Table of contents : Preface Gallery Notation Contents 1 Sets, Orders, Relations and Measures 1.1 Sets and Orders Exercises 1.2 Convergence and Summability in R Exercises 1.3 Maps and Multimaps (Relations) Exercises 1.4 Measurable Spaces Exercises 1.5 Measures Exercises 1.6 Completion of a Measure Exercises 1.7 Lebesgue and Stieltjes Measures Exercises 1.8 * Product Measures Exercises 1.9 * Regular Measures on Metric Spaces Exercises Notes, Remarks, and Additional Reading 2 Encounters With Limits 2.1 Convergences Exercises 2.2 Topologies 2.2.1 General Facts About Topologies Exercises 2.2.2 Connectedness Exercises 2.2.3 Lower Semicontinuity Exercises 2.2.4 Compactness Exercises 2.3 Metric Spaces 2.3.1 General Facts About Metric Spaces Exercises 2.3.2 Complete Metric Spaces 2.3.3 Application to Ordinary Differential Equations Exercises 2.3.4 Compact Metric Spaces Exercises Additional Reading 3 Elements of Functional Analysis 3.1 Normed Spaces 3.1.1 General Properties of Normed Spaces Exercises 3.1.2 Continuity of Linear and Multilinear Maps Exercises 3.1.3 Finite Dimensional Normed Spaces Exercises 3.1.4 Series and Summable Families Exercises 3.1.5 Spaces of Continuous Functions Exercises 3.2 Topological Vector Spaces. Weak Topologies Exercises 3.3 Separation and Extension. Polarity 3.3.1 Convex Sets and Convex Functions Exercises 3.3.2 Separation and Extension Theorems Exercises 3.3.3 Polarity and Orthogonality Exercises 3.4 Couplings and Reflexivity 3.4.1 Couplings 3.4.2 Reflexivity and Weak Topologies Exercises 3.4.3 Uniform Convexity Exercises 3.4.4 Separability Exercises 3.5 Some Key Results of Functional Analysis 3.5.1 Some Classical Theorems Exercises 3.5.2 Densely Defined Operators and Transposition Exercises 3.5.3 The Spectrum of a Linear Operator Exercises 3.5.4 Compact Operators Exercises 3.6 Elementary Integration Theory 3.6.1 Regulated Functions and Their Integrals 3.6.2 *Functions of Bounded Variation and Integration Exercises 3.6.3 * Application: The Dual of C(T) Additional Reading 4 Hilbert Spaces 4.1 Hermitian Forms Exercises 4.2 Best Approximation Exercises 4.3 Orthogonal Families Exercises 4.4 The Dual of a Hilbert Space Exercises 4.5 Fourier Series 4.5.1 Application: The Dirichlet Problem for the Disk 4.5.2 Application: Dido's Problem Exercises 4.6 Orthogonal Polynomials Exercises 4.7 Elementary Spectral Theory for Self-Adjoint Operators Exercises Additional Reading 5 The Power of Differential Calculus 5.1 Differentiation of One-Variable Functions 5.1.1 Derivatives of One-Variable Functions 5.1.2 The Mean Value Theorem Exercises 5.2 Primitives and Integrals Exercises 5.3 Directional Differential Calculus Exercises 5.4 Classical Differential Calculus 5.4.1 The Main Concepts and Results of DifferentialCalculus Exercises 5.4.2 Higher Order Derivatives Exercises 5.4.3 Taylor's Formulas Exercises 5.4.4 Differentiable Partitions of Unity 5.5 Solving Equations and Inverting Maps 5.5.1 Newton's Method Exercises 5.5.2 The Inverse Mapping Theorem 5.5.3 The Implicit Function Theorem Exercises 5.5.4 Geometric Applications Exercises 5.5.5 *The Eikonal Equation 5.5.6 *Critical Points Exercises 5.5.7 *The Method of Characteristics Exercises 5.6 Applications to Optimization 5.6.1 Unconstrained Minimization Exercises 5.6.2 Normal Cones, Tangent Cones, and Constraints Exercises 5.6.3 Calculus of Tangent and Normal Cones 5.6.4 Multiplier Rules Exercises 5.7 Introduction to the Calculus of Variations 5.7.1 The One-Variable Case 5.7.2 Some Examples 5.7.3 The Legendre Transform 5.7.4 The Hamiltonian Formalism Exercises 5.7.5 The Several Variables Case Additional Reading 6 A Touch of Convex Analysis 6.1 Continuity Properties of Convex Functions Exercises 6.2 Differentiability Properties of Convex Functions 6.2.1 Derivatives of Convex Functions 6.2.2 Subdifferentials of Convex Functions Exercises 6.2.3 Differentiability of Convex Functions Exercises 6.2.4 Elementary Calculus Rules for Subdifferentials Exercises 6.2.5 Application to Optimality Conditions Exercises 6.3 The Legendre-Fenchel Transform and Its Applications 6.3.1 The Legendre-Fenchel Transform Exercises 6.3.2 A Brief Account of Convex Duality Theories Exercises 6.3.3 Duality and Subdifferentiability Results Exercises 6.3.4 The Interplay Between a Function and Its Conjugate Exercises 6.3.5 Conditioning and Well-Posedness Exercises 6.4 *Applications to the Geometry of Normed Spaces Exercises 6.5 Regularization of Convex Functions Additional Reading 7 Integration 7.1 Step Functions and μ-Measurable Functions 7.2 Integrable Functions and Their Integrals Exercises 7.3 Approximation of Integrable Functions Exercises 7.4 Convergence Results Exercises 7.5 Integrals Depending on a Parameter 7.6 Integration on a Product Exercises 7.7 Change of Variables Exercises 7.8 Measures on Spheres Exercises Additional Reading 8 Differentiation and Integration 8.1 Vectorial Measures Exercises 8.2 Decomposition and Differentiation of Measures 8.2.1 Decompositions of Measures 8.2.2 Differentiation of Measures Exercises 8.3 Differentiation of Measures on Rd Exercises 8.4 Derivatives of One-Variable Functions Exercises 8.5 Lebesgue Lp(S,E) Spaces 8.5.1 Basic Facts About Lebesgue Spaces Exercises 8.5.2 Nemytskii Maps 8.6 Duality and Reflexivity of Lebesgue Spaces Exercises 8.7 Compactness in Lebesgue Spaces Exercises 8.8 Convolution and Regularization Exercises 8.9 Some Useful Transforms 8.9.1 The Fourier Transform Exercises 8.9.2 Introduction to the Radon Transform Exercises Additional Reading 9 Partial Differential Equations 9.1 Definition and Basic Properties of Sobolev Spaces 9.1.1 Test Functions and Weak Derivatives 9.1.2 Definition and First Properties of Sobolev Spaces 9.1.3 Calculus Rules in Sobolev Spaces 9.1.4 Extension 9.1.5 Traces Exercises 9.2 Embedding Results 9.3 Elliptic Problems 9.3.1 Ellipticity 9.3.2 Energy Estimates and Existence Results 9.3.3 Regularity of Solutions 9.3.4 Maximum Principles Exercises 9.4 Nonlinear Problems 9.4.1 Transforming Equations Exercises 9.4.2 Using Potential Functions 9.4.3 Order Methods 9.4.4 Monotone Multimaps Exercises 9.4.5 Representation of Monotone Multimaps 9.4.6 Surjectivity of Maximally Monotone Multimaps Exercises 9.4.7 Sums of Maximally Monotone Multimaps Exercises 9.4.8 Variational Inequalities Additional Reading 10 Evolution Problems 10.1 Ordinary Differential Equations 10.1.1 Separation of Variables 10.1.2 Existence Results Exercises 10.1.3 Uniqueness and Globalization of Solutions Exercises 10.1.4 The Exponential Map Exercises 10.1.5 The Laplace Transform Exercises 10.2 Semigroups 10.2.1 Continuous Linear Semigroups and Their Generators Exercises 10.2.2 Characterization of Generators of Continuous Semigroups Exercises 10.2.3 * Dissipative and Accretive Multimaps Exercises 10.3 Parabolic Problems: The Heat Equation 10.4 Hyperbolic Problems: The Wave Equation Exercises Appendix: The Brouwer's Fixed Point Theorem Additional Reading References Index
Like this paper and download? You can publish your own PDF file online for free in a few minutes!Sign Up
File loading please wait...
Citation preview
Universitext
Jean-Paul Penot
Analysis
From Concepts to Applications
Universitext
Universitext Series Editors Sheldon Axler San Francisco State University Vincenzo Capasso Università degli Studi di Milano Carles Casacuberta Universitat de Barcelona Angus MacIntyre Queen Mary, University of London Kenneth Ribet University of California, Berkeley Claude Sabbah CNRS, École Polytechnique, Palaiseau, France Endre Süli University of Oxford Wojbor A. Woyczy´nski Case Western Reserve University Cleveland, OH
Universitext is a series of textbooks that presents material from a wide variety of mathematical disciplines at master’s level and beyond. The books, often well classtested by their author, may have an informal, personal even experimental approach to their subject matter. Some of the most successful and established books in the series have evolved through several editions, always following the evolution of teaching curricula, to very polished texts. Thus as research topics trickle down into graduate-level teaching, first textbooks written for new, cutting-edge courses may make their way into Universitext.
More information about this series at http://www.springer.com/series/223
Jean-Paul Penot
Analysis From Concepts to Applications
123
Jean-Paul Penot Université Pierre et Marie Curie Paris, France
ISSN 0172-5939 Universitext ISBN 978-3-319-32409-8 DOI 10.1007/978-3-319-32411-1
ISSN 2191-6675 (electronic) ISBN 978-3-319-32411-1 (eBook)
Nothing is lost, nothing is created out of nothing, everything is transformed. Lavoisier Mathematics has been created by human beings for their needs. The teaching mathematician must remain a teacher of action. Henri Lebesgue I hear, I forget; I see, I remember; I do, I understand! Chinese proverb
The purpose of this book is to present some basic results in analysis that can be used to solve various problems, so it may serve as a kind of toolbox. We do not intend to give a complete panorama of the field. In particular, we concentrate on real analysis and leave complex analysis aside; however we consider complex functions now and then. Although we cannot claim that this volume is devoted to applied analysis, our choice of topics is driven by our wish to present results that can be applied to concrete problems. It is in this sense that this book could be called “Motivated Analysis”. This expression is due to J.-P. Aubin and means that we shall gather some results of analysis that may be useful in applications, even if the nature of these applications is not the focus of the text. Thus, our route is not the choice of some important problems that occur in applications, as in [22, 62, 87, 88, 101, 129, 130, 140, 178, 191, 236, 241, 245], but neither is it the panorama of deep advances in mathematical concepts presented in [100] and [101]. A leading thread throughout this book is a strategy often advised by mathematicians, given in many instances: if a problem seems to be difficult, try to change it into a more tractable problem. This can be done in various ways. A first approach consists in reformulating the question, keeping its crucial features and dropping inessential details as much as possible (in a sense, mathematics can be considered as an art of strip tease). In doing so, one is often led to a more general problem that is no more difficult to solve. On the contrary, because its framework is rather bare and simple, its solution is often easier to reach. Of course, some knowledge of general mathematics may help.
vii
viii
Preface
Another means of solving a problem consists in using a transformation. This approach can been combined with the simplification just mentioned. Quite often, a hidden or rather intricate operation is interchanged with a simple operation via the transformation. For instance, a convolution may be transformed into an addition or a product. Then the problem may appear as much more tractable. Many transformations have been proposed by mathematicians. Among them are those due to Cole-Hopf, Fenchel, Fourier, Hilbert, Laplace, Legendre, Mellin, Radon and Vaslov. For this reason, we focus our attention on some useful transforms, even if we just give an introduction to them. Their applications to economics, engineering, mathematics, medicine and physics are numerous and important. The passage from monotone operators to convex functions (and the inverse passage) is another example of such a transformation that is not yet classical, but it is fruitful and it has attracted a number of mathematicians recently. We give some attention to it in Chap. 9. The different chapters of this book may serve as separate courses. However, we consider it is important not to neglect the interdependence of the subjects. This is revealed in several instances throughout the book. Integration can be taught without a knowledge of abstract measure theory, but the latter can serve as an important foundation and it is the basis of probability theory. Nemytskii operators play a crucial role in nonlinear partial differential equations. Functional analysis permeates almost all of the topics considered in the book. Elementary differential calculus can be set in Euclidean spaces rather than in normed vector spaces. However the main lines may be hidden by a heavy use of partial derivatives and components and some applications would be lost if we confined differential calculus to such a restricted framework. The balance between generality and simplicity is not easy to reach. Take the notion of convergence for example. It can be considered as the keystone of analysis. For that reason we present the main lines of the concept, but it could be expounded more thoroughly. In fact, we encourage the reader to prune rather than to develop what is presented here, since we have given more than the essentials of what is needed in practice. Also, the theory of distributions could be given a more prominent role than the one we offer it in our presentation of Sobolev spaces; incidentally the convergence approach is certainly simpler than the topological approach to distributions considered as elements of a dual space. However, we prefer to give a rather direct treatment, even if the concept of transposition is central to the understanding of generalized derivatives and generalized solutions. The difficulty in choosing the presentation of a subject is well reflected in the following quotation from Julian Barnes, Staring at the Sun, (Jonathan Cape, London, 1986) mentioned by I. Smith, in Bulletin of the American Mathematical Society 52 (3) p. 415 (2015): : : :everything you wanted to say required a context. If you gave the full context, people thought you a rambling old fool. If you didn’t give the context, people thought you a laconic old fool.
In his remarkable book [117] L.C. Evans writes “notation is a nightmare”. It certainly presents difficulties, in particular when one tries to conciliate various uses.
Preface
ix
However, this difficulty is no greater than the challenge posed by the choice of topics and the writing of clear and concise proofs. Our experience with texts that were written one or more lifetimes ago showed us that the need of rigour and precision has increased and is likely to increase more in the future. Thus, we avoid some common abuses such as confusing a function f or f ./ with its value f .x/, a sequence .xn / with its general term xn or its set of values fxn g, and a space with its dual. We distinguish the adjoint A of a continuous linear map A and its transpose A| and we distinguish the derivative Df of a function and its gradient rf . If you are not convinced by such distinctions, you are challenged to give at once the higher derivatives of a composite function. Also, we refrain from using the notation fx1 to denote the partial derivative of a function f with respect to its first variable x1 because fx1 may denote the function f .x1 ; / of the variables other than x1 , the variable x1 being “frozen”. On the other hand, our position is not rigid, as the next two examples show. We do not follow the choice of C. Zalinescu in [265] who denotes by ˛ # the especial conjugate of a function ˛ defined on RC rather than R; albeit his choice is wise, we prefer the reader to see at first glance the link with convex conjugacy, even if there is a risk of confusion (but this risk is limited since we only use nonnegative values of the arguments). Also, we retain the classical terminology concerning “locally uniformly convex” norms despite the fact that the notion is a pointwise notion rather than a local one. Our general choice of discarding ambiguity may lead to unusual expressions or notations. We hope the reader will not be disturbed by such novelties and that authors will support them. Most mathematicians are unaware that the notation they use was once considered as shocking. Rarely is credit given to their inventors or promoters, such as Oresme (for coordinates and exponents), Recorde (for the sign R D), Leibniz (for derivatives, products, quotients and , as the first letter of the Latin summa or sum was denoted at that time), and Peano (for inclusion: : :). In this respect we must say that the notation P used here for the set of positive numbers (more present in analysis than the set Q of rational numbers) is, to our knowledge, due to J.M. Borwein. The scalar product of two vectors u, v is denoted here by the compromise hu j vi between the mathematicians’ and the physicists’ uses; but, from time to time, we use the dot notation u v in view of its simplicity. We avoid the notation .u; v/ which may mean either a pair or an open interval. In general, we try to avoid ambiguity since mathematics aims at being a clear language. For that reason, we sometimes depart from the most common terminology in order to avoid confusion. But, consciously or unconsciously, we also use some abuses of notation that tradition has made acceptable (until they are rejected?). In preparing this book, we have benefited from the teaching of our masters, some of whom were actors in the clarification or the setting up of the subject. We also benefited from several excellent books. The readers will easily detect these sources. Our choices were dictated by our wish to present the most elegant proofs. The word “elegant” may seem a strange choice for a mathematical book. However, mathematicians are often sensitive to aesthetics when considering proofs. Under this term they often appreciate bold, clear, concise proofs. A frequent drawback is the effort required from the reader. But the reward is a better understanding of the
x
Preface
nature of the result and of its possible applications. For that reason, we encourage the reader to approach proofs with a paper and a pen in order to devise variants or developments or to mark the decisive steps. Experiencing proofs is the best apprenticeship for a field. We are so convinced of the value of proofs that sometimes we give two proofs of the same result; on the other hand we skip some proofs, either because they are too involved or because they are outside the scope of the book. Exercises are often augmented with hints, but not complete solutions. The most difficult exercises are marked with an asterisk; they are included as complements rather than for training. An asterisk also marks results or sections that can be omitted on a first reading. It is a great pleasure to thank my colleagues and friends who kindly read and criticized parts of the successive versions of this book; in particular, the contributions of Luc Barbet, Marc Dambrine, Marc Durand, Emmanuel Giner, Dena Kazerani, Alexander Kruger, Khadra Nachi, Steve Robinson, Lionel Thibault, Constantin Z˘alinescu, Nadia Zlateva have been precious for reducing the number of mistakes or misprints and for bringing me some encouragement. I hope the reader will find this toolbox useful and will enjoy the rich legacy of our predecessors. Comments and criticisms will be welcome at [email protected]. Paris, France July 2016
Jean-Paul Penot
Gallery
The list of mathematicians that follows is neither a pantheon nor an academy. It is directly connected to the results expounded in this book and is intended to give the reader an idea of the historical appearance of the ideas involved here. The brief indications that follow are by no means biographies. They just give some elements sketching an historical perspective: we endeavor to devote just one line to each author. It is a pity such elements are so brief and in some sense unfair for works whose extent is often large and varied (in particular, for Euler, Gauss, Cauchy, Poincaré and von Neuman). In some cases a biography would be rather short (for instance, Gateaux was killed in 1914 shortly after leaving the Ecole Normale Supérieure), but in many cases mathematicians have enjoyed a rather long life. Also, one may regret that some anecdotes about people who were living persons with various talents and weaknesses could not be reproduced here. For instance, many mathematicians were concerned with philosophical or logical questions, others were involved in the study of physical phenomena. Hausdorff published literary and philosophical works under the pseudonym of Paul Mongré; Vandermonde was one of the founders of Conservatoire des Arts et Métiers as he was involved in metallurgy; Green was a miller and he hardly attended school; W.H. Young worked on so many topics with his wife Grace Chisholm-Young that it is not clear whether the results attributed to him are joint results or not. Because nationalism has brought so much terrible suffering to human beings, instead of mentioning nationalities, and because present nationalities are not the same as those of the past, in general we pick the names of a few places where the discoveries were elaborated rather than the nationalities of authors. This is also a credit to their scientific environments. When the authors have conducted research in many places, we simply state the name of the country. Our choices suffer from outrageous simplifications, not just of locations, but also for what concern achievements. Happily, many sources can complete the glimpse we give hereafter, among which are: [28, 39, 49, 87, 105, 110, 148, 156] and the web sites en.wikipedia.org www.-history.mcs.st-andrews.ac.uk/history/BiogIndex.html www.storyofmathematics.com xi
xii
Gallery
Abel, Niels Henrik, 1802–1829, Oslo: algebraic equations, elliptic functions, integrals, series. Aleksandrov, Pavel Sergeïlevich, 1896–1982, Moscow: general and algebraic topology. d’Alembert Le Rond, Jean, 1717–1783, Paris: analysis, complex numbers, statistics Archimedes, circa 287–212 BC, Greek inventor and mathematician: , curves. Arzela, Cesare, 1847–1912: functional analysis, integration. Ascoli, Giulio, 1843–1896, Milano: sets whose points are functions. Baire, René, 1874–1932, Montpellier, Dijon: theory of functions, general topology. Banach, Stefan, 1892–1945, Lwow: functional analysis. Bernoulli, Jacob, 1654–1705, Basel: curves, polar coordinates, probabilities. Bernstein, Felix, 1878–1956, Göttingen and USA: set theory Bessel, Friedrich, 1784–1846, Koënigsberg, astronomer: special functions. Bochner, Salomon, 1899–1982, Munich, Princeton: integration theory. du Bois-Raymond, 1831–1889, Heidelberg, Berlin, Freiburg: analysis, Fourier series. Bolzano, Bernhard, 1781–1848, Prague: set theory, logic, topology. Boole, George, 1815–1864, Britain, Cork: logic, order, set theory. Borel, Emile, 1871–1956, Paris: measure theory, probabilities, game theory. Brouwer, Luitzen Egbertus Jan, 1881–1966, Amsterdam: intuitionism, fixed points. Brunn, Herman Karl, 1862–1939, Germany: measure and geometry. Bunyakovsky, Viktor, 1804–1889, Saint Petersburg: analysis, probabilities. Cantor, Georg Ferdinand Ludwig, 1845–1918, Göttingen, Halle: cardinality, logic. Carleson, Lennart, born 1928, Stockholm, Los Angeles: harmonic analysis. Cauchy, Augustin-Louis, 1789–1857, Paris: real and complex analysis. Cartan, Henri, 1904–2008, Paris: algebraic topology, homology, complex variables. Caratheodory, Constantin, 1873–1950, Greece, Germany: PDE, calculus of variations. Cavalieri, Francesco Bonaventura, 1598–1647 Bologna: precursor of integration theory. Cesàro, Ernesto, 1859–1906, Naples: divergent series and trigonometric series. Chasles, Michel, 1793–1880, Paris: geometry, in particular projective geometry. Chebyshev, Pafnuty, 1821–1894, Saint Petersburg: probability, number theory, mechanics. Dini, Ulisse, 1845–1918, Pisa: analysis, derivatives, geometry. Dirac, Paul Adrien Maurice, 1902–1984, British physicist using generalized functions. Dirichlet, Gustav Lejeune, 1805–1859, Göttingen: number theory, series, mechanics. Egorov, Dmitri Fyodorovich, 1869–1931, Moscow: measure theory. Euler, Leonhard, 1707–1783, Basel, Saint Petersburg, Berlin: analysis and more. Fatou, Pierre, 1878–1929, Paris: astronomy, integral calculus, complex analysis. Fejér, Lipót, 1880–1959, Budapest: Fourier series. de Fermat, Pierre, 1601–1665, Toulouse: number theory, geometry, optimization. Fischer, Ernst Sigismund, 1875–1954, Erlangen, Köln: functional analysis. Fourier, Joseph Jean-Baptiste, 1768–1830, Auxerre, Paris, Grenoble: series, analysis. Fréchet, Maurice, 1878–1973, Poitiers, Strasbourg, Paris: analysis, metric spaces. Fresnel, Augustin-Jean, 1788–1827, Paris physicist and engineer: light, integrals. Friedrichs, Kurt Otto, 1901–1982, Aachen, Braunschweig, New York: Sobolev spaces. Fubini, Guido, 1879–1943, Pisa, Torino, Princeton: differential equations, integration. Gagliardo, Emilio, 1930–2008, Pavia: analysis. Gårding, Lars, 1919–2014, Lund, Sweden: partial differential equations, bird songs Gateaux, René, 1889–1914, Paris, Roma: functional analysis, differential calculus. Gauss, Carl Friedrich, 1777–1855, Göttingen: geometry, number theory, analysis. Gram, Jørgen Pedersen, 1850–1916, Copenhagen: approximation, number theory determinants. Green, George, 1793–1841, English mathematician and physicist: a formula.
i.e. f.i. WD D < > 8 8 9 2 T \ or S [ or ` t or W _ or V ^ or P R
, d ! 7! !S ! rC ! ¿ 1 C N
That is to say For instance Equality by definition Equality Less than or equal to Greater than or equal to Less than Greater than For all -almost everywhere There exists Belongs to Included in Intersection Union Union of mutually disjoint sets Supremum Infimum Sum Integral The Lebesgue measure on Rd Converges Is sent to Converges while remaining in S Converges to r while remaining greater than r Converges in the weak topology The empty set Infinity The set of complex numbers The set of natural integers xv
xvi
Nk P Q R RC R RC R R1 Z m AnB AB .a; b/ a; bŒ Œa; b Œa; bŒ a; b h; i h j i kk kf kp kf k1 supp f IS F 1 A| A 1S S S or hS dS ; d.; S/ card S int S cl S bdry S diam S dim X co S co S co S span S S0 B.x; r/
Notation
The set f1; : : : ; kg The set of positive real numbers The set of rational numbers The set of real numbers The set of nonnegative real numbers The set of nonpositive real numbers The set RC [ fC1g of nonnegative extended real numbers The set R [ f1; C1g of extended real numbers The set R [ fC1g The set of integers The canonical m-simplex f.t1 ; : : : ; tm / 2 Rm C W t1 C : : : C tm D 1g The set of a 2 A that do not belong to B The symmetric difference of A and B: .AnB/ [ .BnA/ The pair formed by a and b The open interval with end points a and b The closed interval with end points a and b The semi-closed interval Œa; bnfbg The semi-closed interval Œa; bnfag The coupling function between a normed space and its dual An unspecified variable The scalar product Or kkE the norm of a normed space (n.v.s.) E The norm of a function f in Lp .S; ; E/ for p 2 Œ1; 1Œ supx kf .x/k or esssupkf k The support of a function f Or just I, the identity map from S to S The inverse of a multimap (or multifunction) F The transpose of the continuous linear map A The adjoint of the continuous linear map A between Hilbert spaces The characteristic function of the subset S: 1 on S, 0 elsewhere The indicator function of the subset S: 0 on S, C1 elsewhere The support function of the subset S Distance to a subset S of a metric space Cardinal of a set S The interior of the subset S of a topological space The closure of the subset S of a topological space The boundary of the subset S of a topological space The diameter of a subset S of a metric space .X; d/ The dimension of a vector space The convex hull of the subset S of a linear space The closed convex hull of the subset S of a n.v.s. The weak closed convex hull of a subset S of a dual space The smallest linear subspace containing S The polar set fx 2 X W 8x 2 S hx ; xi 1g of S The open ball with center x and radius r in a metric space
Notation
xvii
BŒx; r BX L.X; Y/ X .X/ C.X/ Cb .X/ C.X; Y/ Ck .W/ Cbk .W/ Cck .W/ Ck .W; Y/ T R.T; E/ Lp .S; ; E/ Lp .S; / Lp .S; ; E/ P.X/ O N .x/ Gı S f0 Di f
The closed ball with center x and radius r in a metric space The closed unit ball in a n.v.s. (normed space) X The set of continuous linear maps from X to another n.v.s. Y The topological dual space L.X; R/ of a n.v.s. X The set of closed proper convex functions on a n.v.s. X The space of continuous functions on a topological space X The space of bounded continuous functions X The set of continuous maps from X to another space Y The set of functions of class Ck on an open subset W of a n.v.s. The space of functions in Ck .W/ \ Cb .W/ with derivatives in Cb .W/ The set of functions of class Ck with compact support on W The set of maps of class Ck from W into a n.v.s. Y An interval of R The space of regulated functions from T into a n.v.s. E The space of p-integrable maps from S into E The space Lp .S; ; E/ for E WD R or Lp ./, Lp .S/ The space of classes for equality a.e. of elements of Lp .S; ; E/ The set of subsets of a set X The family of open subsets of a topological space X The family of neighborhoods of x in a topological space The family of countable intersections of open subsets A ring or a -algebra on a set S The derivative of a function f or a map f or Df The partial derivative of f with respect to the i-th variable or @x@ i f
D˛ f
The partial derivative of f for the multi-index ˛ or
rf @f .x/ f
The gradient of a function f The subdifferential of a function f at x The Laplacian of the function f
Measure what is measurable, and make measurable what is not so. (Misura ciò che è misurabile, e rendi misurabile ciò che non lo è) Galileo Galilei (1564–1642)
Abstract This chapter is devoted to some preliminary subjects and techniques. Sets and orders are briefly considered, in particular for defining nets and sequences which are used throughout the book. Basic facts about countability are reviewed. Since a practice in set theory is needed for topology and analysis, measure spaces are chosen for training. The classical extension results are presented and the product measure is treated without using integration theory. The Lebesgue and Stieltjes measures are introduced as examples.
A knowledge of basic set theory is desirable for a reading of the present book, as in various branches of mathematics. However, it is not our purpose to enter into the subtleties of logic arguments involved in such foundations. We just assume the reader is familiar with the usual operations [, \, , of set theory and has a basic knowledge of the fundamental sets N, Z, R, C. We refer to the list of notations for the definitions of these sets and further information. We devote this opening chapter to some preliminary material dealing with sets, orders, correspondences and particular families of subsets of a set. We gather this material because we believe the reader must have a certain familiarity with these basic techniques before using more specific concepts. In some cases the reader will have to jump to the next chapter when some topological notions are intertwined with measure spaces. The concepts of measurable space and of measure offer an opportunity to train in the practice of sets and maps. For this reason, we present them in this first chapter. Such a choice has a drawback: in some places we have to anticipate some topological notions. We suggest the reader skip these passages or briefly look at the next chapter to fetch the required information. We hope the benefit of the gained familiarity with the basic tools of analysis will compensate any inconvenience. Integration will be presented much later (Chap. 7) because it requires many more
concepts of analysis such as normed vector spaces and completeness. Introducing measure spaces at this stage and not later may also be useful to readers beginning a study of probability. In order to demonstrate the need for the training we mentioned, let us consider the following problem. Problem A yard is paved with rectangular stones. Knowing that the area of a rectangle is the product of its length by its width, prove that the area of the yard is the sum of the areas of the stones (Fig. 1.1).
1.1 Sets and Orders Many facts in mathematics (and real life) can be formulated in terms of relations. A relation R between two sets X, Y is a subset R of X Y. Given x 2 X one sets R.x/ WD fy 2 Y W .x; y/ 2 Rg. Viewed in this way a relation can be considered as a map from X into the set P.Y/ of subsets of Y. We shall return to this viewpoint in Sect. 1.3. Order is an important topic in mathematics (and in real life). However not all orders are total orders for which two elements are always comparable as in N, Z, R. Authority in a modern family provides an example of order that is not total: neither of the two parents is above the other, even if both are above any child. Using a sheet of paper and the order induced by the height on the page, one can make drawings of various ordered situations (Fig. 1.2). In precise terms, a preorder or partial preorder or preference relation on a set X is a relation A between elements of X, often denoted by , with A.x/ WD fy 2 X W x yg, that is reflexive (x x or x 2 A.x/ for all x 2 X) and transitive (A ı A A, i.e. x y; y z ) x z for x; y; z 2 X). One also writes y x instead of x y or y 2 A.x/ and one reads: y is above x or y is preferred to x. The preorder is an order whenever it is antisymmetric in the sense that for any x; y 2 X one has x D y whenever x y and y x. Two elements x, y of a preordered set .X; / are
1.1 Sets and Orders
3
f
f
m
f
m/f
m
m
c1 c2 c3 c4
c1
c2
c
c1
c2
Fig. 1.2 Authority in various families
said to be comparable if either x y or y x. If this is the case for all pairs of X, one says that .X; / is totally ordered. As mentioned above, this is not always the case (think of the set X WD P.S/ of subsets of a set S with inclusion when S has at least two elements). Given a subset S of .X; / an element m of X is called an upper bound (resp. a lower bound) of S if one has s m (resp. m s for all s 2 S). An upper bound of S that belongs to S is a greatest element of S; but it may happen that S has upper bounds but no greatest element. For instance, 1 is a least upper bound of Œ0; 1ŒWD fr 2 R W 0 r < 1g in R but not a greatest element of this set. An element m of X is a least upper bound of S if for any upper bound m0 of S one has 0 mm W. If is an order, such an element is unique and one writes m D sup S or mD S. The definition of a greatest lower bound of S can be deduced from the definition of a least upper bound by reversing the preorder, i.e. by introducing the preorder 0 given by x 0 y if y x. An ordered set .X; / is called a lattice if for any .x; x0 / 2 X X the set fx; x0 g has a least upper bound denoted by x _ x0 and a greatest lower bound denoted by x ^ x0 . If sup S (resp. inf S) exist for any nonempty subset S of X that has upper (resp. lower) bounds, .X; / is called a complete lattice. A map f W X ! X 0 between two preordered sets .X; / and .X 0 ; 0 / will be called homotone or order-preserving or increasing if for any x1 x2 in X one has f .x1 / 0 f .x2 /. It is isotone if it is bijective, homotone and if f 1 is homotone. If f is homotone with respect to the reverse order on X 0 one says that f is antitone or order reversing or decreasing. Since the term “monotone” is often used for one-variable real-valued functions that are either order-preserving or order-reversing, we prefer to avoid it. When the order in X 0 is not total we also avoid the term “nondecreasing” which is ambiguous. If f is such that f .x1 / < f .x2 / whenever x1 < x2 , i.e. x1 x2 and x1 ¤ x2 , we say that f is strictly increasing. The preceding notions are illustrated in the following fixed point theorem. Theorem 1.1 (Knaster-Tarski) Let .L; / be an upper complete lattice, i.e. an ordered set in which any nonempty subset has a least upper bound. Suppose f W L ! L is order-preserving and there exists some z 2 L such that z f .z/. Then the set F WD fx 2 L W f .x/ D xg of fixed points of f is nonempty and F has a greatest element. Proof Let D WD fx 2 L W x f .x/g. It is a nonempty subset of L (as z 2 D), hence it has a least upper bound u. For all x 2 D one has x u, hence f .x/ f .u/ and, by
4
1 Sets, Orders, Relations and Measures
transitivity, x f .u/. Thus f .u/ is an upper bound of D. Since u is the least upper bound of D, we have u f .u/. Thus u 2 D. Since f is homotone, we observe that f .u/ f . f .u//, so that f .u/ 2 D. Then, by definition of u, we get f .u/ u, hence f .u/ D u and u 2 F. Finally, since F is contained in D, for any v 2 F one has v u. t u Example Let L be the power set P.X/ (also denoted by 2X ) of a nonempty set X, i.e. the set of subsets of X. With respect to the order given by inclusion L is a complete lattice, the least S upper boundT(resp. greatest lower bound) of a family .Si /i2I of subsets of X being Si (resp. Si ). Since .L; / has the empty set ¿ as i2I
i2I
a least element, if F W P.X/ ! P.X/ is an order-preserving map, there exists some M 2 P.X/ such that F.M/ D M. Theorem 1.2 (Cantor-Bernstein-Schröder) If there exist injective maps f W X ! Y and g W Y ! X between the sets X and Y, then there exists a bijective map h W X ! Y. Proof As in the preceding example, let L WD P.X/ be the power set of X. Let us define F W P.X/ ! P.X/ by 8A 2 P.X/
F.A/ WD Xng.Yn f .A//
where for two subsets B, C of a set Z one writes BnC WD fb 2 B W b … Cg. It is easy to show that F is homotone. Using the preceding example, we conclude that there exists some subset M of X such that F.M/ D M. This relation means that g.Yn f .M// D XnM. Let us define h W X ! Y by h.x/ WD f .x/ if x 2 M, h.x/ WD g1 .x/ if x 2 XnM. Given x ¤ x0 in X we can show that it is impossible that h.x/ D h.x0 / by considering separately the cases x; x0 2 M, x; x0 2 XnM and x 2 M, x0 2 XnM (in the latter case we have h.x/ 2 f .M/ and h.x0 / 2 Yn f .M/). On the other hand, for all y 2 Y, we have either y 2 f .M/ D h.M/ or y 2 Yn f .M/ and then y D h.x/ for x WD g. y/. Thus h is a bijection of X onto Y. t u The preceding theorem opens the way to comparison of the “size” of sets. One says that two sets X and Y are equipotent or have the same cardinality and one writes card X D card Y if there exists a bijection from X onto Y. This defines an equivalence relation between sets. If X and Y are finite sets, equipotence means that X and Y have the same number of elements. For infinite subsets, equipotence may be more mysterious. For instance the set Q of rational numbers is equipotent to the set N of natural numbers and to the set Z of integers. More surprising is the fact that R and R2 are equipotent. Still some rules are more familiar and easy to prove. For instance, if X and X 0 are equipotent sets and if Y and Y 0 are equipotent sets, then X Y and X 0 Y 0 are equipotent. Also, the disjoint union of X and Y is equipotent to the disjoint union of X 0 and Y 0 . However, the following comparison theorem, whose last assertion is a rephrasing of the Cantor-Bernstein-Schröder theorem is not an obvious result.
1.1 Sets and Orders
5
Theorem 1.3 For any two sets X and Y at least one of the following two assertions holds: (a) X is equipotent to a subset of YI (b) Y is equipotent to a subset of X. Moreover, if these assertions hold simultaneously, then X and Y are equipotent. If X is equipotent to a subset of Y but is not equipotent to Y itself, one writes card X < card Y. A set X is said to be infinite if there exists a subset X 0 of X equipotent to X and different from X. A set X is said to be countable or denumerable if it is equipotent to N or a subset of N. It is uncountable if it is not countable. The existence of different sorts of infinite sets is revealed by the following result. Theorem 1.4 (Cantor) For any set X one has card fxg < card P.X/. Proof Since the map that assigns to x the singleton x is injective, it suffices to prove that there is no surjective map f W X ! P.X/. Otherwise, setting A WD fx 2 X W x … f .x/g, we cannot find x 2 X such that f .x/ D A since if x 2 A we have x … f .x/DA, and if x 2 XnA we have x 2 f .x/DA, a contradiction in both cases. t u Let us note that the Knaster-Tarski Theorem does not use the axiom of choice or its equivalent statements. This axiom asserts that for any set X there exists a map f W P.X/ ! X such that for any nonempty subset A of X one has f .A/ 2 A. Such an assertion seems to be plausible, as is the assertion that a product of nonempty subsets is nonempty. However, such assertions are equivalent to other statements such as Zermelo’s axiom and Zorn’s axiom which are not obvious. Zermelo’s axiom (or theorem) asserts that on any set X one can introduce an order such that any nonempty subset of X has a least element. Most mathematicians work in a framework accepting such assertions. Since Zorn’s axiom (or lemma) is the most useful statement for our purposes, let us give an account of it with the corresponding terminology. A subset C of .X; / that is totally ordered with respect to the induced preorder is called a chain. A preorder on X is said to be upper inductive (resp. lower inductive) if any chain C has an upper bound (resp. a lower bound). Recall that an element x of a preordered space .X; / is said to be maximal if for any x 2 X such that x x one has x x; it is called minimal if it is maximal for the reverse preorder. Zorn’s Lemma can be stated as follows. Theorem 1.5 (Zorn’s Lemma or Zorn’s Axiom) Any preordered set whose preorder is upper (resp. lower) inductive has at least one maximal (resp. minimal) element. As mentioned above, this statement is equivalent to a number of other axioms such as the Zermelo’s axiom or well-ordering principle (a non-intuitive assertion) or the axiom of choice, which seems to be very natural. We shall not deal with such aspects of the foundations of mathematics. Corollary 1.1 Let .X; / be a preordered set whose preorder is upper (resp. lower) inductive. Then, for any x0 2 X there exists a maximal (resp. minimal) element x satisfying x0 x (resp. x x0 /.
6
1 Sets, Orders, Relations and Measures
Proof Suppose is lower inductive, the other case being settled by considering the opposite order. The set X0 WD fx 2 X W x x0 g is lower inductive, any chain C in X0 being a chain in X and any lower bound of C in X0 being a lower bound of C in X. Thus X0 has a minimal element x. Since any minimal element of X0 is a minimal element of X, the corollary is established. t u We need some other concepts, in particular for convergence questions. A map f W H ! I between two preordered spaces is said to be filtering if for all i 2 I there exists an h 2 H such that f .k/ i whenever k 2 H satisfies k h. A preordered set .I; / is said to be directed if any finite subset F of I has an upper bound. This occurs as soon as any pair of elements in I has an upper bound. A subset J of a preordered set .I; / is said to be cofinal if for all i 2 I there exists some j 2 J such that j i.
Exercises 1. Show that a subset C of a preordered space .X; / is a chain iff ( if and only if) C C A [ A1 , where A WD f.x; y/ W x yg, A1 WD f.x; y/ W . y; x/ 2 Ag. 2. Let .I; / be a directed set. Show that if J I is not cofinal, then InJ is cofinal. 3. Let .X; / be a preordered space. Verify that the relation < defined by x < y if x y and not y x is transitive. 4. Show that if a map f W H ! I between two preordered spaces is a homotone bijection, if .H; / is totally ordered and if .I; / is ordered, then f is isotone. 5. Show that a homotone map f W H ! I between two preordered spaces is filtering iff f .H/ is cofinal. 6. Give an example of a subset J of a preordered space .I; / having more than one supremum. Verify that when is an order, a subset of I has at most one supremum. 7. Show that when a subset J of a preordered space .I; / has a greatest element k then k is a supremum of J and for any supremum s of J one has k s and s k. Note that when a supremum s of J belongs to J, then s is a greatest element of J. 8. Let .I; / and .J; / be two ordered sets. Verify that the relation .i; j/ .i0 ; j0 / if i < i0 (i.e. i i0 and i ¤ i0 ) or if i D i0 and j j0 is an order relation (the lexicographic order) on I J. Show that this order is total if the orders on I and J are total. 9. Show that the union of a countable family of countable sets is countable. Deduce from this that the set Pf .N/ of finite subsets of N is countable. 10. Let . fn / be a sequence of maps from N into N. Let f W N ! N be defined by f .n/ D fn .n/ C 1 for n 2 N. Show that there is no k 2 N such that f D fk . Deduce from this that the set of maps from N into N is uncountable. 11. Show that the set Q of rational numbers is countable. 12. Let E be an infinite set and let F be a finite subset of E. Show that E and EnF are equipotent. Prove the same conclusion when F is countable and EnF is infinite, admitting that any infinite set contains an infinite countable subset.
1.1 Sets and Orders
7
13. Deduce from the preceding exercise that the set R of real numbers is equipotent to Œ0; 1. 14. Let X be a set and let S P.X/. Consider f W X P.X/ ! R such that f .x; S/ D mini2I f .x; Si / for all x 2 X and all families .Si /i2I in S whose union is S 2 S. Here mini2I f .x; Si / means that there exists some j 2 I such that f .x; Sj / f .x; Si / for all i 2 I. Given S 2 P.X/ let WS .S/ WD fx 2 X W f .x; S/ f .x; S0 / 8S0 2 Sg: Show that for any family .Si /i2I in S whose union is S 2 S one has WS .S/ D
[
WSi .S/:
i2I
Given a family .S˛ /˛2A of subsets of P.X/ and S WD WS .S/ D
\
S ˛2A
S˛ , verify that
WS .S˛ /:
˛2A
Give an interpretation when f .x; S/ WD minfd.x; y/ W y 2 Sg where d W X X ! RC is some function. If d is the distance on the surface of the globe and S is the family of states, W.S/ can be considered as the territorial waters of S. 15. Given ordered sets S, T , a map DV W S ! T is called a duality if for any family .Si /i2I in S having an infimum Si the family .D.Si //i2I has a supremum i2I W D.Si / and if i2I
D.
^
Si / D
i2I
_
D.Si /:
i2I
Suppose S is a complete inf-lattice in the sense that any nonempty family in S has an infimum. Show that for any antitone map D W S ! T there exists a greatest antitone map D0 W T ! S such that D0 .D.S// S for all S 2 S and that D0 is given by D0 .T/ WD
^
fS 2 S W D.S/ Tg:
Show that D0 is a duality when D is a duality and that D.D0 .T// T for all T 2T. 16. Given sets X, Y, S P.X/, T P.Y/ endowed with the order induced by the inclusion and such that S is a complete sup-lattice, a map P W S ! T is called a polarity if for any family .Si /i2I in S having a supremum in S one has P.
_ i2I
Si / D
\ i2I
P.Si /:
8
1 Sets, Orders, Relations and Measures
Show that there exists a smallest antitone map P0 W T ! S such that P0 .P.S// S for all S 2 S and that P0 is given by P0 .T/ WD
_
fS 2 S W T P.S/g:
Show that when S D P.X/ one has P0 .T/ WD fx 2 X W T P.fxg/g: 17. Given sets X, Y, a point s of Y and f W X Y ! RC , for a subset T of Y let Vs .T/ WD fx 2 X W f .x; s/ f .x; t/ 8t 2 Tg with Vs .¿/ WD X. The set Vs .T/ is called the Voronoi cell associated with s and T. Show that Vs W P.Y/ ! P.X/ is a polarity on the set P.Y/ of subsets of Y, i.e. for any family .Ti /i2I of P.Y/ one has Vs .
[ i2I
Ti / D
\
Vs .Ti /:
i2I
Assume that C is a subset of P.Y/ containing Y and the singletons of Y and that C is stable under intersections. Prove that for all B 2 P.Y/ there exists a smallest subset clo.B/ in C containing B. Let V be a subset of P.X/ such that Vs .C/ 2 V for all C 2 C. Verify that the restriction of Vs to C and V is still a polarity from C into V in the sense that Vs .C/ D
\ i2I
Vs .Ci / for C WD
_
Ci WD clo.
i2I
[
Ci /:
i2I
Describe the polarity Vs0 W V ! C associated with Vs as in the preceding exercise.
1.2 Convergence and Summability in R We assume the reader is familiar with the usual properties of R. On the other hand, it may be useful to review some convergence properties generalizing the convergence of sequences and series. A general approach will be presented later on. A net of real numbers comprises the data of a directed set .I; / and of a family .xi /i2I of elements of R. One says that a net .xi /i2I converges to x 2 R, and one writes .xi /i2I ! x, if for every " > 0 there exists some i" 2 I such that x " < xi < x C " for all i i" . One also writes x D limi2I xi or just x D limi xi . The limit is unique: if .xi /i2I ! x and .xi /i2I ! y with x < y, taking " WD . y x/=2, since I
1.2 Convergence and Summability in R
9
is directed we can find some k 2 I such that for i k we have both xi < x C " and xi > y ", an impossibility. It is easy to see that if .xi /i2I ! x and . yi /i2I ! y, then one has .xi C yi /i2I ! x C y. Also, for all r 2 R one has .rxi /i2I ! rx. The next sufficient condition for convergence is similar to a well known condition for sequences. Taking nets instead of sequences is useful to obtain a sound understanding of some matters such as the Riemann integral. Here we say that .xi /i2I is an increasing net if for i j in I we have xi xj . Theorem 1.6 Let .xi /i2I be an increasing net of real numbers that is bounded above. Then .xi /i2I ! x WD supi2I xi . Proof Given " > 0 we can find some h 2 I such that xh > x ". Then, for i h we have xi xh > x " since .xi /i2I is increasing, and of course, xi < x C ". t u Consequently, a bounded below decreasing net .xi /i2I converges to infi2I xi . A general convergence criterion is the Cauchy criterion. It concerns Cauchy nets, i.e. nets ˇ .xi /ˇi2I having the property that for every " > 0 there exists some h 2 I such that ˇxi xj ˇ " whenever i, j 2 I satisfy i h, j h. A convergent net is a Cauchy net. The converse is interesting because it enables us to assert that a net converges without knowing its limit. Theorem 1.7 (Cauchy Criterion) Any Cauchy net in R converges. Proof Let .xi /i2I be a Cauchy net. For i 2 I let ai WD supji xj , bi WD infji xj ; they are finite for i large enough. Then .ai /i2I is decreasing and .bi /i2I is increasing. Moreover, for all " > 0 there exists some h" 2 I such that jai bi j " for all i 2 I satisfying i h" . Then, for i h1 in I we have ai bh1 1 and bi ah1 C 1. Thus the nets .ai /i2I and .bi /i2I are convergent. Their respective limits a and b satisfy a b since ai bi for all i 2 I and for every " > 0 we can find h" 2 I such that for i h" we have a ai bi C " b C ". Thus a D b and for i h" we have xi ai b C " and xi bi a ", so that .xi / ! ` WD a D b. t u Let us turn to summability questions. Let .rt /2T be an arbitrary family of real numbers, T being an arbitrary set. For J in the family J of finite subsets of T let sJ WD ˙j2J rj . The set J ordered by inclusion is directed ( for J 0 , J 00 2 J the set J WD J 0 [ J 00 is greater than or equal to J 0 and J 00 ). This observation gives a meaning to the following definition. Definition 1.1 A family .rt /t2T of real numbers is said to be summable if the net .sJ /J2J of finite sums converges to some s 2 R called the sum of the family .rt /t2T . One writes s D ˙t2T rt . Clearly, if all but a finite number of members rt of the family .rt /t2T are null, then the family .rt /t2T is summable. Theorem 1.6 ensures that if .rt /t2T is a family of nonnegative numbers and if there is some c 2 R such that sJ c for all J 2 J , then the family .rt /t2T is summable and ˙t2T rt D supJ2J sJ . Another criterion is as follows.
10
1 Sets, Orders, Relations and Measures
Proposition 1.1 (Cauchy Summability Criterion) A family .rt /t2T of real numbers is summable whenever it satisfies the condition: for all " > 0 there exists a finite subset H" of T such that for any finite subset F of T contained in TnH" one has jsF j ". Proof This follows from the fact that the net .sJ /J2J satisfies the Cauchy criterion if the family .rt /t2T satisfies the Cauchy summability criterion: given " > 0, let H" 2 J be such that jsF j " for any F 2 J contained in TnH" I then, for J, K 2 J containing H" , since jsJ sK j jsJnK j C jsKnJ j 2" since JnK and KnJ are contained in tnH" . t u Exercise Prove the converse: if the family .rt /t2T is summable, then it satisfies the Cauchy summability criterion. Corollary 1.2 Let .rt /t2T be a summable family of real numbers. Then, for any subset T 0 of T the family .rt /t2T 0 is summable. Corollary 1.3 If .rt /t2T is a summable family of real numbers, then the set T 0 of t 2 T such that rt ¤ 0 is countable. Proof The converse of the Cauchy summability criterion ensures that for all n 2 N there exists a finite subset Hn of T such that for all t 2 TnHn one has jrt j 1=.nC1/ t u (take F WD ftg). Thus T 0 D [n Hn is at most countable. Let us say that a family .rt /t2T of real numbers is absolutely summable if the family .jrt j/t2T is summable. We have the following surprising result as it differs from a well known fact for series. Proposition 1.2 A family .rt /t2T of real numbers is absolutely summable if and only if it is summable. Proof If .rt /t2T is absolutely summable, for every " > 0 we can find a finite subset H" of T such that for any finite subset F of TnH" one has ˙t2F jrt j < ". Then j˙t2F rt j < ", and .rt /t2T satisfies the Cauchy summability condition, hence is summable. Conversely, suppose .rt /t2T is summable. Let TC WD ft 2 T W rt 0g, T WD TnTC . Setting rtC WD max.rt ; 0/, rt WD max.rt ; 0/ we see that the family .rtC /t2T is summable and its sum is the sum of the family .rt /t2TC . Similarly, the family .rt /t2T is summable and its sum is the sum of the family .rt /t2T . Then the family .jrt j/t D .rtC C rt /t2T is summable and its sum is the sum of ˙t2T rtC and of ˙t2T rt . t u Corollary 1.4 A countable family .rn /n2N of real numbers is summable iff (if and only if) the series ˙rn is absolutely convergent. Proof Saying that the series ˙rn is absolutely convergent means that the partial sums ˙0kn jrk j converge as n ! C1. Then the finite sums ˙t2J jrt j are bounded t u above. Thus the family .rt /t2T is absolutely summable, hence summable.
1.2 Convergence and Summability in R
11
In general, since there is no order on the set T one can say that the summability of a family .rt /t2T is a commutative property in the sense that if f W T 0 ! T is a bijection, and if rt00 WD rf .t0 / , the family .rt /t2T is summable if and only if the family .rt00 /t0 2T 0 is summable. Let us present some properties. Proposition 1.3 If c 2 R and if .rt /t2T and .rt0 /t2T are summable families of real numbers, then .crt C rt0 /t2T is summable and its sum is c˙t2T rt C ˙t2T rt0 . Proof The result stems from the fact that for any finite subset J of T one has t u ˙t2J .crt C rt0 / D c˙t2J rt C ˙t2J rt0 . We dispose of an associativity property that enables us to sum by gathering bunches. Theorem 1.8 Let .Ta /a2A be a partition of T in the sense that the subsets Ta of T are mutually disjoint and such that T D [a2A Ta . If .rt /t2T is a summable family of real numbers, then for all a 2 A the family .rt /t2Ta is summable and if sa denotes its sum the family .sa /a2A is summable and its sum is the sum s of the family .rt /t2T W XX
rt D
a2A t2Ta
X
rt :
t2T
Proof We already know that for all a 2 A the family .rt /t2Ta is summable. Let us prove the other two assertions. For a 2 A, let us denote by Ja (resp. J ) the family of finite subsets of Ta (resp. T). Given " > 0 let K 2 J be such that for all J 2 J containing K one has jsJ sj ". Let C WD fa 2 A W K \ Ta ¤ ¿g; so that K D [c2C K \ Tc and C is finite. Let B be a finite subset of A containing C. Let n be the number of elements of B. For each b 2 B we can find some Jb 2 Jb containing K \ Tb such that jsJb sb j "=n. Then the set J WD [b2B Jb contains K and since the sets Jb (b 2 B) are disjoint, we have ˙b2B sJb D ˙j2J rj , ˇ ˇ ˇ ˇˇ ˇ ˇX ˇ ˇX ˇ ˇ ˇ ˇ sJb sˇ D ˇ rj sˇˇ "; ˇ ˇ ˇ ˇ ˇ b2B
j2J
ˇ ˇ ˇX X ˇˇ ˇ sJb sb ˇ " ˇ ˇ ˇ b2B
b2B
hence ˇ ˇ ˇX ˇ ˇ ˇ sb sˇ 2": ˇ ˇ ˇ b2B
Since " > 0 is arbitrarily small, this shows that the family .sa /a2A is summable with sum s. t u
12
1 Sets, Orders, Relations and Measures
Exercises 1. Let .Ta /a2A be a partition of a set T. For each a 2 A let .rt /t2Ta be a summable family of nonnegative real numbers, with sum sa . Suppose that the family .sa /a2A is summable with sum s. Show that the family .rt /t2T is summable and its sum is s. 2. Let I and J be two sets and let .ai /i2I , .bj /j2J be two (absolutely) summable families of real numbers. Show that the family .ai bj /.i;j/2IJ is summable and its sum is .˙i2I ai /.˙j2J bj /. 3. Let b be an integer greater than 1. (a) Verify that for any sequence .xn / of nonnegative integers less than b the series ˙n0 xn bn converges to some s 2 Œ0; 1. (b) Conversely, show that any s 2 Œ0; 1 is the sum of such a series and that this series is unique provided s does not belong to the set of numbers of the form kbm for some k; m 2 N. In the latter case there are exactly two such series with sum s. (c) Taking b D 2 in what precedes, prove that Œ0; 1 is equipotent to P.N/. (d) Using Theorem 1.4 and Exercise 13 of Sect. 1.1 prove that R is uncountable.
1.3 Maps and Multimaps (Relations) This section forms an introduction to multivalued analysis, which assumes an increasingly more important position in analysis. Mathematics is usually viewed as a precise field in which there is no ambiguity. That is not the case. All mathematicians use some abuses of notation or abuses of terminology. This is not a severe weakness as long as these abuses are well recognized and mastered. However, it is preferable to limit their uses to specific cases in which a more precise notation or terminology would be too heavy or cumbersome. For instance, given a map f W X ! Y between two sets and y 2 Y, one often writes f 1 . y/ instead of f 1 .fyg/, where, for B Y one sets f 1 .B/ WD fx 2 X W f .x/ 2 Bg: Such an abuse can hardly lead to mistakes. However, some common abuses may lead to misunderstandings or mistakes and we encourage the reader to avoid them. For instance, we recommend to denote by f or f ./ a map f W X ! Y between two sets rather than f .x/. If f cannot be identified with one of its values, it cannot be assimilated to its image, i.e. the set f .X/ of its values: two different maps f , g W X ! Y may have the same image. For this reason, we avoid the notation fxn g for a sequence in a set X, i.e. a map s W N ! X, with xn WD s.n/. Let us recall that a sequence in a set X is a map s from N to X, hence is an element of X N . While the notations .xn /n2N , .xn /n0 or just .xn / are unambiguous, the notations xn , fxn W n 2 Ng or fxn g should be avoided.
1.3 Maps and Multimaps (Relations)
13
It may be of interest to recall that the correspondence y 7! f 1 . y/ (called the inverse image) is not a map from Y into X but a multimap or multifunction, i.e. a map from Y into the set P.X/ of subsets of X. It enjoys nice properties: for any family .Bi /i2I of subsets of Y one has f 1 .
[
Bi / D
i2I
[
f 1 .Bi /;
f 1 .
\
i2I
Bi / D
\
i2I
f 1 .Bi /
i2I
and, if B B0 2 P.Y/, f 1 .B/ f 1 .B0 /, f 1 .B0 nB/ D f 1 .B0 /n f 1 .B/, where B0 nB WD fy 2 B0 W y … Bg is the complement of B in B0 . For direct images, given a family .Ai /i2I of subsets of X one just has f.
[
Bi / D
[
i2I
i2I
f .Bi /;
f.
\
Bi /
i2I
\
f .Bi /
i2I
as f .A/ f .A0 / when A A0 . If F W X ! P.Y/ is a multimap, one defines direct and inverse images of subsets by F.A/ WD
[
F.a/;
F 1 .B/ WD fx 2 X W F.x/ \ B ¤ ¿g
a2A
for A 2 P.X/, B 2 P.Y/. Note that F 1 appears as a multimap F 1 W Y ! P.X/ given, for y 2 Y, by F 1 . y/ WD F 1 .fyg/ D fx 2 X W y 2 F.x/g. This notation is compatible with the one we used for inverse images, considering a map f W X ! Y as a multimap F whose values are the singletons F.x/ WD f f .x/g. In order to underline the analogy with maps, a multimap F W X ! P.Y/ is often denoted by F W X Y. Like maps, multimaps can be composed: given multimaps F W X Y; G W Y Z, the composition of F and G is the multimap G ı F W X Z given by .G ı F/.x/ WD G.F.x//; where G.B/, for B WD F.x/, is defined as above. Then one has the associativity rule H ı .G ı F/ D .H ı G/ ı F: It is often convenient to associate to a multimap F W X Y its graph gph.F/ WD f.x; y/ 2 X Y W y 2 F.x/g : This subset of X Y (also denoted by G.F/ when no confusion may arise) characterizes F since F.x/ D fy 2 Y W .x; y/ 2 G.F/g :
(1.1)
14
1 Sets, Orders, Relations and Measures
Conversely, to any subset G of X Y one can associate a multimap F W X Y by setting F.x/ D fy 2 Y W .x; y/ 2 Gg ; so that G is the graph of F. Moreover, when G is the graph G.M/ of some multimap M, one gets F D M via this reverse process. Thus, there is a oneto-one correspondence between subsets of X Y and multimaps from X into Y. This correspondence is simpler than the correspondence between maps and their graphs, since in the latter correspondence one has to consider only subsets G whose vertical slices G \ .fxg Y/ ( for x 2 X) are singletons. In view of this one-to-one correspondence between a multimap and its graph, it is often convenient to identify a multimap with its graph and to say that a multimap has a property P if its graph has this property (such as closedness, convexity. . . ). This viewpoint is often fruitful and without any important risk of confusion; however, when X and Y are endowed with some operation ~ one has to be aware that F~F 0 usually denotes the multimap x 7! F.x/ ~ F 0 .x/ and not the multimap whose graph is G.F/ ~ G.F 0 /. Moreover, one has to be careful with the order of the terms in the product X Y since they determine the direction of the multimap. When X is a product X D X1 X2 one has to be precise when one associates to a subset G of X Y a multimap, as a partial multimap also can be defined in this way. Note that if F W X Y is a multimap and if A (resp. B) is a subset of X (resp. Y) one has F.A/ D pY .gph.F/ \ .A Y//;
F 1 .B/ D pX .gph.F/ \ X B/;
where pX W X Y ! X and pY W X Y ! Y are the so-called canonical projections defined by pX .x; y/ WD x and pY .x; y/ WD y. We also observe that gph.F 1 / D .gph.F//1 WD f. y; x/ 2 Y X W .x; y/ 2 gph.F/g; where for G X Y one sets G1 WD f. y; x/ 2 Y X W .x; y/ 2 Gg. Let us note that here we do not leave the realm of multimaps, whereas the inverse of a mapping is in general a multimap, not a map (and this fact is a source of many mistakes for beginners in mathematics). The domain dom F or D.F/ of a multimap F W X Y is given by dom F WD D.F/ WD fx 2 X W F.x/ ¤ ¿g : It is also the range or image R.F 1 / WD Im F 1 WD F 1 .Y/ of F 1 . Conversely, dom F 1 is the image of F W the roles of F and F 1 are fully symmetric and F D .F 1 /1 . For any subsets A, A0 of X (resp. B, B0 of Y) one has F.A [ A0 / D F.A/ [ F.A0 / and F 1 .B [ B0 / D F 1 .B/ [ F 1 .B0 /. Let us observe that, contrary to what occurs for maps, in general one has F 1 .B \ B0 / ¤ F 1 .B/ \ F 1 .B0 /:
(1.2)
1.3 Maps and Multimaps (Relations)
15
Note that since F 1 may be an arbitrary multimap from Y to X and since for a multimap M W Y X one has M.A \ B/ ¤ M.A/ \ M.B/ in general, taking F D M 1 , so that F 1 D M, we obtain (1.2). The following proposition can be considered as a preparation for the use of multimaps. It will be useful later when considering monotone multimaps (also called monotone operators); see Sect. 9.4.3 for the meaning of the terms used here. Proposition 1.4 Let X be a vector space, let r 2 Rnf0g and let M W X X be a multimap. Then the resolvants of M are related to the Yosida regularizations of M by r1 ŒIX .IX C rM/1 D .rIX C M 1 /1 :
(1.3)
In particular, for r D 1, IX .IX C M/1 D .IX C M 1 /1 : Here and elsewhere IX (resp. IY ) denotes the identity map of X (resp. Y). Proof Relation (1.3) is a consequence in the following equivalences: y 2 r1 ŒIX .IX C rM/1 .x/ , ry x 2 .IX C rM/1 .x/ , x 2 .IX C rM/.x ry/ , x .x ry/ 2 rM.x ry/ , y 2 M.x ry/ , x ry 2 M 1 . y/ , x 2 .rIX C M 1 /. y/ , y 2 .rIX C M 1 /1 .x/:
t u
Exercises 1. Give an example showing that, for a multimap F W X Y, one may have F 1 ı F ¤ IX
F ı F 1 ¤ IY :
2. Show that IX F 1 ı F (the inclusion being the inclusion of graphs or images) if and only if dom.F/ D X. Also show that IY F ı F 1 if and only if F.X/ D Y. 3. Given multimaps F W X Y, G W Y Z and H WD G ı F, give a sufficient condition in order to have F G1 ı H. Show that this inclusion may not hold. 4. Give an example showing that for a multimap F W X Y and subsets B, B0 of Y in general one has F 1 .B \ B0 / ¤ F 1 .B/ \ F 1 .B0 /. 5. Given multimaps F W X Y, G W Y Z show that gph.G ı F/ D .IX G/.gphF/ D .F IZ /1 .gphG/:
16
1 Sets, Orders, Relations and Measures
6. Considering an order relation on a set X as a multimap F W X X whose graph is the set f.x; y/ 2 X X W x yg, write the properties of as properties of F. Do the same with an equivalence relation.
1.4 Measurable Spaces The family P.X/ of all the subsets of a set X has a rich algebraic structure inherited from the operations \, [ and from the inclusion . The aim of the present section is a study of some remarkable subclasses of P.X/. The inclusion endows P.X/ with a lattice structure. It is even a complete lattice in the sense that for any family .Ai /i2I of subsets of X, the intersection (resp. union) of the family .Ai /i2I is the greatest lower bound (resp. smallest upper bound) of .Ai /i2I with respect to the order defined by . Moreover, ¿ (resp. X) is the least element (resp. greatest element) of P.X/ and P.X/ is complemented, i.e. every element A of P.X/ has a complement Ac such that A \ Ac D ¿ and A [ Ac D XI obviously Ac D XnA WD fx 2 X W x … Ag. Furthermore, P.X/ is distributive in the sense that for any family .Ai /i2I of subsets of X and any B 2 P.X/ one has [ [ .Ai \ B/; . Ai / \ B D i2I
.
\ i2I
i2I
Ai / [ B D
\
.Ai [ B/:
i2I
One says that P.X/ is a Boolean lattice or a Boolean algebra. Besides union and intersection, P.X/ is endowed with the operation called the symmetric difference given by 8A; B 2 P.X/
AB WD .AnB/ [ .BnA/ D .A [ B/n.A \ B/;
where for C, D 2 P.X/ one sets CnD WD fx 2 C W x … Dg. The operation is commutative: AB D BA and associative: for all A; B, C 2 P.X/ one has .AB/C D A.BC/ (exercise). It can be convenient to embed P.X/ into the set F .X; R/ of real-valued functions on X by using the characteristic map
W P.X/ ! RX WD F .X; R/ given by .A/ D 1A , where 1A 2 F .X; R/ is the characteristic function of A 2 P.X/ (sometimes called the indicator function, but we prefer to keep this term for another
1.4 Measurable Spaces
17
function) defined by 1A .x/ D 1 if x 2 A,
1A .x/ WD 0 if x 2 Ac WD XnA:
It is such that .A\B/ D .A/: .B/, the product of the two functions .A/ and .B/ on X. It is still more illuminating to consider as taking its values in the set F .X; Z/ of integer-valued functions and to compose with the map p W F .X; Z/ ! F .X; Z2 / induced by the quotient map q W Z ! Z2 WD Z=2Z by setting p. f / WD q ı f . Then one gets the characteristic map WD p ı W P.X/ ! F .X; Z2 /. Identifying Z2 with the set f0; 1g, the operations on the ring Z2 are carried into operations on f0; 1g corresponding to evenness, since q.n/ D 0 if n is even, q.n/ D 1 if n is odd. Then, for all A; B 2 P.X/ one has .A \ B/ D .A/:.B/;
.AB/ D .A/ C .B/:
Then one realizes that P.X/ is given a ring structure (in the usual algebraic sense) with the two operations (for the addition) and \ (for the product). In order to check the usual rules for rings, it suffices to observe that is injective (since A D B whenever 1A D 1B ); in fact is a bijection whose inverse is the map g 7! g1 .1/. Then becomes a ring isomorphism from P.X/ onto F .X; Z2 /. Moreover, the empty set ¿ is the neutral element for and X is the unit for \. These observations justify the following terminology. Definition 1.2 A nonempty subclass A of P.X/ is called a ring if for all A, B 2 A one has AB 2 A and A \ B 2 A. If, moreover, X 2 A one says that A is a (Boolean) algebra. A subclass S of P.X/ is called a -algebra (resp. a -ring) if it is an algebra (resp. a ring) and if the union of a countable family of elements of S is in S. The following criterion may appear as more convenient. Proposition 1.5 A nonempty subclass A of P.X/ is a ring if and only if it is such that for all A, B 2 A one has A [ B 2 A and AnB 2 A. A nonempty subclass A of P.X/ is a -algebra if and only if it is such that X 2 A, [n An 2 A whenever An 2 A for all n 2 N, and Ac WD XnA 2 A for all A 2 A. Proof For the characterization of rings, the only if assertion follows from the relations A [ B D AB.A \ B/;
AnB D A.A \ B/:
The converse is a consequence in the relations AB D .AnB/ [ .BnA/;
A \ B D .A [ B/.AB/:
The only if assertion of the characterization of -algebras is obvious. The if assertion follows from the preceding characterization since AnB D A \ Bc and A \ B D Xn.Ac [ Bc /. t u
18
1 Sets, Orders, Relations and Measures
Note that when A is closed under finite unions, since AnB D .A [ B/nB the requirement AnB 2 A for all A, B 2 A is satisfied whenever A is relatively complemented in the sense that AnB 2 A for all A, B 2 A satisfying B A. Note also that a nonempty ring contains the empty set ¿ since for any A 2 A one has AA D ¿. Among the preceding notions, the notion of a -algebra is the most important one, as the next definition and the sequel show. Definition 1.3 A measurable space is a pair .X; S/ where S is a -algebra of subsets of X. A map f between two measurable spaces .X; S/, .Y; T / is said to be measurable if for all B 2 T one has f 1 .B/ 2 S. Clearly, given measurable spaces .X; S/, .Y; T /, .Z; U/, and measurable maps f W .X; S/ ! .Y; T /, g W .Y; T / ! .Z; U/, the map g ı f W .X; S/ ! .Z; U/ is measurable. If .X; S/ is a measurable space, if W is a set, and if j W W ! X is a map, the family SW of subsets of W of the form j1 .B/ with B 2 S is a -algebra called the inverse image of S by j. In particular, if .X; S/ is a measurable space and if W is a subset of X, taking for j the canonical injection, we see that the family SW of subsets A of W such that there is some B 2 S satisfying B \ W D A is a -algebra called the -algebra induced by S on W. Note that when W 2 S, for A 2 P.W/ one has A 2 SW if and only if A 2 S. The intersection of a family .Si /i2I of rings (resp. -algebras) of X is a ring (resp. a -algebra). As a consequence, for any subset G of P.X/ there is a smallest ring S (resp. -algebra) containing G: S is the intersection of the family of rings (resp. -algebras) containing G (since P.X/ is itself a -algebra, this family is nonempty). One says that S is the ring (resp. -algebra) generated by G. A similar observation holds for algebras and -rings. Let us note the following useful observation. Lemma 1.1 Let j W W ! X be a map and let B be the -algebra on X generated by some subset G of P.X/. Then the -algebra A WD fj1 .B/ W B 2 Bg on W is generated by F WD fj1 .G/ W G 2 Gg. In particular, if W is a subset of a set X and if B is the -algebra generated by some subset G of P.X/, then the induced -algebra A WD fB \ W W B 2 Bg on W is generated by F WD fG \ W W G 2 Gg. Proof Let A0 be a -algebra on W containing F . Set B 0 WD fB0 2 P.X/ W j1 .B0 / 2 A0 g: Since B 0 contains G and is a -algebra, B 0 contains B. Thus, for all B 2 B we have j1 .B/ 2 A0 . In other words we have A A0 . Since A is clearly a -algebra, this shows that A is the smallest -algebra on W containing F . t u The preceding construction can be generalized and applied to products by taking for gi below the canonical projections.
1.4 Measurable Spaces
19
Proposition 1.6 Let X be a set and for i 2 I let gi W X ! Xi be a map. Given -algebras Si on Xi there is a smallest -algebra S on X such that for all i 2 I the map gi W .X; S/ ! .Xi ; Si / is measurable. Moreover, if Si is generated by Gi Si , then S is generated by the collection G of finite intersections \j2J g1 j .Gj / for J a finite subset of I and Gj 2 Gj . If .W; R/ is a measurable space, a map f W .W; R/ ! .X; S/ is measurable if and only if for all i 2 I the map gi ı f is measurable. Proof Given Gi Si generating Si we denote by S the -algebra generated by the union F of the families Fi WD fg1 i .Gi / W Gi 2 Gi g for i 2 I. Then, for all i 2 I, the class Ai WD fA 2 Si W g1 .A / 2 Sg is a -algebra containing Gi , so that Ai D Si i i and gi is measurable. Clearly S is the smallest -algebra satisfying this property and S is generated by the class G of the statement. If f W .W; R/ ! .X; S/ is measurable, then for all i 2 I the map gi ı f is measurable. Conversely, if for all i 2 I the map gi ı f is measurable, then for all F 2 F WD [i2I Fi one has f 1 .F/ 2 R so that f is measurable. t u Let us describe an inverse construction consisting in endowing the image of a map with a -algebra. Given a map f W X ! Y between two sets and a -algebra S in X, the family T of subsets B of Y such that f 1 .B/ 2 S is a -algebra. The -algebra T is called the direct image of S by f and is denoted by f .S/. More generally, one can glue together measurable spaces. Lemma 1.2 Let X be a set and let .Xi /i2I be a family of subsets of X. If for all i 2 I, Ai is a ring (resp. a -ring) of Xi then A WD fA 2 P.X/ W 8i 2 I A \ Xi 2 Ai g is a ring (resp. -ring) of X. If Ai is an algebra (resp. -algebra) of Xi then A is an algebra (resp. -algebra). Proof The first assertion stems from the relations .A \ B/ \ Xi D .A \ Xi / \ .B \ Xi /; .AnB/ \ Xi D .A \ Xi /n.B \ Xi /; [ [ . An / \ Xi D .An \ Xi /: n
n
The second one is a consequence in the relation Ac \ Xi D Xi nA.
t u
Anticipating the notion of a topological space, let us mention that if .X; O/ is a topological space, the -algebra B generated by the family O of open subsets of X is called the Borel algebra of .X; O/ and its members are called the Borel subsets of .X; O/. One often denotes B by B.X/ rather than B.X; O/. Proposition 1.7 Given measurable spaces .X; S/, .Y; T /, and a family of subsets G generating T , a map f W X ! Y is measurable with respect to S and T if and only if for all G 2 G one has f 1 .G/ 2 S.
20
1 Sets, Orders, Relations and Measures
In particular, if .Y; O/ is a topological space and if T is the Borel -algebra of .Y; O/, a map f W X ! Y is measurable if and only if for all O 2 O one has f 1 .O/ 2 S. Equivalently, in this particular case, f W X ! Y is measurable if and only if, for any closed subset C of Y, the set f 1 .C/ is in S. The second assertion of the proposition entails that any continuous map between two topological spaces is measurable for the associated Borel -algebras. Proof It suffices to show that f is measurable whenever f 1 .G/ 2 S for all G 2 G. This follows from the fact that f .S/ WD fH 2 P.Y/ W f 1 .H/ 2 Sg is a -algebra containing G, so that f .S/ contains T . t u It is of interest to consider functions with values in R WD R [ f1; C1g rather than in R because R is compact and every family in R has a supremum and an infimum. We use the fact that there exists an increasing bijection h W R ! Œ1; C1 such as the one given by h.r/ WD r=.jrjC1/ for r 2 R, h.1/ WD 1, h.C1/ WD 1 that can serve to define a topology on R. Moreover, the topology of R is the induced topology of R and the Borel algebra B.R/ of R is the -algebra induced by the Borel algebra B.R/ of R. Thus a function f W X ! R is measurable if and only if it is measurable as a function from X to R. Corollary 1.5 Given a measurable space .X; S/ and a function f W X ! R (resp. f W X ! R), endowing R (resp. R) with its Borel -algebra, f is measurable if and only if for all r 2 R or all r 2 Q the set f f > rg WD f 1 .r; C1Œ/ is in S. Proof This is a consequence in the fact that B.R/ (resp B.R/) is generated by the family of intervals r; C1Œ (resp. r; C1) for r 2 Q. t u Proposition 1.8 If f , g W X ! R are measurable, then f _ g WD sup. f ; g/ and f ^g WD inf. f ; g/ are measurable. If f and g are finitely valued, then . f ; g/ W X ! R2 is measurable for the Borel -algebra of R2 and f Cg, f g, and f :g are measurable. Proof The first assertion follows from the relations . f _ g/1 .r; C1/ D f 1 .r; C1/ [ g1 .r; C1/; . f ^ g/1 .r; C1/ D f 1 .r; C1/ \ g1 .r; C1/: To show that h WD . f ; g/ is measurable when f and g are measurable it suffices to observe that for open subsets U, V of R the set h1 .U V/ D f 1 .U/ \ g1 .V/ is in S and that the Borel -algebra of R2 is generated by the family of products of open subsets. Using the continuous functions .r; s/ 7! r C s, .r; s/ 7! r s, .r; s/ 7! rs and taking their compositions with h, we get the last assertion. t u Let us note the following stability result. Lemma 1.3 Let .X; S/ be a measurable space, let .Y; d/ be a metric space and let . fn / be a sequence of measurable maps from X into Y which converges pointwise in the sense that for all x 2 X one has . fn .x// ! f .x/. Then the limit f of . fn / is measurable.
1.4 Measurable Spaces
21
Proof This stems from the fact that, for every nonempty closed subset C of Y, for Ck WD fy 2 Y W d. y; C/ 2k g one has f 1 .C/ D \k2N [m2N \nm fn1 .Ck /: t u In an important case, the construction of the ring generated by a subset of P.X/ can be made explicit. It will be convenient to use the following notion. Definition 1.4 A subclass C of the class P.X/ of all the subsets of a set X is a semiring if for all A, B in C one has A \ B 2 C and if AnB is the union of a finite family of disjoint elements of C. Let us note that if a semi-ring C is nonempty, then it contains the empty set ¿ since otherwise, given C 2 C, one cannot obtain CnC as a finite union of nonempty sets. In R the family C of intervals of the form Œa; bŒWD fr 2 R W a r < bg with a, b 2 R, a b, is an important example of a semi-ring. In the sequel if A and B are two disjoint subsets of X (in the sense that A\B D ¿) we write A t B for A [ B. It will be convenient to say that a class F of subsets of a set X is disjoint if distinct members of F are disjoint. If .Ai /i2I is a disjoint family of subsets of X, we write ti2I Ai for [i2I Ai . We say that .Ai /i2I is a partition of A X if A D ti2I Ai . Lemma 1.4 The ring generated by a semi-ring C on X is the set A formed by the unions of disjoint finite subfamilies of C. Moreover, A is also the set of unions of finite subfamilies of C and for C, C1 ,. . . , Cn in C one can find finite families .Dj /j2J , .Di;j /j2Ji (i 2 Nn ) of members of C such that Cn.
n [
Ci / D
iD1
Dj ;
(1.4)
j2J
iD1 n [
a
Ci D
n a a
Di;j
with Di;j Ci , Di;j 2 C for i 2 Nn , j 2 Ji :
(1.5)
iD1 j2Ji
Proof Given A, B 2 A let us first show that A \ B belongs to A. Let I, J be finite sets and let .Ci /i2I , .Dj /j2J be two families of disjoints elements of C such that A D [i2I Ci and B D [j2J Dj . Then A \ B D [.i;j/2IJ Ci \ Dj and the family .Ci \ Dj /.i;j/2IJ is a family of disjoint elements of C, so that A \ B 2 A. Now AnB D [i2I .Ci nB/ D [i2I \j2J .Ci nDj / and since Ci nDj 2 A we have Ci0 WD \j2J .Ci nDj / 2 A by what precedes and an induction. Since the family .Ci0 /i2I is formed of disjoint subsets in C we have AnB 2 A and A is a ring by Proposition 1.5. Since any ring containing C contains the elements of A, the ring generated by C is A. Relation (1.4) can be established by induction on n since we know the result for n D 1 and Cn.[1in Ci / D .Cn.[1in1 Ci //nCn . Relation (1.5) is a consequence
22
1 Sets, Orders, Relations and Measures
in the following relation n [ iD1
Ci D
n a iD1
Ci0
with Ci0 WD Ci n.
i1 [
Ch / for i 2; C10 WD C1 ;
hD1
the property Ci0 \ Cj0 D ¿ for i < j stemming from the inclusion Ci [1hj1 Ch . Finally, relation (1.5) ensures that the family A0 of unions of elements of C coincides with A. t u The next two results take into account the fact that P.X/ has a natural order given by inclusion. They are closely related and have important consequences. Let us say that a class D of subsets of a set X is an increasing class if it includes the union of any increasing sequence of members of D; here a sequence .An / of P.X/ is said to be increasing (resp. decreasing) if An AnC1 (resp. AnC1 An ) for all n 2 N. A class D of subsets of a set X is a decreasing class if it includes the intersection of any decreasing sequence of members of D. A class M of subsets of X is called a monotone class if it is an increasing class and a decreasing class. Let us recall that a class D of subsets of a set X is relatively complemented if for all A, B 2 D such that B A one has AnB 2 D. Given C P.X/ there is a smallest relatively complemented increasing class (resp. monotone class) containing C. It is called the relatively complemented increasing class (resp. the monotone class) generated by C. Theorem 1.9 (Increasing Class Theorem) Let X be a set and let C P.X/ be closed under the formation of finite intersections. Then the relatively complemented increasing class D generated by C is the -ring R generated by C. If X is the union of a countable subfamily of C, then D is the -algebra generated by C. The proof below shows that instead of assuming that C is closed under finite intersections one may assume that for all C, C0 2 C one has C \ C0 2 D. However, in the applications we have in view, the class C is closed under intersections. Proof Since a -ring is a relatively complemented increasing class, one has D R. In order to prove the reverse inclusion we begin by showing that D is closed under (finite) intersections. Let D0 WD fA 2 D W A \ C 2 D 8C 2 Cg: By assumption, C is contained in D0 . The relation .AnB/ \ C D .A \ C/n.BnC/ shows that D0 is relatively complemented. Moreover, D0 is clearly an increasing class as is D. Thus D0 D D. Now let D00 WD fA 2 D W A \ D 2 D 8D 2 Dg: The relation D0 D D implies that C D00 . Moreover, the same arguments show that D00 is a relatively complemented increasing class, hence that D00 D D. Therefore D
1.4 Measurable Spaces
23
is closed under finite intersections, hence is a -ring. It follows that R D and that R D D. If X is the union of a countable subfamily of C, then X 2 R D D: D is a algebra. t u Theorem 1.10 (Monotone Class Theorem) Let G be a class of subsets of a set X. Suppose the monotone class M generated by G contains the complements of the members of G and the finite intersections of members of G. Then M is the -algebra S generated by G. The same conclusion holds if M contains the finite unions of members of G and the complements of members of G. In particular, if G is an algebra, then M is the -algebra generated by G. Proof The class M0 of sets M in M such that M c WD XnM belongs to M is a monotone class and by assumption it contains G. Thus M0 D M and M is closed by taking complements. Let C be the class of finite intersections of elements of G, so that C is closed under finite intersections. Assuming C is contained in M means that M is the monotone class generated by C. Theorem 1.9 shows that M is the -ring generated by C. Thus M is also the -ring generated by G. Since M is closed under complementation, M is the -algebra generated by G. The case M contains the finite unions of members of G and the complements of members of G can be proved by taking complements. t u
Exercises 1. Given a set X and two nonempty subsets A, B of X, describe the rings, algebras, -rings, and -algebras generated by the families G WD fAg, G 0 WD fA; Bg. 2. Show that the family Cd of boxes of Rd of the form Œa1 ; b1 Œ : : : Œad ; bd Œ is a semi-ring. 3. Prove that the Borel family of Rd is generated by the semi-ring Cd of Exercise 2. 4. Let .An / be a sequence in a -ring A of subsets of a set X. Prove that the intersection of the family fAn W n 2 Ng is in A as are the sets \[ m nm
An
\[
An ;
m nm
often denoted by lim supn An and lim infn An , respectively. 5. Let X be a set and let ..Xi ; Si //i2I be a family of measurable spaces. Given maps fi W X ! Xi show that there exists a unique -algebra S in X such that for any measurable space .W; U/ a map g W .W; U/ ! .X; S/ is measurable if and only if for all i 2 I the map fi ı g W .W; U/ ! .Xi ; Si / is measurable. Verify that S is the smallest -algebra on X for which the maps fi are measurable.
24
1 Sets, Orders, Relations and Measures
6. Deduce from the preceding exercise that the product of a family of measurable spaces can be endowed with the structure of a measurable space in a canonical way making the projections measurable. 7. Given measurable functions f ; g W X ! R for some -algebra S in X, check that the measurability of f C g is a consequence in the fact that f f C g > rg is the union over q 2 Q of the sets f f > qg \ fg > r qg. 8*. (Stone’s Theorem) Show that any Boolean lattice .S; / (in the algebraic sense) is isomorphic to a ring S of subsets of a set X. [Hint: consider the set X ZS2 of homomorphisms x W S ! Z2 WD f0; 1g and for s 2 S set f .s/ WD fx 2 X W x.s/ D 1gI to show that f W S ! P.X/ is injective, consider first the case when S is finite and given s 2 S use a compactness argument involving the family of finite subrings of S containing s to construct some x 2 f .s/.] 9. Verify that a relatively complemented increasing class of subsets of a set X is a monotone class. [Hint: given a decreasing sequence .An / of subsets of X note that \n An D A0 n.[Bn / for Bn WD A0 nAn .] From this observation deduce that Theorem 1.9 is a consequence in Theorem 1.10 (and thus that the two results are equivalent).
1.5 Measures The concept of measure space is a fundamental notion linked with some additivity properties. Definition 1.5 Given a set X and a subset C of P.X/, a function W C ! R1 WD R [ fC1g is said to be additive (resp. subadditive) if for every finite sequence .C1 ; : : : ; Cn / of disjoint elements of C whose union C is in C one has .C/ D .C1 / C : : : C .Cn / (resp. .C/ .C1 / C : : : C .Cn /). The function is said to be countably additive or -additive (resp. countably subadditive or -subadditive) if for any Psequence .Cn / of disjoint P elements of C whose union C is in C one has .C/ D n .Cn / (resp. .C/ n .Cn /). The function is said to be finite if it takes its values in R. It is said to be -finite if there exists a sequence .An / of sets in C whose union is X such that .An / < C1 for all n 2 N. Definition 1.6 A measure on .X; C/ with ¿ 2 C is a countably additive function W C ! RC WD Œ0; C1 that satisfies .¿/ D 0. For the sake of simplicity we adopt this (usual) terminology although countably additive functions with values in R are of interest; they will be called signed measures (see Chap. 8 and the exercises). When C is a -algebra S and W S ! RC is a measure, the triple .X; S; / is called a measure space. If moreover .X/ D 1 one says that is a probability on X.
1.5 Measures
25
Example The counting measure on a set X is defined on P.X/ by .A/ WD 0 if A is the empty set, .A/ WD n if A has n distinct points, and .A/ WD C1 if A is infinite. Example Given a function ! W X ! RC on a set X, the discrete measure associated with the weights .!x /x2X with !x WD !.x/ is defined by .A/ WD ˙x2A !x . If !x D 1 for all x we get the counting measure on X. We observe that any additive function on a ring A is nondecreasing since for A, B 2 A with A B one has .B/ D .A/ C .BnA/ .A/. This observation justifies the existence of the limits in the next lemma. Let us stress the difference between additivity and -additivity by pointing out the following “continuity” property. Lemma 1.5 For an additive function W A !RC on a ring A of subsets of X, is -additive if and only if for any increasing sequence .An / of elements of A whose union is in A one has .A/ D limn .An /. Moreover, if W A !RC is -additive, if .Bn / is a decreasing sequence of elements of A whose intersection B is in A, and if for some m 2 N, .Bm / is finite, then one has .B/ D limn .Bn /. If is finite and additive and if for every decreasing sequence .Bn / of elements of A whose intersection is empty one has limn ..Bn // ! 0, then is -additive. Proof Let be -additive and let .An / be an increasing sequence in A whose union A is in A. Setting B0 WD A0 , Bn WD An nAn1 for n 2 Nnf0g, we get a sequence in disjoint elements of A whose union is A (since B0 [ : : : [ Bn D An ) so that we get .A/ D
1 X nD0
.Bn / D lim
n!1
n X
.Bk / D lim .B0 [ : : : [ Bn / D lim .An /: n!1
kD0
n!1
Conversely, suppose is additive and satisfies the above property. Let us show that is -additive. Given a sequence .An / of disjoint elements of A whose union A is in A, let us set Bn WD A0 [ : : : [ An . Then Bn 2 A, .Bn / is increasing and its union is A 2 A, so that .A/ D lim .Bn / D lim n!1
n!1
n X kD0
.Ak / D
1 X
.Ak /:
kD0
The second assertion is obtained by considering the sequence .Bm nBn /nm whose union is Bm nB and in using a passage to the limit and the fact that .B/ D .Bm / .Bm nB/ and .Bn / D .Bm / .Bm nBn / for n m. Finally, let us prove the last assertion. Let A 2 A and let .An / be a partition of A by elements of A. Let Bn WD An [pn Ap , so that Bn D \pn .AnAp /. Since BnC1 Bn and \n Bn D ¿ we have ..Bn // ! 0. Moreover, A D A0 t : : : t An1 t Bn , so that .A/ D .A0 / C : : : C .An1 / C .Bn /. This means that .A/ D ˙n .An /. t u
26
1 Sets, Orders, Relations and Measures
Theorem 1.9 yields uniqueness results for extensions. Theorem 1.11 Let C be a family of subsets of a set X closed under the formation of finite intersections and let W C !RC be -additive. If two measures , 0 on the -ring S generated by C extend , then they coincide provided one of the following two conditions is satisfied: (a) is finite; (b) is -finite. Proof Let us first suppose is finite on S. Let D WD fD 2 S W .D/ D 0 .D/g: Since D contains C, if we prove that D is a relatively complemented increasing class we get the result since, by Theorem 1.9, S is the relatively complemented increasing class generated by C, so that S D. For all D, E 2 D with E D we have DnE 2 D since DnE 2 S and .DnE/ D .D/ .E/ D 0 .D/ 0 .E/ D 0 .DnE/: Let .Dn / be an increasing sequence in D and let D WD [Dn 2 S. Setting E0 WD D0 , E1 WD D1 nD0 ,. . . , En WD Dn nDn1 ; : : : we get a sequence of disjoint elements of D such that E0 t : : : t En D D0 [ : : : [ Dn . By -additivity we get .D/ D .
[ n
En / D
X n
.En / D
X n
0 .En / D 0 .
[
En / D 0 .D/
n
so that D 2 D. Thus D is a relatively complemented increasing class and D D S. Now let us suppose X is the union of a countable family .An / of C, each An having a finite measure; note that then S is the -algebra generated by C. Let E WD fE 2 S W .E \ Ak / D 0 .E \ Ak / 8k 2 Ng: Since C is closed under intersections, C is contained in E. Using Lemma 1.5, we see that if .En / is an increasing sequence in E, then [n En 2 E. Now if E, F 2 E are such that F E, since .EnF/ \ Ak D .E \ Ak /n.F \ Ak / we see as above that EnF 2 E. Again, Theorem 1.9 ensures that E D S. Let us define an increasing sequence .Bn / in S by setting B0 WD A0 , B1 D B0 [ .A1 nB0 /,. . . , Bn WD Bn1 [ .An nBn1 /. Since An 2 C and Bn1 2 S D E, for all n 1 we have 0 .An nBn1 / D 0 .An / 0 .An \ Bn1 / D .An / .An \ Bn1 / D .An nBn1 /: By induction we show that for n 1 we have 0 .Bn1 / D .Bn1 /. For n D 1 this equality holds since B0 D A0 . Now if 0 .Bn1 / D .Bn1 /, by the preceding relations and by additivity we get 0 .Bn / D .Bn /. Thus 0 .Bn / D .Bn / for all n 2 N and X D [n Bn .
1.5 Measures
27
Let us prove by induction that for all S 2 S and all n 2 N we have 0 .S \ Bn / D .S \ Bn /:
(1.6)
This relation is satisfied for n D 0 since B0 D A0 and S D E. Suppose it holds for n. Since S \ BnC1 D .S \ Bn / [ .AnC1 \ .SnBn // with SnBn 2 S D E, so that 0 .AnC1 \ .SnBn // D .AnC1 \ .SnBn //, by additivity we get (1.6) with n changed into n C 1. Thus, for all S 2 S and all n 2 N (1.6) holds. Then, for all S 2 S we get 0 .S/ D lim 0 .S \ Bn / D lim .S \ Bn / D .S/: n
n
t u Let us give a first extension result. Lemma 1.6 Let W C !RC be an additive function on a semi-ring C of subsets of X. Then there exists a unique additive function on the ring A generated by C whose restriction to C is . Proof Lemma 1.4 ensures that for any A 2 A there exists a finite family .C1 ; : : : ; Cn / of disjoint elements of C such that A D C1 [ : : : [ Cn . Then, by the additivity requirement, .A/ cannot be anything else than .A/ D .C1 / C : : : C .Cn /: Let us show that this value is independent of the decomposition of A. Given another family .D1 ; : : : ; Dp / of disjoint elements of C whose union is A we observe that the family .Ei;j / WD .Ci \ Dj / is formed with disjoint elements of C and its union is A. Since for j 2 Np WD f1; : : : ; pg we have Dj D E1;j [ : : : [ En;j 2 C, we p p n get .Dj / D .E1;j / C : : : C .En;j / and ˙jD1 .Dj / D ˙jD1 ˙iD1 .Ei;j / and p n n similarly ˙iD1 .Ci / D ˙iD1 ˙jD1 .Ei;j /. Thus .A/ is unambiguously defined and any additive function 0 on A extending gives to A the same value. In particular, if A 2 C we have .A/ D .A/. Given A 2 A and B 2 C with A \ B D ¿ our construction shows that .A [ B/ D .A/ C .B/ D .A/ C .B/. Given A, B 2 A with A \ B D ¿, writing B WD C1 [ : : : [ Cn with Ci 2 C and Ci \ Cj D ¿ for i ¤ j we can prove by induction on n that .A [ B/ D .A/ C .C1 / C : : : C .Cn / D .A/ C .B/. Thus is additive on A. t u A refinement of the preceding lemma can be given. It will serve to define a product measure; but since another means can be used, this lemma can be skipped in a first reading.
28
1 Sets, Orders, Relations and Measures
Lemma 1.7 Let W C !RC be a -additive function on a semi-ring C of subsets of X. Then the additive function on the ring A generated by C extending is -additive. Proof Let A 2 A be the union of a sequence .An / of disjoint members of A. Since A 2 A we can find a finite family .C1 ; : : : ; Cm / of disjoint elements of C whose union is A. Then, for all i 2 Nm , we have Ci D [n0 .An \ Ci /. Since for all .i; n/ 2 Nm N the set Ai;n WD An \ Ci belongs to A, we can find a finite family .C1;i;n ; : : : :; Cp.i;n/;i;n / of disjoint elements of C whose union is Ai;n . By -additivity of and associativity of sums we have .Ci / D
1 p.i;n/ X X
.Cj;i;n / D
nD0 jD1
1 X
.An \ Ci /:
nD0
Thus, by definition of , we get .A/ D
m X
.Ci / D
iD1
1 m X X
.An \ Ci /:
iD1 nD0
Since the terms of this series are nonnegative, since is additive, and since .An \ C1 / [ : : : [ .An \ Cm / D An \ A D An and .An \ Ci / \ .An \ Ci0 / D ¿ for i ¤ i0 , we get .A/ D
m 1 X X
.An \ Ci / D
nD0 iD1
1 X
.An /:
nD0
Thus is -additive.
t u
A general means to get a measure uses the notion of an outer measure, a notion that is less exacting than the concept of measure. An outer measure on a set X is an increasing, -subadditive function ! W P.X/!RC satisfying !.¿/ D 0. Outer measures are easily obtained. Proposition 1.9 Let X be a set, let C P.X/ with ¿ 2 C, and let W C ! RC be such that .¿/ D 0. For any subset S of X let C.S/ be the collection of sequences .Cn / of elements of C whose unions contain S and let !.S/ WD inff
X
.Cn / W .Cn / 2 C.S/g:
(1.7)
n
Then ! is an outer measure on X such that !.C/ .C/ for all C 2 C. If C is a semi-ring and if is -subadditive, then the restriction of ! to C is . In this definition we use the familiar convention that the infimum of the empty set in R is C1.
1.5 Measures
29
Proof Clearly !.¿/ D 0. Let S T X. Since C.T/ C.S/ we get !.S/ !.T/. Let us show that ! is -subadditive. Let .Sk / be a sequence in P.X/ and let S be the union of the Sk ’s. If for some k 2 N one has C.Sk / D ¿, then C.S/ D ¿ and the relation X !.S/ !.Sk / (1.8) k
holds, each side being C1. If C.Sk / ¤ ¿ for all k, given " > 0, we pick sequences .Ck;n /n 2 C.Sk / such that X
.Ck;n / !.Sk / C 2k ":
n
Then the family .Ck;n /k;n can be organized in a sequence .Bj /j0 by using a bijection h W N ! N2 and setting Bj WD Ch. j/ so that .Bj / covers S and !.S/
X j
.Bj / D
X
.Ck;n /
k;n
X X .!.Sk / C 2k "/ D !.Sk / C ": k
k
Since " > 0 is arbitrary, relation (1.8) is established and ! is an outer measure. Given C 2 C, setting C0 WD C, Cn D ¿ for n > 0, we see that !.C/ .C/. Finally, let us show that when C is a semi-ring and is countably subadditive the restriction of ! to C is . Given C 2 C, for any .Cn / 2 C.S/ we may suppose the union of the sets Cn is C (replacing Cn with Cn \ C) and that the sets Cn are disjoint (replacing Cn with a finite family of disjoint sets Di;n 2 C whose union is Cn nCn1 n : : : nC0 and relabelling them). Then we have .C/
X
.Cn /;
n
hence, taking the infimum over C.S/, .C/ !.C/ and equality holds.
t u
Example The trivial outer measure on a set X is given by !.¿/ D 0 and !.S/ D 1 for all nonempty S 2 P.X/. Example The counting measure on a set X is a measure on P.X/, hence is an outer measure. Example Let X be an infinite set and let ! be given by !.S/ D 0 if S is countable, !.S/ WD 1 if S is uncountable. Then ! is an outer measure. Example Let C be the family of intervals of R of the form Œa; bŒ with a b and let W C ! RC be given by .Œa; bŒ/ WD b a, the length of Œa; bŒ. We shall show in Sect. 1.7 that one can define an outer measure ! extending called the Lebesgue outer measure on R.
30
1 Sets, Orders, Relations and Measures
Example Let C be the family of rectangles of Rd of the form Œa1 ; b1 Œ : : : Œad ; bd Œ with ai bi for i D 1; : : : ; d, and let W C ! RC be given by .Œa1 ; b1 Œ : : : Œad ; bd Œ/ WD .b1 a1 / : : : .bd ad /. We shall show in the Sect. 1.7 that one can define an outer measure ! extending called the Lebesgue outer measure on Rd . t u Now let us show how to get a measure from an outer measure ! on a set X. Let us say that a subset M of X is !-measurable if 8S 2 P.X/
!.S/ !.M \ S/ C !.M c \ S/;
(1.9)
with M c WD XnM. In fact, since ! is subadditive, this relation is an equality. Theorem 1.12 (Caratheodory) Let X be a set and let ! W P.X/ ! RC be an outer measure on the family P.X/ of all subsets of X. Then (a) the collection M of all !-measurable subsets of X is a -algebra; (b) the restriction of ! to M is a measure on M. Proof Since relation (1.9) is symmetric in M and M c , one has M c 2 M whenever M 2 M. Given A, B in M, let us show that A \ B 2 M. For any S 2 P.X/, since B is !-measurable we have !.A \ S/ D !.A \ S \ B/ C !.A \ S \ Bc /: Adding !.Ac \ S/ to both sides of this relation, on the left we obtain !.S/ since A 2 M. Thus, to prove that A \ B 2 M it suffices to show that !..A \ B/c \ S/ D !.A \ S \ Bc / C !.Ac \ S/: This is seen by using the fact that A 2 M and by replacing M and S with A and .A \ B/c \ S respectively in (1.9): !..A \ B/c \ S/ D !.A \ S \ .A \ B/c / C !.Ac \ S \ .A \ B/c / since A\S \.A\B/c D A\S \Bc and Ac \S \.A\B/c D Ac \S. Thus A\B 2 M. Now, let A and B be disjoint elements of M. Since A \ B 2 M we have A [ B D .Ac \Bc /c 2 M. Moreover, given an arbitrary element S of P.X/, using relation (1.9) with A instead of M and .A [ B/ \ S instead of S, we get !..A [ B/ \ S/ D !.A \ .A [ B/ \ S/ C !.Ac \ .A [ B/ \ S/ D !.A \ S/ C !.B \ S/: In particular, taking S D X we see that ! is additive on M.
1.5 Measures
31
Now, let .An / be a sequence in disjoint !-measurable subsets of X, let A be its union, and let S 2 P.X/. For any k 2 Nnf0g we have !..A0 [ : : : [ Ak / \ S/ D
k X
!.Ai \ S/
iD0
as an induction shows, the case k D 1 being already established. Using relation (1.9) with A0 [ : : : [ Ak 2 M instead of M, we deduce from the preceding relation that !.S/ D !..A0 [ : : : [ Ak / \ S/ C !..A0 [ : : : [ Ak /c \ S/
k X
!.Ai \ S/ C !.Ac \ S/;
iD0
by the inclusion Ac \ S .A0 [ : : : [ Ak /c \ S and the fact that ! is increasing. Whence, since k is arbitrarily large and since ! is -subadditive, !.S/
1 X
!.Ai \ S/ C !.Ac \ S/ !.A \ S/ C !.Ac \ S/:
iD0
This proves that A 2 M. Thus M is a -algebra and since the preceding relations are equalities when S D A we see that ! is -additive on M. t u Combining this theorem with Proposition 1.9, we get an extension result. Proposition 1.10 (Hahn) Let W A ! RC be a measure on a semi-ring or a ring A of subsets of a set X. Then there exists a measure W A ! RC on the -algebra A generated by A whose restriction to A is . If moreover is -finite, then the extension of is unique. Proof By Lemma 1.6 there is no loss of generality in assuming that A is a ring. Let ! be the outer measure associated with given by Proposition 1.9 and let M be the -algebra of !-measurable subsets of X. In order to show that A is contained in M let us introduce the family M0 formed by the subsets M 2 P.X/ such that 8A 2 A
!.A/ !.A \ M/ C !.A \ M c /:
(1.10)
Clearly, M M0 and since by Proposition 1.9 the restriction of ! to A is additive as it is the measure , we have A M0 . Let us show that M0 D M. This will imply that A M, hence that the -algebra A generated by A is contained in M and Theorem 1.12 will ensure that the restriction of ! to A is a measure. Then the restriction of to A being also the restriction of ! to A is .
32
1 Sets, Orders, Relations and Measures
We have to show that every M 2 M0 belongs to M or that 8S 2 P.X/
!.S/ !.S \ M/ C !.S \ M c /:
(1.11)
This relation is trivially satisfied if !.S/ D C1. If !.S/ isP finite, for every " > 0 one can find a sequence .An / of A such that S [n An and n .An / < !.S/ C ". Since for all n we have An 2 A, hence .An / D !.An / !.An \ M/ C !.An \ M c /, by (1.10), summing we get !.S/ C "
X n
!.An \ M/ C
X
!.An \ M c /
n
!.S \ M/ C !.S \ M c / since ! is -subadditive and increasing. Since " is arbitrarily small, relation (1.11) is satisfied and M 2 M. The uniqueness assertion in the case when is -finite is given by Theorem 1.11. t u
Exercises 1. Let .X; M; / be a measure space and let A 2 M. Verify that any partition of M into sets of positive measure is at most countable. 2. Let .X; M; / be a measure space and let A1 ; : : : ; An 2 M with finite measures. Show that .A1 [ A2 / D .A1 / C .A2 / .A1 \ A2 / and 3 .A1 [ A2 [ A3 / D ˙iD1 .Ai / C .\3iD1 Ai / .A1 \ A2 / .A2 \ A3 / .A3 \ A1 /. Generalize the preceding relations to the case of four or more subsets. 3. Let .X; M; / be a finite measure space and let A1 ; : : : ; An 2 M. Show that if n ˙iD1 .Ai / > .n 1/.X/ then \niD1 Ai is nonempty. 4. Verify that an additive function W C !RC on a semi-ring is increasing. 5. Give an example of an additive function on a -algebra that is not -additive. [Hint: consider an infinite set X and W P.X/ ! RC given by .S/ D 0 if S is finite, .S/ D C1 if S is infinite.] 6. Verify that there exists a decreasing sequence .Bn / of Borel subsets of R such that .\n Bn / ¤ limn .Bn /, being the Lebesgue measure. [Hint: take Bn WD fx 2 R W jxj > ng.] 7. Let X be a set, let C P.X/ be closed under finite intersections, with ¿ 2 C. Let W C ! RC be such that .¿/ D 0 and such that for any sequence .Cn / of C whose union C is in C one has .C/ ˙n .Cn /. Show that the restriction to C of the outer measure ! deduced from coincides with . [Hint: if C 2 C and if .Cn /n 2 C.C/ consider the sequence .Cn \ C/n ].
1.6 Completion of a Measure
33
8. Let X be a set, let W A ! R be a measure on a ring of subsets of X, and let ! be the outer measure associated with . Show that for every increasing sequence .Sn / of P.X/ one has !.[n Sn / D limn !.Sn /. [Hint: for any " > 0 and any n 2 N pick Bn in the -algebra A generated by A such that Sn Bn and !.Bn / < !.Sn / C "=2n and let Cn WD \1 pDn Bp so that .Cn / is increasing, Sn Cn Bn and limn !.Sn / D limn !.Cn / D !.[n Cn / !.[n Sn / and use the fact that ! is increasing.] 9. Let .X; d/ be a metric space (see Sect. 2.3). An outer measure ! W P.X/ ! RC is called a metric exterior measure if for any A, B 2 P.X/ one has !.A [ B/ D !.A/ C !.B/ whenever gap.A; B/ > 0, where gap.A; B/ WD inffd.a; b/ W a 2 A; b 2 Bg . Show that if ! is a metric exterior measure, then the Borelian subsets of X are measurable and the restriction of ! to the Borel -algebra is a measure. [See [56, p. 214], [240, p. 267].] 10. Let .X; d/ be a metric space and let ˛ > 0. For " > 0 and a subset A of X, let R" .A/ be the set of countable coverings of A by balls whose diameter is at most ". Set X ˛;" .A/ WD inff .diamBn /˛ W .Bn / 2 R" .A/g: n
Show that the function " 7! ˛;" .A/ is nonincreasing on P WD0; 1Œ. Verify that is an outer measure on X. Deduce from this that !˛ WD lim"!0C ˛;" is an outer measure on X. It is called the ˛-Hausdorff measure. 11. Keeping the data and the notation of the preceding exercise with X D Rd , verify that for all A X and " 20; 1 the functions ˛ 7! "˛ ˛;" .A/ and ˛ 7! !˛ .A/ are nonincreasing on P. Given ˛ d and " > 0, show that there exists some cd > 0 such that for every rectangle R of Rd one has ˛;" .R/ cd "˛d d .R/, where d is the measure on Rd associated with the Lebesgue outer measure. Deduce from this that for all ˛ > d and all bounded subset A of Rd one has !˛ .A/ D 0. The Hausdorff dimension h.A/ of a bounded subset A of Rd is defined by h.A/ WD inff˛ > 0 W !˛ .A/ D 0g: Show that !˛ .A/ D 0 for ˛ > h.A/ and that !˛ .A/ D C1 for ˛ < h.A/. Show that if A is a nonempty rectangle of Rd then h.A/ D d.
1.6 Completion of a Measure Given a measure space .X; S; / it is often necessary to consider properties which hold almost everywhere, i.e. outside a negligible set. A precise definition is as follows.
34
1 Sets, Orders, Relations and Measures
Definition 1.7 Given a measure on a ring S of subsets of a set X, a subset N of X is called a null set or a negligible set if for every " > 0 there exists some S 2 S such that N S and .S/ ". If every null set belongs to S one says that is complete . If some ambiguity may arise one says that N is -null. Clearly, any subset of a null set is a null set. If S is a -ring, any countable union N of null sets Nn is a null set since for any " > 0 one can find some Sn 2 S such that Nn Sn and .Sn / "=2n , so that one has N [n1 Sn 2 S and .[n1 Sn / ". If S is stable under countable intersections, N 2 P.X/ is a null set if and only if N is contained in some S 2 S with .S/ D 0. Also, if is complete, a subset N of X is a null set if and only if N 2 S and .N/ D 0. Proposition 1.11 Given a measure on a ring (resp. -ring, resp. -algebra) S of subsets of a set X, the family S of sets of the form S [ N, where S 2 S and N is a null set is a ring (resp. -ring, resp. -algebra) called the completion of S. Setting .S [ N/ D .S/ for S 2 S, N null, one gets a measure on S extending and is complete. Moreover, the null sets for are the null sets for . Proof Let T WD S [ N with N null and S 2 S. Replacing N with NnS (which is still a null set) we may suppose S and N are disjoint, so that T D SN. Since the symmetric difference is an associative and commutative operation, for T 0 WD S0 N 0 with S0 2 S and N 0 null we have TT 0 D .SS0 /.NN 0 / 2 S : Since .S [ N/ \ .S0 [ N 0 / D .S \ S0 / [ N 00 with N 00 WD .S \ N 0 / [ .S0 \ N/ [ .N \ N 0 / we see that S is stable under finite intersections. Moreover, if S is a -ring, one easily checks that the union of a countable family of elements of S is in S . Thus S is a -ring (and a -algebra if S is a -algebra). Now let us observe that the definition of is coherent: if S [ N D S0 [ N 0 with N, N 0 null sets and S, S0 2 S, we have SS0 N [ N 0 , so that SS0 is a null set, and .SS0 / D 0, .SnS0 / D 0 D .S0 nS/, .S/ D .S \ S0 / C .SnS0 / D .S \ S0 / and similarly .S0 / D .S \ S0 / so that .S0 / D .S/. Obviously .¿/ D 0. Let .Tn / be a sequence in disjoint members of S , with Tn WD Sn [ Nn , Sn 2 S, Nn null. Since the union N of the family .Nn / is a null set, the union T of the family .T Pn / is such that P T D S [ N, where S is the union of .Sn /, so that .T/ D .S/ D .S / D n n n .Tn /. Clearly extends . If Y 2 P.X/ is a -null set, for every " > 0 we can find some T WD S [ N 2 S containing Y with .T/ D .S/ "=2, S 2 S, S0 2 S with N S0 , .S0 / "=2, so that we have Y S [ S0 with .S [ S0 / " and Y is a null set, so that Y D ¿ [ Y 2 S . Thus .X; S ; / is complete. t u
1.6 Completion of a Measure
35
Proposition 1.12 Let W S ! RC be a measure on a -algebra S of subsets of a set X and let .X; S ; / be the completion of .X; S; /. Let ! W P.X/ ! RC be the outer measure deduced from . Then every T 2 S belongs to the class M of !-measurable subsets and one has !.T/ D .T/. If is -finite, then S D M. Proof Let us first show that every element T of S belongs to M. We start with a null set N. By the observations following Definition 1.7 there exists an N 0 2 S containing N that satisfies .N 0 / D 0I thus by (1.7) !.N/ D 0. Moreover, for every Y 2 P.X/ we have !.N \ Y/ !.N/ D 0, hence !.Y/ !.N c \ Y/ !.N \ Y/ C !.N c \ Y/; so that N 2 M. Now for every T WD S [ N 2 S with N null, S 2 S, and S \ N D ¿ we have T 2 M since M is a -algebra containing S and N 2 M. Moreover, since ! is subadditive and increasing, !.T/ !.S/ C !.N/ D !.S/ !.T/: Since !.S/ D .S/ D .T/, we get !.T/ D .T/. It remains to show that M S when is -finite. Let us first suppose M 2 M with !.M/ < C1 and construct some A 2 S such that M A and .A/ D !.M/. By definition of !, for every k 2 Nnf0g we can find a covering .Ak;n /n of M by members of the -ring S such that X n
1 .Ak;n / !.M/ C : k
For Ak WD [n Ak;n we have Ak 2 S, M Ak , and .Ak / !.M/ C 1=k. Let A WD \k Ak , so that A 2 S, M A and .A/ infk .Ak / !.M/. Since !.M/ !.A/ D .A/ we get .A/ D !.M/. Thus M D A.AnM/ with !.AnM/ D 0 and AnM 2 M (but not AnM 2 S!). Replacing M with AnM in what precedes, we find some B 2 S such that AnM B and .B/ D !.AnM/ D 0. This shows that AnM is a null set and that M D A.AnM/ 2 S . In the general case, taking a sequence .Sn / in S whose union is S and is such that .Sn / < C1 for all n we reduce the question to the case of Mn WD M \ Sn , which is of finite measure. t u
Exercises 1. Find two Borel subsets A, B of R whose Lebesgue measure is 0 and whose sum A C B is R. 2. With the notation of Proposition 1.11 show that S is the smallest -ring containing S on which a complete measure extending can be defined.
36
1 Sets, Orders, Relations and Measures
3. Let .X; S; / be a measure space, the values of on S being finite. For A, B 2 S set d0 .A; B/ D .AB/. Verify that d0 is a semimetric: for any A, B, C 2 S d0 .A; A/ D 0;
d0 .B; A/ D d0 .A; B/;
d0 .A; C/ d0 .A; B/ C d0 .B; C/:
Let b S be the quotient space of S with respect to the equivalence relation A B if d0 .A; B/ D 0 and let p W S !b S be the quotient map. Verify that the function dWb S b S ! R characterized by d. p.A/; p.B// WD d0 .A; B/ for all A, B 2 S is a metric on b S and prove that .b S; d/ is complete. [Hint: using the relation A0 An n1 [iD0 Ai AiC1 show that for any sequence .An / of S satisfying d0 .An ; AnC1 / 2n one has .d0 .An ; B// ! 0, for B WD [n \pn Ap , .d0 .An ; C// ! 0 for C WD \n [pn Ap .] Prove that if S is the -algebra generated by a subalgebra A of S then the image of A by p is dense in b S. 4. (Poincaré’s Recurrence Theorem) Let .X; S; / be a finite measure space and let T W X ! X be a measure-preserving map, i.e. .T.A// D .A/ for all A 2 S. Show that for all A 2 S there exists a null set N of A such that for all x 2 AnN one has T .n/ .x/ 2 A for infinitely many n. [Hint: first prove that there exists some null set N0 of A such that for all x 2 AnN0 one has T n .x/ 2 A for at least one n 2 Nnf0g; then apply this preliminary result to T .2/ WD T ı T, T .3/ ; : : : and conclude.]
1.7 Lebesgue and Stieltjes Measures Anticipating on Sect. 2.2, let us observe that if B is the -algebra generated by a family O of subsets of a set X and if there exists a countable subfamily C of B such that each element of O is the union of a subfamily of C, then B is the -algebra generated by C. In particular, if .X; O/ is a topological space, if B is the Borel algebra generated by O, and if every open set is the union of a family contained in a countable subfamily C of B, then B is also generated by C. That happens when O has a countable base. For X D Rd one can take for C the family of open balls whose radii are rational and whose centers have rational coordinates. One can also take closed such balls. Since any open interval a; bŒ of R is the countable union of a family of semi-closed intervals of the form Œan ; bn Œ, one can also take for C the family Rd of semi-closed rectangles we call boxes C WD Œa1 ; b1 Œ : : : Œad ; bd Œ with ai , bi 2 R, with the convention Œai ; bi ŒD ¿ if ai bi . Such a family is a semi-ring. For the case d D 1 this follows from the relations Œa; bŒ\Œa0 ; b0 Œ D Œsup.a; a0 /; inf.b; b0 /Œ; Œa; bŒnŒa0 ; b0 Œ D Œa; inf.a0 ; b/Œ[Œsup.a; b0 /; bŒ;
1.7 Lebesgue and Stieltjes Measures
37
the two intervals of the last right-hand side being disjoint. The general case follows by induction and the observation made above. Let us note that it is the stability of Rd under intersection which justifies our choice of semi-closed intervals, and, in dimension d > 1 the choice of boxes for C. Moreover, C is a semi-ring. The preceding observations and the extension results of the Sect. 1.5 entail that there is a unique measure d on the Borel -algebra Bd WD B.Rd / of Rd whose restriction to Rd is the volume: d .C/ WD .b1 a1 / : : : .bd ad / for C WD Œa1 ; b1 Œ : : : Œad ; bd Œ, provided one shows that d is -additive on Rd . That will be shown for the larger class of Stieltjes measures. The completion of Bd for the measure d is called the Lebesgue -algebra. The measure on this algebra is still denoted by d rather than d . We are ready to study Stieltjes measures on the Borel -algebra B of R (or on the Lebesgue -algebra of R). They are the measures that are finite on any interval bounded above. First, we associate a Stieltjes measure to an increasing, left-continuous function g W R ! R. We start with the Stieltjes outer measure ! WD !g associated with g. Let us denote by C the collection of bounded intervals of the form C WD Œa; bŒ with a, b 2 R with a b and for C WD Œa; bŒ let us set .C/ WD g.b/ g.a/. For a subset S of R, ! is defined by !.S/ D inff
X
.Cn / W .Cn / 2 C.S/g
n
where C.S/ is the collection of sequences .Cn / of C whose unions contain S. In order to prove that the restriction of ! to C coincides with , by Proposition 1.9 it suffices to show that is -additive. We do that by using the notion of compactness of Sect. 2.2.4. Let .Cn /n0 WD .Œan ; bn Œ/n0 be a sequence of disjoint elements of C whose union C WD Œa; bŒ belongs to C. Since by Lemma 1.6 can be extended to an additive function also denoted by on the ring A generated by C, for all p 2 Nnf0g we have p X
.Cn / D .C0 [ : : : [ Cp / .C/;
nD0
P hence 1 nD0 .Cn / .C/. Let us prove the reverse inequality. Given " > 0 let b0 < b be such that g.b0 / > g.b/ "=2 and for all n 2 Nnf0g let a0n < an be such that g.a0n / > g.an / "=2nC2 . By compactness of Œa; b0 we can find some p 2 N such that 0
0
Œa; b Œ Œa; b
p [
p [ 0 an ; bn Œ Œa0n ; bn Œ nD0 nD0
38
1 Sets, Orders, Relations and Measures
so that " " X " D .Œa; b0 Œ/ C .Œa0n ; bn Œ/ C 2 2 nD0 2 p
g.b/ g.a/ < g.b0 / g.a/ C
p p p X X " " X " 0 .g.bn / g.an // C .g.bn / g.an // C C ; nC2 2 2 2 nD0 nD0 nD0
.C/
p X
.Cn / C ":
nD0
Since " > 0 is arbitrarily small, the required inequality is established and is additive on C and its extension to A is also -additive. Moreover, the restriction of ! to C coincides with . Since R is the union of the intervals Œn; n C 1Œ with finite measure, Proposition 1.10 ensures that the restriction of ! to the -algebra A generated by A is an extension of . Since A D B we have proved the following result. Proposition 1.13 Given a nondecreasing, left continuous function g W R ! R, there exists a unique -finite measure WD g on the Borel -algebra B of R such that .Œa; bŒ/ D g.b/ g.a/ for all a; b 2 R with a b. If g is bounded below, g is called the Stieltjes measure associated with g. Since any bounded closed interval T WD Œa; b is the intersection of a decreasing sequence .Cn / of intervals Cn of the form Œa; bn Œ, Lemma 1.5 ensures that .T/ D limn g.bn /g.a/ D g.bC /g.a/ with g.bC / WD limb0 .>b/!b g.b0 /. Similarly, writing a; bŒ as an increasing sequence of intervals in C, we get .a; bŒ/ D g.b/ g.aC /. The measure h associated with the function h WD g C c, where c is a constant, obviously coincides with g . Thus, when g is bounded below we can assume that inf g.R/ D 0 and then, for all r 2 R, we have . 1; r/ D g.r/. If g is bounded g is finite. In particular, if inf g.R/ D 0 and sup g.R/ D 1 we get a probability on B. Now we show that any Stieltjes measure on the Borel -algebra B of R is obtained in the preceding manner. Proposition 1.14 Let W B !R be a measure on the Borel -algebra of R such that . 1; rŒ/ is finite for all r 2 R. Then, the function g W R ! R given by g.r/ WD . 1; rŒ/ is nondecreasing, left continuous and inf g.R/ D 0. Moreover, the measure g associated with g coincides with . Proof Given and g as in the statement, for r s in R we get g.r/ g.s/ since 1; rŒ 1; sŒ. Let .rn / be an increasing sequence in 1; rŒ with limit r (we write .rn / ! r ). Since 1; rŒD [n 1; rn Œ Lemma 1.5 ensures that .g.rn // ! g.r/. We easily deduce from this the left continuity of g. The relation inf g.R/ D 0 similarly follows from Lemma 1.5. Since for a b in R one has Œa; bŒD1; bŒn1; aŒ we have .Œa; bŒ/ D g.b/g.a/. The uniqueness assertion of the preceding proposition ensures that the measure g coincides with . t u
1.7 Lebesgue and Stieltjes Measures
39
The function g associated with a Stieltjes measure on B is called the distribution function of . In particular, the study of probabilities on B can be somewhat reduced to the study of functions. The Lebesgue measure is obtained as the special case of Proposition 1.13 obtained by taking for g the identity function of R. Then for any bounded interval T of R the measure .T/ of T is its length. Let us note that the proof of Proposition 1.10 shows that is a measure on the -algebra M of measurable subsets of R for the outer measure ! associated with the length. Countable subsets of R have Lebesgue measure 0, a singleton fag being the interval Œa; a with length 0. However there are uncountable subsets S of R such that .S/ D 0, for example the Cantor set. Example (The Cantor Set) Consider the sequence .Cn / of subsets of Œ0; 1 obtained by taking C0 WD Œ0; 1, C1 WD Œ0; 1=3 [ Œ2=3; 1, CnC1 being obtained from Cn by removing the open middle third of each interval forming Cn . The Cantor set is the set C WD \n Cn . Its Lebesgue measure is 0 since .Cn / D .2=3/n . It can be shown that C is the image of the set f0; 1gN of sequences .kn /n with kn D 0 or 1 under the map .kn / 7! ˙n 2kn =3nC1 . Since this map is injective, C is uncountable (it has the cardinality of the continuum). Proposition 1.15 The Borel -algebra B and the Lebesgue -algebra M on R are invariant under all translations tc W x 7! x c and dilations hr W x 7! rx for c 2 R and r > 0. Moreover, .tc .S// D .S/ and .hr .S// D r.S/ for all S 2 M. Proof The class of sets B 2 B (resp. B 2 M) such that tc .B/ 2 B (resp. hr .B/ 2 M) is a -algebra, as is easily seen. Since it contains the class of intervals, it coincides with B (resp. M). Moreover, , ı tc , and r1 ı hr coincide on the class C of intervals, hence coincide on M. t u
Exercises 1. Let C be the semi-ring of bounded semi-closed intervals of R and let be the length measure on C. Let C 2 C and C1 ; : : : ; Cm be disjoint elements of C such that C1 [ : : : [ Cm C. Prove that .C1 / C : : : C .Cm / .C/. Let C and C1 ; : : : ; Cm be elements of C such that C C1 [ : : : [ Cm . Prove that .C/ .C1 / C : : : C .Cm /. Deduce from these facts that there is a unique (countably additive) measure on the -ring generated by C. 2*. Let X WD Œ0; 1 and let I WD X= be the quotient of X by the equivalence relation defined by x y if x y 2 Q. Using the axiom of choice, define a map s W I ! X such that s.i/ 2 p1 .i/ for all i 2 I, where p W X ! X= is the quotient map. Show that A WD s.I/ does not belong to the -algebra of Lebesgue measurable subsets of X. [See [240, pp. 24–25].]
40
1 Sets, Orders, Relations and Measures
3. Prove that the Lebesgue -algebra L of R is equipotent to P.R/. [Using the fact that the Cantor set C is equipotent to R and belongs to the class N of null sets of R, note the inclusions P.C/ N L P.R/.] 4*. (A Lebesgue measurable set that is not Borel) Let h W Œ0; 1 ! Œ0; 1 be given by h.1/ WD 1, h.x/ WD ˙n1 2xn =3n if x WD ˙n1 xn =2n with xn 2 f0; 1g and such that for all k 2 N there exists an n k with xn ¤ 1. Verify that h is strictly increasing, hence Borel measurable and that h.Œ0; 1/ D C, the Cantor set. Let A be the set obtained in Exercise 2. Show that S WD h.A/ is Lebesgue measurable but not a Borel set. [Hint: S is a null set as S C, but since h1 .S/ D A is not a Borel set, S cannot be a Borel set.] 5*. (A non-Borel function that is Riemann-integrable) Let Cn be the set described in the construction of the Cantor set C and let fn WD 1Cn be its characteristic R1 function. Verify that 0 fn D .2=3/n ! 0. Deduce from this that if f is the characteristic function of the subset S of C obtained in Exercise 4, then f is not a Borel function but is Riemann-integrable with a null integral. [Hint: note that 0 1S 1C 1Cn .]
1.8 * Product Measures Because the content of this section is somewhat involved and can be obtained more easily by using integration theory, we suggest that this section should be skipped on first reading. On the other hand, it has its place in the present chapter and it can be considered as a good training in set theory. Given -finite measure spaces .X; M; / and .Y; N ; / can one endow the product space X Y with a canonical measure? It is the purpose of the present section to give a positive answer to this question. We first consider product rings and -algebras. Given sets X, Y and rings A, B on X and Y respectively, we denote by C the collection of rectangles, i.e. sets of the form C WD A B with A 2 A, B 2 B and by A B the collection of unions of finite families of disjoint elements of C. The relations .A B/ \ .A0 B0 / D .A \ A0 / .B \ B0 /; .A B/n.A0 B0 / D Œ.AnA0 / B [ Œ.A \ A0 / .BnB0 /; .A B/c D .Ac Y/ [ .A Bc / show that C is a semi-ring and Lemma 1.4 entails that A B is the ring generated by C. Moreover, A B is an algebra if A and B are algebras. The following proposition shows that A B is a -algebra if A and B are -algebras. Given a class G of subsets of a set Z, we denote by G the -algebra generated by G. Proposition 1.16 Given rings A, B on the sets X and Y respectively, the -algebra A ˝ B WD .A B/ generated by the semi-ring C or by the ring A B coincides with the -algebra A ˝ B WD .A B / generated by the ring A B .
1.8 * Product Measures
41
Proof Since D WD A B A ˝ B it follows that A ˝ B WD D A ˝ B . Given B 2 B, the set AB WD fM 2 P.X/ W M B 2 D g contains A and is closed under finite intersections, relative complementation since .MnN/B D .M B/n.N B/, and symmetric differences since .M B/.M 0 B/ D .MM 0 / B. Moreover, AB is closed under countable unions. Thus AB contains A . This means that for all M 2 A and all B 2 B we have M B 2 D . Thus the set BM WD fB 2 P.Y/ W M B 2 D g contains B. As above one can show it is a -algebra. Thus BM contains B , so that for all M 2 A and all N 2 B we have M N 2 D . Thus A ˝ B D and equality holds. t u We denote by pX W X Y ! X and pY W X Y ! Y the canonical projections. If f W XY ! Z is a map with values in a measurable space .Z; S/ and if P 2 P.XY/, for x 2 X we denote by fx W Y ! Z the partial map of f and by Px 2 P.Y/ the slice of P defined by fx . y/ WD f .x; y/; y 2 Y
Px WD fy 2 Y W .x; y/ 2 Pg:
Lemma 1.8 Let X, Y be sets and let A, B be rings (resp. -algebras) on X and Y respectively. Then, for all P 2 A B (resp. A ˝ B) and for all x 2 X one has Px 2 B. For any measurable map f W .X Y; A ˝ B/ ! .Z; S/ and any x 2 X, fx is measurable. If .W; W/ is a measurable space, a map g W W ! X Y is measurable with respect to W and A ˝ B if and only if pX ı g and pY ı g are measurable. Proof Let Q be the collection of sets Q 2 A B (resp. A˝B) such that Qx 2 B for all x 2 X. We want to show that Q D A B (resp. A ˝ B). Since .X Y/x D Y and since Q contains the family C of rectangles of the form A B with A 2 A, B 2 B, it suffices to show that Q is a ring (resp. a -algebra). This follows from the relations .PnQ/x D Px nQx ; .P [ Q/x D Px [ Qx ;
.
[ n
P n /x D
[
.Pn /x
n
for all P, Q, Pn 2 P.X Y/ and all x 2 X. Let f W .X Y; A ˝ B/ ! .Z; S/ be measurable and let x 2 X. For every S 2 S one has fx1 .S/ D . f 1 .S//x 2 B by the first part of the lemma. Thus fx is measurable. Let g W W ! X Y. If g is measurable with respect to W and A ˝ B, since pX and pY are measurable, pX ı g and pY ı g are measurable. Conversely, if pX ı g and pY ı g are measurable, for all A 2 A, B 2 B one has g1 .A B/ D . pX ı g/1 .A/ \ . pY ı g/1 .B/ 2 W. Since C generates the -algebra A ˝ B, g is measurable. u t
42
1 Sets, Orders, Relations and Measures
For the remaining part of this section, .X; M; / and .Y; N ; / are -finite measure spaces. We want to construct a natural measure WD ˝ on .X Y; M ˝ N /. We denote by A (resp. B) the ring formed by those A 2 M (resp. B 2 N ) such that .A/ < C1 (resp. .B/ < C1). In the sequel we assume and are -finite, hence that M D A and N D B . Then, Proposition 1.16 ensures that M ˝ N D .A B/ D C : It is natural to define WD ˝ on C by setting . ˝ /.A B/ WD .A/ .B/
A 2 A; B 2 B:
(1.12)
Let us first show that is additive on C. We need a refinement of Lemma 1.4. Lemma 1.9 Given a finite family .Ci /i2I of elements of a semi-ring C one can find a finite partition .Ej /j2J of C WD [i2I Ci by elements of C such that for all i 2 I one has Ci D [j2Ji Ej for some subset Ji of J. Proof We prove this assertion by induction on the number n of elements of I. For n D 2, I WD f1; 2g we take J WD f1; 2; 3g, J1 D f1; 3g, J2 WD f2; 3g, and we write C1 [ C2 D E1 t E2 t E3 with E1 WD C1 nC2 , E2 WD C2 nC1 , E3 WD C1 \ C2 : In order to pass from n 1 to n 3, given a family .Ci /i2I , with I WD Nn , we write C1 [ : : : [ Cn1 D th2H Eh , Ci WD th2Hi Eh with Hi H N by our induction assumption. Then, replacing .C1 ; C2 / with .th2H Eh ; Cn / in the preceding relation, we get the decomposition C1 [ : : : [ Cn1 [ Cn D .th2H .Eh \ Cn // t .Cn n.th2H Eh // t .th2H .Eh nCn //: For h 2 H let Fh WD Eh \ Cn 2 C. Since Cn n.th2H Eh / D \h2H .Cn nEh / belongs to the ring A generated by C, we can find a finite subset K of NnH and a partition .Fk /k2K of Cn n.th2H Eh / by members of C. Similarly, we can find a finite subset L of Nn.H [K/, a partition .Lh /h2H of L, and partitions .F` /`2Lh of Eh nCnC1 by elements of C. Then, for J WD H [K [L, the family .Fj /j2J is a partition of C WD C1 [: : :[Cn . For i 2 Nn1 we have Ci D tj2Ji Fj for Ji WD Hi [ .[h2Hi Lh / and for i WD n we have Cn D tj2Jn Fj for Jn WD H [ K so that the induction step is established. t u Lemma 1.10 The function ˝ is additive on C WD fA B W A 2 A; B 2 Bg. Proof Let us consider a finite partition .Ci /i2I of C WD A B 2 C by elements Ci WD Ai Bi of C. The families .Ai /i2I and .Bi /i2I are not partitions in general. However, by the preceding lemma, we can find finite sets J, K, partitions .Ej /j2J , .Fk /k2K of A and B by nonempty members of A and B respectively such that for all i 2 I, there exist some subsets Ji , Ki of J and K respectively, such that .Ej /j2Ji and .Fk /k2Ki are partitions of Ai and Bi respectively.
1.8 * Product Measures
43
Fig. 1.3 Refined pavement of the yard of Fig. 1.1
Then .Ej Fk /. j;k/2JK is a partition of A B: for all . j; k/ 2 J K, picking some .x; y/ 2 Ej Fk we can find some i 2 I such that .x; y/ 2 Ci WD Ai Bi , so that . j; k/ 2 Ji Ki I on the other hand, when .Ej Fk / \ .Ej0 Fk0 / ¤ ¿ we have . j; k/ D . j0 ; k0 /. Then . ˝ /.C/ D .A/ .B/ D .
X
.Ej //.
j2J
X
.Fk // D
X
.Ej / .Fk /:
. j;k/2JK
k2K
Similarly, since .Ej Fk /. j;k/2Ji Ki is a partition of Ai Bi , . ˝ /.Ci / D .Ai / .Bi / D .
X
.Ej //.
j2Ji
X
X
.Fk // D
k2Ki
.Ej / .Fk /:
. j;k/2Ji Ki
Now, let us note that .Ji Ki /i2I is a partition of J K since for h ¤ i in I and . j; k/ 2 .Jh Kh / \ .Ji Ki / we have Ej Ah \ Ai , Fk Bh \ Bi hence Ej Fk .Ah Bh / \ .Ai Bi / D ¿. Thus, by associativity of summation, X . j;k/2JK
.Ej / .Fk / D
X
X
.Ej / .Fk / D
i2I . j;k/2Ji Ki
It follows that . ˝ /.C/ D ˙i2I . ˝ /.Ci /.
X
. ˝ /.Ci /:
i2I
t u
Then, by Lemma 1.6, ˝ can be uniquely extended into an additive function on the ring A B. In order to prove that ˝ is -additive we need an estimate (Fig. 1.3). Lemma 1.11 Let c 2 RC and let P 2 A B be such that .Px / c for all x 2 X. Then one has . ˝ /.P/ c. pX .P//. Proof Every element of A B is a finite union of a disjoint family of rectangles in C. We carry out the proof by an induction on the number n of such rectangles. For n D 1, P WD A B and since Px D B if x 2 A and Px D ¿ if x 2 XnA, we have . ˝ /.P/ D .A/ .B/ c.A/ if .B/ c.
44
1 Sets, Orders, Relations and Measures
Now we assume the result is established when P is the disjoint union of at most n 1 rectangles and we prove it when P D C1 t : : : t Cn with Ci WD Ai Bi , where Ai 2 A, Bi 2 B for i 2 Nn . Let A WD A1 [ : : : [ An1 and let D D .An nA/ Bn
ED
n1 a iD1
.Ai nAn / Bi
FD
n1 [
.Ai \ An / .Bi [ Bn /:
iD1
These sets are disjoint because their projections on X are disjoint and we easily see that P D D [ E [ F. Since these sets are unions of at most n 1 rectangles, the induction assumption and the inclusions Dx Px , Ex Px , Fx Px for all x 2 X yield .P/ D .D/ C .E/ C .F/ c. pX .D// C c. pX .E// C c. pX .F// c..An nA/ C .AnAn / C .A \ An // D c.A [ An / D c. pX .P//: t u Theorem 1.13 The function WD ˝ has a unique extension as a measure on M ˝ N. Proof Since M ˝ N is the -algebra generated by A B, by Theorem 1.10 it suffices to show that is -additive on A B. We first assume that and are finite measures; let a WD .X/ > 0, b WD .Y/ > 0 (the cases a D 0 and b D 0 are trivial). Since is finite and additive on A B, Lemma 1.5 ensures that it suffices to show that for every decreasing sequence .Pn / of elements of A B whose intersection is empty one has limn .Pn / D 0. Assume on the contrary that there exist r > 0 and a decreasing sequence .Pn / such that \n Pn D ¿ and limn .Pn / > r. Since for all P 2 A B and all x 2 X the set Px 2 A and x 7! .Px / is a sum of characteristic functions, the sets An WD fx 2 X W ..Pn /x / > r=2ag are measurable. Let Qn WD Pn n.An Y/, so that for all x 2 pX .Qn / XnAn we have ..Qn /x / D ..Pn /x / r=2a. The preceding lemma entails that .Qn / .r=2a/. pX .P// r=2. Since r < .Pn / D .Pn nQn / C .Qn / .An Y/ C r=2; we have b.An / > r=2. Since the sequence .An / is decreasing we cannot have \n An D ¿. Let x 2 \n An . By definition of An we have ..Pn /x / > r=2a. Since the sequence ..Pn /x /n is decreasing, its intersection must be nonempty. Let y 2 \n .Pn /x . Then we have .x; y/ 2 Pn for all n 2 N, contradicting our assumption that \n Pn D ¿.
1.8 * Product Measures
45
When .X; M; / and .Y; N ; / are -finite measure spaces, the function ˝ is -finite. Given C WD A B 2 C and a countable partition .Cn / of C by elements of C, replacing the measures and by the measures A W M 7! .M \ A/ and B W N 7! .N \ B/, what precedes shows that . ˝ /.C/ D ˙n . ˝ /.Cn /. Then, Hahn’s extension theorem (Theorem 1.10) shows that ˝ can be extended to an additive function on M ˝ N . t u In general, a product of complete measure spaces is not complete. To see this, let WD ˝ be the product measure on X Y of two complete measure spaces .X; A; / and .Y; B; /. We denote by N the set of null sets with respect to a measure with D , , . Then we observe that (with the usual convention 0 .C1/ D 0) .N P.Y// [ .P.X/ N / N :
(1.13)
It follows that if A 2 N nf¿g and B 2 P.Y/nB, then P WD A B … A ˝ B since otherwise, by Lemma 1.8, for x 2 A we would have B D Px 2 B. In particular, since the Lebesgue -algebra L on R obtained by completing the Borel -algebra B of R is different from P.R/ and since the family N of its null sets is not reduced to ¿, we get that .R2 ; L ˝ L; ˝ / is not complete. For d 2 Nnf0g let Ld be the -algebra on Rd obtained by completing the Borel d d -algebra Bd of R Qd and let Cd be the semi-ring on R formed by the semi-closed rectangles C WD iD1 Œai ; bi Œ with ai bi for all i 2 Nd . Since any open subset of Rd is a countable union of such rectangles, Bd is generated by Cd . Let us make precise the links between completed measures and product measures. Proposition 1.17 Let .X; A; / and .Y; B; / be two -finite measure spaces and let .X; A ; / and .Y; B ; / be the completed measure spaces. Then one has .A ˝ B /˝ D .A ˝ B/˝ ;
N˝ D N˝ ;
˝ D ˝ :
Proof The inclusions A ˝ B A ˝ B .A ˝ B /˝ are obvious. Given A WD A [ M 2 A , B WD B [ N 2 B with A 2 A, B 2 B, M 2 N , N 2 N , setting .A ˝ B/[N˝ WD fD [ N W D 2 A B; N 2 N˝ g, we have A B D .A B/ [ .M B/ [ .A N/ [ .M N/ 2 .A B/[N˝ in view of relation (1.13). Thus A B .A B/[N˝ .A ˝ B/˝ : Since the restriction . ˝ / jA˝B of ˝ to A ˝ B coincides with ˝ as both measures coincide on the ring A B that generates A ˝ B, every ˝ -null set P in A ˝ B is ˝ -null: N˝ N˝ . Now, since A ˝ B is a -algebra,
46
1 Sets, Orders, Relations and Measures
given P 2 N˝ we can find some Q 2 A ˝ B .A ˝ B/˝ such that P Q and . ˝ /.Q/ D 0. The characterization of null sets in .A ˝ B/˝ ensures that there exists some R 2 A ˝ B such that Q R and . ˝ /.R/ D 0. Thus P R with . ˝ /.R/ D 0, so that P 2 N˝ . Therefore N˝ D N ˝ . For every S 2 .A ˝ B /˝ we can find T 2 A ˝ B and P 2 N˝ such that S D T [ P. Then, by what precedes, we have T 2 .A ˝ B/ [ N˝ and P 2 N˝ , so that S D .T 0 [ P0 / [ P D T 0 [ .P [ P0 / with T 0 2 A ˝ B and P [ P0 2 N˝ . Thus S 2 .A ˝ B/˝ and we conclude that .A ˝ B /˝ .A ˝ B/˝ . The reverse inclusion is obvious since for R 2 A ˝ B and N 2 N˝ D N˝ we have R 2 A ˝ B and N 2 N˝ , hence R [ N 2 .A ˝ B /˝ . Moreover, . ˝ /.R [ N/ D . ˝ /.R/ D . ˝ /.R/ D . ˝ /.R [ N/. Thus ˝ D ˝ . t u
Exercises 1. Let .X; M; / be a -finite complete measure space. Show that a nonnegative function f on X is measurable if and only if its positive hypograph Hf WD f.x; r/ 2 X RC W r f .x/g is measurable. Give properties of the map f 7! .˝/.Hf /. 2. Let ..Xn ; Mn ; n // be a sequence of probability spaces (this Q means that n .Xn / D 1 for all n). Let M be the -algebra on X WD n Xn generated by the families of sets of the form p1 .A / with A 2 M , where p n n n n W X ! Xn is n the canonical projection. Show that there exists a probability on M satisfying Q .A k>n Xk / D .1 ˝ : : : ˝ n /.A/ for all n 2 N and all A 2 M1 ˝ : : : ˝ Mn . 3. (Brunn-Minkowski inequality) Prove that for two compact subsets A, B of Rd the following inequality holds, with A C B WD fa C b W a 2 A; b 2 Bg, being the Lebesgue measure of Rd W 1=d .A C B/ 1=d .A/ C 1=d .B/: [See: [190, p. 88].]
1.9 * Regular Measures on Metric Spaces In this section we anticipate the notion of a metric space. The reader may either obtain the required knowledge from Chap. 2 or suppose the metric space .X; d/ we consider is Rd endowed with one of its usual norms. Note that the Borel -algebra B of .X; d/, i.e. the -algebra generated by the family O of open subsets of X, is not easy to describe. Thus, our aim is to obtain estimates of the measures of Borelian subsets, i.e. sets in B. We denote by O (resp. F , resp. K) the family of open (resp. closed, resp. compact) subsets of .X; d/.
1.9 * Regular Measures on Metric Spaces
47
Definition 1.8 A measure on .X; B/ is said to be outer regular (resp. inner regular) if for all B 2 B one has .B/ D inff.G/ W G 2 O; B Gg (resp.
.B/ D supf.K/ W K 2 K; K Bg
).
The measure is said to be regular if it is both outer regular and inner regular. Proposition 1.18 Let be a finite measure on .X; B/. Then for every B 2 B and every " > 0 there exist G 2 O and F 2 F such that F B G and .GnF/ < ". Proof We want to show that the family A of elements A of B such that for all " > 0 there exist G 2 O and F 2 F such that F B G and .GnF/ < " is a algebra containing O, whence B. Let A 2 O; we take G D A and we observe that for .rn / ! 0C and Fn WD fx 2 X W d.x; XnA/ rn g we have Fn 2 F , Fn A, and [n Fn D A, so that by Lemma 1.5 .A/ D limn .Fn /. Thus, we can take F WD Fn with n large enough and we get O A. Now using Proposition 1.5, let us show A is a -algebra. Let .An / be a sequence in A and let A WD [n An . Given " > 0, for all n 2 N we pick Gn 2 O and Fn 2 F such that F n A n Gn
and
.Gn nFn / < "=2nC2 :
Let G WD [n Gn 2 O, E WD [n Fn so that E A G and GnE [n .Gn nFn /. By -subadditivity we get .GnE/ ˙n .Gn nFn / "=2. On the other hand, since .E/ D limn .[nkD0 Fk /, we can find m 2 N such that .EnF/ < "=2 for F WD [m kD0 Fk . Then F is closed and .GnF/ .GnE/ C .EnF/ < ". Thus A 2 A. Finally, if A 2 A, then Ac WD XnA 2 A since if .G; F/ 2 OF is such that F A G, for Gc WD XnG 2 F and F c WD XnF 2 O we have Gc Ac F c and F c nGc D GnF, hence .F c nGc / D .GnF/ < ". t u The conclusion of the proposition is equivalent to the following assertion: for all B 2 B one has .B/ D supf.F/ W F 2 F ; F Bg D inff.G/ W G 2 O; B Gg: Let us give a generalization to the case when is -finite. Theorem 1.14 If is a -finite measure on .X; B/, then for all B 2 B one has .B/ D supf.F/ W F 2 F ; F Bg: If X is the countable union of a sequence .Xn / of open subsets with finite measures, then is outer regular.
48
1 Sets, Orders, Relations and Measures
If X is the countable union of a sequence .Xn / of compact subsets with finite measures, then is inner regular. Proof Let .Xn / be a sequence in B with union X such that .Xn / < C1 for all n. We may suppose .Xn / is increasing. Then, for all B 2 B we have .B/ D limn .B\Xn /. Given r < .B/ there exists an m 2 N such that .B \ Xm / > r. The measure m given by m .A/ WD .A \ Xm / being finite, the preceding proposition yields some F 2 F such that F B and m .F/ > r. Thus .F/ .F \ Xm / D m .F/ > r. This proves the first assertion. Now suppose Xn is open (and with finite measure) for all n 2 N. The measure n on B given by n .A/ WD .A \ Xn / being finite, the preceding proposition yields some Gn 2 O such that B Gn and n .Gn / < n .B/ C "=2nC1 i.e. .Gn \ Xn / < .B \ Xn / C 2n1 ":
(1.14)
Let G WD [n Gn 2 O and let Hn WD [nkD0 Gk \ Xk 2 O. Let us show by induction on n that .Hn / .B \ Xn / C " 2n1 ":
(1.15)
For n D 0, this relation is satisfied by our choice of Gn . Let us assume it is satisfied for n 1. Since Hn nHn1 .Gn \ Xn /nHn1 we have .Hn nHn1 / ..Gn \ Xn /nHn1 / D .Gn \ Xn / ..Gn \ Xn / \ Hn1 /: Moreover, since B \ Xn1 Gn1 \ Xn1 Hn1 ;
B \ Xn1 B \ Xn Gn \ Xn ;
hence B \ Xn1 Gn \ Xn \ Hn1 , .B \ Xn1 / .Gn \ Xn \ Hn1 / < 1, we deduce from (1.14) and the preceding inequalities that .Hn / .Hn nHn1 / C .Hn1 / .Gn \ Xn / .B \ Xn1 / C .Hn1 / .B \ Xn / C 2n1 " C " 2n " D .B \ Xn / C " 2n1 " and relation (1.15) holds for all n 2 N. Passing to the limit in relation (1.15), for H WD [k0 Hk we get .H/ .B/ C " and B D [n .B \ Xn / [n .Gn \ Xn / D H. For the last assertion, one just observes that F \ Xn 2 K and .F/ D lim .F \ Xn /. t u Corollary 1.6 Let .X; d/ be a separable locally compact metric space and let be a measure on the Borel -algebra B of X that is finite on compact subsets of X. Then is regular. In particular, any measure on the Borel -algebra Bd of Rd that is finite on compact subsets of X is regular.
1.9 * Regular Measures on Metric Spaces
49
Proof Let fxn W n 2 Ng be a countable dense subset of X and let I be the set of pairs .n; r/ 2 N Q such that BŒxn ; r is compact. Since I is countable, we can find a sequence .Ik /k0 of finite subsets whose union is I. Let Xk WD [.n;r/2Ik BŒxn ; r 2 K. For each x 2 X there exists a compact neighborhood Kx of x. Taking .n; r/ 2 N Q such that x 2 B.xn ; r/ and BŒxn ; r Kx , we get that x 2 Xk for some k 2 N. Thus X D [k Xk and X D [k int.Xk /. The regularity of follows from the theorem. t u
Exercises 1. Show that any finite measure on the Borel -algebra of a Polish space .X; d/, i.e. a complete and separable metric space, is regular in the sense of Definition 1.8. Prove moreover that there is a sequence .Kn / of compact subsets of X such that limn .XnKn / D 0. 2. Show that two measures , on the Borel -algebra of a metric space .X; d/ coincide whenever one of the following conditions is satisfied. (a) and coincide on the family O of open subsets of X and X is the union of a countable family of open subsets of finite measures. (b) and are finite and coincide on the family K of compact subsets of X and X is separable and locally compact. (c) and coincide on the family K of compact subsets of X and X is separable and complete with .X/ D .X/ < C1. 3. Let .X; d/ be a separable metric space and let be a measure on the Borel algebra of X. The essential support of a measurable function f W X ! R is defined as the intersection S . f / of the family of closed subsets F of X such that f D 0 on XnF. (a) Show that when f is continuous one has S . f / supp. f / WD cl.fx 2 X W f .x/ ¤ 0g/. (b) Show that S . f / Dsupp. f / when f is continuous and .O/ > 0 for all O 2 Onf¿g. (c) Show that for any measurable function f on X one has f D 0 a.e. on XnS . f /. (d) Verify that for two measurable functions f , g on X satisfying jf j jgj a.e. one has S . f / S .g/. Deduce from this that S . f / D S .g/ whenever f D g a.e. (e) Let . fn / be an increasing sequence in nonnegative measurable functions with limit f . Show that S . f / D [n S . fn /. 4. Let .X; d/ be a separable metric space and let be a measure on the Borel algebra of X. Let O WD fO 2 O W .O/ D 0g. Show that O is nonempty and has a greatest element O . The support of the measure is defined to be XnO . Determine the support of a Dirac measure ıa , of the counting measure c on a finite set X, and of the Lebesgue measure d on X WD Rd .
50
1 Sets, Orders, Relations and Measures
5. Let .X; d/ be a separable metric space, let be a measure on the Borel -algebra of X, and let f W X ! RC be a measurable function. Show that the support of the measure f WD f coincides with the essential support S . f / of f defined in Exercise 3.
Notes, Remarks, and Additional Reading A basic knowledge of elementary analysis is required for the reading of the present book. Among the numerous monographs devoted to such a topic one is referred to [10, 30, 34, 59, 63, 84, 125, 146, 169, 176, 177, 183, 184, 213, 223, 224, 270]. The Cantor-Bernstein Theorem was stated by Georg Cantor during the period 1895–1897 and proved by his student Felix Bernstein (aged 18!) in 1896. It was published in the book “Leçons sur la théorie des fonctions” by Emile Borel in 1898 under a proposal of G. Cantor, who was facing harsh criticisms, in particular from Leopold Kronecker. Ernst Schröder produced an (imperfect) proof the same year. David Hilbert gave his strong approval of the advances of G. Cantor in the analysis of the various forms of infinity but some mathematicians are still reluctant to adopt the axiom of choice. Historical views on the subjects treated in this book can be found in the references [28, 39, 49, 87, 104, 105, 110, 148, 156, 166]. More on set-valued analysis is contained in the books [12, 13, 93, 208, 221]. More on measure theory can be found in the following references: [38, 39, 41, 42, 49, 54, 64, 80, 99, 100, 105, 118, 126, 127, 132, 150, 156, 168, 182, 183, 185, 190, 193, 224, 226, 247]. The notion of metric spaces is due to Maurice Fréchet. It will be expounded in the next chapter.
Chapter 2
Encounters With Limits
Mathematics is not an arid land in the scientific universe. It is simultaneously the queen, maid and daughter of the observational sciences. La mathématique ne constitue pas une terre aride dans l’univers scientifique. Elle est à la fois reine, servante et fille des sciences de l’observation. Gustave Choquet (1915–2006)
Abstract The notion of limit is central in analysis. Thus the concept of convergence is presented in a general framework and then in the classes of topological spaces and metric spaces. Compactness, connectedness, completeness are studied in detail. Baire’s Theorem is included as well as Ekeland’s Variational Principle. The contraction theorem is proved and as an application an existence result for ordinary differential equations is presented.
2.1 Convergences The first approach to convergences appeared after the first quarter of the twentieth century. It is not often used nowadays, even if in some cases it is simpler than a topological approach (this is the case for pointwise convergence and for the second example given below). Since the selected rules are prevalent in practice, it is worth stating them in a formal definition. Let us first recall that a sequence in a set X is a map s from N to X, hence an element of X N . Setting xn WD s.n/ for n 2 N we often write .xn /n2N , .xn /n0 or just .xn / instead of s. A subsequence of s WD .xn / is a sequence s0 WD .x0n / of X such that there exists a strictly increasing map k W N !N satisfying s0 D s ı k, i.e. x0n D xk.n/ . Thus one obtains s0 by deleting some terms of s and by reindexing the remaining terms. Definition 2.1 A space with limits is a set X such that a relation denoted by ! is defined between the sets X N and X and read as .xn / converges to x or x is a limit of .xn /, the relation ! being required to satisfy the following properties: (L1) any constant sequence with value x converges to xI (L2) if .xn / ! x, then any subsequence .x0n / of .xn / converges to xI (L3) if x 2 X and .xn / 2 X N are such that any subsequence .x0n / of .xn / has a subsequence .x00n / converging to x, then .xn / ! x. For some needs, it is useful to add a uniqueness condition: (U)
if .xn / ! x and .xn / ! x0 then x D x0 .
Clearly, the convergences in R, R1 , R WD R, C, Rd satisfy these four conditions. Example If X is the set of real-valued functions on a set S, or, in other terms, if X D RS , then pointwise convergence on X is defined by .xn / ! x if and only if for all s 2 S one has .xn .s//n ! x.s/ as n ! C1. Here the element x WD .xs /s2S of RS is identified with the function f W S ! R defined by f .s/ WD xs and is also denoted by .x.s//s2S . Example Uniform convergence on the set X of real-valued functions on S is defined u in the following way: . fn / ! f if and only if for all " > 0 one can find some n."/ 2 N such that j fn .s/ f .s/j " for all n n."/ and all s 2 S.
2.1 Convergences
53
Example Let X be the set of continuous functions on Rd that vanish outside a bounded subset. Declare that a sequence . fn / ! f if there exists a bounded subset B of Rd such that the functions fn and f are null on Rd nB and if the net of restrictions u to B uniformly converges to the restriction of f : . fn j B/ ! f j B. Variants of such a convergence are used in the theory of distributions. Sometimes sequences are inadequate and one must replace them with generalized sequences, also called nets (see Exercise 5). A net in a set X is a map s from a directed set .I; / into X. Setting xi WD s.i/, one also denotes it by .xi /i2I . A subnet is a net s0 WD .x0j /j2J such that there exists a filtering map h W J ! I such that x0j D xh. j/ for all j 2 J. Note that, in contrast to what occurs for subsequences, one takes for J a directed set that may differ from I. It is often of the form J WD I K, where K is another directed set, or a subset of I K, h being the first projection or a restriction of it. In some simple cases one can take for J a cofinal subset of I and for h the injection of J into I, but that it not always possible. The axioms we select are the analogues of those of the preceding definition. Definition 2.2 A convergence space is a set X such that for every directed set I there is a relation denoted by ! between the set X I of nets of X indexed by I and X itself in such a way that the following conditions are satisfied: (C1) for every x 2 X the constant net with value x converges to x; (C2) if .xi /i2I ! x and if .x0j /j2J is a subnet of .xi /i2I , then .x0j /j2J ! x; (C3) if x 2 X and .xi /i2I is a net in X such that for every subnet .x0j /j2J of .xi /i2I there exists a subnet .x00k /k2K of .x0j /j2J that converges to x, then .xi /i2I ! x. The preceding examples can be adapted to convergence spaces. A map f W X ! Y between two convergent spaces is said to be continuous at x 2 X if for any net .xi /i2I converging to x the net . f .xi //i2I converges to f .x/. It is continuous on some subset A of X if for all x 2 A it is continuous at x. If .W; !/ is a convergence space and if X is a subset of W, the induced convergence on X is defined by .xi /i2I !X x if .xi /2I ! x in W. It is easy to verify that the conditions (C1)–(C3) are satisfied and that a map f W V ! X from a convergence space .V; !V / into .X; !X / is continuous at v 2 V if and only if f is continuous at v when considered as a map from V into W. Q If .Xa /a2A is a family of convergence spaces, the product X WD a2A Xa is made a convergence space by requiring that a net .xi /i2I ! x if for all a 2 A the acomponent . pa .xi // ! pa .x/. Then, a map f W W ! X from a convergence space W to X is continuous at w 2 W if and only if for all a 2 A the a-component fa WD pa ı f is continuous at w. The lower limit (resp. upper limit) of a net .ri /i2I of real numbers is defined by lim inf ri WD sup inf ri i2I
h2I i2I; ih
(resp. lim sup ri WD inf sup ri ) i2I
h2I i2I; ih
These substitutes for the limit always exist in R WD R [ f1; C1g.
54
2 Encounters With Limits
Whereas it is useful to evoke the use of limits, the two concepts of limit spaces and of convergence spaces are not of great use in the present state of analysis. So, in the next sections we turn to other means of dealing with limits.
Exercises 1. Given a net .ri /i2I of real numbers, show that there exist subnets .sj /j2J and .tk /k2K such that lim supi2I ri D limj2J sj and lim infi2I ri D limk2K tk . 2. Given a net .ri /i2I of real numbers, show that for any subnet .qh /h2H of .ri /i2I that converges one has lim infi2I ri limh2H qh lim supi2I ri . 3. Define pointwise convergence of nets for functions from a set S into a convergence space .X; !/ and verify conditions (C1)–(C3). Do the same for uniform convergence of functions with values in Rd . 4. Let X be the set of real-valued continuous functions on Rd that vanish outside a bounded subset. Declare that a net . fi / ! f if there exist a bounded subset B of Rd and some i 2 I such that for all i i in I the functions fi and f are null on Rd nB and if the net of restrictions to B uniformly converges to the restriction of u f : . fi j B/ ! f j B. Verify conditions (C1)–(C3). 5. (Sequences do not suffice). Let S be an infinite uncountable set and let X be the set of real-valued functions on S equipped with pointwise convergence. Let Y be the subset of X formed by those f 2 X that are null off a finite subset. Show that Y is dense in X in the sense that every f 2 X is the limit of a net . fi /i2I of Y. Verify that if f 2 X is the limit of a sequence . fn /n2N of Y, then the set Sf WD fs 2 S W f .s/ ¤ 0g is countable. Deduce from this the fact that the sequential closure of Y, i.e. the set of limits of sequences in Y, is different from its closure X. 6. Using the notion of topology displayed in the next section, define a convergence ! on a topological space .X; O/ by setting .xi /i2I ! x if for any O 2 O there exists some h 2 I such that xi 2 O for all i 2 I satisfying i h. 7. Conversely, associate to any convergence space .X; !/ a topology O by taking for O the set of subsets O of X such that for any x 2 O and any net .xi /i2I ! x one has xi 2 O whenever i h for some h 2 I. Verify the assumptions (O1), (O2) of the next definition. Prove that the convergence ! is stronger than (i.e. implies) the convergence !O associated with the topology O. Show that for any map f W X ! Y with values in a topological space .Y; OY / the map f is continuous from .X; O/ into .Y; OY / if and only if f is continuous for the convergence ! on X and the convergence associated with the topology of Y. 8 . Find an additional condition on a convergence in order that it is the convergence associated with a topology. [See [174].]
2.2 Topologies
55
2.2 Topologies The success of topology is due to two features: first, convergences are defined through an intuitive notion of neighborhoods for each point; second, the formalism and the rules of set theory can be used efficiently.
2.2.1 General Facts About Topologies A topology on a set X is obtained by selecting a family of subsets called the family of closed subsets having some stability property. Equivalently, one usually introduces topologies by considering the family of complements of closed sets. These sets are called open sets. Definition 2.3 A topology on a set X is the data comprising a family O of so-called open subsets that satisfies the following two requirements: (O1) (O2)
the union of any subfamily of O belongs to OI the intersection of any finite subfamily of O belongs to O.
By convention, we admit that these two conditions include the requirements that X and the empty set ¿ belong to O. A topological space .X; O/ is also denoted by X if the choice of the topology O is unambiguous. A subset F of X is declared to be closed if XnF belongs to O. Exercise Give conditions characterizing the family of closed (resp. open) subsets of a topological space in terms of nets for the convergence associated to O defined in Exercise 6 of the preceding section and reminded a few lines below. A subset V of a topological space .X; O/ is a neighborhood of some x 2 X if there exists some U 2 O such that x 2 U V. For x 2 X we denote by N .x/ the family of neighborhoods of x. The topology O is determined by .N .x//x2X , as shown by the exercises: O is open iff for all x 2 O one has O 2 N .x/. To any topology on X one can associate a convergence ! by setting: ..xi /i2I ! x/ ” .8 V 2 N .x/ 9 iV 2 I W i 2 I; i iV ) xi 2 V/ : When a limit is unique one also writes x D limi2I xi . Exercise Verify that the conditions (C1), (C2), (C3) are satisfied. Moreover, a subset C of X is closed if and only if for any net .xi /i2I of C and any x 2 X satisfying .xi /i2I ! x one has x 2 C. Definition 2.4 A map f W .X; O/ ! .X 0 ; O0 / between two topological spaces is said to be continuous at x 2 X if for any V 0 2 N . f .x// there exists some V 2 N .x/ such that f .V/ V 0 . The map f is said to be continuous on some subset A of X if it is continuous at all x 2 A.
56
2 Encounters With Limits
The definition of continuity of f at x is natural: in order that f .x/ be close enough to f .x/ it suffices to take x close enough to x. However, one has to remember that in this condition, the neighborhood V 0 of f .x/ should be prescribed first. Continuity of f at x can be expressed by requiring that for any V 0 2 N . f .x// one has f 1 .V 0 / 2 N .x/. Exercise Show that f W X ! X 0 is continuous (on X) if and only if for all O0 2 O0 its inverse image f 1 .O0 / WD fx 2 X W f .x/ 2 O0 g belongs to O. The composition of two continuous maps is clearly continuous. A bijection that is continuous and whose inverse is continuous is called a homeomorphism. Topology is the study of properties that are preserved under homeomorphisms. A topology O0 on X is said to be weaker than a topology O if the identity map IX W .X; O/ ! .X; O0 / is continuous, i.e. if any member of O0 is in O, i.e. if O0 O. One also says that O is finer or stronger than O0 . Example On a set X there is a topology that is weaker than any other one, the rough topology: its family of open sets is OR WD f¿; Xg. There is a topology that is finer than any other one, the discrete topology, for which any subset is open: OD WD P.X/. Given a family G of subsets of a set X, there is a topology O on X that is the weakest among those containing G. It is obtained as the intersection of the family of topologies Oi satisfying G Oi . Then one says that G generates O. If B O is such that any element of O is a union of elements of B, one says that B is a base of O. It is easy to verify that when G generates O, the family B of finite intersections of elements of G is a base of O. A family U of subsets of X is a base of neighborhoods of x if U is contained in the family N .x/ of neighborhoods of x and if for any V 2 N .x/ there exists some U 2 U such that U V. Given B O, we see that B is a base of O if and only if for all x 2 X, B.x/ WD fU 2 B W x 2 Ug is a base of neighborhoods of x. The notion of continuity can be localized by using neighborhood bases. A map f W .X; O/ ! .X 0 ; O0 / is continuous at x 2 X if for any neighborhood W in some neighborhood base of f .x/ in .X 0 ; O0 / there exists some V 2 N .x/ such that f .V/ W. Given a set X, a family .Xa ; Oa /a2A of topological spaces, and a family .ga /a2A of maps ga W X ! Xa , among all the topologies on X for which all ga are continuous, there is one that is weaker than any other. It is the topology OX generated by the sets g1 a .Ga / for a 2 A and Ga 2 Oa . It is easy to verify that a map f W W ! X from a topological space .W; OW / into X is continuous with respect to OX if and only if ga ı f W W ! Xa is continuous for all a 2 A. When X is the product ˘a2A Xa and ga is the canonical projection, the topology OX is called the product topology. When X is a subset of a topological space .Y; OY / and one considers for the family .ga /a2A the sole canonical injection j W X ! Y one says that OX is the induced topology. Then O X belongs to OX if and only if there exists some G 2 OY such that O D G \ X. It is easy to verify that the associated convergence to OX is the induced convergence.
2.2 Topologies
57
Besides the notion of limit associated with a topology O on X, one disposes of a weaker notion. One says that a net .xi /i2I of X has a cluster point x 2 X if for any V 2 N .x/ and any i 2 I one can find some j 2 I such that j i and xj 2 V. One can show that x 2 X is a cluster point of .xi /i2I if and only if there exists a subnet of .xi /i2I that converges to x. The if condition is immediate. For the necessary condition one can take J WD f.i; V/ 2 I N .x/ W xi 2 Vg, a cofinal subset of I N .x/ for the product order, and define h W J ! I by h.i; V/ WD i. A topology O on a set X is said to be Hausdorff if for every pair .x; x0 / of distinct points of X one can find neighborhoods V 2 N .x/, V 0 2 N .x0 / that are disjoint. This property is equivalent to uniqueness of limits of nets. Proposition 2.1 A topology O on a set X is Hausdorff if and only if all nets in X have at most one limit. In fact, if O is Hausdorff and if a net .xi /i2I of X has a limit x, it cannot have a different cluster point. Proof Suppose O is Hausdorff and a net .xi /i2I of X has a limit x and a cluster point y ¤ x. Let V 2 N .x/, W 2 N .y/ be such that V \ W D ¿. By definition, there exists an iV 2 I such that xi 2 V for all i iV . Thus one cannot find some j iV such that xj 2 W, contradicting the assumption that y is a cluster point of .xi /i2I . Now suppose O is not Hausdorff: there exists a pair .x; y/ of distinct points of X such that for any V 2 N .x/, W 2 N .y/ one has V \W ¤ ¿. Denoting N .x/N .y/ by I and giving to I the order opposite to inclusion, for i WD .V; W/ 2 I one can pick xi 2 V \ W. Then the net .xi /i2I converges to x and to y. t u Corollary 2.1 In a Hausdorff topological space .X; O/ finite subsets, in particular singletons, are closed. Proof It suffices to show that a singleton S WD fxg is closed. If y 2 XnS one can find V 2 N .x/ and W 2 N .y/ that are disjoint. Then W is contained in XnS. This proves that XnS is open and S is closed. t u A topology O on X is uniquely determined by its associated convergence for nets: a subset C of X is closed if and only if it contains the limits of its convergent nets. In general O is not determined by the convergence of sequences. The following proposition shows that nets may be convenient. It also shows that continuity in topological spaces coincides with continuity for the induced convergence spaces. Proposition 2.2 A map f W X ! Y between two topological spaces is continuous at x 2 X if and only if for any net .xi /i2I of X converging to x, the net . f .xi //i2I converges to f .x/. When N .x/ has a countable base sequences can be used instead of nets. Proof Necessity is immediate. Let us show sufficiency. Suppose f is not continuous at x. Then, there exists a V 2 N . f .x// such that for all U 2 N .x/ there exists some xU 2 U with f .xU / … V. Then, the net .xU /U2N .x/ ! x but . f .xU //U2N .x/ does not converge to f .x/. t u The closure cl.S/ of a subset S of a topological space .X; O/ is the intersection of the family of all closed subsets of X containing S. It is clearly the smallest closed
58
2 Encounters With Limits
subset of .X; O/ containing S. The interior int.S/ of S is the union of all the open subsets of .X; O/ contained in S. It is the largest open subset of X contained in S. Thus int.S/ D Xncl.XnS/. The boundary or frontier of S is bdry.S/ WD cl.S/nint.S/. Exercise Prove that the closure cl.S/ of a subset S of a topological space .X; O/ is the set of limits of nets of S that converge in X. Proposition 2.3 The closure cl.S/ of a subset S of a topological space .X; O/ is the set of points x 2 X such that for any neighborhood V of x one has S \ V ¤ ¿. Proof If x 2 Xncl.S/ there exists some closed set C containing S such that x 2 XnC. Then V WD XnC is an open neighborhood of x and S \ V D ¿. Conversely, if for some V 2 N .x/ one has S \ V D ¿, taking some open subset U satisfying x 2 U V we see that C WD XnU is a closed subset containing S (since U V and S \ V D ¿) and x … C, hence x … cl.S/. t u Corollary 2.2 A point x of a topological space .X; O/ is a cluster point of a net .xi /i2I of X if x 2 \i2I Ci , where Ci WD cl.fxj W j 2 I; j ig/. Proof This follows from the fact that x is a cluster point of .xi /i2I if and only if for all i 2 I and all V 2 N .x/ one has V \ fxj W j 2 I; j ig ¤ ¿. t u A subset D of .X; O/ is said to be dense in a subset E of X if D E and if E is contained in the closure of D. A topological space is said to be separable if it contains a countable dense subset. Example Given a directed set .I; /, let I1 WD I [ f1g, where 1 is an additional element satisfying i 1 for all i 2 I. Then one can endow I1 with the topology O defined by G 2 O if either G is contained in I, else if there exists some h 2 I such that i 2 G for all i 2 I1 such that i h. Thus I is dense in I1 . Given a topological space .X; O/, x 2 X and a net .xi /i2I of X, one easily checks that .xi /i2I ! x if and only if the map f W I1 ! X given by f .i/ WD xi , f .1/ WD x is continuous at 1. Definition 2.5 Given two topological spaces .W; O/, .X 0 ; O0 /, a subset X of W and w 2 cl.X/, one says that f W X ! X 0 has a limit x0 as x !X w (i.e. x ! w with x 2 X), or that f converges to x0 as x !X w and one writes x0 D limx!X w f .x/, if for any V 0 2 N .x0 / there exists some V 2 N .w/ such that f .V \ X/ V 0 . Thus f converges to x0 as x !X w iff for any net .xi /i2I of X satisfying .xi /i2I ! w in W one has . f .xi //i2I ! x0 in X 0 . If X D W, one just writes x0 D limx!w f .x/. Thus, f is continuous at w if and only if f has the limit f .w/ as x ! w. We invite the reader to verify that the notion .xn / ! 0C in R (i.e. .xn / ! 0 and xn > 0 for all n 2 N/ corresponds to the case when W WD R, X WD P, the set of positive real numbers. The preceding definition is a special case of a more general concept. Given another map g W X ! Y with values in another topological space .Y; G/ and some y 2 Y, one says that f has a limit x0 as g.x/ ! y, or that f converges to x0 as g.x/ ! y and one writes x0 D limg.x/!y f .x/, if for any V 0 2 N .x0 / there exists a W 2 N .y/
2.2 Topologies
59
such that f .x/ 2 V 0 for all x 2 g1 .W/. Taking for g the canonical injection of X into .Y; G/ WD .W; O/, one recovers the preceding notion of limit. The next result is often used for uniqueness purposes. Proposition 2.4 Let .W; O/, .X 0 ; O0 / be two topological spaces, let X be a dense subset of W and let f , g W W ! X 0 be two continuous maps. If the restrictions of f and g to X coincide, then f and g coincide. Proof The set Z WD fw 2 W W f .w/ D g.w/g is a closed subset of W containing X. Since X is dense in W, we have Z D W since W D cl.X/ Z. u t
Exercises 1. Let f W R ! R be a map such that f .r C s/ D f .r/ C f .s/ for all r, s 2 R. Show that f .q/ D qf .1/ for all q 2 Q. Prove that f is linear over R when moreover f is continuous or monotone. 2. Write the alphabet with capital letters and decide which letters are mutually homeomorphic. 3. Let .X; O/ be a Hausdorff topological space and let f W X ! X be a continuous map. Show that the set F WD fx 2 X W f .x/ D xg is closed in X. 4. Let .X; OX /, .Y; OX / be topological spaces and let f W X ! Y be a continuous map. Show that the set G WD f.x; y/ 2 X Y W y D f .x/g is homeomrphic to X. 5. Show that a topological space .X; O/ is Hausdorff if and only if the diagonal X WD f.x; x0 / 2 X 2 W x D x0 g is closed in X 2 . 6. Let X and Y be topological spaces, let f W X ! Y and let .Xi /i2I be a covering of X, i.e. a family of subsets of X whose union is X. Suppose the restriction fi of f to Xi is continuous. Show that if every Xi is open, then f is continuous. 7. With the notation of the preceding exercise, suppose that I is finite and that every Xi is closed. Show that f is continuous if every fi is continuous. Give an example showing that the assumption that I is finite cannot be dropped. Give an example showing that the assumption that every Xi is closed cannot be dropped. 8. Let X and Y be topological spaces, let s 2 X, and let f W X Y ! R be separately continuous (i.e. f is continuous in each of its two variables). For a subset T of Y let Vs .T/ WD fx 2 X W f .x; s/ f .x; t/ 8t 2 Tg with Vs .¿/ WD X be the Voronoi cell associated with s and T as in Exercise 17 of Sect. 1.1. Show that Vs .T/ is closed and that Vs .cl.T// D Vs .T/ for each T 2 P.Y/.
60
2 Encounters With Limits
2.2.2 Connectedness It is sometimes useful to know that a space is not made of several pieces: this can be used to globalize some properties or for uniqueness results. For example, if X is an open subset of R and if f W X ! R is a differentiable function whose derivative is 0 we cannot conclude that f is constant because X can be the union of disjoint open intervals. We must give a precise definition. Definition 2.6 A topological space .X; O/ is said to be connected if ¿ and X are the only subsets of X that are both open and closed. The space .X; O/ is arcwise connected if any two points x0 , x1 of X can be joined by a continuous arc, i.e. if for any x0 , x1 there is a continuous map c W Œ0; 1 ! X such that c.0/ D x0 , c.1/ D x1 . Connectedness is a more important notion than arcwise connectedness, but the latter is more intuitive and the proofs of several propositions below for arcwise connectedness are very easy. The reader is invited to verify this assertion. The definition of connectedness can be rephrased by saying that X is connected if any partition of X into two open subsets is improper, i.e. if one of the subsets is empty and the other one is the whole space, a partition of X being a covering by disjoint subsets. The reader must be aware that there are topological spaces that are not connected; in such a space, showing that a subset is not closed does not prove that it is open (a frequent mistake). For instance, if X is the union of two open disjoint intervals A, B of R, A and B are also closed in X since their complements are open. Example A bounded, closed interval X WD Œa; b of R is connected (with respect to the induced topology). To prove this, let us consider a nonempty subset C of X that is both open and closed and let us show that C D X. We may suppose a 2 C (otherwise we consider C0 WD XnC). Let s WD sup T with T WD ft 2 X W Œa; t Cg. Since C is open in X we have s > a. Moreover, s 2 T since there exists a sequence .tn / of T C converging to s so that Œa; sŒD [n Œa; tn Œ C and since C is closed we have Œa; s C. Let us show that assuming s < b leads to a contradiction. Since C is open in X and s 2 C, we can find " > 0 such that s C " b and Œs; s C " C. Then since s 2 T we get Œa; s C " D Œa; s [ Œs; s C " CI this means that s C " 2 T, contradicting the definition of s. It follows from Proposition 2.7 below that any interval of R is connected. t u The use of connectedness for existence results is illustrated by the following properties. The second one is often called the Intermediate Value Theorem. Proposition 2.5 (Customs Lemma) Let C be a connected subset of a topological space .X; O/ and let S be a subset of X. If C\int.S/ and C\.Xncl.S// are nonempty, then C contains some point of the boundary of S (Fig. 2.2).
2.2 Topologies
61
Fig. 2.2 The customs lemma
Fig. 2.3 The Daisy property
Proof If, on the contrary, C does not meet the boundary of S then C is the union of the two disjoint sets C \ int.S/ and C \ .Xncl.S//, contradicting the connectedness of C. t u Proposition 2.6 (Bolzano) Let .X; O/ be a connected topological space and let f W X ! R be a continuous map. Let a < b < c in R be such that a 2 f .X/, c 2 f .X/. Then there exists some x 2 X such that f .x/ D b. Proof If f 1 .b/ D ¿, the sets f 1 . 1; bŒ/ and f 1 .b; C1Œ/ form a partition of X into two open subsets, an impossibility if X is connected. t u The following property can be used as a convenient criterion (Fig. 2.3).
62
2 Encounters With Limits
Fig. 2.4 The Comb or Rake property
Proposition 2.7 (The Daisy Property) Let .X; O/ be a topological space that is the union of a family .Xi /i2I of connected subspaces. If \i2I Xi is nonempty, then X is connected. Proof Let x 2 \i2I Xi . Let C be a subset of X that is closed and open. Changing C into XnC we may suppose x 2 C. Then C \ Xi is open and closed in Xi and contains x. Since Xi is connected, we have C \ Xi D Xi . Thus C Xi for all i 2 I, hence C D X. t u A slight refinement can be given (Fig. 2.4). Corollary 2.3 (The Comb or Rake Property) Let .X; O/ be a topological space that is the union of a family .Xi /i2I of connected subspaces. If there is some nonempty connected subspace Y of X such that Xi \ Y ¤ ¿ for all i 2 I, X is connected. Proof Set Yi WD Xi [ Y. Then, by the preceding proposition, Yi is connected. Since X D [i2I Yi and \i2I Yi contains Y ¤ ¿, then X is connected. t u Proposition 2.8 A product of two connected spaces is connected. Proof Let Z WD X Y, the spaces X, Y being connected and nonempty. Let x0 2 X. Then Y0 WD fx0 g Y is homeomorphic to Y, hence is connected and for all y 2 Y the subspace Xy WD X fyg is connected and meets Y0 as .x0 ; y/ 2 Xy \ Y0 . Since Z D [y2Y Xy , Z is connected by the rake property (2.3). Q Now let us consider the general case of an arbitrary product X WD i2I Xi of connected spaces. Given two nonempty open subsets, A, B of X satisfying A[B D X, the construction of the product topology ensures that there exist a finite subset J of I Q and open subsets AJ , BJ of XJ WD j2J Xj such that A D AJ XInJ and B D BJ XInJ . Then AJ [ BJ D XJ . Since XJ is connected in view of the first part of the proof and of an induction, we have AJ \ BJ ¤ ¿. Thus A \ B ¤ ¿ and X is connected. t u The preceding proof used the obvious fact that when two topological spaces are homeomorphic, both are connected when one of them is connected. This fact is also a consequence in a more general property that is obviously valid for arcwise connectedness.
2.2 Topologies
63
Proposition 2.9 Let .X; OX /, .Y; OY / be two topological spaces and let f W X ! Y be continuous. If X is connected, then f .X/ is connected with respect to the induced topology. Proof There is no loss of generality in assuming f .X/ D Y. Then, if G 2 OY is nonempty, open and closed, then so is f 1 .G/. Since X is connected we have f 1 .G/ D X, hence G D f . f 1 .G// D f .X/ D Y. t u Corollary 2.4 An arcwise connected space is connected. Proof Let .X; O/ be an arcwise connected space and let x0 2 X. By definition, for all x 2 X there exists a continuous map fx W Œ0; 1 ! X such that fx .0/ D x0 and fx .1/ D x1 . Since Cx WD fx .Œ0; 1/ is connected and since X D [x2X Cx with x0 2 Cx for all x, X is connected. t u Given a topological space .X; O/ and x 2 X, Proposition 2.7 implies that the union C.x/ of all connected subsets of X containing x is connected. It is clearly the largest connected subset containing x. Moreover, X can be split into a partition of connected subsets called the connected components of X by taking those C.x/ that are disjoint (note that if C.x/ \ C.x0 / ¤ ¿ then C.x/ D C.x0 /). It follows from Exercise 1 below that the connected components of X are closed subsets. A topological space .X; O/ is said to be locally connected if every point of X has a base of neighborhoods formed by connected sets. Clearly R is locally connected, but Q and RnQ are not locally connected. It is easy to show that a space is locally connected if and only if the connected components of any open subset are open.
Exercises 1. Let A and B be two subsets of a topological space .X; O/ such that A B cl.A/. Show that B is connected whenever A is connected. Deduce from this that cl.A/ is connected and that any connected component of X is closed. 2. Let .X; O/ be a topological space such that X is the union of a sequence .Xn /n of connected subsets satisfying Xn \ XnC1 ¤ ¿ for all n 2 N. Show that X is connected. 3. Show that the connected subsets of R are the intervals. 4. Prove that any open subset of R is the union of a finite or countable family of disjoint open intervals. 5. Verify that a topological space .X; O/ is connected if and only if any continuous map f W X ! Z is constant. 6. Let A and B be two nonempty closed subsets of a topological space. Show that if A \ B and A [ B are connected, then A and B are connected. Show by an example of two subsets of R that the assumption that A and B are closed cannot be omitted. 7. Let G WD f.r; sin.1=r// W r 20; 1g and let X be its closure in R2 . Prove that X is connected but that X is not arcwise connected and not locally connected.
64
2 Encounters With Limits
2.2.3 Lower Semicontinuity In order to deal with minimization problems, one may use a one-sided weakening of continuity when a continuity assumption is not realistic. A precise definition is as follows. Definition 2.7 A function f W X ! R WD R [ f1; C1g on a topological space X is said to be lower semicontinuous (l.s.c.) at some x 2 X if for every real number r < f .x/ there exists some member V of the family N .x/ of neighborhoods of x such that r < f .v/ for all v 2 V. A function f is upper semicontinuous (u.s.c.) at x whenever f is l.s.c. at x. The function f is said to be lower semicontinuous (l.s.c.) on some subset S of X if f is lower semicontinuous at each point of S. We observe f is automatically l.s.c. at x when f .x/ D 1I when f .x/ D C1 the lower semicontinuity of f means that the values of f can be as large as required provided one remains in a small enough neighborhood of x. When f .x/ is finite the definition amounts to assigning to any " > 0 a neighborhood V" of x such that f .v/ > f .x/ " for each v 2 V" . Thus, lower semicontinuity allows sudden upward changes of the value of f but excludes sudden downward changes. Obviously, f is continuous at x iff it is both l.s.c. and u.s.c. at x. Example The function f W R ! R given by f .x/ D 1 for x < 0, f .x/ D 0 for x 2 RC is l.s.c. but not continuous at 0. Example The indicator function A of a subset A of X, defined by A .x/ D 0 for x 2 A, A .x/ D C1 for x 2 XnA is l.s.c. if and only if, A is closed, as is easily seen. Such a function is of great use in optimization theory and nonsmooth analysis. Example The characteristic function 1A of a subset A of X, defined by 1A .x/ D 1 for x 2 A, 1A .x/ D 0 for x 2 XnA is l.s.c. if and only if, A is open. Such a function is of primary importance in integration theory. Example The Length Function Given a metric space .M; d/, let X WD C.T; M/ be the space of continuous maps (curves or arcs of M) from T WD Œ0; 1 to M. Given a subdivision WD ft0 D 0 < t1 < : : : < tn D 1g of T, let us set for x 2 X ` .x/ WD
n X
d.x.ti1 /; x.ti //;
iD1
and let `.x/ be the supremum of ` .x/ as varies in the set of finite subdivisions of T. The properties devised below yield that ` is l.s.c. when X is endowed with the topology of uniform convergence (and even when X is endowed with the topology of pointwise convergence). However ` is not continuous: one can increase ` by following a nearby curve which makes many small changes (a fact any dog knows, when tied with a lease). Details are given in Exercise 3 below. The following characterizations are global ones.
2.2 Topologies
65
Proposition 2.10 For a function f W X ! R the following assertions are equivalent: (a) f is l.s.c.; (b) the epigraph E WD f.x; r/ 2 X R W r f .x/g of f is closed; (c) for each s 2 R the sublevel set S.s/ WD fx 2 X W f .x/ sg is closed. Proof (a))(b) It suffices to prove that .X R/ nE is open when f is l.s.c. Given .x; r/ 2 .X R/ nE, i.e. such that r < f .x/, for any r 2 r; f .x/ Œ one can find a neighborhood V of x such that r < f .v/ for all v 2 V. Then V 1; rŒ is a neighborhood of .x; r/ in X R which does not meet E. Hence .X R/ nE is open. (b))(c) It suffices to observe that for each s 2 R one has S.s/ fsg D E \ .X fsg/. (c))(a) Given x 2 X and r 2 R such that r < f .x/ one has x 2 XnS.r/ which is open and for all v 2 V WD XnS.r/ one has r < f .v/. t u The notion of lower semicontinuity is intimately tied to the concept of lower limit (denoted by liminf), which is a one-sided concept of limit which can be used even when there is no limit. In the following definition we assume X is a subspace of a larger space W, w 2 cl.X/ (a situation which will be encountered later, for instance when X D P WD0; 1Œ, W D R and w D 0) and we denote by N .w/ the family of neighborhoods of the point w in W. Definition 2.8 Given a topological space W, a subspace X and a point w in the closure of X, the lower limit of a function f W X ! R at w is the extended real number lim inf f .x/ WD sup x!X w
inf f .v/:
V2N .w/ v2V\X
Setting mV WD inf f .V \ X/, the supremum over V 2 N .w/ of the family .mV /V can also be considered as the limit of the net .mV /V ; this explains the terminology. One can show that supV mV is also the least cluster point of f .x/ as x ! w in X. When W is metrizable, one can replace the family N .w/ by the family of balls centered at w, so that lim infx!w f .x/ D supr>0 mr , with mr WD inf f .B.w; r/ \ X/, is the limit of a sequence. Exercise Deduce from the preceding that the lower limit of a function f W X ! R at w is the least of the limits of the converging nets . f .xi //i2I where .xi /i2I is a net in X converging to w. Lower semicontinuity can be characterized using the notion of lower limit (here W D X). Lemma 2.1 A function f W X ! R on a topological space X is l.s.c. at some w 2 X iff one has f .w/ lim infx!w f .x/. Proof Clearly, when f is l.s.c. at w, one has f .w/ lim infx!w f .x/. Conversely, when this inequality holds, for any r < f .w/, by the definition of the supremum
66
2 Encounters With Limits
over N .w/, one can find V 2 N .w/ such that r < infv2V f .v/, so that f is l.s.c. at w. u t One can also use nets for such a characterization. Lemma 2.2 A function f W X ! R on a topological space X is l.s.c. at some x 2 X iff for any net .xi /i2I in X converging to x one has f .x/ lim infi2I f .xi /. When x has a countable base of neighborhoods, one can replace nets by sequences in that characterization. Proof The condition is necessary: if a net .xi /i2I in X converges to x, for any r < f .x/ there exists some V 2 N .x/ such that f .v/ > r for all v 2 V, and there exists some h 2 I such that xi 2 V for i h, so that infih f .xi / r, hence lim infi2I f .xi / infih f .xi / r. Conversely, suppose f is not l.s.c. at x and let .Vi /i2I be a base of neighborhoods of x W there exists some r < f .x/ such that for any i 2 I there exists some xi 2 Vi such that f .xi / < r. Ordering I by j i if Vj Vi , we get a net .xi /i2I which converges to x and is such that lim infi2I f .xi / r. The second assertion follows from the fact that when x has a countable base of neighborhoods, one can take a decreasing sequence of neighborhoods for a base. t u Let us give some useful calculus rules (with the convention 0r D 0 for all r 2 R). Exercise For any ˛ 2 RC and f W X ! R one has lim infx!x ˛f .x/ D ˛ lim infx!x f .x/. Exercise If f ; g W X ! R are such that flim infx!x f .x/; lim infx!x g.x/g ¤ f1; C1g, then lim inf. f C g/.x/ lim inf f .x/ C lim inf g.x/: x!x
x!x
x!x
The family of lower semicontinuous functions enjoys stability properties. Proposition 2.11 If . fi /i2I is a family of functions which are l.s.c. at x, then the function f WD supi2I fi is l.s.c. at x. For any ˛ 2 RC and f , g which are l.s.c. at x, the functions inf. f ; g/ and ˛f are l.s.c. at xI the same is true for f C g provided ff .x/; g.x/g ¤ f1; C1g. If moreover f and g are nonnegative, then fg is l.s.c. at x. If f W X ! R is finite and l.s.c. at x and if g W R ! R is nondecreasing and l.s.c. at f .x/, then g ı f is l.s.c. at x. One may observe that in the first assertion one cannot replace lower semicontinuity by continuity, as shown by the above example of arc length. Proof Let r 2 R be such that r < f .x/. There exists some j 2 I such that r < fj .x/, hence one can find some V 2 N .x/ such that r < fj .v/ f .v/ for all v 2 V. The proofs of the other assertions are also straightforward or follow from the preceding lemma. t u
2.2 Topologies
67
Proposition 2.12 For any function f W X ! R on a topological space X, the family of l.s.c. functions majorized by f has a greatest element fN called the lower semicontinuous hull of f (in short, the l.s.c. hull of f ). Its epigraph is the intersection with X R of the closure cl.epi f / of the epigraph of f in X R. The function fN is given by fN .x/ D lim inf f .u/ D sup inf f .u/: u!x
U2N .x/ u2U
Proof The first assertion is a direct consequence in Proposition 2.11. The second assertion easily stems from the fact that setting g.x/ D minfr W .x; r/ 2 cl.epi f /g one defines a lower semicontinuous function that is the greatest lower semicontinuous function majorized by f . The proof of the explicit expression of fN is left as an exercise. t u The treatment of the following example requires the knowledge of some material in integration theory to be found in Sects. 7.4 and 8.5. Its importance justifies its presence here. Example Let E be a Banach space, let .S; S; / be a measure space and let L W E S ! R be a measurable function such that for a null set N in S the function Ls W e 7! L.e; s/ is lower semicontinuous whenever s 2 SnN. Suppose that for some p 2 Œ1; 1Œ and some a 2 R, b 2 L1 .S/ one has L.e; s/ b.s/ a kekp
8.e; s/ 2 E .SnN/:
R Then the function j W Lp .S; E/ ! R given by j.x/ WD S L.x.s/; s/d.s/ is lower semicontinuous. We first prove this assertion in the case a D 0, b D 0. Let .xn / be a sequence in Lp .S; E/ converging to some x 2 Lp .S; E/. Taking a subsequence if necessary we may suppose . j.xn // converges to lim infn j.xn /. Taking a further subsequence we may assume that .xn .s// ! x.s/ a.e. in S. Our lower semicontinuity assumption on L ensures that lim inf L.xn .s/; s/ L.x.s/; s/ n
a.e.,
so that, by Fatou’s lemma, we obtain Z
Z n
Z
L.xn .s/; s/d.s/
lim inf S
lim inf L.xn .s/; s/d.s/ S
n
L.x.s/; s/d.s/; S
or lim infn j.xn / j.x/, the required lower semicontinuity property. p R In the general case we set M.e; s/ WD L.e; s/ b.s/ C a kek 2 RC and m.x/ WD S M.x.s/; s/d.s/ for x 2 Lp .S; E/. Given x 2 Lp .S; E/ and a sequence .xn / ! x in
68
2 Encounters With Limits
Lp .S; E/, we have Z
Z
lim inf j.xn / D lim inf.m.xn / a n
n
kxn .s/kp d.s/ C S
Z
m.x/
a kxkpp
C
b.s/d.s/ S
b.s/d.s/ D j.x/: S
That shows that j is lower semicontinuous.
t u
Exercises T 1. Using the relation E D i2I Ei , where Ei is the epigraph of a function fi and E is the epigraph of f WD supi2I fi , show that f is l.s.c. on X when each fi is l.s.c. on X. Use a similar argument with sublevel sets. 2. Suppose X is a metric space. Show that f is l.s.c. at x iff for any sequence .xn / converging to x one has f .x/ lim infn f .xn /. 3. Let .M; d/ be a metric space and let X WD C.T; M/, where T WD Œ0; 1. Given some x 2 X and some element s of the set S of nondecreasing sequences s WD .sn /n0 satisfying s0 D 0, sn D 1 for n large, let `s .x/ WD
X
d.x.sn /; x.snC1 //
n0
4.
5.
6. 7.
(observe that the preceding sum contains only a finite number of non-zero terms). Define the length of a curve x 2 X by `.x/ WD sups2S `s .x/. Show that `s W X ! R is continuous when X is endowed with the metric of uniform convergence (and even when X is provided with the topology of pointwise convergence). Conclude that the length ` is a l.s.c. function on X. Show that ` is not continuous by taking M WD R2 , x given by x.t/ WD .t;p0/ and by showing that there is some xn 2 X such that d.xn ; x/ ! 0 and `.xn / 2 2k 2k 2kC1 [Hint: for n > 0 define xn .t/ D t 2n for t 2 Œ 2n ; 2n , k n and xn .t/ D 2kC2 2kC1 2kC2 t C 2n for t 2 Œ 2n ; 2n , k n 1]. Show that the infimum of an infinite family of l.s.c. functions is not necessarily l.s.c. [Hint: observe that any function f on X is the infimum of the family . fa /a2X of functions given by fa .x/ D f .a/ if x D a, C1 else]. Let f W X ! R1 be a l.s.c. function on a topological space X and let A be a nonempty subset of X. Show that inf f .A/ D inf f .clA/. Can one replace inf by sup? Show that the supremum of a family of continuous functions is not necessarily continuous. (Ritz’s method) Let f W X ! R be an upper semicontinuous function on a topological space X and let .Xn /n0 be a sequence in subspaces such that for
2.2 Topologies
69
all x 2 X there exists a sequence .xn / ! x satisfying xn 2 Xn for all n. Let m WD inf f .X/, mn WD inf f .Xn /. Show that m D limn mn .
2.2.4 Compactness The existence of limits being a frequent aim, the following definition is of interest. Definition 2.9 A topological space .X; O/ is said to be compact if it is Hausdorff and if any net in X has a convergent subnet. Equivalently, since a cluster point of net is the limit of some subnet, a topological space .X; O/ is compact if every net in X has a cluster point. Moreover, if a net .xi /i2I of a compact space .X; O/ has at most one cluster point x, then it converges to x: if this were not the case, one could find an open neighborhood V of x and a cofinal subset J of I such that xj … V for all j 2 J and then .xj /j2J would have a convergent subnet whose limit would be in XnV, contradicting the uniqueness of the cluster point of .xi /i2I . The property of the definition can be characterized in different ways. The usual one deals with open coverings of X, i.e. families .Oi /i2I of open subsets whose union is X. Another one deals with families .Ci /i2I satisfying the finite intersection property, i.e. such that for any finite subset J of I one has \j2J Cj ¤ ¿. Theorem 2.1 A Hausdorff topological space .X; O/ is compact if and only if every open covering .Oi /i2I of X has a finite covering, if and only if any family .Ci /i2I of closed subsets with the finite intersection property has a nonempty intersection. Proof Setting Oi WD XnCi (and conversely Ci WD XnOi ) one sees that the last two properties are equivalent, since for all J I the family .Oj /j2J is a covering of X if and only if \j2J Cj is empty. Let us assume the last property is satisfied and let .xi /i2I be a net in X. Setting Ci WD clfxj W j ig, we see that for every finite subset J of I and for any k 2 I such that k j for all j 2 J (such a k exists since I is directed) one has \j2J Cj Ck ¤ ¿, so that \i2I Ci is nonempty. Since by Proposition 2.3 \i2I Ci is the set of cluster points of .xi /i2I , we get a cluster point of .xi /i2I . Now let .Ci /i2I be a family of closed subsets satisfying the finite intersection property. Let J be the family of finite subsets of I and for J 2 J let xJ 2 CJ WD \j2J Cj . Since J is directed with respect to inclusion, we get a net .xJ /J2J in X. Let us show that if \i2I Ci D ¿ the net .xJ /J2J cannot have a cluster point. Suppose x is a cluster point of .xJ /J2J . Since x … \i2I Ci there exists some k 2 I such that x 2 V WD XnCk . Then, for J 2 J satisfying J K WD fkg we cannot have xJ 2 V since xJ 2 CJ Ck . Thus x cannot be a cluster point of .xJ /J2J . t u Corollary 2.5 Let X be a subset of a Hausdorff topological space .W; OW / endowed with the induced topology OX WD fO \ X W O 2 OW g. Then .X; OX / is compact if and only if every covering .Oi /i2I of X by members of OW has a finite subcovering.
70
2 Encounters With Limits
Here a family .Wi /i2I of subsets of W is called a covering of X if X [i2I Wi and a subcovering is a subfamily .Wj /j2J of .Wi /i2I that is still a covering of X. Proof If .X; OX / is compact, for every covering .Oi2I / of X by open subsets of W, the family .Oi \ X/i2I being a covering of X for the induced topology OX , one can find a finite subset J of I such that .Oj \ X/j2J is a covering of X. Then .Oj /j2J is a finite subcovering of X. Conversely, suppose every covering .Oi /i2I of X by members of OW has a finite subcovering .Oj /j2J . Then, if .Gi /i2I is an open covering of .X; OX /, picking Oi 2 OW such that Gi D Oi \ X, we can find a finite subset J of I such that X [j2J Oj and then .Gj /j2J is a finite subcovering of .Gi /i2I . Thus .X; OX / is compact. t u Example A discrete space, i.e. a set endowed with the discrete topology, is compact if and only if it is finite. Example If .xn / is a convergent sequence in a Hausdorff topological space .W; O/ and if x WD limn xn , then the set X WD fxn W n 2 Ng [ fxg is compact with respect to the induced topology. In fact, given a covering .Oi /i2I of X by open subsets of W we can find some k 2 I such that x 2 Ok . Since .xn / ! x there exists some m 2 N such that xn 2 Ok for n > m. Then, taking for all j 2 N, j m some i. j/ 2 I such that xj 2 Oi. j/ , we obtain a finite subcover of X by taking the family fOi. j/ W 0 j mg [ fOk g. t u Another important example is given by the following theorem. Theorem 2.2 (Heine-Borel-Lebesgue) Every closed bounded interval of R is compact. Proof Let X WD Œa; b with a b in R and let .Oi /i2I be an open covering of X by open subsets of R. Let A be the set of x 2 X such that Œa; x is covered by a finite number of members of .Oi /2I . Then A is nonempty (since a 2 A), hence it has a least upper bound c b. Let h 2 I be such that c 2 Oh . Suppose c < b. Given " > 0 such that c C " b and Œc "; c C " Oh , one can find some x 2 A such that c " < x c. By definition of A one can find a finite subset J of I such that Œa; x [j2J Oj . Then c 2 A and Œa; c C " [k2K Ok for K WD J [ fhg, so that c C " 2 A, a contradiction. Thus c D b and b 2 A W Œa; b can be covered by a finite subfamily of .Oi /i2I . t u Corollary 2.6 The space R WD R [ f1; C1g is compact. Proof This follows from the fact that there exists a homeomorphism h from R onto Œ1; 1, for instance h W r 7! r=.1 C jrj/ for r 2 R, h.1/ WD 1, h.C1/ WD 1. t u Let us give some permanence properties. Proposition 2.13 Let X be a closed subset of a compact topological space .W; OW /. Then, denoting by OX WD fO \ X W O 2 OW g the induced topology, .X; OX / is compact.
2.2 Topologies
71
Proof If .xi /i2I is a net in X, it has a subnet .xj /j2J that converges in W. But since X is closed in W, the limit x of .xj /j2J belongs to X and .xj /j2J converges in .X; OX /. t u Proposition 2.14 Let X be a subset of a Hausdorff topological space .W; OW /. If X is compact with respect to the induced topology OX , then X is closed in .W; OW /. Proof Let w be an element of the closure of X. Then, there exists a net .xi /i2i of X that converges to w. But since .X; OX / is compact, .xi /i2I has a cluster point x 2 X. Then Proposition 2.1 ensures that x D w, so that w 2 X. t u Corollary 2.7 The compact subsets of R (with respect to the induced topology) are the closed bounded subsets of R. Proof Since R is Hausdorff, the preceding proposition shows that a compact subset X of R is closed. It is bounded since otherwise we could find a sequence .xn / in X satisfying jxn j > n; such a sequence cannot have a cluster point. Conversely, let X be a closed bounded subset of R. Then there exists a closed bounded interval W WD Œa; b containing X. Since W is compact and X is closed in W, X is compact by Proposition 2.13. t u Exercise Show that in a Hausdorff topological space the union of a finite family of compact subsets is compact and the intersection of a family of compact subsets is compact. Theorem 2.3 Let .X; OX / and .Y; OY / be two Hausdorff topological spaces, and let f W X ! Y be a continuous map. If X is compact, then Z WD f .X/ is compact. Proof Let .zi /i2I be a net in Z. If xi 2 X is such that zi D f .xi /, then .xi /i2I has a convergent subnet .xj /j2J . Then .zj /j2J is a subnet of .zi /i2I that converges to f .limj xj /. t u Corollary 2.8 Let .X; OX / and .Y; OY / be two Hausdorff topological spaces, and let f W X ! Y be a continuous injective map. If X is compact, then f is a homeomorphism from X onto f .X/. Proof If C is a closed subset of X, C is compact and f .C/ is compact too, so that f .C/ is closed in f .X/. t u The proof of the following theorem requires Zorn’s Lemma when the product has an infinite number of factors. Theorem 2.4 (Tykhonov) The product of a family of compact topological spaces is compact. Proof We admit the result when the product has an infinite number of factors. The case of a finite number of factors is reduced to the case of two factors by an induction. Let Z WD X Y be the product of two compact spaces and let .zi /i2I WD ..xi ; yi //i2I be a net in Z. A subnet .xj /j2J of .xi /i2I converges. In turn, a subnet . yk /k2K of . yj /j2J converges. Then .zk /k2K is a convergent subnet of .zi /i2I . t u
72
2 Encounters With Limits
Corollary 2.9 The compact subsets of Rd (with respect to the induced topology) are the closed bounded subsets of Rd . Here a subset of Rd is said to be bounded if its projections are bounded. Proof If S is a compact subset of Rd , it is closed and its projections are compact, hence bounded. Conversely, if S is a closed bounded subsets of Rd then S is contained in a product of closed bounded intervals. Thus S is a closed subset of a compact space, hence S is compact. t u A subset X of a topological space .W; OW / is said to be relatively compact if its closure is compact. Thus, any subset of a relatively compact subset is relatively compact. Since fixed point results enable us to solve equations, the interest of the notion of compactness is illustrated by the following theorem. It has been given many proofs; an elementary one can be found in [197, 222] and in the appendix. Recall that a subset C of a vector space is said to be convex if for all x0 , x1 2 C and t 2 Œ0; 1 one has .1 t/x0 C tx1 2 C. Theorem 2.5 (Brouwer) Let X be a compact convex subset of Rd (or of a finite dimensional normed vector space) and let f W X ! X be a continuous map. Then there exists some x 2 X such that f .x/ D x. Corollary 2.10 (Hairy Ball Theorem) Let X be a (finite dimensional) Euclidean space with scalar product h j i, unit closed ball BX , unit sphere SX , let r > 0, and let g W rBX ! X be continuous and pointing inside rBX , i.e. such that hg.x/ j xi 0
8x 2 rSX :
Then there exists some z 2 rBX such that g.z/ D 0. Proof Suppose on the contrary that g.x/ ¤ 0 for all x 2 rBX . Then h W rBX ! X given by h.x/ WD
r g.x/ kg.x/k
x 2 rBX
is continuous, takes its values in rSX rBX , hence has a fixed point x 2 rBX by Brouwer’s Theorem. Then we get the contradiction r2 D kh.x/k2 D hh.x/ j xi D
r hg.x/ j xi 0: kg.x/k
This contradiction proves that g has a zero in rBX .
t u
Corollary 2.11 Let X be a (finite dimensional) Euclidean space, let b, r 2 RC , and let f W X ! X be continuous and such that h f .x/ j xi b kxk for all x 2 rSX . Then for all y 2 bBX the equation f .x/ D y has a solution x 2 rBX .
2.2 Topologies
73
If h f .x/ j xi c.kxk/ kxk for some function c./ such that c.r/ ! 1 as r ! 1, then f .X/ D X. Moreover, there exists a function k W RC ! RC such that for all x, y 2 X satisfying f .x/ D y one has kxk k.kyk/. Proof Given y 2 bBX let us set g./ WD y f ./, so that g is continuous and for x 2 rSX hg.x/ j xi D h y j xi h f .x/ j xi b kxk b kxk 0: Thus, there exists some x 2 rBX such that g.x/ D 0 or f .x/ D y. Assuming that h f .x/ j xi c.kxk/ kxk for c./ satisfying limr!1 c.r/ D 1, setting k.s/ WD supfr 2 RC W c.r/ sg
s 2 RC
we see that k takes finite values and that whenever x is such that f .x/ D y we have c.kxk/: kxk h f .x/ j xi D h y j xi kyk : kxk, hence kxk k.kyk/. t u Some topological spaces of interest have a local behavior involving compactness. Definition 2.10 A topological space .X; OX / is said to be locally compact if it is Hausdorff and if each point of X has a compact neighborhood. A compact space is obviously locally compact. A discrete space (i.e. a space in which each subset is open) is locally compact, but it is not compact if it is infinite. The space R with its usual topology is locally compact but not compact. An open subset of a compact space or locally compact space is locally compact, as the following proposition shows. Proposition 2.15 In a compact space, or more generally in a locally compact space, each point has a base of neighborhoods formed by compact sets. Proof Let .X; OX / be a compact space, let x 2 X and let U 2 N .x/. We want to show that there exists some closed V 2 N .x/ contained in U. Without loss of generality we may assume U is open. If no such V exists, since N .x/ is stable under finite intersections, the family fVnU W V 2 N .x/; cl.V/ D Vg of closed subsets of XnU has the finite intersection property. Since XnU is compact by Proposition 2.13, one can find some y 2 XnU belonging to any closed neighborhood V of x. Then we have y ¤ x and since X is Hausdorff we can find some V 2 N .x/, W 2 N .y/ with V \ W D ¿. Since we may assume W is open, replacing V with its closure we may suppose V is closed. Then y 2 V. Since y 2 W and V \ W D ¿ we get a contradiction. Now suppose X is locally compact. Let x 2 X and let U 2 N .x/. By assumption there is some W 2 N .x/ that is compact. The preceding yields some compact neighborhood V of x in W contained in U \ W. Then V is a neighborhood of x in X and is contained in U. t u
74
2 Encounters With Limits
Proposition 2.16 Every open or closed subset X of a locally compact space .W; OW / is locally compact. Proof If X is open, for every x 2 X one has X 2 N .x/, so that there is some compact V 2 N .x/ contained in X and V is a neighborhood of x with respect to the induced topology on X. If X is closed in W and if x 2 X, taking a neighborhood V of x in W that is compact, we see that W \ X is compact and is a neighborhood of x in X with respect to the induced topology. Thus X is locally compact. t u Proposition 2.17 The product of a finite family of locally compact spaces is locally compact. In particular, Rd is locally compact. Proof It suffices to prove that the product Z WD X Y of two locally compact spaces is locally compact. Given z WD .x; y/ 2 Z we pick U 2 N .x/, V 2 N .y/ that are compact. Then U V is a compact neighborhood of z. t u Given a topological space .X; O/ there are different ways of embedding it into a compact space. When .X; O/ is locally compact the simplest approach consists in adding a point w to X and declaring that a subset O of W WD X [fwg is open if either it belongs to O or if w 2 O and XnO is compact. It is easy to see that the resulting topological space is compact. It is called the Alexandroff compactification of X or one point compactification of X. Exercise Show that the Alexandroff compactification of Rd is homeomorphic to the unit sphere Sd of RdC1 . [Hint: use the stereographic projection p W Sd n fng ! Rd , where n is the “north pole” n WD .0; : : : ; 0; 1/ of Sd defined as follows: for x 2 Sd n fng, p.x/ is the point of Rd f0g that belongs to the half-line x C RC .x n/.] The following theorem is the main existence result in optimization theory. Theorem 2.6 (Weierstrass) Let f W X ! R be a lower semicontinuous function on a (nonempty) compact topological space X. Then the set M WD fw 2 X W f .w/ f .x/ 8x 2 Xg of minimizers of f is nonempty. Proof We may suppose m WD inf f .X/ < 1 for otherwise f is constant with value 1. Setting Sf .r/ WD fx 2 X W f .x/ rg for r 2 R, the family fSf .r/ W r > mg is formed of nonempty closed subsets and any finite subfamily has a nonempty intersection: \1ik Sf .ri / D Sf .rj / where rj WD min1ik ri . Therefore M D \r>m Sf .r/ is nonempty. t u Given a topological space X, one may try to weaken its topology in order to enlarge the family of compact subsets. Then, a continuous function on X may not remain continuous. There are interesting cases, for instance making use of convexity assumptions, for which the function still remains lower semicontinuous, so that the preceding generalization of the classical existence of a minimizer under a continuity assumption is of interest. Let us give a criterion for the lower semicontinuity of a function obtained by minimization.
2.2 Topologies
75
Proposition 2.18 Let W, X be topological spaces, let w 2 W and let f W W X ! R be a function which is lower semicontinuous at .w; x/ for every x 2 X. If the following compactness assumption is satisfied, then the performance function p W W ! R given by p.w/ WD infx2X f .w; x/ is lower semicontinuous at w W (C) for any net .wi /i2I ! w there exist a subnet .wj /j2J , a convergent net .xj /j2J in X and ."j /j2J ! 0C such that f .wj ; xj / p.wj / C "j for all j 2 J. Proof Given a net .wi /i2I ! w such that . p.wi //i2I converges, let .wj /j2J be a subnet of .wi /i2I , and let .xj /j2J and ."j / ! 0C be as in (C). Then, if x is the limit of .xj /j2J , one has p.w/ f .w; x/ lim inf f .wj ; xj / lim inf p.wj / C "j D lim p.wi /: j2J
j2J
i2I
Since lim infw!w p.w/ is the limit of . p.wi //i2I for some net .wi /i2I ! w such that . p.wi //i2I converges, these inequalities show that p.w/ lim infw!w p.w/. t u Corollary 2.12 Let W and X be topological spaces, X being compact, and let f W W X ! R1 WD R [ fC1g be lower semicontinuous. Then the performance function p defined as above is lower semicontinuous. Proof We give a proof in the case when f is just lower semicontinuous at each point of fwg XI when f is lower semicontinuous on W X, a simpler proof can be given using the Weierstrass’ Theorem. Condition (C) is clearly satisfied when X is compact since for any net .wi /i2I ! w and for any sequence .˛n / ! 0C one can take H WD I N, wh WD wi , "h WD ˛n for h WD .i; n/ and pick xh 2 X satisfying f .wh ; xh / p.wh / C "h , and take a subnet .xj /j2J of .xh /h2H which converges in X. t u
Exercises 1. For every net .ri /i2I in the compact space R show that lim infi2I ri is the least cluster point of .ri /i2I and lim supi2I ri is the greatest cluster point of .ri /i2I . 2. Let X and Y be topological spaces and let Z be a closed subset of X Y. Show that if Y is compact, then the projection pX .Z/ of Z on X is closed. Give an example showing that one cannot drop the assumption that Y is compact. 3. Let X and Y be topological spaces, Y being Hausdorff and let f W X ! Y be a map. Show that if f is continuous then the graph G WD f.x; f .x// W x 2 Xg of f is closed in X Y. Give an example with X D Y D R showing that the converse is not true. Prove that if Y is compact, then the converse is true. 4. Let .Kn /n be a decreasing sequence in nonempty compact subsets of a topological space X. Show that K WD \n Kn is nonempty and that for any open subset G of X containing K there exists some m 2 N such that Km G. Give a generalization to a filtered family .Ki /i2I of nonempty compact subsets.
76
2 Encounters With Limits
5. Let X be a Hausdorff topological space and let A and B be two disjoint compact subsets of X. Show that there exist open subsets U, V of X such that A U, B V and U \ V D ¿. [Hint: start with the case when B is a singleton.] 6. Let X WD Œ0; 1 Œ0; 1 be endowed with the lexicographic order . Let O be the topology generated by the open intervals Tw;z WD fx 2 X W w x zg of X and the intervals fx 2 X W x zg, fx 2 X W w xg, where x x0 means that x x0 and x ¤ x0 . Show that X is compact with respect to this topology. 7. Show that for any compact subset K of a locally compact space X and any neighborhood U of K there exists some compact neighborhood V of K contained in U. 8. Show that the intersection of two locally compact subsets of a topological space is locally compact. 9. Prove that the union of two locally compact subsets of a topological space is not always locally compact. [Hint: in R take A WD fag where a is the limit of a sequence .an / of Rnfag and B WD Rn.fan W n 2 Ng [ fag/ or in R2 take A WD R P, B WD f.0; 0/g.] 10. Show by an example that the image of a locally compact space X under a continuous map f W X ! Y is not necessarily locally compact. [Hint: take a bijection f from N onto Q.] 11 . Prove the odd Hairy Ball Theorem: for d odd there is no continuous vector field on the unit sphere Sd1 WD SRd tangent to SRd [Hint: use the appendix.] 12. (Beals) Let B be the closed unit ball of the space X WD c0 of sequences x WD .xn /n0 with limit 0 endowed with the supremum norm. Prove that the map f W B ! B given by f .x/ D .1; x0 ; x1 ; : : :/ for x WD .x0 ; x1 ; : : :/ 2 B is nonexpansive, i.e. does not increase distances, and has no fixed point. 13. (Stone) A Boolean ring is a ring A such that a2 D a for all a 2 A. A topological space .X; O/ is said to be totally disconnected if the family B of subsets of X which are simultaneously open and closed forms a base of its topology. Show that B forms a Boolean ring with unit when the product is \ and the addition is as in Definition 1.2. Conversely, show that any Boolean ring with unit is isomorphic to the ring of subsets of a compact topological space which are simultaneously open and closed. [See [106, p. 41].]
2.3 Metric Spaces A usual means of studying convergence on a set X is to reduce the question to the case of convergence in R. That can be done if one disposes of functions from X X to R that allow such a transfert. Metrics (also called distances) and semimetrics are such means.
2.3 Metric Spaces
77
2.3.1 General Facts About Metric Spaces Definition 2.11 A semimetric on a set X is a function d W X X ! RC WD Œ0; C1Œ satisfying the properties: (SM1) (SM2) (SM3)
for all x 2 X one has d.x; x/ D 0I for all x, x0 2 X one has d.x; x0 / D d.x0 ; x/I for all x; x0 ; x00 2 X, one has d.x; x00 / d.x; x0 / C d.x0 ; x00 /.
A metric is a semimetric d such that for all x, x0 2 X, d.x; x0 / D 0 ) x D x0 . A pseudo-metric on X is a function d W X X ! RC WD Œ0; C1 satisfying (SM1)– (SM3). The relation in (SM3) is called the triangle inequality. A metric space (resp. semimetric space) is a pair .X; d/ formed by a set X and a metric (resp. semimetric) d on X. When there is no ambiguity on d, we just write X. A uniform space is a pair .X; .da /a2A /, where .da /a2A is a family of pseudo-metrics on X. In a metric space .X; d/ (resp. a uniform space .X; .da /a2A /) one can introduce a convergence by setting .xi /i2I ! x
”
(resp. .xi /i2I ! x
”
.d.xi ; x//i2I ! 0 8a 2 A
.da .xi ; x//i2I ! 0).
Exercise Verify the axioms (C1), (C2), (C3) of convergence spaces (Definition 2.2) In fact, a semimetric d induces a topology O on X defined by: G 2 O iff for all x 2 G there exists some r > 0 such that the open ball B.x; r/ WD fx0 2 X W d.x; x0 / < rg is contained in G. Thus O is the topology generated by the family of open balls. This family is a base of O and for all x 2 X, the family of open balls centered at x is a base of neighborhoods of x. In the sequel, the closed ball with center x and radius r 2 RC is the set BŒx; r WD fx0 2 X W d.x; x0 / rg: The family of closed balls centered at x with positive radius is also a base of neighborhoods of x. A topology can also be associated with a uniform space .X; .da /a2A /: it is the topology generated by the balls Ba .x; r/ WD da .x; /1 .Œ0; rŒ/ for a 2 A, x 2 X, r > 0. It is easy to show that the convergence on .X; d/ or .X; .da /a2A / described above is the convergence associated with the topology we just defined. Moreover, when d is a metric, the associated topology is Hausdorff: given x, x0 2 X such that x ¤ x0 , for r 20; d.x; x0 /=2Œ the balls B.x; r/ and B.x0 ; r/ are disjoint in view of the triangle inequality. Hence convergent nets or sequences have a unique limit by Proposition 2.1. The existence of a metric d on X implies
78
2 Encounters With Limits
a noticeable property of accumulation points. We propose it in the next exercise. A point a is called an accumulation point of a subset S of a topological space X if every neighborhood V of a contains some point x 2 S, x ¤ a. Exercise Show that every neighborhood V of an accumulation point a of a subset S of a metric space .X; d/ contains an infinite family of points of S. [Hint: by induction define a sequence .xn / of Snfag such that d.xnC1 ; a/ < d.xn ; a/.] Given a subset S of a metric space .X; d/ the diameter of S is diam.S/ WD supfd.x; y/ W x; y 2 Sg and the distance to S is the function dS W X ! RC given by dS .x/ WD inffd.x; y/ W y 2 Sg;
x 2 X:
This notion is often convenient. If A, B are two subsets of X, their gap g.A; B/ is gap.A; B/ WD inffd.a; b/ W a 2 A; b 2 Bg: This number is sometimes called the distance between A and B but this terminology is improper as .A; B/ 7! gap.A; B/ is not a metric on the set P.X/ of subsets of X or on the set of closed subsets of X. In metric spaces continuity can be expressed with the help of "’s and ı’s: a map f W .X; d/ ! .X 0 ; d0 / is continuous at x 2 X if and only if for all " > 0 one can find some ı > 0 such that d0 . f .x/; f .x// < " for all x 2 X satisfying d.x; x/ < ı. In metric spaces, one can avoid nets and just use sequences in a convenient way: Proposition 2.19 For a map f W .X; d/ ! .X 0 ; d0 / between two metric spaces and x 2 X the following assertions are equivalent: (a) f is continuous at x 2 XI (b) for every sequence .xn / ! x one has . f .xn // ! f .x/I (c) for every sequence .xn / ! x one can find a subsequence .xk.n/ / such that . f .xk.n/ // ! f .x/. Proof (a))(b) Sequences being particular nets, the implication stems from Proposition 2.2. (b))(c) being obvious, it remains to show that (c))(a). (c))(a) If f is not continuous at x there exists some " > 0 such that for all ı > 0 one has f .B.x; ı// ª B. f .x/; "/. Taking a sequence .ın / ! 0C we can find some xn 2 B.x; ın / such that f .xn / … B. f .x/; "/. Then we have .xn / ! x but for any subsequence .xk.n/ / the sequence . f .xk.n/ // does not converge to f .x/. t u Proposition 2.20 In a metric space .X; d/ the closure cl.S/ of a subset S is the set of limits of sequences in S. Proof If x is the limit of a sequence .xn / of S, then clearly x 2 cl.S/. Conversely, if x 2 cl.S/, for any n 1 the set S\B.x; 1=n/ is nonempty. Picking xn 2 S\B.x; 1=n/ we get a sequence .xn / converging to x. t u
2.3 Metric Spaces
79
Proposition 2.21 In a metric space .X; d/ any cluster point of a sequence in X is the limit of a subsequence. Proof Let x be a cluster point of a sequence .xn / of X. Given a sequence .rn / ! 0C an induction on n gives a sequence .k.n// of N such that k.n C 1/ > k.n/ and xk.n/ 2 B.x; rn / for all n. Then .xk.n/ /n is a subsequence in .xn / that converges to x. t u A map f W .X; d/ ! .X 0 ; d0 / is said to be uniformly continuous if for all " > 0 one can find some ı > 0 such that d0 . f .w/; f .x// < " for all w, x 2 X satisfying d.w; x/ < ı. Such a map is clearly continuous at each x 2 X and one sees that ı does not depend on x. If there exists a modulus , i.e. a function W RC ! RC WD Œ0; C1 satisfying .t/ ! 0 as t ! 0, such that d0 . f .w/; f .x// .d.w; x// for all w, x 2 X, then f is uniformly continuous. The converse is true and we invite the reader to produce a modulus satisfying the preceding property (and even the least such modulus f called the modulus of uniform continuity of f ). The case when is linear deserves some attention, but there are other cases of interest, for example the case .t/ D ct˛ with c 2 RC , ˛ > 0 (then f is said to be Hölderian). A map f W .X; d/ ! .X 0 ; d0 / is said to be Lipschitzian if there exists some c 2 RC such that d0 . f .x1 /; f .x2 // cd.x1 ; x2 / for all x1 ; x2 2 X. The constant c is called a Lipschitz constant (or rate, or rank). The least such constant is called the (exact) Lipschitz rate of f . If this rate is 1, f is said to be nonexpansive. If this rate is less than 1 one says that f is contractive or a contraction. If f is a bijection and if for all x1 , x2 2 X one has d0 . f .x1 /; f .x2 // D d.x1 ; x2 / one says that f is an isometry; then f 1 is also an isometry. For x 2 X the Lipschitz rate of f at x is the infimum of the Lipschitz rates of the restrictions of f to the neighborhoods of x (and C1 if there is no neighborhood of x on which f is Lipschitzian). If for all x 2 X there is a neighborhood V of x such that the restriction f j V is Lipschitzian, f is said to be locally Lipschitzian. If .X; .da /a2A / and .Y; .db /b2B / are two uniform spaces, a map f W X ! Y is said to be uniformly continuous if for all b 2 B and all " 2 P one can find a finite subset A.b; "/ of A and ı > 0 such that db . f .x/; f .x0 // " whenever x, x0 2 X satisfy da .x; x0 / ı for all a 2 A.b; "/. Exercise Show that the composition of two uniformly continuous maps is uniformly continuous. Different equivalence properties can be defined on the set of metrics on a set X. Two metrics d, d0 are said to be topologically equivalent if the topologies they define coincide. They are said to be uniformly equivalent (resp. metrically equivalent) if the identity map from .X; d/ into .X; d0 / and its inverse are uniformly continuous (resp. Lipschitzian). Proposition 2.22 A metric space .X; d/ is separable if and only if its topology has a countable base. Proof If the topology of .X; d/ has a countable base B WD fBn W n 2 Ng, picking an arbitrary point an 2 Bn for all n we get a dense subset A WD fan W n 2 Ng since
80
2 Encounters With Limits
for all x 2 X and any open subset U containing x there exists some n 2 N such that Bn U. Conversely, suppose X contains a countable dense subset A WD fan W n 2 Ng. Then we claim that the family B WD fB.an ; q/ W n 2 N; q 2 Q; q > 0g is a base of the topology of .X; d/. In fact, given x 2 X and r > 0 we can find q 2 Q such that 0 < q < r=2 and if n 2 N is such that an 2 B.x; q/, for all x 2 B.an ; q/ we have x 2 B.x; r/ by the triangle inequality. t u Example The set R of real numbers endowed with its usual metric has a countable base since the set Q of rational numbers is dense in R. Corollary 2.13 A subspace of a separable metric space is separable. Proof This follows from the proposition and from the fact that if B is a countable base of .X; d/, then for a subspace Y, the family BY WD fB \ Y W B 2 Bg is a base for the induced topology on Y. t u On the product Z WD X Y of two metric spaces .X; dX /, .Y; dY /, a metric d is called a product metric if the canonical projections pX W Z ! X, pY W Z ! Y and the insertions jb W x 7! .x; b/, ja W y 7! .a; y/ are nonexpansive. Exercise Show that a metric d on Z WD X Y is a product metric if and only if for all .u; v/, .x; y/ 2 X Y one has max.dX .u; x/; dY .v; y// d..u; v/; .x; y// dX .u; x/ C dY .v; y/: The left (resp. right) side defines a convenient metric usually denoted by d1 (resp. d1 ). Describe its balls. Whereas the product X of a family of metric spaces .Xs ; ds / (s 2 S, an arbitrary set) cannot be provided with a (sensible, i.e. inducing the product topology) metric in general, we have seen that a product of topological spaces .Xs ; Os / (s 2 S) can always be endowed with a topology O that makes the projections ps W X ! Xs continuous and that is as weak as possible, namely it is the topology generated by the sets p1 s .Os / for s 2 S, Os 2 Os . Its associated convergence is componentwise convergence. When .Xs ; Os / WD .Y; OY ) for all s 2 S, identifying the product X with the set Y S of maps from S to Y, the convergence associated with the product topology O on X coincides with pointwise convergence: . fi /i2I ! f in Y S if, for all s 2 S, . fi .s//i2I ! f .s/. When OY is the topology associated with a metric dY on Y, a stronger convergence can be defined on Y S : it is the so-called uniform convergence for which . fi /i2I ! f iff .d1 . fi ; f // WD .sups2S dY . fi .s/; f .s/// ! 0. This convergence is adapted to bounded functions, but it can be considered for any set of maps from a set S into a metric space Y. It enjoys better preservation properties, such as the following one. Theorem 2.7 Let X be a topological space and let .Y; d/ be a metric space. Let . fi /i2I be a net (or a sequence) of continuous functions from X into Y that converges uniformly to some map f W X ! Y. Then f is continuous.
2.3 Metric Spaces
81
Proof Let x 2 X and let " > 0 be given. We can find k 2 I such that for i k we have supx2X d. fi .x/; f .x// "=3. Since fk is continuous there exists a neighborhood V of x such that d. fk .x/; fk .x// "=3 for all x 2 V. Then, for x 2 V we have d. f .x/; f .x// d. f .x/; fk .x// C d. fk .x/; fk .x// C d. fk .x/; f .x// ". t u
This proves that f is continuous at x. Some constructions can be obtained by using metrics.
Lemma 2.3 (Urysohn’s Lemma) Let A and B be two disjoint nonempty closed subsets of a metric space .X; d/. Then there exists inuous function h W X ! Œ1; 1 such that h.x/ D 1 for all x 2 A and h.x/ D 1 for all x 2 B. If C is a closed subset of X and if f W C ! Œ1; 1 is a continuous function, there exists a continuous function g W X ! Œ1=3; 1=3 such that j f .x/ g.x/j 2=3 for all x 2 C. Proof Since A and B are disjoint, for all x 2 X we have g.x/ WD d.x; A/ C d.x; B/ > 0. Setting h.x/ WD .d.x; A/ d.x; B//=g.x/ we obtain the required function. Let us set A WD fx 2 C W f .x/ 1=3g and B WD fx 2 C W f .x/ 1=3g. If A is empty we take g WD 1=3; if B is empty we take g WD 1=3. If A and B are both nonempty we take g WD .1=3/h where h is as in the first assertion. We obtain the relation j f .x/ g.x/j 2=3 for all x 2 C by considering the three cases x 2 A, x 2 B, x 2 Cn.A [ B/. t u Theorem 2.8 (Tietze-Urysohn) Let C be a closed subset of a metric space .X; d/ and let f W C ! R a bounded continuous function. Then there exists a continuous function g W X ! R such that g jC D f , inf g.X/ D inf f .C/, and sup g.X/ D sup f .C/. Proof The result is obvious when f is constant. Changing f into af C b, where a and b are appropriate real numbers, we may assume inf f .C/ D 1 and sup f .C/ D 1. The second assertion of the lemma yields a continuous function g0 with values in Œ1=3; 1=3 such that j f .x/ g0 .x/j 2=3 for all x 2 C. Let us suppose that inductively we have defined for n 2 Nk WD f1; : : : ; kg a continuous function gn such that 2 jgn j 1 . /nC1 3
2 jgn jC f j . /nC1 3
(2.1)
Applying the second assertion of the lemma to the function . 32 /kC1 .gk jC f /, we get a continuous function hkC1 with values in Œ2kC1 =3kC2 ; 2kC1 =3kC2 such that for all x 2 C 2 j f .x/ gk .x/ hkC1 .x/j . /kC2 : 3
(2.2)
82
2 Encounters With Limits
Setting gkC1 WD gk C hkC1 we obtain the two inequalities of relation (2.1) for n D k C 1. Since for x 2 X we have jgkC1 .x/ gk .x/j 2kC1 =3kC2 for all k 2 N, the sequence .gk / converges uniformly on X to a function g that is continuous by Theorem 2.7. Moreover, for x 2 C, passing to the limit in relation (2.2) we get g.x/ D f .x/. t u Metrics can be used in order to tackle optimization problems with constraints: a natural idea consists in introducing some penalty terms, replacing the objective f by a penalized objective. If one has to minimize f on an admissible subset A of a metric space .X; d/ one may consider the minimization of fs WD f C sdA ./ on the whole space X, expecting that for a large penalization rate s the effect will be similar. Here dA .x/ WD d.x; A/ WD inffd.x; w/ W w 2 Ag for x 2 X. In general one has to replace s by an infinite sequence .sn / ! 1, so that one has to solve a sequence of unconstrained problems. In some favorable cases a single penalized problem suffices as in the simple situation presented in the following result. Proposition 2.23 (Exact Penalization) Let X be a metric space and let f W X ! R be a Lipschitzian function with rate r. Then, for any s r and any nonempty subset A of X inf f .x/ D inf .f .x/ C sdA .x// :
x2A
x2X
(2.3)
Moreover, x 2 A is a minimizer of f on A if and only if, x is a minimizer of fs WD f C sdA on X. If A is closed and if s > r, any minimizer z of fs belongs to A. A local version can be given by replacing X with a neighborhood of x. Proof Since fs D f on A, we have m WD inf f .A/ inf fs .X/. If we had strict inequality we could find x 2 X such that fs .x/ < m. Then we would have sdA .x/ < m f .x/, so that we could pick x0 2 A such that sd.x; x0 / < m f .x/. Since f is Lipschitzian with rate r s we would get f .x0 / f .x/ C sd.x; x0 / < m, a contradiction. The second assertion follows from (2.3). Suppose now that A is closed and that for some s > r a minimizer z of fs does not belong to A. Then dA .z/ is positive, so that, by the relations inf f .A/ D inf fs .X/ D f .z/ C sdA .z/, rdA .z/ < sdA .z/ D inf f .A/ f .z/ one can find a 2 A such that rd.a; z/ < inf f .A/ f .z/, contradicting the relations t u inf f .A/ D inf fr .X/ f .z/ C rd.a; z/. When the admissible set A is defined by equalities or inequalities, it is sensible to take these relations into account. In the case when the admissible set A is defined as A WD g1 .C/, where g W X ! W is a map with values in another metric space .W; d0 / and C is a closed subset of W such that for some c 2 P one has dA .x/ cd0 .g.x/; C/
2.3 Metric Spaces
83
for all x 2 X, one can replace fs with f C scd0 .g.x/; C/. Since dC is often easier to compute than the distance to the implicitly defined set A WD g1 .C/, as is the case when W WD Rm , C WD Rm , such a penalized problem is often more tractable.
Exercises 1. A function h W RC ! RC is said to be subadditive if it satisfies h.r C s/ h.r/ C h.s/ for all r, s 2 RC . Let H be the set of subadditive functions h W RC ! RC satisfying h.0/ D 0 and h.r/ > 0 for all r > 0. Verify that for h, k 2 H one has h C k 2 H, h _ k 2 H, h ı k 2 H and that H is stable under pointwise limits and suprema. Prove that if h W RC ! RC is concave, increasing and such that h.0/ D 0, then h 2 H. Show that for every metric d on a set X, the function h ı d is a metric on X. Using the functions r 7! r=.1 C r/ and r 7! min.r; 1/ show that any metric d on X is uniformly equivalent to a bounded metric. 2. Give examples of disjoint closed subsets A, B of a metric space .X; d/ such that gap.A; B/ D 0. Show that the triangle inequality gap.A; C/ gap.A; B/C gap.B; C/ is not valid. 3. (Hausdorff-Pompeiu metric) Let .X; d/ be a metric space and let B0 be the set of nonempty bounded subsets of X. For A, B 2 B0 let e.A; B/ be the excess of A over B defined by e.A; B/ WD supfd.a; B/ W a 2 Ag. Verify that e.A; B/ D 0 if and only if A cl.B/. Verify that dH W .A; B/ 7! max.e.A; B/; e.B; A// is a semimetric on B0 and a metric on the set Fb of nonempty closed bounded subsets of X. 4. Let .X; d/ a metric space and let .Y; OY / be a topological space. Show that a map f W X ! Y is continuous at x 2 X if and only if for any sequence .xn / ! x one has . f .xn // ! f .x/. 5. Prove that if f W Rd ! R is uniformly continuous, then there exist a, b 2 RC such that f .x/ ad.x; 0/ C b for all x 2 Rd . 6. Let S be a subset of a metric space .X; d/. For r > 0 let B.S; r/ WD fx 2 X W d.x; S/ < rg. Verify that B.S; r/ is the union over x 2 S of the balls B.x; r/ and that \r>0 B.S; r/ D cl.S/. Examine whether similar conclusions hold for BŒS; r WD fx 2 X W d.x; S/ rg. 7. Let .M; d/ be a metric space in which closed balls are compact. Suppose X is arcwise connected. Show that any pair of points x0 , x1 in X can be joined by a curve with least length (a so-called geodesic) [see [75]]. Identify such a curve when M is the unit sphere Sd1 of Rd and when M is a circular cylinder in R3 . Such curves have prompted the development of differential geometry. 8. Given a metric space .M; d/, let X be the set of subsets of M. For a net .Si /i2I in X define lim inf Si WD fx 2 M W lim d.x; Si / D 0g; i2I
i2I
lim sup Si WD fx 2 M W lim inf d.x; Si / D 0g i2I
i2I
84
2 Encounters With Limits
and write .Si /i2I ! S if lim infi2I Si D lim supi2I Si D S. Verify the axioms of convergence spaces. 9. Devise another proof of the Tietze-Urysohn extension theorem that gives an explicit expression for the extension g of f in the case inf f .C/ D 1, sup f .C/ D 2 W g.x/ WD
1 inf f .w/d.w; x/ d.x; C/ w2C
x 2 XnC
and g.x/ D f .x/ for x 2 C. Show that g is well defined on X, satisfies inf g.X/ D inf f .C/, sup g.X/ D sup f .C/, and is continuous. [Hint: note that x 7! infw2C f .w/d.w; x/ is Lipschitzian with rate 2 and that for x 2 XnC and D.x/ WD C \ B.x; 2d.x; C// one has g.x/ D .1=d.x; C// infw2D.x/ f .w/d.w; x/.] 10 . Prove that any continuous function f on a metric space .X; d/ can be uniformly approximated by a locally Lipschitz function [see [133].] 11. Verify the following counterexample showing that a continuous function f on a metric space .X; d/ cannot always be uniformly approximated by a Lipschitz function: take X WD R with its usual distance and f continuous such that f .n/ D 0 for n 2 N, f .n C rn / D 1 for n 2 N, where .rn / is a sequence in 0; 1Œ / with limit 0. This counterexample and the preceding reference have been communicated to the author by G. Beer. 12. Let .X; d/ be a metric space, let s 2 X and for a subset T of X let Vs .T/ WD fx 2 X W d.x; s/ d.x; t/ 8t 2 Tg be the Voronoi cell associated with s and T (with Vs .¿/ WD X and f WD d). Verify that s 2 intVs .T/ if and only if s 2 Xncl.T/. 13. Let .X; d/ be a metric space such that for all w, x 2 X and r > 0 such that r < d.w; x/ one has d.x; BŒw; r/ < d.x; w/. Using the notation of the preceding exercise, show that when s 2 Xncl.T/ one has Vs .T/ D Vs .bdry.T//. When s 2 cl.T/ show that the relation Vs .T/ D Vs .bdry.T// may or may not hold. [Hint: for X WD R2 , s D .0; 0/, T D R2C one has Vs .T/ D Vs .bdry.T// D R2 whereas for T WD R RC one has Vs .T/ D f0g R , Vs .bdry.T// D f0g R.] Show that the assumption on .X; d/ is satisfied whenever for any w, x 2 X with w ¤ x there exists a connected set S containing w and x such that d.w; z/ C d.z; x/ D d.w; x/ for all z 2 S. 14. (McShane, Whitney, 1934) Let W be a nonempty subset of a metric space .X; d/ and let f W W ! R be a Lipschitz function with rate r. Show that the functions f [ , f # W X ! R defined by f [ .x/ WD inf . f .w/ rd.w; x//; w2W
f # .x/ WD sup . f .w/ C rd.w; x// w2W
2.3 Metric Spaces
85
are Lipschitzian with rate r and extend f . Prove that any Lipschitzian extension g of f with rate r satisfies f [ g f # .
2.3.2 Complete Metric Spaces The structure of metric space is richer than the structure of topological space. In particular, one disposes of the notion of a Cauchy sequence: a sequence .xn / of .X; d/ is called a Cauchy sequence if .d.xn ; xp // ! 0 as n; p ! C1. A metric space is said to be complete if its Cauchy sequences are convergent. The interest of such a notion is the fact that one can assert the convergence of such a sequence without knowing its limit. The following result is worth noting; its proof is immediate. Proposition 2.24 If a Cauchy sequence .xn / in a metric space .X; d/ has a converging subsequence, then .xn / is converging. It is often convenient to replace Cauchy sequences with more special sequences. We call a sequence .xn / in a metric space .X; d/ an Abel sequence if there exists some c 2 RC and some r 20; 1Œ such that d.xn ; xnC1 / crn for all n 2 N. We observe that Abel sequences can be a substitute to Cauchy sequences in view of the following lemma, the easy proof of which is left to the reader. Lemma 2.4 Any Abel sequence in a metric space is a Cauchy sequence. Any Cauchy sequence has a subsequence that is an Abel sequence. Corollary 2.14 A metric space .X; d/ is complete if and only if any Abel sequence in X is convergent. Let us give some permanence properties. Proposition 2.25 If .X; dX / is a subspace of a metric space .W; dW / with the induced metric and if .X; dX / is complete, then X is closed in W. Conversely, any closed subset X of a complete metric space .W; dW / is complete with respect to the induced metric. Proof Let us show that any point w in the closure cl.X/ of X belongs to X. Proposition 2.20 ensures that some sequence .xn / of X converges to w. Such a sequence is a Cauchy sequence, hence has a limit x 2 X. Uniqueness of limits in metric spaces implies that w D x 2 X. For the converse, let X be a closed subset of a complete metric space .W; dW /. Since a Cauchy sequence in X with respect to the induced metric dX is a Cauchy sequence in .W; dW /, it converges in W, and in fact in X since X is closed. Thus X is complete with respect to dX . t u Other permanence properties concern function spaces. It is extremely useful to consider metric spaces formed with functions or maps. Proposition 2.26 Let S be a set and let .M; d/ be a metric space. On the set B.S; M/ of f W S ! M that are bounded, i.e. such that f .S/ is bounded in .M; d/, one can
86
2 Encounters With Limits
define a metric by setting for f , g 2 B.S; M/ d1 . f ; g/ WD sup d. f .s/; g.s//: s2S
If .M; d/ is complete, .B.S; M/; d1 / is complete. Proof It is easy to see that d1 is a metric on B.S; M/. Let . fn /n be a Cauchy sequence in .B.S; M/; d1 /. Since for every x 2 S the evaluation map f 7! f .x/ is nonexpansive, the sequence . fn .x//n is a Cauchy sequence in .M; d/. When .M; d/ is complete, this sequence converges. Let f .x/ be its limit. If " 7! k."/ is such that d1 . fm ; fn / " for n m k."/, passing to the limit as n ! 1 we see that d. fm .x/; f .x// " for m k."/ and all x 2 S. Equivalently we have d1 . fm ; f / " for m k."/, so that . fn / ! f in .B.S; M/; d1 / since f is bounded as supx;x0 2S d. f .x/; f .x0 // supx;x0 2S d. fm .x/; fm .x0 // C 2" < C1. t u When a sequence . fn / converges to f for d1 one says that . fn / converges uniformly to f . This property is stronger than pointwise convergence, which means that for all x 2 S one has . fn .x//n ! f .x/. Proposition 2.27 Let X be a topological space and let .M; d/ be a metric space. The subspace Cb .X; M/ of bounded continuous functions from X to M is closed in B.X; M/ with respect to the metric d1 . Thus, if .M; d/ is complete, .Cb .X; M/; d1 / is complete. Proof We have to prove that the uniform limit f of a sequence . fn / in Cb .X; M/ belongs to Cb .X; M/. We know from the preceding proof that f is bounded. The continuity of f is established in Theorem 2.7. t u Completeness can be used with great success for extension results. Theorem 2.9 Let .W; dW / and .Y; dY / be complete metric spaces and let f W X ! Y be uniformly continuous, where X is a dense subset of W. Then f can be extended uniquely to a uniformly continuous map f W W ! Y. Proof Uniqueness of the extension stems from Proposition 2.4. Let us prove the existence of a uniformly continuous extension. Let m W RC ! RC be a modulus of uniform continuity of f in the sense that dY . f .x/; f .x0 // m.dW .x; x0 // for all x, x0 2 X. Given w 2 W, let .xn / be a sequence in X with limit w. Since .xn / is a Cauchy sequence in X, . f .xn // is a Cauchy sequence in .Y; dY /. Since .Y; dY / is complete, . f .xn // has a limit y 2 Y. This limit does not depend on the choice of the sequence .xn / since given two such sequences .xn /, .x0n / one has dY . f .xn /; f .x0n // m.dW .xn ; x0n //, so that, passing to the limit, we get . f .x0n // ! y. Denoting this limit by f .w/ we define a map f W W ! Y. For all t > 1 this map satisfies dY .f .w/; f .w0 // m.tdW .w; w0 // since for any sequences .xn / ! w, .x0n / ! w0 in X one has dW .xn ; x0n / tdW .w; w0 / for n large enough, so that dY .f .w/; f .w0 // D limn dY . f .xn /; f .x0n // m.tdW .w; w0 //. Let us observe that when m is continuous (and not just continuous at 0), m is also a modulus of uniform continuity of the map f . t u
2.3 Metric Spaces
87
The notion of a complete metric space enables us to present a powerful fixed point theorem. Here we deal with contraction maps , i.e. Lipschitzian maps whose Lipschitz rate is less than 1 and the process is called the method of successive approximations. Theorem 2.10 (Contraction Theorem, Picard, Banach) Let .X; d/ be a complete metric space and let f W X ! X be a contraction. Then f has a fixed point x. Moreover, x is unique and for every x0 2 X one has . f .n/ .x0 // ! x for f .n/ WD f ı f .n1/ . Proof Let c 2 Œ0; 1Œ be the Lipschitz rate of f . Uniqueness of the fixed point stems from the contraction property: if x and y satisfy f .x/ D x, f .y/ D y we have d.x; y/ D d. f .x/; f .y// cd.x; y/ and as c 2 Œ0; 1Œ this is possible only if d.x; y/ D 0, i.e. x D y. Given x0 2 X let us define inductively xnC1 D f .xn / for n 2 N. Since for n 1 we have xn D f .xn1 / and xnC1 D f .xn /, the contraction property yields d.xnC1 ; xn / cd.xn ; xn1 /: By induction, this relation entails that d.xnC1 ; xn / cn d.x1 ; x0 / for all n 2 N. The sequence .xn / is thus an Abel sequence (hence a Cauchy sequence), hence has a limit x 2 X. Passing to the limit in the relation d. f .xn /; xn / cn d.x1 ; x0 / we get d. f .x/; x/ D 0, hence f .x/ D x. t u Let us observe that the convergence of .xn / to x is rapid enough since d.xnCp ; xn /
nCp X
d.xk ; xk1 / d.x1 ; x0 /
kDnC1
1 X kDn
ck D
d.x1 ; x0 / n c 1c
(2.4)
and, passing to the limit as p ! C1, d.x; xn / .1 c/1 cn d. f .x0 /; x0 ). Corollary 2.15 Let W be a topological space, let .X; d/ be a complete metric space and let f W W X ! X be a continuous map such that for some c 2 Œ0; 1Œ and all w 2 W the partial map fw W x 7! f .w; x/ is Lipschitzian with rate c. Then there exists a unique continuous map g W W ! X such that g.w/ D f .w; g.w// for all w 2 W. Proof For w 2 W, let g.w/ be the unique fixed point of fw . We have to show that g is continuous. Let z 2 W and given " > 0 let V 2 N .z/ be such that d. f .w; g.z//; f .z; g.z/// " for all w 2 V. The triangle inequality yields for w 2 V d.g.w/; g.z// D d. fw .g.w//; fz .g.z/// d. fw .g.w//; fw .g.z/// C d. fw .g.z//; fz .g.z/// cd.g.w/; g.z// C "; so that d.g.w/; g.z// .1 c/1 ". This shows that g is continuous at z.
t u
88
2 Encounters With Limits
A remarkable property of complete metric spaces is the Baire property. Theorem 2.11 (Baire) Let .X; d/ be a complete metric space. If .Gn / is a sequence in open dense subsets of X, then \n Gn is dense. If X is the union of a countable family .Fn / of closed subsets, then one of them has a nonempty interior. Proof Let .Gn / be a sequence of open dense subsets of X. Let us show that G WD \n Gn is dense. Let .sn / be a sequence of positive numbers with limit 0. Given a nonempty open subset U of X, the set Gn \ U is nonempty and open; in particular, G0 \ U contains some closed ball BŒx0 ; r0 with r0 20; s0 . Assume by induction that we have constructed open balls B.xk ; rk / with rk sk ; BŒxk ; rk B.xk1 ; rk1 / \ Gk for k D 1; : : : ; n. Since B.xn ; rn / meets GnC1 , we can find a closed ball BŒxnC1 ; rnC1 GnC1 \ B.xn ; rn / with rnC1 snC1 . The sequence .xn / obtained in this way is a Cauchy sequence (since d.xnCp ; xn / sn for all n; p). Its limit x belongs to BŒxm ; rm Gm for all m 2 N and in particular x 2 BŒx0 ; r0 U and x 2 G, so that G \ U is nonempty: G is dense. Now suppose X D [n Fn , where each Fn is closed. Then Gn WD XnFn is open and if Fn has an empty interior then Gn is dense. If this happens for all n 2 N, then \n Gn is dense, an impossibility since \n Gn D ¿. Thus, at least one Fn has a nonempty interior. t u The preceding result can be expressed in terms of genericity. A subset G of some topological space T is generic if it contains the intersection of a countable family of open subsets of T (a so-called Gı set, the notation being a reminder of the German term “Gebiete”, while the notation F stems from the French “fermé” for closed) that are dense in T; other terminologies are that G is residual or that the complement of G is meager or a set of first category. It is convenient to say that a property involving a point is generic if it holds on a generic subset. The main feature of this notion is that the intersection of a finite (or countable) family of generic subsets is still generic, a property that does not hold for dense subsets (consider the set of rational numbers and the set of irrational numbers in R). The (equivalent) properties of Theorem 2.11 can be phrased as follows: in a complete metric space any generic subset is dense. A topological space satisfying this property is called a Baire space. Locally compact topological spaces are also Baire spaces. Complete metric spaces can be characterized by an approximate minimization principle that is extremely useful. Here, given " > 0, we say that a point x of a set X is an "-minimizer of a function f W X ! R1 if f .x/ inf f .X/ C ". Theorem 2.12 (Ekeland) Let .X; d/ be a complete metric space and let f W X ! R1 be a bounded below lower semicontinuous function taking at least one finite value. Given " > 0, an "-minimizer x of f , and given c; r > 0 satisfying cr ", one can find u 2 BŒx; r such that f .u/ C cd.u; x/ f .x/ and f .u/ < f .x/ C cd.u; x/
for all x 2 Xnfug:
(2.5)
2.3 Metric Spaces
89
Thus, not too far from x, we can find a strict mimimizer u of the modified function fc;u WD f C cd.u; /. Replacing the metric d with the metrics d0 WD d=.1 C d/ or d00 WD min.d; 1/ and c with the general term of a sequence .cn / ! 0C , we can ensure that the approximate function fc;u is as close to f as required with respect to the uniform metric. However, there is a trade off between the accuracies of the two approximating elements u; fc;u : one cannot expect to get arbitrarily accurate approximations of f and of x at the same time. Proof We associate to f and c an order on X defined by w x if f .w/ C cd.w; x/ f .x/. Let B.x/ WD fw 2 X W f .w/ C cd.w; x/ f .x/g
x2X
be the set of elements below x. We have x 2 B.x/ for all x 2 X and the relations y 2 B.x/, x 2 B. y/ imply d.x; y/ D 0 or x D y. Let us verify that the relation B satisfies the transitivity property B. y/ B.x/ for all x 2 X, y 2 B.x/. We may assume x 2 domf WD f 1 .R/, so that f . y/ < 1. Then, for all z 2 B. y/ we also have f .z/ < 1 and cd. y; z/ f . y/ f .z/. Since y 2 B.x/, we also have cd.x; y/ f .x/ f . y/. Adding the respective sides of these two inequalities, and using the triangle inequality, we get cd.x; z/ f .x/ f .z/, or z 2 B.x/. Thus B defines an order; we shall construct a minimal element. Given x 2 domf , we can define inductively a sequence starting from x0 WD x by picking xnC1 2 B.xn / satisfying f .xnC1 /
1 1 f .xn / C inf f .B.xn //: 2 2
(2.6)
Such a choice is possible: it suffices to use the definition of an infimum when inf f .B.xn // < f .xn / and to take xnC1 D xn when inf f .B.xn // D f .xn /. Since xn 2 B.xn /, (2.6) ensures that the sequence . f .xn // is nonincreasing, hence is convergent as f is bounded below. Let ` WD limn f .xn /. Since xnC1 2 B.xn / we have cd.xn ; xnC1 / f .xn / f .xnC1 / and by induction cd.xn ; xnCp / f .xn / f .xnCp /
(2.7)
for all n; p 0. Thus .xn / is a Cauchy sequence, hence has a limit we denote by u. Because f is lower semicontinuous, for each n 2 N the set B.xn / is closed. Since relation (2.7) says that xnCp 2 B.xn / for all p 0, we get u 2 B.xn /. In particular, taking n D 0 and remembering that x0 D x, we get f .u/ C cd.x; u/ f .x/: Moreover, by the transitivity property of relation B, for all v 2 B.u/ and all n 2 N, we have v 2 B.xn /. Thus inf f .B.xn // C cd.xn ; v/ f .v/ C cd.xn ; v/ f .xn / and relation (2.6) yields cd.xn ; v/ f .xn / inf f .B.xn // 2 .f .xn / f .xnC1 // ! 0;
90
2 Encounters With Limits
hence d.v; u/ D 0. It follows that B.u/ D fug. This relation means that (2.5) is satisfied. If x is such that f .x/ inf f .X/ C ", and " cr, we have inf f .X/ C cd.u; x/ f .u/ C cd.u; x/ f .x/ inf f .X/ C cr; so that d.u; x/ r.
t u
O Given a metric space .X; d/ one may look for a complete metric space .b X; d/ that is close enough to .X; d/. A precise answer can be given, even for semimetric spaces. Theorem 2.13 Given a semimetric space .X; d/ there is a complete semimetric O and an isometry j from X onto a dense subset of .b O Moreover, space .b X; d/ X; d/. if .Y; dY / is a complete metric space and if f W X ! Y is a uniformly continuous map, there is a unique uniformly continuous map fO W b X ! Y such that f D fO ı j. Proof This last property ensures that the pair .b X; j/ is unique up to an isometry. 0 0 b0 b In fact, if .X ; j / is another pair, there is a uniformly continuous map b j0 W b X !X 0 0 0 b b such that j D j ı j. Similarly, there is a uniformly continuous map Oj W X ! b X such 0 0 0 b b O O O ı j, so that the two maps j ı j and Ib coincide that j D j ı j . Then j D j ı .j ı j/ D Ib X X by the uniqueness requirement (or the density of j.X/ in b X). Similarly, one shows b0 . that b j0 ı Oj coincides with the identity map Ib on X X0 Several specific constructions can be given for .b X; j/. One consists in taking for b X the set of equivalence classes of Cauchy sequences for the relation .xn / .x0n / if limn d.xn ; x0n / D 0. We invite the reader to complete the construction by defining dO and taking for j.x/ the class of the constant sequence with value x. When d is a metric that is bounded above on X 2 we can also take an embedding j into the space Cb .X/ WD Cb .X; R/ endowed with the metric d1 defined by d1 . f ; g/ D supx2X j f .x/ g.x/j. Given x 2 X we define j.x/ as the function w 7! d.w; x/ on X. Then, for x, y 2 X, the triangle inequality yields d1 . j.x/; j. y// d.x; y/. In fact this inequality is an equality since jj.x/. y/ j. y/. y/j D d.x; y/. Taking for b X the closure of j.X/ in Cb .X/ for dO WD d1 we get a complete space in which j.X/ is dense. The last assertion of the statement is a consequence in Theorem 2.4. When d is an arbitrary metric, turning the requirement that j be isometric into the requirement that j be nonexpansive, i.e. Lipschitzian with rate 1, we can define j by setting j.x/.w/ WD min.d.w; x/; 1/. t u If .da /a2A is a family of semimetrics on X, one can consider on X the topology generated by the balls Ba .x; r/ WD fw 2 X W da .w; x/ < rg. However, X has a structure richer than a topology. It is called a uniformity and .X; .da /a2A is called a uniform space. A sequence .xn / of X is called a Cauchy sequence if for all a 2 A one has .da .xn ; xp // ! 0 as n; p ! C1. A uniform space is said to be complete if any Cauchy sequence is convergent.
2.3 Metric Spaces
91
2.3.3 Application to Ordinary Differential Equations We intend to give a simple existence theorem for ordinary differential equations. One of its proofs relies on a generalization of the Contraction Theorem. Proposition 2.28 Let .X; d/ be a complete metric space and let f W X ! X be a continuous map such that for some k 2 Nnf0g the k times iterated map f .k/ WD f ı : : : ı f is a contraction. Then f has a unique fixed point. Proof If x and y are fixed points of f , then they are fixed points of f .k/ , hence x D y. Let x be the fixed point of f .k/ . We know that for any x0 2 X we have . f .kn/ .x0 //n ! x. In particular, taking x0 WD f .x/ we get that . f .knC1/ .x//n ! x. On the other hand, since f .kn/ .x/ D x, we have f . f .kn/ .x// D f .x/. By uniqueness of limits, we get f .x/ D x. t u For this existence result, we anticipate to some notions of derivation and integration to be found later. The reader may suppose E WD R for the sake of simplicity. Theorem 2.14 Let T be a bounded interval of R, let E be a Banach space and let f W T E ! E be a continuous map such that for some c 2 RC one has k f .t; e/ f .t; e0 /k c ke e0 k for all t 2 T, e; e0 2 E. Then, given t0 2 T, e0 2 E there exists a unique solution x 2 C.T; E/ with a continuous derivative x0 to the equation x0 .t/ D f .t; x.t//
t 2 T;
x.t0 / D e0 :
(2.8)
Proof We admit that x 2 C.T; E/ satisfies (2.8) if and only if x satisfies the integral equation Z t x.t/ D e0 C f .s; x.s//ds: (2.9) t0
Let us endow the space X WD Cb .T; E/ of bounded continuous maps from T to E with the norm kk1 and let us consider the map F W X ! X given by Z t f .s; x.s//ds: F.x/.t/ WD e0 C t0
For x, y 2 Cb .T; E/ we have Z t D sup . f .s; x.s// f .s; y.s///ds F. y/k kF.x/ 1 t2T
t0
ˇZ t ˇ ˇ ˇ sup ˇˇ c kx.s/ y.s/k dsˇˇ `.T/c kx yk1 t2T
t0
92
2 Encounters With Limits
where `.T/ is the length of T. Thus, for `.T/c < 1, F is a contraction and F has a fixed point that is a solution to (2.9). In the general case it can be shown by induction that for all k 2 Nnf0g, x, y 2 Cb .T; E/ one has k k .k/ F .x/ F .k/ . y/ c `.T/ kx yk : 1 1 kŠ
For k large enough one has ck `.T/k =kŠ < 1 and we can apply the preceding proposition to find a fixed point of F, hence a solution to (2.8). t u
Exercises 1. Observe that the proof we gave of Corollary 2.15 only uses the continuity of the partial map w 7! f .w; g.z//. Prove that such an assumption is equivalent to the continuity of f at .z; g.z// in view of the hypothesis that for all w 2 W the map fw is Lipschitzian with rate c. 2. Let .X; d/ be a compact metric space and let f W X ! X be a map such that for x, x0 2 X with x ¤ x0 one has d. f .x/; f .x0 // < d.x; x0 /. Show that f has a unique fixed point that can be found by the method of successive approximations. 3. Let .X; d/ be a complete metric space, let c > 1, and let f W X ! X be a continuous map such that for x, x0 2 X one has d. f .x/; f .x0 // cd.x; x0 /. Show that f has a unique fixed point. R 4. Let T WD Œa; b, let X WD L2 .T/, let k 2 L2 .T T/ be such that T 2 k2 .s; t/dsdt < 1, and let f W RT 2 ! R be such that for all .r; r0 ; s; t/ 2 R2 T 2 one has j f .r; s; t/ f .r0 ; s; t/j k.s; t/ jr r0 j and such that for all x 2 X the function Rb t 7! a f .x.s/; s; t/ds belongs to X. Prove that for every y 2 X the following integral equation has a unique solution Z x.t/ D y.t/ C
b
f .x.s/; s; t/ds: a
5. Let X be an open subset of a complete metric space .W; d/, with X ¤ W. For x, x0 2 X, let ˇ ˇ dX .x; x0 / WD ˇ1=d.x; WnX/ 1=d.x0 ; WnX/ˇ : Show that dX is a metric on X whose associated topology is the induced topology. Prove that .X; dX / is complete. Does this contradicts the fact that in general X is not closed in W? ˇ ˇ 6. Show that on R the function d0 W R R ! R given by d0 .w; x/ D ˇw3 x3 ˇ is a metric topologically equivalent to the usual metric d given by d.w; x/ WD jw xj, but not uniformly equivalent to d. Verify that the Cauchy sequences for d and d0 are the same.
2.3 Metric Spaces
93
7. Let .X; d/ be a complete metric space and let f , g W X ! X be two contractions with rate c 20; 1Œ. Let x (resp. y) be the fixed point of f (resp. g). Prove that d.x; y/ .1 c/1 d1 . f ; g/ where d1 . f ; g/ WD supx2X d. f .x/; g.x//. [Hint: In relation (2.4) take n D 0, x0 WD y, pass to the limit on p and note that d. f .y/; y/ d1 . f ; g/.]
2.3.4 Compact Metric Spaces Some additional properties of compact spaces can be obtained when they are metrizable, i.e. when their topologies can be associated with metrics. They are even valid for sequentially compact spaces, a topological space X being called sequentially compact if every sequence in X has a convergent subsequence. First we note a uniformity property related to coverings. Lemma 2.5 (Lebesgue) Let .W; d/ be a metric space and let X be a subset of W that is sequentially compact. Then for any family .Ui /i2I of open subsets of W whose union contains X there exists some r > 0 such that for all x 2 X the ball B.x; r/ is contained in some Ui . Proof If no such r exists, for all n 2 Nnf0g one can find some xn 2 X such that for all i 2 I the ball B.xn ; 1=n/ is not contained in Ui . Let x 2 X be the limit of a subsequence .xk.n/ /n of .xn /. Let rn WD d.xk.n/ ; x/ and let j 2 I be such that x 2 Uj . Let s > 0 be such that B.x; s/ Uj . For n large enough we have rn C 1=k.n/ < s so that B.xk.n/ ; 1=k.n// is contained in B.x; s/, hence in Uj . This contradicts the choice of the balls B.xn ; 1=n/. t u The second property we consider captures the idea that such a space can be approximated by a finite set. Definition 2.12 A metric space .X; d/ is said to be precompact if for any " > 0 there exists a finite subset F" of X such that for all x 2 X one has d.x; F" / < ". This means that for all " > 0 there is a covering of X by a finite number of balls of radius ". Such a space is clearly bounded and separable. But we have more. Theorem 2.15 For a metric space .X; d/ the following properties are equivalent: (a) (b) (c) (d) (e)
every sequence .xn / of X has a cluster point; X is sequentially compact; X is precompact and complete; every infinite subset S of X has an accumulation point; X is compact.
Proof The equivalence (a),(b) stems from Proposition 2.21 and the implication (e))(a) is obvious. Let us prove (b))(c). If X is sequentially compact then X is complete since every Cauchy sequence in X has a convergent subsequence, hence is convergent. If X is not precompact there exists some " > 0 such that X is not covered
94
2 Encounters With Limits
by a finite number of balls of radius ". Thus, by induction we build a sequence .xn / such that, for all n 2 N, xnC1 is not contained in B.x0 ; "/ [ : : : [ B.xn ; "/. Such a sequence cannot have a convergent subsequence. Now let us show that (c))(d). For every n 2 N there exists a finite subset Fn of X such that the balls with center in Fn and radius 2n cover X, hence covers S. One of these balls contains an infinite number of points of S. Let us denote by xn 2 Fn its center. Similarly there exists some xnC1 2 FnC1 such that B.xnC1 ; 2n1 / contains an infinite number of points of S \ B.xn ; 2n /. By the triangle inequality we have d.xnC1 ; xn / < 2nC1 . Since X is complete, the sequence .xn / built in this way converges to some x 2 X since it is a Cauchy sequence (in fact an Abel sequence). Again, the triangle inequality shows that x is an accumulation point of S. The implication (d))(a) is immediate: given a sequence .xn / of X either S WD fxn W n 2 Ng is finite and then a subsequence of .xn / is constant, hence convergent, else S is infinite and any accumulation point a of S is a cluster point of .xn / since we know that any neighborhood V of a contains an infinite number of points of S. It remains to show that (a))(e). Let .Ui /i2I be an open covering of X. The preceding lemma yields some r > 0 such that for all x 2 X there exists some i.x/ 2 I such that B.x; r/ Ui.x/ . On the other hand, since (a))(c), X is precompact and there exists a finite subset F of X such that fB.x; r/ W x 2 Fg is a covering of X. Then fUi.x/ W x 2 Fg is a finite covering of X: X is compact. t u Corollary 2.16 A compact metric space is separable. Proof Let X be a compact metric space. Since X is precompact, for any sequence .rn / ! 0C one can find a finite subset Fn of X such that fB.x; rn / W x 2 Fn g is a covering of X. Then D WD [n Fn is a dense countable subset of X. t u Another important consequence of compactness and metrizability follows. Theorem 2.16 Let .W; dW / be a metric space and let X be a relatively compact subset of W. If f W W ! Y is a continuous map with values in another metric space .Y; dY /, then f is uniformly continuous around X in the following sense: for every " > 0 there exists some ı > 0 such that for all w 2 W, x 2 X satisfying dW .w; x/ < ı one has dY . f .w/; f .x// < ". Of course, if X D W the conclusion is just usual uniform continuity of f . Proof Given " > 0, since f is continuous, for all z 2 cl.X/ there exists some open neighborhood Uz of z such that f .Uz / B. f .z/; "=2/. Lemma 2.5 yields some r > 0 such that for all x in the compact set cl.X/ one has B.x; r/ Uz.x/ for some z.x/ 2 cl.X/. Then, for w 2 B.x; r/ one has dY . f .w/; f .x// dY . f .w/; f .z.x/// C dY . f .z.x//; f .x// < " since w; x 2 B.x; r/ Uz.x/ . Thus we can take ı D r. t u Exercise Give a proof by contradiction using sequences and subsequences. Exercise Give a proof using sequences and cluster points. Exercise Show that the result is valid if X is a semi-metric space and Y is a uniform space.
2.3 Metric Spaces
95
The compactness assumption in Theorem 2.6 can be relaxed by using a notion of coercivity. Let us say that a function f W X ! R on a metric space X is coercive if for all r 2 R the sublevel set Sf .r/ WD f 1 . 1; r/ is bounded, or, equivalently, if f .x/ ! 1 as d.x; x0 / ! 1 (x0 being an arbitrary point of X). This notion is essentially used in the case when X is a normed space and f .x/ ! 1 as kxk ! 1. We will say that f is compactly coercive if for all r 2 R the sublevel set Sf .r/ is compact. When f is lower semicontinuous and the closed balls of X are compact, both notions coincide. For such a function, the existence of minimizers is ensured. Lemma 2.6 Let f W X ! R1 be a compactly coercive function on a metric space X. Then f attains its minimum. In particular, when the closed balls of X are compact, any coercive lower semicontinuous function on X attains its minimum. Proof The result being obvious when f only takes the value C1, let us take r 2 R such that Sf .r/ WD f 1 . 1; r/ is nonempty. By assumption, Sf .r/ is compact. By the Weierstrass’ Theorem, f attains its infimum on Sf .r/. Since inf f .X/ D inf f .Sf .r//, any minimizer of the restriction f j Sf .r/ of f is also a minimizer of f . t u Proposition 2.29 Let C be a nonempty closed convex subset of a Euclidean space X and let a 2 X. Then there exists some p 2 C called the best approximation of a in C such that kp ak kx ak for all x 2 C. Moreover, p is characterized by the inequality 8x 2 C
ha p j x pi 0:
(2.10)
Proof The function x 7! kx ak is continuous and coercive and the closed balls of X are compact since they are dilations of smaller balls and since X is locally compact. Setting f .x/ D kx ak if x 2 C and f .x/ WD C1 if x 2 XnC, we get a coercive lower semicontinuous function on X. It attains its infimum at some point p of C, so that kp ak kx ak for all x 2 C. Given x 2 C, for t 20; 1 we have xt WD p C t.x p/ in C by convexity, hence kp ak2 kxt ak2 D kp ak2 C 2th p a j x pi C t2 kx pk2 : Simplifying both sides and dividing by t and then passing to the limit we get (2.10). t u
Exercises 1. Let X be a closed subset of Rd and let f W X ! R be lower semicontinuous and pseudo-coercive in the sense that there exists some x0 2 X such that f .x0 / < lim infkxk!1; x2X f .x/. Show that f attains its infimum.
96
2 Encounters With Limits
2. Let X be a closed subset of Rd and let f W X ! R. Assume f is finitely minimizable in the sense that there exists an r 2 RC such that, for any t > m WD inf f .X/, there exists some x 2 X with kxk r, f .x/ < t. Show that any pseudo-coercive function is finitely minimizable and that any finitely minimizable lower semicontinuous function on X attains its infimum at some point of X \ BŒ0; r, where r is the radius of essential minimization, i.e. the infimum of the real numbers r for which the above definition is satisfied. 3. Prove Weierstrass’ Theorem in the case when X is a compact metric space by using a minimizing sequence of f , i.e. a sequence .xn / of X such that . f .xn // ! inf f .X/. 4. Show that any l.s.c. function f on Œ0; 1 (or on a separable metric space X) is the supremum of the family of continuous functions majorized by f . 5. Prove Corollary 2.12 by using open subsets. 6. Show that among all cylindrical barrels of a given area s there is one with greatest volume. 7 . (d’Alembert-Gauss Theorem) Prove that any polynomial P with complex coefficients has at least one root in C. [Hint: verify that jP./j is coercive and show that if z0 is such that jP.z0 /j D inffjP.z/j W z 2 Cg, then P.z0 / D 0.] 8. Show that the following properties of a locally compact metric space .X; d/ are equivalent: (a) X is the union of an increasing sequence .Xn / of open relatively compact subsets of X such that cl.Xn / XnC1 for all n 2 N; (b) X is the union of a countable family of compact subsets; (c) X is separable.
Everything should be made as simple as possible, but not simpler. Albert Einstein.
Abstract In this central chapter, the fundamental elements of functional analysis are presented. Although topological vector spaces are considered, essentially for the use of weak topologies, the focus is on normed spaces. Normed spaces in duality or metric duality form a convenient framework. The main pillars of functional analysis are presented: separation properties, the uniform boundedness theorem, the open mapping theorem, the closed graph theorem. . . Some special properties such as reflexivity, separability, and uniform convexity are given some attention. An account of spectral theory for linear operators is presented and the case of compact operators is considered. The chapter ends with a presentation of regulated functions and functions of bounded variation. This enables us to dispose of an elementary integration theory that is often sufficient for simple purposes.
Besides subsets of Euclidean spaces, spaces of functions are the most common examples of topological spaces. Historically, they prompted the study of abstract metric spaces and topological spaces: the notion of a metric space appeared as a transposition of the usual notion in Rd . Then, it was realized that the concept of a norm is even a simplification of the concept of a metric and that the framework of linear spaces equipped with a norm is a very convenient framework to tackle and solve a number of problems. This chapter is devoted to such a tool. We focus on linearity, leaving aside the powerful tools of nonlinear functional analysis. Most of the results we expound have been obtained during the twentieth century. They have applications in various problems and thus are important. Since many problems can be solved by using a fixed point theorem, let us quote one such result in a simple form: (Brouwer’s Fixed Point Theorem) For any continuous map f W BX ! BX from the closed unit ball BX of a Euclidean space X into itself there exists some x 2 BX such that f .x/ D x.
A proof of this result is presented in the appendix; it involves several tools from various chapters of the book. Whereas a generalization due to J. Schauder of this result has to be used to solve problems in functional spaces (see Theorem 3.28 below), we first present a proof for the simple one-dimensional case for which BX D Œ1; 1. Then, setting g.x/ D f .x/ x, we see that g.1/ 0, g.1/ 0 and the existence of some x 2 Œ1; 1 such that g.x/ D 0 stems from the intermediate value theorem.
3.1 Normed Spaces When a set is endowed with a topology (or a metric) and has an algebraic structure, it is interesting to study the case when these two structures are compatible in the sense that the operations are continuous. If this is not the case, one might face difficulties.
3.1.1 General Properties of Normed Spaces Given a linear space X, it is natural to select metrics or semimetrics that are compatible with the operations, so that one may expect simplifications. Let us say that a semimetric d on a linear space is compatible (with the linear structure) if for all x, y, v 2 X one has d.x C v; y C v/ D d.x; y/ (invariance by translations) and if for all 2 R (or 2 C if X is a linear space over C) one has d.x; y/ D jj d.x; y/. The following result is immediate. Lemma 3.1 If a semimetric d on a linear space X is compatible, then the function p W X ! RC given by p.x/ WD d.x; 0/ is a seminorm, i.e. a function p W X ! RC satisfying the following properties: (SN1) (SN2)
p.x/ D jj p.x/ for all 2 R (or 2 C) and all x 2 X; p.x C y/ p.x/ C p.y/ for all x, y 2 X.
This last relation is called the triangle inequality. It stems from the relations d.x C y; 0/ d.x C y; y/ C d.y; 0/ D d.x; 0/ C d.y; 0/: Conversely, given a seminorm p on X, one gets a compatible semimetric d by setting d.x; y/ WD p.x y/. For x, y, z 2 X one has d.x; z/ D p.x z/ D p..x y/ C .y z// p.x y/ C p.y z/ D d.x; y/ C d.y; z/: Since p.v/ D p.v/ for all v 2 X, one also has d.y; x/ D d.x; y/. Moreover, from (SN1) one deduces that p.0/ D 0, hence d.x; x/ D 0 for all x 2 X. The
3.1 Normed Spaces
99
compatibility conditions are immediate consequences of the definition of d and of (SN1). A seminorm satisfying the condition (N0)
p.x/ D 0 implies x D 0
is called a norm. The associated semimetric is then a metric and the converse is true. A norm is often denoted by x 7! kxk. A normed space is a linear space equipped with a norm. A Banach space is a complete normed space. Example On Rd familiar norms are defined as follows for x D .x1 ; : : : ; xd /: kxk1 D jx1 j C : : : C jxd j ; kxk1 D max.jx1 j ; : : : ; jxd j/; kxkp D .jx1 jp C : : : C jxd jp /1=p
.p 2 Œ1; 1Œ/:
The norm kk2 corresponding to the choice p D 2 is called the Euclidean norm. It has interesting properties due to the fact that it is associated with a scalar product but it is not always as convenient as kk1 or kk1 . Example If S is an arbitrary set and if .E; kk/ is a normed space, it is easy to show that on the space B.S; E/ (resp. B.S/) of bounded maps from S into E (resp. of bounded real-valued functions on S/ one gets a norm kk1 by setting kf k1 WD sups2S kf .s/k (resp. kf k1 WD sups2S jf .s/j). This norm is called the norm of uniform convergence or the sup norm. It is convenient to denote by BX the closed ball BŒ0; 1 centered at 0 with radius 1, called the (closed) unit ball of X. A subset B of X is seen to be bounded if and only if there exists some r 2 RC such that B rBX . The unit sphere of X is the set SX of u 2 X such that kuk D 1. Two norms kk and kk0 are said to be equivalent if there exist two positive constants c, c0 such that .1=c/ kk kk0 c0 kk. Such a relation defines an equivalence relation among norms. Moreover, if the norms kk and kk0 are equivalent then the associated metrics d and d0 are metrically (hence uniformly and topologically) equivalent. The following result can be considered as a training in the use of norms (Fig. 3.1). Proposition 3.1 (Riesz) Let Y be a closed linear subspace of a normed space .X; kk/ with Y ¤ X. Then, for every " > 0 one can find some x 2 X such that kxk D 1 and d.x; Y/ 1 ", for d.x; Y/ WD inf fd.x; y/ W y 2 Yg. Proof Let z 2 XnY and let " > 0 be given. We can find some y" 2 Y such that r WD kz y" k .1 C "/d.z; Y/:
100
3 Elements of Functional Analysis
Fig. 3.1 Approximate orthogonality
x
Y
Let x D r1 .z y" /. If y 2 Y we have y" C ry 2 Y, hence kz y" ryk d.z; Y/, kx yk D r1 .z y" / y D r1 kz y" ryk r1 d.z; Y/ .1 C "/1 : Taking the infimum over y 2 Y, it follows that d.x; Y/ .1 C "/1 1 ".
t u
Exercises 1. Show that in a normed space .X; kk/ the closure of the open ball B.a; r/ is the closed ball BŒa; r and that the interior of BŒa; r is the open ball B.a; r/. Give an example showing that this property is not always satisfied in a general metric space. 2. Let A be a closed subset of a normed space .X; kk/ and let B be a compact subset of X. Show that A C B WD fx C y W x 2 A; y 2 Bg is closed. Give an example showing that the sum of two closed subsets of X is not always closed. [Hint: take X WD R2 , A WD Rf0g, B WD f.x; y/ 2 RC RC W xy D 1g.] 3. Let X (often denoted by c00 ) be the space of sequences x WD .xn / of real numbers such that xn is null for n large enough. Show that the norms given by kxk WD supn jxn j and kxk1 D ˙n jxn j are not equivalent. 4. Show that two norms kk and kk0 on a linear space X are equivalent if and only if the associated metrics d and d0 are metrically equivalent (resp. uniformly equivalent, resp. topologically equivalent). 5. Let .X; kk/ be a normed vector space, let x, y 2 X and let r 2 RC . Show that r kxk D limn!1 .k.r C n/x C yk knx C yk/. 6. Let X be the space of functions of class C1 on some compact interval T of R equipped with the norm defined by kxk WD supt2T jx.t/j C supt2T jx0 .t/j. Show that setting kxk0 WD supt2T .jx.t/j C jx0 .t/j/ one defines an equivalent norm.
3.1 Normed Spaces
101
7 . Let .X; kk/ be a normed vector space and let x, y 2 Xnf0g. Show that x 1 y ; kx yk max.kxk ; kyk/: 2 kxk kyk x y 1 kx yk .kxk C kyk/: kxk kyk : 4 Show that the constants
1 2
and
1 4
cannot be replaced with greater constants.
3.1.2 Continuity of Linear and Multilinear Maps Continuity of linear maps can be handled in several convenient ways. Proposition 3.2 A linear map ` W X ! Y between two normed spaces .X; kkX /, .Y; kkY / is continuous (and in fact Lipschitzian) if and only if there exists some c 2 RC such that k`.x/kY c kxkX for all x 2 X. The least such constant is called the norm of `. Proof Let us prove more, namely that the following assertions about a linear map ` W X ! Y are equivalent: (a) (b) (c) (d) (e) ( f)
` is bounding, i.e. for any bounded subset B of X, `.B/ is bounded; `.BX / is bounded; there exists some c 2 RC such that k`.x/kY c kxkX for all x 2 X; ` is Lipschitzian; ` is continuous; ` is continuous at 0.
The implications (a))(b), (d))(e))( f) are obvious. The implication (b))(c) is a consequence in the homogeneity of `: if `.BX / is contained in cBY , then for all x 2 Xnf0g, setting r WD kxkX , we have k`.x=r/kY c, hence k`.x/kY cr D c kxkX and k`.0/kY D 0. The implication (c))(d) is a consequence in the additivity of `: for all x, x0 2 X we have k`.x/ `.x0 /kY D k`.x x0 /kY c kx x0 kX . Finally, let us prove that ( f))(a). When ` is continuous at 0 there exists some ı > 0 such that `.ıBX / BY . Then if B is a subset of rBX for some r 2 RC one has `.B/ .r=ı/BY : `.B/ is bounded. t u For a linear form, i.e. a linear map from X into R, a simple characterization of continuity is available. In the sequel we say that a subset H of a linear space X is a hyperplane if there exist c 2 R and a linear form h ¤ 0 on X such that H D h1 .c/. Corollary 3.1 Let X be a (real) normed vector space, let c 2 R, and let h be a non-null linear form on X. The hyperplane H WD h1 .c/ is closed if and only if h is continuous.
102
3 Elements of Functional Analysis
Proof Obviously, when h is continuous, H is closed, fcg being closed in R. Conversely, suppose that H is closed. Since h is non null, XnH is nonempty. Let x0 2 XnH, so that there exists some r > 0 for which B.x0 ; r/ XnH. Assuming h.x0 / < c (the other possibility is dealt with changing h, c into h, c), let us note that h.x/ < c for all x 2 B.x0 ; r/: otherwise, if there exists an x1 2 B.x0 ; r/ such that h.x1 / > c, for t WD .h.x1 / c/.h.x1 / h.x0 //1 one has h..1 t/x0 C tx1 / D c and .1 t/x0 C tx1 2 B.x0 ; r/, contradicting B.x0 ; r/ XnH. Then, for all u 2 B.0; 1/ we have h.x0 C ru/ < c or h.u/ < r1 .c h.x0 // and h is continuous. t u Corollary 3.2 Let X be a dense linear subspace of a normed space W and let Y be a Banach space. If f W X ! Y is a continuous linear map, then there exists a unique continuous linear map g W W ! Y whose restriction to X is f . Proof Since f is Lipschitzian with rate kf k as we have seen, f has a unique extension g to W as a Lipschitzian map. The linearity of g follows from a passage to the limit in the relation f .xn C ryn / D f .xn / C rf .yn / with .xn / ! x, .yn / ! y. t u The space L.X; Y/ of continuous linear maps from X to Y can be turned into a normed space. Proposition 3.3 The map u 7! kuk WD supfku.x/kY W x 2 BX g on the space L.X; Y/ of continuous linear maps from X into Y is a norm on L.X; Y/. In particular, the (topological) dual space X WD L.X; R/ of a normed space X is a normed space. Proof The fact that this map is a seminorm is an easy consequence in the definitions. It is a norm since when kuk D 0 one has u.x/ D 0 for all x 2 BX , hence for all x 2 X by homogeneity. t u Example In general the supremum in the definition of kuk is not attained, even for Y D R. Taking T WD Œ0; 1 and X WD fx 2 C.T/ W x.0/ D 0g with kxk D kxk1 WD R1 supt2T jx.t/j for x 2 X, one can see that for u 2 X given by u.x/ D 0 x.t/dt one has kuk D 1 since for xn .t/ D t1=n one has u.xn / D n.n C 1/1 . However, if R1 u.x/ D 1 for some x 2 BX we have 0 .1 x.t//dt D 0, hence x.t/ D 1 for all t 2 T, contradicting x.0/ D 0. Exercise Show that for u 2 L.X; Y/, kuk is the least c 2 RC such that ku.x/kY c kxkX for all x 2 X. Moreover, kuk D supfku.x/kY W x 2 SX g D supfku.x/kY = kxkX W x 2 Xnf0gg. Exercise Show that if X, Y, Z are normed spaces and if u 2 L.X; Y/, v 2 L.Y; Z/, then kv ı uk kvk : kuk. Proposition 3.4 If X and Y are normed spaces and if Y is complete, then L.X; Y/ is complete. Proof Let .un / be a Cauchy sequence in L.X; Y/. For each x 2 X, since kun .x/ um .x/k kun um k : kxk, the sequence .un .x// is a Cauchy sequence in Y, hence it converges to some yx 2 Y which we denote by u.x/. The rules for
3.1 Normed Spaces
103
limits in Y show that u is a linear map. For x 2 BX , passing to the limit on n in the relation kun .x/k kun .x/ um .x/k C kum .x/k we see that u.BX / is bounded. Another passage to the limit in the relation supx2BX kun .x/ um .x/k kun um k shows that .ku um k/m ! 0, so that L.X; Y/ is complete. t u Corollary 3.3 If X is a normed space, then its (topological) dual X WD L.X; R/ is complete. A norm kk on the product X Y of two normed spaces is a product norm if its associated metric is a product metric. This amounts to the following inequalities for all .x; y/ 2 X Y: k.x; y/k1 WD max.kxkX ; kykY / k.x; y/k k.x; y/k1 WD kxkX C kykY : b and Corollary 3.4 If X is a normed space, there exists a complete normed space X b an isometry j from X onto a dense subspace of X. b of X as described above Proof It can be checked directly that the completion X as the set of equivalence classes of Cauchy sequences of X can be given a linear structure and a norm inducing on X the original norm. However, we prefer to invoke a result below ensuring that the canonical embedding eX W X ! X WD .X / given by heX .x/; x i WD hx ; xi is an isometry onto its image eX .X/. Then we can take for b the closure of eX .X/ in X . X t u Now let us turn to the continuity of multilinear maps. Proposition 3.5 Let X1 ; : : : Xm , Y be normed vector spaces and let u W X1 : : : Xm ! Y be an m-linear map, i.e. a map that is linear with respect to each of its m variables. Then u is continuous if and only if there exists some c 2 RC such that for all .x1 ; : : : ; xm / 2 X WD X1 : : : Xm one has ku.x1 ; : : : ; xm /k c kx1 k : : : kxm k :
(3.1)
Proof For the sake of simplicity, we give the proof for m D 2 only. The general case is similar, but is more laborious to write. Let us endow X with the supremum norm. Condition (3.1) is clearly sufficient to prove the continuity of u at .0; 0/. The continuity of u at arbitrary x WD .x1 ; x2 / 2 X ensues since for .x1 ; x2 / 2 BŒx; ı with ı 20; 1 one has kx1 k kxk C 1, hence ku.x1 ; x2 / u.x1 ; x2 /k D ku.x1 ; x2 x2 / C u.x1 x1 ; x2 /k cı.2 kxk C 1/: Now let us prove the necessity of (3.1). Suppose u is continuous at .0; 0/. Then, there exists some r > 0 such that for all x WD .x1 ; x2 / 2 BŒ0; r one has u.x1 ; x2 / 2 BŒ0; 1.
104
3 Elements of Functional Analysis
Given x WD .x1 ; x2 / 2 X, let w WD .w1 ; w2 / 2 BŒ0; r be such that r1 kxi k wi D xi for i D 1, 2 (if xi ¤ 0 we take wi WD rxi = kxi k and if xi D 0 we take wi WD 0). Then we have u.x1 ; x2 / D r2 kx1 k kx2 k u.w1 ; w2 /, hence ku.x1 ; x2 /k r2 kx1 k kx2 k. t u We can turn the space Lm .X1 ; : : : ; Xm I Y/ of continuous m-linear maps from X1 : : : Xm into Y into a normed space by setting kuk WD supfku.x1 ; : : : ; xm /k W sup.kx1 k ; : : : ; kxm k/ 1g for u 2 Lm .X1 ; : : : ; Xm I Y/. It can be shown that kuk is the least constant c such that (3.1) holds for all .x1 ; : : : ; xm / 2 X WD X1 : : : Xm . It can be shown that the space Lm .X1 ; X2 ; : : : ; Xm I Y/ is isometric to the space L.X1 ; L.X2 ; : : : ; L.Xm ; Y/ : : ://. In particular it is complete if Y is complete. We just state and prove the case m D 2; the general case follows by induction. Proposition 3.6 Given normed spaces X, Y, Z, and u 2 L2 .X; YI Z/, for x 2 X let ux 2 L.Y; Z/ be given by ux .y/ WD u.x; y/. Then uQ W x 7! ux is a linear continuous map from X into L.Y; Z/ and the map u 7! uQ is a linear isometry from L2 .X; YI Z/ onto L.X; L.Y; Z//. In particular, L2 .X; YI R/ is isometric to L.X; Y /. Proof Since for all .x; y/ 2 X Y we have kux .y/k D ku.x; y/k kuk kxk kyk, we see that ux is linear and continuous and that sup kux k D sup sup ku.x; y/k D x2BX
x2BX y2BY
sup .x;y/2BX BY
ku.x; y/k D kuk :
This shows that the map u 7! uQ (which is obviously linear) from L2 .X; YI Z/ into L.X; L.Y; Z// is an isometry onto its image. Let us prove this map is surjective. Given v 2 L.X; L.Y; Z//, setting u.x; y/ WD v.x/.y/ we clearly define a bilinear map that is continuous since ku.x; y/k kv.x/k kyk kvk kxk kyk and uQ D v. t u
Exercises 1. Let f be a map from a normed space X into another one Y that is bounded on the unit ball of X and additive, i.e. such that f .x C x0 / D f .x/ C f .x0 / for all x, x0 2 X. Prove that f is linear. 2. Let h ¤ 0 be a continuous linear form on a normed space X, and let H WD h1 .0/. Show that for all x 2 X the distance d.x; H/ from x to the hyperplane H is given by d.x; H/ D jh.x/j = khk. 3. Let X WD c0 be the space of real sequences .xn / with limit 0 endowed with the norm given by kxk1 WD supn jxn j for x WD .xn /. Let h 2 X be given by h.x/ WD ˙n 2n xn for x WD .xn / 2 X. Compute khk. Show that there is no x 2 X
3.1 Normed Spaces
4.
5.
6.
7.
105
such that kxk D 1 and h.x/ D khk. Setting H WD h1 .0/, show that for every x 2 XnH there is no v 2 H such that d.x; H/ D kx vk. A function q W X ! R on a normed space is said to be quadratic if there exists a symmetric bilinear map b W X X ! R such that q.x/ D b.x; x/ for all x 2 X. Verify that such a bilinear map is unique. Prove that b is continuous if and only if q is continuous. [Hint: define b by b.x; y/ WD 12 Œq.x C y/ q.x/ q.y/ for x; y 2 X.] Let X (often denoted by c0 ) be the space of sequences x WD .xn / of real numbers such that limn xn D 0. Let Y (often denoted by `1 ) be the space of sequences y WD .yn / of real numbers such that ˙n jyn j < C1. Given y 2 Y, show that fy W x 7! ˙n xn yn is a continuous linear form on X. Prove that y 7! fy is a linear isometry from Y onto the dual X of X. Let Z (often denoted by `1 ) be the space of bounded sequences z WD .zn / of real numbers. Given z 2 Z, show that gz W y 7! ˙n zn yn is a continuous linear form on `1 . Prove that z 7! gz is a linear isometry from Z onto the dual Y of Y. Verify that the canonical injection j W X ! Z is compatible with the injection X ! X via the preceding identifications. Prove that on any infinite dimensional normed space X there is a linear form f that is not continuous. [Hint: by Zorn’s Lemma there exists a maximal family fei W i 2 Ig of linearly independent elements of the unit sphere SX . It is an algebraic basis of X and since X is infinite dimensional, one can find a subset N of I and a bijection j W N ! N. Setting f .en / D j.n/ for n 2 N, f .ei / D 0 for i 2 InN one gets an unbounded linear form.]
3.1.3 Finite Dimensional Normed Spaces Finite dimensional normed spaces have interesting properties. We start with Rd endowed with the product topology and the norm kk1 . Proposition 3.7 A subset S of Rd is compact with respect to the induced topology if and only if it is closed and bounded. Proof If S is compact, by Proposition 2.14 it is closed in Rd . It is bounded since the norm kk1 on Rd is continuous. Conversely, if S is bounded it is contained in some ball BŒ0; r with respect to the norm kk1 which is a product of intervals Œr; r, hence is compact. If moreover S is closed, it is compact as a closed subset of a compact space. t u Proposition 3.8 All norms on a finite dimensional vector space X are equivalent. Proof Taking an isomorphism from Rd onto X, where d is the dimension of X, we may suppose X D Rd . Thus it suffices to prove that any norm kk on Rd is equivalent to the norm kk1 . Setting c WD ke1 kC: : :Cked k, where .e1 ; : : : ; ed / is the canonical basis of Rd , for all x D .x1 ; : : : ; xd / 2 Rd we have kxk jx1 j ke1 k C : : : C jxd j ked k c kxk1 :
106
3 Elements of Functional Analysis
Moreover, kk is continuous and even Lipschitzian since for all x, y 2 Rd we have jkxk kykj kx yk c kx yk1 : Since the unit sphere S of Rd with respect to the norm kk1 is compact, the function kk attains its infimum m on S. Since 0 … S this infimum cannot be 0. Thus m is positive and for all x 2 S we have kxk m kxk1 . By homogeneity this inequality is valid for all x 2 Rd and completes the first inequality. t u Corollary 3.5 If X is a finite dimensional vector subspace of a normed vector space W, then X is closed in W. Proof Taking a basis of X we get an isometry of X onto Rd equipped with a certain norm. Since this norm is equivalent to kk1 , X is complete, hence closed in W. u t Theorem 3.1 (Riesz) A normed vector space is locally compact if and only if it is finite dimensional. Proof If X is a finite dimensional vector space, taking a base of X we get an isomorphism from Rd onto X. The norm of X being transformed to a norm on Rd that is equivalent to the norm kk1 , we see that X is locally compact. Conversely, suppose X is locally compact. Taking an equivalent norm if necessary, we may assume the unit ball BX of X is compact. Thus it can be covered by a finite number of balls B.xi ; 1=2/, i 2 Nm . Let Y be the subspace generated by x1 ; : : : ; xm . By the preceding corollary, Y is closed in X. Let us show that assuming Y ¤ X leads to a contradiction. Taking " D 1=2 in Proposition 3.1 we get some x 2 X satisfying kxk D 1 and d.x; Y/ > 1=2. This contradicts the fact that there exists some i 2 Nm such that x 2 B.xi ; 1=2/. Thus Y D X and X is finite dimensional. t u
Exercises 1. Show that if X is finite dimensional, every linear map from X into a normed vector space Y is continuous. [Hint: use an isomorphism of Rd onto X.] 2. Show that if X is a finite dimensional subspace of a normed vector space W and if Y is a closed subspace of W, then X C Y is closed in W. 3. Show that if X is a finite dimensional supplement of a vector subspace Y of a normed space W, then X is a topological supplement of Y in the sense that W is the topological direct sum of X and Y, i.e. that the projections of W onto X and Y are continuous. 4. Prove that a normed vector space whose unit sphere SX is compact is finite dimensional. [Hint: use the fact that the unit ball BX is the image of Œ0; 1 SX under the map .u; r/ 7! ru.] 5. Show that an infinite dimensional Banach space cannot have a countable algebraic basis fen W n 2 Ng. [Hint: let Xn be the linear subspace generated
3.1 Normed Spaces
107
by fe0 ; : : : ; en g. Using Baire’s Theorem show that int.Xk / ¤ ¿ for some k 2 N, hence Xk D X, a contradiction.]
3.1.4 Series and Summable Families The study of summable families of real numbers we made can easily be extended to families of elements of normed spaces. We slightly change the notation of Sect. 1.2. Definition 3.1 A family .xi /i2I of elements of a normed vector space .X; kk/ is said to be summable with sum s if the family .sJ /J2J of finite sums sJ WD ˙j2J xj converges to s. Here J denotes the directed set of finite subsets of I with the order given by inclusion. The next properties are immediate consequences of the definition. Proposition 3.9 If r 2 R and if .xi /i2I , .x0i /i2I are summable families of X, then .rxi /i2I and .xi C x0i /i2I are summable families. Proposition 3.10 If X and Y are normed spaces, if ` W X ! Y is a continuous linear map and if .xi /i2I is a summable family of X, then .`.xi //i2I is a summable family of Y with sum `.s/. Proof This follows from the fact that the net .˙j2J `.xj //J2J converges to `.s/, where s is the sum ˙i2I xi since ˙j2J `.xj / D `.sJ / with sJ WD ˙j2J xj and since .sJ /J2J converges to s. t u Proposition 3.11 (Cauchy Summability Criterion) A family .xi /i2I of elements of a complete normed space X is summable if and only if it satisfies the condition: for all " > 0 there exists a finite subset H" of I such that for any finite subset F of I contained in InH" one has ksF k ". Proof The condition is sufficient since the net .sJ /J2J satisfies the Cauchy criterion whenever the family .xi /i2I satisfies the Cauchy summability criterion: given " > 0, let H" 2 J be such that ksF k " for any F 2 J contained in InH" I then, for J, K 2 J containing H" , since ksJ sK k D ksF k for F WD JK InH" we have ksJ sK k ". Conversely, let .xi /i2I be a summable family of X with sum s. Given " > 0 we can find H 2 J such that for J 2 J containing H we have ksJ sk "=2. Then, for any F 2 J contained in InH, setting J WD H [ F we have ksF k D ksJ sH k t u ksJ sk C ks sH k ". Corollary 3.6 A subfamily of a summable family of a Banach space X is summable. A simple means of showing that a family of elements of a Banach space is summable consists in reducing the question to the summability of a family of nonnegative real numbers. This relies on the next proposition and on the notion of
108
3 Elements of Functional Analysis
an absolutely summable family: a family .xi /i2I is said to be absolutely summable if the family .kxi k/i2I is summable. Proposition 3.12 Any absolutely summable family of a Banach space is summable. Proof It suffices to show that an absolutely summable family .xi /i2I satisfies the Cauchy summability criterion. This follows from the Cauchy summability criterion for .kxki /i2I and the fact that for any finite subset F of I one has ksF k ˙i2F kxi k. t u It can be shown that in any infinite dimensional Banach space there exist summable families that are not absolutely summable. That does not occur in finite dimensional spaces. Proposition 3.13 In a finite dimensional normed space a family is summable if and only if it is absolutely summable. Proof Since any finite dimensional normed space is isomorphic to Rd for some d 1, it suffices to show that a summable family .xi /i2I of Rd is absolutely summable. Proposition 3.10 ensures that for all k 2 Nd the family .xki /i2I of the k-components of .xi /i2I is summable since xki D pk .xi /, where pk W Rd ! Rˇ isˇ the k-th projection. ˇ ˇ Then .xki /i2I is absolutely summable, so that .kxi k1 /i2I D .ˇx1i ˇ C : : : C ˇxdi ˇ/i2I is summable. t u Let us turn to series in a normed space .X; kk/. A series in X is a pair of sequences .xn /, .sn / of X such that sn D x0 C : : : C xn for all n 2 N; xn is called the n-th term and sn is called the n-th partial sum of the series. The pair .xn /, .sn / is often denoted by ˙n xn , a formal notation. One says that the series ˙n xn converges if the sequence .sn / converges. The limit s of .sn / is then called the sum of the series. Then, the remainder rn WD ˙pn xp converges to 0. For series of nonnegative general terms, convergence can be simply characterized. Lemma 3.2 Let .xn /n be a sequence in nonnegative real numbers. Then the following assertions are equivalent: (a) the series with general term xn is convergent; (b) the partial sums sn are bounded above; (c) the family .xn /n2N is summable. Proof (a))(b) Since the sequence .sn / is nondecreasing, if it converges one has sn limn sn . (b))(c) Let c sn for all n 2 N. For any finite subset J of N one can find n such that J Œ0; n, so that sJ c and the nondecreasing net .xJ /J2J is convergent. (c))(a) The implication is a general fact, as shown below. In fact, if .xn /n2N is summable with sum s, for every " > 0 one can find a finite subset H of N such that for n sup H and J WD f0; : : : ; ng one has sn D sJ , hence js sn j D js sJ j < ": t u .sn / ! s. The rules for convergence of sequences yield rules for convergence of series (sums, images) as above. Cauchy criterion can be adapted as follows.
3.1 Normed Spaces
109
Proposition 3.14 A series with general term xn in a Banach space is convergent if and only if for all " > 0 there exists an k" 2 N such that for p > n k" one has kDp ˙kDnC1 xk ". Proof This statement is merely Cauchy criterion for the sequence .sn /n .
t u
It follows that the general term of a convergent series tends to 0. It is well known that this property is far from being sufficient. Let us compare convergence of series with summability. We start with a simple observation. Lemma 3.3 Let .xn /n0 be a sequence in a normed space X. If the family .xn /n2N satisfies the Cauchy summability criterion and if the series with general term xn is convergent with sum s, then the family .xn /n2N is summable and its sum is s. Proof Given " > 0 we can find a finite subset H of N such that ksF k < "=2 for any finite subset F of NnH and we can find k 2 N such that ksn sk < "=2 for all n k. Without loss of generality we may suppose k max H. Then, for K WD f0; 1; : : : ; kg we have sk D sK and for any finite subset J of N containing K, setting F WD JnK we have ksJ sk ksK C sF sk ksK sk C ksF k < ": Thus the family .xn / is summable with sum s.
t u
Theorem 3.2 A series with general term xn in a normed vector space is commutatively convergent (in the sense that for any permutation of N the series with general term x.n/ is convergent) if and only if the family .xn /n2N is summable. Proof Suppose the family .xn /n2N is summable, with sum s. Let be a permutation of N and let yn WD x.n/ , tn WD y0 C : : : C yn , sJ WD ˙j2J xj for J a finite subset of N. For any " > 0 there exists some finite subset H of N such that for any finite subset J of N containing H one has ksJ sk ". Since H is finite there exists a k 2 N such that H .f0; : : : ; kg/. Then, for n k one has J WD .f0; : : : ; ng/ H, hence ktn sk D ksJ sk ": the series ˙n yn converges to s. Conversely, suppose the series with general term xn is commutatively convergent. The lemma asserts that if the family .xn /n2N satisfies the Cauchy summability criterion, then it is summable. It remains to show that if the Cauchy summability criterion is not satisfied we get a contradiction. In such a case there exists an " > 0 such that for any H 2 J one can find F.H/ 2 J contained in NnH such that sF.H/ > ". Taking H0 WD f0g and inductively setting HnC1 WD Hn [ F.Hn /, we get a sequence .F.Hn // of disjoint finite subsets of N such that sF.Hn / > ". Keeping the order on F.Hn / and ranking in a consecutive order the sets F.Hn /, we get a strictly increasing sequence .k.n//n and a bijective map W N !N such that F.Hn / D f.j/ W k.n/ j < k.n C 1/g. The series with general term yn WD x.n/ jDk.nC1/1 cannot be convergent since for all n 2 N we have ˙ yj D sF.H / > ". jDk.n/
This is the required contradiction.
n
t u
110
3 Elements of Functional Analysis
A series with general term xn in a normed space is said to be absolutely convergent if the series with general term kxn k is convergent. Lemma 3.2 shows that the series ˙xn is absolutely convergent if and only if the family .xn /n2N is summable. Combining this observation with the preceding theorem we get the next statement. Corollary 3.7 In a Banach space, any absolutely convergent series is commutatively convergent. Adding Proposition 3.13 to the preceding implications, we get nice equivalences. Proposition 3.15 For a sequence .xn / in a finite dimensional space the following assertions are equivalent: (a) (b) (c) (d)
the series ˙xn is commutatively convergent; the series ˙xn is absolutely convergent; the family .xn /n2N is absolutely summable; the family .xn /n2N is summable.
When X is a space of functions from a set T to a Banach space E, besides pointwise convergence, one may consider uniform convergence. Taking for X the space B.T; E/ of bounded functions from T to E with the supremum norm kk1 , it may be convenient to find a convergent series ˙n rn such that kxn .t/k rn for all t 2 T. Then one says that the series ˙n xn is normally convergent; then it is absolutely convergent in X WD B.T; E/. Let us provide some means for studying series that are convergent but not necessarily absolutely convergent. The Abel transformation is such a means. It presents some similarity with integration by parts. Lemma 3.4 Let .xn / be a sequence in a normed space X such that cn WD supk0 kxn C xnC1 C : : : C xnCk k < 1 for all n 2 N and let .rn / be a nonincreasing sequence of nonnegative numbers. Then for all n, k 2 N one has krn xn C rnC1 xnC1 C : : : C rnCk xnCk k rn cn :
(3.2)
Proof For n 2 N let yn;0 D 0, and for k 2 Nnf0g let yn;k WD xn C xnC1 C : : : C xnCk . Since xnCj D yn;j yn;j1 for j D 1; : : : ; k, we have k X jD0
rnCj xnCj D rn yn;0 C
k X
rnCj .yn;j yn;j1 /
jD1
D .rn rnC1 /yn;0 C : : : C .rnCk1 rnCk /yn;k1 C rnCk yn;k ; k
k X
rnCj xnCj k cn ..rn rnC1 / C : : : C .rnCk1 rnCk / C rnCk / D cn rn :
jD0
t u
3.1 Normed Spaces
111
It follows from this estimate that when .cn rn / ! 0 the series with general term rn xn satisfies the Cauchy criterion. Proposition 3.16 (Abel) Let .xn / be a sequence in a complete normed space X and let .rn / be a nonincreasing sequence of nonnegative real numbers. If either the series ˙n xn converges or .rn / ! 0 and the sequence .sn / of partial sums sn WD x0 C: : :Cxn is bounded in X, then the series ˙rn xn converges. Proof Since X is complete, in view of the preceding lemma it suffices to show that .cn rn / ! 0 in each of these two cases. In the first case, the convergence of ˙n xn implies that .cn / ! 0. Since rn r0 we have cn rn cn r0 and we get .cn rn / ! 0. In the second case we have yn;k WD xn C : : : C xnCk D snCk sn1 , so that cn WD supk0 kyn;k k is bounded above by some c. Then cn rn crn and .cn rn / ! 0. t u Example An alternate series is a series whose general term is .1/n rn where .rn / is a nonincreasing sequence of nonnegative numbers. Since sn is either 1 or 0, if .rn / ! 0 then the series ˙n .1/n rn converges. Example The case xn D zn with z 2 C, jzj D 1, z ¤ 1 also pertains to the second case of the proposition. Since sn D .1 znC1 /.1 z/1 one has jsn j 2 j1 zj1 , so that if .rn / ! 0 and .rn / is nonincreasing, the series ˙n rn zn is convergent. Setting z WD eit we see that for t … 2Z the series with general terms rn cos nt and rn sin nt are convergent. As an application of series in Banach spaces, let us consider the question of invertibility of linear maps. A similar study could be made in Banach algebras (i.e. Banach spaces endowed with a continuous bilinear and associative product). When X and Y are finite dimensional normed spaces and f is a linear isomorphism, we know that any linear map g that is close enough to f with respect to some norm on the space L.X; Y/ of linear continuous maps from X into Y is still an isomorphism: taking bases in X and Y we see that if g is close enough to f its determinant will remain different from 0. A similar result holds in infinite dimensional Banach spaces. Proposition 3.17 Let f be a linear isomorphism between two Banach spaces X and Y. Then any g 2 L.X; Y/ such that kf gk < kf 1 k1 is an isomorphism. Thus, the set of linear continuous maps that are isomorphisms is open in the space L.X; Y/. Proof Let us first consider the case X D Y; f D IX . Let u WD IX g, so that u 2 L.X; X/ satisfies kuk < 1. Since the map .v; w/ 7! w ı v is continuous, since IX u
nC1
D .IX u/ ı
n X kD0
! u
k
D
n X
! u
k
ı .IX u/;
kD0
k P k kukk ), we get that and since the series 1 kD0 u is absolutely convergent (as u its sum is a right and left inverse of IX u. Thus IX u is invertible.
112
3 Elements of Functional Analysis
The general case can be deduced from this special case. Given g 2 L.X; Y/ such that kf gk < r WD kf 1 k1 , setting u WD IX f 1 ı g, we observe that kuk kf 1 ı . f g/k kf 1 k kf gk < 1. Therefore, by what precedes, f 1 ı g D IX u is invertible. It follows that g is invertible, with inverse .IX u/1 ı f 1 . t u
Exercises 1. Prove that if .xi /i2I is a summable family of a Banach space X the set D WD fi 2 I W xi ¤ 0g is countable. 2. Recall that a (vector) basis of a vector space E is a family .ei /i2I of elements of E such that any element x of E can be written in a unique way as a linear combination of a finite subfamily of .ei /i2I . Prove that an infinite dimensional Banach space cannot have a countable basis. [Hint: given a sequence .en /n of linearly independent vectors of norm 1, let Xn be the space generated by fe0 ; : : : ; en g; define inductively a sequence .rn / of nonnegative numbers by r0 WD 0, r1 WD 0, rnC1 D .1=3/d.rn en ; Xn1 / for n 1 and show that the series ˙n rn en is absolutely convergent but that its sum does not belong to any of the subspaces Xn .] 3. Let T WD Œ0; 1, X WD C.T/, let f 2 X be given by f .0/ D 0, f .t/ WD t sin2 .=t/ for t 2 Tnf0g and let fn 2 X be defined by fn WD f 1Œ1=.nC1/;1=n . Show that the family . fn / is summable but not absolutely summable. 4. In the space `1 of bounded sequences of real numbers provided with the norm given by kxk WD supn jxn j for x WD .xn /, let an be the element of X whose components are all 0 except for the nth one which is 1=.n C 1/. Show that the family .an / is not absolutely summable but is summable. 5. Let X, Y, Z be normed spaces and let b W X Y ! Z be normed spaces. If .xi /i2I (resp. .yj /j2J ) is an absolutely summable family of X (resp. Y), prove that .b.xi ; yj //.i;j/2IJ is absolutely summable with sum b.˙i xi ; ˙j yj / when Z is complete. 6. Let .k.n//n be a strictly increasing sequence of integers with k.0/ D 0 and let ˙n xn be a convergent series in a normed space X. For n 2 N let yn WD xk.n/ C : : : C xk.nC1/1 . Show that the series ˙n yn converges to the sum s of the series ˙n xn . 7. Let X and Y be Banach spaces and let S L.X; Y/ be the set of A 2 L.X; Y/ such that there exists a B 2 L.Y; X/ (called a left inverse of A) satisfying B ı A D IX . Show that S is open in L.X; Y/. [Hint: use Proposition 3.17.]
3.1.5 Spaces of Continuous Functions The question arises of characterizing subsets of a space of functions that are compact or complete. We already considered the question of completeness. Let us give a criterion for compactness. It uses the following definition.
3.1 Normed Spaces
113
Definition 3.2 Given a topological space X and a metric space Y, a set F of maps from X into Y is said to be equicontinuous at x 2 X if for every " > 0 one can find some V 2 N .x/ such that for all f 2 F and all v 2 V one has d. f .v/; f .x// ". The set F is said to be equicontinuous if it is equicontinuous at all x 2 X. Clearly, any finite set of continuous maps is equicontinuous. If X is a metric space and if there exist a neighborhood U of x, c > 0 and ˛ > 0 such that d. f .u/; f .x// cd.u; x/˛ for all u 2 U and f 2 F, then F is equicontinuous at x. In particular, if the functions in F are Lipschitzian with the same rate c, then F is equicontinuous. Theorem 3.3 (Ascoli–Arzela) Let X be a compact topological space, let Y be a complete metric space, and let F be a subset of the space C.X; Y/ of continuous maps from X into Y. Then, endowing C.X; Y/ with the metric d1 of uniform convergence, F is relatively compact in C.X; Y/ if and only if F is equicontinuous and for all x 2 X the set F.x/ WD ff .x/ W f 2 Fg is relatively compact in Y. Proof If the closure F of F in C.X; Y/ is compact, since for every x 2 X the evaluation ex W f 7! f .x/ is continuous, F.x/ is contained in the compact set ex .F/, hence is relatively compact in Y. Since F is precompact by Theorem 2.15, given " > 0 one can find some finite subset ffi W i 2 Nm g of F such that the balls B. fi ; "=3/ (i 2 Nm ) cover F. Let x 2 X and let V 2 N .x/ be such that d. fi .x/; fi .v// "=3 for all i 2 Nm and all v 2 B.x; ı/. Then, for all f 2 F and all v 2 V, picking i 2 Nm such that f 2 B. fi ; "=3/, we have d. f .x/; f .v// d. f .x/; fi .x// C d. fi .x/; fi .v// C d. fi .v/; f .v// " W F is equicontinuous at x. For the converse, since .C.X; Y/; d1 / is complete, it suffices to prove that F is precompact when F is equicontinuous and when F.x/ is relatively compact for all x 2 X. Given " > 0, by equicontinuity, for all x 2 X we can find some Vx 2 N .x/ such that d. f .v/; f .x// "=4 for all v 2 Vx and all f 2 F. Let A be a finite subset of X such that fVa W a 2 Ag is a cover of X. Since F.a/ is relatively compact in Y for all a 2 A, so is F.A/ WD [a2A F.a/. Let b1 ,. . . ; bm be points of F.A/ such that fB.bi ; "=4/ W i 2 Nm g is a cover of F.A/. Let K be the finite set of maps k from A into Nm . For all f 2 F there exists some k 2 K such that f .a/ 2 B.bk.a/ ; "=4/ for all a 2 A. Thus F is covered by the sets Fk WD fg 2 F W 8a 2 A g.a/ 2 B.bk.a/ ; "=4/g
k 2 K:
Since for g; h 2 Fk and x 2 X one can find some a 2 A such that x 2 Va one has d.g.x/; h.x// d.g.x/; g.a// C d.g.a/; bk.a// C d.bk.a/ ; h.a// C d.h.a/; h.x// "; the diameters of the sets Fk are at most " and F is precompact.
t u
Under a compactness assumption and a monotonicity assumption one can pass from pointwise convergence to uniform convergence.
114
3 Elements of Functional Analysis
Theorem 3.4 (Dini) Let X be a compact topological space and let . fn / be a sequence of continuous real-valued functions that pointwise converges to some continuous function f . If . fn / is increasing (in the sense that fn fnC1 for all n 2 N) then . fn / ! f uniformly. Proof Given " > 0, for n 2 N let Un WD fx 2 X W f .x/ fn .x/ < "g. Since f and fn are continuous, Un is open (we note that it would suffice to suppose f is upper semicontinuous and fn is lower semicontinuous). For all x 2 X, since . fn .x// ! f .x/, we have x 2 Un for n large enough. Since X is compact we can select a finite subcovering fUk W k 2 Nm g from the covering fUn W n 2 Ng. The sequence . fn / being increasing, we have Uk Um for k 2 Nm . Thus Um D X and for n m we have jf .x/ fn .x/j D f .x/ fn .x/ < ". t u It is often useful to approximate a continuous function by simple p functions. A prototype of such a process is the following result in which r 7! r could be replaced by any continuous function, as we shall show later. Lemma 3.5 (Weierstrass) There exists p a sequence .pn / of polynomials which converges uniformly on Œ0; 1 to r 7! r and is increasing on this interval. Proof We define pn by induction, setting p0 D 0 and pnC1 .r/ WD pn .r/ C .1=2/.r p2n .r// We show by induction that pn .r/
n 1; r 2 R:
(3.3)
p r for r 2 Œ0; 1 thanks to the relation
p p p r pnC1 .r/ D . r pn .r//Œ1 .1=2/. r C pn .r// 0 p p since .1=2/. r C pn .r// r 1. It follows that pnC1 .r/ pn .r/ for n 2 N and r 2 Œ0; 1. The increasing sequence .pn .r// of Œ0; 1 converges to some q.r/ p2 Œ0; 1. Passing to the limit in relation (3.3) we get r q2 .r/ D 0, hence q.r/ D r. Since Œ0; 1 is compact and .pn / is increasing on Œ0; 1, Dini’s Theorem ensures that the convergence is uniform. t u In the sequel S is a compact topological space and we say that a subset A of the space C.S/ of continuous real-valued functions on S separates the points of S if for any pair .x; y/ of distinct points of S there exists some f 2 A such that f .x/ ¤ f .y/. Let us consider a subalgebra A of C.S/, i.e. a vector subspace of C.S/ such that fg 2 A whenever f , g 2 A. Lemma 3.6 If A is a subalgebra of C.S/, for any f 2 A one has jf j 2 A WD cl.A/. Moreover, A is a sublattice of C.S/: for all f , g 2 A one has f ^ g; f _ g 2 A. Proof Let f 2 A. We may suppose r WD kf k1 > 0. Then, setting fn .s/ WD pn . f 2 .s/=r2 /, where .pn / is as in the preceding lemma, we get that fn 2 A as A is an algebra and . fn / ! jf j =r for kk1 . Thus jf j =r 2 A. Since A is a subalgebra of C.S/, as is easily seen, for all f 2 A we have jf j 2 A.
3.1 Normed Spaces
115
For all f , g 2 A one has f _ g D .1=2/. f C g C jf gj/ 2 A and f ^ g D .1=2/. f C g jf gj/ 2 A. t u Lemma 3.7 If A is a subalgebra of C.S/ that separates the points of S and contains the constant functions, then for every pair x, y of distinct points of S and any pair r, s of real numbers there is some f 2 A such that f .x/ D r and f .y/ D s. Proof Given x ¤ y in S, by assumption there is some g 2 A such that g.x/ ¤ g.y/. Setting t WD .r s/=.g.x/ g.y//, since A contains the constant functions, we get that f WD r C t.g g.x// 2 A, f .x/ D r, and f .y/ D s. t u Lemma 3.8 If A is a subalgebra of C.S/ that separates the points of S and contains the constant functions, then for all f 2 C.S/, " > 0 and all x 2 S there is some g 2 A such that g.x/ D f .x/ and g f C ". Proof Let f 2 C.S/, " > 0 and x 2 S be given. For all y 2 Snfxg, by the preceding lemma, there exists some gy 2 A such that gy .x/ D f .x/ and gy .y/ < f .y/ C ". Let Vy be an open neighborhood of y such that gy .v/ < f .v/ C " for all v 2 Vy . Since S is compact, there exists a finite subset Y of S such that fVy W y 2 Yg is a covering of S. Then, by Lemma 3.6 g WD infy2Y gy 2 A, g.x/ D f .x/ and for all w 2 S we can find y 2 Y such that w 2 Vy , so that g.w/ gy .w/ < f .w/ C ". t u Theorem 3.5 (Stone-Weierstrass) Let S be a compact topological space and let A be a subalgebra of C.S/ that separates the points of S and contains the constant functions. Then A is dense in .C.S/; kk1 /. Proof Let f 2 C.S/ and let " > 0 be given. By the preceding lemma, for all x 2 S there exists some gx 2 A such that gx .x/ D f .x/ and gx f C ". Since f and gx are continuous, there exists an open neighborhood Ux of x such that gx .u/ f .u/ " for all u 2 Ux . Let X be a finite subset of S such that fUx W x 2 Xg covers S. Then h WD supx2X gx 2 A by Lemma 3.6 and satisfies h f C ", h f " since every z 2 S belongs to some Ux with x 2 X. Thus kh f k1 " and f 2 cl.A/ D A. t u The corresponding conclusion with C.S/ replaced by the space C.S; C/ of complex-valued continuous functions on S is not true. However, one can get a similar conclusion by adding an assumption. Corollary 3.8 Let S be a compact topological space and let A be a subalgebra of C.S; C/ that separates the points of S, contains the constant functions and is such that f 2 A for all f 2 A. Then A is dense in .C.S; C/; kk1 /. Proof Let B WD ff 2 A W f D f g. Then B is a real subalgebra of C.S/ that contains the constant functions. Moreover, since for all f 2 A we have .1=2/. f C f / 2 B and .1=2i/. f f / 2 B, B separates the points of S. Thus B is dense in C.S/, hence A WD B C iB is dense in C.S; C/ D C.S/ C iC.S/. t u Since the polynomial functions on Rd separate the points and form an algebra, we get the following announced consequence.
116
3 Elements of Functional Analysis
Corollary 3.9 For any compact subset S of Rd and any continuous function f on S there exists a sequence .pn / of polynomials on Rd that converges uniformly to f on S. Corollary 3.10 If S is a compact metric space, the spaces C.S/ and C.S; C/ are separable. Proof Since C.S; C/ is the topological direct sum of C.S/ and iC.S/, it suffices to prove the result for C.S/. Now S is separable by Corollary 2.16, hence there exists a countable base fGn W n 2 Ng of the topology of S. Set gn WD d.; SnGn /. Given a pair x, y of distinct points of S there exists some n 2 N such that x 2 Gn and y 2 XnGn , so that gn .x/ > 0 and gn .y/ D 0. Thus the algebra A generated by fgn W n 2 Ng and the constant functions separates the points of A, hence it is dense in C.S/. Since A is formed by the monomials g˛1 1 : : :g˛n n with ˛1 ,. . . ; ˛n 2 N, A is countable and the Stone-Weierstrass Theorem ensures that A is dense in C.S/. t u The following result shows a link between algebraic properties and topological properties. It will be used for the spectral analysis of operators on a Hilbert space. Let us recall that a subset J of a ring or an algebra R is an ideal if it is a subring (or a subgroup) of R such that fg 2 J whenever f 2 R and g 2 J. Theorem 3.6 (Ideal Theorem) Let S be a compact space and let J be a closed ideal of R WD C.S/ endowed with the sup norm. Let Z WD fx 2 S W 8f 2 J; f .x/ D 0g. Then J coincides with the set of f 2 R such that f .z/ D 0 for all z 2 Z. In other words, the set C of closed subsets of S is in bijection with the set J of ideals of C.S/ via the map T 7! JT WD ff 2 C.S/ W f .z/ D 0 8z 2 Tg. Proof Let f 2 R be such that f .z/ D 0 for all z 2 Z. Given " > 0, we will find some f" 2 J such that kf" f k1 ". Since J is closed, this will prove that f 2 J. Let Z" WD fx 2 S W jf .x/j < "g, S" WD SnZ" . Since Z is contained in Z" , for all y 2 S" there exists some gy 2 J such that gy .y/ ¤ 0 and some open neighborhood Vy of y such that gy .v/ ¤ 0 for all v 2 Vy . Let y1 ; : : : ; yk be such that .Vy1 ; : : : ; Vyk / is a finite covering of the compact set S" . Let g WD g2y1 C : : : C g2yk ; so that g 2 J, g takes nonnegative values and is positive on S" . Then, for all n 2 N the function gn WD n.1 C ng/1 g is in J since 1 C ng 1 and n.1 C ng/1 2 C.S/. Since g is bounded below by some ˛ > 0 on the compact set S" , we see that .gn / ! 1 and .gn f / ! f uniformly on S" . On the other hand, since 0 n.1 C ng/1 g 1 we have jgn .x/f .x/ f .x/j < " for all x 2 Z" . Thus, taking n large enough, for f" WD gn f , we have kf" f k1 " and gn f 2 J. t u
3.2 Topological Vector Spaces. Weak Topologies
117
Exercises 1. Verify that the union of a finite family of equicontinuous sets of maps from a topological space X into a metric space Y is equicontinuous. Deduce from this result that in particular any finite set of such maps is equicontinuous. 2. Let X be a topological space, let Y be a metric space and let . fi /i2I be a net of maps from X into Y that converges pointwise to some f W X ! Y. Show that if the family ffi W i 2 Ig is equicontinuous at x 2 X, then f is continuous at x. 3. With the data of Exercise 2 show that the closure in .B.X; Y/; d1 / of an equicontinuous subset F is equicontinuous. 4. Let X be a compact topological space, let Y be a metric space and let . fn / be a sequence in C.X; Y/ that is equicontinuous and converges pointwise to some f W X ! Y. Prove that . fn / ! f uniformly. 5. Let X be a topological space, let Y be a metric space and let ffn W n 2 Ng be a family of maps from X into Y that is equicontinuous at x 2 X. Show that if . fn .x//n converges to some y 2 Y and if .xn / ! x, then . fn .xn // ! y. 6. Let S and T be two compact topological spaces and let A WD C.S/ ˝ C.T/ be the set of finite sums of separable functions, i.e. functions of the form .s; t/ 7! f .s/g.t/ with f 2 C.S/, g 2 C.T/. Show that A is dense in C.S T/. 7. Let S D T D Œ0; 1 and let fri W i 2 Nm g be a finite family of distinct points of S. Prove that the functions r 7! jr ri j (i 2 Nm ) are linearly independent. Deduce from this observation that the function h W .s; t/ 7! js tj cannot be an element of C.S/ ˝ C.T/. 8. Prove that the additional condition f 2 A for all f 2 A in Corollary 3.8 cannot be omitted. [Hint: let S WD fz 2 C W jzj 1g, let A be the set of restrictions to S of the complex polynomial functions on C. Note that for all f 2 A, hence for all R 2 f 2 A, one has f .0/ D .1=2/ 0 f .eit /dt, a relation that is not satisfied by all f 2 C.S/.] 9. Let S be a compact subset of a metric space .T; d/. Deduce from Theorem 3.5 that any f 2 C.S/ is the restriction to S of some g 2 C.T/. [Hint: Take A WD fg jS W g 2 C.T/g and note that for all f 2 A one can find a g 2 C.T/ such that sup g D sup f and inf g D inf f . Given a sequence ."n / ! 0C find a sequence .gn / of C.T/ such that kf ˙0kn gk jS k "n , supt2T jgn .t/j "n1 and take g WD ˙n gn .]
3.2 Topological Vector Spaces. Weak Topologies In infinite dimensional normed spaces, it appears that compact subsets are scarce. A natural means to get a richer family of compact subsets on a normed space .X; kk/ is to weaken the topology: then there will be more convergent nets and, since open covers will be not as rich, finding finite subfamilies will be easier. The drawbacks are that continuity of maps issued from X will be lost in general and that no norm
118
3 Elements of Functional Analysis
will be available to define the weakened topology if X is infinite dimensional. A partial remedy for the first inconvenience will be proposed in the next subsection. Now, the lack of a norm will not be too dramatic if one realizes that the structure of topological linear space is preserved. This means that the two operations .x; y/ 7! x C y and .; x/ 7! x will be continuous with respect to the new topology. One will even dispose of a family of seminorms defining the topology. Recall that a seminorm on a linear space X being a function p W X ! RC that is subadditive (i.e. such that p.x C y/ p.x/ C p.y/ for all .x; y/ 2 X X) and absolutely homogeneous (i.e. such that p.x/ D jj p.x/ for all .; x/ 2 R X) or equivalently subadditive, positively homogeneous (i.e., such that p.x/ D p.x/ for all .; x/ 2 RC X) and even (i.e., such that p.x/ D p.x/ for all x 2 X). Note that a seminorm p is a norm iff p1 .0/ D f0g. The topology associated with a family .pi /i2I of seminorms on X is the topology generated by the family of semi-balls Bi .a; r/ WD fx 2 X W pi .x a/ < rg for all a 2 X, r 2 P, i 2 I. Such a topology is clearly compatible with the operations on X, so that X becomes a topological linear space. It is even a locally convex topological linear space in the sense that each point has a base of neighborhoods that are convex. One can show that this property is equivalent to the existence of a family of seminorms defining the topology. On the (topological) dual space X of a topological linear space X, i.e. on the space of continuous linear forms on X, a natural family of seminorms is the family .px /x2X given by px . f / WD jf .x/j or, adopting a notation we will use frequently, px .x / WD jhx ; xij for x 2 X . Then a net . fi /i2I of X converges to some f 2 X if and only if for all x 2 X, . fi .x//i2I ! f .x/; then we write . fi /i2I ! f . Thus, the obtained topology on X , denoted by WD .X ; X/ and called the weak topology, is just the topology induced by pointwise convergence. It is the weakest topology on X for which the evaluations f 7! f .x/ are continuous, for all x 2 X. Although this topology is poor, it preserves some continuity properties. In particular, if X and Y are normed spaces and if A 2 L.X; Y/, its transpose map (often called the adjoint) A| W Y ! X defined by A| .y / WD y ı A for y 2 Y or hA| .y /; xi D hy ; A.x/i
.x; y / 2 X Y
is not just continuous with respect to the topologies induced by the dual norms (the so-called strong topologies) since kA| .y /k D ky ı Ak kAk : ky k for all y 2 Y ; it is also continuous with respect to the weak topologies: when .yi /i2I !
y one has .A| .yi //i2I ! A| .y / since for all x 2 X one has .A| .yi /.x//i2I D .yi .A.x///i2I ! y .A.x//. Note that when X and Y are Hilbert spaces (i.e., Banach spaces whose norms derive from scalar products), so that they can be identified with their dual spaces, A| corresponds to the adjoint A W Y ! X of A characterized by hA .y/ j xiX D hy j A.x/iY for all x 2 X, y 2 Y, h j iX (resp. h j iY ) denoting the scalar product in X (resp. Y). Let us show there are sufficiently many linear forms on X that are continuous with respect to the weak topology.
3.2 Topological Vector Spaces. Weak Topologies
119
Proposition 3.18 The set of continuous linear forms on X endowed with the weak topology can be identified with X. Proof By definition, for all x 2 X, the linear form ex W x 7! hx ; xi on X is continuous with respect to the weak topology . Let us show that any continuous linear form f on .X ; / coincides with some ex . We can find ı > 0 and a finite family .a1 ; : : : ; am / in X such that jf .x /j < 1 for all x 2 X satisfying pai .x / WD jhx ; ai ij < ı for i 2 Nm WD f1; : : : ; mg. Setting xi WD ai =ı, we get jf .x /j max1im jhx ; xi ij since otherwise, by homogeneity, we could find x 2 X such that jf .x /j D 1 and max1im jhx ; xi ij < 1, contradicting the choice of ai and xi . Changing the indexing if necessary, we may suppose that, for some k 2 Nm , x1 ; : : : ; xk form a basis of the linear space spanned by x1 ; : : : ; xm . Let A WD .ex1 ; : : : ; exk / W X ! Rk : Then, denoting by N the kernel of A and by p W X ! X =N the canonical projection, f can be factorized into f D g ı p for some linear form g on X =N. Since A is surjective, there is also an isomorphism B W X =N ! Rk such that A is factorized into A D B ı p. Then, p D B1 ı A and f D g ı B1 ı A, hence f .x / D c1 x .x1 / C : : : C ck x .xk / for all x 2 X , where c1 ; : : : ; ck are the components of g ı B1 in .Rk / . Thus f D ex for x WD c1 x1 C : : : C ck xk 2 X. t u The following result shows that in introducing the weak topology we have attained our aim of getting sufficiently many compact subsets. Theorem 3.7 (Alaoglu-Bourbaki) Every weak closed, bounded subset of the dual space X of X is weak compact, i.e. is compact with respect to the weak topology. Proof It suffices to show that the closed unit ball B WD BX of X is weak compact. To do so, let us denote by S the closed unit sphere SX of X, by H the space of positively homogeneous functions on X and by HS the space of all the restrictions to S of the elements of H. The restriction operator r W H ! HS is then a bijection, with inverse given by r1 .h/.x/ D th.t1 x/ for x 2 Xnf0g, t WD kxk, r1 .h/.0/ D 0. Then r and r1 are continuous with respect to the pointwise convergence topologies on H and HS , and for this topology HS is homeomorphic to the product space RS . The subset B of H is easily seen to be closed with respect to the pointwise convergence topology on H. Moreover, r.B / is contained in Œ1; 1S , which is compact, by the Tychonov Theorem. Thus, r.B / and B are compact in HS and H, respectively. It follows that B is compact in X endowed with the weak topology. t u The weak topology on X is the topology .X; X / associated with the semi-norms px W x 7! jx .x/j. It will be shown later that this topology is the topology on X induced by .X ; X / when X is considered as a subspace of its bidual space X WD .X / .
120
3 Elements of Functional Analysis
Exercises 1. Prove that the interior with respect to the weak topology of the unit ball of the dual of an infinite dimensional Banach space is empty. [Hint: any neighborhood of 0 with respect to the weak topology contains an unbounded subset, in fact a non-trivial linear space.] 2. A cone of a linear space is a subset stable under the homotheties ht W x 7! tx for all t > 0. Show that a weakly closed cone Q of the dual of a normed space X is weakly locally compact (i.e., for each point x of Q there exists a weak neighborhood V of x such that Q \ V is weakly compact) if and only if there exists a neighborhood U of 0 such that Q \ U is weakly compact. [Hint: let u1 ; : : : ; un 2 X be such that U1 \ : : : \ Un U for Ui WD fx 2 X W x .ui / 1g. Given x 2 Q, let t > max.1; x .u1 /; : : : ; x .un //. Let Vi WD fi1 ..1; t/ for i D 1; : : : ; n. Then V WD V1 \ : : : \ Vn is a weak neighborhood of x and Q \ V D t.Q \ t1 V/ t.Q \ U/ which is weakly compact.] 3. Let p 21; 1Œ and let `p be the space of sequences x WD .xn / such that kxkp WD .˙n jxn jp /1=p < 1. Show that the dual of `p is `q for q WD .1 1=p/1 . Given x WD .xn / 2 `q and a sequence .x.k/ /k0 in `q , show that .x.k/ / converges weakly .k/ to x if and only if it is bounded and for each n 2 N one has .xn /k ! xn . 4. Let `1 be the space of sequences x WD .xn / such that kxk1 WD ˙n jxn j < 1. Show that the dual of `1 is the space `1 of bounded sequences with the norm kk1 given by kxk1 WD supn jxn j. Prove that in `1 a sequence converges weakly if and only if it converges strongly. .k/ .k/ For k 2 N, let x.k/ WD .xn /n be given by xn WD ınk WD 1 if n D k, 0 otherwise. Deduce from what precedes that the bounded sequence .x.k/ /k has no subsequence that converges weakly. 5. Let C be the subset of the space `1 defined in the preceding exercise that consists in those x WD .xn / such that limn xn D 1 and xn 2 Œ0; 1 for all n 2 N. Show that C is a closed, convex, bounded subset of `1 , hence is weakly closed, but that C is not weak closed i.e. closed for .`1 ; `1 /. 6. Let A 2 L.X; Y/ be a continuous linear map between two normed spaces. Show that the transpose map A| W Y ! X is continuous and that kA| k D kAk. 7. Komolgorov’s normability criterion. Prove that a locally convex topological vector space X is normable i.e. its topology is associated with a norm, if and only if some neighborhood V of 0 is bounded.
3.3 Separation and Extension. Polarity This section is devoted to one of the major tools of functional analysis, the possibility of making use of linear continuous functionals on a normed space. In particular, we show that this family is rich enough and enables us to separate disjoint
3.3 Separation and Extension. Polarity
121
convex sets under some additional conditions. We first review some properties of convex subsets.
3.3.1 Convex Sets and Convex Functions Let us recall that a subset C of a linear space X is said to be convex if a segment whose extremities are in C is entirely contained in C: for all x0 ; x1 2 C, t 2 Œ0; 1, one has xt WD .1 t/x0 C tx1 2 C. Among convex subsets, the simplest ones are affine subspaces obtained by translating linear subspaces, half-spaces (subsets D such that there exist a linear form ` on X and r 2 R for which D D `1 . 1; rŒ/ or D D `1 . 1; r/ and convex cones. The latter are the subsets that are stable under addition and positive homotheties hr W x 7! rx (with r 2 P WD0; 1Œ fixed), as easily checked. From antiquity to the present days, polyhedral subsets, i.e. finite intersections of closed half-spaces, have played a special role among convex subsets as they enjoy particular properties not shared by all convex sets. A function f from a linear space X to R WD R [ f1; 1g is said to be convex if its epigraph Ef WD epi f WD f.x; r/ 2 X R W r f .x/g is convex or, equivalently, if for any t 2 Œ0; 1; x0 ; x1 2 X f ..1 t/x0 C tx1 / .1 t/f .x0 / C tf .x1 / (with the convention that .1/ C .C1/ D C1 and 0:.C1/ D C1, 0:.1/ D 1 we adopt in the sequel). It is easy to show that f is convex if and only if its strict epigraph Ef0 WD epis f WD f.x; r/ 2 X R W r > f .x/g is convex. A function f is concave if f is convex. A function s W X ! R is said to be sublinear if its epigraph is a convex cone, i.e. if it is subadditive (s.x C x0 / s.x/ C s.x0 / for all x; x0 2 X) and positively homogeneous (s.tx/ D ts.x/ for all t 2 P, x 2 X). A sublinear function p with nonnegative values is called a gauge; if moreover p is finite and even i.e., if p.x/ D p.x/ for every x 2 X, then p is a semi-norm. Example Let g W X ! R1 be a convex function. The associated (positively) homogeneous function is the function h W X R ! R1 given by h.x; r/ WD rg.x=r/ for .x; r/ 2 X P, h.0; 0/ WD 0, h.x; r/ D C1 otherwise. Then h is sublinear since it is clearly positively homogeneous and convex since .x; r; s/ 2 epi h if and only if .x; s; r/ 2 P.epi g f1g/ [ f.0; 0; 0/g.
122
3 Elements of Functional Analysis
Example The support function of a subset S of a normed space is the function S or hS W X ! R given by S .x / WD hS .x / WD supfhx ; xi W x 2 Sg
x 2 X :
(3.4)
Here we use the fact that the supremum of a family of convex functions is convex since the intersection of a family of convex subsets of a vector space is convex. A convex function taking the value 1 is very special (on any straight line there are at most two points at which it takes a finite value and if the function is lower semicontinuous no such point exists); therefore we will usually discard them and only consider functions with values in R1 WD R [ f1g. In contrast, it is useful to admit functions taking the value C1. Among them is the indicator function C of a subset C of X: let us recall it is given by C .x/ D 0 for x 2 C, C .x/ D C1 for x 2 XnC. For instance, in a minimization problem one can take a constraint C into account by replacing an objective function f by fC WD f C C . One calls a function proper if it does not take the value 1 and takes at least one finite value. The expression nonimproper would be less ambiguous, but the risk of confusion with the topological concept is limited, so that we retain the usual terminology. Moreover, the epigraph of a function f W X ! R1 is a proper subset (nonempty and not the whole space) of X R if and only if f is proper. We denote by Df or dom f the domain of f , i.e. the projection on X of Ef WD epi f : Df WD dom f WD fx 2 X W f .x/ < C1g : The following statement will be used repeatedly; it relies on the obvious fact that the image of a convex set under a linear map is convex. Lemma 3.9 Let W and X be linear spaces and let f W W X ! R be convex. Then the performance function p W W ! R defined as follows is convex p.w/ WD inf f .w; x/: x2X
Proof The result follows from the fact that the strict epigraph of p is the projection on W R of the strict epigraph of f . t u Let us add that if f is positively homogeneous in the variable w, so is p. Example If C is a convex subset (resp. a convex cone) of a normed space, then the associated distance function dC W w 7! infx2C kw xk is convex (resp. sublinear). Example Given f ; g W X ! R, their infimal convolution f g W X ! R defined by .f g/ .w/ WD infff .u/ C g.v/ W u; v 2 X; u C v D wg D inf . f .w x/ C g.x// x2X
is convex whenever f and g are convex. If f and g are sublinear then f g is sublinear. The preceding example is the case corresponding to f WD kk, g WD C .
3.3 Separation and Extension. Polarity
123
Besides the indicator function, the support function, and the distance function, another function associated with a convex set plays a noteworthy role. If C is a subset of X containing the origin, the gauge function (or Minkowski gauge) C of C is defined by C .x/ WD inffr 2 RC W x 2 rCg
x 2 X:
Clearly, C is positively homogeneous and one has C 1 C .Œ0; 1/. If C is starshaped, i.e. if for all x 2 C, t 2 Œ0; 1 one has tx 2 C, then 1 C .Œ0; 1Œ/ C. If moreover C is algebraically closed in the sense that its intersection with every ray Lu WD RC u, u 2 Xnf0g is closed in Lu , then C D 1 C .Œ0; 1/. In particular, the gauge function of the closed unit ball BX of a normed space .X; kk/ is just kk. We leave the proof of the next lemma as an exercise using the fact that C .x/ D infr h.x; r/ where h.x; r/ WD rg.x=r/ with g WD C C 1. Hereafter, a subset C of a linear space X is said to be absorbing if for all x 2 X there exists some r > 0 such that x 2 rC. Lemma 3.10 The gauge C of a convex subset C of X is sublinear. A subset C of X is absorbing if and only if C is finitely valued. Since the intersection of a family of convex subsets is convex, any nonempty subset A of a linear space X is contained in a convex set C that is the smallest in the family CA of convex sets containing A. It is denoted by co.A/ and called the convex set generated by A or the convex hull of A. It is obtained as the intersection of the family CA . It is easy to see that co.A/ is the set of convex combinations of elements of A, i.e. co.A/ is the set of x 2 X that can be written as t1 a1 C : : : C tn an with n 2 Nnf0g, ai 2 A, t WD .t1 ; : : : ; tn / being an element of the canonical simplex n , i.e. the set of t WD .t1 ; : : : ; tn / 2 RnC satisfying t1 C: : : Ctn D 1. The convex hull co. f / of a function f W X ! R1 is the greatest convex function g bounded above by f . Its epigraph is almost the convex hull of the epigraph Ef of f . In fact, it is the vertical closure of co.Ef / in the sense that one has epis g co.Ef / epi g. Thus g.x/ WD inf inff m1
m X
ti f .xi / W .t1 ; : : : ; tm / 2 m ; xi 2 X; t1 x1 C : : : C tm xm D xg:
iD1
Exercise Show that for g WD co. f / the inclusions epis g co.epi f / epi g may be strict. [Hint: consider f W R ! R given by f .0/ WD 1 and f .x/ WD jxj for x 2 Rnf0g.] Note that in general, the union of a family .Cp / of convex subsets is no longer convex; but when .Cp / is an increasing sequence (with respect to inclusion), the union is convex. Similarly, the infimum of a countable family .kp / of convex functions is convex when the sequence .kp / is decreasing; but that is not the case if the sequence .kp / does not satisfy this property.
124
3 Elements of Functional Analysis
When X is a normed space, any subset S is contained in a smallest closed convex subset, its closed convex hull co.S/. Using the following elementary result, it is easy to check that this set is just the closure of co.S/. In fact, the lemma and the preceding assertion are valid in any topological linear space. In the sequel, a number of results given for normed spaces are valid for topological linear spaces. We leave the proofs of the next two results as exercises. Lemma 3.11 The closure cl.C/ and the interior int.C/ of a convex subset C of a normed space are convex. Moreover, if t 2 Œ0; 1Œ, x0 2 int.C/ and x1 2 C then .1 t/x0 C tx1 2 int.C/. Lemma 3.12 If the interior of a convex subset C of a normed space is nonempty, then one has cl.C/ D cl.int.C// and int.cl.C// D int.C/. Lemma 3.13 If C is a nonempty convex subset of a finite dimensional space, then C has a nonempty interior (called the relative interior and denoted by ri(C)) in the affine subspace A it generates. Proof By definition, A is the smallest affine subspace containing C. Using a translation, we may suppose 0 2 C, so that A is the linear subspace generated by C. Let n be the dimension of A and let m be the greatest integer k such that there exists a linearly independent family fe1 ; : : : ; ek g in C satisfying cofe1 ; : : : ; ek g D ft1 e1 C : : : C tk ek W .t1 ; : : : ; tk / 2 k g C: Let fe1 ; : : : ; em g be such a family and let L be the linear space it generates. Then C is contained in L: otherwise, we could find some e 2 CnL and the family fe1 ; : : : ; em ; eg would satisfy the above conditions and be strictly larger than fe1 ; : : : ; em g. Thus L D A and the set cofe1 ; : : : ; ek g has a nonempty interior in A with respect to the unique Hausdorff linear topology on A obtained by transporting the topology of Rm by the isomorphism defined by the base fe1 ; : : : ; em g. t u
Exercises 1. Let A and B be convex subsets of a normed space .X; kk/. Show that the set C WD f.1 t/a C tb W a 2 A; b 2 B; t 2 Œ0; 1g is the convex hull co.A [ B/ of A [ B, i.e. the smallest convex subset of X containing A [ B. 2. Let A and B be compact convex subsets of a normed space .X; kk/. Show that the sets C WD co.A [ B/ and S WD A C B are compact. Give an example showing that the convex hull of the sum of two closed convex subsets of R2 is not always closed. [Hint: take A WD Rf1g, B WD f.x; y/ 2 RC RC W xy 1g.] 3. (Hermite-Hadamard) Given a continuous convex function f W Œa; b ! R prove the inequalities aCb 1 f. / 2 ba
Z
b
f .x/dx a
f .a/ C f .b/ : 2
3.3 Separation and Extension. Polarity
125
[Hint: for the right inequality use the relation f .x/ f .a/ C . f .b/ f .a//.b a/1 .x a/ and integrate; for the left inequality, split the interval Œa; b into Œa; .a C b/=2 [ Œ.a C b/=2; b.] 4. Given a sequence .En /n1 of nonempty subsets of a linear space Z, show that the convex hull C of the union E of the En ’s is the union over p 2 Nnf0g of the convex hulls Cp of E1 [ [ Ep : 1 0 [ [ C WD co.E/ D Cp where Cp WD co @ En A : p
1np
For m; p 2 Nnf0g, setting Nm WD f1; : : : ; mg and denoting by Jm;p the set of maps j W Nm ! Np , show that the set Cp is given by Cp WD
m [ [ X f ti xi W t WD .t1 ; : : : ; tm / 2 m ; xi 2 Ej.i/ g: m1j2Jm;p iD1
5. Given a sequence .hn / of functions on a linear space X, show that the convex hull k of the function h WD infn hn is the infimum over p 2 Nnf0g of the convex hulls kp WD co h1 ; : : : ; hp of the functions h1 ; : : : ; hp . The function kp is given by kp .x/ D inf inf inff m1 j2Jm;p
m X
ti hj.i/ .xi / W .t1 ; : : : ; tm / 2 m ; xi 2 X;
iD1
m X
ti xi D xg:
iD1
6. Let C be a closed subset of a normed space that is midconvex in the sense that for any x, y 2 C one has 12 x C 12 y 2 C. Show that C is convex. 7. Let f W X ! R1 be a midconvex function in the sense that for any x, y 2 X one has f . 12 x C 12 y/ 12 f .x/ C 12 f .y/. Prove that f is convex if f is lower semicontinuous. 8. Let X be a normed space, let A 2 L.X; X /, b 2 X , c 2 R, and let f W X ! R be given by f .x/ WD 12 hAx; xi C hb; xi C c. Show that f is convex if A is positive semidefinite in the sense that hAx; xi 0 for all x 2 X.
3.3.2 Separation and Extension Theorems Looking at both the analytical face and the geometrical face of the results of this section is fruitful. In fact, the following extension and separation theorems are closely intertwined. We start with a finite dimensional separation property. Theorem 3.8 (Finite Dimensional Separation Property) Let C be a nonempty convex subset of a finite dimensional vector space X and let a 2 XnC. Then, there
126
3 Elements of Functional Analysis
exists some f 2 X nf0g such that f .a/ sup f .C/. If moreover C is closed, one can request that f .a/ > sup f .C/. Proof Let us first consider the case when C is closed. Since X is finite dimensional, we may endow X with the norm associated with a scalar product h j i. Then, by Corollary 2.29, the point a has a best approximation p in C characterized by 8z 2 C
hz p j a pi 0:
For f 2 X defined by f .x/ WD hx j a pi, for every z 2 C we have f .p/ f .z/ and the second conclusion is established since f .a/ f .p/ D ka pk2 > 0, as a … C. Now let us consider the general case in which C is not assumed to be closed. Let SX be the unit sphere of X and for x 2 C let Sx WD fu 2 SX W hu ; xi hu ; aig; so that ` 2 SX is such that `.a/ sup `.C/ if and only if ` 2 \x2C Sx . Since X is finite dimensional, SX is compact, hence this intersection is nonempty provided the family of closed subsets .Sx /x2C has the finite intersection property. Thus, we have to show that for any finite subset F WD fx1 ; : : : ; xn g of C one has Sx1 \ : : : \ Sxn ¤ ¿. Let n WD f.t1 ; : : : ; tn / 2 RnC W t1 C : : : C tn D 1g; h W .t1 ; : : : ; tn / 7! t1 x1 C : : : C tn xn ; E WD co.F/ WD h.n /. Since the canonical simplex n is compact and h is continuous, E is compact, hence closed, and contained in C. The first part of the proof yields some ` in X nf0g satisfying `.a/ > `.z/ for all z 2 E, in particular for z 2 F. Without loss of generality we may suppose k`k D 1. Thus ` 2 Sx for all x 2 F: the family .Sx /x2C has the finite intersection property. t u Corollary 3.11 Let A and B be two disjoint nonempty convex subsets of a finite dimensional space X. Then there exists some f 2 X nf0g such that 8a 2 A; 8b 2 B
f .a/ f .b/:
Proof Since C WD A B is convex and since A and B are disjoint, one has 0 … C and it suffices to take the linear form f provided by Theorem 3.8. u t Now let us deal with the possibly infinite dimensional case, for which one has to use the axiom of choice in the form of Zorn’s Lemma. The analytical versions are intimately linked with the geometrical versions. In the latter case one is led to detect the special place of half-spaces among convex subsets; in the analytical versions, one sheds light on the special place of linear forms among sublinear forms.
3.3 Separation and Extension. Polarity
127
We first observe that a sublinear form s on a linear space X is linear if (and only if) it is odd: for any x; y 2 X, r 2 R, r < 0, one has s.x C y/ s.x/ C s.y/ D s.x/ s.y/ s.x y/ D s.x C y/; s.rx/ D s. jrj x/ D jrj s.x/ D rs.x/: Proposition 3.19 The space S.X/ of finite sublinear functions on the vector space X, ordered by the pointwise order, is (lower) inductive, hence has minimal elements. Each such element is a linear form. Proof We have to show that any totally ordered subset C of S.X/ has a lower bound. Let s0 be a fixed element of C. For every s 2 C, x 2 X, we have s.x/ inf.s0 .x/; s0 .x//; since we have either s s0 or s.x/ s.x/ s0 .x/ if s s0 . It follows that p W X ! R given by p.x/ WD inffs.x/ W s 2 Cg is finite, and, as easily checked, it is sublinear. Thus, by Zorn’s Lemma, S.X/ has minimal elements. The second assertion is a consequence in the next lemma. This lemma is motivated by the observation preceding the statement which incites us to look for sublinear forms that are odd on some linear subspaces. Lemma 3.14 Let s 2 S.X/ and let u 2 X. Then the function su given by su .x/ WD inffs.x tu/ s.tu/ W t 2 RC g is sublinear and such that su s, su .u/ D su .u/. Thus, when s is minimal in S.X/, one has su D s and s.u/ D s.u/ for all u 2 X and the proof of the proposition will be reduced to the following proof. Proof We first observe that the infimum in the definition of su .x/ is finite since for all t 2 RC we have s.tu/ s.x tu/ C s.x/, hence 8t 2 RC
s.x/ s.x tu/ s.tu/:
Moreover, the inequality su .x/ s.x/ stems from the choice t D 0 in the definition of su . It is easy to see that su is sublinear. Taking t D 1 in the definition of su .x/, we get su .u/ s.u/. But since 0 su .u/ C su .u/ and su s, we obtain s.u/ su .u/ su .u/ s.u/; hence su .u/ D su .u/. Corollary 1.1 of Zorn’s Lemma yields the following consequence.
t u
128
3 Elements of Functional Analysis
Fig. 3.2 The Sandwich Theorem
f
–g
Corollary 3.12 For every s 2 S.X/ there exists some linear form ` on X such that ` s. The next statement is suggestive. It holds under more general assumptions (Exercise 1; Fig. 3.2). In fact, if f W X ! R is convex, g is as in the statement and g f there exists a linear form ` on X such that g ` f . This follows from Theorem 3.9 by considering h given by h.x/ WD inft>0 .1=t/.f .tx/ f .0// that is sublinear and such that g h f . Theorem 3.9 (Sandwich Theorem) Let g W X ! R1 WD R [ fC1g and h W X ! R be sublinear functions on a linear space X. If g h there exists a linear form ` on X such that g ` h. Proof Let s W X ! R be defined by s.x/ WD inffh.x C y/ C g.y/ W y 2 Xg: Since h.y/ h.x C y/ C h.x/ and since h.y/ g.y/ for all y 2 X, we have h.x C y/ C g.y/ h.y/ h.x/ C g.y/ h.x/; so that s.x/ > 1 for all x 2 X, and, of course, s.x/ h.x/ < C1. We easily verify that s is sublinear (in fact, s is the infimal convolution of h and k W X ! R1 given by k.x/ D g.x/ for x 2 X) and that s h, s k. Thus, taking a linear form ` s, as in the preceding corollary, we have ` h, ` k, hence, for x 2 X, `.x/ D `.x/ k.x/ D g.x/. t u Theorem 3.10 (Hahn-Banach) Let X0 be a vector subspace of a real vector space X, let `0 be a linear form on X0 and let h W X ! R be a sublinear functional such that `0 .x/ h.x/ for every x 2 X0 . Then there exists a linear form ` on X extending `0 such that ` h.
3.3 Separation and Extension. Polarity
129
Proof Let g W X ! R1 be given by g.x/ D `0 .x/ for x 2 X0 , g.x/ WD C1 for x 2 XnX0 . It is easy to see that g is sublinear and that g h. Taking ` such that g ` h, for x 2 X0 we get `0 .x/ D g.x/ `.x/ and similarly `0 .x/ `.x/, so that `.x/ D `0 .x/ for all x 2 X0 . t u Now let us turn to the case when X is endowed with a topology. Corollary 3.13 Let X be a topological vector space and let h W X ! R be a continuous sublinear functional. Then there exists a continuous linear form ` on X such that ` h. Proof By the preceding corollary there exists a linear form ` on X such that ` h. Let us prove that ` is continuous. Given " > 0 we take a symmetric neighborhood V of 0 such that h.x/ " for all x 2 V. Then for x 2 V we have `.x/ h.x/ ", so that j`.x/j " for all x 2 V. Thus ` is continuous. t u In particular, if p W X ! R is a seminorm on X one can find a linear form ` on X such that ` p. Such an assertion can be made more precise. We just give a version with a norm. Corollary 3.14 Let X be a normed vector space and let x 2 X. Then there exists a continuous linear form ` on X such that k`k D 1 and `.x/ D kxk. Proof Let X0 WD Rx and let `0 be the linear form on X0 given by `0 .rx/ D r kxk for r 2 R. Thus, for every x 2 X0 , one has `0 .x/ h.x/ WD kxk. The Hahn-Banach Theorem yields some linear form ` on X extending `0 such that ` h. Then one has t u k`k 1 and `.x/ D kxk, hence k`k D 1. The preceding corollary can be rephrased by saying that the duality map J W X ! P.X / defined as follows has nonempty values: J.x/ WD fx 2 X W hx ; xi D kxk2 ; kx k D kxkg: This (multi)map or set-valued map is a useful tool, in particular for the geometry of normed spaces and for the study of dissipative operators. Explicit expressions for some spaces are to be found in the exercises below. Another tool is given in the next corollary. On a special class of normed spaces called inner product spaces or Hilbert spaces we shall dispose of a better tool satisfying a bilinear property. Corollary 3.15 (Lumer) For any normed space .X; kk/ there exists a semi-scalar product, i.e. a function Œ; W X X ! R such that for all x, y, z 2 X, r 2 R one has Œx; y C z D Œx; y C Œx; z; jŒx; yj kxk : kyk ;
Œx; ry D rŒx; y; Œx; x D kxk2 :
130
3 Elements of Functional Analysis
Proof It suffices to take a selection j of J, i.e. a map j W X ! X such that j.x/ 2 J.x/ for all x 2 X and to set Œx; y D hj.x/; yi. Note that we can even require that Œtx; y D tŒx; y for all .t; x; y/ 2 RC X X. t u Corollary 3.14 is a special case of the next corollary. Corollary 3.16 Let X be a normed vector space and let Y be a vector subspace of X. Then any continuous linear form y on Y has a linear continuous extension x to X such that kx k D ky k. Proof Let c WD ky k. Theorem 3.10 yields some linear form ` on X extending y and satisfying ` c kk. Then x WD ` is continuous and kx k D c. u t Corollary 3.17 Let Y be a closed linear subspace of a normed space X. If Y ¤ X there exists a non-null continuous linear form f on X that is null on Y. Proof Let p W X ! X=Y be the quotient map. Since Y ¤ X one can find some non-null z 2 X=Y. Then Corollary 3.14 yields some ` in the dual of X=Y such that `.z/ ¤ 0. Then f D ` ı p is non null on X and null on Y. t u Corollary 3.18 Let Y be a closed vector subspace of a Banach space X. Then Y is isometric to X =Y ? , where Y ? WD fx 2 X W x .y/ D 0 8y 2 Yg. Proof Let r W X ! Y be the restriction map given by r.x / WD x jY . Corollary 3.16 ensures that r is onto. The kernel of r being precisely Y ? , one can factorize r as r D q ı p, where p W X ! X =Y ? is the canonical projection and q W X =Y ? ! Y is bijective and continuous. Giving to X =Y ? the quotient norm defined by kzk WD inffkx k W x 2 p1 .z/g, Corollary 3.16 can serve to prove that q is isometric. t u Corollary 3.19 Given normed spaces X, Y, the transpose A| of A 2 L.X; Y/ satisfies kA| k D kAk. Proof We have seen that kA| k kAk. Given x 2 X, Corollary 3.14 yields some y 2 Y such that ky k D 1 and hy ; Axi D kAxk. Then kAxk D hA| y ; xi t u kA| y k : kxk kA| k : kxk. Thus kAk kA| k and equality holds. Now let us turn to geometric forms of the Hahn-Banach Theorem. We first consider an algebraic version. We recall that a subset C of a vector space X is said to be absorbing if for all x 2 X there exists some r > 0 such that rx 2 C. The core of a convex subset C of X is the set of points a 2 X such that C a is absorbing. Proposition 3.20 Let C be an absorbing convex subset of a vector space X and let e 2 Xn core C. Then there exists a hyperplane H of X such that e 2 H and H\core C D ¿. Moreover, C is contained in one of the strict half-spaces determined by H. Proof Let j WD C be the Minkowski gauge of C: j.x/ WD infft > 0 W x 2 tCg:
3.3 Separation and Extension. Polarity
131
Since C is absorbing and convex, j is finite on X and sublinear. For all x 2 core C one has j.x/ < 1 since there exists some r > 0 such that rx 2 C x, hence j.x/ .1 C r/1 . Conversely, if j.x/ < 1 then x 2 core C since for all u 2 X and for " > 0 such that " max.j.u/; j.u// < 1 j.x/ one has, for all r 2 Œ"; ", j.x C ru/ j.x/ C j.ru/ < 1, hence x C ru 2 tC C for some t 20; 1Œ. Since e 2 Xn core C, we have j.e/ 1. Let X0 WD Re, and let `0 W X0 ! R be given by `0 .re/ WD rj.e/. Then, since rj.e/ 0 j.re/ for r 0, we have `0 j j X0 , so that there exists some linear form h on X extending `0 with h j. Let H WD fx 2 X W h.x/ D j.e/g. Then e 2 H and for x 2 core C we have h.x/ j.x/ < 1 j.e/ hence x … H and core C h1 . 1; j.e/Œ/. t u A topological version is eased by the following observation. Lemma 3.15 Let X be a normed vector space, let c 2 R, and let h be a nonnull linear form on X. The hyperplane H WD h1 .c/ is closed if and only if h is continuous. Proof Obviously, when h is continuous, H is closed, fcg being closed in R. Conversely, suppose that H is closed. Since h is non-null, XnH is nonempty. Let x0 2 XnH, so that there exists some r > 0 for which B.x0 ; r/ XnH. Assuming h.x0 / < c (the other possibility is dealt with by changing h, c into h, c), let us note that h.x/ < c for all x 2 B.x0 ; r/: otherwise, if there exists an x1 2 B.x0 ; r/ such that h.x1 / > c, for t WD . f .x0 c/. f .x0 /f .x1 //1 one has h..1t/x0 Ctx1 / D c and .1 t/x0 C tx1 2 B.x0 ; r/, contradicting B.x0 ; r/ XnH. Then, for all u 2 B.0; 1/ we have h.x0 C ru/ < c or h.u/ < r1 .c h.x0 // and h is continuous. t u Theorem 3.11 (Eidelheit) Let A and B be two disjoint nonempty convex subsets of a topological vector space X. If A is open, then there exists a closed hyperplane H separating A and B: for some f 2 X nf0g, r 2 R one has 8a 2 A; 8b 2 B
f .a/ > r f .b/:
Proof Let D WD A B WD fa b W a 2 A; b 2 Bg. It is a convex subset of X which is open as the union over b 2 B of the translated sets A b, and 0 … D. Taking e 2 D and setting C WD e D, we see that e … C, 0 2 C and C is absorbing. Thus, there exist some s > 0 and some linear form f on X such that f .e/ D s and f .x/ < s for all x 2 C. Since f is bounded above on the neighborhood C of 0, f is continuous. Moreover, for a 2 A, b 2 B one has f .e a C b/ < s D f .e/, hence f .a/ sup f .B/. In fact, since A is open and f ¤ 0, one must have f .a/ > r WD sup f .B/. t u Theorem 3.12 (Hahn-Banach Strong Separation Theorem) Let A and B be two disjoint nonempty convex subsets of a normed space (or a locally convex topological vector space) X. If A is compact and B is closed, then there exists some f 2 X nf0g and some r 2 R, ı > 0 such that 8a 2 A; 8b 2 B
f .a/ > r C ı > r > f .b/:
132
3 Elements of Functional Analysis
Fig. 3.3 Separation of two convex subsets
B
A
Proof For every a 2 A there exists a symmetric open convex neighborhood Va of 0 in X such that .a C 2Va / \ B D ¿. Let F be a finite subset of A such that the family .a C Va /a2F forms a finite covering of A. Then, if V is the intersection of the family .Va /a2F , V is an open neighborhood of 0 and A0 \ B D ¿ for A0 WD A C V. The Edelheit Theorem yields f 2 X nf0g and s 2 R such that f .a/ > s f .b/ for all a 2 A0 , b 2 B. The compactness of A ensures that there exists a ı > 0 such that f .a/ > s C 2ı for all a 2 A. Setting r WD s C ı, we get the result (Fig. 3.3). t u Example The compactness assumption on A cannot be omitted, as shown by the example of X D R2 , A WD f.r; s/ 2 R2C W rs 1g, B WD R 1; 0. The following application to approximate solutions to linear systems will be used later on. Lemma 3.16 (Helly) Let X be a normed space, let f1 ; : : : ; fn in the dual X of X and let a1 ; : : : ; an be real numbers. The following assertions are equivalent: (a) for any " > 0 there exists some x ˇ2 jfi .x/ ai j " for all i 2 Nn ; PBX suchˇ thatP (b) for all .r1 ; : : : ; rn / 2 Rn one has ˇ niD1 ri ai ˇ niD1 ri fi . Proof (a))(b) Given .r1 ; : : : ; rn / 2 Rn , let s WD jr1 j C : : : C jrn j, so that by (a), given " > 0 one can find x" 2 BX such that ˇ ˇ n n n ˇ X ˇX X ˇ ˇ ri fi .x" / ri ai ˇ jri j jfi .x" / ai j "s: ˇ ˇ ˇ iD1
iD1
iD1
Thus, since kx" k 1, ˇ ˇ ˇ n ˇ n n ˇ ˇ ˇX ˇX X ˇ ˇ ˇ ˇ ri ai ˇ ˇ. ri fi /.x" /ˇ C "s ri fi C "s: ˇ ˇ ˇ ˇ ˇ iD1
iD1
iD1
Since " > 0 is arbitrarily small, we get the inequality of assertion (b). (b))(a) Let us consider the map f W X ! Rn with components f1 ; : : : ; fn and let a WD .a1 ; : : : ; an /. Assertion (a) means that a belongs to the closure cl. f .BX // of the
3.3 Separation and Extension. Polarity
133
image of BX under f . If that does not hold, by compactness of f .BX /, one can find x D .r1 ; : : : ; rn / 2 Rn and c 2 R such that 8x 2 BX
x :f .x/ < c < x :a:
Then (b) does not hold since these relations imply that ˇ n n ˇ n ˇX X ˇ X ˇ ˇ ri fi D sup ˇ. ri fi /.x/ˇ D sup jx :f .x/j c < ri ai : x2BX ˇ ˇ x2BX iD1
iD1
iD1
t u A special case of the Fenchel transform we will study later on is the passage from closed convex subsets (or their indicator functions) to their support functions. Recall that the support function hC or C of a subset C of a normed space X is the function hC W X ! R given by hC .x / WD supfhx ; xi W x 2 Cg: Corollary 3.20 (Hörmander) The map h W C 7! hC is an injective lattice morphism from the set C.X/ of nonempty closed convex subsets of the normed space X into the space H.X/ of positively homogeneous functions on X null at 0. Moreover, hC D hC for all 2 RC , C 2 C.X/ and hcl.ACB/ D hA C hB for all A; B 2 C.X/. Proof We just prove the injectivity of h, leaving the other assertions as exercises. It suffices to prove that for C; D 2 C.X/ satisfying hC hD one has C D since the roles of C and D can be interchanged. Given b 2 XnC we can find x 2 X such that hx ; bi > supx2C hx ; xi. Then we cannot have b 2 D since otherwise we would have hD .x / hx ; bi > supx2C hx ; xi D hC .x /. t u
Exercises 1. Let X and Y be finite dimensional spaces, let A 2 L.X; Y/ and let f W X ! R1 , g W Y ! R1 be convex functions such that RC .domg A.domf // D Y and f g ı A. Prove that there exist some ` 2 X and c 2 R satisfying f ` c g ı A. 2. Prove the Mazur-Orlicz Theorem: Let h W X ! R be a sublinear functional on some vector space X and let C be a nonempty convex subset of X. Then there exists a linear form ` on X such that ` h and inf `.C/ D inf h.C/ [See [232, p.13].] 3. Prove the Mazur-Bourgin Theorem: Let C be a convex subset with nonempty interior in a topological vector space X and let A be an affine subspace of X such
134
4.
5. 6.
7.
8.
3 Elements of Functional Analysis
that A \ intC D ¿. Prove that there exists a hyperplane H of X containing A which does not meet intC [See [164, p. 5].] Prove the Mazur Theorem: Let .xn / be a sequence in a normed space X that weakly converges to some x 2 X. Then there exists a sequence .yn / strongly converging to x such that, for all k 2 N, yk is a convex combination of the xn ’s. [Hint: Consider the closed convex hull of fxn W n 2 Ng.] Prove the Sandwich Theorem using the Eidelheit’s Theorem. Prove the Stone’s Theorem: Let A and B be disjoint convex subsets of a normed space X. Show that there exists a pair .C; D/ of disjoint convex subsets satisfying A C and B D which is maximal with respect to the order induced by inclusion. Show that when A is open one can take for C and D opposite half spaces, C being open. Verify that for p 21; 1Œ, q WD .1 1=p/1 and the usual norm, the duality map J W Lp .S; / ! Lq .S; / is single-valued and is given by J.x/.s/ D kxkp2p jx.s/jp2 x.s/ for s 2 S such that x.s/ ¤ 0, J.x/.s/ D 0 for s 2 S such that x.s/ D 0. Verify that the duality (multi)map J W L1 .S; / ! .L1 .S; // is given by J.x/.s/ D fkxk1 x.s/= jx.s/jg for s 2 S such that x.s/ ¤ 0, J.x/.s/ D fy.s/ W y 2 L1 .S; /; kyk1 kxk1 g for s 2 x1 .0/, .L1 .S; // being identified with L1 .S; /.
3.3.3 Polarity and Orthogonality Let us give a short account of polarity, a passage from a subset of a normed space X to a subset of the dual X of X (or the reverse, or, more generally from a subset of X to a subset of a space Y paired with X by a bilinear coupling function). This correspondence is a geometric analogue of a correspondence for functions, the Fenchel conjugacy, we will study in Chap. 6. The polar set S0 of a subset S of X is the set defined as follows with the help of its support function S0 WD h1 S . 1; 1/ WD fx 2 X W 8x 2 S hx ; xi 1g:
Clearly, S0 is a weak closed convex subset of X containing 0. If S is a cone, then S0 is a convex cone and S0 WD fx 2 X W 8x 2 S hx ; xi 0g; if S is a linear subspace, then S0 is the linear subspace S? WD fx 2 X W 8x 2 S hx ; xi D 0g; also called the orthogonal of S. It is also easy to show that .S [ T/0 D S0 \ T 0 :
3.3 Separation and Extension. Polarity
135
Cº
C
Fig. 3.4 Polar cones
A base of neighborhoods of 0 for the weak topology is formed by the polar sets of finite subsets. On the other hand, one has the following classical theorem (Fig. 3.4). Theorem 3.13 (Alaoglu-Bourbaki) Let X be a normed space and let S be a neighborhood of 0. Then S0 is weak compact. Proof Since S0 T 0 when T S and since S0 is weak closed, it suffices to prove the result when S is a ball centered at 0. Since .rS/0 D r1 S0 for r > 0, we may suppose S D BX . Then S0 D BX and the result has been shown in Theorem 3.7 in that case. t u The polar S0 of a subset S of X is defined similarly by S0 WD fx 2 X W 8x 2 S hx ; xi 1g: If S is a subset of X, then its bipolar is the set S00 WD .S0 /0 . Corollary 3.21 (Bipolar Theorem) For every nonempty subset S of a normed space X, its bipolar is the closed convex hull of S [ f0g: S00 WD co.S [ f0g/. In particular, if S is a convex subset of X containing 0, then S00 is the closure cl.S/ of S. Proof Let C WD co.S [ f0g/. Since one has S S00 , and since S00 is closed convex and contains 0, one has C S00 . Given b 2 XnC, Theorem 3.12 yields x 2 X and r 2 R such that hx ; bi > r > hx ; ai for all a 2 C. Since 0 2 C, one has r > 0 and r1 x 2 C0 S0 , hence b … S00 . Therefore S00 D C. t u Remark One must be careful with the use of polar sets. One cannot apply the preceding result with S a subset of the dual space of X, unless one replaces the closure with the weak closure co .S [ f0g/ of co.S [ f0g/, using Proposition 3.18.
136
3 Elements of Functional Analysis
Let us give some calculus rules. They will be completed by a more refined result (Theorem 3.25). Proposition 3.21 Let G and H be cones (resp. linear subspaces) of a normed space E. Then G0 \ H 0 D .G C H/0
(resp. G? \ H ? D .G C H/? ),
(3.5)
so that when G and H are convex cones one has .G0 \ H 0 /0 D cl.G C H/. If G and H are closed, convex cones, then G \ H D .G0 C H 0 /0
(resp. G \ H D .G? C H ? /? ).
(3.6)
Proof Since M 0 L0 when L M and since L0 D .L [ f0g/0 , we have .G C H/0 G0 \ H 0 . Conversely, for any f 2 G0 \ H 0 and any x 2 G, y 2 H we have hf ; x C yi D hf ; xi C hf ; yi 0, hence f 2 .G C H/0 . If G and H are closed, convex cones, replacing G and H with their polar cones in relation (3.5), we get G \ H D G00 \ H 00 D .G0 C H 0 /0 . t u The following is a prototype of a duality result for minimization problems. Proposition 3.22 Given a convex cone C of a normed space X and an element x of X one has d.x; C/ D maxfhx ; xi W x 2 C0 ; kx k 1g: If D is a weak closed cone of X , then for all x 2 X one has ˚ d.x ; D/ D sup hx ; xi W x 2 D0 \ BX : Proof For every x 2 C0 \ BX and any x 2 C we have hx ; xi hx ; x xi kx k : kx xk kx xk ;
(3.7)
hence, taking the infimum over x 2 C, hx ; xi ı WD d.x; C/. Now, since C and the open ball B.x; ı/ are disjoint, we can separate them by a hyperplane H WD fx 2 X W hx ; xi D cg with kx k D 1, c 2 R: 8x 2 C; y 2 B.x; ı/
hx ; xi c < hx ; yi:
Replacing x by tx with t ! 1 and t ! 0C , we see that x 2 C0 and c 0. Taking the infimum over y 2 B.x; ı/ D xCB.0; ı/ we get c hx ; xiı, hence ı hx ; xi. With (3.7) this shows that the supremum of fhx ; xi W x 2 C0 \ BX g is attained for x D x and its value is ı. Now let D be a weak closed cone of X and let x 2 X . Inequalities similar to those in (3.7) show that supfhx ; xi W x 2 D0 \BX g ı WD d.x ; D/. For all r 20; ıŒ
3.3 Separation and Extension. Polarity
137
the closed ball BŒx ; r is weak compact and disjoint from the weak closed convex set D. Theorem 3.12 yields some cr 2 R and xr 2 X, the dual of .X ; .X ; X//, such that kxr k D 1 and 8x 2 D; y 2 BŒx ; r
hx ; xr i cr < hy ; xr i:
By homogeneity we see that xr 2 D0 . Since 0 2 D we have cr 0 and taking the infimum over y 2 BŒx ; r we get cr hx ; xr i r. Thus, r hx ; xr i supfhx ; xi W x 2 D0 \ BX g and since r is arbitrarily close to ı we get the second equality of the statement. t u Remark Moreover, if the infimum over w 2 C of the distances kx wk is attained at w 2 C, then one has hx ; x wi D kx wk since the following inequalities are equalities: kx wk D hx ; xi hx ; x wi kx wk : t u The following lemma will be used to establish a minimax theorem. In it, we denote by _ and ^ the operations given by r1 _ : : : _ rk D max.r1 ; : : : ; rk /;
r1 ^ : : : ^ rk D min.r1 ; : : : ; rk /
for ri 2 R, i D 1; : : : ; k and, as above, k stands for the canonical simplex of Rk : k WD f.s1 ; : : : ; sk / 2 RkC W s1 C : : : C sk D 1g. As usual, max (resp. min) means that one has attainment of the supremum (resp. infimum) when it is finite. Lemma 3.17 Let f1 ; : : : ; fk be convex functions on a convex subset C of a vector space X. Then inf. f1 _ : : : _ fk / D maxfinf.s1 f1 C : : : C sk fk / W s WD .s1 ; : : : ; sk / 2 k g: C
C
If g1 ; : : : ; gk are concave functions on C then sup.g1 ^ : : : ^ gk / D minfsup.s1 g1 C : : : C sk gk / W s WD .s1 ; : : : ; sk / 2 k g: C
C
Proof Let h WD f1 _: : :_fk . Then for each s WD .s1 ; : : : ; sk / 2 k we have h hs WD s1 f1 C : : : C sk fk , hence infC h infC hs and infC h supfinfC .s1 f1 C : : : C sk fk / W s WD .s1 ; : : : ; sk / 2 k g, with equality if infC h D 1. Now let A WD fr D .r1 ; : : : ; rk / 2 Rk W 9x 2 C; ri > fi .x/ i D 1; : : : ; kg;
138
3 Elements of Functional Analysis
which is convex. For t infC h one has b WD .t; : : : ; t/ … A. The finite dimensional separation theorem yields some s D .s1 ; : : : ; sk / 2 Rk nf0g such that s1 r1 C : : : C sk rk s1 t C : : : C sk t
8r D .r1 ; : : : ; rk / 2 A:
We have si 0 for i D 1; : : : ; k since ri can be arbitrarily large. Since s ¤ 0, by homogeneity, we may suppose s1 C : : : C sk D 1, i.e. s 2 k . Then, for each x 2 C, since ri can be arbitrarily close to fi .x/ we get s1 f1 .x/ C : : : C sk fk .x/ s1 t C : : : C sk t D t: Therefore infC .s1 f1 C : : : C sk fk / t and since t can be arbitrarily close to infC h, we get sups2k infC .s1 f1 C : : : C sk fk / infC h, so that equality holds. When infC h is finite we can take t D infC h and the inequality infC .s1 f1 C : : : C sk fk / t shows that we have attainment for this s 2 k . The second assertion is obtained by setting fi WD gi . t u Theorem 3.14 (Infimax Theorem) Let A and B be nonempty convex subsets of vector spaces X and Y respectively, and let ` W A B ! R be a function that is convex in its first variable and concave in its second variable. Then, if B is compact with respect some topology on Y and if ` is upper semicontinuous in its second variable, one has inf max `.x; y/ D max inf `.x; y/:
x2A y2B
y2B x2A
Proof The inequality ˛ WD infx2A supy2B `.x; y/ supy2B infx2A `.x; y/ WD ˇ is valid without any assumption. Here we can write max instead of sup since B is compact and `.x; / is u.s.c. for each x 2 A as is infx2A `.x; /. Given k 2 Nnf0g and a1 ; : : : ; ak 2 A, applying the preceding lemma with C D B, gi D `.ai ; /, we can find s 2 k such that sup.`.a1 ; b/ ^ : : : ^ `.ak ; b// D sup.s1 `.a1 ; b/ C : : : C sk `.ak ; b//: b2B
b2B
Since `.; b/ is convex for each b 2 B, we get sup.`.a1 ; b/ ^ : : : ^ `.ak ; b// sup.`.s1 a1 C : : : C sk ak ; b// ˛: b2B
b2B
Introducing for a 2 A the closed subset Ba WD fb 2 B W `.a; b/ ˛g, which is nonempty by the Weierstrass’ Theorem, we deduce from these inequalities that Ba1 \ : : : \ BakTis nonempty. The finite intersection property of the compact space B ensures that a2A Ba is nonempty. This means that there exists some b 2 B such that infa2A `.a; b/ ˛. Thus ˇ ˛ and equality holds. t u
3.4 Couplings and Reflexivity
139
Exercises 1. Let j W Y ! X be the canonical injection of a vector subspace Y of a normed space X into X and let j| W X ! Y be its transpose map given by j| .x / WD x ı j for x 2 X . Rephrase Corollary 3.16 as: j| is surjective. Show that the kernel of j| is the polar Y 0 of Y and that Y can be isometrically identified with X =Y 0 . 2. Let A W W ! X be a continuous linear operator with transpose map A| W X ! W given by A| .x / D x ı A. Show that for D WD A.C/ one has D0 D .A| /1 .C0 /. 3. Let A W W ! X be a continuous linear operator with transpose map A| W X ! W and let D be a closed convex subset of X containing the origin. Prove that .A1 .D//0 D cl .A| .D0 //. 4. Let G and H be closed convex cones of a normed space X. Prove that .G \ H/0 D cl .G0 C H 0 / and that one can omit the weak closure if G \ intH ¤ ¿. 5. Verify that the sets G WD f.x; y; z/ W x2 Cy2 z2 ; z 0g and H WD f.x; y; z/ W y D zg are closed convex cones of R3 . Describe the polar cones G0 , H 0 , .G \ H/0 . Show that .1; 1; 1/ 2 .G \ H/0 n.G0 C H 0 / and deduce that the sum of two closed convex cones is not necessarily closed. 6. (Ky Fan’s Infimax Theorem) Prove the conclusion of Theorem 3.14 with the assumptions that A is a nonempty set, that B is a nonempty compact topological space, and that f is upper semicontinuous in its second variable and is convexconcave-like in the following sense: for any t 2 Œ0; 1 and any x1 ; x2 2 A, y1 ; y2 2 B there exist some x3 2 A, y3 2 B such that `.x3 ; y/ .1 t/`.x1 ; y/ C t`.x2 ; y/
8y 2 B;
`.x; y3 / .1 t/`.x; y1 / C t`.x; y2 /
8x 2 A:
[Hint: adapt the proof of the Infimax Theorem.] 7. (Sion’s Infimax Theorem) Prove the conclusion of Theorem 3.14 with the assumption that f is convex-concave changed into the assumption that f is quasiconvex-quasiconcave, i.e. for all r 2 R, x 2 A, y 2 B the sets fa 2 A W f .a; y/ rg and fb 2 B W f .x; b/ rg are convex.
3.4 Couplings and Reflexivity 3.4.1 Couplings It may be useful to consider weak topologies in a symmetric way. For such a purpose, given normed vector spaces X and Y, we consider a coupling c W XY ! R (or c W X Y ! C if X and Y are complex spaces, but here we consider only real
140
3 Elements of Functional Analysis
vector spaces), i.e. a continuous bilinear function such that the maps cX W X ! Y and cY W Y ! X given by cX .x/ WD c.x; /
x 2 X;
cY .y/ WD c.; y/
y2Y
are injective. This means that if x 2 X is such that c.x; y/ D 0 for all y 2 Y then x D 0 and symmetrically if y 2 Y is such that c.x; y/ D 0 for all x 2 X then y D 0. We say that c is a metric coupling if cX and cY are monometries, i.e. isometries onto their images, or, in other terms, preserve the norms. For any coupling c W X Y ! R the map cY W Y ! X allows us to identify Y with a subspace of X endowed with a stronger norm (and symmetrically cX allows us to identify X with a subspace of Y ). When c is a metric coupling the norm of Y (resp. X) is the induced norm. The main example of a coupling is the evaluation map e W X X ! R given by e.x; x / WD x .x/ for .x; x / 2 X X . The assertion that it is a coupling relies on Corollary 3.14. Let us note that it is a metric coupling. Since eX is just the identity map IX of X , we have keX ./kX D kkX . On the other hand, Corollary 3.14 asserts that for all x 2 X one can find some `x 2 X such that k`x k D 1 and `x .x/ D kxk I since jx .x/j kxk for all x in the closed unit ball BX of X , this means that keX .x/kX WD sup eX .x/.x / D sup x .x/ D `x .x/ D kxk : x 2BX
x 2BX
The map eX W X ! X is called the canonical embedding. When it is surjective the space X is said to be reflexive.
3.4.2 Reflexivity and Weak Topologies Not all Banach spaces are reflexive (see the exercises). We shall soon give examples of reflexive Banach spaces. Proposition 3.23 If c W X Y ! R is a metric coupling between Banach spaces and if X is reflexive, then Y can be identified with the dual X of X via the map cY . Proof Let Z WD cY .Y/ in X . Since cY is an isometry from Y onto Z, Z is a closed subspace of X . In order to prove that Z D X it suffices to show that any x 2 X such that hx ; zi D 0 for all z 2 Z is null. Since X is reflexive, there exists an x 2 X such that x D eX .x/. Then, for all y 2 Y, setting x WD cY .y/ 2 X we have c.x; y/ D hcY .y/; xi D hx ; xi D hx ; x i D hx ; cY .y/i D 0: Since c is a coupling, these equalities show that x D 0, hence x D 0.
t u
3.4 Couplings and Reflexivity
141
Given a coupling c W X Y ! R, since Y can be identified with its image cY .Y/ in X , one can endow Y with the topology .Y; X/ induced by the weak topology .X ; X/. Similarly, identifying X with the subspace cX .X/ of Y one can endow X with the topology .X; Y/ induced by .Y ; Y/. An adaptation of the proof of Proposition 3.18 proves the following result. Proposition 3.24 Let c W X Y ! R be a coupling. Then the (topological) dual of .Y; .Y; X// is X. Taking Y WD X and taking for c the evaluation e, the topology on X induced by the canonical embedding eX of X into X endowed with its weak topology .X ; X / is called the weak topology on X and is denoted by .X; X /. Thus, WD .X; X / is the topology induced by the family .pf /f 2X of seminorms on X given by pf .x/ WD jf .x/j for x 2 X, f 2 X and one has
.xi /i2I ! x ” 8x 2 X .x .xi //i2I ! x .x/: Thus, the weak topology of X is indeed weaker than the topology associated with the norm; the latter is often called the strong topology. In the sequel we often write .xi /i2I ! x for weak or weak convergence and hx; yi instead of c.x; y/. Remark If A W X ! Y is a continuous linear map between two normed spaces, then A is continuous for the weak topologies on X and Y: if .xi /i2I ! x, then, for all y 2 Y one has .hAxi ; y i/i2I D .hxi ; A| .y /i/i2I !hx; A| .y /i D hAx; y i;
so that .Axi /i2I ! Ax and A is weakly continuous.
t u
When X is finite dimensional, the weak topology on X coincides with the strong topology (and, similarly, the weak topology of X coincides with the topology associated with the norm). In fact, a net . fi /i2I in X converges to some f 2 X if and only if for every element b of a base of X the net . fi .b//i2I converges to f .b/, and this is enough to imply the convergence for the dual norm. If X is infinite dimensional, the weak (resp. weak ) topology never coincides with the strong topology (the topology induced by the norm or dual norm). This stems from the fact that no neighborhood V of 0 in the weak or weak topology is bounded since it contains the intersection of the kernels of a finite family of linear forms. Although .X; X / does not coincide with the strong topology, for the class of convex subsets closedness for these two topologies is the same. One has to be warned that this fact is not valid for convex subsets of a dual space in the weak topology. Proposition 3.25 (Mazur) Closed convex subsets of a Banach space X are weakly closed. Proof Let C be a nonempty closed convex subset of X. If C D X, C is obviously weakly closed. Suppose C ¤ X and let a 2 XnC. Taking A WD fag and B WD C in
142
3 Elements of Functional Analysis
Theorem 3.12, we get some f 2 X nf0g and some r 2 R such that f .a/ > r > f .b/ for all b 2 C. Thus W WD f 1 .r; C1Œ/ is an open neighborhood of a with respect to the weak topology contained in XnC W XnC is open in the weak topology. t u In general, the weak topology does not provide compact subsets as easily as does the weak topology. However, when X is reflexive, since then the weak topology coincides with the weak topology obtained by considering X as the dual of X , we do get a rich family of compact subsets. We state this fact in the following corollary. Corollary 3.22 Every bounded weakly closed subset of a reflexive Banach space X is weakly compact. In particular, every bounded, closed, convex subset of X is weakly compact. In order to show that this property characterizes reflexivity, let us prove a noteworthy lemma. Lemma 3.18 (Goldstine) The image eX .BX / of the unit ball BX of a normed space X via the canonical embedding eX W X ! X is dense in the closed unit ball BX of X for the .X ; X / topology. Proof The conclusion means that for any x 2 BX and any .X ; X /neighborhood V of x we have eX .BX / \ V ¤ ¿. By construction of .X ; X / we can find " > 0 and a finite set F WD ff1 ; : : : ; fn g in X such that W WD fx 2 X W jhx x ; fi ij "; i D 1; : : : ; ng V: Since kx k 1, setting ai WD hx ; fi i, for all .r1 ; : : : ; rn / 2 Rn we have ˇ n ˇ ˇ ˇ n n ˇ X ˇ ˇ ˇX ˇ ˇ ˇ X ˇ ri ai ˇ D ˇhx ; ri fi iˇ ri fi : ˇ ˇ ˇ ˇ ˇ iD1
iD1
iD1
Then we deduce from Lemma 3.16 that there exists some x 2 BX such that t u jfi .x/ ai j ", i.e. jheX .x/ x ; fi ij " or eX .x/ 2 W V. Theorem 3.15 A normed space X is reflexive if and only if its closed unit ball BX is weakly compact. Proof If X is reflexive the canonical embedding eX W X ! X is an isomorphism (and even an isometry). Its inverse e1 X is continuous, hence, by the preceding remark, it is continuous with respect to the weak topologies .X ; X / D .X ; X / on X and .X; X / on X. Since the unit ball BX is .X ; X /compact, BX is .X; X /-compact. Conversely, let us suppose BX is .X; X /-compact. Since eX is continuous, hence continuous with respect to the weak topologies .X; X / on X and .X ; X / on X and a fortiori with respect to the topology .X ; X / on X we get that eX .BX / is .X ; X /-compact, hence .X ; X /-closed. Since eX .BX / is dense in BX in the topology .X ; X /, we get that eX .BX / D BX and by homogeneity, eX .X/ D X . t u
3.4 Couplings and Reflexivity
143
Remark In the preceding proof some care is required concerning the topologies on X . In fact, if X is a Banach space, eX .BX / is closed in BX in the norm topology. But if X is not reflexive, then eX .BX / is not closed in BX in the topology .X ; X / since it is dense in BX with respect to this topology but distinct from this ball. t u Let us give some elementary permanence properties of reflexive spaces. Proposition 3.26 A closed subspace Y of a reflexive Banach space X is reflexive. Proof Let us first observe that the weak topology .Y; Y / on Y coincides with the topology induced by .X; X / on Y. In fact, by definition, a net .yi /i2I in Y converges to y 2 Y for .Y; Y / if and only if for every g 2 Y one has .g.yi //i2I ! g.y/. But since g is the restriction to Y of some f 2 X , if .yi /i2I ! y in the topology induced by .X; X / then one has .g.yi //i2I D . f .yi //i2I ! f .y/ D g.y/. Conversely, if .yi /i2I ! y in the .Y; Y / topology, given f 2 X , setting g WD f jY we get . f .yi //i2I D .g.yi //i2I ! g.y/ D f .y/, so that .yi /i2I ! y in the topology induced by .X; X /. Now BY D BX \ Y and Y is weakly closed (by Proposition 3.25). Thus BY is weakly closed in BX , hence is weakly compact, so that Y is reflexive. t u Proposition 3.27 A Banach space X is reflexive if and only if its dual X is reflexive. Proof When X is reflexive one has .X ; X / D .X ; X/ and since BX is compact in the .X ; X/ topology, it is compact in the .X ; X / topology, so that X is reflexive by Theorem 3.15. When X is reflexive, X is reflexive by the preceding. Since eX .X/ is a closed subspace of X , eX .X/ is reflexive by the preceding proposition. Since X and eX .X/ are isometric, X is reflexive. t u The following results are of interest but they are outside the scope of our purposes, although they have some bearing on our study in the reflexive case. We refer to [106, 109, 119, 165] for the proofs. Theorem 3.16 (Eberlein-Šmulian) For a subset S of a Banach space X the following assertions are equivalent: (a) the weak closure w cl.S/ of S is weakly compact; (b) every sequence in S has a weakly convergent subsequence; (c) every sequence in S has a weak cluster point. Theorem 3.17 (Banach, Dieudonné, Krein, Šmulian) Let C be a convex subset of the dual X of a Banach space X. If for all r 2 RC the set C \ rBX is closed for .X ; X/, then C is closed for .X ; X/. Theorem 3.18 (James) Let A be a bounded and weakly closed subset of a Banach space X. If every continuous linear form on X attains its supremum on A then A is weakly compact.
144
3 Elements of Functional Analysis
In particular, if every continuous linear form on X attains its supremum on BX then X is reflexive.
Exercises 1. Prove that the quotient X=Y of a reflexive space by a closed subspace Y is reflexive. 2. Prove that if the quotient X=Y of a Banach space by a closed reflexive subspace Y is reflexive then X is reflexive. [See [165, p. 126].] 3. Show that a weakly lower semicontinuous function f on a reflexive space X attains it infimum if it is coercive in the sense that f .x/ ! 1 when kxk ! 1. 4. The purpose of this exercise is to show that weak continuity of continuous maps cannot be expected in general in an infinite dimensional Banach space X. (a) Prove that the unit sphere SX of X is dense in the closed unit ball BX endowed with the topology induced by the weak topology . (b) Verify the continuity of the retraction r W X ! BX given by r.x/ WD x= max.kxk; 1/. (c) Given x 2 X such that kxk D 1=2, let .xi /i2I be a net in SX weakly converging to x. Observe that .r.2xi //i2I D .xi /i2I weakly converges to x and not to r.2x/ D 2x. 5 . Show that the weak topology of a Banach space X need not be sequential. Here a topology T on X is said to be sequential if the closure cl.S/ for T of any subset S of X is the set of limits of convergent sequences in S. [Hint: in a separable Hilbert space X with orthonormal base .en / show that 0 is in the weak closure of the set S WD fem C men W m; n 2 N; m < ng but no sequence in S weakly converges to 0.] 6 . (Šmulian’s Theorem). Prove that any sequence in a weakly compact subset of a Banach space has a weakly convergent subsequence. 7 . Let I be an infinite set and let X WD `1 .I/ be the space of bounded functions on I with the supremum norm. Show that the unit ball BX of X contains a weak compact subset that has no weak convergent sequence besides the ones that are eventually constant. 8. Show that the class W of Banach spaces having weak sequentially compact dual balls is stable under the following operations (see [97, p. 227]): (a) taking dense continuous linear images; (b) taking quotients; (c) taking subspaces. 9. Prove a result similar to the one of exercise 5 of Sect. 3.2 for a weakly closed cone of a normed space endowed with the weak topology. 10. (a) Show that the polar set P0 WD fx 2 X W hx ; xi 1 8x 2 Pg of a cone P of a normed space X is a cone and is given by P0 D fx 2 X W hx ; xi 0 8x 2 Pg.
3.4 Couplings and Reflexivity
145
(b) A base of a convex cone Q is a convex subset C of Q such that 0 … C and Q D RC C. Show that a closed convex cone P of a Banach space has a nonempty interior if and only if its polar cone Q WD P0 has a weak compact base. 11. (a) Verify that the polar cone Q of the cone P WD f0g RC R2 is locally compact but does not have a compact base. (b) Prove that if Q is a weak closed convex cone of the dual of a Banach space, Q has a weak compact base if and only if it is locally compact. [See [109].] 12 . (Davis-Figiel-Johnson-Pelczynski Theorem) Let Q be a weakly compact symmetric convex subset of a Banach space X. Show that there exists a weakly compact symmetric convex subset P of X containing Q such that the linear span Y of P endowed with the gauge of P is a reflexive space.
3.4.3 Uniform Convexity In this subsection we give a useful criterion for reflexivity. Let us call a gage or a forcing function a function W RC ! RC that is nondecreasing and such that
.r/ D 0 if and only if r D 0. Let us say that a subset A of a normed space .X; kk/ is uniformly rotund oruniformlyconvex if there is a gage such that for any x; y 2 A one has 12 .x C y/ C . 12 .x y//BX A. The space .X; kk/ is said to be uniformly rotund or uniformly convex if its unit ball BX is uniformly rotund. Thus .X; kk/ is uniformly convex if and only if the following property holds: (UC)
for every " > 0 there exists some ı > 0 such that 1 1 ı: .x C y/ x; y 2 BX ; kx yk " ) 2
Such a property is of metric character and is not preserved under isomorphisms as one can see by taking different usual norms on Rd . It is easy to see that X is uniformly convex if and only if there exists a modulus (i.e. a nondecreasing function W RC ! RC such that .0/ D 0 and is continuous at 0) satisfying kx yk .1 12 kx C yk/ for all x; y 2 BX (Fig. 3.5). Fig. 3.5 Uniform convexity
x z
y
146
3 Elements of Functional Analysis
Theorem 3.19 (Milman-Pettis) Any uniformly convex Banach space is reflexive. Proof Let eX W X ! X be the canonical embedding. Since eX .X/ is closed in X in the norm topology (since it is complete), it suffices to show that any x 2 X belongs to the closure of eX .X/. Without loss of generality we may suppose kx k D 1. Given " > 0 we want to find some x 2 X such that keX .x/ x k ". Let ı > 0 be associated with " as in (UC). Since kx k D 1 we can find x 2 SX such that hx ; x i > 1 ı=2. Let V WD fx 2 X W jhx x ; x ij < ı=2g: Since V is open in the .X ; X /-topology and since eX .BX / is .X ; X /-dense in BX by Lemma 3.18, we can find some x 2 BX such that eX .x/ 2 V. Let us show that assuming that keX .x/ x k > " leads to a contradiction. This inequality means that x 2 W WD X n.eX .x/ C "BX /, a .X ; X /-open subset of X since "BX is .X ; X /-compact. Applying again Lemma 3.18, we can find some y 2 BX such that eX .y/ 2 V\W: keX .y/ eX .x/k > " and jheX .y/ x ; x ij < ı=2. Since eX .x/ 2 V by our choice of x, we also have jheX .x/ x ; x ij < ı=2 and we get jheX .x C y/ 2x ; x ij < ı hence heX .x C y/; x i > 2hx ; x i ı > 2 2ı, kx C yk D keX .x C y/k D keX .x C y/k : kx k > 2 2ı: Since 12 kx C yk > 1 ı, (UC) ensures that kx yk ", contradicting the relation t u keX .y/ eX .x/k > " and the fact that eX is isometric. The preceding criterion is just a sufficient condition, not a necessary condition. But it enables us to prove that some usual Banach spaces such as Hilbert spaces and Lebesgue spaces Lp .S/ for p 21; C1Œ are reflexive. Also, uniform convexity implies a useful property relating weak and strong topologies: Proposition 3.28 (Kadec-Klee) Let .xi /i2I be a net (or a sequence) in a uniformly convex Banach space X which weakly converges to x 2 X and is such that .kxi k/i2I ! kxk (or just lim supi2I kxi k kxk). Then .kxi xk/i2I ! 0. Proof Since the result is obvious if x D 0, we may suppose x ¤ 0. Let ri WD max.kxi k ; kxk/, so that .ri /i2I ! r WD kxk. Let ui WD xi =ri , u WD x= kxk, so that ui , u 2 BX and .ui /i2I ! x=r D u in the .X; X / topology. Then . 12 .ui C u// ! u in the .X; X / topology, so that kuk lim infi2I . 21 kui C uk/. But since kuk D 1 and ku C ui k kuk C kui k 2, we get . 12 kui C uk/ ! 1, hence, by condition (UC), .kui uk/i2I ! 0. We conclude that .xi / converges to x in the norm topology. u t In turn, the dual Kadec-Klee Property implies that the duality map is continuous.
3.4 Couplings and Reflexivity
147
Proposition 3.29 Let .X; kk/ be a normed space such that the dual norm has the sequential dualKadec-Klee Property: .xn / ! x whenever .xn / ! x in the weak topology and . xn / ! kx k . Then, if the duality map J is single-valued, it is continuous from X to X with respect to the topologies associated with the norms. Proof For simplicity, we assume that BX is sequentially compact. Let .xn / ! x in .X; kk/, so that .kxn k/ ! kxk and .kJ.xn /k / ! kJ.x/k . From the bounded sequence .J.xn // one can extract a subsequence .J.xk.n/ // that weak converges to 2 some x 2 X . Then hx ; xi D limn hJ.xk.n//; xk.n/ i D limn xk.n/ D kxk2 , so that lower semicontinuous, one has kx k kx k kxk. But since kk is weak lim infn J.xk.n/ / D kxk. Thus kx k D kxk and hx ; xi D kxk2 , hence x D J.x/. Since .J.xk.n/ // can be extracted from any subsequence in .J.xn //, one has
.J.xn // ! J.x/, .kJ.xn /k / ! kJ.x/k and, by the sequential dual Kadec-Klee Property, .kJ.xn / x k / ! 0. t u
Exercises 1. Let .X; kk/ be a normed vector space. Show that the following properties are equivalent: (a) 8u, v 2 SX , u ¤ v H) ku C vk < 1I (b) 8u, v 2 X, u ¤ 0, ku C vk D kuk C kvk H) 9r 2 RC : v D ru. When the norm satisfies these properties, it is said to be strictly convex and the space is said to be strictly convex. 2. Show that when the dual norm kk on the dual of a normed vector space .X; kk/ is strictly convex, the duality map J is single-valued. 3. Let .X; kk/ be a normed vector space such that there exists on X an equivalent norm kk0 that is strictly convex. Show that for all c 21; 1Œ there exists some b 2 P such that the norm kkb WD kk C b kk0 satisfies kk kkb c kk and is strictly convex. 4. Let .X; kk/ be a uniformly convex normed vector space and let p 21; 1Œ. Prove that for every ı > 0 there exists an " > 0 such that for all x, y 2 BX one has p 1 1 1 kxkp C kykp ": kx yk ı H) .x C y/ 2 2 2 5. Let .X; kk/ be a normed vector space such that there exists on X an equivalent norm kk0 that is uniformly convex. Using the preceding exercise, show that for all c 21; 1Œ there exists some b 2 P such that the norm kkb WD kk C b kk0 satisfies kk kkb c kk and is uniformly convex. 6. Prove that any uniformly convex normed space is strictly convex.
148
3 Elements of Functional Analysis
7. Let C be a nonempty closed convex subset of a strictly convex Banach space X and let w 2 X. Show that there is at most one point a 2 C such that kw ak D dC .w/ WD infx2C kx wk. 8. Let C be a nonempty closed convex subset of a reflexive Banach space. Show that for every w 2 W there exists some point a 2 C such that kw ak D dC .w/. 9. Let C be a nonempty closed convex subset of a strictly convex reflexive Banach space X. Show that there exists a unique point a WD PC .w/ 2 C such that kw ak D dC .w/. 10. Let C be a nonempty closed convex subset of a uniformly convex Banach space X. Show that for every w 2 X, any sequence .xn / in C satisfying .kxn wk/n ! dC .w/ is a Cauchy sequence. Conclude that there exists a unique point a WD PC .w/ 2 C such that kw ak D dC .w/. Prove that the map PC W X ! C is uniformly continuous on bounded subsets of X. 11 . For p 21; 1Œ let h W R ! R be given by h.t/ WD jtjp C 1 212p jt C 1j. After a study of h, prove that there exists some cp 2 P such that for all r; s 2 R one has jr sjp cp .jrjp C jsjp /.jrjp C jsjp 21p jr C sj/p=2 : Deduce from this the fact that for p 21; 2 the space Lp .S; / is uniformly convex. We shall return to this question in Sect. 8.5.
3.4.4 Separability It is often important to know whether one can use sequences when dealing with weak topologies. Such a question is related to the metrizability of the topology induced by a weak topology on a bounded subset. We recall that a topological space is said to be metrizable if its topology can be associated with a metric. Theorem 3.20 Let X and Y be normed spaces in metric duality. If X is separable, then the topology induced by .Y; X/ on any bounded subset of Y is metrizable. If Y is the dual of X or if X is the dual of Y, the converse is true. Proof If X is separable, then BX is separable by Corollary 2.13. Let fan W n 2 Ng be a countable dense subset of BX . It is easy to see that the function y 7! kyk WD
X
2n jhan ; yij
n1
is a norm on Y. Let us show that the topology induced by .Y; X/ on BY coincides with the topology associated with the norm kk . Then the same will hold on any bounded subset.
3.4 Couplings and Reflexivity
149
Given y 2 BY and r > 0, let us find a neighborhood V of y in .BY ; .Y; X// such that V U WD B .y; r/ WD fy 2 Y W ky yk < rg. We take V of the form V WD fy 2 BY W jhai ; y yij < "; i 2 Nk g for some " 20; rŒ and k 2 Nnf0g such that " C 21k < r. Then, for y 2 V, since ky yk 2 we have y 2 U since ky yk
k X nD1
2n jhan ; y yij C
1 X
2n jhan ; y yij < " C
nDkC1
1 X
2n 2 < r:
nDkC1
Conversely, given a neighborhood W of y in .BY ; .Y; X//, let us find r > 0 such that B .y; r/ \ BY W. We may assume that there is a finite subset fx1 ; : : : ; xm g of X and " > 0 such that W WD fy 2 Y W jhxi ; y yij < "; i 2 Nm g: Without loss of generality we may assume xi 2 BX for all i 2 Nm . Since fan W n 2 Ng is dense in BX , for each i 2 Nm we can pick some n.i/ 2 N such that xi an.i/ < "=4. Let us pick r > 0 such that 2n.i/ r < "=2 for all i 2 Nm and let us show ˇ that B .y;ˇr/ \ BY W. Given y 2 B .y; r/ \ BY , for i 2 Nm we have 2n.i/ ˇhan.i/ ; y yiˇ < r, so that y 2 W since ˇ ˇ ˇ ˇ jhxi ; y yij ˇhxi an.i/ ; y yiˇ C ˇhan.i/ ; y yiˇ < 2"=4 C 2n.i/ r < ": Finally, let us prove the converse in the case Y D X : assuming that the topology induced by .Y; X/ on BY is defined by a metric d, let us show that X is separable. For n 2 Nnf0g let Un WD fy 2 BY W d.y; 0/ < 1=ng and let Vn be a neighborhood of 0 in .BY ; .Y; X// such that Vn Un . Shrinking Vn if necessary we can find "n > 0 and a finite subset Fn of X such that Vn WD fy 2 BY W 8x 2 Fn jhx; yij < "n g: Then \n Vn \n Un D f0g so that if y 2 Y is such that hx; yi D 0 for all x 2 F WD [n Fn one has y D 0. Therefore, the linear space generated by F is dense in X. Since F is countable, this means that X is separable (consider the set of linear combinations of elements of F with rational coefficients). The case X D Y is more delicate and we refer to monographs on functional analysis [106, p. 426] f.i. t u From the relative weak compactness and the metrizability of bounded subsets of the dual of a separable Banach space one gets the following useful statement.
150
3 Elements of Functional Analysis
Corollary 3.23 Let . fn / be a bounded sequence in the dual of a separable Banach space X. Then . fn / has a subsequence that converges in the weak topology .X ; X/. The following fact related to the preceding duality relationship is noteworthy. Proposition 3.30 If the dual X of a Banach space X is separable, then X is separable. Proof Let fxn W n 2 Ng be a countable dense subset of SX . For all n 2 N let xn 2 SX be such that hxn ; xn i > 1=2. Let Y be the smallest closed linear subspace containing fxn W n 2 Ng. It suffices to prove that Y D X. If Y ¤ X one can find some x 2 SX such that hx ; yi D 0 for all y 2 Y. Let k 2 N be such that xk x < 1=2. Then one gets the contradiction: 0 D hx ; xk i D hxk ; xk i hxk x ; xk i hxk ; xk i xk x kxk k > 0: t u Corollary 3.24 A Banach space X is reflexive and separable if and only if its dual X is reflexive and separable. Proof If X is reflexive and separable then X is reflexive by Proposition 3.27 and it is separable by the preceding proposition. Conversely, if X is reflexive and separable then its dual X is reflexive and it is separable since .X / D X is separable. t u Theorem 3.21 Any bounded sequence in a reflexive Banach space has a subsequence that converges in the weak topology. Proof Let .xn / be a bounded sequence in a reflexive Banach space X. The closed linear subspace Y generated by fxn g is separable and reflexive by Proposition 3.26. Then Y is separable by the preceding corollary. Corollary 3.23 ensures that .xn / has a convergent subsequence with respect to the topology .Y; Y /. The proof of Proposition 3.26 showed that .Y; Y / is the topology induced by .X; X / on Y. The conclusion ensues. t u
Exercises 1. Show that if X is an infinite dimensional Banach space the weak topology .X; X / is not metrizable. 2. Prove that if X is a reflexive infinite dimensional normed space there exists in the unit sphere SX of X a sequence that converges to 0 in the .X; X / topology. 3. Prove the same conclusion if X has a separable dual and is infinite dimensional. 4. Let X WD `1 be the space of sequences x WD .xn /n2N of real numbers such that kxk1 WD ˙n0 jxn j < C1, endowed with the norm kk1 . Prove that any weakly convergent sequence in X converges in the norm topology.
3.5 Some Key Results of Functional Analysis
151
5. Show that the space `1 is separable. 6. Show that the space c0 of sequences of real numbers with limits 0 endowed with the supremum norm is separable. 7. Prove that the space `1 of bounded real sequences endowed with the supremum norm is not separable. [Hint: Considering a sequence as a function on N and denoting by 1A the characteristic function of a subset A of N defined by 1A .n/ D 1 if n 2 A, 1A .n/ D 0 else, note that for distinct subsets A, B of N one has k1A 1B k D 1. Deduce from Cantor’s Theorem that the space `1 is not separable as it cannot have a countable base of open sets.] 8. Using James’ Theorem, give another proof of the fact that a uniformly convex space X is reflexive. [Hint: for x 2 X with norm r > 0 show that a sequence .xn / of BX satisfying limn x .xn / D r is a Cauchy sequence.]
3.5 Some Key Results of Functional Analysis We devote this section to some classical theorems of functional analysis and to some applications and refinements. Most of them are consequences of Baire’s Theorem. With the Hahn-Banach Theorem, they form the pillars of linear functional analysis.
3.5.1 Some Classical Theorems The Uniform Boundedness Theorem, the Open Mapping Theorem and the Closed Range Theorem that are on our agenda are among the cornerstones of linear functional analysis. They have many consequences and applications. We start with a side result that enables us to reduce an interiority condition to an algebraic condition. Recall that the core (or algebraic interior) of a convex subset C of a vector space X is the set of points a 2 X such that for all v 2 X there exists some ˛ > 0 for which a C Œ˛; ˛v C. One has the following characterizations. Lemma 3.19 For a nonempty convex subset C of a vector space X and a 2 X, the following assertions are equivalent: (a) (b) (c) (d)
a 2 core C; C a is absorbing: for all x 2 X there exists a t > 0 such that tx 2 C a; X D RC .C a/ WD fr.c a/ W r 2 RC ; c 2 Cg; a 2 C and for all v 2 X there exists an " > 0 such that a C Œ0; "v C.
Proof Since a subset D of X is absorbing if and only if, for every x 2 X there exists some r > 0 such that x 2 rD, the implications (a))(b))(c) are obvious. Assertion (d) is clearly satisfied when X D f0g. Let us show that (c))(d) when X ¤ f0g. Taking v ¤ 0 in X, we can write v D r.c a/, v D r0 .c0 a/, for some c; c0 2 C and r; r0 > 0 (as v ¤ 0), so that 0 D r.r C r0 /1 c C r0 .r C r0 /1 c0 a 2 C a
152
3 Elements of Functional Analysis
by convexity, hence a 2 C. Moreover, setting " WD 1=r, for s 2 Œ0; ", we have a C sv D .1 sr/a C src 2 C. Assuming (d) holds, given v 2 X, taking ", "0 > 0 such that a C Œ0; "v C, a C Œ0; "0 .v/ C and setting ˛ WD min."; "0 / we see that a C Œ˛; ˛v C so that a 2 core C. t u Let us give comparisons between the core and the interior of a convex subset of a normed vector space. The inclusion int C core C always holds since for all a 2 intC and all v 2 X the map f W t 7! a C tv is continuous and f .0/ 2 intC, so that f .t/ 2 C for t > 0 small enough. We leave the first two comparisons that follow as exercises. Exercise If C is a convex subset of a normed vector space and if the interior of C is nonempty, then int C D core C. [Hint: for x 2 core C and a 2 intC let " > 0 be such that z WD x C ".x a/ 2 C. Then note that h W x 7! z C ".1 C "/1 .x z/ is a homeomorphism of X onto X that maps C onto a neighborhood of x contained in C.] Exercise If C is a convex subset of a finite dimensional vector space, then int C D core C. [Hint: use the preceding exercise and the fact that C has a nonempty interior in the affine subspace it generates, which is the whole space if core C is nonempty.] The next relationship requires a proof. Proposition 3.31 The core of a closed convex subset C of a Banach space X coincides with its interior. Proof We may suppose core C ¤ ¿ and, using a translation if necessary, that 0 2 core C. Then X is the union of the closed subsets nC for n 2 Nnf0g. Since X is a Baire space, one of these sets has a nonempty interior. Thus C has a nonempty interior and the result follows by the first exercise above. t u The preceding proposition can be considered as a special case of the next one when in it one takes X WD f0g, F WD f0g C. Note that the projection of a closed subset is not necessarily closed, so that new arguments are required. Proposition 3.32 (Robinson) Let C WD pY .F/ be the projection on Y of a closed convex subset F of the product X Y of two Banach spaces. Then core C D int C. Proof in the case X is reflexive In such a case a simple proof using a compactness argument can be given. Again, we may suppose core C ¤ ¿ and, using translations if necessary, that 0 2 core C and .0; 0/ 2 F. Then, for all y 2 Y there exist some .u; v/ 2 F and some k 2 Nnf0g such that y D kv. Taking m 2 Nnf0g such that u 2 mBX , where BX is the closed unit ball of X, we see that y D kmpY .u=m; v=m/ 2 kmF.BX /, where F.BX / WD pY .F \ .BX Y//, F being considered as a multimap from X into Y. Thus Y D [n1 nF.BX / and 0 2 core F.BX /. It remains to prove that F.BX / is (weakly) closed and convex, so that, by the preceding proposition,
3.5 Some Key Results of Functional Analysis
153
0 2 int F.BX / int C. Convexity is obvious and closedness follows from Exercise 2 of Sect. 2.2.4. Let us adapt the argument to the present case. Let y be the (weak) limit of a net .yi /i2I in F.BX /. For all i 2 I we pick xi 2 BX such that yi 2 F.xi /, i.e. .xi ; yi / 2 F. Since BX is weakly compact, a subnet .xj /j2J of .xi /i2I has a weak limit x in BX . Since F is weakly closed, we have .x; y/ 2 F and y 2 F.BX /. t u Proof of Proposition 3.32 in the general case Again, by translations and homotheties we can reduce the task to showing that if 0 2 core C and .0; 0/ 2 F, then, for some s > 0, C contains the ball sBY . As above, considering F as a multimap from X into Y we have Y D [n1 nF.BX /. The Baire category theorem asserts that for some n 1 the set cl nF.BX / has a nonempty interior. Thus D WD cl F.BX / has a nonempty interior. Let y 2 int.D/. Since 0 2 core C there exists some s > 0 such that sy 2 C. If u 2 X is such that sy 2 F.u/, taking t 20; s such that ts1 u 2 BX , by convexity we see that ty 2 F.BX / D WD clF.BX /. Then, since D is convex, by Lemma 3.11 one has 0D
1 t 1 t .ty/ int.D/ C D int.D/: yC 1Ct 1Ct 1Ct 1Ct
Thus there exists an r > 0 such that rBY D D cl F.BX /. The rest of the proof is devoted to showing that .1 q/rBY F.BX / for all q 20; 1Œ (the reader can just take q WD 1=2 but since int.rBY / is the union of the balls r0 BY for r0 2 Œ0; rŒ, our proof shows more: int.rBY / F.BX /). Let y 2 .1 q/rBY . Since by convexity, for t 2 Œ0; 1, trBY cl F.tBX /, given w 2 trBY and " > 0 we can find u 2 tBX , v 2 F.u/ such that kv wk < ". In particular, taking t WD 1 q, w WD y we can find x1 2 .1 q/BX , y1 2 F.x1 / such that ky1 yk < "0 WD 12 .1 q/q2 r. Then, setting x0 WD 0, we show by induction on n 2 Nnf0g that there exists some .xn ; yn / 2 F such that 1 .1 q/qnC1 r: (3.8) 2 n n These relations imply kxn k D x0 C ˙jD1 .xj xj1 / ˙jD1 qj1 .1 q/ 1 qn and they are satisfied for n D 1. Assuming x0 , x1 ,. . . ; xk are constructed in such a way that (3.8) is satisfied for n k, taking wk WD y C tk .y yk /, with tk WD 2qk .1 q/1 , so that kwk yk D tk kyk yk qr and kwk k qr C kyk r, or wk 2 rBY , we pick uk 2 BX , vk 2 F.uk / such that kvk wk k < "k WD .1 q/qkC2 r=2 and we set kxn xn1 k qn1 .1 q/;
.xkC1 ; ykC1 / WD
kyn yk
tk 1 .xk ; yk / C .uk ; vk / 2 F 1 C tk 1 C tk
since F is convex. Moreover, using the relation 1=.1 C tk / 1=tk , we get kxkC1 xk k
1 qk .1 q/ .2 qk / qk .1 q/: .kxk k C kuk k/ 1 C tk 2
154
3 Elements of Functional Analysis
By our choice of wk ykC1 y D
tk 1 1 .yk y/ C Œ.wk y/ C .vk wk / D .vk wk /; 1 C tk 1 C tk 1 C tk
so that we get kykC1 yk .1 C tk /1 kvk wk k "k WD .1 q/qkC2 r=2 and relation (3.8) is established. The Cauchy sequence .xn / has a limit x and .x; y/ D limn .xn ; yn / 2 F since F is closed and x 2 BX since xn 2 BX for all n 2 N. Thus y 2 F.BX /. t u Remark Interchanging the roles of X and Y, we get that for any multimap F W X Y with closed convex graph with domain D WD pX .F/ one has core D D int D. Taking for F the epigraph of a lower semicontinuous convex function f , we get the next corollary. Corollary 3.25 Let W be a Banach space and let f W W ! R be a lower semicontinuous convex function. Then core .dom f / D int .dom f /. The following generalization of the Open Mapping Theorem is a versatile tool. Theorem 3.22 (Robinson-Ursescu) Let X, Y be Banach spaces, let F W X Y be a multimap with closed convex graph. Then, for any .x; y/ in (the graph of) F such that y 2 core F.X/, the multimap F is open at .x; y/ in the sense that for any U 2 N .x/ one has F.U/ 2 N .y/. In fact, F is open at .x; y/ with a linear rate: there exists some c > 0 such that 8t 20; 1
B.y; tc/ F.B.x; t//:
Proof Without loss of generality, we may suppose .x; y/ D .0; 0/, so that F.X/ is absorbing. Let B be the closed ball with center x D 0 and radius r in X and let C WD F.B/ D pY ..B Y/ \ gphF/. The beginning of the first proof of Proposition 3.32 shows that C is absorbing, but let us show that again, which will prove that 0 2 intC by Proposition 3.32 in which we replace F with .B Y/\gphF. Let y be an arbitrary point of Y; since F.X/ is absorbing, there exist some s > 0 and x 2 X such that sy 2 F.x/. If x 2 B, then sy 2 C. If x 2 XnB, let t WD r kxk1 20; 1Œ, so that tx 2 B. Since F is convex we have sty D tsy C .1 t/0 2 tF.x/ C .1 t/F.0/ F.tx C .1 t/0/ F.B/ D C: Thus, C is absorbing and any neighborhood of x is mapped by F onto a neighborhood of y: F is open at .x; y/. The last assertion stems from the convexity of F: if B.y; c/ is contained in F.B.x; 1//, then, for t 20; 1, we have tF.B.x; 1// C .1 t/F.x/ F.tB.x; 1/ C .1 t/x/, hence B.y; tc/ D tB.y; c/ C .1 t/y tF.B.x; 1// C .1 t/F.x/ F.B.x; t//: t u
3.5 Some Key Results of Functional Analysis
155
Taking for F a surjective linear map with closed graph, we get a classical result. Theorem 3.23 (Banach-Schauder Open Mapping Theorem) Let X and Y be Banach spaces and let A W X ! Y be a continuous linear mapping such that A.X/ D Y. Then there exists some c > 0 such that for all y 2 Y one can find x 2 X satisfying A.x/ D y and kxk c kyk and A is open, i.e. for any open subset U of X, its image V WD A.U/ is an open subset of Y. We immediately point out some important consequences. The first is obtained by replacing Y with A.X/, which is a Banach space when it is closed in Y. Corollary 3.26 Let X and Y be Banach spaces and let A 2 L.X; Y/ be such that R.A/ WD A.X/ is closed. Then there exists some c > 0 such that for all y 2 R.A/ one can find x 2 X satisfying A.x/ D y and kxk c kyk. When A is bijective, the conclusion means that A1 is continuous, a remarkable fact. Theorem 3.24 (Banach Isomorphism Theorem) If A is a linear continuous bijection between two Banach spaces, then A is an isomorphism. Corollary 3.27 (Closed Graph Theorem) Any linear map with closed graph between two Banach spaces is continuous. Proof Let B W Y ! Z be such a map. The graph X of B, being a closed linear subspace of Y Z, is a Banach space and A W .y; By/ 7! y is a continuous bijection from X onto Y. Then its inverse y 7! .y; By/ is continuous and B is continuous. u t Exercise Let X and Y be Banach spaces and let A W X ! Y be a linear map such that, for every y 2 Y , the linear form y ı A is continuous on X. Show that A is continuous. [Hint: note that the graph G of A is closed as G D f.x; y/ 2 X Y W 8y 2 Y y .A.x/ y/ D 0g.] A factorization result is often helpful. Lemma 3.20 Let X; Y be Banach spaces, let A 2 L.X; Y/ be such that A.X/ D Y and let ` 2 X be such that `.x/ D 0 for all x 2 N WD ker A. Then there exists some y in the dual Y of Y such that ` D y ı A. Proof Since A is onto and since for any x; x0 2 X satisfying A.x/ D A.x0 / one has `.x/ D `.x0 /, there exists a map k W Y ! R such that ` D k ı A. It is easy to see that k is linear. Now, the Banach Open Mapping Theorem asserts that there exists some c > 0 such that for all y 2 Y one can find some x 2 A1 .y/ satisfying kxk c kyk. It follows that for all y 2 Y one has k.y/ D k.A.x// D `.x/ k`k c kyk. Thus k is continuous and we can take y D k. t u Remark Instead of using the Banach Open Mapping Theorem, one can introduce the canonical projection p W X ! X=N, observe that ` can be factorized as ` D ` ı p for some ` in the dual of X=N and use the Banach Isomorphism Theorem to get that the map A W X=N ! Y induced by A is an isomorphism, so that one has ` D y ı A 1 t u with y WD ` ı A .
156
3 Elements of Functional Analysis
Corollary 3.28 Let G and H be two closed subspaces of a Banach space E such that G C H is closed. Then there exists some c > 0 such that every z 2 G C H can be decomposed into z D x C y with x 2 G, y 2 H, kxk c kzk and kyk c kzk. Proof This is a special case of Corollary 3.26 obtained by taking X WD G H and A W .x; y/ 7! x C y, so that R.A/ D G C H. t u Corollary 3.29 Let X and Y be closed subspaces of a Banach space Z such that Z D X C Y and X \ Y D f0g. Then Z is the direct topological sum of X and Y in the sense that the map S W .x; y/ 7! x C y is an isomorphism from X Y onto Z. Then the projectors pX and pY of Z onto X and Y are continuous since .pX ; pY / D S1 . One says that X (and Y) are (topologically) complemented. In Hilbert spaces, every closed subspace is complemented. This is not the case in general in Banach spaces. For example, the closed subspace c0 of `1 is not complemented (see Exercises 5 and 6 of Sect. 3.3.3 for the definitions of these spaces). Corollary 3.30 Let X and Y be closed subspaces of a Banach space W such that Z WD X C Y is closed. Then there exists a c > 0 such that, for all w 2 W, d.w; X \ Y/ cd.w; X/ C cd.w; Y/: Proof Given t > 1 and w 2 W we can find x 2 X, y 2 Y such that kw xk td.w; X/;
kw yk td.w; Y/:
Taking z WD x y in Corollary 3.28, we obtain some x0 2 X, y0 2 Y such that x y D x0 C y0 ;
0 0 x C y c kx yk :
Then x x0 D y0 C y 2 X \ Y and since kx0 y0 k kx0 k C ky0 k c kx yk we get 1 2w .x x0 / .y C y0 / 2 1 .kw xk C kw yk C x0 y0 / 2 1 .kw xk C kw yk C c kx wk C c kw yk/ 2 1 t.c C 1/.d.w; X/ C d.w; Y//: 2
d.w; X \ Y/
Since t is arbitrarily close to 1, the constant of the statement can be estimated as 1 .c C 1/, where c is as in Corollary 3.28. u t 2 Exercise Let A be a linear continuous map from a Banach space X onto a Banach space Y. Show that A has a continuous right inverse B (i.e. a map B 2 L.Y; X/ such
3.5 Some Key Results of Functional Analysis
157
that A ı B D IY ) if and only if the kernel N of A has a topological complement M, i.e. if X is the direct topological sum of M and N). Exercise Let A be an injective linear continuous map from a Banach space X into a Banach space Y. Show that A has a continuous left inverse B (i.e. a map B 2 L.Y; X/ such that B ı A D IX ) if and only if the image R.A/ of A has a topological complement Q. The next theorem is a result of independent interest completing Proposition 3.21. Theorem 3.25 Let G and H be closed linear subspaces of a Banach space E. The following assertions are equivalent: G C H is closed in E
(3.9)
G? C H ? is closed in E
(3.10)
G C H D .G? C H ? /?
(3.11)
G? C H ? D .G \ H/? :
(3.12)
Proof (3.11))(3.9) and (3.12))(3.10) are immediate implications. The reverse implication (3.9))(3.11) is a consequence in Proposition 3.21. (3.9))(3.12) Since G? .G \ H/? and H ? .G \ H/? , hence G? C H ? .G \ H/? , it suffices to prove that .G \ H/? G? C H ? . Let f 2 .G \ H/? . Given two decompositions z D x C y D x0 C y0 of z 2 G C H with x, x0 2 G, y, y0 2 H we have f .x/ D f .x0 / since x x0 D y0 y 2 G \ H. Thus one can define ` W F WD G C H ! R by `.z/ WD f .x/ for z D x C y with x 2 G, y 2 H and ` is seen to be linear. Moreover, ` is continuous since by Corollary 3.28 there exists a c > 0 such that for all z 2 F we can find x 2 G, y 2 H such that z D x C y, kxk c kzk, hence j`.z/j c kf k kzk. The Hahn-Banach Theorem allows us to consider ` as the restriction to F of a continuous linear form on E still denoted by `. Then we can write f as f D . f `/ C ` with f ` 2 G? and ` 2 H ? . (3.10))(3.9) Corollary 3.30 asserts that there exists a c > 0 such that 8z 2 E
d.z ; G? \ H ? / cd.z ; G? / C cd.z ; H ? /:
(3.13)
Now by the second part of Proposition 3.22, for D WD G? , H ? , G? \ H ? we have 8z 2 E
d.z ; D/ D supfhz ; zi W z 2 BE \ D? g D hBD? .z /:
By relation (3.5) we have .G? \ H ? /? D .G C H/?? WD F where F is the closure of G C H. Now, by Corollary 3.20 about support functions, (3.13) implies that c1 BF cl.BG C BH /:
158
3 Elements of Functional Analysis
The proof of the Open Mapping Theorem applied to the map A W .x; y/ ! x C y from G H to F shows that int.c1 BF / A.BGH / D BG C BH when G H is endowed with the sup norm. This shows that A is surjective or that F D G C H: G C H is closed. t u The following classical result is known as the Closed Range Theorem. Theorem 3.26 (Closed Range Theorem) Let X and Y be Banach spaces and let A| be the transpose map of A 2 L.X; Y/. Then the range R.A| / of A| is closed if and only if the range R.A/ of A is closed. Proof Suppose R.A/ is closed. Corollary 3.26 yields some c > 0 such that for all y 2 R.A/ one can find x 2 X satisfying A.x/ D y and kxk c kyk. Then, given y 2 Y , for all y 2 R.A/ we have jhy ; yij D jhy ; Axij D jhA| y ; xij kA| y k : kxk c kA| y k : kyk ; so that ky k c kA| y k. This inequality shows that any Cauchy sequence in R.A| / is the image under A| of a Cauchy sequence in Y . Since A| is continuous, we see that any Cauchy sequence in R.A| / converges. Thus R.A| / is closed in X . Conversely, suppose R.A| / is closed. Then R.A|| / is closed in Y . Considering X (resp. Y) as a subspace of X (resp. Y ), for all x 2 X, y 2 Y we have hy ; A|| xi D hA| y ; xi D hy ; Axi; so that A|| x D Ax. As above we can find some c > 0 such that kx k c kA|| x k for all x 2 X . Given y 2 cl.A.X// cl.A|| .X// and a sequence .xn / in X such that .A|| .xn // ! y we get that .xn / is a Cauchy sequence in X, hence has a limit x. Since A|| is continuous we have y D A|| .x/ D A.x/, hence y 2 A.X/. This shows that R.A/ WD A.X/ is closed in Y , hence in Y. t u The next classical theorem is often useful. Theorem 3.27 (Banach-Steinhaus or Uniform Boundedness Theorem) Let X, Y be normed spaces, X being complete, and let F be a subset of the space L.X; Y/ of continuous linear maps from X to Y. If for all x 2 X, the set F.x/ WD ff .x/ W f 2 Fg is bounded in Y, then F is bounded in L.X; Y/ with respect to the usual norm. Proof Denoting by BY the unit ball of Y, for n 2 N consider the closed set Xn WD fx 2 X W 8f 2 F kf .x/k ng D
\
f 1 .nBY /:
f 2F
By assumption, X is the union of the family .Xn /. By Baire’s Theorem 2.11, there is some k 2 N such that Xk has a nonempty interior. If a 2 intXk , we also have a 2 intXk since Xk is symmetric with respect to 0. It follows that 0 2 intXk . Thus, if r > 0 is such that rBX Xk , we have kf k r1 k for all f 2 F. t u
3.5 Some Key Results of Functional Analysis
159
Corollary 3.31 A weakly bounded subset of the dual X of a Banach space X is bounded. A weakly bounded subset of X is bounded. In particular, weak convergent sequences of X and weakly convergent sequences of X are bounded. Here a subset F of X (resp. S X) is said to be weakly (resp. weakly) bounded if for all x 2 X (resp. x 2 X ) the set ff .x/ W f 2 Fg (resp. fx .x/ W x 2 Sg) is bounded in R. Proof The first assertion is the special case of the theorem corresponding to Y WD R. The second stems from the fact that the canonical embedding of X into X is isometric. t u We end this subsection by deriving from Brouwer’s Theorem a fixed point theorem that may be used to solve equations in functional spaces. The proof relies on two useful tools: an approximation method and partitions of unity. Theorem 3.28 (Schauder) Let X be a normed space, let C be a nonempty closed convex subset of X and let f W C ! C be a continuous map such that f .C/ is contained in a compact subset K of C. Then f has a fixed point. Proof Given r > 0 let .B.yi ; r=2//i2I be a finite covering of K by open balls with radius r=2, with yi 2 K for all i 2 I. Let us set pi .x/ WD max.r kf .x/ yi k ; 0/;
p.x/ WD
X
pi .x/:
i2I
Then pi and p are continuous and for all x 2 C we can find some i 2 I such that f .x/ 2 B.yi ; r=2/, so that p.x/ pi .x/ r=2. Thus, the map fr defined by fr .x/ WD
1 X pi .x/yi p.x/ i2I
is continuous and kfr .x/ f .x/k r for all x 2 C: indeed, setting I.x/ WD fi 2 I W pi .x/ > 0g, we have kf .x/ yi k < r for all i 2 I.x/, hence X 1 X 1 p .x/y p .x/f .x/ kfr .x/ f .x/k D i i i p.x/ p.x/ i2I.x/ i2I.x/
1 X pi .x/ kf .x/ yi k r: p.x/ i2I.x/
Let Cr be the convex hull of fyi W i 2 Ig. It is a compact convex subset of the linear span of fyi W i 2 Ig. Since fr .Cr / is contained in Cr , the map fr jCr has a fixed point xr 2 Cr by Theorem 2.5. Since kxr f .xr /k D kfr .xr / f .xr /k r and f .xr / 2 K, we can find a sequence .r.n// with limit 0 such that . f .xr.n/ //n and .xr.n/ /n converge to the same limit x. Then, by continuity of f , we have f .x/ D x. t u
160
3 Elements of Functional Analysis
Exercises 1. Show that if .ei /i2I is a basis of a Banach space X, then I is uncountable. [Hint: if I D N, for n 2 N let Xn be the linear span of fei W i 2 Nn g, so that X is the union of the closed subspaces Xn . Using Baire’s Theorem, obtain a contradiction.] 2. Let X and Y be Banach spaces and let b W X Y ! R be a bilinear map. Assuming that b is separately continuous in each of its two variables, show that b is continuous. Is the conclusion valid if b is not bilinear? [Hint: define B W X ! Y by B.x/ WD b.x; / and use the Uniform Boundedness Theorem.] 3. Let X and Y be Banach spaces and let .An / be a sequence in L.X; Y/ such that for all x 2 X the sequence .An x/ has a limit denoted by Ax. Show that A 2 L.X; Y/ and that for any convergent sequence .xn / ! x in X one has .An xn / ! Ax. 4. Let X be a reflexive Banach space and let .xn / be a sequence in X such that for all f 2 X the sequence . f .xn // has a limit. Deduce from the preceding exercise that there exists some x 2 X such that .xn / ! x weakly. Exhibit a sequence in a (non-reflexive) Banach space X such that the preceding property does not hold. [Hint: take for X the space of sequences v WD .vn / of real numbers such that .vn / ! 0 and take xn WD 1Œ0;n with xn;k WD 1 for k n, 0 for k > n.] 5. Let X be a Banach space and let A W X ! X be a linear operator such that hAx; xi 0 for all x 2 X. Show that A is continuous. [Hint: use the closed graph theorem.] 6. Let X and Y be Banach spaces and let A 2 L.X; Y/ be surjective. Show that the kernel N.A/ of A has a complement if and only if A has a right inverse B 2 L.Y; X/ (meaning that A ı B D IY ). 7. Let X and Y be Banach spaces and let A 2 L.X; Y/ be injective. Show that R.A/ is closed and has a complement if and only if A has a left inverse B 2 L.Y; X/ (that means that B ı A D IX ). 8. (Robinson) Given a Banach space X, a normed space Y, and a closed convex subset F of X Y such that pX .F/ is bounded, show that int.pY .F// D int.cl.pY .F///. [Hint. Adapt the proof of Proposition 3.32.] 9. (Metric regularity property) Given c 2 P and a multimap F W X Y whose graph is convex satisfying B.y; c/ F.B.x; 1// for some .x; y/ 2 F, show that for all .x; y/ 2 X B.y; c/ one has d.x; F 1 .y//
1 C kx xk d.y; F.x//: c ky yk
3.5.2 Densely Defined Operators and Transposition In many problems involving derivatives or partial derivatives one is faced with operators that are not defined everywhere, such as A W x 7! x0 on C.Œ0; 1/ which is defined on the subspace C1 .Œ0; 1/ of continuously differentiable functions on
3.5 Some Key Results of Functional Analysis
161
d Œ0; 1, or W C.˝/ ! C.˝/ given by u D ˙iD1 D2i u for u 2 C2 .˝/, where d 2 ˝ is an open subset of R and C .˝/ is the space of functions having continuous derivatives of order one and two on ˝. Thus, it is of interest to perform a general study of so-called unbounded operators, i.e. operators whose domain is a subspace. Applications to quantum mechanics can be found in [67, 74, 185] for instance. In the rest of this subsection X and Y are Banach spaces and A W D.A/ ! Y is a linear map whose domain D.A/ is a linear subspace of X. We first show that an elementary rule for integration and the notion of transposition can be extended to closed densely defined operators, i.e. operators whose graphs are closed and whose domains are dense.
Lemma 3.21 Let A W D.A/ ! Y be a closed linear map, let T WD Œa; b be a compact interval R of R, and let f 2R C.T; X/R be such that f .T/ D.A/ and A ı f 2 C.T; Y/. Then T f 2 D.A/ and A. T f / D T A ı f . Proof By assumption, g W t 7! . f .t/; A. f .t/// is continuous from T into the graph G.A/ of A Rendowed with the norm induced R R Rby X Y. SinceRG.A/ is closed, R hence t complete, T g 2 G.A/. Since T g D . T f ; T A ı f /, we get T A ı f D A. T f /. u Definition 3.3 Let A W D.A/ ! Y be a linear map with dense domain D.A/. Let D.A| / be the set of y 2 Y such that y ı A is continuous on D.A/ with respect to the induced topology. Then there exists a unique linear map A| W D.A| / ! X satisfying the relation hA| y ; xi D hy ; Axi
8y 2 D.A| /; 8x 2 D.A/:
It is called the transpose (or conjugate or dual operator) of A. The fact that A| y 2 X is well defined when y 2 D.A| / stems from the property that the linear continuous map y ı A has a unique (linear) continuous extension to X since D.A/ is dense in X and y ı A is uniformly continuous. A routine argument shows that D.A| / is a linear subspace and that A| is linear. The transpose of A is often called the adjoint of A, but we prefer to keep this term (and the notation A ) for a related notion specific to Hilbert spaces. Namely, if JX W X ! X and JY W Y ! Y are the duality maps of Hilbert spaces X and Y one sets A WD JX1 ı A| ı JY ; using the fact that JX and JY are isomorphisms. In other terms, A is characterized by 8x 2 D.A/; y 2 JY1 .D.A| //
hA y j xi D hy j Axi;
where h j i is the scalar product. If X and Y are finite dimensional Euclidean spaces, both notions can be identified. It can be shown that if A W D.A/ ! Y and B W D.B/ ! Y are such that D.A/ \ D.B/ is dense in X, then .A C B/| D A| C B| on D.A| / \ D.B| /. Let us note that
162
3 Elements of Functional Analysis
when D.A/ D X and A is continuous, then D.A| / D Y since for all y 2 Y the linear form y ı A is continuous. Moreover, in such a case, we have seen that A| is continuous. This fact is valid even if X and Y are just locally convex topological vector spaces (exercise). Example The position operator Q W X ! X, with X WD L2 .R/, the Lebesgue space of square integrable functions on R, is the map given by Q.x/.r/ WD rx.r/ for x 2 L2 .R/. It plays an important role in quantum mechanics. Its domain is the set D.Q/ WD fx 2 L2 .R/ W IR x 2 L2 .R/g where IR .r/ WD r for r 2 R. It is unbounded since for Tn WD Œn; nC1 one has kQ.1Tn /k2 n while k1Tn k2 D 1. It is a symmetric operator since for x, y 2 D.Q/ one has Z rx.r/y.r/dr D hQy j xi:
hQx j yi D R
Moreover, it is self-adjoint, i.e. one has D.Q / D.Q/: if y 2 D.Q / there exists some z 2 L2 .R/ such that hQx j yi D hx j zi for all x 2 D.Q/, in particular for all x 2 Cc1 .R/, which implies that z.r/ D ry.r/ a.e. r 2 R or y 2 D.Q/. The spectrum .Q/ of Q is R, i.e. for all 2 R the operator Q I is not invertible: if T were a continuous inverse of Q I, for all x 2 L2 .R/ the relation .Q I/.Tx/ D x would imply .r /.Tx/.r/ D x.r/ whereas for x WD 1;Œ with > the function r 7! .r /1 x.r/ does not belong to L2 .R/. However, Q has no eigenvalue, as is easily seen. t u Proposition 3.33 If A W D.A/ ! Y is a densely defined linear map, its transpose A| is a closed map in the sense that its graph G.A| / is closed in Y X (and even weakly closed). For the domain D.A| / of A| to be weakly dense in Y it is necessary and sufficient that A has a closed extension A. If this is the case, the transpose A|| WD .A| /| of A| is the smallest closed extension of A. In particular, if A is closed in the sense that G.A/ is closed, then A|| D A. Proof Let S W X Y ! Y X be the (symplectic) isomorphism defined by S.x ; y / D .y ; x /. The definition of A| shows that G.A| / D S.G.A/? /:
(3.14)
In fact, .y ; x / 2 G.A| / if and only if for all .x; y/ 2 G.A/ one has hy ; yi hx ; xi D 0 or .y ; x / D S.x ; y / with .x ; y / 2 G.A/? . Since S is an isomorphism, and since G.A/? is weakly closed, G.A| / is weakly closed. Assume that D.A| / is weakly dense in Y . The preceding shows that A|| considered as an operator from X into Y is closed and G.A|| / D T.G.A| /? / for T WD S| W .y; x/ 7! .x; y/. It is clearly an extension of A and, since for a linear subspace Z of Y X one has .S.Z//? D T 1 .Z ? /, one gets G.A|| / D T.G.A| /? / D T..S.G.A/? /? / D T.T 1 .G.A/?? // D G.A/?? :
3.5 Some Key Results of Functional Analysis
163
Thus, by the bipolar theorem, G.A|| / is the smallest closed subspace containing G.A/ and A has a closed extension A|| . Conversely, assuming that A has a closed extension A, let us prove that D.A| / | | is dense. Since D.A / D.A| /, it suffices to prove that D.A / is dense in Y or that D.B| / is dense in Y when B is a closed densely defined operator. Thus, we assume that A is closed. Let y 2 D.A| /? ; then .y; 0/ 2 G.A| /? . Setting as above T.v; u/ WD .u; v/ for .v; u/ 2 Y X, so that T 1 .u; v/ D .v; u/, for a subspace Z of Y X we have .S.Z//? D T 1 .Z ? /, hence, by relation (3.14) and the bipolar theorem, G.A| /? D .S.G.A/?//? D T 1 .G.A/?? / D T 1 .G.A//: Then .y; 0/ 2 G.A| /? implies that y 2 D.A/? and y D 0. Thus D.A| / is dense in Y . t u Other relationships between A and A| are described in the next result. Proposition 3.34 For a closed, densely defined operator A between D.A/ X and Y one has N.A| / D R.A/? ;
N.A/ D R.A| /? :
(3.15)
Proof Introducing the closed subspaces G WD G.A/, H WD X f0g (the “horizontal subspace”) of E WD X Y, one has X R.A/ D G C H;
N.A/ f0g D G \ H
(3.16)
and similarly, since G.A| / D S.G.A/?/ D S.G? /, R.A| / Y D G? C H ? ;
f0g N.A| / D G? \ H ? :
(3.17)
Thus, since .G C H/? D G? \ H ? by (3.5), we get f0g N.A| / D G? \ H ? D .G C H/? D f0g R.A/? : On the other hand, since G \ H D .G? C H ? /? by (3.6), we get N.A/ f0g D G \ H D .G? C H ? /? D .R.A| / Y /? D R.A| /? f0g; t u
so that relation (3.15) holds. ??
??
Remark From (3.15) and the relations W D cl W, Z D cl Z for linear subspaces W X; Z X , we deduce N.A| /? D cl R.A/, N.A/? D cl R.A| /. t u The Closed Range Theorem can be generalized to densely defined closed maps.
164
3 Elements of Functional Analysis
Theorem 3.29 (Closed Range Theorem) Let A W D.A/. X/ ! Y be a densely defined closed operator. The following assertions are equivalent: R.A/ D cl .R.A//; |
|
(3.18)
R.A / D cl R.A /;
(3.19)
R.A/ D N.A| /? ;
(3.20)
R.A| / D N.A/? :
(3.21)
Proof Setting G WD G.A/, H WD X f0g, we saw that N.A/ f0g D G \ H;
f0g N.A| / D G? \ H ? ;
(3.22)
X R.A/ D G C H;
R.A| / Y D G? C H ? :
(3.23)
Thus R.A/ is closed if and only if G C H is closed, if and only if (by Theorem 3.25) G? C H ? is closed, if and only if R.A| / is closed. When R.A/ D N.A| /? , R.A/ is closed. Conversely, if R.A/ is closed, by relation (3.15) one has R.A/ D R.A/?? D N.A| /? . Similarly, when R.A| / D N.A/? , R.A| / is closed. Conversely, if R.A| / is closed, then by (3.23) and Theorem 3.25, G? C H ? and G C H are closed so that, by (3.23), R.A/ is closed. t u Alternate proof (for further training with the preceding results) (3.18),(3.20) The implication (3.20))(3.18) being obvious, let us prove the reverse implication. By definition of A| we have R.A/ N.A| /? since for x 2 D.A/, y WD Ax, y 2 N.A| / we have hy ; yi D hy ; Axi D hA| y ; xi D 0. To prove that R.A/ D N.A| /? when R.A/ is closed we show that for y 2 YnR.A/ we have y 2 YnN.A| /? . Since R.A/ is closed, the Hahn-Banach Theorem yields some y 2 Y such that hy ; zi D 0 for all z 2 R.A/ and hy ; yi D 1. Thus y 2 D.A| / and y 2 N.A| /. Since hy ; yi D 1, we have y … N.A| /? . (3.19),(3.21) The implication (3.21))(3.19) being obvious, let us prove the reverse implication. Again, by definition of A| , we have R.A| / N.A/? since for x WD A| y with y 2 D.A| / and x 2 N.A/ we have hx ; xi D hA| y ; xi D hy ; Axi D 0. Let us prove the opposite inclusion when R.A| / is closed. Given x 2 N.A/? , we observe that for every y 2 R.A/ and any x, x0 2 A1 .y/ we have hx ; xi D hx ; x0 i since x x0 2 N.A/. Thus, we get a linear form f on R.A/ by setting f .y/ WD hx ; xi for x 2 A1 .y/. Since B W G.A/ ! Y given by B.x; Ax/ D Ax is continuous when G.A/ is endowed with the norm induced by X Y and since R.B/ D R.A/ is closed, Corollary 3.26 provides some c > 0 such that for all y 2 R.A/ there exists some z 2 G.A/ such that z 2 B1 .y/ and kzk c kyk. Taking x 2 D.A/ such that z WD .x; Ax/ we see that for any y 2 R.A/ there exists some x 2 D.A/ such that kxk c kyk and y D A.x/. Thus f is continuous. Taking a linear continuous extension y 2 Y of f we see that y 2 D.A| / and for all x 2 D.A/ we have hx ; xi D f .Ax/ D hy ; Axi. Thus x D A| y 2 R.A| /.
3.5 Some Key Results of Functional Analysis
165
Let us reduce the proof of the equivalence (3.18),(3.19) to the case of a continuous operator that is already known (Theorem 3.26). To do so, we provide the graph G WD G.A/ of A with the norm induced by the norm of X Y, so that G is a Banach space. Its dual is isometric to the quotient space .X Y /=G? via the restriction map S W X Y ! G and a subset H of G is closed if and only if S1 .H/ is closed in X Y . Let P be the restriction to G of the second projection .x; y/ 7! y. Since A.x/ D P.x; Ax/ for all x 2 D.A/ we have R.P/ D R.A/. On the other hand, since P| .z / D S.0; z / for all z 2 Y , we have .x ; y / 2 S1 .R.P| // if and only if there exists a z 2 Y such that S.x ; y / D S.0; z /, if and only if there exists z 2 Y such that x D A| .z y /. Thus S1 .R.P| // D R.A| / Y and R.P| / is closed if and only if R.A| / is closed. Using the fact that R.P| / is closed if and only if R.P/ is closed, we obtain the equivalence (3.18),(3.19). t u The following consequence is often used to show the solvability of an equation Ax D y : one reduces the question to the search for a constant c > 0 such that ky k c kA| y k for all y 2 D.A| /. This method is called the method of a priori estimates. Corollary 3.32 Let A W D.A/. X/ ! Y be a densely defined closed operator between Banach spaces. The following assertions are equivalent: (a) A is surjective: R.A/ D Y; (b) there exists some c > 0 such that ky k c kA| y k for all y 2 D.A| /; (c) R.A| / is closed and N.A| / D f0g. Proof (a))(b) By homogeneity it suffices to prove that the set Z WD fy 2 D.A| / W kA| y k 1g is bounded in Y or even (invoking the Uniform Boundedness Theorem), that for any y 2 Y, the set Z .y/ WD fhy ; yi W y 2 Z g is bounded in R. Since R.A/ D Y we can pick some x 2 D.A/ such that Ax D y. Then, for y 2 Z we have jhy ; yij D jhA| y ; xij c kA| y k with c WD kxk. (b))(c) Let x 2 cl.R.A| //. If .yn / is a sequence in D.A| / such that .A| yn / ! x the estimate in (b) shows that .yn / is a Cauchy sequence, hence has a limit y in Y . Since the graph of A| is closed by Proposition 3.33, we get x D A| y 2 R.A| /: R.A| / is closed. The equality N.A| / D f0g is a direct consequence in the relation ky k c kA| y k for all y 2 D.A| /. (c))(a) This follows from Theorem 3.29: R.A/ D N.A| /? D Y. t u In Exercise 1 a dual result is proposed; its use is more limited. For a densely defined closed operator A between Banach spaces we have the implications A surjective ) A| injective, A| surjective ) A injective
166
3 Elements of Functional Analysis
and in finite dimensional spaces the reverse implications hold. In general, this is not the case in infinite dimensional spaces (see Exercises 2 and 3).
Exercises 1. Let A W D.A/. X/ ! Y be a densely defined closed operator between Banach spaces. Show that the following assertions are equivalent: (a) A| is surjective : R.A| / D X I (b) there exists a c > 0 such that kx k c kAx k for all x 2 D.A/; (c) R.A/ is closed and N.A| / D f0g. 2. Let X D Y D `1 and let A 2 L.X; Y/ be given by Ax WD .an xn /n for x WD .xn /n , where .an / ! 0 and an > 0 for all n. Verify that A is injective but that R.A| / ¤ X. 3. With the same data as in the preceding exercise, verify that A| is injective but that A is not surjective. 4. Let A W D.A/. X/ ! Y be a densely defined closed operator between Banach spaces. Show that R.A/ is closed if and only if there exists some c 2 RC such that d.x; N.A// c kAxk for all x 2 D.A/. [Hint: first, consider the case D.A/ D X and use the quotient space X=N.A/. Then, reduce the general case to the preceding one by considering the operator AG W G.A/ ! Y defined as the restriction to G.A/ of the second projection, G.A/ being endowed with the restriction of a product norm.] 5. Let X WD `1 and let A W D.A/. X/ ! X be the densely defined closed operator defined by D.A/ WD fx WD .xn / 2 `1 W .nxn / 2 `1 g, Ax D .nxn / for x WD .xn / 2 D.A/. Verify that A is densely defined and closed but that A| is not densely defined in X D `1 .
3.5.3 The Spectrum of a Linear Operator Given a Banach space X over K WD R or C and A 2 L.X/ WD L.X; X/, it is of interest to consider the set .A/ of 2 K such that A I is an isomorphism; it is called the resolvent set of A. The complement of .A/ is called the spectrum of A and is denoted by .A/. The spectrum of A obviously contains the set e .A/ of eigenvalues of A, 2 K being called an eigenvalue of A if there exists some v 2 Xnf0g such that Av D v. Then v is called an eigenvector. The set e .A/ is also called the point spectrum and denoted by p .A/. The eigenspace X corresponding to the eigenvalue is the set of eigenvectors associated with , i.e. N.A I/, the kernel of A I. If X is finite dimensional, e .A/ D .A/ since AI is an isomorphism whenever it is injective. However, in infinite dimensional spaces the inclusion e .A/ .A/ may be strict, as one may have N.A/ WD A1 .0/ D f0g, i.e. 0 … e .A/ and R.A/ WD
3.5 Some Key Results of Functional Analysis
167
A.X/ ¤ X, hence 0 2 .A/. This is the case for the right shift A on the space `1 of bounded sequences given by A.x0 ; x1 ; : : :/ D .0; x0 ; x1 ; : : :/. As already shown by the finite dimensional case, the situation is much more interesting in complex vector spaces than in real spaces. In complex vector spaces it can be shown that .A/ is always nonempty (even if e .A/ is empty, as in the above example), as in the finite dimensional setting. However, different techniques are involved and in this book we focus our attention on real vector spaces. In this setting we have the following elementary result. Proposition 3.35 The (real) spectrum of a linear continuous operator A 2 L.X/ on a real Banach space X is a compact subset ofthe interval Œc; c for c WD kAk. Moreover, if 0 2 .A/, then for j 0 j < 1= .0 I A/1 one has 2 .A/ and X . 0 /n .0 I A/.nC1/ ; (3.24) .I A/1 D n0
.I A/1 .0 I A/1 exp.j 0 j : .0 I A/1 /:
(3.25)
Proof Since .A/ is open by Proposition 3.17, .A/ is closed. We have to show that c .A/. Now for 2 RnŒc; c, I 1 A is an isomorphism since 1RnŒc; A < 1. Thus A I is an isomorphism and 2 .A/. Setting R./ D .I A/1 and writing I A D 0 I A C . 0 /I D ŒI .0 /R.0 /.0 I A/; we see that I A is invertible when j 0 j < 1= kR.0 /k. Then the inverse is given by R./ D R.0 /ŒI .0 /R.0 /1 D
1 X . 0 /n R.0 /nC1 : nD0
It follows that kR./k kR.0 /k exp.j 0 j kR.0 /k/.
t u
Proposition 3.36 If .e1 ; : : : ; en / is a finite family of eigenvectors corresponding to distinct eigenvalues of A 2 L.X/, then e1 ; : : : ; en are linearly independent. Proof We prove the assertion by induction on n. For n D 1 the assertion is obvious. We assume the assertion is valid for n 1 and prove it for n. Let i be the eigenvalue corresponding to ei for i 2 Nn . Since e1 ; : : : ; en1 are linearly independent by our induction assumption, it suffices to prove that assuming that en D c1 e1 C : : : C cn1 en1 for some ci 2 R leads to a contradiction. This relation implies that n .c1 e1 C : : : C cn1 en1 / D n en D Aen D c1 1 e1 C : : : C cn1 n1 en1 ;
168
3 Elements of Functional Analysis
hence that ci .n i / D 0 for i 2 Nn1 . It follows that ci D 0 and en D 0, a contradiction. u t The notions of eigenvalue and eigenvector can be extended to any linear operator A whose domain D.A/ is smaller than the whole Banach space X. The resolvent set .A/ of a closed (linear) operator A is defined as the set of 2 R such that I A is a bijection of D.A/ onto X. Then, the resolvent operator R WD R./ WD .I A/1 is a continuous linear operator since its graph is closed, as is the graph of I A, as is easily seen. Thus, when D.A/ D X, these definitions reduce to the former ones. The following properties of the resolvent set and of the resolvent operator are useful, in particular for semigroups (see Chap. 10). Proposition 3.37 The resolvent set .A/ of an unbounded closed linear operator A on a real Banach space X is open. Moreover, if 0 2 .A/ and if 2 R satisfies j 0 j < 1= kR0 k we have 2 .A/ and R D
X
. 0 /n .0 I A/.nC1/ :
n0
Proof Again, given 0 2 .A/, for satisfying j 0 j < 1= kR k we have I A D 0 I A C . 0 /I D ŒI .0 /R0 .0 I A/ and we see that I A is invertible. Then the inverse is given as above by R D R0 ŒI .0 /R0 1 D
1 X
. 0 /n RnC1 0 :
nD0
t u Proposition 3.38 If R is the resolvent of a closed linear operator A then one has AR D R I and (a) For x 2 D.A/ and 2 .A/ one has R x 2 D.A/ and R Ax D AR x. (b) For , 2 .A/ one has R R D . /R R and R R D R R . (c) The map 7! R is infinitely differentiable (in the usual sense that for all .n/ .n1/ .n1/ n 2 Nnf0g, the n-th derivative R WD lim . ¤0/!0.1= /.RC R / exists) and .n/
R D .1/n nŠRnC1 : Proof For all x 2 X, 2 .A/, setting w WD R x, one has w 2 D.A/, w Aw D x, hence AR x D R x x.
3.5 Some Key Results of Functional Analysis
169
(a) If x 2 D.A/ and 2 .A/, for w WD R x one has Ax D A.w Aw/ D .I A/.Aw/, hence R Ax D Aw D AR x. (b) Given x 2 X and ; 2 .A/, setting w WD R x, z WD R x, we have .I A/.wz/ D x.I A/.z/C./z D ./z, so that wz D ./R z D . /R R x. The relation R R D R R ensues. (c) The map 7! R is differentiable: taking WD C , and passing to the limit .1/ in the relation 1 .RC R / D R RC , we get R D R2 . An induction .n/ on n gives the relation R D .1/n nŠRnC1 . It can also be obtained from the properties of power series. t u
Exercises 1. Prove that for any Banach space X and any A 2 L.X/ WD L.X; X/ one has .A| / D .A/. Give examples showing that there is no general inclusion between e .A| / and e .A/. [Hint: use the right shift and the left shift on a sequence space.] 2 . Let T WD Œ0; 1, R tX WD Lp .T/, with p 2 Œ1; 1Œ and let A 2 L.X/ be given by .Ax/.t/ WD 0 x.s/ds. Determine .A/ and e .A/. For 2 .A/, give an explicit expression of .A I/1 . Determine A| . [Hint: see Chap. 8 for the definition of Lp .T/.] 3. Let X be a Banach space and let A 2 L.X/. Verify that for all 2 .A/ one .A I/1 . [Hint: has .A I/1 A D A.A I/1 and d.; .A// 1= use the fact that all 2 R satisfying j j : .A I/1 < 1 are in .A/]. Show that for all 2 R satisfying jj > kAk one has I C .A I/1 kAk =.jj kAk/. Prove that if 0 2 .A/ then one has .A1 / D 1=.A/. 4. Let X be a Banach space and let A 2 L.X/ WD L.X; X/. Show that for all ; 2 .A/ one has .A I/1 .A I/1 D . /.A I/1 .A I/1 . 5. Let X be a Banach space and let A 2 L.X/. Show that .kAn k1=n / converges to a limit denoted by r .A/ and called the spectral radius of A. [Hint: show that lim supn kAn k1=n r WD infn1 kAn k1=n by taking for all s > r some k 2 N 1=k such that Ak < s and by noting that for n D kq C p with p, q 2 N, p < k one has kAn k1=n skq=n kAkp=n .] Verify that r .A/ kAk but for A W R2 ! R2 given by A.x1 ; x2 / D .0; x1 / one has r .A/ D 0, kAk D 1. Prove that 2 .A/ whenever satisfies jj > r .A/ and that .A I/1 D ˙n0 n1 An . Verify that for A W R3 ! R3 given by A.x1 ; x2 ; x3 / D .x2 ; x1 ; 0/ one has .A/ D f0g but r .A/ D 1, as A3 D A. If X is a complex space one has r .A/ D maxfjj W 2 .A/g. 6. Let X be a Banach space and let A 2 L.X/ be such that fAn W n 2 Ng is bounded. Show that cl.R.I A// D fx 2 X W .Cn x/ ! 0g for Cn D .1=n/.A C A2 C : : : C An / (CesJaro’s sum). Deduce from this characterization the relation N.I A/ \ cl.R.I A// D f0g.
170
3 Elements of Functional Analysis
Suppose that for some x 2 X and some infinite subset N of N .Cn x/n2N weakly converges to some x. Show that Ax D x and .Cn x/n0 ! x (the mean ergodic theorem). When X is reflexive, infer from the preceding that there exists some M 2 L.X/ such that .Cn x/ ! Mx for all x 2 X and that M D M 2 D AıM D MıA, with R.M/ D N.IA/, N.M/ D R.IA/ D cl.R.IA//. [Hint: see [262].] 7 . Show that the Laplace operator on X WD L2 .Rd / with domain H 2 .Rd / is self-adjoint and its spectrum is RC . [Hint: use the characterization of H 2 .Rd / in terms of the Fourier transform; see [67, p. 266].]
3.5.4 Compact Operators In this subsection X and Y are two Banach spaces. A continuous linear operator A 2 L.X; Y/ is said to be compact if for any bounded subset B of X its image A.B/ is relatively compact, i.e. cl .A.B// is compact. Such operators keep some familar properties of operators between finite dimensional normed spaces. Example If A 2 L.X; Y/ has finite rank, i.e. if the range R.A/ of A is finite dimensional, then A belongs to the set K.X; Y/ of compact (linear) operators. The converse is not true. It was proved by P. Enflo in 1972 that it is not even true that any element of K.X; Y/ can be approximated (for the norm topology of L.X; Y/) by a sequence of finite rank operators. Exercise Prove that the set K.X; Y/ of compact operators from X into Y is closed in L.X; Y/. [Hint: if A D limn An with An 2 K.X; Y/ for all n, prove that A.BX / is precompact.] Exercise Prove that the set K.X; Y/ is a linear space and that for any Banach spaces W, Z, any A 2 L.W; X/, B 2 K.X; Y/, C 2 L.Y; Z/ one has B ı A 2 K.W; Y/, C ı B 2 K.X; Z/. Proposition 3.39 For all A 2 L.X; Y/ one has A| 2 K.Y ; X / if and only if A 2 K.X; Y/. Proof Given A 2 K.X; Y/ let us show that A| .BY / is relatively compact in .X ; kk /. Let K WD cl.A.BX // and let F WD A| .BY / jK C.K/. Then K is compact, F is equicontinuous since for all y 2 BY , f WD A| .y / jK 2 F and for y, y0 2 K one has jf .y/ f .y0 /j D jy .Ay/ y .Ay0 /j kAk : ky y0 k. Moreover, for all y 2 K the set F.y/ D f.y ı A/.y/ W y 2 BY g is bounded in R. The AscoliArzela theorem (Theorem 3.3) ensures that F is relatively compact in C.K/. Thus, any sequence . fn / in F has a subsequence . fk.n/ / that converges to some f 2 C.K/. Picking yn 2 BY such that fn D A| .yn / jK we see that ˇ ˇ ˇ ˇ ˇ ˇ sup ˇyk.m/ .Ax/ yk.n/ .Ax/ˇ D sup ˇfk.n/ .y/ fk.m/ .y/ˇ ! 0 as m; n ! 1
x2BX
y2K
3.5 Some Key Results of Functional Analysis
171
so that .xk.n/ / WD .A| .yk.n/ // is a Cauchy sequence in X , hence is convergent: cl.A| .BY // is compact in X . Conversely, let A 2 L.X; Y/ be such that A| 2 K.Y ; X /. Then, by the preceding, A|| 2 K.X ; Y /. Thus cl.A|| .BX // cl.A|| .BX // is compact in Y . But since A.BX / D A|| .BX / and since Y is closed in Y , we get that cl.A.BX // is compact in Y. t u The next result is of practical interest. Theorem 3.30 (Fredholm Alternative) For any A 2 K.X/ WD K.X; X/ the spaces R.I A/ and R.I A| / are closed and in fact R.I A/ D .N.I A| //? ; R.I A| / D .N.I A//? : Moreover, N.I A/ and N.I A| / are finite dimensional with the same dimension. In particular, N.I A/ D f0g if and only if R.I A/ D X. This result entails a solvability criterion for the equation x Ax D y: according to the two cases dim N.I A| / D 0 or n WD dim N.I A| / > 0: either for every y 2 X the equation x Ax D y has a unique solution or the homogeneous equation x Ax D 0 has n linearly independent solutions and the inhomogeneous equation x Ax D y is solvable if and only if y satisfies n orthogonality relations hy; yi i D 0, i 2 Nn , where .yi /i2Nn is a base of N.I A| /. Proof Let Z WD N.I A/. Since BZ A.BX /, BZ is compact, hence Z is finite dimensional by Theorem 3.1. Since A| is compact, N.I A| / is also finite dimensional. Let us show that R.IA/ is closed. By the Closed Range Theorem, this will imply that R.I A/ D .N.I A| //? . Given y D lim yn with yn WD xn Axn 2 R.I A/ for all n, we have to show that y 2 R.I A/. Since N.I A/ is finite dimensional, we can find wn 2 N.I A/ such that kxn wn k D rn WD d.xn ; N.I A//: Let us first prove that the sequence .rn / is bounded. Otherwise, taking a subsequence if necessary, we may assume .rn / ! 1. Setting un WD rn1 .xn wn / 2 BX and taking another subsequence and relabelling it, we may assume .A.un // converges to some z 2 cl .A.BX //. Then, observing that A.wn / D wn , we see that yn D .xn wn / A.xn wn /
(3.26)
and .rn1 yn / ! 0. Thus .un / ! z and z D Az. Therefore z 2 N.IA/ and d.un ; N.I A// kun zk ! 0, contradicting the fact that d.un ; N.I A// D rn1 d.rn un ; N.I A// D rn1 kwn xn k D 1. Thus .xn wn / is bounded and since A is a compact operator, taking a subsequence if necessary, we may assume that .A.xn wn // has a limit v. Then, by (3.26), .xn wn / ! y C v. Setting x WD y C v and passing to the limit in (3.26),
172
3 Elements of Functional Analysis
we get y D x Ax 2 R.I A/: this space is closed. Applying Theorem 3.26, we get that R.I A/ D .N.I A| //? ;
R.I A| / D .N.I A//? :
Let us prove the last assertion. We first show that assuming N.I A/ D f0g and R.I A/ ¤ X leads to a contradiction. Setting Xn WD .I A/n .X/ we see that Xn is closed and that the sequence .Xn / is strictly decreasing since taking x 2 XnR.I A/ we have .I A/n .x/ 2 Xn nXnC1 : if we could find w 2 X such that .I A/n .x/ D .IA/nC1 .w/, since .IA/n is injective, we would have x D .IA/.w/ contradicting the fact that x … .I A/.X/. Since XnC1 is closed, Proposition 3.1 yields some un 2 Xn such that kun k D 1 and d.un ; XnC1 / 1=2. Given k < n in N and observing that Aun Auk D v uk with v WD un .I A/.un / C .I A/.uk / 2 XkC1 we get kAun Auk k d.uk ; XkC1 / 1=2, contradicting the compactness of A.BX /. Thus R.I A/ D X when N.I A/ D f0g. Conversely, suppose that R.I A/ D X. Then by relation (3.15) or Corollary 3.32 we have N.I A| / D .R.I A//? D f0g. Since A| 2 K.X /, the preceding step ensures that R.I A| / D X . Using Corollary 3.32 once more, we get that N.I A/ D .R.I A| //? D f0g. It remains to show that, for any A 2 K.X/, d WD dim N.I A/ is equal to d0 WD dim N.I A| /. Suppose d < d0 . Since N.I A/ is finite dimensional there exists a continuous projector P W X ! N.I A/ with R.P/ D N.I A/. Since R.I A/ D .N.I A| //? has finite codimension d0 , it has a complement Y with dimension d0 . Since we assume that d < d0 , we can find some C 2 L.N.I A/; Y/ that is injective but not surjective. Let y 2 YnR.C/ and let B W X ! X be defined by B WD A C C ı P. Since C ı P has finite rank, we have B 2 K.X/. Let us show that N.I B/ D f0g. Given x 2 N.I B/ we have 0 D x Bx D .x Ax/ .C ı P/.x/ 2 R.I A/ ˚ Y; so that x Ax D 0 and .C ı P/.x/ D 0, hence x 2 N.I A/ and Px D 0. Thus x D Px D 0 and N.I B/ D f0g. By the preceding step we obtain that R.I B/ D X. However, if x 2 .I B/1 .y/ we have y D .x Ax/ C.Px/ 2 R.I A/ ˚ Y, hence y D C.Px/, contradicting y 2 YnR.C/. Thus d0 d. Applying the same result to the compact operator A| , we get dim N.IX A|| / dim N.IX A| / dim N.IX A/: But since A|| is an extension of A, we have N.IX A/ N.IX A|| /, hence equality. Thus d0 D d. t u
3.5 Some Key Results of Functional Analysis
173
Corollary 3.33 For every eigenvalue ¤ 0 of a compact operator A, is an eigenvalue of A| with the same multiplicity, i.e. the dimensions of the eigenspaces are the same. Proof It suffices to apply the second assertion of the theorem to 1 A.
t u
Now let us study the spectra of compact operators. A first result is as follows. Proposition 3.40 The set e .A/nf0g of non-null eigenvalues of A 2 K.X/ is formed of isolated points: if .n / ! with n 2 e .A/nf0g for all n and m ¤ n for m ¤ n, then one has D 0. Proof Let .n / ! with n 2 e .A/nf0g for all n and m ¤ n for m ¤ n. Let en 2 Xnf0g be such that Aen D n en and let En be the space spanned by fe1 ; : : : ; en g. We know that En has dimension n (Proposition 3.36), so that En1 ¤ En . For all n > 1 Riesz’s lemma (Proposition 3.1) provides some un 2 En such that kun k D 1 and d.un ; En1 / > 12 . Since .A n I/.En / En1 , for k < n we have 1 Auk 1 Aun D 1 .Auk k uk / 1 .Aun n un / C uk un k n k n d.un ; En1 /
1 : 2
Since .Aun / has a convergent subsequence and since .n / ! ¤ 0, we get a contradiction. u t Theorem 3.31 If A 2 K.X/ then e .A/nf0g D .A/nf0g. If X is infinite dimensional one has 0 2 .A/ and either .A/ is a finite set or there exists a sequence .n / with limit 0 such that .A/nf0g WD fn g. Proof If 2 Rne .A/ and ¤ 0, we have N.1 A I/ D N.A I/ D f0g. Then, by Theorem 3.30, we have R.A I/ D X. By the Banach isomorphism theorem, A I is an isomorphism: 2 .A/ D Rn.A/. If 0 … .A/, A is invertible and I D A ı A1 is compact. Then BX is compact, hence X is finite dimensional. Thus 0 2 .A/ if X is infinite dimensional. Suppose .A/ is an infinite set. For every " > 0 the set f 2 .A/ W jj "g must be finite since it is compact by Proposition 3.35 and formed of isolated points by the preceding proposition and the relation e .A/nf0g D .A/nf0g. Thus .A/nf0g is countable and ordering .A/nf0g WD fn g in such a way that .jn j/ is decreasing, we get a sequence with limit 0. t u
Exercises 1. Given any sequence .n / ! 0 in R, show that there is a compact operator A in X WD `1 with .A/ D fn W n 2 Ng. [Hint: take A.x/ D .0 x0 ; 1 x1 ; : : :/ for
174
2.
3.
4.
5.
6.
7. 8.
9.
10.
3 Elements of Functional Analysis
x D .x0 ; x1 ; : : :/ and note that A is the limit in L.X/ of a sequence in finite rank operators.] Prove that any compact operator A 2 K.X; Y/ is completely continuous in the sense that for any weakly convergent sequence .xn / in X the sequence .A.xn // is strongly convergent in Y (and its limit is Ax if x is the weak limit of .xn /). Let X and Y be Banach spaces, X being reflexive, and let A 2 L.X; Y/. Show that A.BX / is closed and that A.BX / is compact when A is compact. Find a compact operator A such that A.BX / isRnot compact. [Hint: take X D Y D C.T/ for t T WD Œ0; 1 and set .Ax/.t/ WD 0 x.s/ds.] (J.-L. Lions) Let .X; kkX /, .Y; kkY / and .Z; kkZ / be three Banach spaces and let A 2 K.X; Y/, B 2 L.Y; Z/ with B injective. Prove that for every " > 0 there exists some c" > 0 such that for all x 2 X one has kAxkY " kxkX C c" kB.Ax/kZ . [Hint: suppose that for some " > 0 and some sequence .xn / of SX one has kAxn kY " C n kB.Axn /kZ and obtain a contradiction with the injectivity of B.] Apply the preceding to show that for every " > 0 there exists some c" > 0 such that for all x 2 C1 .T/, where T is a compact interval, one has kxk1 " kx0 k1 C c" kxk1 . Can one interchange the assumptions on A and B? Let .X; kkX /, .Y; kkY / and .Z; kkZ / be three Banach spaces and let A 2 K.X; Y/, C 2 L.X; Z/ with C injective. Suppose X is reflexive. Prove that for every " > 0 there exists some c" > 0 such that for all x 2 X one has kAxkY " kxkX C c" kCxkZ . [Hint: use an argument similar to the one in the preceding exercise.] Let S be a compact space endowed with a regular Borel measure . Given a continuous function G R W S S ! R, show that the operator A W C.S/ ! C.S/ given by .Ax/.s/ WD S G.s; t/x.t/d.t/ is compact, C.S/ being equipped with the norm kk1 . Prove the same conclusion when A is considered as an operator on L2 .S/. In contrast with what occurs for operators of the form I A with A compact, show that a continuous linear operator on an infinite dimensional Banach space may be injective and not surjective or surjective and not injective. [Hint: take the right shift and the left shift on a sequence space such as `2 , as defined in Exercise 11 below.] Given two Banach spaces X and Y, let A 2 L.X; Y/ be a continuous linear operator such that R.A/ has finite codimension, i.e. Y D R.A/ C Z for a finite dimensional subspace Z of Y. Show that R.A/ is closed. [Hint: apply the Open Mapping Theorem to the map B W X Z ! Y given by B.x; z/ D Ax C z.] Given two Banach spaces X and Y, a continuous linear operator A 2 L.X; Y/ is said to be a Fredholm operator if N.A/ is finite dimensional and R.A/ has finite codimension. The index of A in the set F.X; Y/ of Fredholm operators from X into Y is defined by ind A WD dim N.A/ codim R.A/:
3.6 Elementary Integration Theory
175
Note that for any K 2 K.X; X/ one has A WD IX K 2 F.X; X/ and that when X and Y are finite dimensional F.X; Y/ D L.X; Y/ and that for any A 2 L.X; Y/ one has ind A D dim X dim Y. Show that the set F.X; Y/ is an open subset of L.X; Y/ and that the map ind W F.X; Y/ ! N is continuous, hence is locally constant. Prove that A 2 L.X; Y/ belongs to F.X; Y/ if and only if there exists some B 2 L.Y; X/ such that A ı B IY and B ı A IX are compact operators if and only if there exists some B 2 L.Y; X/ such that A ı B IY and B ı A IX are finite rank operators. Prove that A 2 L.X; Y/ belongs to F.X; Y/ if and only if A| belongs to F.Y ; X / and then ind A| D ind A. Show that for A 2 F.X; Y/ and B 2 K.X; Y/ one has A C B 2 F.X; Y/ and ind .A C B/ D ind A. Given another Banach space Z and B 2 F.Y; Z/, A 2 F.X; Y/ show that B ı A 2 F.X; Z/ and that ind .B ı A/ D ind A C ind B. 11. Let X WD `2 be the space of square summable sequences with its usual norm. Let Sr W X ! X be the right shift defined by Sr x WD y with y0 D 0, yn WD xn1 for x WD .xn /, y WD .yn / and let S` W X ! X be the left shift defined by S` x D z with zn D xnC1 for x WD .xn /. Show that for c 2 Rnf1; 1g the maps Sr cIX and S` cIX are Fredholm operators and compute their indexes. Note that this is not the case for c 2 f1; 1g.
3.6 Elementary Integration Theory In this section we deal with some classes of one-variable maps that are regular enough and for which an integration theory can be given in a simple way.
3.6.1 Regulated Functions and Their Integrals We start with functions that have simple discontinuities. Definition 3.4 A function f W T ! X from a compact interval T WD Œa; b of R into a real Banach space X is said to be regulated if, for all t 2 Œa; bŒ (resp. t 2a; b), f has a limit on the right f .tC / WD limr!t;r>t f .r/ (resp. on the left f .t / WD lims!t;s 0 such that fn is constant on Œt ı; tŒ\T and t; t C ı \ T. Thus for s, t in one of these two intervals we have kf .s/ f .t/k ". Since X is complete, this proves that the one-sided limits of f at t exist. Moreover, if fn is normalized, f is normalized. Proposition 3.42 The space R.T; X/ (resp. Rn .T; X/) of regulated functions (resp. normalized regulated functions) from T to X endowed with the norm kk1 is a Banach space. Proof This follows from the fact that R.T; X/ is the closure of the set S.T; X/ of t u order step functions in .B.T; X/; kk1 /, hence is complete. It is worth noting the following facts. Proposition 3.43 For any regulated function f W T ! X, the set f .T/ is relatively compact in X (i.e. cl. f .T// is compact). Moreover, the set of discontinuities of f is at most countable. Proof If f 2 R.T; X/ and if . fn / is a sequence in S.T; X/ that converges uniformly to f we see that f .T/ is precompact since it can be approximated by fn .T/, which is finite. The second assertion stems from the fact that the set D of discontinuities of f is the union of the sets Dn WD ft 2 T W kf .tC / f .t /k 1=ng for n 2 Nnf0g. t u The integral of a stair function f can be defined unambiguously as follows: if s0 D a < s1 < : : : < sk D b is such that f .t/ D ci for all t 2si1 ; si Œ for i D 1; : : :k, then Z
Z
b
f WD T
a
k X f .t/dt WD .si si1 /ci : iD1
3.6 Elementary Integration Theory
177
It is easy to show that this element of X does not depend on the subdivision of T. Moreover, for any stair function f from T to X, the triangle inequality ensures that Z f .b a/ kf k : 1
(3.27)
T
SinceR the space S.T; X/ of stair functions is dense in the space R.T; X/, the map f 7! T f can be extended by continuity from S.T; X/ to R.T; X/: Z
Z f D lim n
T
if f D lim fn :
fn
n
T
Again, this map is linear, continuous and with norm b a as (3.27) remains valid for f 2 R.T; X/. Moreover, given a b c in R, for all f 2 R.Œa; c; X/ one has the Chasles’ relation Z
Z
c a
Z
b
f D
fC a
c
f
(3.28)
b
which easily follows from the case of stair functions by a passage to the limit. The following composition property is crucial: using continuous linear forms e on X, it enables us to determine the integral of a regulated function f 2 R.T; X/ with R the help of the integrals of the real-valued functions e ı f which determine T f uniquely. Proposition 3.44 Given Banach spaces R A 2 L.X; Y/, for every f 2 R X and Y and R.T; X/ one has A ı f 2 R.T; Y/ and T A ı f D A. T f /. Proof The first assertion is a direct consequence in the definition. It can also be checked by taking a sequence R R . fn / in S.T; X/ which converges uniformly to f . Since the relation T A ı fn D A. T fn / is immediate, follows from the R the second assertion R definition of the integral of A ı f since .A. f // ! A. f /, A being continuous n T T R R and . T fn / converging to T f . t u This result can be completed. Proposition 3.45 Let E, F be Banach spaces, let U be an open subset of E T and let W WD ff 2 R.T; E/ W cl.f. f .t/; t/ W t 2 Tg Ug. Then W is an open subset of R.T; E/. Moreover, if g W U ! F is a continuous map, then the map g˘ W W ! R.T; F/ defined by g˘ . f /.t/ WD g. f .t/; t/ is continuous. We encourage the reader to first consider the simpler case of the subset V of W formed by those f 2 C.T; E/ such that . f .t/; t/ 2 U for all t 2 T, or even suppose U WD U0 T where U0 is an open subset of E. Proof For any f0 2 W the map t 7! . f0 .t/; t/ is regulated on T with values in E R. Thus the closure K WD cl.f. f0 .t/; t/ W t 2 Tg/ of its image is compact. Thus ı WD gap.K; .E R/nU/ WD inffkk zk W k 2 K; z 2 .E R/nUg
178
3 Elements of Functional Analysis
is positive. Thus, given ı 0 20; ıŒ and f 2 R.T; E/ satisfying kf f0 k1 ı 0 one has f. f .t/; t/ W t 2 Tg BŒK; ı 0 WD fz 2 E R W d.z; K/ ı 0 g U and since BŒK; ı 0 is closed one gets cl.f. f .t/; t/ W t 2 Tg U, i.e. f 2 W. Thus W is open. The definition of a regulated function shows that for any f 2 W the map g˘ . f / W t 7! g. f .t/; t/ is in R.T; F/. Moreover, for f0 2 W and K as above, since g is uniformly continuous around K in the sense that for any " > 0 one can find some ı > 0 such that kg.z/ g.k/k " whenever k 2 K and z 2 BŒk; ı. Thus, given f 2 W satisfying kf f0 k ı, one has kg˘ . f / g˘ . f0 /k ". This proves that g˘ is continuous on W. t u Corollary 3.34 Given Banach spaces E, F, G and continuous maps A W T ! L.E; F/, B W T ! L2 .E; FI G/ the maps A˘ W R.T; E/ ! R.T; F/ and B˘ W R.T; E/ R.T; F/ ! R.T; G/ given by A˘ . f /.t/ WD A.t/:f .t/;
B˘ . f ; g/.t/ WD B.t/. f .t/; g.t//
are linear continuous and bilinear continuous respectively. Moreover, kA˘ k D kAk1 and kB˘ k D kBk1 . In fact, the conclusions are valid whenever A (resp. B) is regulated. Proof The linearity of A˘ and the bilinearity of B˘ are obvious. The fact that A˘ and B˘ are well-defined whenever A and B are regulated stems from the definition of a regulated map and the inequalities ˘ A . f /
1
sup kA.t/k : sup kf .t/k t2T
t2T
t2T
t2T
˘ B . f ; g/ sup kB.t/k : sup kf .t/k : sup kg.t/k 1 t2T
for f 2 R.T; E/, g 2 R.T; F/ are obvious and prove that A˘ and B˘ are continuous with kA˘ k kAk1 , kB˘ k kBk1 . The reverse inequalities (which are not as important) follow from appropriate choices of f and g and are left as exercises. u t
3.6.2
Functions of Bounded Variation and Integration
Let us turn to another class of functions that enables us to speak of the length of a curve. Here T still stands for a compact interval Œa; b but we can replace the range space X or E with a metric space. Given f W T ! X and a subdivision WD .s0 ; s1 ; : : : ; sk / of T, i.e. an increasing finite sequence in T such that s0 D a, sk D b, we set V . f / WD
k X iD1
d. f .si1 /; f .si //:
3.6 Elementary Integration Theory
179
Definition 3.5 A map f W T ! X is said to be a function of bounded variation or a function of finite variation if the supremum V. f / WD sup V . f / over the set ˙.T/ of subdivisions of T is finite. The number V. f / is called the total variation of f . The set of functions of bounded variation from T to X is denoted by BV.T; X/. If f is Lipschitzian with rate c on T, then for every 2 ˙.T/ one has V . f / c.b a/, so that f is a function of bounded variation. If X D R and if f is nondecreasing, then for every 2 ˙.T/ one has V . f / f .b/ f .a/, so that f belongs to BV.T; X/. Of course, the same inclusion holds if f is nonincreasing. Let us endow ˙.T/ with the order induced by inclusion of the sets of values. Then ˙.T/ is a directed set and since the triangle inequality shows that V . f / V . f / if , i.e. if is a refinement of , we see that V. f / is the limit of the net .V . f // 2˙.T/ . When f is continuous, V. f / can be interpreted as the limit of V . f / as ./ ! 0, where W ˙.T/ ! RC is the mesh defined by ./ WD supfsi si1 W i 2 Nk g if WD .s0 ; : : :sk / with s0 D a < s1 < : : : < sk D b. Let S be another compact interval of R and let f W S ! X, g W T ! X be equivalent in the sense that there exists an increasing bijection h W S ! T such that f D g ı h. Then one has V. f / D V.g/: if WD .s0 ; : : : ; sk / 2 ˙.S/, then for WD h./ WD .h.s0 /; : : : ; h.sk // 2 ˙.T/ one has V . f / D V .g/ V.g/, hence V. f / V.g/ and similarly V.g/ V. f /. This remark allows us to define the length of an arc of X, an arc being defined as the equivalence class fQ of an element f of BV.T; X/: its length being defined by setting `.fQ / WD V. f /. Let us focus on the case when X is a normed space, in particular on the case X D R. Proposition 3.46 If X is a normed vector space, the set BV.T; X/ of functions of bounded variation from the interval T to X is a vector space and V./ is a seminorm on BV.T; X/. Moreover, if X is a Banach space, BV.T; X/ is a subset of R.T; X/. Proof Let f , g 2 BV.T; X/ and let h WD f C g. For every WD .s0 ; : : : ; sk / 2 ˙.T/ and i 2 Nk we have kh.si / h.si1 /k kf .si / f .si1 /k C kg.si / g.si1 /k ; hence V .h/ V . f / C V .g/, and V.h/ V. f / C V.g/. The relation V.rf / D jrj V. f / for r 2 R and f 2 BV.T; X/ is obvious. Given f 2 BV.T; X/ and t 2 T WD Œa; b, t > a, let .tn / be an increasing sequence in T with limit t. Then . f .tn // must be a Cauchy sequence. Otherwise we can find " > 0 and a subsequence .tk.n/ /n of .tn / such that f .tk.2n/ / f .tk.2nC1/ / ". Then for m 2 Nnf0g, taking WD .a; tk.0/ ; : : : ; tk.2mC1/ ; b/ we get V . f / m ˙nD0 f .tk.2n/ / f .tk.2nC1/ / m", contradicting the fact that V. f / < 1. Now, if .tn / and .tn0 / are increasing sequences with limit t, we can order the terms of ftn W n 2 Ng [ ftn0 W n 2 Ng in such a way that .tn / and .tn0 / are subsequences of an increasing sequence .tn00 /. Thus, when X is complete, limn f .tn / D limn f .tn0 /. We
180
3 Elements of Functional Analysis
deduce from this observation that the left limit f .t / exists. Similarly, we can show that the right limit f .tC / exists for all t 2 Œa; bŒ: f is regulated. t u Theorem 3.32 The space BV.T; R/ coincides with the set of differences of two nondecreasing functions. Proof Since BV.T; R/ is a vector space and contains the cone of nondecreasing functions, every difference of two nondecreasing functions belongs to BV.T; R/. Conversely, let f 2 BV.T; R/. For t 2 T, let ft be the restriction of f to Tt WD Œa; t and let g.t/ WD V. ft /. For r < s in T one has g.r/ g.s/ and in fact g.r/ C jf .r/ f .s/j g.s/. Thus, for h WD f C g we get h.r/ h.s/ D g.r/ g.s/ C f .r/ f .s/ jf .r/ f .s/j C f .r/ f .s/ 0: Thus h is nondecreasing and f D h g.
t u
The integration of regulated functions presented in a previous subsection is a simplified approach to Riemann integration. In turn, the latter theory is a special case of the Riemann-Stieltjes integration theory as we intend to show. It is obtained by taking for integrator function g the identity map IT given by IT .t/ WD t. Let g be a given element of the space BVn .T; R/ of normalized functions of bounded variation on T; we do not mention it in our notation as it is fixed. Let us denote by ˙ 0 .T/ the set of pairs .; / such that WD .s0 ; : : : ; sk / 2 ˙.T/, WD .r0 ; : : : ; rk1 / satisfy si1 < ri1 < si for i 2 Nk . We define a directed preorder on ˙ 0 .T/ by setting .; / . 0 ; 0 / if 0 (i.e. if 0 is obtained from by adding some points). For f W T ! X let us say that f is integrable with respect to g if the net .S; . f //.;/2˙ 0 .T/ defined as follows converges: for .; / 2 ˙ 0 .T/ with WD .s0 ; : : : ; sk /, WD .r0 ; : : : ; rk1 / S; . f / WD
k X
f .ri1 /.g.si / g.si1 //:
iD1
R Then the limit is denoted by fdg and is called the Riemann-Stieltjes integral of f with respect to g and, if g D IT , the Riemann integral of f . This integrability requirement may seem stringent; however it is satisfied for continuous maps and even for regulated maps as we intend to show. Let f be an element of the set S.T; X/ of stair functions: for some D .s0 ; : : : ; sk / 2 ˙.T/, f is constant with value ei on si1 ; si Œ. Then, for any .; / 2 ˙ 0 .T/ such that , gathering terms, we see that S; . f / D
k X .g.si / g.si1 //ei ; iD1
so that .S; . f //.;/2˙ 0 .T/ converges to this sum. Moreover, in such a case we have Z k X fdg max kei k jg.si / g.si1 /j kf k1 V.g/: 1ik iD1
3.6 Elementary Integration Theory
181
Now, given f 2 Rn .T; X/ and a sequence . fn / of Sn .T; X/ converging uniformly to f , the preceding inequality shows that for n, p 2 N we have Z Z fn dg fp dg fn fp V.g/; 1 R R so that . fn dg/n is a Cauchy sequence in X. We denote by fdg its limit. Since two Cauchy sequences appear as subsequences of the Cauchy sequence obtained by alternating terms, the limit is independent of the choice of . fn /. Moreover, the map R f ! fdg is linear and continuous from R.T; X/ into X and it satisfies Z fdg kf k V.g/: 1 Remark If f 2 Rn .T; X/ i.e. if f 2 R.T; X/ is normalized, the preceding construction can be simplified: one takes a sequence . fn / of normalized stair functions converging to f and replaces pairs .; / 2 ˙ 0 .T/ with pairs .; .// where for WD .s0 ; : : : ; sk / 2 ˙.S/ ./ WD .s0 ; : : : ; sk1 /. However, the RiemannStieltjes construction addresses to a more general class of functions than Rn .T; X/. t u For X WD R, f W T ! R and g nondecreasing, it is usual to use the integrability criterion that the Darboux’s sums S . f / WD
k X iD1
ci1 .g.si / g.si1 //;
S . f / WD
k X
ci1 .g.si / g.si1 //;
iD1
where ci WD supff .t/ W t 2si ; siC1 Œg and ci WD infff .t/ W t 2si ; siC1 Œg are such that .S . f / S . f // ! 0. If f is Riemann-Stieltjes integrable this criterion is satisfied since for any subdivision the numbers ci and ci are arbitrarily close to actual values of f on the interval si ; siC1 Œ. Conversely, if f is bounded the nets .S . f // and S . f // belong to a compact interval of R and if 2 ˙.T/ is a refinement of 2 ˙.T/ in using refinements that consist in adding a single point and in making successive steps, one can show that S . f / S . f / S . f / S . f /: Theorem 1.6 shows that the increasing net .S . f // converges and the decreasing net .S . f // converges. Their limits are the same since .S . f / S . f // ! 0. Moreover, since for any .; / 2 ˙ 0 .T/ one has S . f / S; . f / S . f / the net .S; . f //.;/2˙ 0 .T/ converges. We admit the following characterization (see [132, Thm 6.16] for instance).
182
3 Elements of Functional Analysis
Theorem 3.33 A bounded function f W T ! R on a compact interval T is Riemann integrable if and only if there exists a subset N of T of Lebesgue measure 0 such that f is continuous at each point of TnN. As above, one can show that given Banach spaces X and Y and A 2 R L.X; Y/, for every f 2 R.T; X/ (or even for every Riemann integrable f ) one has .A ı f /dg D R A. T fdg/. Moreover, one can devise an analogue of the Chasles’ relation (3.28). Because the Riemann integral has poor properties in terms of convergence, we shall not continue its study. We just quote a comparison result with the Lebesgue integral to which we shall devote more attention, referring to [132, Thm 6.15] or [240, Thm 1.5 p. 57] for a proof. Theorem 3.34 If a bounded function f W T ! R on a compact interval T is Riemann integrable then it is Lebesgue integrable and its Riemann integral coincides with its Lebesgue integral.
Exercises 1. Verify that the function f defined by f .0/ WD 0, f .x/ WD x2 sin.1=x2 / for x 2 R n f0g is not of bounded variation on T WD Œ0; 1 although it has a derivative at each point of T. 2. Given a < b < c in R and v 2 BV.Œa; c; X/, show that Vac .v/ D Vab .v/ C Vbc .v/ and that s 7! Vas .v/, the variation of v on the interval Œa; s, is a nondecreasing function. 3. Verify that the function f 7! kf .a/k C V. f / is a norm on the space BV.T; X/, where T WD Œa; b and X is a normed space. 4. (A generalization of the Stieltjes integral) Given Banach spaces X, Y, Z, a continuous bilinear map .x; y/ 7! x y from X Y to Z, a function v 2 BV.T; Y/ for T WD Œa; b and a (right-) normalized stair function f from T to X, given .; / 2 ˙ 0 .T/ with WD .s0 ; : : : ; sk / 2 ˙.T/, WD .r0 ; : : : ; rk1 /, set S; . f / WD
k X
f .ri1 / .g.si / g.si1 //:
iD1
(a) Show that the net .S; . f //.;/2˙ 0 .T/ converges. [Hint: consider first the case when f is the step R function us e for s 2a; bŒ, e 2 X and use linearity.] Its limit is denoted Rby T f dg. (b) Show that T f dg does not depend on the decomposition of f . Verify that R f dg V.g/ kf k . 1 T R (c) Deduce from this inequality that the map f 7! T f dg can be extended to a continuous linear map from the space Rn .T; X/ of normalized regulated functions with values in X into Z satisfying the same inequality.
3.6 Elementary Integration Theory
183
(d) Conversely, given a continuous linear form f on the space Rn .T/ WD Rn .T; R/, for s 2 T, let g.s/ WD f .us /, where us is defined as above. Show that g is of bounded variation on T and that Va .g/ kf k. (e) Deduce from the preceding a correspondence between the (topological) dual of the space Rn .T/ and the space BV.T/. [See [189].] 5. Integration by parts Prove the following equality for f , g 2 BV.T; R/, T WD Œa; b: Z
Z
b
fdg D f .b/g.b/ f .a/g.a/ a
b
gdf : a
6. Let f be the function on Œ0; 1 given by f .0/ D 0, f .x/ WD sin.1=x/ for x 20; 1. Prove that f is not regulated but is Riemann integrable. 7. Verify that the function f on T WD Œ0; 1 given by f .x/ D 0 if x 2 T n Q, f .x/ WD 1=q for x 2 T \ Q and x WD p=q where p; q 2 N have no common factor. Prove that f is regulated. 8. Show that the function f on T WD Œ0; 1 given by f .x/ D 0 if x 2 T n Q, f .x/ WD 1 for x 2 T \ Q is not Riemann integrable. 9. Deduce from the preceding exercise that a pointwise limit of a sequence of Riemann integrable function is not necessarily Riemann integrable. [Hint: taking a sequence .qn / such that fqn W n 2 Ng D T \ Q define fn on T by fn .x/ D 1 if x D qk with k n, fn .x/ D 0 otherwise and observe that fn is a stair function and . fn / ! f , where f is as in the preceding exercise.
3.6.3
Application: The Dual of C.T/
Let us use the Stieltjes integral to identify the dual of .C.T/; kk1 / when T is a compact interval of R. Theorem 3.35 (Riesz) For any element x of the dual of the space X WD C.T/ of continuous functions on T WD Œa; b there exists some g 2 BV.T/ WD BV.T; R/ such R that x . f / D fdg for all f 2 C.T/. R Proof We have seen that for all g 2 BVn .T/ the map f 7! fdg is a continuous linear form on C.T/. Let us show that any element x in the dual of the space C.T/ is of this form. Using the Hahn-Banach Theorem, we pick some xR 2 R.T/ extending x and satisfying xR D kx k. For s 2 T, s ¤ b we denote by us the function 1Œa;sŒ defined by 1Œa;sŒ .t/ WD 1 if t 2 Œa; sŒ, 0 otherwise and we set ub D 1Œa;b . Let us show that the function g W T ! R defined by g.s/ WD xR .us /
s2T
184
3 Elements of Functional Analysis
is of bounded variation. Let WD fs0 D a < s1 : : : < sn D bg 2 ˙.T/. Let "i 2 f1; 1g be such that "i .g.si / g.si1 // D jg.si / g.si1 /j. Then, by linearity of xR we have n X
jg.si / g.si1 /j D
iD1
n X
"i .g.si / g.si1 // D xR .
iD1
n X
"i .usi usi1 //
iD1
n "i .usi usi1 /1 1, hence, since xR D kx k and ˙iD1 n X
jg.si / g.si1 /j kx k :
iD1
Thus g is of bounded variation and V.g/ kx k. Now let us show that for all R f 2 C.T/ we have x . f / D fdg. Given f 2 C.T/ and WD fs0 D a < s1 < : : : < sn D bg 2 ˙.T/ we set f WD
n X
.usi usi1 /f .si1 /;
iD1
so that f is constant on Œsi1 ; si Œ with value f .si1 / and Z
n n X X f dg D .g.si / g.si1 //f .si1 / D .xR .usi / xR .usi1 //f .si1 / iD1
D
iD1
xR . f /:
Since f is uniformly continuous, R we have .kf f k1 / ! 0 as the mesh of R goes to 0. Since both xR and x 7! xdg are continuous on Rn .T/, we get x . f / D fdg. t u
Riemann has shown us that proofs are better achieved through ideas than through long calculations. David Hilbert, 1897.
Abstract Hilbert spaces form a major class of normed spaces. They offer geometric properties that are similar to those of Euclidean spaces. In particular one can identify them with their duals and, given a nonempty closed convex subset of such a space, to every point of the space corresponds a closest point in the set. When the set is a linear subspace, this correspondence defines an orthogonal projection. Hilbert spaces also serve as a models for important classes of function spaces. Since one can define Hilbert bases that generalize algebraic bases by using series, the study of Fourier series is set in such a framework.
Hilbert spaces form a special class of normed spaces of particular interest. They resemble Euclidean spaces as in them a notion of orthogonality can be defined. Moreover, the angle between two vectors can be given a meaning. Besides the interest in such generalizations of elementary geometry, they have nice properties in terms of duality, best approximation and form a convenient framework for the study of operators.
one has h.x; x/ 2 R for all x 2 X. If X is a real vector space, then a Hermitian form is just a bilinear symmetric form. If for all x 2 Xnf0g one has h.x; x/ > 0 (resp. h.x; x/ 0) one says that h is positive definite (resp. positive). Then h is also called a scalar product. We use Dirac’s notation hx j yi for h.x; y/ (pronounced bra for hx and ket for yi) which is widely used in physics and in quantum mechanics, but a variety of notations can be encountered: among them are the original one Œx j y by H. Grassmann (1862), its variant .x j y/ by N. Bourbaki and the simplified notation hx; yi that may introduce some confusion with the coupling between X and its dual. In general, the notation x y is reserved for the scalar product of Euclidean spaces. Example Let X be the space of continuous functions on Œ0; 1 with values in C. For R1 x; y 2 X let h.x; y/ WD 0 x.t/y.t/dt. This Hermitian form is the prototype of a useful class of Hermitian forms over infinite dimensional spaces. Example Let I be an infinite set and let X WD `2 .I/ (resp. `2 .I; C/) be the space of families x WD .xi /i2I of real (resp. complex) numbers such that .jxi j2 /i2I is summable. Then, using the Minkowski inequality (4.1) in Euclidean spaces, P one can show that for any x WD .xi /i2I , y WD .yi /i2I the scalar product hx j yi WD i2I xi yi is well defined. For I WD N the space `2 WD `2 .N/ is a useful model. t u A vector x 2 X is said to be orthogonal to a family Y of vectors in X if for all y 2 Y one has hx j yi D 0. Then one writes x ? Y. Two subsets Y and Z of X are said to be orthogonal if z ? Y for all z 2 Z and then one writes Y ? Z. Theorem 4.1 (Bunyakovsky-Cauchy-Schwarz) If h is a positive Hermitian form on X, for every x; y 2 X one has jh.x; y/j2 h.x; x/h.y; y/: Proof Let a WD h.x; x/, b WD h.y; y/, c WD h.x; y/. Changing x into x, with 2 C satisfying jj D 1, c D jcj, we may suppose c 2 RC . The relation 0 h.x y; x y/, i.e. 2c a C b yields c 1 D ab when a D 1, b D 1, hence, by homogeneity, jcj2 ab when a ¤ 0, b ¤ 0. When a D 0, changing x into tx with t 2 RC we obtain the relation 2tc b and c D 0 by taking the limit as t ! C1. The case b D 0 is similar. t u Corollary 4.1 (Minkowski) If h is a positive Hermitian form on X, the function p W x 7! h.x; x/1=2 is a semi-norm (and a norm if h is positive definite). Proof Clearly, for all 2 C, x 2 X one has p.x/ D jj p.x/. Now, for all x; y 2 X, the (Bunyakovsky-) Cauchy-Schwarz inequality entails h.x; y/ C h.y; x/ 2 jh.x; y/j 2p.x/p.y/ hence p.x C y/2 D h.x C y; x C y/ D h.x; x/ C h.x; y/ C h.y; x/ C h.y; y/ .p.x/ C p.y//2 :
4.1 Hermitian Forms
187
Fig. 4.1 The parallelogram law
y x+y
0
x
Taking the square root of each side we get p.x C y/ p.x/ C p.y/.
t u
Thus, when h is positive definite, setting kxk WD p.x/, the Cauchy-Schwarz inequality can be written as 8x; y 2 X
jhx j yij kxk : kyk ;
(4.1)
a well-known inequality for the scalar product in Euclidean spaces. Expanding the square of the norm, the classical parallelogram law (Fig. 4.1) can also be obtained: kx C yk2 C kx yk2 D 2 kxk2 C 2 kyk2 :
8x; y 2 X
A similar expansion (made above) yields the famous Pythagoras’ Theorem: Theorem 4.2 (Pythagoras) If x and y are orthogonal, then kx C yk2 D kxk2 C kyk2 : If X is endowed with a positive definite Hermitian form, one says that X is a pre-Hilbertian space or a pre-Hilbert space. If X is complete with respect to the associated norm, one says that X is a Hilbert space. The following exercises show that in a pre-Hilbertian space the scalar product is determined by the associated semi-norm.
Exercises 1. Let h a Hermitian form on a real vector space. Show that h.x; y/ D
1 Œh.x C y; x C y/ h.x y; x y/ : 4
2. Let h a Hermitian form on a complex vector space. Show that h.x; y/ D 1 Œh.x C y; x C y/ h.x y; x y/ C ih.x C iy; x C iy/ ih.x iy; x iy/ : 4 3. Let X be a complex linear space and let b W X X ! C be a C-bilinear form on X. Show that there exist some x 2 X, x ¤ 0 such that b.x; x/ D 0.
188
4 Hilbert Spaces
4. Deduce from the preceding exercises that the restriction of a Hermitian form h to a linear subspace Z of X is null if one has h.z; z/ D 0 for all z 2 Z. 5. Deduce from Exercises 1 and 2 that if the norm on a normed space .X; kk/ over R or C satisfies the parallelogram law, then it is the norm associated with a scalar product. 6. Let .X; kk/ be a normed space over R or C. Suppose that for all two dimensional linear subspaces Y of X the induced norm is associated with a scalar product. Show the norm of X is associated with a scalar product. 7. Let us define a sesquilinear form on a normed space X over C as a map f W X X ! C such that, for all x 2 X, f .x; / is C-linear and continuous and for all y 2 X, f .; y/ is semi-linear and continuous. Let b WD supfjf .x; y/j W x; y 2 SX g and c WD supfjf .x; x/j W x 2 SX g. Show that c b 4c. Compare with the case when f is Hermitian. 8. Show that when h is a positive definite Hermitian form, the Cauchy-Schwarz inequality is an equality for some vectors x; y 2 X if x and y are linearly dependent. 9. Prove that when h is a positive definite Hermitian form, Minkowski’s inequality kx C yk kxk C kyk is an equality if and only if there exists r 2 RC such that y D rx or x D ry. 10. Show that the complexified space of a real pre-Hilbertian space is a complex pre-Hilbertian space. [Hint: the complexified space of a real vector space X is X X and, for c WD a C ib with a; b 2 R, c.x; y/ WD .ax by; bx C ay/.] 11. Given two non-null vectors u, v in a pre-Hilbertian space prove that u v D ku vk : 2 2 kuk : kvk kuk kvk Give an interpretation in terms of surveying or astronomy: knowing the distance of the observer to two observed points and the angle under which they appear, one can compute their mutual distance. 12. Let X be a pre-Hilbertian space and let w, x, y, z 2 X. Prove the Ptolemae inequality: kw yk : kx zk kw xk : ky zk C kx yk : kw zk : [Hint: Reduce the question to the case w D 0 and use the preceding exercise.] 13. Let X, Y be two pre-Hilbertian spaces and let A W X ! Y be isometric (i.e. such that kA.x/ A.x0 /k D kx x0 k for all x, x0 2 X/ and such that A.0/ D 0. Show that A is linear. R 14. Let f W R ! C be a continuous function such that R jf .t/j2 dt < C1. Show that for all r 2 R one has ˇZ ˇ Z ˇ ˇ ˇ f .t/f .t r/dtˇ jf .t/j2 dt: ˇ ˇ R
R
4.2 Best Approximation
189
4.2 Best Approximation Given a closed convex subset C of a pre-Hilbertian space X and w 2 XnC, one may wonder whether there exists some a 2 C such that ka wk kx wk for all x 2 C. Such a point a is called a best approximation of w in C or a projection of w on C. It is unique, as shown by the parallelogram law: given two such points a, a0 one has 2 a a0 2 D 2 ka wk2 C 2 a0 w2 4 1 a C 1 a0 w 0; 2 2 since .a C a0 /=2 2 C and k.a C a0 /=2 wk infx2C kx wk; hence a D a0 . It is easy to find the best approximation of w when C is a one-dimensional subspace Ru or Cu with u 2 Xnf0g. It suffices to find a 2 C such that w a ? C: then, by Pythagoras’ Theorem, for all x 2 C one has kw ak2 kw ak2 C ka xk2 D kw xk2 : Such a point a WD u is obtained by taking such that hw u j ui D 0, i.e. D kuk2 hw j ui. The number is called the Fourier coefficient of w with respect to u. A similar construction holds when C is the vector space spanned by a finite family .u1 ; : : : ; uk / of orthogonal vectors. Taking a WD 1 u1 C : : : C k uk
with i WD kui k2 hw j ui i for i D 1; : : : ; k
we get that w a ? ui for i D 1; : : : ; k since ui ? uj for j ¤ i, hence by linearity, w a ? x for all x 2 C and again kw ak2 kw xk2 by Pythagoras’ Theorem. A general existence result can be given. Theorem 4.3 Let C be a nonempty complete convex subset of a pre-Hilbertian space X and let w 2 XnC. Then there exists some a 2 C such that ka wk D infx2C kx wk. This point, called the projection of w on C is unique. It is denoted by PC .w/ in the sequel and is characterized by the inequality 8x 2 C
Rehw a j x ai 0:
(4.2)
Proof Let .xn / be a sequence in C such that .kw xn k/ ! d WD infx2C kx wk. Setting "n WD kxn wk2 d 2 , the parallelogram law shows that .xn / is a Cauchy sequence: 2 1 xn xp 2 D kxn wk2 C xp w2 2 1 .xn C xp / w 2 2 d 2 C "n C d 2 C "p 2d2 D "n C "p :
190
4 Hilbert Spaces
Since C is complete with respect to the induced metric, .xn / converges to some a 2 C, and passing to the limit we get kw ak D d. Let us prove (4.2). For all x 2 C and all t 20; 1 we have xt WD .1 t/a C tx 2 C hence kw ak2 kw xt k2 D kw ak2 C t2 kx ak2 2t Rehw a j x ai; 0 t kx ak2 2 Rehw a j x ai after simplification. Taking the limit as t ! 0C , we get (4.2). Conversely, let us assume a 2 C satisfies (4.2). Then, for all x 2 C, we have kw xk2 D kw ak2 C ka xk2 2 Rehw a j x ai kw ak2 : t u
Therefore a is a best approximation of w in C.
Corollary 4.2 Let C be a complete convex subset of a pre-Hilbertian space X. Then the map PC is Lipschitzian with rate 1: for all w; w0 2 X one has kPC .w/ PC .w0 /k kw w0 k. Proof Given w; w0 2 X, let a WD PC .w/; a0 WD PC .w0 /, z WD .w w0 / .a a0 /, so that w w0 2 D a a0 2 C kzk2 C 2 Rehz j a a0 i:
(4.3)
Relation (4.2) shows that Rehw a j a0 ai 0;
Reha0 w0 j a0 ai 0;
hence, by addition, Rehz j a0 ai D Reh.w a/ C .a0 w0 / j a0 ai 0. Plugging this estimate into (4.3) we get kw w0 k2 ka a0 k2 . t u Corollary 4.3 Let Y be a complete linear subspace of a pre-Hilbertian space X. Then the map PY is linear and continuous. For all x 2 X, PY .x/ is the unique point y of Y such that x y is orthogonal to Y. Thus X is the topological direct sum of Y and Y ? : X D Y ˚ Y ? and Y ? D ker PY , Y ?? D Y. Proof The characterization of y WD PY .x/ deduced from (4.2) can be written as 8y0 2 Y
Rehx y j y0 yi 0:
For z arbitrary in Y, taking successively y C z, y z (and y C iz, y iz if X is a complex linear space) in place of y0 , we get hx y j zi D 0. The linearity of PY ensues. Writing x D y C .x y/, we note that X D Y C Y ? . The sum is direct since for u 2 Y \ Y ? we have u D 0. Since PY is continuous, this sum is a topological direct sum.
4.2 Best Approximation
191
If x 2 ker PY we have x D x PY .x/ 2 Y ? . Conversely, if x 2 Y ? we have x D PY .x/ C .x PY .x// D 0 C x and the uniqueness of the decomposition ensures that PY .x/ D 0. The inclusion Y Y ?? is obvious; if x 2 Y ?? we have in particular hx PY .x/ j xi D 0 and since hx PY .x/ j PY .x/i D 0, we get hx PY .x/ j x PY .x/i D 0, hence x D PY .x/ 2 Y. t u Corollary 4.4 For any linear subspace Y of a Hilbert space X one has Y ?? D cl.Y/. Proof Since Y ?? is closed and contains Y, one has cl.Y/ Y ?? . On the other hand, since Y cl.Y/, one has Y ?? .cl.Y//?? D cl.Y/. Thus Y ?? D cl.Y/. u t
Exercises 1. Let C be a closed convex cone of a Hilbert space. Show that the projection x WD PC .w/ of w 2 X is characterized by w 2 C, Rehw x j xi D 0, so that one has kwk2 D kw xk2 C kxk2 . 2. Let C be a complete convex cone of a pre-Hilbertian space X. Show that the set C0 WD fx 2 X W 8y 2 C hx j yi 0g is a closed convex cone and that C00 WD .C0 /0 D C. 3. Let C and D be two closed convex cones of a real Hilbert space X such that D D C0 , hence C D D0 by the preceding exercise. Show that for all .x; y; z/ 2 X 3 the following two assertions are equivalent: z D x C y; x D PC .z/;
x 2 C; y 2 D; y D PD .z/.
x?y
4. Prove that the projection x WD PY .w/ of a point w of a pre-Hilbertian space X on a complete linear subspace Y of X is the unique point y 2 Y such that x y is orthogonal to Y. 5. Let Y and Z be two linear subspaces of a pre-Hilbertian space X and for a, b 2 X let A WD a C Y, B WD b C Z. Show that the following two assertions are equivalent: gap.A; B/ D ka bk for gap.A; B/ WD inffku vk W u 2 A; v 2 Bg a b is orthogonal to Y and to Z. 6. Let A and B be two nonempty complete convex subsets of a pre-Hilbertian space X, B being bounded. Show that there exists some .a; b/ 2 A B such that ka bk D gap.A; B/ WD inffku vk W u 2 A; v 2 Bg. 7. Let C be a convex subset of a closed affine subspace A of a Hilbert space X and let w 2 X, a WD PA .w/. Show that if a has a projection x in C, then x is also the projection of w in C. 8. Show that the conclusion of the preceding exercise is no longer true if A is replaced with a general closed convex subset of X.
192
4 Hilbert Spaces
9. Let C be the closure of the union of an increasing family .Cn / of complete convex subsets of a pre-Hilbertian space X. Let w 2 X be such that w has a best approximation x WD PC .w/ in C. Show that x D limn PCn .w/. [Hint: show that .PCn .w// is a Cauchy sequence.] 10. Let .Cn / be a decreasing sequence of complete convex subsets of a preHilbertian space X. For x 2 X, let d.x/ WD limn d.x; Cn /. Suppose that for some x 2 X one has d.x/ < C1. Show that d.x/ < C1 for all x 2 X. Prove that the diameter of Cn \ BŒx; d.x/ C " tends to 0 as .n; "/ ! .C1; 0/. Deduce from this that the intersection C of the Cn ’s is nonempty and that d.x/ D d.x; C/. 11. Let C be a bounded complete convex subset of a pre-Hilbertian space X and let f W C ! R be a lower semicontinuous convex function. Using Exercise 10 show that f attains its infimum on C. Prove the same conclusion when f is quasiconvex in the sense that for all r 2 R the set f 1 . 1; r/ is convex. 12. Let C be a complete convex subset of a real pre-Hilbertian space X. Using Theorem 4.3 show that C is the intersection of a family of closed affine halfspaces, i.e. of a family of subsets of the form Df ;r WD f 1 .1; r/ with f 2 X , r 2 R. If C is a complete convex cone of X, show that C is the intersection of a family of closed half-spaces, i.e. subsets of the form Df ;0 . 13. Let X be a Hilbert space, let a be the bilinear form associated with a continuous positive semidefinite symmetric linear map A W X ! X and for a nonempty subset T of X let V0 .T/ WD fx 2 X W a.x; x/ a.x t; x t/ 8t 2 Tg be the Voronoi cell of T with respect to the origin (taking Y WD X and f WD a in Exercise 8 of Sect. 2.3). Show that V0 .T/ D V0 .Tnf0g/ D V0 .cl.T// is a closed convex subset containing the origin and that V0 .T/ is polyhedral when T is finite. [Hint: show that V0 .T/ WD fx 2 X W 2hAt j xi hAt j ti 8t 2 Tg.] Suppose A is positive definite. Prove that 0 2 intV0 .T/ if and only if 0 … cl.T/. Prove that for any closed convex subset W containing the origin there exists some closed convex subset T of X such that V0 .T/ D W. [Hint: use the bipolar theorem and write T D co.f0g [ fy A1 y W y 2 W 0 nf0gg/ for some y 2 R.]
4.3 Orthogonal Families The convenience of Cartesian coordinates in Euclidean spaces incites us to look for a similar device in Hilbert spaces. The use of orthogonal families will present such an analogy. However, some differences appear as finite families are not sufficient in infinite dimensional spaces. A family .bi /i2I of elements of a pre-Hilbertian space is said to be orthogonal if for all i; j 2 I with i ¤ j one has bi ? bj . It is said to be orthonormal if it is
4.3 Orthogonal Families
193
orthogonal and if for all i 2 I one has kbi k D 1. Any orthogonal family of nonnull vectors is linearly independent: if for some finite subset J of I and some family P .j /j2JPof numbers one has j2J j bj D 0, then for all k 2 J one has k kbk k2 D hbk j j2J j bj i D 0 hence k D 0. One can pass from an orthogonal family .bi /i2I of non-null vectors to an orthonormal family .ei /i2I by setting ei WD bi = kbi k. Let us describe a process that allows us to pass from a linearly independent family to an orthogonal family. Proposition 4.1 (Gram-Schmidt) Let .bn / be a finite or countable family of linearly independent vectors of a pre-Hilbertian space X and let Xn be the linear subspace spanned by b1 ,. . . ; bn . Setting a1 WD b1 , anC1 WD bnC1 PXn .bnC1 / for n 1, one gets an orthogonal family such that a1 ; : : : ; an generate Xn for all n. Let us note that if the family .bn / is total in the sense that the union of the spaces Xn is dense in X, then .an / is total. Proof Clearly a1 generates X1 . Assume that a1 ; : : : ; an generate Xn . Then anC1 2 XnC1 and, on the other hand, any element of XnC1 can be written as a linear combination of a1 ; : : : ; an and bnC1 D anC1 C PXn .bnC1 / hence as a linear combination of a1 ; : : : ; an and anC1 . Thus a1 ; : : : ; anC1 generate XnC1 . It remains to show that .an / is an orthogonal family. By Corollary 4.3, anC1 is orthogonal to Xn , so that anC1 ? ai for i D 1; : : : ; n. t u In practice, one determines anC1 by looking for coefficients 1 ; : : : ; n such that anC1 D bnC1 1 a1 : : : n an satisfies haj j anC1 i D 0 for j D 1; : : : ; n i.e. 2 2 haj j bnC1 i j aj D 0 so that j D aj haj j bnC1 i. Proposition 4.2 (Bessel) Let .ei /i2I be an orthonormal family of a pre-Hilbertian space X and for x 2 X let xi WD hei j xi. Then the family .jxi j2 /i2I is summable and X
jxi j2 kxk2 :
(4.4)
i2I
Proof We have Pto show that for any finite subset J of I we have Setting xJ WD j2J xj ej , this inequality stems from the relations 0 kx xJ k2 D kxk2
X
P j2J
ˇ ˇ2 ˇxj ˇ kxk2 .
.hx j xj ej i C hxj ej j xi/ C kxJ k2 D kxk2
j2J
ˇ ˇ2 P ˇ ˇ2 since hx j xj ej i D xj xj D ˇxj ˇ D hxj ej j xi and kxJ k2 D j2J ˇxj ˇ .
X ˇ ˇ2 ˇx j ˇ ; j2J
(4.5) t u
Theorem 4.4 (Parseval) For an orthonormal family .ei /i2I of a pre-Hilbertian space X the following assertions are equivalent: (a) the family .ei /i2I is total in XI (b) P for all x 2 X, setting xi WD hei j xi, the family .jxi j2 / is summable with sum 2 2 i2I jxi j D kxk I (c) for all x 2 X the family .xi ei /i2I is summable with sum x.
194
4 Hilbert Spaces
Proof (a))(b) Let x 2 X and let " > 0 be given. By assumption, there exists some finite subfamily .ej /j2J of .ei /i2I and some element y of the space XJ generated by .ej /j2J such that kx P yk ". Let z be the projection of x on XJ . One has kx zk yk and x j2J xj ej is orthogonal to each ej by the definition of xj , so that kx P z D j2J xj ej . Therefore, by (4.5), 0 kxk2
X ˇ ˇ2 ˇxj ˇ D kx zk2 "2 : j2J
P This shows that the family i2I jxi j2 is summable, with sum kxk2 . (b))(c) We have to show that for any " > 0 one can find a finite subset J" of P I such that for all finite subsets J of I containing J" one has j2J xj ej x ". ˇ ˇP ˇ ˇ 2 ˇ ˇ We take J" such that ˇ j2J ˇxj ˇ kxk2 ˇ "2 for any finite subset J of I containing P 2 P ˇ ˇ2 J" . Since j2J xj ej x D kxk2 j2J ˇxj ˇ by a computation of the preceding proof, this choice of J" is suitable. (c))(a) This implication is obvious since any x 2 X can be approximated by a finite linear combination of the family .ei /i2I . t u An orthonormal family .ei /i2I satisfying the assertions of the preceding theorem is called an orthonormal basis or a Hilbert basis. One has to recall that such a family is not an algebraic basis if it is infinite. Proposition 4.3 An orthonormal family .ei /i2I of a Hilbert space X is a Hilbert basis if, and only if it is maximal in the set of orthonormal families with respect to set inclusion. Proof Let .ei /i2I be a Hilbert basis of a pre-Hilbertian space. If e is a vector orthogonal to all the ei ’s, then e is orthogonal to the closed linear space generated by .ei /i2I , i.e. e is orthogonal to X, hence is 0. This shows that there is no orthonormal family strictly containing .ei /i2I : .ei /i2I is maximal. Conversely, let .ei /i2I be a maximal orthonormal family of a Hilbert space X. If the family .ei /i2I is not total, one can find a non-null vector e orthogonal to the closed linear space generated by .ei /i2I . Since we may suppose kek D 1, adding the vector e to .ei /i2I , we get a strictly larger family, a contradiction. Thus .ei /i2I is total. t u Corollary 4.5 Any Hilbert space contains a Hilbert basis. Proof This follows from Zorn’s Lemma, the union of an increasing set of orthonormal families being an orthonormal family. t u Corollary 4.6 Any real (resp. complex) Hilbert space is isometric to some space `2 .I/ (resp. `2 .I; C/). Any real (resp. complex) separable Hilbert space is isometric to `2 (resp. `2 .N; C/). Remark By the preceding corollary and the polarization identity, the assertions of Theorem 4.4 are equivalent to the following one:
4.3 Orthogonal Families
195
(d) for all x, y 2 X, setting yi WD hei j yi, the family .xi yi /i2I is summable with sum hx j yi.
Exercises 1. Let X and Y be two linear subspaces of a real pre-Hilbertian space Z. Suppose there exists some c 2 RC such that jhx j yij D c kxk : kyk for all x 2 X, y 2 Y. Show that either X and Y are one-dimensional or c D 0 (i.e. X ? Y). 2. Let .xi /i2Nk be a finite sequence in a pre-Hilbertian space X. The Gram determinant of .xi /i2Nk is the determinant G.x1 ; : : : ; xk / WD det.hxi j xj i/. (a) Show that G.x1 ; : : : ; xk / 2 RC and that G.x1 ; : : : ; xk / D 0 if and only if the family .xi /i2Nk is linearly dependent. [Hint: use an orthonormal basis of the linear subspace generated by .xi /i2Nk .] (b) Suppose the family .xi /i2Nk is linearly independent and let Y be the linear subspace generated by .xp i /i2Nk . Show that the distance d.w; Y/ of a point w 2 X to Y is equal to G.w; x1 ; : : : ; xk /=G.x1 ; : : : ; xk /. [Hint: write the projection x of w on Y as a linear combination of the xi ’s.] 3. Let X be the pre-Hilbertian space C.T/, where T is a compact interval of R, R endowed with the scalar product h j i given by hx j yi D T x.t/y.t/dt. Given a linearly independent family .x1 ; : : : ; xn / of elements of X, let dn 2 C.T/ be such that dn .t/ is the determinant of the matrix whose i-th line is .hxi j x1 i; : : : ; hxi j xn i/ for i D 1; : : : ; n 1 and whose n-th line is .x1 .t/; : : : ; xn .t//. Prove that d1 ; : : : ; dn form a linearly independent family spanning the space Xn generated by .x1 ; : : : ; xn / and that hdn j dn i D G.x1 ; : : : ; xn /G.x1 ; : : : ; xn1 / where G.x1 ; : : : ; xk / WD det.hxi j xj i/. 4. Let X and Y be separable real Hilbert spaces with orthonormal basis .en /n2N and . fn /n2N respectively. For A 2 L.X; Y/ show that the following series have the same (possibly infinite) sum: 1 X mD0
kA.em /k2 ;
1 1 X X
hA.em / j fn i2 ;
mD0 nD0
1 X
2
kA . fn /k :
nD0
When their sums are finite they are denoted by kAk2HS . Observe that kAkHS is independent of the choice of the basis .em /m2N and verify that A 7! kAkHS defines a norm on the space HS.X; Y/ WD fA 2 L.X; Y/ W kAkHS < 1g called the space of Hilbert-Schmidt operators. Show that the norm kkHS is associated with the scalar product h j iHS on HS.X; Y/ defined by hA j BiHS WD
1 X
hA.em / j B.em /i:
mD1
Prove that kk kkHS and that .HS.X; Y/; kkHS / is complete.
196
4 Hilbert Spaces
5. Consider separable real Hilbert spaces X and Y with orthonormal basis .en /n2N and . fn /n2N respectively. Given a bounded sequence s WD .sn /n2N of real numbers show that one defines a continuous linear map A W X ! Y by setting A.x/ D P 1 Y/ as defined in the preceding mD1 sm hx j em ifm . Prove that A belongs to HS.X; P 2 exercise if and only if s 2 `2 , i.e. ksk22 WD 1 mD1 sm < 1 and that kAkHS D ksk2 . Deduce from this relation that the norms kkHS and kk on HS.X; Y/ are not equivalent. 6. Show that if W, X, Y, and Z are separable infinite dimensional Hilbert spaces and if B 2 L.W; X/, C 2 L.Y; Z/, A 2 HS.X; Y/ as defined in Exercise 4, then A ı B 2 HS.W; Y/, C ı A 2 HS.X; Z/ and kA ı BkHS kAkHS : kBk, kC ı AkHS kAkHS : kCk.
4.4 The Dual of a Hilbert Space A remarkable property of real Hilbert spaces is that they can be identified with their dual spaces. Theorem 4.5 (Riesz) If X is a Hilbert space, for every w 2 X the function fw W x 7! hw j xi is a continuous linear form with norm kwk. The map w 7! fw is an isometric semi-linear map from X onto the dual X of X. Proof The linearity of fw is obvious and its continuity is a consequence of the Cauchy-Schwarz inequality: jfw .x/j kwk : kxk for all x 2 X, so that kfw k kwk. Since fw .w/ D kwk2 , we get kfw k D kwk. Clearly w 7! fw is semi-linear and injective. Let us show that this map is onto. Let w 2 X . If w D 0 we have w D fw with w D 0. Assuming w ¤ 0, let Y WD ker w . It is a closed linear subspace of X, hence a complete subspace. Corollary 4.3 ensures that X D Y ˚ Y ? . Since Y ¤ X as w ¤ 0, we can pick some b 2 Y ? nf0g. Then we have Y ker fb and since both these subspaces are hyperplanes, we have Y D ker fb . It follows that there exists some scalar such that w D fb . Then w D fb . t u It is very convenient to identify the dual of a real Hilbert space X with X itself. In particular, to the derivative of a differentiable function f at x 2 X corresponds a vector denoted by rf .x/ and called the gradient of f at x. It is characterized by 8y 2 X
hrf .x/ j yi D hf 0 .x/; yi:
Corollary 4.7 Any Hilbert space is reflexive. Proof The Riesz isometry R W X ! X enables us to endow the dual X of X with a scalar product h j i obtained by setting for x , y 2 X hx j y i WD hR1 .y / j R1 .x /i:
4.4 The Dual of a Hilbert Space
197
Thus, for all x 2 X, taking x D R.x/ one gets hx j y i D hR1 .y / j xi. The Riesz isometry R W X ! X associated with the scalar product h j i on X is defined by hR .x /; y i D hx j y i . Thus, for y WD R1 .y / one gets hR .x /; y i D hx j y i D hy j xi D hRy; xi D hy ; xi: This string of equalities proves that R .R.x// coincides with the image of x in the canonical injection j W X ! X . Since R and R are onto, j D R ı R is onto and X is reflexive. t u Corollary 4.8 An element x of a Hilbert space X is the weak limit of a net .xi /i2I of X if and only if for all y 2 Y one has hx j yi D limi2I hxi j yi. Exercise Show that for a net .xi /i2I in a Hilbert space X and x one has .xi /i2I ! x if and only if x is the weak limit of .xi /i2I and .kxi k/i2I ! kxk. Given Hilbert spaces X and Y and an element A of the space L.X; Y/ of continuous linear maps from X into Y, the Riesz isomorphisms RX and RY of X and Y respectively enable us to associate to the transpose A| 2 L.Y ; X / of A an | element A of L.Y; X/ called the adjoint of A. It is defined by A WD R1 X ı A ı RY or through the relation 8x 2 X; y 2 Y
hx j A yi D hAx j yi:
We leave as an exercise the proof of the following proposition. Proposition 4.4 The map A 7! A satifies the following properties: .˛A/ D ˛A ;
.A C B/ D A C B ;
kA k D kAk ;
.A ı B/ D B ı A ;
.A / D A; kA Ak D kAk2 :
An operator A 2 L.X; X/ such that A D A is called Hermitian or self-adjoint. If X is a real Hilbert space, A is also said to be symmetric (since then the bilinear form .x; y/ 7! hAx j yi is symmetric). Then hA given by hA .x; y/ WD hAx j yi is a Hermitian form on X. Theorem 4.6 (Lax-Milgram) Let X be a real Hilbert space and let A 2 L.X; X/ be coercive in the sense that there exists some c > 0 such that hAx j xi c kxk2 for all x 2 X. Then A is an isomorphism from X onto X. Conversely, if A 2 L.X; X/ is an isomorphism and if A is symmetric and such that hAx j xi 0 for all x 2 X, then A is coercive. Proof of the direct assertion For all x 2 Xnf0g we have kAxk D supu2SX hAx j ui hAx j x= kxki c kxk. Thus A is an isomorphism from X onto A.X/. Thus A.X/ is complete, hence closed in X. On the other hand A.X/ is dense since x D 0 whenever x 2 A.X/? as c kxk2 hAx j xi D 0. Thus A.X/ D X. t u
198
4 Hilbert Spaces
Proof in the case A is symmetric The norm kkA associated with the Hermitian form hA introduced above satisfies 8x 2 X
kxkA WD .hA .x; x//1=2 c1=2 kxk
and also kxkA kAk1=2 kxk. Thus, kkA is equivalent to the norm associated with the scalar product h j i of X and the dual of X with respect to kkA coincides with X . The Riesz isomorphism theorem ensures that for all w 2 X there exists some w 2 X such that hw ; xi D hA .w; x/ for all x 2 X. Given y 2 X and taking w WD hy j i, we get hy j xi D hA .w; x/ D hAw j xi for all x 2 X, hence y D Aw. Thus A is onto. The injectivity of A is immediate. For the converse we introduce a > 0 such that A1 x a kxk for all x 2 X. The Cauchy-Schwarz inequality yields jhAy j xij2 D jhAx j yij2 hAx j xihAy j yi
8x; y 2 X:
Taking y WD A1 x we get kxk4 hAx j xi kxk kyk ahAx j xi kxk2 , hence kxk2 ahAx j xi or hAx j xi a1 kxk2 . t u Thus, when A is symmetric, the Lax-Milgram Theorem can be deduced from the Riesz representation theorem by introducing the new scalar product given by hx j yiA WD hAx j yi: Also, when A is symmetric, for all b 2 X, setting f .x/ WD 12 hAx j xi hb j xi, we get a convex, continuous function f . Since it is coercive, it attains its infimum at some x 2 X characterized by Ax D b since rf .x/ D Ax b; again this proves the surjectivity of A. This link with optimization is important.
Exercises 1. Using the map w 7! fw from a pre-Hilbertian space X into X , with fw defined by fw .x/ D hw j xi for x 2 X, show that X is isometric to a dense linear subspace of X . Deduce from this fact that any pre-Hilbertian space is isometric to a dense linear subspace of a Hilbert space. 2. Let X be a Hilbert space and let P W X ! X be a map satisfying P ı P D P and hP.w/ j xi D hw j P.x/i for all w; x 2 X. Show that P is linear and continuous. Setting Y WD fy 2 X W P.y/ D yg show that P coincides with the projection operator PY on Y. 3. Let T WD Œ0; 1 and let X WD C.T; C/ be endowed with the Hermitian form R1 defined by hx j yi D 0 x.t/y.t/dt. Given a 2 X, let A 2 L.X; X/ be given by .Ax/.t/ WD a.t/x.t/. Show that A has an adjoint A 2 L.X; X/ although X is not complete. [Hint: Verify that .A x/.t/ WD a.t/x.t/ for all x 2 X, t 2 T.] Compute kAk.
4.4 The Dual of a Hilbert Space
199
4. With the notation of the preceding exercise, let K W T T ! C be a continuous R1 map and let A 2 L.X; X/ be given by .Ax/.t/ WD 0 K.s; t/x.s/ds. Show that A has an adjoint and give an explicit expression for it. 5. Let X be a Hilbert space and let A 2 L.X; X/. Prove that the following assertions are equivalent: A A D IX the identity map of X hAx j Ayi D hx j yi for all x, y 2 X kAxk D kxk for all x 2 X.
6.
7.
8.
9.
10.
An operator satisfying these three properties is called unitary if X is a complex space or orthogonal if X is a real space. Let X be a pre-Hilbertian space and let A 2 L.X; X/. One says that A is a symmetry if P WD .1=2/.A C IX / is a projection operator. Using Exercises 2 and 5 show that A is a symmetry if and only if A is both Hermitian and unitary. Let X be a Hilbert space and let A 2 L.X; X/. Let q W x 7! 12 b.x; x/ WD 12 hAx j xi be the quadratic form associated with A. Show that q is coercive (in the sense that q.x/ ! 1 as kxk ! 1) if, and only if q is supercoercive (in the sense that lim infkxk!1 q.x/= kxk > 0) if, and only if q is hypercoercive (in the sense that limkxk!1 q.x/= kxk D 1), if, and only if there exists some c > 0 such that q.x/ c kxk2 for all x 2 X. Let X be a Hilbert space, let W be a closed linear subspace of X and let A 2 L.X; X/ be a coercive operator. Show that the operator B 2 L.W; W/ given by B WD PW ıAıjW , where jW W W ! X is the canonical injection and PW W X ! W is the orthogonal projection onto W, is coercive and satisfies b.w; w/ D a.w; w/ for all w 2 W, where b.w; w/ WD hBw j wi, a.x; x/ WD hAx j xi for w 2 W, x 2 X. Given ` 2 X , prove that there exists some c > 0 such that the solutions u 2 X, v 2 W of the equations Au D `, Bv D ` jW satisfy ku vk cd.u; W/. (Galerkin method) Let X be a Hilbert space, let .Wn / be a sequence of closed linear subspaces of X and let A 2 L.X; X/ be a coercive operator. The map PWn ı A ı jWn is denoted by An 2 L.Wn ; Wn / and, given ` 2 X , the solution of the equation An w D ` jWn (resp. Ax D `) is denoted by un (resp. u). Let Y be a dense linear subspace of X such that .d.y; Wn // ! 0 for all y 2 Y. Show that .kun uk/n ! 0. (Stampacchia’s Theorem) Let C be a nonempty closed convex subset of a Hilbert space X and let a be a continuous bilinear form on X that is coercive in the sense that there exists some c > 0 such that a.x; x/ c kxk2 for all x 2 X. Prove that for all f 2 X there exists a unique u 2 C such that a.u; xu/ hf ; xi for all x 2 C. [Hint: for r WD 2c= kak2 consider the map g W C ! C defined by g.x/ WD PC .rf rAx C x/, where A 2 L.X; X/ is the operator associated with a and show that g is a contraction. Then the fixed point of g is the solution to the above variational inequality]. Many unilateral (i.e. one-sided) problems can be studied with such a model.
200
4 Hilbert Spaces
11. Show that the Lax-Milgram Theorem is a consequence of Stampacchia’s Theorem. [Hint: take C D X] 12. Given separable infinite dimensional Hilbert spaces X and Y and elements A, B of the space HS.X; Y/ of Hilbert-Schmidt operators from X into Y, as defined in Exercise 4 of the preceding section, along with its scalar product, show that hA j B iHS D hA j BiHS , so that kA kHS D kAkHS .
4.5 Fourier Series Let X be the space of continuous complex-valued functions that are periodic with period 1 (in short, 1-periodic). It can be identified with the space C.T; C/ of continuous complex-valued functions on the torus T WD R=Z. For x; y 2 X let Z
1
hx j yi WD
x.t/y.t/dt: 0
This positive Hermitian form is positive definite since the relation hx j xi D 0 implies that x is null on Œ0; 1 since x is continuous, hence is null on R since x is 1-periodic. The space X is not complete with respect to the associated norm, but its completion is the classical Lebesgue space L2 .T; C/. Proposition 4.5 The family .en /n2Z given by en .t/ D e2int forms a Hilbert basis of X WD C.T; C/. P For x 2 X, n 2 Z, setting xn WD cn .x/ WD hen j xi, the series C1 1 xn en converges and its sum is x. Proof The relations hen j ep i D 0 for n; p 2 Z, n ¤ p and hen j en i D 1 are immediate. It is a consequence in the Stone-Weierstrass Theorem that any f 2 X is the limit for the norm kk1 of a sequence in linear combinations of the en ’s. Since one has hx j xi kxk21 for all x 2 X, the family .en / is total in X, hence forms a Hilbert basis of X. The second assertion is a consequence in Theorem 4.4. t u In the space of continuous real-valued 1-periodic functions endowed with the of the preceding scalar product, one can show that the family p restriction p . 2 sin 2nt, 2 cos 2nt/n2N forms an orthonormal family and that a result similar to Proposition 4.5 holds for the Fourier series associated with a continuous 1-periodic real-valued function. Let us note that the convergence of the series holds for the norm kk2 associated with the scalar product and not for the norm kk1 . One does not even have pointwise convergence in general.
4.5 Fourier Series
201
The Fourier series of x 2 C.T; C/ is the series associated with the sequence x˙ WD .xn /n2Z given by Z xn WD hen j xi D
1
e2int x.t/dt:
0
We observe that the Fourier series of f 2 L1 .T; C/ is related to the Fourier transform c 1c S f of the function 1S f , with S WD Œ0; 1, via the relation f˙ D 1S f jZ . Here 1S f denotes the function on R given by 1S f .t/ D f .t/ for t 2 S, 1S f .t/ D 0 for t 2 RnS and the Fourier transform of g 2 R.R; C is given by Z gO .y/ WD
e2ixy g.x/dx:
Rd
If f W R ! C is a periodic continuous function with period T, one defines the Fourier coefficients of f (with respect to the functions t 7! .1=T/e2int=T ) as the Fourier coefficients of the 1-periodic function x given by x.t/ WD f .Tt/: 1 cn . f ; T/ D T
Z
T 0
e2ins=T f .s/ds:
R 2 ins 1 In particular, if f is 2-periodic, then cn . f ; 2/ D 2 f .s/ds. We denote by 0 e Sn . f ; T/, or for short Sn . f / when T D 1, the trigonometric polynomial defined by Sn . f ; T/.s/ WD
n X
ck . f ; T/e2iks=T
s 2 R:
kDn
When f is real, one has cn . f ; T/ D cn . f ; T/ and it is of interest to gather terms and consider the trigonometric polynomial Sn . f ; T/.s/ WD c0 . f ; T/ C
n X
ak . f ; T/ cos ks=T C
kD1
n X
bk . f ; T/ sin ks=T
s 2 R;
kD1
with Z ak . f ; T/ WD ck . f ; T/ C ck . f ; T/ D .2=T/
T 0
bk . f ; T/ WD i.ck . f ; T/ ck . f ; T// D .2=T/
f .s/ cos nsds 2 R; Z 0
T
f .s/ sin nsds 2 R:
Theorem 4.4 shows that for all f 2 C.T; C/ or L2 .T; C/ one has f˙ 2 `2 .Z; C/. More precisely, one has the following result.
202
4 Hilbert Spaces
Theorem 4.7 (Riemann-Lebesgue-Parseval) For all f 2 L2 .T; C/ one has R1 . 0 jSn . f /.t/ f .t/j2 dt/n ! 0 and .cn . f // ! 0. Moreover C1 X
2
jcn . f /j D
nD1
Z
1 0
jf .t/j2 dt:
Conversely, one can associate to every .cn / 2 `2 .Z; C/ an element f of L2 .T; C/ such that cn . f / D cn by setting f .t/ WD ˙n2Z cn e2int for t 2 T. With a stronger assumption one gets a more regular function. Theorem 4.8 (Fourier’s Inversion Formula) For every sequence c WD .cn /n2Z in `1 .Z; C/ WD f.cn /n2Z W ˙n2Z jcn j < C1g the series ˙n2Z cn e2int converges uniformly to some x 2 C.T; C/. Here, given a sequence .zn /n2Z 2 CZ , the series ˙n2Z zn is said to converge if the series ˙n2N zn and ˙n2N zn converge and its sum is the sum of the two sums. A similar convention holds for the uniform or pointwise convergence of a series of functions. ˇ ˇ Proof For all n 2 Z and all t 2 R one has ˇe2int ˇ D 1, so that ˙n2Z cn e2int converges uniformly whenever ˙n2Z jcn j converges. Since the trigonometric polyk cn e2int are 1-periodic continuous functions, the sum of the nomial functions ˙k series ˙n2Z cn e2int is a 1-periodic continuous function. t u The growth property of the Fourier series of x reflects the regularity of x and the Fourier series of the derivative of x is obtained by a term-by-term differentiation of the Fourier series of x. Proposition 4.6 For any x in the space C1 .T; C/ of 1-periodic continuously differentiable functions, the Fourier series of x0 is the sequence .2inxn /, where .xn / is the sequence of Fourier coefficients of x and supn2Z jnxn j < 1. Conversely, if .cn / 2 CZ is such that ˙n2Z n jcn j < C1, then the function t 7! ˙n2Z cn e2int belongs to the space C1 .T; C/. Proof An integration by parts shows that x0n
The converse assertion is a consequence in Theorem 5.6 below since the series 2i˙n2Z ncn e2int converges uniformly. t u
4.5 Fourier Series
203
Example Let f W R ! R be the 2-periodic sawtooth function given by f .s/ D s for s 2 Œ; Œ. An integration by parts gives cn . f ; 2/ D .1/nC1 =in for n ¤ 0 and c0 . f ; 2/ D 0. Thus, the Fourier series of f is 1 X .1/nC1 X sin ns eins D 2 : .1/nC1 in n nD1
n2Znf0g
It can be shown that this series converges to f .s/ for s 2 Rn.2Z C 1/. However, for s D .2k C 1/ with k 2 Z, the sum of this series is 0, i.e. .1=2/. f .s / C f .sC //, a general fact, as stated below. Then, Parseval’s identity yields 1
Z
s2 ds D
1 X
jbn . f ; 2/j2 D 4
nD1
1 X 1 : 2 n nD1
We recover the result ˙n1 .1=n2 / D 2 =6 first established by Euler. Example Considering the 2-periodic function f W R ! R given by f .s/ D s2 for s 2 Œ; Œ, we invite the reader to prove another identity due to Euler: ˙n1 .1=n4 / D 4 =90. Example The 2-periodic function f W R ! R given by f .s/ D =2 s=2 for s 2 Œ0; 2Œ and f ./ D f ./ D 0 is odd, so that its coefficients an . f ; 2/ are null, whereas an integration by parts shows that bn . f / WD bn . f ; 2/ D 1=n. Example The Dirichlet kernel is the trigonometric polynomial Dn defined by Dn .s/ WD
kDn X
eiks :
kDn
It is of crucial importance in the study of Fourier series since for any 2-periodic integrable function f one has Z Z 2 kDn 1 X iks 2 1 ikt Sn . f ; 2/.s/ WD e f .t/e dt D f .t/Dn .s t/dt: 2 kDn 2 0 0 n cos ks. Using the relation eiks C eiks D 2 cos ks we see that Dn .s/ D 1 C 2˙kD1 n k 1 k Moreover, considering the geometric progressions ˙kD0 c and ˙kDn c with c WD eis whose sums are .1 cnC1 /=.1 c/ and .cn 1/=.1 c/ respectively for s 2 Rn.Z/, and writing
.1 cnC1 /=.1 c/ C .cn 1/=.1 c/ D .cn1=2 cnC1=2 /=.c1=2 c1=2 /;
204
4 Hilbert Spaces
we get Dn .s/ D
sin..2n C 1/s=2/ : sin s=2
The question of convergence of Fourier series is a delicate and important subject which is beyond the scope of the book. In [239, pp. 83–87] one can find an example of a continuous periodic function whose Fourier series diverges. For the reader’s information we quote some positive results. The last one is a recent and deep theorem. Proposition 4.7 For all f 2 L1 .T; C/ its Fourier series converges to f in Cesàro’s sense at every point of continuity t of f W SQ n . f /.t/ WD .1=n/ŒS0. f /.t/ C : : : C Sn1 . f /.t/ converges to f .t/ as n ! 1. If f is continuous then .SQ n . f //n ! f uniformly. Proposition 4.8 Let f 2 C.R; C/ be 1-periodic and stable at some t 2 R in the sense that there exists some c > 0 such that jf .t/ f .t/j c jt tj for t near t. Then the Fourier series of f converges to f .t/ at t. Proposition 4.9 Let f be 1-periodic and regulated. Then for all t 2 R at which the Fourier series of f converges, it converges to .1=2/. f .t / C f .tC //. Theorem 4.9 (Dirichlet-Jordan) Let f 2 C.R; C/ be 1-periodic and with bounded variation on Œ0; 1. Then, for all t 2 R the Fourier series of f pointwise converges to t 7! .1=2/. f .t / C f .tC //. In particular, the Fourier series of f pointwise converges to f on the set of continuity points of f . Theorem 4.10 (Carleson [65]) For every f 2 L2 .T; C/ its Fourier series converges to f almost everywhere.
4.5.1 Application: The Dirichlet Problem for the Disk The Dirichlet problem for the open unit disc ˝ WD B.0; 1/ in the Euclidean plane is to solve the steady-state heat equation u WD
@2 u @2 u C 2 D0 2 @x @y
in ˝
u j @˝ D f ; where @˝ is the boundary of ˝, i.e. the unit circle, and f is a given function on @˝. The geometry of the problem leads to use polar coordinates .r; /, so that, writing
4.5 Fourier Series
205
u.r; / instead of u.rcos ; rsin / by an abuse of notation, u D
@2 u 1 @2 u 1 @u C C : @r2 r @r r2 @ 2
Writing r2 u D 0 under the form r2
@2 u @u @2 u D C r ; @r2 @r @ 2
we look for solutions .r; / 7! u.r; / with separable variables: u.r; / WD v.r/w. /. We require that v.1/ D 1, so that the boundary condition becomes w D f . Dividing both sides of the preceding equation by v.r/w. / we get w00 . / r2 v 00 .r/ C rv 0 .r/ D : v.r/ w. / Since the two sides depend on independent variables, they must be the same constant. We call it and we get two equations: w00 . / C w. / D 0
(4.6)
r2 v 00 .r/ C rv 0 .r/ D v.r/:
(4.7)
Since w must be periodic, we look for solutions of the first equation of the form wn . / D an cos n C bn sin n
with n2 D , n 2 N, an ; bn 2 R. In view of the linearity of the problem Fourier suggested to assume f is of the form f . / D
X
an cos n C
n2N
X
bn sin n :
(4.8)
n2N
In fact, assuming f is 2-periodic and of class C2 Proposition 4.6 ensures that f can be expanded in a uniformly convergent Fourier series as in the right-hand side of (4.8), with ˙n0 n2 .jan j C jbn j/ < C1. Now we look for solutions vn of (4.7) with D n2 > 0 by setting v.r/ D rn z.r/ for r 20; 1, with z.1/ D 1. This leads to the equation rz00 .r/ C .n C 1/z0 D 0. We discard the solutions of the form r 7! crn that are unbounded around 0 and we keep the solution z D 1, so that vn is given by vn .r/ WD rn , n 2 Nnf0; 1g, in order that u be of class C2 and we can take u.r; / D
X n2N
vn .r/wn . / D
X n2N
rn .an cos n C bn sin n /:
206
4 Hilbert Spaces
4.5.2 Application: Dido’s Problem Dido’s problem is the most famous example of a so-called isoperimetric problem. It consists in determining a curve in R2 with a given length enclosing a figure of greatest area. It is connected with the legend of the foundation of Carthage, as told by Virgil, an instance of the guile of colonialists (or women, depending on your views). Let z WD .x; y/ 2 C1 .Œ0; 2; R2 / be a simple closed curve of R2 . This means that z.2/ D z.0/ and that for any pair s; t 2 Œ0; 2Œ one has z.s/ ¤ z.t/ whenever s ¤ t. In fact, we identify two curves w and z if there exists an increasing function h W Œa; b ! Œ0; 2 of class C1 such that h.a/ D 0, h.b/ D 2 and w.s/ D z.h.s// for all t 2 Œa; b. The length of z is given by Z `.z/ WD
2 0
.x0 .t/2 C y0 .t/2 /1=2 dt
and does not change if z is reparameterized into z ı h as above. Thus we assume that z is parameterized by arc length, i.e. that x0 .t/2 C y0 .t/2 D 1 for all t 2 Œ0; 1, so that `.z/ D 2. It can be shown (Jordan’s theorem) that R2 nz.Œ0; 2/ has two connected components, one bounded and one unbounded. Let ˝ be the bounded one, called the region enclosed by z. Since x.t/x0 .t/ C y.t/y0 .t/ D 0, Green’s formula asserts that the area a of ˝ is given by 1 a WD 2
Z
2
0
1 .x.t/y .t/ y.t/x .t//dt D 2i 0
0
Z
2 0
1 z.t/z .t/dt D 2i 0
Z
1 0
w.s/w0 .s/ds;
considering z as a complex-valued function and setting w.s/ WD z.2s/ for s 2 Œ0; 1, taking into account the relations x2 .0/ D x2 .2/ and y2 .0/ D y2 .2/. Using Parseval’s equality, since cn .w0 / D 2incn .w/, we get X 1 X cn .w/cn .w0 / D n jcn .w/j2 : 2i n2Z n2Z
aD
jz0 j D 1 and since n2 n for all n 2 Z we obtain `2 .z/ D 2
Z
D 4 2
2 0
X
ˇ 0 ˇ2 ˇz .t/ˇ dt D
Z
1 0
Xˇ ˇ ˇ 0 ˇ2 ˇw .s/ˇ ds D ˇcn .w0 /ˇ2 n2Z
n2 jcn .w/j2 4a;
n2Z
with strict inequality if cn .w/ ¤ 0 for at least one n 2 Znf1; 0; 1g. Equality holds for w.s/ D e2is either by considering the Fourier coefficients of w or by noting that in such a case one has `.z/ D 2 and a D . Thus, the circle is the solution of Dido’s problem. The above proof is due to Hurwitz (1902).
4.5 Fourier Series
207
The following exercises give a (very) short account of the recent theory of wavelets. It has numerous applications (for instance in the design of JPEG2000, the industrial standard for image compression replacing the older Fourier-based JPEG standard). See [31, 149, 158, 181, 212, 240, 257, 258].
Exercises 1 . Let L2 .R/ be the completion of the space of continuous functions f on R such that Z kf k2 WD
R
jf .r/j2 dr
1=2
< C1:
endowed with this norm kk2 . (a) Show that for m, n 2 Z the translation Tm and dilation Dn operators given as follows are unitary operators: .Tm f /.r/ WD f .r m/;
.Dn f /.r/ WD 2n=2 f .2n r/
r 2 R:
(b) Note that Dn Tm D T2n m Dn . (c) Show that for all 2 L2 .R/ one has .Dn Tm . O //.s/ D gm;n .s/ O .2n s/ with n gm;n .s/ WD 2n=2 e2im2 s , O WD F . / being the Fourier transform of . A (dyadic) wavelet is a function 2 L2 .R/ such that the family W WD fDn Tm W m; n 2 Zg forms an orthonormal basis of L2 .R/. 2 . (a) Let S WD Œ1; 12 Œ[Œ 12 ; 1Œ. Verify that the family f2n S W n 2 Zg is a partition of Rnf0g. (b) Show that the family fek 1S W k 2 Zg is an orthonormal basis of L2 .S/. (c) Deduce from this that for all n 2 Z the restrictions to the set 2n S of the functions gm;n (m 2 Z) defined in the preceding exercise form an orthonormal basis of L2 .2n S/. (d) Conclude that S WD F 1 .1S / is a wavelet, called the Shannon wavelet. 3 . A general method to construct wavelets is the so-called Multiresolution Analysis (MRA) method. An MRA is a sequence .Vn /n2Z of closed linear subspaces of L2 .R/ whose union is dense in L2 .R/ and such that (i) (ii) (iii) (iv)
.Vn / is increasing: Vn VnC1 for all n 2 Z; \n Vn D f0g; VnC1 D D1 Vn for all n: f 2 Vn if, and only if f .2/ 2 VnC1 ; there exists a ' 2 V0 such that .Tk '/k2Z is an orthonormal basis of V0 .
(a) Given an MRA .Vn /, let Wn WD Vn? \ VnC1 . Show that the subspaces Wn are mutually orthogonal and that ˚n2Z Wn D L2 .R/. (b) Prove that Dn W0 D Wn for all n 2 Z.
208
4 Hilbert Spaces
(c) Suppose that for some 2 W0 the family .Tk /k2Z forms an orthonormal basis of W0 . Deduce from the preceding that .Dn Tk /k2Z forms an orthonormal basis of Wn and that .Dn Tk /k;n2Z forms an orthonormal basis of L2 .R/: is a wavelet. (d) Let ' WD 1Œ0;1Œ and let V0 be the closed space spanned by the family .Tk '/k2Z . Show that for Vn WD Dn V0 the family .Vn /n2Z is an MRA. (e) Let H D 1Œ0;1=2Œ 1Œ1=2;1Œ . Show that .Dn Tk H /k;n2Z forms an orthonormal basis of L2 .R/: H is a called the Haar wavelet. 4 . Prove that a similar construction with ' WD sinc given by sinc.r/ WD sinrr for r 2 Rnf0g and sin c.0/ WD 0 shows that the Shannon wavelet is associated with an MRA. The function sinc is involved in the following result (Whittaker-ShannonKotelnikov Sampling Theorem): if f 2 L2 .R/ is such that the support function of fO WD F . f / is contained in Œ1=2; 1=2 then f is determined by its values on Z: 8r 2 R
f .r/ D
X
f .k/ sin c.r k/:
k2Z
5. Let .rn /n be a nonincreasing sequence in real numbers and let .cn /n be a sequence in complex numbers such that there exists an m 2 RC satisfying jc0 C : : : C cn j m for all n 2 N. Prove that for all n 2 N one has jr0 c0 C : : : C rn cn j mr0 : [Hint: set sn WD c0 C : : : C cn and write (Abel transformation) r0 c0 C : : : C rn cn D s0 r0 C .s1 s0 /r1 C : : : : C .sn sn1 /rn : 6. Using the preceding exercise, show that the series ˙n1 .1=n/ sin nx converges uniformly on Œ ; 2 for all 20; Œ. [Hint: prove and use the relation X 1kp
sin.n C k/x D
1 Œcos..2n C 1/x=2/ cos..2n C 2p C 1/x=2/ 2 sin.x=2/
for x 2 Rn2Z.] 7. Denoting by Dn W x 7! .sin x=2/1 sin.2n C 1/x=2 for x 2 Rn.2Z/ the Dirichlet kernel, Fejér’s kernel is given by Fn .x/ WD
1 .D0 .x/ C D1 .x/ C : : : C Dn .x//: nC1
4.6 Orthogonal Polynomials
209
Show that Fn1 .x/ D
1 sin2 .nx=2/ n sin2 .x=2/
x 2 Rn.2Z/:
Given a 2-periodic regulated function f one sets Rn . f /.x/ WD
1 .S0 . f ; 2/.x/ C : : : C Sn . f ; 2/.x//: nC1
Prove that 1 f .x/ Rn . f /.x/ D 2 .Rn . f /.x//n !
Z
2 0
. f .x/ f .x t//Fn .t/dt;
1 . f .x / C f .xC // 2
and that if f is continuous .Rn . f //n ! f uniformly. [Hint: cut the integration interval into the three pieces Œ0; ˛, Œ˛; 2 ˛, Œ2 ˛; 2.]
4.6 Orthogonal Polynomials Many mathematicians have proposed families of orthogonal polynomials. They are used for various concrete problems such as approximation, representation, and differential equations; see [20, 107] f.i.. They enjoy special properties. We just give a brief, general account. Given a closed interval T in R (bounded or unbounded) and a continuous function w on T, positive on intT considered as a weight, let X be the set of continuous functions x./ on T such that Z
w.t/ jx.t/j2 dt < C1: T
The relations 2 jabj jaj2 C jbj2 and ja C bj2 2 jaj2 C 2 jbj2 show that X is a linear space and that the function h given by Z h.x; y/ WD
w.t/x.t/y.t/dt T
is a Hermitian form on X. This Hermitian form is positive definite since h.x; x/ D 0 implies that x is null on intT, hence is null on T. In the sequel we assume that for all
210
4 Hilbert Spaces
n 2 N we have Z w.t/ jtn j dt < C1; T
so that X contains the restrictions to T of the polynomial functions. Since the coefficients of a polynomial function that is null on T are null, the monomials tn are linearly independent and one can use the Gram-Schmidt process to construct from them an orthogonal family .pn /. Let us give some classical examples. 2
Example Hermite polynomials are obtained for T WD R and w.t/ WD et . Example Laguerre polynomials are obtained for T WD RC and w.t/ WD et . Example Jacobi polynomials are obtained for T WD Œ1; 1 and w.t/ WD .1 t/r .1 C t/s with r, s 2 1; C1Œ. For r D s D 0 (so that w./ D 1) they are called Legendre polynomials. For r D s D 1=2 they are called Chebyshev polynomials. Among the interesting properties of the scalar product associated with w and of the polynomials pn we note the following, left as exercises (note that the degree of p1 pn pnC1 is at most n). Proposition 4.10 For all x; y; z in the space X introduced above, one has hx j yzi D hxy j zi D hxyz j 1i. For all n one has hp1 pn j pnC1 i D hpnC1 j pnC1 i.
Exercises 1. Prove that the polynomials pn satisfy an inductive relation of the form pn .t/ D .t C bn /pn1 .t/ cn pn2 .t/ for n 2, with bn 2 R, cn > 0. 2. Show that for all n 2 N the polynomial pn has n distinct real roots contained in intT. p 3. Verify that for the Chebyshev polynomials pn the functions 2=pn form an orthonormal family. 4. Show that the first Legendre polynomials are given by p0 .t/ D 1, p1 .t/ D t, 2 3 p2 .t/ D .1=2/.3t 3t/. P n 1/, p3 .t/ D .1=2/.5t 2 1=2 5. Prove that n x pn .t/ D .1 2tx C x / for the Legendre polynomials pn . 6. Prove that the Legendre polynomial pn is a solution of the differential equation .t2 1/p00n .t/ C 2tp0n .t/ n.n C 1/pn D 0 n
and that one has pn .t/ D .1=2n nŠ/ dtd n .t2 1/n .
4.7 Elementary Spectral Theory for Self-Adjoint Operators
211
4.7 Elementary Spectral Theory for Self-Adjoint Operators If X is a complex (resp. real) Hilbert space, an operator A 2 LC .X; X/ (resp. A 2 LR .X; X/) is called Hermitian (resp. self-adjoint or symmetric) if A D A. This relation is equivalent to the fact that hA defined by hA .x; y/ WD hAx j yi for x; y 2 X is Hermitian (resp. symmetric). When X is a complex Hilbert space A is Hermitian if and only if hA .x; x/ 2 R for all x 2 X. In fact, if A D A , then for all x 2 X one has hAx j xi D hx j A xi D hx j Axi D hAx j xi. Conversely, if hA .x; x/ 2 R for all x 2 X we have hA x j xi D hx j Axi D hAx j xi;
8x 2 X
hence A D A in view of the second part of the next lemma, taking in it A A instead of A. Lemma 4.1 If a self-adjoint operator A on a real Hilbert space X satisfies hAx j xi D 0 for all x 2 X then A D 0. The same conclusion holds if X is a complex Hilbert space and if A is any C-linear operator satisfying this property. Proof This stems from the so-called polarization identity: for all x, y 2 X hA.x C y/ j x C yi hA.x y/ j x yi D 2hAx j yi C 2hAy j xi: By assumption the left-hand side is 0. If A is self-adjoint the right-hand side is 4hAx j yi, so that Ax D 0 for all x 2 X. If A is C-linear, replacing x by ix we get ihAx j yi C ihAy j xi D 0 along with hAx j yi C hAy j xi D 0, hence hAx j yi D 0.
t u
Remark In the real case the conclusion does not hold if A is not self-adjoint, as u t shown by the rotation .r; s/ 7! .s; r/ in R2 . On the space L.X/ WD LR .X; X/ one defines a preorder by setting A B if hAx j xi hBx j xi for all x 2 X and one says that A is positive (or positive semidefinite) if hAx j xi 0 for all x 2 X. By the preceding lemma, this preorder induces an order on the space Ls .X/ of symmetric operators, or, when X is a complex space on the space LC .X; X/ of C-linear operators. For A 2 L.X/, setting a WD inf hAx j xi; x2SX
b WD sup hAx j xi x2SX
which are finite real numbers in the interval Œ kAk ; kAk, one has aI A bI;
(4.9)
212
4 Hilbert Spaces
where I WD IX denotes the identity map of X. When A is symmetric one can refine Theorem 3.35 which asserts that the spectrum .A/ of A is contained in Œ kAk ; kAk. Proposition 4.11 If A is a symmetric operator the numbers a and b defined in (4.9) are elements of the spectrum .A/ of A. Moreover, .A/ Œa; b and kAk D max.a; b/. Proof We first observe that for r > b one has r 2 .A/, i.e. A rI is invertible, in view of the Lax-Milgram Theorem and of the inequalities r b > 0, hAx j xi b kxk2 , 8x 2 X
h.rI A/x j xi .r b/ kxk2 :
Changing A into A, we see that 1; aŒ .A/. Gathering the two conclusions we get .A/ Œa; b. Now let us show that b 2 .A/. Let ˇ WD kbI Ak1=2 . Setting hA .x; y/ WD h.bI A/x j yi for x, y 2 X, we define a positive symmetric bilinear form. The Cauchy-Schwarz inequality yields jh.bI A/x j yij jh.bI A/x j xij1=2 : jh.bI A/y j yij1=2 for all x, y 2 X, so that, taking the supremum over y 2 SX we see that k.bI A/xk ˇ jh.bI A/x j xij1=2 Taking a sequence .xn / in SX such that .hAxn j xn i/n ! b, we get .k.bI A/xn k/n ! 0: bI A is not invertible, i.e. b 2 .A/. Similarly, a 2 .A/ D .A/. Let c WD max.a; b/. Clearly, (4.9) shows that b kAk and a kAk. Thus c kAk. Conversely, the polarization identity yields 4hAx j yi D hA.x C y/ j x C yi hA.x y/ j x yi b kx C yk2 a kx yk2 c.kx C yk2 C kx yk2 / D 2c.kxk2 C kyk2 /: Assuming x ¤ 0, y ¤ 0 and replacing x with sx, y with ty and choosing s WD .kyk = kxk/1=2 , t WD .kxk = kyk/1=2 , we obtain 2hAx j yi c.s2 kxk2 C t2 kyk2 / D 2c kxk : kyk : t Taking the supremum for x, y 2 SX , we get kAk c and the announced equality. u Remark If A is a Hermitian operator, one still has the relation kAk max.jaj ; jbj/ D supfjhAx j xij W x 2 SX gI see [182, thm 10, section VII,2].
D
4.7 Elementary Spectral Theory for Self-Adjoint Operators
213
Corollary 4.9 If a symmetric operator A is such that .A/ D f0g, then A D 0. Starting with a symmetric operator A on X, for any polynomial p one can define a new operator p.A/ given by p.A/ WD cn An C : : : C c1 A C c0 IX
if
p.t/ D cn tn C : : : C c1 t C c0 :
Then one obtains a ring homomorphism from the algebra RŒt of real polynomials into the subalgebra A of L.X; X/ generated by A. But one can achieve more in associating an operator f .A/ to any continuous function f on the interval Œa; b, a and b being given as above, so that one gets a linear ring homomorphism of the algebra C.Œa; b/ into the closure clA of the algebra A in L.X; X/. We need a preliminary algebraic result. Lemma 4.2 Let p be a real polynomial that is nonnegative on Œa; b. Then there exist a finite family .qh /h2H of real polynomials and a partition H D I [ J [ K such that X X X qi .t/2 C .t a/ qj .t/2 C .b t/ qk .t/2 : p.t/ D i2I
j2J
k2K
Proof We factor p into a product of real polynomials of degree one and two, the quadratic factors being irreducible polynomials of the form .t c/2 C d2 . The other factors are of the form .t r/, where r belongs to the set R of real roots of p. If r 2 R belongs to the interior of Œa; b, its multiplicity is even since otherwise p would change sign around r. If r 2 R is not greater than a we write the linear factor t r as .t a/ C .a r/ with a r 0. If r 2 R is not less than b we write the linear factor r t as .r b/ C .b t/ and r b is a square. Since p 0 on Œa; b, the coefficient in front of the product of such factors is positive (we assume p ¤ 0, a trivial case). Multiplying out all these factors and noting that a product of a sum of squares is a sum of squares, we get an expression of the announced type, except that there still remains terms of the form .t r/.s t/q.t/2 , where r a, s b and q is a real polynomial. However, the identity .t r/.s t/ D
.s t/.t r/2 C .t r/.s t/2 sr
enables us to reduce these terms to terms of the other types.
t u
From this lemma we can deduce that the map p 7! p.A/ preserves positivity. Proposition 4.12 If A is a symmetric operator, a, b 2 R are such that aIX A bIX and if p is a real polynomial that is positive on Œa; b, then one has p.A/ 0. If p and q are real polynomials such that p q on Œa; b, then one has p.A/ q.A/. Also kp.A/k kpk1 with kpk1 WD supt2Œa;b jp.t/j.
214
4 Hilbert Spaces
Proof Clearly, for any real polynomial q one has q2 .A/ D .q.A//2 0. Moreover, if B and C are two symmetric operators with B 0 and if BC D CB, then one has BC2 0 since hBC2 x j xi D hCBCx j xi D hBCx j Cxi 0: The first assertion then follows from the preceding lemma. The second one ensues by considering p q. Since for c WD kpk1 one has c p c on Œa; b, we get cIX p.A/ cIX , hence kp.A/k c by Proposition 4.7. t u Since the map p 7! p.A/ is linear and continuous from the set of restrictions to Œa; b of polynomials to L.X; X/, the Stone-Weierstrass Theorem and the extension theorem ensure that this linear map can be extended to a linear map from C.Œa; b/ to L.X; X/ and that for all f 2 C.Œa; b/ one has kf .A/k kf k1 . Moreover, if .pn / ! f and .qn / ! g then .pn qn / ! fg and so . fg/.A/ D f .A/g.A/. Thus the extended map
W f 7! f .A/ is again an algebra homomorphism from C.Œa; b/ to clA. Proposition 4.13 If A is a positive symmetric operator, then there exist some S (called the square root of A and often denoted by A1=2 ) in the closure of the algebra A generated by A such that S2 D A. Moreover, one has AS D SA. If A and B are two commuting positive symmetric operators, then AB is again positive. Proof The first assertion is obtained by using the function t 7! t1=2 on Œ0; kAk and observing that for any polynomial p, A commutes with p.A/. For the second assertion one introduces the square root S of A and uses the fact that S2 B 0, as observed above. t u The preceding analysis can be refined; this refinement can be bypassed in a first reading. Let KA be the kernel of the map W f 7! f .A/: KA WD ff 2 C.Œa; b/ W f .A/ D 0g: It is an ideal of the ring C.Œa; b/: for all h 2 C.Œa; b/, k 2 KA one has hk 2 KA since .hk/.A/ D h.A/k.A/ D 0. Let ZA WD ft 2 Œa; b W 8f 2 KA f .t/ D 0g be the zero set of KA called the (Gelfand) spectrum of A. Since ZA is closed, any continuous real-valued function f on ZA can be extended to a continuous function g on Œa; b, with kgk1 D kf kA WD supfjf .t/j W t 2 ZA g by using Urysohn’s theorem. If h is another extension of f , since h g is null on ZA , the Ideal Theorem ensures that h g belongs to the ideal KA , so that h.A/ D g.A/. Thus, we may denote by f .A/ this unambiguously defined operator. The map ˛ W C.ZA / ! clA thus obtained is easily seen to be an algebra homomorphism. If f is the restriction to ZA of a function fQ 2 C.Œa; b/, then, by construction, one has f .A/ D fQ .A/. Thus the map
4.7 Elementary Spectral Theory for Self-Adjoint Operators
215
W fQ 7! fQ .A/ from C.Œa; b/ to clA can be factorized through the map f 7! f .A/ from C.ZA / to clA and the restriction map W fQ 7! fQ jZA from C.Œa; b/ to C.ZA /:
D ˛ ı . Theorem 4.11 (Spectral Theorem) The map ˛ W C.ZA / !clA is an isomorphism of Banach algebras sending the cone of nonnegative functions on ZA onto the cone of positive elements of clA. Moreover, ˛ is an isometry. Proof For f 0 in C.ZA /, we have f .A/ 0 since f can be extended to some fQ 0 in C.Œa; b/ and Proposition 4.12 can be used. Conversely, let us show that if for some f 2 C.Œa; b/ we have f .A/ 0, then necessarily f 0 on ZA . Suppose on the contrary that for some t 2 ZA we have f .t/ < 0. For some " > 0 we have f .t/ < 0 for all t 2 Œt "; t C " \ Œa; b. Let g 2 C.Œa; b/ be piecewise affine, g 0, null off Œt "; t C " and such that g.t/ D 1. Then, since fg 0 we have f .A/g.A/ 0. But since f .A/ 0, g.A/ 0, Proposition 4.13 ensures that f .A/g.A/ 0. Thus f .A/g.A/ D 0 and fg 2 KA . This is impossible since . fg/.t/ ¤ 0 and KA D fh 2 C.Œa; b/ W h.z/ D 08z 2 ZA g by the Ideal Theorem. We have seen the inequality kf .A/k kf kA for all f 2 C.ZA /. Let us prove the reverse inequality. Let r WD kf .A/k, so that rIX f .A/ 0 and rIX C f .A/ 0. The preceding shows that r f 0 and r C f 0 on ZA , hence kf kA r and kf .A/k D kf kA . Thus ˛ is an isometry, hence is injective. Given B 2 clA let . fn / be a sequence in polynomial functions such that . fn .A// ! B. Since ˛ is isometric, the restriction to ZA of the sequence . fn / is a Cauchy sequence, hence converges to some f 2 C.ZA / in the norm kkA . Then B D f .A/, so that ˛ is a bijection. t u Let us give a characterization of the spectrum ZA of A. Proposition 4.14 If A is a symmetric operator then its Gelfand spectrum ZA coincides with its spectrum, i.e. the set .A/ of numbers z 2 C (in fact z 2 R) such that A zIX is not invertible. Proof For z 2 CnZA the function g W t 7! .t z/.t z/ is non-null on ZA . Then h WD 1=g 2 C.ZA /, so that .A zIX /.A zIX /h.A/ D IX and A zIX is invertible. Thus, if A zIX is non-invertible one has z 2 ZA . Let us show that for any z 2 ZA the operator A zIX is non-invertible. Suppose on the contrary that A zIX has an inverse B. For n 2 Nnf0g let gn W R ! R be the continuous function defined by 1 1 gn .t/ WD n for t 2 Œz ; z C n n
gn .t/ WD
1 1 1 for t 2 RnŒz ; z C : n n jt zj
Since j.t z/gn .t/j 1 for all t, we have k.A zIX /gn .A/k 1, hence kgn .A/k D kB.A zIX /gn .A/k kBk : On the other hand, kgn kA jgn .z/j D n, so that kgn .A/k n, a contradiction.
t u
216
4 Hilbert Spaces
An eigenvector of an operator A 2 L.X; X/ is a vector v 2 Xnf0g such that for some number c (called the associated eigenvalue) one has Av D cv. Any eigenvalue c of an operator A satisfies jcj kAk since kAvk kAk : kvk. For a symmetric operator on a finite dimensional Hilbert space or, more generally, for a compact symmetric operator on a Hilbert space X, one can say more. Recall that A is said to be compact if the image A.BX / of the unit ball of X is contained in a compact set. Proposition 4.15 If A is a compact symmetric operator on a Hilbert space X ¤ f0g, then either a WD infx2SX hAx j xi or b WD supx2SX hAx j xi is an eigenvalue of A. Proof Proposition 4.11 ensures the existence of a sequence .xn / of SX such that .jhAxn j xn ij/n ! c WD kAk. Since rn WD hAxn j xn i 2 R, we can find "n 2 f1; 1g such that rn D "n jrn j. Taking a subsequence if necessary, we may suppose ."n / has a limit " 2 f1; 1g. Then kAxn "n cxn k2 D kAxn k2 2"n chAxn j xn i C "2n c2 kxn k2 kAk2 2c jrn j C c2 : The right-hand side converging to 0, we get that .Axn "n cxn / ! 0. Since A is compact, taking another subsequence if necessary, we may suppose .Axn / has a limit y 2 X. Then ."n cxn / converges to y. If c D 0 one has A D 0 and the result is obvious. If c ¤ 0, for v WD "c1 y one has .xn / ! v, hence v 2 SX and, passing to the limit, Av "cv D 0, so that v is an eigenvector of A. Since kAk D max.jaj ; jbj/ by Proposition 4.11, the corresponding eigenvalue "c is b if " D 1 or a if " D 1. t u Given an eigenvalue of a symmetric operator A, let E be the subspace spanned by the corresponding eigenvectors. Clearly E D fx 2 X W Ax D xg. If and are two distinct eigenvalues of A one has E ? E since for x 2 E , y 2 E one has hx j yi D hAx j yi D hx j Ayi D hx j yi hence hx j yi D 0. If A is compact and if ¤ 0, the dimension of the eigenspace E is finite: otherwise, taking an infinite family .en / of orthonormal vectors of E we could not find a convergent subsequence in .A.en // D .en /, a contradiction with ken enC1 k2 D 2. For a similar reason, for any " > 0 there is only a finite number of eigenvalues satisfying jj ". Thus, if X is infinite dimensional, 0 is the unique limit of a sequence in distinct eigenvalues. Given an operator A on a Hilbert space X, a linear subspace Y of X is said to be A-invariant if A.Y/ Y, i.e. A.y/ 2 Y for all y 2 Y. Then cl.Y/ is A-invariant and if A is symmetric, Y ? is invariant. Moreover, the restriction to Y is again symmetric. Theorem 4.12 (Spectral Theorem for Compact Symmetric Operators) Let A be a compact symmetric operator on a Hilbert space X. Then the set S WD e of eigenvalues of A is a finite or countable subset of R. If S is finite (in particular if X is finite dimensional), X is the direct sum of the mutually orthogonal eigenspaces E for 2 S.
4.7 Elementary Spectral Theory for Self-Adjoint Operators
217
If S is infinite, X is the Hilbertian sum of the eigenspaces E for 2 S. P The last assertion means that any x 2 X is the sum of a series n xn such that xn 2 En with .n / ! 0 in S and xn ? xp for n ¤ p. Taking a Hilbert basis of each subspace E , including E0 D N.A/, we get a Hilbert basis of X formed of eigenvectors. Proof Let Y be the closure of the linear subspace generated by the eigenspaces E for 2 S and let Z WD Y ? . Then Z is invariant under A and the restriction AZ of A to Z has no eigenvalue. Proposition 4.15 shows that Z D f0g. Taking orthonormal basis in all eigenspaces (including the kernel of A) and gathering them, we get the announced decomposition, even if we do not give more details about Hilbert sums of subspaces.
Exercises 1. Let X and Y be two pre-Hilbertian spaces. Given a 2 X, b 2 Y, one defines A WD a ˝ b 2 L.X; Y/ by A.x/ WD ha j xib for x 2 X. Describe the image and the kernel of A. Compute kAk. Determine the adjoint A of A. Find relations between the image (resp. kernel) of A and the kernel (resp. image) of A . 2. Let X and Y be two Hilbert spaces and let .en / (resp. . fn /) be an orthonormal family of X (resp. Y). Given a bounded sequence .n / of R, let A W X ! Y be defined by AD
X
n en ˝ fn :
n
Show that A is a well defined continuous linear map and compute kAk and A . Assuming that n ¤ 0 for all n 2 N, show that A is injective if and only if .en / is a Hilbert basis of X. Show that A.X/ is dense in Y if and only if . fn / is a Hilbert basis of Y. Find conditions ensuring that A is invertible and express A1 . In the case X D Y find the eigenvalues of A. 3. Let X be a real Hilbert space and let A 2 L.X/ be symmetric. Prove that .A/ RC if and only if hAx j xi 0 for all x 2 X. Prove the following equivalences: .A/ Œ0; 1 ” kAk 1 & hAx j xi 0 8x 2 X ” hAx j xi kAxk2 8x 2 X ” kxk2 hAx j xi 0 8x 2 X:
. . . Leibniz quite rapidly developed formal analysis in the form in which we know it. That is, in a form specially suitable to teach analysis by people who do not understand it to people who will never understand it. V.I. Arnol’d, Huygens and Barrow, Newton and Hooke, Birkhäuser Verlag, Basel, 1990.
Abstract This chapter is devoted to a rather complete exposition of classical differential calculus. However, we present some non-classical variants which are often easier to handle. Besides the idea of approximation carried by the notion of derivative, important existence theorems can be obtained: the inverse function theorem and its relative, the implicit function theorem. Several applications are studied: the Legendre transform, the method of characteristics and some partial differential equations. Applications to geometry and optimization are devised and the calculus of variations that served as an incentive to the development of the subject is evoked with numerous examples.
case of one-variable maps to the case of maps defined on open subsets of normed spaces. It is not as strong as Fréchet differentiability. Moreover, for some results, differentiability does not suffice and one needs some continuity property of the derivative or a stronger notion of approximation. The main questions we treat are the invertibility of nonlinear maps, its applications to geometrical notions and its uses for optimization problems. We end the chapter with an introduction to the calculus of variations, which has been a strong incentive for the development of differential calculus since the end of the 17th century. Differentiability questions for convex functions will be considered in the next chapter. Differential equations will be studied in Chap. 10 devoted to evolution problems. As an example showing the power of differential calculus, let us consider Dido’s problem already evoked in the preceding chapter. It consists in finding a curve of given length enclosing the greatest area. Its origin can be found in the legend of the foundation of Carthage. According to the Eneid, the queen Dido made the modest request that a piece of land that could be delimited by the skin of a buoy be given to her and her men. It was accepted but she had the skin cut into a long, thin string and the piece of land became the city of Qart Hadasht (Carthage) (the first example of cunning colonialism). We take two opposite points considered as the beginning of the string and its middle and represented by .0; 0/ and .1; 0/ in R2 and we look for a function x W Œ0; 1 ! RC such that the area between its graph and the segment Œ0; 1f0g is maximum for all curves of a given length `. Thus, we have to maximize Z
Z
1
x.t/dt 0
1
subject to 0
p 1 C x0 .t/2 dt D `, x.0/ D 0, x.1/ D 0:
The theory of Lagrange multipliers and the Euler equation lead us to find a multiplier 2 R and x./ (an unknown in an infinite dimensional space!) such that 1
x0 .t/ d p D 0: dt 1 C x0 .t/2
Integrating, we look for some constant c such that x0 .t/ D c: t p 1 C x0 .t/2 Solving this equation in x0 .t/ we get tc x0 .t/ D p : 2 .t c/2
5.1 Differentiation of One-Variable Functions
221
Another integration yields some constant c0 such that p x.t/ D 2 .t c/2 C c0 : Thus the graph of x lies on the circle with equation .t c/2 C . y c0 /2 D 2 in R2 with coordinates .t; y/. The constraints x.0/ D 0, x.1/ D 0 yield c D 1=2, 2 D c2 C c02 and the length constraint yields 2 arcsin.1=/ D `. Choosing the distance between the extremities of the string to be d, with d WD 1 ` rather than 1, one finds that the arc w./ WD dx.d/ with length ` solving the problem is half a circle.
5.1 Differentiation of One-Variable Functions The differentiation of one-variable vector-valued functions is not very different from the differentiation of one-variable real-valued functions. In both cases, the calculus relies on rules for limits. The aims are similar too. In both cases, the purpose consists in drawing some information about the behavior of the function from some knowledge concerning the derivative. In the vector-valued case, the direction of the derivative takes as great importance as its magnitude.
5.1.1 Derivatives of One-Variable Functions In this section, T is an open interval of R and f W T ! X is a map with values in a normed space X. Definition 5.1 The map f is said to be right differentiable (resp. left differentiable) at t 2 T if the quotient . f .t C s/ f .t//=s has a limit as s ! 0C , i.e. s ! 0 with s > 0 (resp. as s ! 0 , i.e. s ! 0 with s < 0). These limits, denoted by fr0 .t/ or Dr f .t/ and f`0 .t/ or D` f .t/ respectively, are called the right and the left derivatives of f at t. When these limits coincide, f is said to be differentiable at t and their common value f 0 .t/ is called the derivative of f at t. Thus f is differentiable at t if and only if the quotient . f .tCs/f .t//=s has a limit as s ! 0, with s ¤ 0, or, equivalently, if there exists some vector v.D f 0 .t// 2 X and some functions ˛, r W T 0 WD T t ! X such that ˛.s/ ! 0 as s ! 0 and r.s/ D s˛.s/ for which one has the expansion f .t0 / D f .t/ C .t0 t/v C r.t0 t/;
(5.1)
as seen by setting ˛.0/ WD 0, ˛.s/ WD . f .t C s/ f .t//=s v for s 2 T 0 nf0g, r.s/ D f .t C s/ f .t/ sv for s 2 T 0 . The function r is called a remainder. The following rules are immediate consequences of the rules for limits.
222
5 The Power of Differential Calculus
Proposition 5.1 If f ; g W T ! X are differentiable at t 2 T and ; 2 R, then h WD f C g is differentiable at t and its derivative at t is h0 .t/ D f 0 .t/ C g0 .t/. Proposition 5.2 If f W T ! X is differentiable at t 2 T, if Y is another normed space and if A W X ! Y is linear and continuous, then g WD A ı f is differentiable at t and g0 .t/ D A. f 0 .t//. Similar rules hold for right derivatives and left derivatives. We will see later a more general composition rule (or chain rule). The following composition rule can be proved using quotients in the same way as for scalar functions. We prefer using expansions as in (5.1) because such expansions give the true flavor of differential calculus, i.e. approximations by continuous affine functions. Moreover, one does not need to take care of denominators taking the value 0. Proposition 5.3 If T, U are open intervals of R, if g W T ! U is differentiable at t 2 T and if h W U ! X is differentiable at u WD g.t/, then f WD h ı g is differentiable at t and f 0 .t/ D g0 .t/h0 .u/. Proof Let v WD h0 .u/ and let ˛ W T ! R, ˇ W U ! X be such that ˛.t/ ! 0 as t ! t, ˇ.u/ ! 0 as u ! u with g.t/ g.t/ D .t t/g0 .t/ C .t t/˛.t/, h.u/ h.u/ D .u u/v C .u u/ˇ.u/. Then one has f .t/ f .t/ D h.g.t// h.u/ D .g.t/ u/v C .g.t/ u/ˇ.g.t// D .t t/g0 .t/v C .t t/˛.t/v C .t t/.g0 .t/ C ˛.t//ˇ.g.t//: Since g.t/ ! u as t ! t, one sees that ˛.t/v C .g0 .t/ C ˛.t//ˇ.g.t// ! 0 as t ! t, so that f is differentiable at t and f 0 .t/ D g0 .t/v D g0 .t/h0 .u/. u t Now let us devise a rule for the derivative of a product. It can be generalized to a finite number of factors. Proposition 5.4 (Leibniz Rule) Let X, Y, Z be normed spaces and let b W X Y ! Z be a continuous bilinear map. If f W T ! X, g W T ! Y are differentiable at t, then the function h W r 7! b. f .r/; g.r// is differentiable at t and h0 .t/ D b. f 0 .t/; g.t// C b. f .t/; g0 .t//: Proof By assumption, there exist some ˛ W .T t/ ! X, ˇ W .T t/ ! Y satisfying ˛.s/ ! 0, ˇ.s/ ! 0 as s ! 0 such that f .t C s/ D f .t/ C sf 0 .t/ C s˛.s/;
g.t C s/ D g.t/ C sg0 .t/ C sˇ.s/:
Plugging these expansions into b and setting
.s/ WD b.˛.s/; g.t// C b. f .t/; ˇ.s// C sb.˛.s/; ˇ.s//;
5.1 Differentiation of One-Variable Functions
223
so that .s/ ! 0 as s ! 0, we get h.t C s/ h.t/ D sb. f 0 .t/; g.t// C sb. f .t/; g0 .t// C s .s/ and s1 .h.t C s/ h.t// ! b. f 0 .t/; g.t// C b. f .t/; g0 .t//.
t u
For a real-valued function f W Œa; b ! R, as substitutes for the derivative of f at x 2a; bŒ, Ulisse Dini introduced the following four derivative numbers (called the Dini derivatives of f at x): D f .t/ WD lim inf s!0
DC f .t/ WD lim inf s!0C
f .t C s/ f .t/ ; s
D f .t/ WD lim sup
f .t C s/ f .t/ ; s
DC f .t/ WD lim sup
s!0
s!0C
f .t C s/ f .t/ ; s f .t C s/ f .t/ : s
They always exist in R. Of course, f is right differentiable at t if and only if DC f .t/ D DC f .t/I a similar equivalence holds for the left derivative. They enable us to obtain estimates akin to the ones we will deal with in the next subsection. We also refer to Theorem 8.14 and its corollaries for some refinements. Lemma 5.1 Let f W Œa; b ! R be continuous and such that for some c; d 2 R one of the Dini derivatives, denoted by Df , satisfies c Df .t/ d for all t 2a; bŒ. Then one has c.b a/ f .b/ f .a/ d.b a/. Proof Changing f into f or f into t 7! f .t/ we may suppose Df D DC f . We first prove that f .b/ f .a/ 0 whenever f satisfies Df .t/ 0 for all t 2a; bŒ. Suppose on the contrary that f .b/ f .a/ < 0 and take p 2 P such that f .b/ f .a/ < p.b a/. For a0 2a; bŒ close to a we still have f .b/ f .a0 / < p.b a0 /. Setting g.t/ WD f .t/ f .a0 / C p.t a0 /, we have g.a0 / D 0, Dg.t/ D Df .t/ C p > 0, so that there exists some r 2a0 ; bŒ such that g.r/ > 0. Since g is continuous and g.b/ < 0 there exists some s 2r; bŒ such that g.s/ D 0 and g.t/ < 0 for all t 2s; b. Then we have Dg.s/ 0, contradicting Dg.s/ > 0. Thus our assertion is proved. Applying this assertion to the functions h and k given by h.t/ WD f .t/ ct, k.t/ WD dt f .t/ we get the conclusion of the lemma. t u It follows from the lemma that the supremum (and the infimum) of the four Dini derivatives of a continuous function on any interval are the same. We leave to the reader the task of proving that when one of the four Dini derivatives of a continuous function f is continuous at some t, then f is differentiable at t. The next result is well known. Proposition 5.5 Let T be an interval of R, let f W T ! R, and let x 2 int T be such that f is differentiable at x. If f attains its minimum over T at x, then f 0 .x/ D 0. Proof After a passage to the limit, that is a consequence in the fact that for all u 2 P t u one has u1 Œ f .x C u/ f .x/ 0 and .u/1 Œ f .x u/ f .x/ 0.
224
5 The Power of Differential Calculus
Fig. 5.1 The Descartes-Snell law of refraction
b
θ2 θ1
– x
c
–a
Application: The Descartes-Snell Law of Refraction Suppose two media are separated by the horizontal plane z D 0 and that the speed of light is s1 in the first one and s2 in the second one. Using the fact that the trajectory of light minimizes the travel time and is along a line in an isotropic medium, prove the relation 1 1 sin 1 D sin 2 s1 s2 in which 1 (resp. 2 ) is the angle of the first (resp. second) ray with the vertical line (Fig. 5.1). [Hint: assume the trajectory joins the point .0; 0; a/ to the point .c; 0; b/, with a, b, c 2 P and passes through the point .x; 0; 0/. Using the fact that x minimizes the travel time f .x/ D
1 2 1 .x C a2 /1=2 C ..c x/2 C b2 /1=2 ; s1 s2
find an interpretation of the relation f 0 .x/ D 0.]
5.1.2 The Mean Value Theorem The Mean Value Theorem is a precious tool for obtaining estimates. For this reason, it is a cornerstone of differential calculus. Let us note that the elementary version recalled in the following lemma is not valid when the function takes its values in a linear space of dimension greater than one. Lemma 5.2 Let f W T ! R be a continuous function on some interval T WD Œa; b of R, with a < b. If f is differentiable on a; bŒ then there exists some c 2a; bŒ such that f .b/ f .a/ D f 0 .c/.b a/:
5.1 Differentiation of One-Variable Functions
225
Example Let f W Œ0; 1 ! R2 be given by f .t/ WD .t2 ; t3 / for t 2 T WD Œ0; 1. Then one cannot find any c 2 intT satisfying the preceding relation since the system 2c D 1, 3c2 D 1 has no solution. t u Instead, a statement in the form of an estimate is valid. Theorem 5.1 Let X be a normed space, let T WD Œa; b be a compact interval of R and let f W T ! X, g W T ! R be continuous on T and have right derivatives on a; bŒ such that kDr f .t/k Dr g.t/ for every t 2a; bŒ. Then k f .b/ f .a/k g.b/ g.a/:
(5.2)
Proof It suffices to prove that for every given " > 0, b belongs to the set T" WD ft 2 T W k f .t/ f .a/k g.t/ g.a/ C ".t a/g: This set is nonempty since a 2 T" and it is closed, being defined by an inequality whose sides are continuous. Let s WD sup T" b. Then s 2 T" . We first suppose f and g have right derivatives on Œa; bŒ and we show that assuming s < b leads to a contradiction. The existence of the right derivatives of f and g at s yields some ı 20; b sŒ such that for r 20; ı one has f .s C r/ f .s/ " ; D f .s/ r 2 r
ˇ ˇ ˇ g.s C r/ g.s/ ˇ " ˇ ˇ : D g.s/ r ˇ ˇ 2 r
It follows that for r 20; ı one has k f .s C r/ f .s/k r kDr f .s/k C r"=2; g.s C r/ g.s/ rDr g.s/ r"=2: Therefore, since s 2 T" and kDr f .s/k Dr g.s/, k f .s C r/ f .a/k k f .s C r/ f .s/k C k f .s/ f .a/k rDr g.s/ C r"=2 C g.s/ g.a/ C ".s a/ g.s C r/ g.s/ C r" C g.s/ g.a/ C ".s a/ g.s C r/ g.a/ C ".s C r a/: This string of inequalities shows that sCr 2 T" , a contradiction with the definition of s. Thus b 2 T" and the result is established under the additional assumption that the right derivatives of f and g exist at a (note that we may have s D a in the preceding). When this additional assumption is not made, we take a0 2a; bŒ and we apply the preceding case to the interval Œa0 ; b: f .b/ f .a0 / g.b/ g.a0 /: Then, passing to the limit as a0 ! aC , we get the announced inequality.
t u
226
5 The Power of Differential Calculus
Remark Since we allow the possibility that the right derivatives do not exist at the extremities of the interval, we may assume that the derivatives do not exist (or do not satisfy the assumed inequality) at a finite number of points of T. To prove this, it suffices to subdivide the interval into subintervals and to gather the obtained inequalities by using the triangle inequality. In fact, one can exclude a countable set of points of T, but the proof is more involved; see [66], [100, p.153]. The result can also be deduced from Theorem 8.14, even by replacing Dr g with one of its right Dini derivatives, observing that for h W T ! R given by h.t/ WD k f .t/ f .a/k one has Dr h.t/ kDr f .t/k for all t 2 T since for s > 0 one has k f .t C s/ f .a/k k f .t/ f .a/k k f .t C s/ f .t/k : Thus h g is nonincreasing and one obtains the following refinement. Theorem 5.2 With the notation of Theorem 5.1, the estimate (5.2) holds when f and g are continuous on T and have right derivatives on TnD, where D is countable, such that kDr f .t/k Dr g.t/ for every t 2 TnD. The most usual application is given in the following corollary in which we take g.t/ D mt for some m 2 RC and t 2 T. The Lipschitz property is obtained in substituting an arbitrary pair t; t0 (with t t0 ) to a, b. Corollary 5.1 Let f W T ! X be continuous on T WD Œa; b, let m 2 RC and let D be a countable subset of T. Suppose that, for all t 2a; bŒnD, f has a right derivative at t such that kDr f .t/k m. Then f is Lipschitzian with rate m on T and, in particular, k f .b/ f .a/k m.b a/: The case m D 0 yields the following noteworthy consequence. Corollary 5.2 Let f W Œa; b ! X be continuous and such that f has a right derivative Dr f on a; bŒnD that is null, D being countable. Then f is constant on Œa; b. The purpose of obtaining estimates often requires the introduction of auxiliary functions, as in the proof of the following useful corollary. Corollary 5.3 Let f W T ! X be continuous on T WD Œa; b, let v 2 X, r 2 RC and let D be a countable subset of T. Suppose f has a right derivative on a; bŒnD such that Dr f .t/ 2 v C rBX for every t 2a; bŒnD. Then f .b/ 2 f .a/ C .b a/v C .b a/rBX : Proof Define h W T ! X by h.t/ WD f .t/ tv. Then h is continuous and for t 2 a; bŒnD one has kDr h.t/k D kDr f .t/ vk r. Then Corollary 5.2 entails that k f .b/ f .a/ .b a/vk D kh.b/ h.a/k .b a/r; an estimate equivalent to the inclusion of the statement.
t u
5.2 Primitives and Integrals
227
Remark The terminology of the theorem stems from the fact that the mean value v WD .ba/1 . f .b/f .a// is estimated by the approximate speed v, with an error r that is exactly the magnitude of the uncertainty of the estimate of the instantaneous speed fr0 .t/ WD Dr f .t/. Note that the shorter the lapse of time .b a/, the more precise is the localization of f .b/ by f .a/ C .b a/v. Thus, if you lose your dog, be sure to have a rather precise idea of his speed and direction and do not lose time in looking for him.
Exercises 1. Prove Corollary 5.1 by deducing it from the classical Mean Value Theorem (Lemma 5.2) for real-valued functions, using the Hahn-Banach Theorem. [Hint: Take y with norm 1 such that hy ; yi D kyk for y WD f .x/ f .w/, set g.t/ WD hy ; f .x C t.w x//i and pick 20; 1Œ such that g.1/ g.0/ D Dr g. /.] 2. Prove Theorem 5.2. [See [66, 100].]
5.2 Primitives and Integrals The aim of this subsection is to present an inverse of the differentiation operator. In fact, as revealed by the Darboux property (Exercise 1), not all functions from some interval T of R to a real Banach space X are derivatives. Therefore, we will get a primitive g of a function f on T only if f is regular enough. Here we use the following terminology. Definition 5.2 A function g W T ! X is said to be a primitive of f W T ! X if g is continuous and if there exists a countable subset D of T such that, for all t 2 TnD, g is differentiable at t and g0 .t/ D f .t/. Uniqueness of the primitive taking an assigned value at some point of T is asserted by the next proposition. Proposition 5.6 If g1 and g2 are two primitives of an arbitrary function f W T ! X, then g1 g2 is constant. Proof If g1 and g2 are two primitives of f there exist countable subsets D1 and D2 of T such that gi is differentiable on TnDi and g0i .t/ D f .t/ for all t 2 TnDi (i D 1; 2). Then, for the countable set D WD D1 [ D2 , the continuous function g1 g2 is differentiable on TnD and its derivative is 0 there; thus g1 g2 is constant. t u A partial inverse of the differentiation operator is given in the next result. Rt Theorem 5.3 For f W T ! X regulated, the map g W t 7! a f .s/ds is a primitive of f .
228
5 The Power of Differential Calculus
Proof Given t 2 Œa; bŒ, " > 0, let ı 20; b tŒ be such that for r 20; ı one has R tCr k f .t C r/ f .tC /k ". Since for c WD f .tC / one has t c D rc, it follows from the Chasles’ relation and (3.27) that Z tCr Z tCr Z t f f rc D . f c/ r": a
a
t
Rt
This relation shows that g W t 7! a f .s/ds has a right derivative at t whose value is c. Similarly, if t 2a; b, then g has f .t / as a left derivative at t. Therefore, if f is continuous at t 2a; bŒ, then g is differentiable at t and g0 .t/ D f .t/. Since the set D of discontinuities of f is countable, we get that g is differentiable on TnD with derivative f . Moreover, g is continuous on T in view of the Chasles’ relation and (3.27). t u Rt Corollary 5.4 If f W T ! X is continuous, then g W t 7! a f .s/ds is of class C1 (i.e., differentiable with a continuous derivative) and its derivative is f . Let us give two rules that are useful for the computation of primitives. Proposition 5.7 (Change of Variables) Let h W S D Œ˛; ˇ ! R be the primitive of a regulated function h0 such that h.S/ T and let f 2 R.T; X/. If either f is continuous or h is monotone, then s 7! h0 .s/f .h.s// is regulated and for all r 2 Œ˛; ˇ one has Z r Z h.r/ 0 h .s/f .h.s//ds D f .t/dt: (5.3) ˛
h.˛/
Proof When f is continuous, since h is continuous, f ı h is continuous and then k W s 7! h0 .s/f .h.s// is regulated; the same is true when h is either increasing or decreasing. Then, the left-hand side of equality (5.3) is the value at r of the primitive j of k satisfying j.˛/ D 0. The right-hand side is g.h.r//, where g is the primitive of f satisfying g.h.˛// D 0. Under each of our assumptions, for a countable subset D of S, the derivative of g ı h at r 2 SnD exists and is h0 .r/g0 .h.r// D h0 .r/f .h.r//. The uniqueness of the primitive of k null at ˛ gives the equality. t u Proposition 5.8 (Integration by Parts)) Let X; Y and Z be Banach spaces, let .x; y/ 7! x y be a continuous bilinear map from X Y into Z and let f W T ! X, g W T ! Y be primitives of regulated functions, with T WD Œa; b. Then Z b Z b f .t/ g0 .t/dt D f .b/ g.b/ f .a/ g.a/ f 0 .t/ g.t/dt: a
a
Proof The functions t 7! f .t/ g0 .t/ and t 7! f 0 .t/ g.t/ clearly have one-sided limits at all points of T WD Œa; b. Moreover, their sum is the derivative of h W t 7! f .t/ g.t/ on TnD where D is the countable set of nondifferentiability of f or g. Thus the result Rb amounts to the equality a h0 .t/dt D h.b/ h.a/ that stems from the uniqueness of the primitive of h0 that takes the value 0 at a. t u
5.3 Directional Differential Calculus
229
Exercises 1. (Darboux property) Show that the derivative f of a differentiable function g W T ! R satisfies the intermediate value property: given a; b 2 T with f .a/ < f .b/ and r 2 f .a/; f .b/Œ, there exists some c between a and b such that f .c/ D r. 2. Show that there exist a continuous function f W R ! R and two continuous functions g1 , g2 whose difference is not constant and which are such that g1 and g2 are differentiable on RnN, where N is a set of measure zero, with g01 .t/ D g02 .t/ D f .t/ for all t 2 RnN. [Take f D 0, g1 D 0 and for g2 take an increasing function whose derivative is 0 a.e.] 3. Prove Theorem 5.2. [See [100, 8.5.1].]
5.3 Directional Differential Calculus Now let us consider maps from an open subset W of a normed space X into another normed space Y. In order to reduce the study of differentiability to the one-variable case it is natural to take restrictions to line segments or to compose with regular curves in W. Definition 5.3 Let X, Y be normed spaces, let W be an open subset of X, let x 2 W and let f W W ! Y. We say that f has a radial derivative at x in the direction u 2 X if .1=t/. f .xCtu/f .x// has a limit when t ! 0C . We denote by dr f .x; u/ this limit. If f has a radial derivative at x in every direction u, we say that f is radially differentiable at x. If moreover the map Dr f .x/ W u 7! dr f .x; u/ is linear and continuous, we say that f is Gateaux differentiable at x and call Dr f .x/ the Gateaux derivative of f at x. One often says that f is directionally differentiable at x but we prefer to keep this terminology for a slightly more demanding notion we consider now. In fact, although the notion of radial differentiability is simple and useful, it has several drawbacks; the main one is the fact that this notion does not enjoy a chain rule. The variant that follows does enjoy such a rule and reflects a smoother behavior of f when the direction u is submitted to small changes. Definition 5.4 Let X; Y be normed spaces, let W be an open subset of X, let x 2 W and let f W W ! Y. We say that f has a directional derivative at x in the direction u 2 X, or that f is differentiable at x in the direction u, if .1=t/. f .x C tv/ f .x// has a limit when .t; v/ ! .0C ; u/. We denote by f 0 .x; u/ or df .x; u/ this limit. If f has a directional derivative at x in every direction u, we say that f is directionally differentiable at x. If moreover the map f 0 .x/ WD Df .x/ W u 7! f 0 .x; u/ is linear and continuous, we say that f is Hadamard differentiable at x. The concepts of directional derivative and radial derivative are different, as the next example shows. Thus, it is convenient to dispose of two notations.
230
5 The Power of Differential Calculus
Example-Exercise Let f W R2 ! R be given by f .r; s/ D .r4 Cs2 /1 r3 s for .r; s/ 2 R2 nf.0; 0/g, f .0; 0/ D 0. It is Gateaux differentiable at .0; 0/ but not directionally differentiable at .0; 0/. t u The (frequent) use of the same notation for the radial and directional derivatives is justified by the following observation showing the compatibility of the two notions. Proposition 5.9 If X and Y are normed spaces, if W is an open subset of X and if f W W ! Y has a directional derivative at x in the direction u, then it has a radial derivative at x in the direction u and both derivatives coincide. In particular, if f is Hadamard differentiable at x, then it is Gateaux differentiable at x. Conversely, if f is Lipschitzian on a neighborhood V of x, then f is directionally differentiable at x in any direction u in which f is radially differentiable. Proof The first assertions stem from an application of the definition of a limit. Let us prove the converse assertion. Let k be the Lipschitz rate of f on V and let u 2 X be such that f is radially differentiable at x in the direction u. Setting 0 r.t; v/ WD f .x C tv/ f .x/ u/ we have t1 r.t; u/ ! 0 as t ! 0 and tfr .x; 1 1 since t .r.t; v/ r.t; u// D t . f .x C tv/ f .x C tu// kkv uk ! 0 as .t; v/ ! .0C ; u/, we get t1 r.t; v/ ! 0 as .t; v/ ! .0C ; u/. t u While radial differentiability of f at x in the direction u is equivalent to differentiability of the function fx;u W t 7! f .x C tu/ at 0, directional differentiability of f at x amounts to differentiability of the composition of f with curves issued from x with the initial direction u, as the next proposition shows. Proposition 5.10 The map f W W ! Y is differentiable at x in the direction u 2 Xnf0g if and only if f is radially differentiable at x in the direction u and for any > 0 and any map c W Œ0; ! W that is right differentiable at 0 with Dr c.0/ D u, c.0/ D x, the map f ı c is right differentiable at 0 and Dr . f ı c/.0/ D dr f .x; u/. The characterization is still valid if one makes the additional requirement that c be continuous on Œ0; . Proof Suppose f is differentiable at x in the direction u 2 X. Given > 0 and c W Œ0; ! W that is right differentiable at 0 with c0C .0/ D u and c.0/ D x, let us set vt WD .1=t/.c.t/ c.0//, so that vt ! u as t ! 0C . Then f .c.t// f .c.0// f .x C tvt / f .x/ D ! df .x; u/ as t ! 0C : t t Now let us prove the sufficient condition. Suppose f has a radial derivative at x in the direction u but is not differentiable at x in the direction u ¤ 0. There exist " > 0 and some sequence .tn ; un / ! .0C ; u/ such that x C tn un 2 W for all n 2 N and f .x C tn un / f .x/ dr f .x; u/ ": t n
(5.4)
5.3 Directional Differential Calculus
231
We may assume that tnC1 .1=2/tn. Then, let us define c W Œ0; t0 ! X by c.0/ WD x, c.t/ WD x C .tn tnC1 /1 Œ.tn t/tnC1 unC1 C .t tnC1 /tn un for t 2 ŒtnC1 ; tn Œ. Then one sees that .1=t/.c.t/ c.0// ! u, but since c.tn / D x C tn un , in view of (5.4), f ı c is not differentiable at 0 with derivative dr f .x; u/. u t Corollary 5.5 Let X, Y be normed spaces, let T, W be open subsets of R and X respectively, let c W T ! X be differentiable at t 2 T and f W W ! Y be Hadamard differentiable at x WD c.t/ 2 W and such that c.T/ W. Then f ı c is differentiable at t and . f ı c/0 .t/ D Df .x/.c0 .t//: Thus, Df .x/ appears as the continuous linear map transforming velocities. It is easy to show that any linear combination of maps having radial (resp. directional) derivatives at x in some direction u has a radial (resp. directional) derivative at x in direction u. In particular, any linear combination of two Gateaux (resp. Hadamard) differentiable maps is Gateaux (resp. Hadamard) differentiable. One also deduces from Proposition 5.2 that if f has a directional (resp. radial) derivative at x in the direction u and if A W Y ! Z is a continuous linear map, then A ı f has a directional (resp. radial) derivative at x in the direction u and .A ı f /0 .x; u/ D A. f 0 .x; u//. The preceding example-exercise shows that the composition of two radially differentiable maps is not necessarily radially differentiable. However, one does have a chain rule for directionally differentiable maps. These facts show that Hadamard differentiability is a more interesting notion than Gateaux differentiability. Theorem 5.4 Let X, Y, Z be normed spaces, let U and V be open subsets of X and Y respectively and let f W U ! Y, g W V ! Z be directionally differentiable maps at x 2 W WD f 1 .V/ and y WD f .x/ 2 V respectively. Then h WD g ı f is directionally differentiable at x and d .g ı f / .x; u/ D dg. f .x/; df .x; u//: In particular, if f is Hadamard differentiable at x and if g is Hadamard differentiable at y WD f .x/, then h WD g ı f is Hadamard differentiable at x and D.g ı f /.x/ D Dg.y/ ı Df .x/: Proof More generally, let us show that if f has a directional derivative at x in the direction u 2 X and if g has a directional derivative at f .x/ in the direction v WD df .x; u/, then h WD g ı f has a directional derivative at x in the direction u. For .t; u0 / close enough to .0; u/ one has x C tu0 2 W. Let q.t; u0 / WD .1=t/. f .x C tu0 / f .x//.
232
5 The Power of Differential Calculus
Then q.t; u0 / ! v WD df .x; u/ as .t; u0 / ! .0C ; u/. Therefore g.y C tq.t; u0 // g.y/ h.x C tu0 / h.x/ D ! dg.y; v/ as .t; u0 / ! .0C ; u/: t t The statement can also be proved by using Proposition 5.10.
t u
The notion of radial differentiability is sufficient to get a mean value theorem. Recall that the line segment Œa; b (respectively a; bŒ) with end points a, b in a normed space is the set f.1 t/a C tb W t 2 Œ0; 1g (respectively f.1 t/a C tb W t 2 0; 1Œg). Proposition 5.11 If f W W ! Y is radially differentiable at each point of a segment Œw; x contained in W, then k f .x/ f .w/k sup kdr f .w C t.x w/; x w/k: t20;1Œ
Proof Let h W Œ0; 1 ! Y be given by h.t/ WD f ..1t/wCtx/I it is right differentiable on 0; 1Œ, with right derivative Dr h.t/ D dr f ..1 t/w C tx; x w/ and continuous on Œ0; 1. Then Corollary 5.1 yields the estimate. t u A variant can be derived when f is Gateaux differentiable at each point of S WD a; bŒ, since then one has kdr f .z; x w/k kDr f .z/k : kx wk for all z 2 S, w, x 2 X. Proposition 5.12 Let X and Y be normed spaces, let W be an open subset of X containing the segment Œw; x and let f W W ! Y be continuous on Œw; x and Gateaux differentiable at each point of S WDw; xŒ, with c WD supz2S kDr f .z/k < C1. Then one has k f .x/ f .w/k c kx wk : Corollary 5.6 Let X and Y be normed spaces, let W be a convex open subset of X and let f W W ! Y be Gateaux differentiable at each point of W and such that for some c 2 R one has kDr f .w/k c for every w 2 W. Then, f is Lipschitzian with rate c: for all x; x0 2 W one has f .x/ f .x0 / c x x0 : In particular, if Dr f .w/ D 0 for every w 2 W, then f is constant on W. This result is also valid if W is connected instead of convex. An extension of the estimate of Proposition 5.12 is also valid in the case when W is connected, provided one replaces the usual distance with the geodesic distance dW in W defined as in Exercise 5. The following corollary gives an approximation of f in the case one disposes of an approximate value of the derivative of f around x.
5.3 Directional Differential Calculus
233
Corollary 5.7 Let X and Y be normed spaces, let X0 be a linear subspace of X, let W be a convex open subset of X and let f W W ! Y be Gateaux differentiable at each point of W and such that for some c 2 R and some ` 2 L.X0 ; Y/ one has kDr f .x/.u/ `.u/k c kuk for every x 2 W, u 2 X0 . Then, for any x; x0 2 W such that x x0 2 X0 , one has f .x/ f .x0 / `.x x0 / c x x0 : This result (obtained by changing f into f ` in the preceding corollary) will serve to get Fréchet differentiability from Gateaux differentiability. For the moment, let us point out another passage from Gateaux differentiability to Hadamard differentiability. Proposition 5.13 Let W be an open subset of X. If f W W ! Y is radially differentiable on a neighborhood V of x in W and if, for some u 2 Xnf0g, its radial derivative fr0 WD dr f W V X ! Y is continuous at .x; u/, then f is directionally differentiable at x in the direction u. In particular, if f is Gateaux differentiable on V and if fr0 W V X ! Y is continuous at each point of fxg X, then f is Hadamard differentiable at x. Proof Without loss of generality, we may suppose u has norm 1. Given " > 0, let ı 20; 1Œ be such that j fr0 .x; v/ fr0 .x; u/j " for all .x; v/ 2 B.x; 2ı/ B.u; ı/, with B.x; 2ı/ V. Setting r.t; v/ WD f .x C tv/ f .x/ tfr0 .x; u/, we observe that for every v 2 B.u; ı/ the map rv WD r.; v/ is differentiable on Œ0; ı and krv0 .t/k D k fr0 .x C tv; v/ fr0 .x; u/k ". Since rv .0/ D 0, Corollary 5.1 yields kr.t; v/k "t. This shows that f has fr0 .x; u/ as a directional derivative at x in the direction u. The last assertion is an immediate consequence. t u The importance of this continuity condition leads us to introduce a definition. Definition 5.5 Given normed spaces X, Y and an open subset W of X, a map f W W ! Y is said to be of class D1 at w 2 W (resp. on W) if it is Hadamard differentiable around w (resp. on W) and if df W W X ! Y is continuous at .w; v/ for all v 2 X (resp. on W X). We say that f is of class Dk with k 2 N, k > 1, if f is of class D1 and if df is of class Dk1 . We denote by D1 .W; Y/ the space of maps of class D1 from W to Y and by BD1 .W; Y/ the space of maps f 2 D1 .W; Y/ that are bounded and such that f 0 W w 7! Df .w/ WD df .w; / is bounded from W to L.X; Y/. Let us note the following two properties. Proposition 5.14 For any f 2 D1 .W; Y/ the map f 0 W w 7! Df .w/ is locally bounded. Proof Suppose, on the contrary, that there exist w 2 W and a sequence .wn / ! w such that .rn / WD .kDf .wn /k/ ! C1. For each n 2 N one can pick some unit vector un 2 X such that kdf .wn ; un /k > rn 1. Setting (for n 2 N large) xn WD rn1 un , we see that ..wn ; xn // ! .w; 0/ but .kdf .wn ; xn /k/ ! 1, a contradiction. t u
234
5 The Power of Differential Calculus
Corollary 5.8 For f W W ! Y, where W is a convex open subset of X, the following assertions are equivalent: (a) f is of class D1 ; (b) there exists a continuous map h W W X X ! Y such that for all .w; x/ 2 W one has h.w; x; / 2 L.X; Y/ and f .w/ f .x/ D h.w; x; w x/; (c) f is Hadamard (or Gateaux) differentiable, f 0 is locally bounded and for all u 2 X the map x 7! f 0 .x/u is continuous. In particular, if Y D R and if f 2 D1 .W; R/, the derivative of f is continuous when X is endowed with the topology of uniform convergence on compact sets. R1 Proof (a))(b) Take h.w; x; v/ WD 0 df ..1 t/x C tw; w x/dt. For the reverse implication, observe that for all x 2 W, u 2 X, setting w WD x C tu one has 1 . f .x C tu/ f .x// D h.x C tu; x; u/ ! h.x; x; u/; t t.¤0/!0 so that f is Gateaux differentiable and df .x; u/ D h.x; x; u/. Thus df is continuous and f is of class D1 . The implication (a))(c) stems from the preceding proposition. For the reverse implication we assume that there exist an m > 0 and a neighborhood U of x in W such that k f 0 .w/k m for w 2 U and, given u 2 X and " > 0 that there exists a neighborhood V" of x contained in U such that k f 0 .w/u f 0 .x/uk "=2 for w 2 V" . Then, for v 2 B.u; "=2m/ and w 2 V" we have 0 f .w/v f 0 .x/u f 0 .w/.v u/ C f 0 .w/u f 0 .x/u m" C " D " 2m 2 which shows that .w; v/ 7! f 0 .w; v/ is continuous.
t u
Example Let E, F be Banach spaces, let T be a compact interval of R, let U be an open subset of E T and let g W U ! F be continuous and such that for every .t; e/ 2 U the Gateaux derivative D1 g.e; t/ D Dgt .e/ of gt WD g.; t/ at e exists and the map .e; t; v/ 7! D1 g.e; t/:v is continuous on U E. Let R.T; E/ (resp. R.T; F/) be the space of regulated maps from T to E (resp. F). Denoting by W the set of w 2 R.T; E/ such that cl.f.w.t/; t/ W t 2 Tg/ U and by g˘ W W ! R.T; F/ the map defined by g˘ . f /.t/ WD g.w.t/; t/ for w 2 R.T; E/, t 2 T, we recall that Proposition 3.45 established that g˘ is continuous. For a similar reason the map .w; z/ 7! D1 g.w./; /:z./ is continuous. Let V WD f.e; e0 ; t/ W ..1 s/e C se0 ; t/ 2 U 8s 2 Œ0; 1g and let H W V E ! F be continuous and such that g.e/ g.e0 / D G.e; e0 ; t/:.e e0 / D H.e; e0 ; e e0 ; t/
8.e; e0 ; t/ 2 V
5.3 Directional Differential Calculus
235
R1 where G.e; e0 ; t/ D 0 D1 g..1s/eCse0 ; t/ds and in particular G.e; e; t/ D D1 g.e; t/ for all .e; t/ 2 U. Plugging the values of w, z 2 W into g and H, for all t 2 T, we get g˘ .w/.t/ g˘ .z/.t/ D H.w.t/; z.t/; w.t/ z.t/; t/ D H ˘ .w; z; w z/.t/: Since H ˘ is continuous and since H ˘ .w; z; / is linear and continuous by Proposition 3.45, we get that g˘ is of class D1 from W into R.T; F/. t u Proposition 5.15 If X, Y, Z are normed spaces, if U and V are open subsets of X and Y respectively and if f 2 D1 .U; Y/, g 2 D1 .V; Z/, with f .U/ V then h WD g ı f 2 D1 .U; Z/. Proof This follows from the relation dh.u; x/ D dg. f .u/; df .u; x// for all .u; x/ 2 U X. t u Under a differentiability assumption, convex functions, integral functionals and Nemytskii operators are important examples of maps of class D1 . See Proposition 6.16 and Sect. 8.5.2.
Exercises 1. Let X, Y be normed spaces and let W be an open subset of X. Prove that f W W ! Y is Hadamard differentiable at x if and only if there exists a continuous linear map ` W X ! Y such that the map qt given by qt .v/ WD .1=t/. f .x C tv/ f .x// converges to ` as t ! 0C , uniformly on compact subsets of X. Deduce another proof of Proposition 5.12 below from this characterization. 2. Prove that if f W W ! Y is radially differentiable at x in the direction u and if f is directionally steady at x in the direction u in the sense that .1=t/. f .x C tv/ f .x C tu// ! 0 as .t; v/ ! .0C ; u/, then f is directionally differentiable at x in the direction u. Give an example showing that this criterion is more general than the Lipschitz condition of Proposition 5.9. 3. Let f W R2 ! R be given by f .r; s/ WD r2 s.r2 C s2 /1 for .r; s/ 2 R2 nf.0; 0/g, f .0; 0/ D 0. Show that f has a radial derivative (which is, in fact, a bilateral derivative) but is not Gateaux differentiable at .0; 0/. 4. Let E be a Hilbert space and let X WD D1 .T; E/, where T WD Œ0; 1. Endow X with the norm kxk WD supt2T kx.t/k C supt2T kx0 .t/k. Define the length of a curve x W Œ0; 1 ! E by Z `.x/ WD
1 0
kx0 .t/kdt:
(a) Show that ` is a continuous sublinear functional on X with Lipschitz rate 1. (b) Let W be the set of x 2 X such that x0 .t/ ¤ 0 for all t 2 Œ0; 1. Show that W is an open subset of X and that ` is Gateaux differentiable on W.
236
5 The Power of Differential Calculus
(c) Show that ` is of class D1 on W [Hint: use convergence results for integrals]. In order to prove that ` is of class C1 one may use the results in the following questions. (d) Let E0 WD Enf0g and let D W E0 ! E be given by D.v/ WD kvk1 v. Given u; v 2 E0 show that kD.u/ D.v/k 2kuk1 ku vk. (e) Deduce from the preceding inequality that `0 is continuous. 5. Prove the assertions following Corollary 5.6, defining the geodesic distance dW .x; x0 / between two points x; x0 of W as the infimum of the lengths of curves joining x to x0 . 6. Prove that if f W W ! Y has a directional derivative at some point x of the open subset W of X, then its derivative Df .x/ W u 7! df .x; u/ is continuous if it is linear. 7. Prove Proposition 5.11 by deducing it from the classical Mean Value Theorem (Lemma 5.2) for real-valued functions, using the Hahn-Banach Theorem. [Hint: Take y with norm one such that hy ; yi D kyk for y WD f .x/ f .w/, set g.t/ WD hy ; f .xCt.wx//i and pick 20; 1Œ such that g.1/g.0/ D Dr g. /.] 8. Show that the norm x 7! kxk WD supt2T jx.t/j on the Banach space X WD C.T/ of continuous functions on T WD Œ0; 1 is Hadamard differentiable at x 2 X if and only if the function t 7! jx.t/j attains its maximum on T at a single point. 9. (a) Let a; b be two points of a normed space X. Show that the function g given by g.t/ WD ka C tbk has a right derivative and a left derivative at all points of R. (b) Let f W T ! X, where T is an interval of R. Show that if f has a right derivative Dr f .t/ at some t 2 T, then g ı f has a right derivative at t and Dr .g ı f /.t/ kDr f .t/k. 10. Use the preceding exercise to deduce a mean value theorem from Lemma 5.2. 11. Let f W X ! Y be a map of class C1 between two normed spaces such that f .tx/ D tf .x/ for all .t; x/ 2 R X. Show that f is linear, and in fact that f .x/ D f 0 .0/.x/. 12. A map f W X ! Y between two normed spaces is said to have a Schwarz derivative at x 2 X if for all v 2 X the quotient .1=2t/. f .x C tv/ f .x tv// has a limit ds f .x; v/ as t ! 0C . Show that if f has a radial derivative at x, then it has a Schwarz derivative at x and ds f .x; v/ D .1=2/.dr f .x; v/ dr f .x; v//.
5.4 Classical Differential Calculus The behaviors of nonlinear maps are difficult to control. Differential calculus can help. The main purpose of differential calculus consists in getting some information by using an affine approximation to a given nonlinear map around a given point.
5.4 Classical Differential Calculus
237
5.4.1 The Main Concepts and Results of Differential Calculus Of course, the meaning of the word “approximation” has to be made precise. For that purpose, we define remainders. Fréchet differentiability consists in an approximation by a continuous affine map, the error being a remainder. Definition 5.6 Given normed spaces X and Y, we denote by o.X; Y/ the set of maps r W X ! Y such that r.0/ D 0 and r.x/=kxk ! 0 as x ! 0 in Xnf0g. The elements of o.X; Y/ will be called remainders. Thus, r W X ! Y is a remainder if and only if there exists some map ˛ W X ! Y satisfying ˛.x/ ! 0 as x ! 0 and r./ D kk ˛./. Moreover, r 2 o.X; Y/ if and only if there exists a modulus W RC ! RC WD RC [ f1g such that kr.x/k .kxk/ kxk for x 2 X (recall that W RC ! RC is a modulus when is nondecreasing, .0/ D 0 and is continuous at 0). Such a case occurs when there exist c > 0 and p > 1 such that kr.x/k c kxkp . Following Landau, remainders are often denoted by o./ and different remainders are often represented by the same letters since they are considered as inessential for the assigned purposes. If r; s W X ! Y are two maps that coincide on some neighborhood V of 0 in X, then s belongs to o.X; Y/ if and only if r belongs to o.X; Y/. Thus if q W V ! Y is defined on some neighborhood V of 0 in X, we consider that q is a remainder if some extension r of q to all of X is a remainder. The preceding observation shows that this property does not depend on the choice of the extension. The following two results are direct consequences of the rules for limits. Lemma 5.3 For any normed spaces X, Y the set o.X; Y/ of remainders is a linear space. Lemma 5.4 Given normed spaces X, Y1 ; : : : ; Yk , Y WD Y1 : : : Yk , a map r W X ! Y is a remainder if and only if its components r1 ,. . . ; rk are remainders. The class of remainders is stable under composition by continuous linear maps. Lemma 5.5 For any normed spaces W, X, Y, Z, for any r 2 o.X; Y/ and any continuous linear maps A W W ! X, B W Y ! Z, one has r ı A 2 o.W; Y/ and B ı r 2 o.X; Z/ (hence B ı r ı A 2 o.W; Z/). Proof Let ˛ W X ! Y be such that r.x/ D kxk ˛.x/ and ˛.x/ ! 0 as x ! 0. Then, if A W W ! X is stable at 0, i.e. is such that there exists some c > 0 for which kA.w/k c kwk for w in a neighborhood of 0 in W, in particular if A is linear and continuous, one has kr.A.w//k D kA.w/k k˛.A.w//k c kwk k˛.A.w//k and ˛.A.w// ! 0 as w ! 0, so that r ı A 2 o.W; Y/. Similarly, if B W Y ! Z is stable at 0, then B ı r 2 o.X; Z/. The assertion about B ı r ı A is a combination of the two other cases. t u We are ready to define differentiability in the Fréchet sense; this notion is so common that one often writes “differentiable” instead of “Fréchet differentiable”.
238
5 The Power of Differential Calculus
Definition 5.7 Given normed spaces X, Y and an open subset W of X, a map f W W ! Y is said to be Fréchet differentiable (or firmly differentiable, or just differentiable) at x 2 W if there exist a continuous linear map ` W X ! Y and a remainder r 2 o.X; Y/ such that for w 2 W one has f .w/ D f .x/ C `.w x/ C r.w x/:
(5.5)
It is often convenient to write the preceding relation in the form f .x C u/ f .x/ D `.u/ C r.u/ for u close to 0. Here the continuous affine map x 7! f .x/ C `.x x/ can be viewed as an approximation of f that essentially determines the behavior of f around x. The continuous linear map ` is called the derivative of f at x and is denoted by Df .x/ or f 0 .x/. It is unique: given two approximations `1 ; `2 of f .x C / f .x/ around 0 and two remainders r1 ; r2 such that f .x C u/ f .x/ D `1 .u/ C r1 .u/ D `2 .u/ C r2 .u/ one has `1 D `2 since ` WD `1 `2 is the remainder r WD r2 r1 I in fact for every u 2 X and any t > 0 small enough one has `.u/ D
1 1 r.tu/ D ˛.tu/ ktuk D ˛.tu/ kuk ! 0 as t ! 0; t t
so that `.u/ D 0. Thus L.X; Y/ \ o.X; Y/ D f0g. Uniqueness is also a consequence in Corollary 5.12 below and of the fact that the directional derivative is unique as it is obtained as a limit. When Y WD R, the derivative Df .x/ of f at x belongs to the dual X of X. When X is a Hilbert space with scalar product h j i it may be convenient to use the Riesz isometry R W X ! X given by hR.x/; yi D hx j yi to get an element rf .x/ of X called the gradient of f at x by setting rf .x/ WD R1 .Df .x//. It allows us to visualize the derivative, but, in some respects, it is preferable to work with the derivative, in particular when dealing with composite maps and higher order derivatives. Proposition 5.16 If f W W ! Y is differentiable at x 2 W, then it is continuous at x. Proof This follows from the fact that any remainder is continuous at 0.
t u
Proposition 5.17 If f ; g W W ! Y are differentiable at x 2 W, then for any ; 2 R the map h WD f C g is differentiable at x and Dh.x/ D Df .x/ C Dg.x/. Proof If r.x/ WD f .x C x/ f .x/ f 0 .x/.x/, s.x/ WD g.x C x/ g.x/ g0 .x/x, one has h.x C x/ D h.x/ C f 0 .x/.x/ C g0 .x/.x/ C t.x/, where t WD r C s 2 o.X; Y/. Thus h is differentiable at x and h0 .x/ D f 0 .x/ C g0 .x/. t u Examples (a) A constant map is everywhere differentiable and its derivative is 0. (b) A continuous linear map ` 2 L.X; Y/ is differentiable at any point x and its derivative at x is ` since `.x C x/ D `.x/ C `.x/.
5.4 Classical Differential Calculus
239
(c) A continuous affine map f WD ` C c, where ` 2 L.X; Y/ and c 2 Y, is differentiable at any x 2 X and Df .x/ D `. (d) If f W X WD X1 X2 ! Y is a continuous bilinear map then f is differentiable at any point x WD .x1 ; x2 / 2 X and for x D .x1 ; x2 / one has Df .x/.x/ D f .x1 ; x2 / C f .x1 ; x2 / since f .x C x/ f .x/ D f .x1 ; x2 / C f .x1 ; x2 / C f .x1 ; x2 /. Here f is a remainder since k f .x/k k f k kx1 k kx2 k k f k kxk2 whenever kxk kxk1 WD max.kx1 k ; kx2 k/. (e) If f W X ! Y is a continuous quadratic map in the sense that there exists a continuous bilinear map b W X X ! Y such that f .x/ D b.x; x/, then f is differentiable at any point x 2 X and Df .x/.x/ D b.x; x/Cb.x; x/ for x 2 X. This follows from the chain rule below and the preceding example. Alternatively, one may observe that r WD f is a remainder since for every x 2 X one has k f .x/k kbk kxk2 and f .x C x/ D f .x/ C b.x; x/ C b.x; x/ C f .x/. ( f) If p W X ! Y is a continuous homogeneous polynomial of degree k, i.e. if there exists a symmetric k-multilinear map f W X k ! Y such that p.x/ WD f .x; : : : ; x/ for all x 2 X, then p is differentiable and Dp.x/.x/ D kf .x; : : : ; x; x/ for all x 2 X. (g) If f W T ! Y is defined on an open interval T of R, f is differentiable at x 2 T if and only if f has a derivative at x and Df .x/ is the linear map r 7! rf 0 .x/, hence f 0 .x/ D Df .x/.1/. The key point in this example is detailed in the following exercise. t u Exercise Show that for any normed space Y the space L.R; Y/ is isomorphic (and even isometric) to Y via the evaluation map ` 7! `.1/ whose inverse is the map t u v 7! `v , where `v 2 L.R; Y/ is defined by `v .r/ WD rv for r 2 R. The following characterization will be helpful. Lemma 5.6 Given an open subset W of X, a map f W W ! Y is differentiable at x 2 W if and only if there exists a map F W W ! L.X; Y/ that is continuous at x and such that f .w/ f .x/ D F.w/.w x/ for all w 2 W. Proof Assume that for a map F W W ! L.X; Y/ continuous at x we have f .w/ D f .x/ C F.w/.w x/ for all w 2 W. Then f .x C x/ f .x/ D F.x/.x/ C r.x/, where r.x/ WD .F.x C x/ F.x//.x/ for x small. Since kr.x/k kF.x C x/ F.x/k : kxk, r is a remainder and f is differentiable at x with Df .x/ D F.x/. Let us prove the converse. Using the Hahn-Banach Theorem, for x 2 X we pick `x 2 X such that k`x k D 1 and `x .x/ D kxk. Then, setting A WD Df .x/, ˛.u/ WD r.u/= kuk for u 2 Xnf0g, ˛.0/ D 0, where r is the remainder r appearing in (5.5), so that r.u/ D ˛.u/kuk D ˛.u/`u .u/ with ˛.u/ ! 0 as u ! 0, we get f .x C u/ f .x/ D .A C ˛.u/`u /.u/ D F.w/.w x/ for F.w/ WD A C ˛.w x/`wx ! A D F.x/ as w ! x.
with w WD x C u t u
240
5 The Power of Differential Calculus
Corollary 5.9 Given Banach spaces X, Y, an open subset W of X, a map f W W ! Y is of class C1 on W, i.e. differentiable on W with a continuous derivative f 0 W W ! L.X; Y/ if and only if there exists an open subset V of W W containing the diagonal f.w; w/ W w 2 Wg and a continuous map F W V ! L.X; Y/ such that f .w/ f .x/ D F.w; x/.w x/ for all .w; x/ 2 V. In such a case, one has f 0 .w/ D F.w; w/ for all w 2 W. Proof The sufficient condition is a direct consequence of the proposition. For the necessary condition one can take V WD f.w; x/ 2 W W W .1 t/x C tw 2 W8t 2 R1 Œ0; 1g and F.w; x/ WD 0 f 0 ..1 t/x C tw/dt. t u Example Let us keep the notation of the example following Cor. 5.8: E, F are Banach spaces, T is a compact interval of R, U is an open subset of E T and g W U ! F is continuous and such that for every .e; t/ 2 U the Gateaux derivative D1 g.e; t/ D Dgt .e/ of gt WD g.; t/ at e exists. This time we assume that the map D1 g is continuous from U into L.E; F/. Again, let us denote by W the set of w 2 R.T; E/ such that cl.f.w.t/; t/ W t 2 Tg/ U and by g˘ W W ! R.T; F/ the map defined by g˘ . f /.t/ WD g.w.t/; t/ for w 2 R.T; E/, t 2 T, as in Proposition 3.45. Here R.T; E/ and R.T; F/ are the spaces of regulated functions from T to E and F, respectively. Then, g˘ W W ! R.T; F/ is of class C1 and, with a slight abuse of notation, Dg˘ .w/ D .D1 g/˘ .w/ 2 L.R.T; E/; R.T; F// for all w 2 W. A similar (and simpler) result holds for g˘ considered as a map from W \C.T; E/ into C.T; F/. t u Let us present the chain rule. It is a cornerstone of differential calculus. Theorem 5.5 (Chain Rule) Let X, Y, Z be normed spaces, let U, V be open subsets of X and Y respectively and let f W U ! Y, g W V ! Z be differentiable at x 2 U and y D f .x/ respectively and such that f .U/ V. Then h WD g ı f is differentiable at x and Dh.x/ D Dg.y/ ı Df .x/:
(5.6)
Proof Let ` WD Df .x/, m WD Dg.y/ and let r 2 o.X; Y/, s 2 o.Y; Z/ be defined by r.x/ WD f .x C x/ f .x/ `.x/;
s. y/ WD g.y C y/ g.y/ m. y/:
Then, setting y WD `.x/ C r.x/ for x 2 U x, so that f .x C x/ D y C y, we get h.xCx/h.x/m.`.x// D g.yCy/g.y/m. yr.x// D s. y/Cm.r.x//:
(5.7)
Lemma 5.5 ensures that m ı r 2 o.X; Z/. Now, given c > k`k, there exists some > 0 such that for x 2 B.0; / one has kr.x/k .ck`k/kxk hence k`.x/Cr.x/k ckxk. Then the proof of Lemma 5.5 ensures that s ı .` C r/ 2 o.X; Z/. Thus, the right-hand side s ı .` C r/ C m ı r of (5.7) is a remainder and we conclude that h is differentiable at x with derivative the continuous linear map m ı `. t u Exercise Give a short proof of Theorem 5.5 using Lemma 5.4.
5.4 Classical Differential Calculus
241
The following corollary is a consequence in the fact that the derivative of a continuous linear map ` at an arbitrary point is ` itself. Corollary 5.10 Let X, Y, Z be normed spaces, let U, V be open subsets of X and Y respectively and let f W U ! Y, g W V ! Z be such that f .U/ V and let h WD g ı f . (a) If f is differentiable at x and V WD Y, g 2 L.Y; Z/, then h is differentiable at x and Dh.x/ D g ı Df .x/. (b) If g is differentiable at y WD f .x/ and U WD X, f 2 L.X; Y/, then h is differentiable at x and Dh.x/ D Dg.y/ ı f . Corollary 5.11 The differentiability of f W W ! Y (with W open in X) at x does not depend on the choices of the norms on X and Y within their equivalences classes. In fact, changing the norm amounts to composing with the identity map, the source space and the range space being endowed with two different norms. Corollary 5.12 Let X, Y be normed spaces, let W be an open subset of X and let f W W ! Y. If f is Fréchet differentiable at x 2 W, then f is Hadamard differentiable at x. If X is finite dimensional, the converse holds. Thus, the Mean Value theorems of Sect. 5.2 are in force for Fréchet differentiability. Also, the interpretation of the derivative as a rule for the transformation of velocities remains valid for the Fréchet derivative. Proof The first assertion follows from the definitions or from Theorem 5.5 and Proposition 5.10. Assuming the dimension of X is finite, let us prove that if f is directionally differentiable at x, and if its directional derivative f 0 .x; / is continuous, then r given by r.w/ WD f .x C w/ f .x/ f 0 .x; w/ is a remainder. Adding the assumption that f 0 .x; / is linear, this will prove the converse assertion. Suppose, on the contrary, that there exist " > 0 and a sequence .wn / ! 0 such that, for all n 2 N, kr.wn /k > " kwn k. Then tn WD kwn k is positive; setting un WD tn1 wn , we may suppose the sequence .un / converges to some vector u of the unit sphere of X. Then, given "0 20; "Œ, we can find k 2 N such that for n k we have k f 0 .x; un / f 0 .x; u/k " "0 , so that 0 f .x C tn un / f .x/ 0 0 0 f .x; u/ > " kun k f .x; un / f .x; u/ " ; tn contradicting the assumption that f is differentiable at x in the direction u.
t u
Another link between directional differentiability and firm differentiability is pointed out in the next statement. A direct proof using Corollary 5.7 is easy (Exercise 8). We present a proof in the case f 0 is continuous around x.
242
5 The Power of Differential Calculus
Proposition 5.18 If f is Gateaux differentiable on W and if f 0 W W ! L.X; Y/ is continuous at x 2 W, then f is Fréchet differentiable at x. Proof Without loss of generality, replacing Y by its completion, we may suppose Y is complete; replacing W with a ball centered at x, we may also suppose W is convex. Then, for x 2 W one has f .x/ f .x/ D F.x/.x x/ with Z 1 F.x/ WD Df .x C t.x x//dt; 0
and F is continuous, so that the criterion of Lemma 5.6 applies.
t u
This result shows that it may be a sensible strategy to start with radial differentiability in order to prove that a map is of class C1 , i.e. that it is differentiable with a continuous derivative. For instance, if one is dealing with an integral functional Z f .x/ WD F.s; x.s//ds; S
where S is some measure space and x belongs to some space of measurable maps, it is advisable to use Lebesgue’s Theorem to differentiate inside the integral (under appropriate assumptions) by taking the limit in the quotient 1 Œ f .x C tu/ f .x/ D t
Z S
1 ŒF.s; x.s/ C tu.s// F.s; x.s//ds: t
Continuity arguments may be invoked later, for instance by using Krasnoselski’s criterion (see Subsection 8.5.2 and [12, 179, 254]). However, there are cases in which such a map is Gateaux differentiable but not Fréchet differentiable. See [92, Example 15.2]. Let us note other consequences of Theorem 5.5. Proposition 5.19 Let X, Y1 ; : : : ; Yn be normed spaces, let W be an open subset of X and let f WD . f1 ; : : : ; fn / W W ! Y WD Y1 : : : Yn . Then f is differentiable at x 2 W if and only if its components fi W W ! Yi (i D 1; : : : ; n) are differentiable at x and for v 2 X Df .x/.v/ D .Df1 .x/.v/; : : : ; Dfn .x/.v// : Proof Let pi W Y ! Yi denote the ith canonical projection. If f is differentiable at x, then Corollary 5.10 ensures that fi WD pi ı f is differentiable at x and Dfi .x/ D pi ıDf .x/. Conversely, suppose that f1 ; : : : ; fn are differentiable at x. Let ri 2 o.X; Yi / be given by ri .x/ D fi .x C x/ fi .x/ Dfi .x/.x/. Then, by Lemma 5.4, we have that r WD .r1 ; : : : ; rn / 2 o.X; Y/ and r.x/ D f .xCx/f .x/`.x/ for ` 2 L.X; Y/ given by `.x/ WD .Df1 .x/.x/; : : : ; Dfn .x/.x//. Thus f is differentiable at x, with derivative `. t u Now, let us consider the case when the source space X is a product X1 : : : Xn and W is an open subset of X. One says that f W W ! Y has a partial
5.4 Classical Differential Calculus
243
derivative at x 2 W relative to Xi for some i 2 Nn if the map fi;x W xi 7! f .x1 ; : : : ; xi1 ; xi ; xiC1 ; : : : ; xn / is differentiable at xi . Then, one denotes by Di f .x/ @f or @x .x/ the derivative of the map fi;x at xi . Let ji 2 L.Xi ; X/ be the insertion i given by ji .xi / WD .0; : : : ; 0; xi ; 0; : : : ; 0/. Since the map fi;x is just the composition of the affine map xi 7! ji .xi xi / C x D .x1 ; : : : ; xi1 ; xi ; xiC1 ; : : : ; xn / with f , from Corollary 5.10 (b) and the fact that v D j1 .v1 / C : : : C jn .vn /, while Di f .x/ D Dfi;x .xi / D Df .x/ ı ji one gets the following proposition. Proposition 5.20 If f W W ! Y is defined on an open subset W of a product space X WD X1 : : : Xk and if f is differentiable at x, then for i D 1; : : : ; k, the map f has a partial derivative at x relative to Xi and 8v WD .v1 ; : : : ; vk /
Df .x/.v/ D D1 f .x/v1 C : : : C Dk f .x/vk :
When X WD Rm , Y WD Rn , the matrix .Di fj .x// of Df .x/ formed with the partial derivatives of the components fj 1jn of f is called the Jacobian matrix of f at x. It determines Df .x/. Note that it may happen that f has partial derivatives at x with respect to all its variables but is not differentiable at x. Example Let f W R2 ! R be given by f .r; s/ WD rs.r2 C s2 /1 for .r; s/ ¤ .0; 0/ and f .0; 0/ D 0. Since f .r; 0/ D 0 D f .0; s/, f has partial derivatives with respect to its two variables at .0; 0/. However, f is not continuous at .0; 0/, hence is not differentiable at .0; 0/. t u We will shortly present a partial converse of Proposition 5.20. For this purpose (among others) it will be useful to introduce a reinforced notion of differentiability that allows us to formulate several results with assumptions weaker than continuous differentiability. Definition 5.8 Let X and Y be normed spaces, let W be an open subset of X and let x 2 W. A map f W W ! Y is said to be circa-differentiable (or peri-differentiable, or strictly differentiable) at x if there exists some continuous linear map ` 2 L.X; Y/ such that for every x; x0 2 W one has k f .x/ f .x0 / `.x x0 /k ! 0 as x; x0 ! x with x0 ¤ x: kx x0 k
(5.8)
Taking x0 D x in relation (5.8), one sees that if f is circa-differentiable at x, then f is differentiable at x and Df .x/ D `. If X0 is a linear subspace of X we say that f is circa-differentiable (or strictly differentiable) at x with respect to X0 if there exist some continuous linear map ` 2 L.X0 ; Y/ such that (5.8) holds whenever x; x0 2 W satisfy x x0 2 X0 . Let us relate the preceding notion to continuous differentiability. Definition 5.9 The map f W W ! Y will be said to be continuously differentiable at x 2 W, or of class C1 at x, and we write f 2 Cx1 .W; Y/, if f is differentiable on some neighborhood V W of x and if the derivative f 0 W V ! L.X; Y/ of f given
244
5 The Power of Differential Calculus
by f 0 .x/ WD Df .x/ for x 2 V is continuous at x. If f is of class C1 at each point x of W, then f is said to be of class C1 on W and one writes f 2 C1 .W; Y/. One says that f is of class Ck with k 2 N, k > 1, if f is of class C1 and if f 0 is of class Ck1 . Then, one writes f 2 Ck .W; Y/. Proposition 5.21 Let X and Y be normed spaces, let W be an open subset of X and let x 2 W. A map f W W ! Y which is differentiable on a neighborhood U W of x is circa-differentiable at x 2 W if and only if f 2 Cx1 .W; Y/. Proof Suppose f 2 Cx1 .W; Y/ and let ` WD Df .x/. Given " > 0 one can find ı > 0 such that B.x; ı/ W and for x 2 B.x; ı/ one has kDf .x/ `k ". Then, using Corollary 5.7, for x; x0 2 B.x; ı/ one has 0 f .x / f .x/ `.x0 x/ " x0 x ; so that f is circa-differentiable at x. Conversely, suppose f is circa-differentiable at x and is differentiable on a neighborhood V of x contained in W. Given u 2 X and " > 0, assuming that the preceding inequality holds whenever x; x0 2 B.x; ı/ V, one gets for all x 2 B.x; ı/, kDf .x/.u/ `.u/k D lim t1 k f .x C tu/ f .x/ `.tu/k " kuk ; t!0C
so that kDf .x/ `k " and f 0 W x 7! Df .x/ is continuous at x.
t u
We are now in a position to give a converse of Proposition 5.20. Proposition 5.22 If f W W ! Y is defined on an open subset W of a product space X WD X1 : : : Xk , if for i 2 Nk , f has a partial derivative at x 2 W relative to Xi and if f is circa-differentiable at x with respect to X1 ,. . . ; Xi1 , XiC1 ,. . . ; Xk , then f is differentiable at x. In particular, if f has partial derivatives on some neighborhood of x, all of which but one being continuous at x, then f is differentiable at x. Proof It suffices to give the proof for k D 2; an induction yields the general case. Thus, let f be circa-differentiable at x with respect to X1 and have a partial derivative at x relatively to X2 . The first assumption means that there exists some `1 2 L.X1 ; Y/ such that for every " > 0 one can find some ı > 0 such that B.x; 2ı/ W and for x WD .x1 ; x2 / 2 B.x; ı/, u1 2 X1 , ku1 k ı one has k f .x1 C u1 ; x2 / f .x1 ; x2 / `1 .u1 /k " ku1 k :
(5.9)
Setting `2 WD D2 f .x/ and taking a smaller ı > 0 if necessary, we may suppose that k f .x1 ; x2 C u2 / f .x1 ; x2 / `2 .u2 /k " ku2 k
5.4 Classical Differential Calculus
245
for any u2 2 X2 satisfying ku2 k ı. Then, taking .x1 ; x2 / WD .x1 ; x2 C u2 / in (5.9) with u WD .u1 ; u2 / 2 B.0; ı/, we get k f .x C u/ f .x/ `1 .u1 / `2 .u2 /k k f .x C u/ f .x1 ; x2 C u2 / `1 .u1 /k C k f .x1 ; x2 C u2 / f .x/ `2 .u2 /k " ku1 k C " ku2 k D " k.u1 ; u2 /k if one takes the norm on X given by k.u1 ; u2 /k WD ku1 k C ku2 k.
t u
Corollary 5.13 A map f W W ! Y defined on an open subset W of a product space X WD X1 : : : Xk is of class C1 on W if and only if f has partial derivatives on W that are jointly continuous. Now, let us give a result dealing with the interchange of limits and differentiation. Theorem 5.6 Let . fn / be a sequence of Fréchet (resp. Hadamard) differentiable functions from a bounded, convex, open subset W of a normed space X to a Banach space Y. Suppose (a) there exists some x 2 W such that . fn .x// converges in Y; (b) the sequence . fn0 / uniformly converges on W to some map g W W ! L.X; Y/. Then . fn / uniformly converges on W to some map f that is Fréchet (resp. Hadamard) differentiable on W. Moreover, f 0 D g. Proof Let us prove the first assertion. Let r > 0 be such that W is contained in the ball B.x; r/. Given n, p in N, Corollary 5.6 yields, for every x 2 W fp .x/ fp .x/ . fn .x/ fn .x// f 0 f 0 : kx xk r f 0 f 0 p n 1 p n 1 fp .x/ fn .x/ fp .x/ fn .x/ C r f 0 f 0 : p n 1
(5.10) (5.11)
Since fp0 fn0 1 ! 0 as n; p ! 1 and since . fp .x/ fn .x// ! 0 as n; p ! 1, we see that . fn .x// is a Cauchy sequence, hence has a limit in the complete space YI we denote it by f .x/. Passing to the limit on p in (5.11) we see that the limit is uniform on W. Now, given x 2 W, let us prove that f is differentiable at x with g.x/. derivative Given " > 0, we can find k 2 N such that for p > n k one has fp0 fn0 1 "=3, hence g fn0 1 "=3. Using again Corollary 5.6 with x0 WD x C u 2 W, we get . fp .x C u/ fp .x// . fn .x C u/ fn .x// ."=3/ kuk and passing to the limit on p, we obtain k f .x C u/ f .x/ . fn .x C u/ fn .x//k ."=3/ kuk
(5.12)
246
5 The Power of Differential Calculus
In the Fréchet differentiable case, we can find ı > 0 such that B.x; ı/ W and for all u 2 ıBX k fk .x C u/ fk .x/ g.x/.u/k " " fk .x C u/ fk .x/ fk0 .x/.u/ C fk0 .x/.u/ g.x/.u/ kuk C kuk : 3 3 Combining this estimate with relation (5.12) in which we take n D k, we get 8u 2 ıBX
k f .x C u/ f .x/ g.x/.u/k " kuk ;
so that f is Fréchet differentiable at x with derivative g.x/. In the Hadamard differentiable case, given " > 0 and a unit vector u, we take ı 20; 1Œ such that B.x; 2ı/ W and for t 20; ıŒ, v 2 B.u; ı/ k fk .x C tv/ fk .x/ g.x/.tu/k " " fk .x C tv/ fk .x/ fk0 .x/.tu/ C fk0 .x/.tu/ g.x/.tu/ t C t: 3 3 Gathering this estimate with relation (5.12), in which we take n D k, u D tv, we get 8.t; v/ 2 .0; ı/ B.u; ı/
k f .x C tv/ f .x/ g.x/.tu/k "t;
so that f is Hadamard differentiable at x and f 0 .x/ D g.x/.
t u
Corollary 5.14 Let X, Y be normed spaces, Y being complete, and let W be an open subset of X. The space B1 .W; Y/ (resp. BC1 .W; Y/) of bounded, Lipschitzian, differentiable (resp. of class C1 ) maps from W to Y is complete with respect to the norm kk1;1 given by k f k1;1 WD sup k f .x/k C sup f 0 .x/ : x2W
x2W
Here we use the fact that if f is Lipschitzian and differentiable, its derivative is bounded. Proof Let . fn / be a Cauchy sequence in B1 .W; Y/; kk1;1 . Then . fn0 / is a Cauchy sequence in the space B.W; L.X; Y// of bounded maps from W into L.X; Y/ with respect to the uniform norm; thus it converges and its limit is continuous if fn 2 BC1 .W; Y/. Similarly, . fn / converges in B.W; Y/. The theorem ensures that the limit f of . fn / is Fréchet differentiable and its derivative is the limit of . fn0 /, hence is bounded. Thus f belongs to B1 .W; Y/ and . fn / ! f for kk1;1 . If . fn / is contained in BC1 .W; Y/, then f 0 is continuous, hence f 2 BC1 .W; Y/. t u A directional version follows similarly from Theorem 5.6.
5.4 Classical Differential Calculus
247
Corollary 5.15 Let X, Y be normed spaces, Y being complete and let W be an open subset of X. The space BH 1 .W; Y/ of bounded, Lipschitzian, Hadamard differentiable maps from W to Y is complete with respect to the norm kk1;1 . The same is true for its subspace BD1 .W; Y/ formed by bounded, Lipschitzian maps of class D1 . The following theorem is a deep result we admit. It requires tools from measure theory. We refer to [118] for the proof. Theorem 5.7 (Rademacher) A locally Lipschitz function f on an open subset of Rd is differentiable almost everywhere.
Exercises 1. (a) Show that r W X ! Y is a remainder if and only if there exists a remainder on R such that kr.x/k .kxk/ for all x close to 0. (b) Prove the other two characterizations of remainders that follow the definition. 2. Define a notion of directional remainder that could be used for the study of Hadamard differentiability. 3. Show that when f W W ! Y is Fréchet differentiable at x, then it is stable at x in the sense that there exists a c > 0 such that k f .x C x/ f .x/k c kxk for kxk small enough. 4. Give a direct proof that Fréchet differentiability implies Hadamard differentiability. 5. Show that if f W X1 X2 ! Y is circa-differentiable at x WD .x1 ; x2 / with respect to X1 and X2 , then it is circa-differentiable at x. 6. In Theorem 5.6, when W is not bounded, assuming that . fn0 / converges to g uniformly on bounded subsets of W, obtain a similar interchange result in which the convergence of . fn / to f is uniform on bounded subsets of the open convex set W. 7. In Theorem 5.6, assuming that W is a connected open subset of X and that the convergence of . fn0 / is locally uniform (in the sense that for every x 2 W there exists some ball with center x contained in W on which the convergence of . fn0 / is uniform), prove that . fn / is locally uniformly convergent and that its limit f is differentiable with derivative g. 8. Give a proof of Proposition 5.18 assuming only continuity of f 0 at x. 9. With the hypothesis of Proposition 5.18 show that the map f is circadifferentiable at x. Is it of class C1 at x? 10. Express the chain rule for differentiable maps between Rm , Rn , Rp in terms of a matrix product for the Jacobians of f and g. 11. Using the Hahn-Banach Theorem, show that f W W ! Y is circa-differentiable at a 2 W if and only if there exists a map F W W W ! L.X; Y/ continuous at .a; a/ such that f .u/ f .v/ D F.u; v/.u v/. Then f 0 .a/ D F.a; a/.
248
5 The Power of Differential Calculus
12. Show that if X is finite dimensional, f W U ! Y, with U open in X, is of class D1 if and only if f is of class C1 . [Hint: for any element e of a basis of X the map x 7! Df .x/.e/ is continuous when f is of class D1 .] 13. Given normed spaces X, Y and a topology T (or a convergence) on the space of maps from BX to Y, one can define the notion of a T -semiderivative at x of a map f W B.x; r/ ! Y: it consists in requiring that the family of maps . ft /0
q and is Hadamard differentiable at f if p D q, with P0 . f /:h D 1T h, where T WD ft 2 S W f .t/ > 0g and 1T .s/ WD 1 for s 2 T, 1T .s/ WD 0 for s 2 S n T. Give a counter-example showing that P is not necessarily Fréchet differentiable if p D q. [Hint: assume that there exists a sequence .Sn / such that ..Sn // ! 0 and .Sn / > 0 for all n 2 N and set vn WD 1Sn , f WD .1=2/1S and show that .P. f C vn / P. f / vn /= kvn k does not converge to 0 as n ! 1.]
5.4.2 Higher Order Derivatives Let X and Y be normed spaces and let f W W ! Y be a differentiable map. Since the derivative f 0 W W ! L.X; Y/ of f takes its values in a normed space, requiring its differentiability at some x 2 W has a meaning. Then we say that f is twice differentiable at x and we define the second derivative of f at x as the map f 00 .x/ WD Df 0 .x/ 2 L.X; L.X; Y//:
5.4 Classical Differential Calculus
249
Given u, v 2 X and observing that the evaluation map ev W L.X; Y/ ! Y defined by ev .`/ WD `.v/ is linear and continuous, the chain rule for derivatives yields . f 00 .x/.u//.v/ D ev .Df 0 .x/:u/ D D.ev ı f 0 /.x/.u/ D lim
t!0C
1 0 . f .x C tu/.v/ f 0 .x/.v//: t
Such a relation allows us to compute f 00 .x/. The space L.X; L.X; Y// is isometric to the space L2 .XI Y/ WD L2 .X XI Y/ of continuous bilinear maps from X X to Y via the map W L.X; L.X; Y// ! L2 .XI Y/ given by .B/ WD b with b.u; v/ WD B.u/.v/I the equalities kbkL2 .XIY/ WD sup jb.u; v/j D sup . sup B.u/.v// D sup kB.u/k u;v2SX
u2SX v2SX
u2SX
D kBkL.X;L.X;Y// being consequences of the definitions. Thus, one can consider f 00 .x/ as a continuous bilinear map and write f 00 .x/.u; v/ instead of . f 00 .x/.u//.v/. Definition 5.10 A map f W W ! Y, where W is an open subset of X is said to be n times differentiable at x 2 X if f is n 1 times differentiable on an open neighborhood V of x contained in W and if the .n 1/-th derivative f .n1/ of f is differentiable at x. Then one sets f .n/ .x/ WD Df .n1/ .x/ 2 L.X; Ln1 .XI Y//, where Ln1 .XI Y/ WD Ln1 .X : : : XI Y/. Using the fact that the space L.X; Ln1 .XI Y// is isometric to Ln .XI Y/ via the map given by .A/.x1 ; : : : ; xn / WD A.x1 /.x2 ; : : : ; xn /, f .n/ .x/ can be considered as a continuous n-linear map from X to Y. If f is n times differentiable at each point of W and if f .n/ is continuous from W into Ln .XI Y/, then f is said to be of class Cn . It is of class C1 if it is of class Cn for all n 2 Nnf0g. For m 2 Nnf0g, n > m, one can show that f .n/ is the m-th derivative of f .nm/ . The next theorem exhibits an important symmetry property of f 00 .x/. Theorem 5.8 (Schwarz) Let f W W ! Y be twice differentiable at x 2 W. Then, setting V WD f.u; v/ 2 X X W x C u 2 W; x C v 2 W, x C u C v 2 Wg, r.u; v/ WD f .x C u C v/ f .x C u/ f .x C v/ C f .x/ f 00 .x/.u; v/; one has r.u; v/=.kuk C kvk/2 ! 0 as .u; v/ ! .0; 0/ in Vnf.0; 0/g. Therefore f 00 .x/ considered as a continuous bilinear map is symmetric. By induction, for n 2 one can prove that f .n/ .x/ is symmetric in its n variables. Proof Given " > 0, the differentiability of f 0 at x yields some ı > 0 such that x C 2ıBX W and " x 2 2ıBX H) f 0 .x C x/ f 0 .x/ Df 0 .x/:x kxk : 2
(5.13)
250
5 The Power of Differential Calculus
Taking u 2 ıBX and setting gu .v/ WD r.u; v/
v 2 ıBX ;
with r as in the statement, we compute Dgu .v/ with the help of the rule Db.u; / D b.u; / for the linear map b.u; /: Dgu .v/ D f 0 .x C u C v/ f 0 .x C v/ f 00 .x/.u; / D Œ f 0 .x C u C v/ f 0 .x/ f 00 .x/.u C v; / Œ f 0 .x C v/ f 0 .x/ f 00 .x/.v; /; so that, by (5.13), for v 2 ıBX , kDgu .v/k ".kuk C kvk/. Applying Corollary 5.6 we get kgu .v/k D kgu .v/ gu .0/k ".kuk C kvk/ kvk ".kuk C kvk/2 . This proves the first assertion. The second one follows from the symmetry of s.u; v/ WD f .x C u C v/ f .x C u/ f .x C v/ in u and v and the relation k f 00 .x/.u; v/ f 00 .x/.v; u/k D kr.v; u/ r.u; v/k 2".kuk C kvk/2 for u, v 2 ıBX by a homogeneity argument. t u Corollary 5.16 Given normed spaces X1 , X2 , Y, x WD .x1 ; x2 / 2 W, an open subset of X WD X1 X2 and f W W ! Y that is twice differentiable at x, the partial derivatives D1 f W W ! L.X1 ; Y/ and D2 f W W ! L.X2 ; Y/ are differentiable and D2 D1 f .x/.v1 /.v2 / D D1 D2 f .x/.v2 /.v1 / for all v1 2 X1 , v2 2 X2 . Proof Setting u1 WD .v1 ; 0/, u2 WD .0; v2 /, one has D2 D1 f .x/.v1 /.v2 / D D. f 0 ./u1 /.x/:u2 and D1 D2 f .x/.v2 /.v1 / D D. f 0 ./u2 /.x/:u1 . The symmetry of f 00 .x/ entails the result. t u When X WD Rd , for f W W ! Y, where W is an open subset of X, it is advisable to gather the partial derivatives. Namely, if f is k times differentiable at x, if ˛ WD .˛1 ; : : : ; ˛d / 2 Nd with j˛j D k for j˛j WD ˛1 C : : : C ˛d one writes D˛ f .x/ WD D˛1 : : :D˛d f .x/ WD
@˛1 @˛d ::: f .x/ @x1 @xd
for these partial derivatives. Example If f W X ! Y is linear and continuous, then f is of class C1 and f .n/ D 0 for n 2 since f 0 .x/ D f for all x 2 X, so that f 0 is constant. Example Let b W X1 X2 ! Y be a continuous bilinear map. Then b is of class C1 and Db.x1 ; x2 /.v1 ; v2 / D b.v1 ; x2 / C b.x1 ; v2 /, so that b0 is linear from X1 X2 into L.X1 X2 ; Y/, b.2/ .x1 ; x2 / is independent of .x1 ; x2 / and b.n/ D 0 for n 3. Proposition 5.23 Let X, Y, Z be normed spaces, let U, V be open subsets of X and Y respectively and let f W U ! Y, g W V ! Z be n times differentiable at x 2 U and y WD f .x/ 2 V respectively, with f .U/ V. Then h WD g ı f is n times differentiable at x. If f and g are of class Cn , then g ı f is of class Cn .
5.4 Classical Differential Calculus
251
Proof The result is known for n D 1 and one has h0 .x/ D g0 . f .x// ı f 0 .x/ for x 2 U. If n D 2, since h0 is composed of the map x 7! . f 0 .x/; .g0 ı f /.x// with values in the product space L.X; Y/ L.Y; Z/ and of the map .A; B/ 7! B ı A from L.X; Y/ L.Y; Z/ into L.X; Z/ which is bilinear and continuous, applying the Leibniz’s rule and the chain rule one sees that h0 is differentiable and for u1 ; u2 2 X one has h00 .x/.u1 ; u2 / D g00 . f .x//:f 0 .x/u1 :f 0 .x/u2 C g0 . f .x//:f 00 .x/.u1 ; u2 /: More generally, by induction we get that h0 is .n 1/ times differentiable at x (or of class Cn1 if f and g are of class Cn ). t u The preceding result and a change of notation give a means to compute the second derivative f 00 .x/.u1 ; u2 / at x 2 U of a map f W U ! Y that is twice differentiable at x by reducing the question to the computation of the second derivative of the map k WD f ı j W R2 ! Y where j W R2 ! X is the affine map defined by j.r1 ; r2 / WD x C r1 u1 C r2 u2 : setting e1 WD .1; 0/, e2 WD .0; 1/ one has f 00 .x/.u1 ; u2 / D k00 .0; 0/.e1 ; e2 /:
Exercises 1. Let f W R2 ! R be given by f .r; s/ WD rs.r2 s2 /=.r2 Cs2 / for .r; s/ 2 R2 nf.0; 0/g and f .0; 0/ D 0. Show that the four partial derivatives D1 D1 f , D1 D2 f , D2 D1 f , D2 D2 f exist at every .r; s/ 2 R2 but that D1 D2 f .0; 0/ ¤ D2 D1 f .0; 0/. R r 2. Let fn , gn W R ! R be given by gn .r/ WD r=.1 C n jrj/, fn .r/ WD 0 gn .s/ds for r 2 R, n 2 N. Let X WD c0 be the space of sequences x WD .xn /n such that limn xn D 0. Show that the map f W .xn / 7! . fn .xn // belongs to C1 .X; X/ and that for all v 2 V the map f 0 ./.v/ is differentiable at 0 but that f 0 is not differentiable at 0. 3. Let f W X ! Y be a map of class C2 between two normed spaces such that f .tx/ D t2 f .x/ for all t 2 R, x 2 X. Show that f is quadratic and that in fact f .x/ D .1=2/f 00 .0/.x; x/. 4. Let W R ! R be given by .r/ D exp..1 r2 /1 / for r 2 R 1; 1Œ, .r/ D 0 1 for r 2 Rn 1; 1Œ. Verify that is of class C1 . Let c WD 1 .r/dr. For f W RR ! R continuous or regulated and n 2 N, let fn be given by fn .t/ D 1 .n=c/ 1 f .s/.n.t s//ds. Show that fn is of class C1 . Prove that when f is continuous . fn / converges to f uniformly. Find the limit of . fn .t// when f is regulated. [Hint: start with the case when f is a step function] 5. Let f W R ! R be a twice differentiable function such that for some a 2 R one has f .r/ 0, f 0 .r/ > 0, and f 00 .r/ 0 for all r 2 Œa; 1Œ. Show that f .t/ f 0 .t/.t a/=2 for all t 2 Œa; 1Œ and f .t/ f .t/t=4 for t 2 Œ2a; 1Œ. [Hint: use the function g given by g.t/ WD .t a/f .t/ .1=2/.t a/2 f 0 .t/ for t 2 Œa; 1Œ.]
252
5 The Power of Differential Calculus
6. Given normed spaces Y1 , Y2 , Z, a bilinear product b W Y1 Y2 ! Z, and maps f1 W U ! Y1 , f2 W U ! Y2 that are twice differentiable at x 2 U, show that h WD b ı . f1 ; f2 / is twice differentiable at x and give the expression of h00 .x/.u1 ; u2 /. 7. Given an open subset W of a normed vector space X, the Lie bracket of two vector fields f , g in C1 .W; X/ is the map h WD Œ f ; g 2 C.W; X/ given by h.w/ WD Df .w/g.w/ Dg.w/f .w/
w 2 W:
Verify the following properties in which f ; g, h 2 C1 .W; X/: (a) (b) (c) (d)
Œg; f D Œ f ; g; Œrf C sg; h D rŒ f ; h C sŒg; h for r; s 2 R; Œ'f ; g D ' Œf ; g C .D':g/f '.D :f /g for ', 2 C1 .W; R/; ŒŒ f ; g; h C ŒŒg; h; f C ŒŒh; f ; g D 0 for f , g, h 2 C2 .W; X/.
5.4.3 Taylor’s Formulas Higher order derivatives enable us to give precise approximations of a map. Such approximations that are so useful in applications can be given in different forms. We start with a result of a pointwise character that resembles the definition of Fréchet differentiability. Throughout this subsection X and Y are normed spaces, x is a point of an open subset W of X, and f W W ! Y is a map. Theorem 5.9 If f W W ! Y is n times differentiable at x, with n 2 Nnf0g, then setting rn .x/ WD f .x C x/ f .x/
n X 1 .k/ f .x/.x; : : :; x/ kŠ kD1
(5.14)
one defines a remainder of order n, i.e. limx.¤0/!0 rn .x/= kxkn D 0, in symbols rn .x/ D o.kxkn /. Proof For n D 1, this is just the definition of differentiability. Let us prove the result by induction. Example (g) after the definition of differentiability ensures that for k 2 the derivative of the polynomial function x 7! .1=kŠ/f .k/.x/.x.k/ / (where x.k/ stands for .x; : : : ; x/ with k entries) is the function x 7! .1=.k 1/Š/f .k/ .x/.x.k1/ ; / 2 L.X; Y/. Thus 0
0
Drn .x/ D f .x C x/ f .x/
n X kD2
1 f .k/ .x/.x.k1/ ; /: .k 1/Š
0 0 This is the remainder rn1 associated with induction ensures that for all " > 0 0 f . Then1 one can find ı > 0 such that rn1 .x/ " kxk for all x 2 ıBX . Then, the Mean
5.4 Classical Differential Calculus
253
Value Theorem applied to rn in the ball kxk BX yields krn .x/k D krn .x/ rn .0/k " kxkn for all x 2 ıBX . t u One can obtain more precise estimates by making more stringent assumptions. We need some preliminaries. Lemma 5.7 Let E, F, G be normed spaces and let b W E F ! G be a continuous bilinear map denoted by .u; v/ 7! u v. Given an open interval T of R, n C 1 times differentiable functions e W T ! E, f W T ! F, the derivative of the function g W T ! G defined by g.t/ WD
n X
.1/nk e.nk/ .t/ f .k/ .t/
kD0
is t 7! .1/n e.nC1/ .t/ f .t/ C e.t/ f .nC1/ .t/. In particular, the derivative of the function g W t 7!
n X 1 .1 t/k f .k/ .t/ kŠ kD0
is .1=nŠ/.1 t/n f .nC1/ .t/. Proof Applying Leibniz’s rule to t 7! e.nk/ .t/ f .k/ .t/ and noting the cancellations we are left with the conclusion of the first assertion. For the second one we take E WD R, e.t/ WD .1t/n =nŠ, b.r; x/ WD rx for .r; x/ 2 RF. Then .1/nk e.nk/ .t/ D .1 t/k =kŠ and e.nC1/ .t/ D 0, hence the result. t u Theorem 5.10 (Taylor’s Theorem with Lagrange’s Remainder) Let f W W ! Y be n C 1 times differentiable on W and let x 2 W, x 2 X be such that the segment Œx; x C x is contained in W. If f .nC1/ .w/ is bounded above by some c 2 RC for all w 2 W, then the remainder rn defined by relation (5.14) satisfies krn .x/k nC1 c . .nC1/Š kxk Proof For x 2 X such that Œx; x C x W and t 2 Œ0; 1, let us set .k/ .k/ .k/ fx .t/ WD f .x C tx/. Then for k 2 Nn we have fx .t/ D f .x C tx/.x /, hence .nC1/ .t/ .c=nŠ/.1 t/n kxknC1 . Applying the Mean Value .1=nŠ/.1 t/n fx P .k/ Theorem to the functions gx W t 7! nkD0 .1=kŠ/.1 t/k fx .t/ whose derivative is .1=nŠ/.1t/n f .nC1/ .t/ and hx W t 7! .1t/nC1 .c=.nC1/Š/ kxknC1 whose derivative is .c=nŠ/.1 t/n kxknC1 , we get krn .x/k D kgx .1/ gx .0/k hx .1/ hx .0/ D .c=.n C 1/Š/ kxknC1 as required.
t u
Theorem 5.11 (Taylor’s Theorem with Integral Remainder) Let f W W ! Y be n C 1 times continuously differentiable on W and let x 2 W, x 2 X be such that
254
5 The Power of Differential Calculus
the segment Œx; x C x is contained in W. Then the remainder rn defined by (5.14) satisfies Z rn .x/ D
0
1
.1 t/n .nC1/ f .x C tx/.x.nC1/ /dt: nŠ
Proof With the notation of the preceding proof, that follows from the relations R1 .nC1/ rn .x/ D gx .1/ gx .0/ D 0 g0x .t/dt, g0x .t/ D .1=nŠ/.1 t/n fx .t/ D .1=nŠ/.1 n .nC1/ .nC1/ t/ f .x C tx/.x /. t u Let us quote the next result which is outside the scope of the present book. Theorem 5.12 (Aleksandrov) Let f W Rd ! R be a convex function. Then for some set N of null measure in Rd and all x 2 Rd nN, f admits a second-order Taylor expansion at x.
Exercises 1. Given k 2 Nnf0g, Banach spaces X and Y and an open subset W of X, show that the space BCk .W; Y/ of maps of class Ck from W to Y whose derivatives are bounded is space with respect to the norm kkCk given by k f kCk WD a Banach sup0hk Dh f 1 . 2. Let g W T ! R, with T WD a; aŒ, be odd and five times differentiable. Show that for all x 2 T there exists some y 20; aŒ such that g.x/ D
x 0 x5 .5/ .g .x/ C 2g0 .0// g . y/: 3 180
3. (Simpson’s formula) Let f W Œa; b ! R be five time differentiable. Deduce from the preceding exercise that there exists some c 2a; bŒ such that f .b/ f .a/ D
ba 0 .b a/5 .5/ Œ f .a/ C f 0 .b/ C 4f 0 ..a C b/=2/ f .c/: 6 2880
5.4.4 Differentiable Partitions of Unity We intend to show that the family of functions of class C1 on Rd is rich enough. We start with an analogue of Urysohn’s lemma. Then we describe a useful tool known as a partition of unity. Lemma 5.8 Let A, B be two disjoint closed subsets of Rd such that gap.A; B/ WD inffka bk W a 2 A; b 2 Bg > 0:
5.4 Classical Differential Calculus
255
Then, there exists some f 2 C1 .Rd / such that f .a/ D 1, f .b/ D 0 for all a 2 A, b 2 B and f .x/ 2 Œ0; 1 for all x 2 Rd . Proof Let W Rd ! R be defined by .x/ WD ce1=.kxk
2
1/
for x 2 B.0; 1/;
.x/ D 0 for x 2 Rd nB.0; 1/;
R where c > 0 is chosen in such a way that D 1. Given r 2 P, R r < .1=2/gap.A; B/ and setting C WD B.A; r/ WD fx 2 Rd W d.x; A/ < rg, f .x/ D C ..x y/=r/dy, we get the required function (we use here a derivation result for integrals depending on a parameter). t u Proposition 5.24 Let .U1 ; : : : ; Uk / be a finite open covering of a compact subset K of Rd . Then one can find nonnegative functions . pi /i2Nk of class C1 such that k k ˙iD1 pi 1, ˙iD1 pi D 1 on K and supp pi WD cl.fx W pi .x/ ¤ 0g/ is a compact subset of Ui for i 2 Nk . Proof For each x 2 K we pick an open neighborhood Wx whose closure cl.Wx / is a compact subset of some Ui.x/ . Let x1 ; : : : ; xn be a finite family of points of K such that Wx1 ; : : : ; Wxn is a covering of K. For i 2 Nk let Vi be the union of the sets Wxj whose closures are contained in Ui . Then Ai WD cl.Vi / is a compact subset of Ui , so that gap.Ai ; Rd nUi / > 0. Let qi 2 C1 .Rd / be such that qi D 1 on Ai , qi D 0 on Rd nUi and qi .Rd / Œ0; 1I we may require that supp qi is compact. Setting p1 WD q1 , p2 WD q2 .1 q1 /; : : : : : : ; pn WD qn .1 q1 / : : : .1 qn1 / by induction we see that p1 C : : : C pn D 1 .1 q1 / : : : .1 qn /: This relation implies that p1 C : : : C pn D 1 on K (and in fact on the union of the Ai ’s). u t A family .Si /i2I of subsets of Rd (or a topological space) is said to be locally finite if for all x 2 Rd there exists a neighborhood V of x and a finite subset I.x/ of I such that Si \ V D ¿ for all i 2 InI.x/. A family . pi /i2I of nonnegative functions of class C1 on an open subset U of Rd is called a partition of unity of U if the family .supp pi /i2I is locally finite and if ˙i2I pi .x/ D 1 for all x 2 U. A partition of unity . pi /i2I is said to be subordinated to a covering .Uj /j2J of open subsets of U if for all i 2 I there exists some j.i/ 2 J such that supp pi Uj.i/ . We admit the following result (see [182]). Some steps of its proof have been presented above. Theorem 5.13 Given an open subset U of Rd and an open covering .Uj /j2J of U there exists a countable partition of unity . pi /i2I subordinated to the covering .Uj /j2J .
256
5 The Power of Differential Calculus
5.5 Solving Equations and Inverting Maps In this section, we show that simple methods linked with differentiability notions lead to efficient ways of solving nonlinear systems or vectorial equations f .x/ D y:
(5.15)
Here X and Y are Banach spaces, W is an open subset of X, y 2 Y, and f W W ! Y is a map. We start with a classical constructive algorithm.
5.5.1 Newton’s Method Newton’s method is an iterative process that relies on a notion of approximation by a linear map that slightly differs from differentiability. We formulate it in the next definition. Definition 5.11 The map f W W ! Y has a Newton approximation at x 2 W if there exist r > 0, ˛ > 0 and a map A W B.x; r/ ! L.X; Y/ such that B.x; r/ W and 8x 2 B.x; r/
k f .x/ f .x/ A.x/.x x/k ˛ kx xk :
(5.16)
A map A W V ! L.X; Y/ is a slant derivative of f at x if V is a neighborhood of x contained in W and if for any ˛ > 0 there exists some r > 0 such that B.x; r/ V and relation (5.16) holds. Thus f is differentiable at x if and only if f has a slant derivative at x that is constant on some neighborhood of x. But condition (5.16) is much less demanding, as the next lemma shows. Lemma 5.9 The following assertions about a map f W W ! Y are equivalent: (a) f has a Newton approximation A that is bounded near x; (b) f is stable at x, i.e. there exist c > 0; r > 0 such that 8x 2 B.x; r/
k f .x/ f .x/k c kx xk I
(5.17)
(c) f has a slant derivative A at x that is bounded on some neighborhood of x. Proof (a))(b) If for some ˛; ˇ > 0 and some r > 0 a map A W B.x; r/ ! L.X; Y/ is such that (5.16) holds with kA.x/k ˇ for all x 2 B.x; r/, then, by the triangle inequality, relation (5.17) holds with c WD ˛ C ˇ. (b))(c) We use a corollary of the Hahn-Banach Theorem asserting the existence of some map S W X ! X such that S.x/.x/ D kxk and kS.x/k D 1 for all x 2 X. Suppose (5.17) holds. Then, setting A.x/ D 0 and, for x 2 Wnfxg, u 2 X, A.x/.u/ D hS.x x/; ui
f .x/ f .x/ ; kx xk
5.5 Solving Equations and Inverting Maps
257
we easily check that kA.x/k c for all x 2 W and that A.x/.x x/ D f .x/ f .x/ for all x 2 W, so that (5.16) holds with ˛ D 0 and A is a slant derivative of f at x. (c))(a) is clear since a slant derivative of f at x is a Newton approximation of f at x. t u In the elementary method that follows, we first assume that (5.15) has a solution x. Proposition 5.25 Let x be a solution to equation (5.15) with y D 0, let ˛; ˇ; r > 0 satisfying WD ˛ˇ < 1 and let A W B.x; r/ ! L.X; Y/ be such that (5.16) holds, A.x/ being invertible with A.x/1 ˇ for all x 2 B.x; r/. Then, for any initial point x0 2 B.x; r/, the sequence .xn / given by xnC1 WD xn A.xn /1 . f .xn //
(5.18)
is well defined and converges linearly to x with rate in the sense that kxnC1 xk
kxn xk for all n 2 N. Thus, setting c WD kx0 xk, one has kxn xk c n . It follows that if A is a slant derivative of f at x, .xn / converges superlinearly to x: for all " > 0 there is some k 2 N such that kxnC1 xk " kxn xk for all n k. Proof Using the fact that f .x/ D 0, so that xnC1 x D A.xn /1 .f .x/ f .xn / C A.xn /.xn x// ; we inductively obtain that kxnC1 xk ˇ k f .xn / f .x/ A.xn /.xn x/k ˛ˇ kxn xk ; so that xnC1 2 B.x; r/: the whole sequence .xn / is well defined and converges to x. t u Under reinforced assumptions, one can show the existence of a solution. Theorem 5.14 (Kantorovich) Let x0 2 W, ˛; ˇ > 0, r > 0 with WD ˛ˇ < 1, B.x0 ; r/ W and let A W B.x0 ; r/ ! L.X; Y/ be such that for all x 2 B.x0 ; r/ the map A.x/ W X ! Y has a right inverse B.x/ W Y ! X satisfying kB.x/./k ˇ kk and 8w; x 2 B.x0 ; r/
k f .w/ f .x/ A.x/.w x/k ˛ kw xk :
(5.19)
If k f .x0 /k < ˇ 1 .1 /r and if f is continuous, the sequence .xn / given by the Newton iteration xnC1 WD xn B.xn /. f .xn //
(5.20)
is well defined and converges to a solution x of equation (5.15) with y D 0. Moreover, one has kxn xk r n for all n 2 N and kx x0 k ˇ.1
/1 k f .x0 /k < r.
258
5 The Power of Differential Calculus
Here B.x/ is a right inverse of A.x/ if A.x/ ı B.x/ D IY ; B.x/ is not supposed to be linear. Proof Let us prove by induction that xn 2 B.x0 ; r/, kxnC1 xn k ˇ n k f .x0 /k and k f .xn /k n k f .x0 /k. For n D 0 these relations are obvious. Assuming they are valid for n < k, we get kxk x0 k
k1 X
kxnC1 xn k ˇ k f .x0 /k
nD0
1 X
n D ˇ k f .x0 /k .1 /1 < r;
nD0
or xk 2 B.x0 ; r/ and, since f .xk1 / C A.xk1 /.xk xk1 / D 0, from (5.20), (5.19), k f .xk /k D k f .xk / f .xk1 / A.xk1 /.xk xk1 /k ˛ kxk xk1 k k k f .x0 /k ; kxkC1 xk k ˇ k f .xk /k ˇ k k f .x0 /k : Since < 1, the sequence .xn / is a Cauchy sequence, hence converges to some x 2 X satisfying kx x0 k ˇ k f .x0 /k .1 /1 < r. Moreover, by continuity of f , we get f .x/ D limn f .xn / D 0. Finally, p1 X kxkC1 xk k r n : kxn xk lim xn xp lim p!C1
p!C1
kDn
t u From Kantorovich’s Theorem we deduce a result that is the root of many important developments in nonlinear analysis. Theorem 5.15 (Lyusternik-Graves Theorem) Let X and Y be Banach spaces, let W be an open subset of X and let g W W ! Y that is circa-differentiable at some x 2 W with a surjective derivative Dg.x/. Then g is open at x. More precisely, there exist some , , > 0 such that g has a right inverse h W B.g.x/; / ! W satisfying kh. y/ xk ky g.x/k and 8.w; y/ 2 B.x; / B.g.x/; /
9x 2 W W g.x/ D y; kx wk ky g.w/k : (5.21)
Proof Let A W W ! L.X; Y/ be the constant map with value A WD Dg.x/ (we use a familiar abuse of notation). The Open Mapping Theorem yields some ˇ > 0 and some right inverse B W Y ! X of A such that kB./k ˇ kk. Let ˛; r > 0 be such that WD ˛ˇ < 1, B.x; 2r/ W and 8w; x 2 B.x; 2r/
kg.w/ g.x/ Dg.x/.w x/k ˛ kw xk :
(5.22)
5.5 Solving Equations and Inverting Maps
259
Let ; > 0 be such that C < ˇ 1 .1 /r, and let 20; r be such that g.w/ 2 B.g.x/; / for all w 2 B.x; /. Given w 2 B.x; /, y 2 B.g.x/; /, let us set f .x/ WD g.x/ y for x 2 B.x; /, so that k f .w/k kg.w/ g.x/k C kg.x/ yk < C < ˇ 1 .1 /r and, by (5.22), (5.19) holds in the ball B.x0 ; /, with x0 WD w. Using the estimate kx x0 k ˇ k f .x0 /k .1 /1 < r obtained in the proof of Kantorovich’s Theorem for a solution x of the equation f .x/ D 0, we get some x 2 W such that g.x/ D y, kx wk kg.w/ yk with WD ˇ.1 /1 . The right inverse h is obtained by taking w WD x in (5.21). t u
Exercises 1. Let X and Y be Banach spaces, let x 2 X, b; c; r > 0, W WD B.x; r/, f W W ! Y be of class C1 and such that f 0 is Lipschitzian with rate c on W and k f 0 .w/| . y /k b ky k for all w 2 W, y 2 Y . Let b > cr. Using Kantorovich’s Theorem, prove that for all y 2 B. f .x/; .b cr/r/ there exists an x 2 W satisfying f .x/ D y and kx xk b1 ky f .x/k. [Hint: use the Banach-Schauder Theorem to find a right inverse B.w/ of A.w/ WD f 0 .w/ for all w 2 W satisfying kB.w/./k b1 kk and use Exercise 15 of Sect. 2.2.4 to verify condition (5.19).] 2 . Using Exercise 15 of Sect. 2.2.4 establish a refined version of Kantorovich’s Theorem and prove that the conclusion of the preceding result can be extended to any y 2 B. f .x/; br/. 3 . Convexity of images of small balls (Polyak). Let X be a Hilbert space, let Y be a normed space, let a 2 X, c; ; > 0, W WD B.a; /, f W W ! Y be differentiable and such that f 0 is Lipschitzian with rate c on W and k f 0 .a/| . y /k ky k for all y 2 Y . Prove that for r > 0, r < min.; =2c/ the image f .B/ of B WD B.a; r/ under the nonlinear map f is convex. [Hint: given x0 , x1 2 B, y0 WD f .x0 /, y1 WD f .x1 /, y WD .1=2/. y0 C y1 /, x WD .1=2/.x0 C x1 /, show that k f 0 .w/| . y /k b ky k for all w 2 W, y 2 Y for b WD cr and apply the preceding exercise.] 4 . Extend the (surprising!) result of the preceding exercise to the case when X is a Banach space with a uniformly convex norm.
5.5.2 The Inverse Mapping Theorem The Inverse Mapping Theorem is a milestone of differential calculus. It shows the interest and the power of derivatives. It has numerous applications in differential geometry, differential topology and in the study of dynamical systems.
260
5 The Power of Differential Calculus
When f W T ! R is a continuous function on some open interval T of R, one can use the order of R and the intermediate value theorem to obtain results about the invertibility of f . If, moreover, f is differentiable at some r 2 T and if f 0 .r/ is non-null, one can conclude that f .T/ contains some neighborhood of f .r/. When f is a map of several variables one would like to know whether such a conclusion is valid, and even more, whether f induces a bijection from some neighborhood of a given point x onto some neighborhood of f .x/. Of course, one cannot expect a global result without further assumptions since the derivative is a local notion. Following R. Descartes’ advice, we will reach our main results, concerning the possibility of inverting nonlinear maps, through several small steps; some of them are of independent interest. First, given a bijection f between two metric spaces X; Y, we would like to know whether a map close enough to f is still a bijection. We have seen such a result for continuous linear maps between Banach spaces (Theorem 3.17). Now let us turn to a nonlinear setting. Let us first observe that if f W U ! V is a bijection between two open subsets of normed spaces X and Y respectively, it may occur that f is differentiable at some a 2 U whereas its inverse g is not differentiable at b D g.a/: take U D V D R, f given by f .x/ D x3 whose inverse y 7! y1=3 is not differentiable at 0. However, if f is differentiable at some a 2 X and if its inverse g is differentiable at b WD g.a/, then the derivative of g at b is the inverse f 0 .a/1 of the derivative f 0 .a/ of f at a. This fact simply follows from the chain rule: from g ı f D IU and f ı g D IV one deduces that g0 .b/ ı f 0 .a/ D IX and f 0 .a/ ı g0 .b/ D IY . Our first step is not as obvious as the preceding observation since one of its assumptions is now a conclusion. Lemma 5.10 Let U and V be two open subsets of normed spaces X and Y respectively. Assume that f W U ! V is a homeomorphism that is differentiable at a 2 U and such that f 0 .a/ is an isomorphism. Then the inverse g of f is differentiable at b D f .a/ and g0 .b/ D f 0 .a/1 . Proof Using translations if necessary, we may suppose a D 0, f .a/ D 0 without loss of generality. Changing f into h1 ı f , where h WD f 0 .a/, we may also suppose Y D X and f 0 .a/ D IX . Then, setting s. y/ WD g. y/ y, we have to show that s. y/= kyk ! 0 as y ! 0, y ¤ 0. Let us set r.x/ WD f .x/ x. Given " 20; 1Œ, we can find > 0 such that kr.x/k ."=2/ kxk for x 2 BX . Since g is continuous, we can find > 0 such that kg. y/k for y 2 BY . Then, for y 2 BY and x WD g. y/, we have y D f .x/ D x C r.x/, hence kyk kxk kr.x/k .1=2/ kxk ; ks. y/k D kg. y/ yk D kr.x/k ."=2/ kxk " kyk : t u In order to get a stronger result in which the invertibility of f is part of the conclusion instead of being an assumption, we will use the reinforced differentiability property of Definition 5.8. Recall that a map f W W ! Y from an open subset
5.5 Solving Equations and Inverting Maps
261
W of a normed space X into another normed space Y is circa-differentiable (or strictly differentiable) at a 2 W if there exists a continuous linear map ` W X ! Y such that the map r D f ` is Lipschitzian with arbitrary small Lipschitz rate on sufficiently small neighborhoods of a: for any " > 0 there exists a > 0 such that B.a; / W and 8w; w0 2 B.a; /
f .w/ f .w0 / `.w w0 / " w w0 :
The criterion for circa-differentiability given in Proposition 5.21 uses continuous differentiability or slightly less. Thus, the reader who is not interested in refinements may suppose throughout that f is of class C1 . Our next step is a perturbation result. We formulate it in a general framework. Lemma 5.11 Let .U; d/ be a metric space, let Y be a normed space, let j; h W U ! Y be such that (a) j is injective and its inverse j1 W j.U/ ! U is Lipschitzian with rate I (b) h is Lipschitzian with rate . Then, if < 1, the map f WD j C h is still injective and its inverse f 1 W f .U/ ! U is Lipschitzian with rate .1 /1 . Note that the Lipschitz rate of the inverse of the perturbed map f is close to the Lipschitz rate of j1 when is small. It may be convenient to reformulate this lemma by saying that a map e W X ! Y between two metric spaces is expansive with rate c > 0 if for all x; x0 2 X one has d.e.x/; e.x0 // cd.x; x0 /: This property amounts to d.e1 . y/; e1 . y0 // c1 d. y; y0 / for any y; y0 2 e.X/, i.e. e is injective and its inverse is Lipschitzian on the image e.X/ of e. Thus the lemma can be rephrased as follows: Lemma 5.12 Let X be a metric space and let Y be a normed spaces. Let e W X ! Y be expansive with rate c > 0 and let h W X ! Y be Lipschitzian with rate ` < c. Then g WD e C h is expansive with rate c `. Proof This ensues from the following relations which are valid for any x; x0 2 X: g.x/ g.x0 / e.x/ e.x0 / h.x/ h.x0 / cd.x; x0 / `d.x; x0 /: Note that for c D 1 , ` D one has .c `/1 D .1 /1 .
t u
262
5 The Power of Differential Calculus
Since we have defined differentiability only on open subsets, it will be important to ensure that f .U/ is open in order to apply Lemma 5.10. We reach this conclusion in two steps. The first one relies on the Banach-Picard contraction theorem. Lemma 5.13 Let W be an open subset of a Banach space Y and let k W W ! Y be a Lipschitzian map with rate c < 1. Then the image of W under f WD IW C k is open. Proof We will prove that for any a 2 W and for any closed ball BŒa; r contained in W, the closed ball BŒ f .a/; .1 c/r is contained in the set f .W/, and in fact in the set f .BŒa; r/. Without loss of generality, we may suppose a D 0, k.a/ D 0, using translations if necessary. Given y 2 .1 c/rBY we want to find x 2 rBY such that y D f .x/. This equation can be written as y k.x/ D x. We note that x 7! y k.x/ is Lipschitzian with rate c < 1 and that it maps rBY into itself since ky k.x/k kyk C kk.x/k .1 c/r C cr D r: Since rBY is a complete metric space, the contraction theorem yields some fixed point x of this map. Thus y D f .x/ 2 f .W/. t u Lemma 5.14 Let .U; d/ be a metric space, let Y be a Banach space, let > 0; > 0 with < 1, and let j; h W U ! Y be such that W WD j.U/ is open and (a) j is injective and its inverse j1 W W ! U is Lipschitzian with rate I (b) h is Lipschitzian with rate . Then, the map f WD j C h is injective, its inverse is Lipschitzian and f .U/ is open. Proof Let k WD h ı j1 , so that f ı j1 D IW C k and k is Lipschitzian with rate
< 1. Then, Lemma 5.13 shows that f .U/ D f . j1 .W// D .I C k/.W/ is open. t u We are ready to state the Inverse Mapping Theorem. Theorem 5.16 (Inverse Mapping Theorem) Let X and Y be Banach spaces, let W be an open subset of X and let f W W ! Y be circa-differentiable at a 2 W and such that f 0 .a/ is an isomorphism from X onto Y. Then there exist neighborhoods U of a and V of b WD f .a/ such that U W and such that f induces a homeomorphism from U onto V whose inverse is differentiable at b. Proof In the preceding lemma, let us take j WD f 0 .a/, h D f j. Since j is an isomorphism, its inverse is Lipschitzian with rate k j1 k. Let U be a neighborhood of a such that h is Lipschitzian with rate < 1=k j1 k. Then, by the preceding lemma, V WD f .U/ is open and f j U is a homeomorphism from U onto V and, by Lemma 5.10, its inverse is differentiable at b. t u Exercise Show that the inverse of f is in fact circa-differentiable at b. Exercise (Square Root of an Operator) Let E be a Banach space and let X WD L.E; E/. Considering the map f W X ! X given by f .u/ WD u2 WD u ı u, show that
5.5 Solving Equations and Inverting Maps
263
there exist a neighborhood V of IE in X and a differentiable map g W V ! X such that g.v/2 WD g.v/ ı g.v/ D v for all v 2 V. The following classical terminology is helpful. Definition 5.12 A Ck -diffeomorphism between two open subsets of normed spaces is a homeomorphism that is of class Ck as is its inverse (k 1). The following example plays an important role in the sequel, so we make it a lemma. Lemma 5.15 Let X and Y be Banach spaces. Then the set Iso.X; Y/ of isomorphisms from X onto Y is open in L.X; Y/ and the map i W Iso.X; Y/ ! Iso.Y; X/ given by i.u/ D u1 is a C1 -diffeomorphism, i.e. a Ck -diffeomorphism for all k 1. Proof The first assertion has been proved in Proposition 3.17. Let us prove the second assertion by first considering the case X D Y and by showing that i is differentiable at the identity map IX , with derivative Di.IX / given by Di.IX /.v/ D v. Taking 20; 1Œ, this follows from the expansion 8v 2 L.X; X/; kvk
.IX C v/1 D IX v C s.v/
P k k k k k and with s.v/ WD v 2 ı 1 kD0 .1/ v : s defines a remainder since .1/ v ks.v/k .1 /1 kvk2 . Thus i is differentiable at IX . Now,in the general case, for u 2 Iso.X; Y/, and w 2 L.X; Y/ satisfying kwk < 1= u1 , v WD u1 ı w, one has u C w D u ı .IX C v/ 2 Iso.X; Y/, 1 i.u C w/ D u ı .IX C u1 ı w/ D .IX C u1 ı w/1 ı u1 D IX u1 ı w C s.v/ ı u1 D i.u/ u1 ı w ı u1 C s.v/ ı u1 ; and since s./ ı u1 is a remainder, one sees that i is differentiable at u, with Di.u/.w/ D u1 ı w ı u1 :
(5.23)
Thus the derivative i0 W Iso.X; Y/ ! L.L.X; Y/; L.Y; X// is obtained by composing i with the map k W L.Y; X/ ! L.L.X; Y/; L.Y; X// given by k.z/.w/ WD z ı w ı z for z 2 L.Y; X/, w 2 L.X; Y/ that is continuous and quadratic, hence is of class C1 . It follows that i0 is continuous and i is of class C1 . Then i0 is of class C1 . By induction, we obtain that i is of class Ck for all k 1. Since i is a bijection with inverse i1 W Iso.Y; X/ ! Iso.X; Y/ given by i1 .z/ D z1 , we get that i is a C1 diffeomorphism. t u Note that formula (5.23) generalizes the usual case i.t/ D t1 on Rnf0g for which i .u/ D u2 and Di.u/.w/ D u2 w. 0
Corollary 5.17 Let X and Y be Banach spaces, let W be an open subset of X and let f W W ! Y be of class Ck (k 1) and such that f 0 .a/ is an isomorphism from X
264
5 The Power of Differential Calculus
onto Y for some a 2 W. Then there exist neighborhoods U of a and V of b WD f .a/ such that U W and such that f j U is a Ck -diffeomorphism between U and V. Proof Let us first consider the case k D 1. The Inverse Mapping Theorem ensures that f induces a homeomorphism from a neighborhood U of a onto a neighborhood V of b. Since f 0 is continuous at a and since the set Iso.X; Y/ of isomorphisms from X onto Y is open in L.X; Y/, taking a smaller U if necessary, we may assume that f 0 .x/ is an isomorphism for all x 2 U. Then, Lemma 5.10 guarantees that g WD f 1 is differentiable at f .x/. Moreover, one has g0 . y/ D . f 0 .g. y///1 : Since the map i W u 7! u1 is of class C1 on Iso.X; Y/, g0 D i ı f 0 ı g is continuous. Thus g is of class C1 . Now suppose by induction that g is of class Ck if f is of class Ck , and let us prove that when f is of class CkC1 , then g is of class CkC1 . This follows from the expression g0 D i ı f 0 ı g which shows that g0 is of class Ck as a composite of maps of class Ck . t u Let us give a global version of the Inverse Mapping Theorem. Corollary 5.18 Let X and Y be Banach spaces, let W be an open subset of X and let f W W ! Y be an injection of class Ck such that, for every x 2 W, the linear map f 0 .x/ is an isomorphism from X onto Y. Then f .W/ is open and f is a Ck diffeomorphism between W and f .W/. Proof The Inverse Mapping Theorem ensures that f .W/ is open in Y. Thus f is a continuous bijection from W onto f .W/ and its inverse is locally of class Ck , hence is of class Ck . t u Exercise Let f W T ! R be a continuous function on some open interval T of R. Show that if f is differentiable at some r 2 T with f 0 .r/ non-null, then f .T/ contains some neighborhood of f .r/. Show by an example that it may happen that there is no neighborhood of r on which f is injective. Example-Exercise (Polar Coordinates) Let W WD0; 1Œ ; Œ R2 and let f W W ! R2 be given by f .r; / D .r cos ; r sin /. Then f is a bijection from W onto R2 nD, with D WD 1; 0 f0g and the Jacobian matrix of f at .r; / is
cos r sin
: sin r cos
Its determinant (called the Jacobian of f ) is r.cos2 C sin2 / D r > 0, hence f is a diffeomorphism of class C1 from W onto f .W/. Using the relation tan. =2/ D 2 sin. =2/ cos. =2/=2 cos2 . =2/ D sin =.1 C cos /, show that its inverse is given by p .x; y/ 7! . x2 C y2 ; 2Arc tan
y /: p x C x2 C y2
5.5 Solving Equations and Inverting Maps
265
z
ω
y θ
x
Fig. 5.2 Euler angles Exercise (Spherical Coordinates) Let W WD0; 1Œ ; Œ 2 ; 2 Œ and let f W W ! R3 be given by f .r; ; !/ D .r cos sin !; r sin sin !; r cos !/. Show that f is a diffeomorphism from W onto its image. The angles ; ! are known as Euler angles. On the globe, they can serve to measure latitude and longitude (Fig. 5.2).
Exercise Is f W R2 ! R2 given by f .x; y/ WD .x2 y2 ; 2xy/ a diffeomorphism? Give an interpretation by considering z 7! z2 , with z WD x C iy, identifying C with R2 . Exercise (Globalization) Let f W X ! Y be a map of class C1 between two Banach spaces. Suppose there exist a, b 2 RC such that for every x 2 X the linear map f 0 .x/ is invertible and satisfies f 0 .x/1 a kxk C b. Prove that f is a homeomorphism, in fact a diffeomorphism, from X onto Y. [See: [92, Thm 15.4].]
5.5.3 The Implicit Function Theorem Functions are sometimes defined in an implicit, indirect way. For example, in economics, the famous Phillips curve is defined through the equation 1:39u.w C :9/ D 9:64; where u is the rate of unemployment and w is the annual rate of variation of nominal wages; in such a case one can express u in terms of w and vice versa. However, given Banach spaces X; Y; Z, an open subset W of X Y and a map f W W ! Z, it is often
266
5 The Power of Differential Calculus
impossible to determine an explicit map h W X0 ! Y from an open subset X0 of X such that .x; h.x// 2 W and f .x; h.x// D 0 for all x 2 X0 . When the existence of such a map is known, (but not necessarily in an explicit form) one says that it is an implicit function determined by f . The following result guarantees the existence and regularity of such a map. Theorem 5.17 Let X; Y; Z be Banach spaces, let W be an open subset of X Y and let f W W ! Z be a map of class C1 at .a; b/ 2 W such that f .a; b/ D 0 and the second partial derivative DY f .a; b/ is an isomorphism from Y onto Z. Then, there exist open neighborhoods U of .a; b/ and V of a in W and X respectively and a map h W V ! Y of class C1 at a such that h.a/ D b and ..x; y/ 2 U; f .x; y/ D 0/ ” .x 2 V; y D h.x// :
(5.24)
If f is of class Ck with k 1 on W, then h is of class Ck on V. Moreover, Dh.a/ D DY f .a; b/1 ı DX f .a; b/:
(5.25)
Proof Let F W W ! X Z be the map given by F.x; y/ WD .x; f .x; y//. Then F is of class C1 at .a; b/ as are its components and DF.a; b/.x; y/ D .x; DX f .a; b/x C DY f .a; b/y/: It is easy to see that DF.a; b/ is invertible and that its inverse is given by .DF.a; b//1 .x; z/ D .x; .DY f .a; b//1 ı DX f .a; b/x C .DY f .a; b//1 z/: Therefore, the Inverse Mapping Theorem yields open neighborhoods U of .a; b/ in W and U 0 of .a; 0/ in X Z such that F induces a homeomorphism from U onto U 0 of class C1 at .a; b/. Its inverse G is of class C1 at .a; 0/, satisfies G.a; 0/ D .a; b/, and has the form .x; z/ 7! .x; g.x; z//. Let V WD fx 2 X W .x; 0/ 2 U 0 g and let h W V ! Y be given by h.x/ D g.x; 0/. Then, the equivalence ..x; y/ 2 U; .x; z/ D .x; f .x; y/// , .x; z/ 2 U 0 ; .x; y/ D .x; g.x; z// entails, by definition of V and h, ..x; y/ 2 U; f .x; y/ D 0/ , .x 2 V; y D h.x// : When f is of class Ck on W, with k 1, F is of class Ck , hence G and h are of class Ck on U 0 and V respectively. Moreover, the computation of the inverse DF.a; b/1 we have done shows that Dh.a/ D DX g.a; 0/ D DY f .a; b/1 ı DX f .a; b/: t u
5.5 Solving Equations and Inverting Maps
267
Example Let X be a Hilbert space and, for Y WD R, let f W X Y ! R be given by f .x; y/ D kxk2 C y2 1. Then f is of class C1 and for .a; b/ WD .0; 1/ one has Df .a; b/.u; v/ D 2.a j u/ C 2bv D 2v; hence DY f .a; b/ D 2IY is invertible and DY f .a; b/1 D 12 IY . Here we can take U WD B.a; 1/0; C1Œ, V WD B.a; 1/ and the implicit function is given by h.x/ D .1 kxk2 /1=2 . As mentioned above, it is not always the case that U and h can be described explicitly as in this classical parameterization of the upper hemisphere. t u When Z is finite dimensional, the regularity assumption on f can be relaxed in two ways. Theorem 5.18 Let X; Y; Z be Banach spaces, Y and Z being finite dimensional, let W be an open subset of X Y and let f W W ! Z be Fréchet differentiable at .a; b/ 2 W such that f .a; b/ D 0 and the partial derivative DY f .a; b/ is an isomorphism from Y onto Z. Then, there exist open neighborhoods U of .a; b/ and V of a in W and X respectively and a map h W V ! Y Fréchet differentiable at a such that h.a/ D b and 8x 2 V
f .x; h.x// D 0:
Differentiating this relation, we recover the value of Dh.a/: Dh.a/ D DY f .a; b/1 ı DX f .a; b/: The proof below is slightly simpler when A WD DX f .a; b/ D 0I one can reduce it to that case by a linear change of variables. Proof Using translations and composing f with DY f .a; b/1 , we may suppose .a; b/ D .0; 0/, Z D Y and DY f .a; b/ D IY . Let r W W ! Y be a remainder such that f .x; y/ WD Ax C y C r.x; y/: For " 20; 1=2 let ı WD ı."/ > 0 be such that ıBXY W, kr.x; y/k ".kxkCkyk/ for all .x; y/ 2 ıBXY . Let ˇ WD ı=2, ˛ WD .2 kAk C 1/1 ˇ and for x 2 ˛BX let kx W ˇBY ! Y be given by kx . y/ WD Ax r.x; y/: Then, kx maps ˇBY into itself since for y 2 ˇBY we have kkx . y/k kAk ˛ C .1=2/.˛ C ˇ/ ˇ. The Brouwer’s Fixed Point Theorem ensures that kx has a fixed point yx 2 ˇBY : Ax r.x; yx / D yx . Then, setting h.x/ WD yx , we have f .x; h.x// D Ax C h.x/ C r.x; h.x// D 0. It remains to show that h is differentiable
268
5 The Power of Differential Calculus
at 0. Since kh.x/k D kkx .h.x//k kAk kxk C " kxk C " kh.x/k ; so that kh.x/k .1 "/1 .kAk C "/ kxk, we get kh.x/ C Axk D kr.x; h.x//k " kxk C " kh.x/k ".1 "/1 .kAk C 1/ kxk : This shows that h is differentiable at 0 with derivative A.
t u
A similar (and simpler) proof yields the first assertion of the next statement. Theorem 5.19 Let X and Y be normed spaces, Y being finite dimensional and let f W X ! Y be continuous on a neighborhood of a 2 X and differentiable at a, with f 0 .a/.X/ D Y. Then there exist a neighborhood V of b WD f .a/ in Y and a right inverse g W V ! X that is differentiable at a and such that g.b/ D a. If C is a convex subset of X, if a 2 C and if f 0 .a/.cl.RC .C a/// D Y, one can even get that g.V/ C if one does not require that the directional derivative of g at b is linear. The second weakening of the assumptions concerns the kind of differentiability. Theorem 5.20 Let X; Y; Z be Banach spaces, Y and Z being finite dimensional, let W be an open subset of X Y and let f W W ! Z be a map of class D1 at .a; b/ 2 W such that f .a; b/ D 0 and the partial derivative DY f .a; b/ is an isomorphism from Y onto Z. Then, there exist open neighborhoods U of .a; b/ and V of a in W and X respectively and a map h W V ! Y of class D1 such that h.a/ D b and ..x; y/ 2 U; f .x; y/ D 0/ ” .x 2 V; y D h.x// : Proof We may suppose W is a ball B..a; b/; 0 /, Y D Z, DY f .a; b/ D IY . With the notation of the preceding proof, using the compactness of the unit ball of Y, we may suppose the remainder r satisfies, for 2 .0; 0 / and every x 2 BX ; y; y0 2 BY , r.x; y/ r.x; y0 /
Z
1 0
.DY f .x; .1 t/y C ty0 / IY /. y y0 / dt
c./ y y0 where c./ ! 0 as ! 0C . Taking 0 small enough, we see that the map kx is a contraction with rate c.0 / 1=2. Picking ˛ 2 .0; 0 / so that, for x 2 ˛BX , kkx .0/k D kf .x; 0/k =2, the Banach-Picard Contraction Theorem ensures that kx has a unique fixed point yx in the ball BY . Then, setting h.x/ WD yx , we have f .x; h.x// D 0 and yx is the unique solution of the equation f .x; y/ D 0 in the ball BY . Moreover, h is continuous as a uniform limit of continuous maps given by iterations. Restricting f to X1 Y, where X1 is an arbitrary finite dimensional subspace of X, we get that h is Gateaux differentiable. Since Iso.Y/ is an open subset
5.5 Solving Equations and Inverting Maps
269
of L.Y; Y/ and since .x; y/ 7! DY f .x; y/ is continuous with respect to the norm of L.Y; Y/ by the above argument, we obtain from the relation Dh.x/v D DY f .x; h.x//1 .DX f .x; h.x//v/ that .x; v/ 7! Dh.x/v is continuous.
t u
Exercises 1. Show that the Inverse Mapping Theorem can be deduced from the Implicit Mapping Theorem by considering the map .x; y/ 7! y f .x/. 2. Let f W R4 ! R3 be given by f .w; x; y; z/ D .w C x C y C z; w2 C x2 C y2 C z 2; w3 C x3 C y3 C z/: Show that there is a neighborhood V of a WD 0 in R and a map h W V ! R3 of class C1 such that h.0/ D .0; 1; 1/ and f .h.z/; z/ D 0 for every z 2 V. Compute the derivative of h at 0. 3. Let X be the space of square n n matrices and let f W X R ! R be given by f .A; r/ D det.A rI/. Let r 2 R be such that f .A; r/ D 0 and D2 f .A; r/ ¤ 0. Show that there is an open neighborhood U of A in X and a function W U ! R of class C1 such that, for each B in U, .B/ is a simple eigenvalue of B. 4. Given Banach spaces W, X, Z, Y WD Z and maps f W W X ! R, g W W X ! Z of class C2 , consider the parameterized mathematical programming problem .Pw / minimize f .w; x/ subject to g.w; x/ D 0 and let p.w/ be its value. Suppose that for some w 2 W and a solution x 2 X of .Pw / the derivative B WD DX g.w; x/ is surjective and its kernel N has a topological supplement M. Let ` be the Lagrangian of .Pw /: `.w; x; y/ WD f .w; x/ C hy; g.w; x/i and let y be a multiplier at x, i.e. an element of Y such that DX `.w; x; y/ D 0. Suppose D2X `.w; x; y/ j N induces an isomorphism from N onto N ' M ? . Let A WD D2X `.w; x; y/. (a) Show that for any .x ; z/ 2 X Z the system Au C B| v D x Bu D z has a unique solution .u; v/ 2 X Y continuously depending on .x ; z/.
270
5 The Power of Differential Calculus
(b) Show that the Karush-Kuhn-Tucker system DX f .w; x/ C y ı DX g.w; x/ D 0 g.w; x/ D 0 determines .x.w/; y.w// as an implicit function of w in a neighborhood of w with x.w/ D x, y.w/ D y, the multiplier at x. (c) Suppose x.w/ is a solution to .Pw / for w close to w. Show that p is of class C1 near w. Using the relations p.w/ D `.w; x.w/; y.w//, DX `.w; x.w/; y.w// D 0, DY `.w; x.w/; y.w// D 0, show that Dp.w/ D DW `.w; x.w/; y.w//. (d) Deduce from the preceding that p is of class C2 around w and give the expression of D2 p.w/ WD . p0 .//0 .w/.
5.5.4 Geometric Applications When looking at familiar objects such as forks, knifes, funnels, roofs, spires, one sees that some points are smooth, while some other points of the objects present ridges or peaks or cracks. Mathematicians have found concepts that enable one to deal with such cases (see [47], [78], [208], [221] for instance). In this subsection we essentially focus our attention on smooth objects. The notions of (regular) curve, surface, hypersurface. . . can be embodied in a general framework in which some differential calculus can be done. The underlying idea is the possibility of straightening a piece of the set; for this purpose, some forms of the inverse map theorem will be appropriate. We first define a notion of smoothness for a subset S of a normed space X around some point. Definition 5.13 A subset S of a normed space X is said to be Ck -smooth around a point a 2 S if there exist normed spaces Y, Z, an open neighborhood U of a in X, an open neighborhood V of 0 in Y Z and a Ck -diffeomorphism ' W U ! V such that '.a/ D 0 and '.U \ S/ D .Y f0g/ \ V:
(5.26)
A subset S of a normed space X is said to be a submanifold of class Ck if it is Ck -smooth around each of its points. Thus, ' straightens U \ S onto the piece .Y f0g/ \ V of the linear space Y f0g which can be identified with a neighborhood of 0 in Y. The map ' is called a chart and a collection f'i g of charts whose domains form a covering of S is called an atlas. When Y is of dimension d, one says that S is of dimension d around a. When Z is of dimension c, one says that S is of codimension c around a. The following example can be seen as a general model.
5.5 Solving Equations and Inverting Maps
271
Example Let X WD Y Z, where Y, Z are normed spaces, let W be an open subset of Y and let f W W ! Z be a map of class Ck . Then, its graph S WD f.w; f .w// W w 2 Wg is a Ck -submanifold of X: taking U WD V WD W Z, and setting '.w; z/ WD .w; z f .w//, we define a Ck -diffeomorphism from U onto V with inverse given by ' 1 .w; z/ D .w; z C f .w// for which (5.26) is satisfied. t u When in the preceding example Z WD R and one takes the epigraph E WD f.w; y/ 2 W R W y f .w/g of f , one obtains a model for the notion of a submanifold with boundary. We just give a formal definition in which a subset ZC of a normed space Z is said to be a half-space of Z if there exists some h 2 Z nf0g such that ZC WD h1 .RC /. Definition 5.14 A subset S of a normed space X is said to be a Ck -submanifold with boundary if, for every point a of S, either S is Ck -smooth around a or there exist normed spaces Y, Z, a half-space ZC of Z, an open neighborhood U of a in X, an open neighborhood V of 0 in Y Z, and a Ck -diffeomorphism ' W U ! V such that '.a/ D 0 and '.U \ S/ D .Y ZC / \ V: Such a notion is useful when giving a precise meaning to the expression “S is a regular open subset of Rd ” (an improper expression, since usually one considers the closure of such a set). There are two usual ways of obtaining submanifolds: either through equations or through parameterizations. For instance, the graph S of the preceding example can either be defined as the image under .IW ; f / W w 7! .w; f .w// of the parameter space W or as the set of points . y; z/ 2 Y Z satisfying y 2 W and the equation z f . y/ D 0. As a more concrete example, we observe that for given a; b 2 P, the ellipse E WD f.x; y/ 2 R2 W
x2 y2 C 2 D 1g 2 a b
can be seen as the image of the parameterization f W R ! R2 given by f .t/ WD .a cos t; b sin t/. 2
2
2
Exercise Give parameterizations for the ellipsoid f.x; y; z/ 2 R3 W ax2 C by2 C cz2 D 1g and do the same for the other surfaces of R3 defined by quadratic forms. Even if S is not smooth around a 2 S, one can get an idea of its shape around a by using an approximation. The concept of tangent cone offers such an approximation; it can be seen as a geometric counterpart to the directional derivative. Definition 5.15 The tangent cone (or contingent cone) to a subset S of a normed space X at some point a in the closure of S is the set T.S; a/ of vectors v 2 X such that there exist sequences .vn / ! v, .tn / ! 0C for which a Ctn vn 2 S for all n 2 N.
272
5 The Power of Differential Calculus
Equivalently, one has v 2 T.S; a/ if and only if there exist sequences .an / in S, .tn / ! 0C such that .vn / WD .tn1 .an a// ! v: v is the limit of a sequence of secants to S issued from a. Some rules for dealing with tangent cones are given in the next lemma, whose elementary proof is left as an exercise. Lemma 5.16 Let X be a normed space, let S; S0 be subsets of X such that S S0 . Then for every a 2 S one has T.S; a/ T.S0 ; a/. If U is an open subset of X, then for any a 2 S\U one has T.S; a/ D T.S\U; a/. If X 0 is another normed space, if g W U ! X 0 is Hadamard differentiable at a, and if S0 X 0 contains g.S \ U/, then one has Dg.a/.T.S; a// T.S0 ; g.a//. If ' W U ! V is a Ck -diffeomorphism between two open subsets of normed spaces X, X 0 and if S is a subset of X containing a, then, for S0 WD '.S \ U/ and a0 WD '.a/, one has T.S0 ; a0 / D D'.a/.T.S; a//. Exercise Deduce from the second assertion of the lemma that for g W U ! X 0 Hadamard differentiable at a, b WD g.a/, S WD g1 .b/ one has T.S; a/ ker Dg.a/. Moreover, if for some c > 0, > 0 one has d.x; g1 .b// cd.g.x/; b/ for all x 2 B.a; /, then one has T.S; a/ D ker Dg.a/. Exercise Let S WD f.x; y/ 2 R2 W x3 D y2 g. Show that T.S; .0; 0// D RC f0g. When S is smooth around a 2 S in the sense of Definition 5.13 one can give an alternative characterization of T.S; a/ in terms of velocities. Proposition 5.26 If S is C1 -smooth around a 2 S, then the tangent cone T.S; a/ to S at a coincides with the set T I .S; a/ of v 2 X such that there exist > 0 and c W Œ0; ! X right differentiable at 0 with c0C .0/ D v and satisfying c.0/ D a, c.t/ 2 S for all t 2 Œ0; . Moreover, if ' W U ! V is a C1 -diffeomorphism such that '.a/ D 0 and '.S\U/ D .Y f0g/\V, then one has T.S; a/ D .D'.a//1 .Y f0g/ and T.S; a/ is a closed linear subspace of X. Proof The result follows from Lemma 5.16 and the observation that if S is an open subset of some closed linear subspace L of X then T.S; a/ D L D T I .S; a/. t u Now let us turn to sets defined by equations. We need the following result. Theorem 5.21 (Submersion Theorem) Let X and Z be Banach spaces, let W be an open subset of X and let g W W ! Z be a map of class Ck with k 1, such that for some a 2 W the map Dg.a/ is surjective and its kernel N has a topological supplement M in X. Then, there exist an open neighborhood U of a in W, a diffeomorphism ' of class Ck from U onto a neighborhood V of .0; g.a// in N Z such that '.a/ D .0; g.a//, gjU Dpı' where p is the canonical projection from N Z onto Z. In particular, g is open around a in the sense that for every open subset U 0 of U, the image g.U 0 / is open.
5.5 Solving Equations and Inverting Maps
273
This result shows that the nonlinear map g has been straightened into a simple continuous linear map, a projection, by using the diffeomorphism '. Proof Let F W W ! N Z be given by F.x/ D . pN .x/ pN .a/; g.x//, where pN W X ! N is the projection on N associated with the isomorphism between X and M N. Then F is of class Ck and DF.a/.x/ D . pN .x/; Dg.a/.x//. Clearly DF.a/ is injective: when pN .x/ D 0, Dg.a/.x/ D 0, one has x 2 M \ N, hence x D 0. Let us show that DF.a/ is surjective: given . y; z/ 2 N Z, there exists v 2 X such that Dg.a/.v/ D z and since y pN .v/ 2 N, for x WD v C y pN .v/, we have that Dg.a/.x/ D Dg.a/.v/ D z and pN .x/ D pN . y/ D y. Thus, by the Banach isomorphism theorem, we have that DF.a/ is an isomorphism of X onto N Z. The Inverse Mapping Theorem ensures that the restriction ' of F to some open neighborhood U of a is a Ck -diffeomorphism onto some neighborhood V of .0; g.a//. t u Note that for Z WD R, the condition on g reduces to: g is of class Ck and g0 .a/ ¤ 0. Note also that when N WD f0g, we recover the inverse function theorem. The application we have in view follows readily. Corollary 5.19 Let X and Z be Banach spaces, let W be an open subset of X and let g W W ! Z be a map of class Ck with k 1. Let S WD fx 2 W W g.x/ D 0g: Suppose that for some a 2 S the map g0 .a/ WD Dg.a/ is surjective and its kernel N has a topological supplement in X. Then S is Ck -smooth around a. Moreover, T.S; a/ D ker g0 .a/. Proof Using the notation of the Submersion Theorem, setting Y WD N, we see that Definition 5.13 is satisfied, noting that for x 2 U we have x 2 S \ U if and only if p.'.x// D g.x/ D 0, if and only if '.x/ 2 .Y f0g/ \ V. Now, the preceding proposition asserts that T.S; a/ D .' 0 .a//1 .Y f0g/. But since g j U D p ı ', we have g0 .a/ D p ı ' 0 .a/, ker g0 .a/ D .' 0 .a//1 .ker p/ D .' 0 .a//1 .Y f0g/. Hence T.S; a/ D ker g0 .a/. t u The regularity condition on g can be relaxed thanks to the Lyusternik-Graves Theorem. Proposition 5.27 (Lyusternik) Let X and Y be Banach spaces, let W be an open subset of X and let g W W ! Y be circa-differentiable at a 2 S WD fx 2 W W g.x/ D 0g, with g0 .a/.X/ D Y. Then T.S; a/ D ker g0 .a/. Proof The inclusion T.S; a/ ker g0 .a/ follows from Lemma 5.16. Conversely, let v 2 ker g0 .a/. Theorem 5.15 yields some , > 0 such that for all w 2 B.a; / there exists some x 2 W such that g.x/ D y WD 0; kx wk kg.w/k. Taking w WD a C tv with t > 0 so small that w 2 B.a; /, we get some xt 2 S satisfying kxt .a C tv/k o.t/ WD kg.x C tv/k. Thus v 2 T.S; a/ and even v 2 T I .S; a/. t u
274
5 The Power of Differential Calculus
In the following example, we use the fact that when Y D R, the surjectivity condition on g0 .a/ reduces to g0 .a/ ¤ 0 (or rg.a/ ¤ 0 if X is a Hilbert space). Example-Exercise For a Hilbert space X, let g W X ! R be given by g.x/ WD 1 2 .hA.x/ j xi 1/, where A is a linear isomorphism from X onto X that is symmetric, i.e. such that hA.x/ j yi D hA. y/ j xi for every x; y 2 X. Let S WD g1 .f0g/. For all a 2 S one has rg.a/ D A.a/ ¤ 0 since hA.a/ j ai D 1. Thus S is a C1 -submanifold of X. Taking X D R2 and appropriate isomorphisms A, find the classical conic curves; then take X D R3 and find the classical conic surfaces, including the sphere, the ellipsoid, the paraboloid and the hyperboloid. A variant of the submersion theorem can be given with differentiability instead of circa-differentiability when the spaces are finite dimensional. Its proof (which we skip) relies on the Brouwer’s Fixed Point Theorem rather than on the contraction theorem. Proposition 5.28 Let X and Z be Banach spaces, Z being finite dimensional, let W be an open subset of X and let g W W ! Z be Hadamard differentiable at a 2 W, with Dg.a/.X/ D Z. Then, there exist open neighborhoods U of a in W, V of g.a/ in Z and a map h W V ! U that is differentiable at g.a/ and such that h.g.a// D a, g ı h D IV . In particular, g is open at a. Now let us turn to representations via parameterizations. We need the following result. Theorem 5.22 (Immersion Theorem) Let P and X be Banach spaces, let O be an open subset of P and let f W O ! X be a map of class Ck with k 1, such that for some p 2 O the map Df .p/ is injective and its image Y has a topological supplement Z in X. Then there exist open neighborhoods U of a WD f .p/ in X, Q of p in O, W of 0 in Z and a Ck -diffeomorphism W V WD Q W ! U such that .q; 0/ D f .q/ for all q 2 Q. Again the conclusion can be written in the form of a commutative diagram, since f j Q D ı j, where j W Q ! Q W is the canonical injection y 7! . y; 0/. Again the nonlinear map f has been straightened by into a linear map j D 1 ı . f j Q/. Proof Let F W O Z ! X be given by F. p; z/ D f . p/ C z. Then F is of class Ck and F 0 .p; 0/. p; z/ D f 0 .p/. p/ C z for . p; z/ 2 P Z, so that F 0 .p; 0/ is an isomorphism from P Z onto Y C Z D X. The Inverse Mapping Theorem asserts that F induces a Ck -diffeomorphism from some open neighborhood of .p; 0/ onto some open neighborhood U of f .p/. Taking a smaller neighborhood of .p; 0/ if necessary, we may suppose it has the form of a product Q W. Clearly, .q; 0/ D f .q/ for q 2 Q. t u Example-Exercise Let P WD R2 , O WD ; Œ =2; =2Œ, X WD R3 and f be given by f .'; / WD .cos cos '; cos sin '; sin '/. Identify the image of f .
5.5 Solving Equations and Inverting Maps
275
Exercise Let us note that the image f .O/ of f is not necessarily a Ck -submanifold of X. Verify that a counterexample can be given by taking P WD R, X WD R2 , f . p/ D . p; 0/ for p 2 1; 0Œ, f . p/ WD . p; 1 .1 p2 /1=2 / for p 2 Œ0; 1Œ, f . p/ WD .2 p; . p2 3p C2/1=2 C1/ for p 2 Œ1; 2Œ, f . p/ WD .0; e2p / for p 2 Œ2; 1Œ. A topological assumption ensures that the image f .O/ is a Ck -submanifold of X. Corollary 5.20 (Embedding Theorem) Let P and X be Banach spaces, let O be an open subset of P and let f W O ! X be a map of class Ck with k 1 such that for every p 2 O the map f 0 . p/ is injective and its image has a topological supplement in X. Then, if f is a homeomorphism from O onto f .O/, its image S WD f .O/ is a Ck -submanifold of X. Moreover, for every p 2 O one has T.S; f . p// D f 0 . p/.P/. One says that f is an embedding of O into X and that S is parameterized by O. Proof Given a WD f . p/ in S, with p 2 O, we take Qa O, Ua X, Wa Z and a Ck -diffeomorphism a W Va WD Qa Wa ! Ua such that a .q; 0/ D f .q/ for all q 2 Qa as in the preceding theorem. Performing a translation in P, we may suppose p D 0. Using the assumption that f is a homeomorphism from O onto S D f .O/, we can find an open subset Ua0 of X such that f .Qa / D S \ Ua0 . Let U WD Ua \ Ua0 , V WD a1 .U/, ' WD a1 j U, Y WD P, so that '.a/ D .0; 0/. Let us check relation (5.26), i.e. '.S \ U/ D .Y f0g/ \ V. For all . y; 0/ 2 .Y f0g/ \ V, we have x WD ' 1 . y; 0/ D a . y; 0/ D f . y/ 2 S, hence x 2 S \ UI conversely, when x 2 S \ U D f .Qa / there is a unique q 2 Qa such that x D f .q/, so that x D a .q; 0/ D ' 1 .q; 0/ and '.x/ D .q; 0/ 2 .Y f0g/ \ V. Then T.S; a/ D T.S \ U; a/ D .' 0 .a//1 .T..Y f0g/ \ V; 0//, and, since T..Y f0g/ \ V; 0/ D a0 .0/.P f0g/ D Y f0g, we get T.S; a/ D Y D f 0 . p/.P/. t u
Exercises 1. (Conic section) Let S R3 be defined by the equations x2 C y2 1 D 0, x z D 0. Show that S is a submanifold of R3 of class C1 (it has been known since Apollonius that S is an ellipse). Find an explicit diffeomorphism (in fact linear isomorphism) sending S onto an ellipse of the plane R2 f0g. 2. (Viviani’s window) Let S be the subset of R3 defined by the system x2 Cy2 D x, x2 C y2 C z2 1 D 0. Show that S is a submanifold of R3 of class C1 . 3. (The torus) Let r > s > 0, let O WD 0; 2 Œ 0; 2 Œ and let f W O ! R3 be given by f .˛; ˇ/ D ..r C s cos ˇ/ cos ˛; .r C s cos ˇ/ sin ˛; s sin ˇ/. Show that f is an embedding onto the torus T deprived of its greatest circle and of the set T \ .RC f0g R/, where p T WD f.x; y; z/ 2 R3 W . x2 C y2 r/2 C z2 D s2 g:
276
5 The Power of Differential Calculus
4. Using the Submersion Theorem, show that T is a C1 -submanifold of R3 . 5. (a) (Beltrami’s tractricoid) Let f W R ! R2 be given by f .t/ WD .1= cosh t; t tanh t/. Determine the points of T WD f .R/ that are smooth. (b) (Beltrami’s pseudo-sphere) Let g.s; t/ WD .cos s= cosh t; sin s= cosh t; t tanh t/. Determine the points of S WD g.R2 / that are smooth. They form a surface of (negative) constant Gaussian curvature. This can serve as a model for hyperbolic geometry. 6. Study the cross-cap surface f.1 C cos v/ cos u; .1 C cos v/ sin u; tanh.u / sin v/ W .u; v/ 2 Œ0; 2 Œ0; 1g and compare it with the self-intersecting disc, the image of Œ0; 2 Œ0; 1 under the parameterization .u; v/ 7! .v cos 2u; v sin 2u; v cos u/: 7. Study Whitney’s umbrella f.uv; u; v 2 / W .u; v/ 2 R2 g. Verify that it is determined by the equation x2 y2 z D 0. Such a surface is of interest in the theory of singularities. For this surface or the preceding one, make some drawings if you can or find some on the internet. 8. Let O WD0; 1Œ[1; 1Œ R, f W O ! R2 being given by f .t/ D .t C t1 ; 2t C t2 /. Show that f is an embedding, but that its continuous extension to 0; C1Œ given by f .1/ D .2; 3/ is of class Ck but is not an immersion. 9. Let X be a normed space and let f W X ! R be Lipschitzian around x 2 X. Show that f is Hadamard differentiable at x 2 X if and only if the tangent cone to the graph G of f at .x; f .x// is a hyperplane. 10. Show that the fact that the tangent cone at .x; f .x// to the epigraph E of f is a half-space does not imply that f is Hadamard differentiable at x.
5.5.5 *The Eikonal Equation The eikonal equation plays a role in wave propagation phenomena such as rings on the surface of water, or seismic techniques for petrol exploration. We present a study of this equation in the form of a detailed problem. (A) Given a nonempty subset S of a Banach space X we denote by dS W X ! R the distance function to S given by dS .x/ WD inf kx yk y2S
x 2 X:
1) Verify that dS is Lipschitzian with rate 1 so that when dS is differentiable at some x 2 X one has dS0 .x/ 1. 2) a) Suppose there exists a y 2 S such that kx yk kx yk for all y 2 S. Setting xt WD .1 t/x C ty for t 2 Œ0; 1, verify that dS .xt / D .1 t/dS .x/.
5.5 Solving Equations and Inverting Maps
277
b) Assuming moreover that dS is differentiable at x, deduce from this the relation dS0 .x/.y x/ D ky xk : c) Conclude that in this case one has dS0 .x/ D 1. (B) Consider the example S WD f.x1 ; x2 / 2 R2 W x21 C x22 D 1; x1 > 1g X WD R2 with its Euclidean norm. Let W WD ; Œ, g W W ! R2 be given by g.w/ D .cos w; sin w/. 1) Show that dS .x/ D jkxk 1j for x 2 R2 by first observing that for x 2 R2 nD where D WD 1; 0 f0g there exists a y 2 S such that kx yk kx yk for all y 2 S. 2) Show that the map f W .w; r/ 7! ..r C 1/ cos w; .r C 1/ sin w/ is a C1 diffeomorphism from U WD ; Œ 1; 1Œ onto a neighborhood V of S. 3) Set f 1 .x1 ; x2 / WD .w.x1 ; x2 /; r.x1 ; x2 //, compute r.x1 ; x2 /, and show that .
@r @r .x1 ; x2 //2 C . .x1 ; x2 //2 D 1: @x1 @x2
(C) In the sequel S is a hypersurface of X parametrized by a C1 -embedding g W W ! X where W is an open subset of a Banach space H, g0 .w/ being injective for all w 2 W. It is also assumed that there exists a C1 map n W W ! Xnf0g such that for all w 2 W the subspace Rn.w/ WD frn.w/ W r 2 Rg is a topological complement of g0 .w/.H/. 1) Let f W W R ! X be given by f .w; r/ WD g.w/ C rn.w/. Compute f 0 .w; r/.u; s/ for .w; r/ 2 W R, .u; s/ 2 H R. 2) Show that for all w 2 W, f 0 .w; 0/ is an isomorphism from H R onto X. 3) Deduce from this that for all w 2 W there exists an open neighborhood Uw of .w; 0/ in W R and an open neighborhood Vw of g.w/ in X such that f induces a diffeomorphism from Uw onto Vw . 4) In the sequel it is assumed that there exists an open subset U0 of W R containing W f0g such that f j U0 is injective. Show that there exists an open subset U of U0 containing W f0g and an open subset V of X containing S such that f induces a diffeomorphism from U onto V. The inverse of f j U is denoted by h W x 7! .w.x/; r.x//. (D) From now on X is a Hilbert space whose norm is associated with the scalar product h j i. 1) Assume that for all w 2 W one has kn.w/k D 1. Show that for all .w; u/ 2 W H one has hn0 .w/.u/ j n.w/i D 0. 2) Assuming moreover that for all w 2 W the vector n.w/ is orthogonal to g0 .w/.H/, show that hf 0 .w; r/.u; s/ j n.w/i D s
8w 2 W; .u; s/ 2 H R.
278
5 The Power of Differential Calculus
3) Given v 2 X, one takes u WD w0 .x/.v/, s WD r0 .x/.v/. Show that hf 0 .w.x/; r.x//.u; s/ j n.w.x//i D s: Conclude that r0 .x/.v/ D hn.w.x// j vi; and kr0 .x/k .D kn.w.x//k/ D 1.
5.5.6 *Critical Points For a function f W W ! R of class C1 on an open subset W of a Banach space X a point x 2 W such that f 0 .x/ D 0 is called a critical point of f . The set Cf of critical points of f plays an important role: for several phenomena it is more important than the set of minimizers of f or the set of maximizers of f . Of course, such sets are subsets of Cf ; but even if x 2 Cf is neither a local minimizer nor a local maximizer of f , one knows that f does not change much around x and that is already a noticeable property often called stationarity. Under an additional nondegeneracy assumption, the behavior of f around a critical point can be described in a very simple manner. Let us note that it offers an analogy with the case f 0 .x/ ¤ 0 for which one can find a neighborhood V of x and a C1 -diffeomorphism h W V ! U onto a neighborhood U of 0 such that h.x/ D 0 and f .x/ D f .x/ C `.h.x// for x 2 V with ` WD f 0 .x/. Here, in contrast, we suppose f 0 .x/ D 0 and that x is a nondegenerate critical point in the sense that the second derivative b WD f 00 .x/ of f at x is such that the map x 7! b.x; / is a linear isomorphism from X onto X . Again, a change of variables yields a very simple form of the function. Theorem 5.23 (Morse-Palais Lemma) Let f W W ! R be a function of class Cn (n 3) on an open subset W of a Hilbert space X with inner product h j i. Let x 2 W be a nondegenerate critical point of f and let b WD f 00 .x/. Then there exist an open neighborhood V W of x, an open neighborhood U of 0 in X and a Cn2 -diffeomorphism h W V ! U such that h.x/ D 0, h0 .x/ D IX and f .x/ D f .x/ C b.h.x/; h.x//
8x 2 V:
Thus, by “the change of variables” h, f can be considered as a simple quadratic form. Proof Without loss of generality we may assume x D 0, f .0/ D 0 and that W is a ball centered at 0. Then Z f .x/ D
1 0
f 0 .tx/xdt
x 2 W;
5.5 Solving Equations and Inverting Maps
279
and, similarly, replacing f with f 0 , 0
f .tx/ D
Z
1 0
f 00 .stx/txds
x 2 W; t 2 Œ0; 1;
so that f .x/ D hG.x/.x/ j xi with Z
1
G.x/ D 0
Z
1 0
f 00 .stx/tdsdt:
Here G is a map of class Cn2 from W into the space L2s .X; X / of symmetric linear maps from X into X identified with the space of continuous symmetric bilinear forms on X. By definition, B WD G.0/ D f 00 .0/ is an isomorphism of X onto X . Thus, for x close to 0, G.x/ is an isomorphism of X onto X . Let H W W ! Ls .X; X/ be given by H.x/ WD B1 ı G.x/; so that, for x close to 0, H.x/ is an isomorphism of X onto X close to IX and since B and G.x/ are symmetric H.x/ D G.x/ ı B1 ; H.x/ ı B D B ı H.x/: Let R.x/ be the square root of H.x/, so that R.x/ ı R.x/ D H.x/. Since R.x/ is the sum of a convergent series in H.x/ and R.x/ is the sum of a convergent series in H.x/ , we have R.x/ ı B D B ı R.x/; hence R.x/ ı B ı R.x/ D B ı R.x/ ı R.x/ D B ı H.x/ D G.x/: Setting h.x/ WD R.x/.x/ we get hB.h.x// j h.x/i D hB.R.x/.x// j R.x/.x/i D h.R.x/ ı B ı R.x//.x/ j xi D hG.x/.x/ j xi D f .x/: Finally, h is of class Cn2 and h0 .0/ D R.0/, the square root of H.0/ D IX , so that h defines a diffeomorphism as required. t u Exercise Using the spectral decomposition of B, show that one can find two closed subspaces Y and Z of X endowed with compatible norms such that X D Y ˚ Z,
280
5 The Power of Differential Calculus
Z D Y ? and for x WD y C z with . y; z/ 2 Y Z f .h1 .x// D f .x/ C kyk2 kzk2 : Exercise For r 2 R, classify the critical points of fr W R2 ! R defined by fr .x; y/ WD x3 y3 C 3rxy as local maximizers, minimizers or saddle points. The search of critical points is facilitated by the following condition. Definition 5.16 A differentiable function f W W ! R is said to satisfy the PalaisSmale condition (PSc ) for some value c 2 R if any sequence .xn / of W such that . f .xn // ! c and . f 0 .xn // ! 0 has a convergent subsequence. It satisfies condition (PS) if for all c 2 R it satisfies condition (PSc ). Such a condition is a kind of compactness condition that involves f itself rather than the space X: of course, a coercive differentiable function on a finite dimensional space satisfies condition (PS). But condition (PS) can also be satisfied for interesting functions on infinite dimensional spaces. Such a fact has been used for the study of partial differential equations. An important consequence of condition (PSc ) is the following deformation property around a level c which is not a critical value of f , i.e. an element of f .Cf /, where Cf is the set of critical points of f . It is obtained by taking the flow of an appropriate vector field. We admit it. Here we say that f is of class C1;1 if f is of class C1 and if its derivative is Lipschitzian on bounded subsets. Also, for r 2 R, we set Wr WD fw 2 W W f .w/ rg. Theorem 5.24 Suppose that f W W ! R is of class C1;1 and satisfies condition (PSc ) for some c 2 R such that Cf \f 1 .c/ D ¿. Then for all " > 0 sufficiently small there exists some ı 20; "Œ and a homotopy h W W Œ0; 1 ! W, i.e. a continuous map satisfying h.; 0/ D IW ./, the identity map and the following properties: (a) h.w; 1/ D w for all w 2 Wn f 1 .Œc "; c C "/; (b) h.w; 1/ 2 Wcı WD f 1 . 1; c ı/ for all w 2 WcCı ; (c) f .h.w; t// f .w/ for all .w; t/ 2 W Œ0; 1. This deformation property can be used to obtain the following rather intuitive result known as the Mountain Pass Theorem. Hikers will not be surprised by it (Fig. 5.3). Theorem 5.25 (Ambrosetti-Rabinowitz) Let W be an open subset of a Banach space X and let f W W ! R be of class C1;1 and such that for some w0 2 W, r 20; d.w0 ; XnW/Œ, w1 2 WnBŒw0 ; r, m > max. f .w0 /; f .w1 // one has f .w/ m for all w 2 S.w0 ; r/ WD fw 2 X W kw w0 k D rg. If f satisfies condition (PSc ) for c WD inf max f .g.t// g2G t2T
with T WD Œ0; 1, G WD fg 2 C.T; W/ W g.0/ D w0 ; g.1/ D w1 g, then c is a critical value of f , i.e. c D f .w/ for some critical point w of f .
5.5 Solving Equations and Inverting Maps
281
w w0 w1
Fig. 5.3 The Mountain Pass Theorem
Proof The Custom Lemma 2.5 (or just the intermediate value theorem) ensures that for all g 2 G there exists some tg 2 T such that g.tg / 2 S.w0 ; r/. Thus, for all g 2 G one has maxt2T f .g.t// m, hence c m. Assuming that Cf \ f 1 .c/ D ¿, i.e. that c is not a critical value of f , we will obtain a contradiction with the deformation property in which we take " > 0 satisfying " < m max. f .w0 /; f .w1 //. Let ı 20; "Œ and let h W W Œ0; 1 ! W be a homotopy satisfying conditions (a), (b), (c) of Theorem 5.24. Let gı 2 G be such that maxt2T f .gı .t// c C ı and let g WD h.gı ./; 1/. Since h is continuous and h.w0 ; 1/ D w0 , h.w1 ; 1/ D w1 , we have g 2 G, whence maxt2T f .g.t// c. However, since gı .T/ WcCı by our choice of gı and since h.w; 1/ 2 Wcı for all w 2 WcCı by condition (b) of Theorem 5.24, we get maxt2T f .g.t// c ı, a contradiction with the definition of c. t u The following example shows that one cannot drop assumption (PSc ). Example (Brézis-Nirenberg) Let f W R2 ! R be given by f .x; y/ D x2 C.1x/3 y2 , let w0 WD .0; 0/, w1 WD .2; 2/, r WD 1=2. Then one can show that f .w0 / D 0, f .w1 / D 0, inf f .rS1 / > 0 but 0 is the unique critical point of f . Note that one can find a sequence .wn / such that . f 0 .wn // ! 0, . f .wn // ! c but .kwn k/ ! 1. In Sect. 9.4 an application of the Mountain Pass Theorem will be given to semilinear boundary-value problems.
Exercises 1. Let S be a subset of a proper affine subspace A of Rd contained in a ball with radius r and for " > 0 let S" WD S C "B, where B is the closed unit ball for the norm kk1 . Prove the following estimate for the outer Lebesgue measure
282
5 The Power of Differential Calculus
d of S" : d .S" / 2d .r C "/d1 ". [Hint: use a translation and an orthogonal transformation.] 2. (Sard’s Theorem) Let W be an open subset of Rd and let f W W ! Rd be of class C1 . Denote by C the set of critical points of f , i.e. the set of points z 2 W such that Df .z/ is not surjective. Prove that the set f .C/ of critical values of f has measure zero. [Hint: use the preceding exercise and the Mean Value Theorem. See [190, Section 13.5.1].]
5.5.7 *The Method of Characteristics Let us consider the nonlinear partial differential equation F.w; Du.w/; u.w// D 0;
w 2 W0 ;
(5.27)
where W is a reflexive Banach space, W0 is an open subset of W whose boundary @W0 is a submanifold of class C2 and F W .w; p; z/ 7! F.w; p; z/ is a function of class C2 on W0 W R. We look for a solution u of class C2 satisfying the boundary condition u j @W0 D g;
(5.28)
where g W @W0 ! R is a given function of class C2 . We leave apart the question of compatibility conditions for the data .F; g/. The method of characteristics consists in associating to (5.27) a system of ordinary differential equations (in which W is identified with W) called the system of characteristics: w0 .s/ D Dp F.w.s/; p.s/; z.s//
(5.29)
0
p .s/ D Dw F.w.s/; p.s/; z.s// Dz F.w.s/; p.s/; z.s//p.z/
(5.30)
z0 .s/ D hDp F.w.s/; p.s/; z.s//; p.s/i:
(5.31)
Suppose a smooth solution u of (5.27) is known. Let us relate it to a solution s 7! .w.s/; p.s/; z.s// of the system (5.29)–(5.31). Let q.s/ WD Du. y.s//;
r.s/ WD u. y.s//
where y./ is the solution of the differential equation y0 .s/ WD Dp F. y.s/; Du. y.s//; u. y.s///;
y.0/ D w0 :
Then r0 .s/ D Du. y.s//:y0 .s/ D hq.s/; Dp F. y.s/; q.s/; r.s//i
5.5 Solving Equations and Inverting Maps
283
For all e 2 W, identifying W and W we have q0 .s/:e D D2 u. y.s//:y0 .s/:e D hDp F. y.s/; q.s/; r.s//; D2 u. y.s//:ei: Now, taking the derivative of the function F.; Du./; u.// and writing u, Du instead of u.w/, Du.w/, we have Dw F.w; Du; u/e C Dp F.w; Du; u/D2 u.w/:e C Dz F.w; Du; u/Du.w/e D 0: Thus, replacing .w; Du; u/ by . y.s/; q.s/; r.s// and noting that e is arbitrary in W, we get q0 .s/ D Dw F.w.s/; q.s/; r.s// Dz F.w.s/; q.s/; r.s//q.s/: It follows that s 7! . y.s/; q.s/; r.s// is a solution of the characteristic system. Taking the same initial data .w0 ; p0 ; g.w0 //, by uniqueness of the solution of the characteristic system, we get y.s/ D w.s/, p.s/ D q.s/ and z.s/ D r.s/ WD u.w.s//. This means that knowing the solution of the characteristic system, we get the value of u at w.s/. If around some point w 2 W0 we can represent any point w of a neighborhood of w as the value w.s/ for the solution of (5.29)–(5.31) issued from some initial data, then we get u around w. In the following classical example, the search for the initial data is particularly simple. Example: quasilinear equations Let W WD Rn , W0 WD Rn1 P, F being given by F.w; p; z/ WD p:b.w; z/ c.w; z/, where b W W0 R ! W, c W W0 R ! R. Then, taking into account the relation Dp F.w.s/; p.s/; z.s//:p.s/ D p.s/:b.w.s/; z.s// D c.w.s/; z.s//, equations (5.29), (5.31) of the characteristic system read as a system in .w; z/: w0 .s/ D b.w.s/; z.s// z0 .s/ D c.w.s/; z.s//: In the case b WD .b1 ; : : : ; bn / is constant with bn ¤ 0 and c.w; z/ WD zkC1 =k, with k > 0, the solution of this system with initial data ..v; 0/; g.v// 2 Rn P is given by wi .s/ D bi s C vi .i D 1; : : : ; n 1/; wn .s/ D bn s;
z.s/ D
g.v/ : .1 g.v/k s/1=k
It is defined for s in the interval S WD Œ0; g.v/k Œ. Given x WD .x1 ; : : : ; xn / 2 W0 near x 2 W0 , the initial data v is found by solving the equations bi s C vi D xi .i 2 Nn1 /, xn D bn s: vi D xi ai xn with ai WD bi =bn . The preceding shows that u is given by u.x/ D
for x in the set f.x1 ; ; xn / W xn g.x1 a1 xn ; ; xn1 an1 xn /k < bn g.
t u
284
5 The Power of Differential Calculus
A special case of equation (5.27) is of great importance. It corresponds to the case w WD .x; t/ 2 W0 WD U0; Œ for some 20; C1 and some open subset U of a hyperplane X of W and F..x; t/; . y; v/; z/ WD v CH.x; t; y; z/, so that equation (5.27) and the boundary condition (5.28) take the form Dt u.x; t/ C H.x; t; Dx u.x; t/; u.x; t// D 0 u.x; 0/ D g.x/
.x; t/ 2 W0 0; Œ x 2 W0 :
(5.32) (5.33)
Such a system is called a Hamilton-Jacobi equation. Let us note that, as in the example of quasilinear equations, the general case can be reduced to this form under a mild condition. First, since W0 is the interior of a smooth manifold with boundary, taking a chart, we may assume for a local study that W0 D U0; Œ for some > 0 and some open subset U of a hyperplane X of W. Now, using the implicit function theorem around w 2 @W0 , F can be reduced to the form F..x; t/; . y; v/; z/ WD v C H.x; t; y; z/ provided Dv F.w; p; z/ ¤ 0. Such a condition can be expressed intrinsically (i.e., without using the chart) by finding a vector v transverse to @W0 at w such that Dw F.w; y; z/:v ¤ 0. The characteristic system associated with (5.32) can be reduced to x0 .s/ D Dy H.x.s/; s; y.s/; z.s//
z .s/ D Dy H.x.s/; s; y.s/; z.s//:y.s/ H.x.s/; s; y.s/; z.s//
(5.36)
by dropping the equation t0 .s/ D 1 and noting that an equation for Dt u.x.s/; t.s// is not needed since this derivative is known to be H.x.s/; s; y.s/; z.s//. In order to take into account the dependence on the initial condition .v; Dg.v/; g.v//, the onejet of g at v 2 U X, let us denote by s 7! .Ox.s; v/; yO .s; v/; zO.s; v// the solution to the system (5.34)–(5.36). Since the right-hand side of this system is of class C1 , the theory of differential equations ensures that the solution is a mapping of class C1 in .s; v/. In view of the initial data, we have 8v 2 U; v 0 2 X
Dv xO .0; v/v 0 D v 0 :
It follows that for all v 2 U there exist a neighborhood V of v in U and some 20; Œ such that, for s 20; Œ, the map xO s W v 7! xO .s; v/ is a diffeomorphism from V onto Vs WD xO .s; V/. From the preceding analysis, we get that for x 2 Vs one has u.x; s/ D zO.s; v/ with v WD .Oxs /1 .x/. Thus we get a local solution to the system (5.32)–(5.33). In general, one cannot get a global solution with such a method: it may happen that for two values v1 , v2 of v the characteristic curves issued from v1 and v2 take the same value for some t > 0.
5.5 Solving Equations and Inverting Maps
285
Exercises 1. Write down the characteristic system for the conservation law Dt u.x; t/ C Dx u.x; t/:b.u.x; t// D 0;
u.v; 0/ D g.v/;
where b W R ! X, g W X ! R are of class C1 . Verify that its solution satisfies xO .s; v/ D v C sb.g.v//, z.s; v/ D g.v/. Compute Dv xO .s; v/ and show that for all v 2 X, this element of L.X; X/ is invertible for .s; v/ close enough to .0; v/. Deduce a local solution of the conservation law equation from this property. 2. (Haar’s Uniqueness Theorem) Suppose X D R and H W X R X R ! R satisfies the Lipschitz condition with constants k, ` ˇ ˇ ˇ ˇ ˇ ˇ ˇH.x; t; y; z/ H.x; t; y0 ; z0 /ˇ k ˇy y0 ˇ C ` ˇz z0 ˇ ;
3.
4.
5.
6.
7*.
for .x; t; y; y0 ; z; z0 / 2 T R4 , where, for some constants a; b; c, T is the triangle T WD f.x; t/ 2 X Œ0; a W x 2 Œb C `t; c `tg. Show that if u1 , u2 are two solutions of class C1 in T of the system (5.32)–(5.33), then u1 D u2 . Suppose X D R, g D IX and H W X R X R ! R1 is given by H.x; t; y; z/ WD jt 1j1=2 y for t 2 Œ0; 1Œ, C1 otherwise. Using the method of characteristics, show p that a solution to the system (5.32)–(5.33) is given by u.x; t/ D x 2 C 2 1 t for .x; t/ 2 X0; 1Œ. Suppose X D R and g and H are given by g.x/ WD x2 =2 and H.x; t; y; z/ WD y2 =2. Using the method of characteristics, show that a solution to the system (5.32)–(5.33) is given by u.x; t/ D x2 =2.1 t/ for .x; t/ 2 X .0; 1/. Suppose X D R, g and H are given by g.x/ WD x, H.x; t; y; z/ WD e3t yz.a0 .t/e2t C b0 .t/z2 / z, where a and b are nonnegative functions of class C1 satisfying a.0/ D 1, b.0/ D 0, a C b > 0. Show that the characteristics associated with the system (5.32)–(5.33) satisfy xO .t; v/ D a.t/v C b.t/v 3 , zO.t/ D et v, so that v 7! xO .t; v/ is a bijection. Assuming that there exists some > 0 such that a.t/ D 0 for t , show that u.x; t/ D et b.t/1=3 x1=3 for .x; t/ 2 X Œ; 1Œ so that u is not differentiable at .0; t/. Suppose X D R and g and H are given by g.x/ WD x2 =2, H.x; t; y; z/ WD a0 .t/et y2 =2 C b0 .t/e3t y4 z, where a and b are as in the preceding exercise. Show that the characteristics associated with the system (5.32)–(5.33) satisfy xO .t; v/ D a.t/v C 4b.t/v 3 , zO.t/ D et .a.t/v 2 =2 C 3b.t/v 4 /, so that for t , v 7! xO .t; v/ is a bijection on a neighborhood of 0, in spite of the fact that Dv xO .t; 0/ D 0 and u.x; t/ D 3:44=3 b.t/x4=3 , so that u is of class C1 but not C2 around .0; t/. Let W be an open subset of a Banach space X, let F W W ! X be of class C1 , and let u W R W ! W be its flow, so that D1 u D F ı u. Show that D2 u:F D F ı u. Given g 2 C1 .W; R/, let f WD g ı u. Show that f satisfies D1 f .t; x/ D2 f .t; x/F.x/ D 0 and f .0; x/ D g.x/ for all x 2 W, t 2 R. @f d For X WD Rd , x WD .x1 ; : : : ; xd /, solve the equation @f@t .t; x/ D ˙iD1 xi @x .t; x/. i
286
5 The Power of Differential Calculus
5.6 Applications to Optimization For unconstrained optimization, differential calculus offers easily obtained criteria. Constrained minimization (or maximization) requires more attention and we have to deal with some preliminaries. We consider it in the form of the problem .P/
minimize f .x/ under the constraint x 2 F;
where F is a nonempty subset of a normed space X called the feasible set or the admissible set and where f W X ! R is the objective function.
5.6.1 Unconstrained Minimization In the case when the feasible set is the whole space or an open subset W of a normed space X, differential calculus can readily be applied to the minimization problem (for the maximization problem one uses f instead of f ). Proposition 5.29 Let x 2 W be a local minimizer of a function f W W ! R, i.e. such that for some neighborhood V of x one has f .x/ f .v/ for all v 2 V. Then, if f is differentiable at x one has f 0 .x/ D 0. If f is twice differentiable at x then one has f 00 .x/.x; x/ 0 for all x 2 X. Proof For the first assertion it suffices to assume f is Gateaux differentiable at x since for all v 2 X one has .1=t/. f .xCtv/f .x// 0 for t > 0 small enough, hence, passing to the limit as t ! 0C dr f .x; v/ 0. Since similarly we have dr f .x; v/ 0 we conclude that Dr f .x/ D 0. When f is twice differentiable at x, Theorem 5.9 asserts that r.x/ D f .x C x/ f .x/ f 0 .x/.x/ .1=2/f 00 .x/.x; x/ is a remainder of order two. Thus, for all v 2 X, limt!0C t2 r.tx/ D 0 and f 00 .x/.v; v/ D lim
2
t!0C t2
. f .x C tv/ f .x// 0
since f 0 .x/ D 0.
t u
A corresponding sufficient condition requires a reinforced assumption. Proposition 5.30 Let f W W ! R be twice differentiable at x 2 W and such that f 0 .x/ D 0 and, for some c > 0; f 00 .x/.x; x/ c kxk2 for all x 2 X. Then x is a local strict minimizer of f . Note that the assumption on f 00 .x/ is satisfied if f 00 .x/ is a nondegenerate positive bilinear form.
5.6 Applications to Optimization
287
Proof Given " 20; c=2Œ one can find some ı > 0 such that the remainder r given by r.x/ WD f .x C x/ f .x/ f 0 .x/x .1=2/f 00 .x/.x; x/ satisfies jr.x/j " kxk2 when x 2 ıBX . Then, our assumptions entail that f .v/ f .x/ C .c=2 "/ kv xk2 for v 2 BŒx; ı, hence f .v/ > f .x/ for v 2 BŒx; ınfxg. t u Other criteria could be given using higher order derivatives. Hereafter we will formulate optimality conditions for the problem with constraints. These conditions will involve the concept of a normal cone.
Exercises 1. Give examples of functions and of critical points that are not local minimizers. 2. Give an example of a function f on R2 such that Df .0/ D 0, D2 f .0/:v:v 0 but 0 is not a local minimizer of f . 3. Given .x; y/ 2 R2 , find t 2 R such that .cos t x/2 C .sin t y/2 .cos r x/2 C .sin r y/2 for all r 2 R. Give a geometric interpretation. 4. Let X be a Euclidean space or a Hilbert space, let a1 , a2 2 X, and let u1 , u2 2 Xnf0g. Solve the minimization problem of kx1 x2 k for x1 2 a1 C Ru1 , x2 2 a2 C Ru2 . Consider in particular the case X D R3 . 5. For d 2, let f be the polynomial function given by f .x/ D .xd C 1/3 .x21 C : : : C x2d1 / C x2d : Show that 0 is the unique critical point of f and that 0 is a local strict minimizer of f , but not a global minimizer. 6. Let X be a Euclidean space or a Hilbert space and let f W X ! R be a function of class C2 . Suppose that for some c > 0 and some x 2 X one has f .x/ f .x/ C c kx xk2 . Show that Df .x/ D 0 and D2 f .x/:v:v 2c kvk2 for all v 2 X. Prove that there exist a > 0 and a neighborhood U of x such that krf .x/k a kx xk for all x 2 U. Deduce from this the existence of some b > 0 and some neighborhood V of x such that f .x/ inf f .X/ C b krf .x/k2 for all x 2 V. 7. Gradient algorithm. Let X be a Euclidean space or a Hilbert space and let f W X ! R be a function of class C1 whose gradient is Lipschitzian with rate c. Suppose that for some b > 0 one has f .x/ m C b krf .x/k2 for all x 2 X, with m WD inf f .X/. Given x0 2 X and a sequence .tn / in 0; 1=cŒ one defines inductively a sequence .xn / by setting xnC1 D xn tn rf .xn /. Show that f .xnC1 / f .xn / 12 tn krf .xn /k2 for all n 2 N and that if for some q 20; 1Œ one has tn 2.1 q/ for all n 2 N, then one has f .xn / m qn . f .x0 / m/ and . f .xn // ! m. Prove that kxnC1 xn k2 2tn . f .xn / f .xnC1 // 2tn . f .xn / m/ for all n 2 N. Setting a WD . 2c . f .x0 /m//1=2 deduce from the preceding inequalities that kxnC1 xn k aqn=2 for all n. Conclude that .xn / converges to some minimizer x of f .
288
5 The Power of Differential Calculus
8. Gradient algorithm with optimal step. Let f W Rd ! R be the quadratic function given by f .x/ WD 12 hAx j xi C hb j xi C c, where A is a positive definite matrix, b 2 Rd and c 2 R. Show that f has a unique minimizer x. Given x0 2 Rd , consider the sequence defined inductively by xnC1 D xn C tn dn , where dn WD rf .xn / and tn WD kdn k =hAxn j xn i minimizes the function t 7! f .xn C tdn / on R. Verify that dnC1 D dn tn Adn and hdnC1 j dn i D 0 for all n 2 N. Let m WD inf f .Rd / and let WD 1 =d be the conditioning of A, where 1 2 : : : d is the sequence in eigenvalues of A. Show that "
kdn k2 f .xnC1 / m D . f .xn / m/ 1 hAdn j dn ihA1 dn j dn i f .xn / m . f .x0 / m/ 1=2
kxn xk 21=2 d
#
. 1/2n . C 1/2n
. f .x0 / m/1=2
. 1/n : . C 1/n
[Hint: use Kantorovich’s inequality: hAx j xihA1 x j xi 14 . 1=2 C 1=2 /2 kxk4 for all x 2 Rd .] 9. Let p W R ! R be a polynomial function with positive values. If n is its degree, show that the polynomial function q WD p C p0 C : : : C p.n/ takes its values in RC . [Hint: observe that the term of highest degree in q is the term of highest degree in p, so that q attains its minimum at some r 2 R and that q0 .r/ D q.r/ p.r/, so that q.r/ q.r/ D p.r/.]
5.6.2 Normal Cones, Tangent Cones, and Constraints In fact we will use some variants of the concept of normal cone that fit different differentiability assumptions on the function f . When the feasible set is a convex set these variants coincide (Exercise 6) and the concept is very simple (Fig. 5.4). Definition 5.17 The normal cone N.C; x/ to a convex subset C of X at x 2 C is the set of x 2 X which attain their maximum on C at x: N.C; x/ WD fx 2 X W 8x 2 C
hx ; x xi 0g :
Thus, when C is a linear subspace, N.C; x/ D C? , where C? is the orthogonal of C (or annihilator of C) in X C? WD fx 2 X W 8x 2 C
hx ; xi D 0g :
When C is a cone, one has N.C; 0/ D C0 , where C0 is the polar cone of C.
5.6 Applications to Optimization
289
N(C,b) b N(C,a)
a
C
Fig. 5.4 The normal cone to a convex subset
In the nonconvex case the preceding definition has to be modified by introducing a remainder in the inequality in order to allow a certain curvature or inaccuracy. Definition 5.18 The firm or Fréchet normal cone NF .S; x/ to a subset S of X at x 2 S is the set of x 2 X for which there exists a remainder r ./ such that x ./r . x/ attains its maximum on S at x: x 2 NF .S; x/ ” 9r 2 o.X; R/ 8x 2 S
hx ; x xi r.x x/:
In other terms, x 2 X is a firm normal to S at x iff for every " > 0 there exists a ı > 0 such that for all x 2 S \ B.x; ı/ one has hx ; x xi "kx xk. Equivalently x 2 NF .S; x/
”
1 hx ; x xi 0: x.x2S/!x; x¤x kx xk lim sup
We will give some properties and calculus rules in the next subsection. For the moment it is important to convince oneself that this notion corresponds to the intuitive idea of an “exterior normal” to a set, for instance by making drawings in simple cases. We present a necessary condition using this concept without any delay. In it we say that f attains a local maximum (resp. local minimum) on F at x if f .x/ f .x/ (resp. f .x/ f .x/) for all x in some neighborhood of x in F. It is convenient to say that x is a local maximizer (resp. local minimizer) of f on F. Theorem 5.26 (Fermat’s Rule) Suppose f attains a local maximum on F at x and is Fréchet differentiable at x. Then f 0 .x/ 2 NF .F; x/: If f attains a local minimum on F at x and is Fréchet differentiable at x then 0 2 f 0 .x/ C NF .F; x/:
290
5 The Power of Differential Calculus
Proof Suppose f attains a local maximum on F at x and is differentiable at x. Set f .x/ D f .x/ C hx ; x xi C r.x x/ with r a remainder, x WD f 0 .x/, so that for x 2 F close enough to x one has hx ; x xi C r.x x/ D f .x/ f .x/ 0: Hence x 2 NF .F; x/. Changing f into f , one obtains the second assertion.
t u
0
The second formula shows how the familiar rule f .x/ D 0 of unconstrained minimization has to be changed by introducing an additional term involving the normal cone. Without such an additional term the condition would be utterly invalid. Example The identity map f D IR on R attains its minimum on F WD Œ0; 1 at 0 but f 0 .0/ D 1. Example Suppose F is the unit sphere of the Euclidean space R3 representing the surface of the earth and suppose f is a smooth function representing the temperature. If f attains a local minimum on F at x, in general rf .x/ is not 0; however rf .x/ is on the downward vertical at x and, if one can increase one’s altitude at that point, one usually experiences a decrease of temperature. t u When the objective function f is not Fréchet differentiable but just Hadamard differentiable, an analogue of Fermat’s Rule can still be given by introducing a variant of the notion of a firm normal cone. It goes as follows; although this variant appears to be more technical than the concept of a Fréchet normal cone, it is a natural and important notion. It can be formulated with the help of the notion of a directional remainder: r W X ! Y is a directional remainder if for all u 2 Xnf0g one has r.tv/=t ! 0 as t ! 0C , v ! uI we write r 2 oD .X; Y/. Definition 5.19 The normal cone (or directional normal cone) to the subset F at x 2 cl.F/ is the set N.F; x/ WD ND .F; x/ of x 2 X for which there exists a directional remainder r ./ such that x ./ r . x/ attains its maximum on F at x: x 2 N.F; x/ ” 9r 2 oD .X; R/ 8x 2 F
hx ; x xi r.x x/:
In other terms, x 2 X is a normal to F at x if and only if for all u 2 Xnf0g and " > 0 there exists a ı > 0 such that hx ; vi " for any .t; v/ 20; ı B.u; ı/ satisfying x C tv 2 F : x 2 N.F; x/ ” 8u 2 X
lim sup .t;v/!.0C ;u/; xCtv2F
1 hx ; .x C tv/ xi 0: t
Let us note that the case u D 0 can be discarded in the preceding reformulation because the condition is automatically satisfied in this case with ı D " min.1; kx k1 /. This cone often coincides with the Fréchet normal cone and it always contains it, as the preceding reformulations show.
5.6 Applications to Optimization
291
Lemma 5.17 For any subset F and any x 2 cl.F/ one has NF .F; x/ N.F; x/. The duality property we prove now compensates the complexity of the definition of the (directional) normal cone compared to the definition of the firm normal cone. Proposition 5.31 The normal cone to F at x is the polar cone to the tangent cone to F at x:
x 2 N.F; x/ , 8u 2 T.F; x/
hx ; ui 0 :
Proof Given x 2 N.F; x/ and u 2 T.F; x/nf0g, for any " > 0, taking ı 20; "Œ such that hx ; vi " for any .t; v/ 20; ı B.u; ı/ satisfying x C tv 2 F and observing that such a pair .t; v/ exists since u 2 T.F; x/, we get hx ; ui hx ; vi C kx k ku vk " C " kx k. As " is arbitrarily small, we get hx ; ui 0. Conversely, given x in the polar cone of T.F; x/, given u 2 T.F; x/ and given " > 0, taking ı > 0 such that ı kx k ", the inequality hx ; vi " holds whenever t 20; ıŒ and v 2 t1 .F x/ \ B.u; ı/ since hx ; vi hx ; ui C hx ; v ui kx k ku vk ı kx k ": If u 2 XnT.F; x/ we can find ı > 0 such that no such pair .t; v/ exists. Thus, we have hx ; vi " for any .t; v/ 20; ı B.u; ı/ satisfying x C tv 2 F: x 2 N.F; x/. t u Theorem 5.27 (Fermat’s Rule) Suppose f attains a local maximum on F at x 2 F and is Hadamard differentiable at x. Then, for all v 2 T.F; x/ one has f 0 .x/v 0: f 0 .x/ 2 N.F; x/: If f attains a local minimum on F at x then, for all v 2 T.F; x/ one has f 0 .x/v 0: 0 2 f 0 .x/ C N.F; x/: Proof Let V be an open neighborhood of x in X such that f .x/ f .x/ for all x 2 F\V. Given v 2 T.F; x/, let .vn / ! v, .tn / ! 0C be sequences such that xCtn vn 2 F for all n 2 N. For n large enough, we have xn WD x C tn vn 2 F \ V, hence f .x C tn vn / f .x/ 0. Dividing by tn and passing to the limit, the (Hadamard) differentiability of f at x yields f 0 .x/.v/ 0. t u It is possible to give a third version of Fermat’s Rule that does not assume that f is differentiable; it is set in the space X instead of its dual X . In it, we use the directional (lower) derivative (or contingent derivative) of f given by f D .x; u/ WD
1 . f .x C tv/ f .x// .t;v/!.0C ;u/ t lim inf
and the tangent cone to F at x as introduced in Definition 5.15.
292
5 The Power of Differential Calculus
In view of their fundamental character, we will return to these notions of tangent and normal cones. For the moment, the definition itself suffices to give the primal version of the Fermat rule we announced. Note that this version entails the preceding theorem since f D .x; / D f 0 .x/ when f is Hadamard differentiable at x. Theorem 5.28 Suppose f attains a local maximum on F at x. Then f D .x; u/ 0 for all u 2 T.F; x/: Proof Let u 2 T.F; x/. There exist .tn / ! 0C ; .un / ! u such that x C tn un 2 F for all n 2 N. For n large enough we have f .x C tn un / f .x/, so that f D .x; u/ lim inf n
1 .f .x C tn un / f .x// 0: tn t u
For minimization problems, a variant of the tangent cone is required since the rule f D .x; u/ 0 for u 2 T.F; x/ is not valid in general. Example Let F WD f0g [ f22n W n 2 Ng R and let f W R ! R be even and given by f .x/ D 0 for every x 2 F, f .22kC1 / D 22kC1 , f being affine on each interval Œ2j ; 2jC1 . Show that f D .x; 1/ D 1 for x WD 0, although f .x/ D min f .F/. Definition 5.20 The incident cone (or adjacent cone) to F at x 2 cl.F/ is the set T I .F; x/ WD fu 2 X W 8.tn / ! 0C ; 9.un / ! u; x C tn un 2 F 8ng
xn x D u 2 X W 8.tn / ! 0C ; 9.xn / ! x; . / ! u; xn 2 F 8n : tn It is easy to show that u 2 T I .F; x/ , lim
t!0C
1 d.x C tu; F/ D 0: t
Let us also introduce the incident derivative of a function f at x by f I .x; u/ WD inffr 2 R W .u; r/ 2 T I .Ef ; xf /g; where Ef is the epigraph of f and xf WD .x; f .x//. Proposition 5.32 Suppose f is directionally stable at x in the sense that for all u 2 Xnf0g one has .1=t/. f .x C tv/ f .x C tu// ! 0 as .t; v/ ! .0; u/. If f attains a local minimum on F at x then f I .x; u/ 0 for all u 2 T.F; x/; f D .x; u/ 0 for all u 2 T I .F; x/:
5.6 Applications to Optimization
293
Proof Suppose on the contrary that there exists some u 2 T.F; x/ such that f I .x; u/ < 0. Then, there exists some r < 0 such that .u; r/ 2 T I .Ef ; xf /I thus, if .tn / ! 0C and .un / ! u are such that x C tn un 2 F for all n 2 N, one can find a sequence ..vn ; rn // ! .u; r/ such that xf C tn .vn ; rn / 2 Ef for all n 2 N. Then f .x/ C tn rn f .x C tn vn / for all n 2 N and 0 > r lim sup n
1 1 . f .x C tn vn / f .x// D lim sup . f .x C tn un / f .x/ 0; tn tn n
a contradiction. The proof of the second assertion is similar.
t u
Exercises 1. Given an element x of the closure of a subset F of a normed space X, show that the tangent cone and the incident cone can be expressed as follows: 1 T.F; x/ D fv 2 X W lim inf d.x C tv; F/ D 0g; t!0C t T I .F; x/ D fv 2 X W lim
t!0C
1 d.x C tv; F/ D 0g: t
2. Deduce from Exercise 1 that v 2 T.F; x/ if and only if v 2 lim supt!0C 1t .F x/ and that v 2 T I .F; x/ if and only if v 2 lim inft!0C 1t .F x/, the limits of a family of sets being defined as in Exercise 8 of Sect. 2.3.1. 3. Find a subset F of R such that 1 2 T.F; 0/ but T I .F; 0/ D f0g. 4. Show that if X is a finite dimensional normed space, then, for any subset F of X and any x 2 cl.F/, one has N.F; x/ D NF .F; x/. 5. Show that for any subset F of a normed space and any x 2 cl.F/, the cones N.F; x/ and NF .F; x/ are convex and closed. 6. Show that for any convex subset C of a normed space X and any x 2 cl.C/ the cones N.C; x/ and NF .C; x/ coincide with the normal cone in the sense of convex analysis described in Definition 5.17. 7. Let f W R ! R be differentiable at a 2 R and such that a is a minimizer of f on some interval Œa; b with b > a. Verify that f 0 .a/ 0. 8. Show that the incident cone T I .F; x/ can be called the velocity cone of F at x since v 2 T I .F; x/ if and only if there exists some c W Œ0; 1 ! F such that c.0/ D x, c is right differentiable at 0 and c0C .0/ D v. 9. Give an example of a function f on a normed space X that attains its infimum on a subset F of X at some a 2 F and of some u 2 T.F; a/ such that f D .a; u/ < 0. [Hint: use Exercise 3.]
294
5 The Power of Differential Calculus
5.6.3 Calculus of Tangent and Normal Cones We devote this subsection to some calculus rules for normal cones. These rules will enable us to compute the normal cones to sets defined by equalities and inequalities, an important topic for the application to concrete optimization problems. In order to show that the two notions of normal cone we introduced correspond to the classical notion in the smooth case, let us make some easy but useful observations. Proposition 5.33 The notions of normal cone and of Fréchet normal cone are local notions: if F and G are two subsets such that F \V D G\V for some neighborhood V of x, then N.F; x/ D N.G; x/ and NF .F; x/ D NF .G; x/. Proposition 5.34 Given normed spaces X, Y and x 2 F X, y 2 G Y, one has N.F G; .x; y// D N.F; x/ N.G; y/; NF .F G; .x; y// D NF .F; x/ NF .G; y/: Proposition 5.35 The normal cone and the Fréchet normal cone are antitone: for F G and any x 2 clF one has N.G; S x/ N.F; x/ and NF .G; x/ NF .F; x/. Moreover, if F is a finite union, F D i2I Fi , then N.F; x/ D
\
N.Fi ; x/;
i2I
NF .F; x/ D
\
NF .Fi ; x/:
i2I
This fact helps in the computation of normal cones, as the next example shows. ˚ Example Let F WD .r; s/ 2 R2 W rs D 0 , so that F D F1 [ F2 with F1 WD R f0g; F2 WD f0g R. Then, as Fi is a linear subspace, one has N.Fi ; 0/ D Fi? , hence N.F; 0/ D F1? \ F2? D f0g. However, the computations of normal cones to intersections are not obvious. One may just have the inclusions N.F \ G; x/ N.F; x/ [ N.G; x/;
NF .F \ G; x/ NF .F; x/ [ NF .G; x/:
Example Let X WD R2 with its usual Euclidean norm and let F WD BX C e, G WD BX e, where e D .0; 1/. Then N.F \ G; 0/ D R2 whereas N.F; 0/ [ N.G; 0/ D f0g R. Now let us show that the notions of normals and firm normals are invariant under differentiable transformations (diffeomorphisms). Proposition 5.36 Let g W U ! V be a map between two open subsets of the normed spaces X and Y, respectively, and let B U, C V be such that g.B/ C. Then,
5.6 Applications to Optimization
295
if g is Fréchet differentiable (resp. Hadamard differentiable) at x 2 B, for y WD g.x/, one has NF .C; y/ .g0 .x/| /1 .NF .B; x// .resp.
(5.37)
N.C; y/ .g0 .x/| /1 .N.B; x//
/:
Relation (5.37) is an equality when C D g.B/ and there exist > 0, c > 0 such that 8y 2 C \ B.y; /
d.x; g1 . y/ \ B/ cd. y; y/:
(5.38)
Proof Let y be an element of NF .C; y/: for some remainder r./ and for all y 2 C we have hy ; y yi r.ky yk/. The differentiability of g at x can be written in the following form for some remainder s g.x/ g.x/ D A.x x/ C s.kx xk/;
(5.39)
where A WD g0 .x/. Taking x 2 B, since y WD g.x/ 2 C, we get hA| .y /; x xi D hy ; g.x/ g.x/ s.kx xk/i r.kg.x/ g.x/k/ hy ; s.kx xk/i WD t.kx xk/ where t is a remainder since kg.x/ g.x/k .kAk C 1/kx xk for x close enough to x. The proof for the normal cone is similar. It can also be deduced from the inclusion g0 .x/.T.B; x// T.C; y/. Now suppose C D g.B/ and relation (5.38) holds for some > 0, c > 0. Then, for all y 2 C \ B.y; /, there exists some xy 2 g1 . y/ \ B satisfying xy x 2c ky yk. Let y 2 Y be such that x WD g0 .x/| .y / 2 NF .B; x/. Then, there exists a remainder r./ such that 8x 2 B
hy ; g0 .x/.x x/i D hx ; x xi r.x x/:
Taking into account (5.39), we get, for all y 2 C \ B.y; / hy ; y yi D hy ; g.xy / g.x/i r.xy x/ C ky k s.xy x/ and, since xy x 2c ky yk, we conclude that y 2 NF .C; y/.
t u
Corollary 5.21 Let g W U ! V be a bijection between two open subsets of the normed spaces X and Y respectively such that g and h WD g1 are Hadamard differentiable (resp. Fréchet differentiable) at x and y WD g.x/ respectively and let B U, C D g.B/. Then N.B; x/ D g0 .x/| .N.C; y// .resp.
NF .B; x/ D g0 .x/| .NF .C; y//
/:
296
5 The Power of Differential Calculus
Proof Since h0 .y/| is the inverse of g0 .x/| , one has the inclusions of Proposition 5.36 and their analogues in which h; y; C take the roles of g; x; B, respectively. t u For an inverse image, it is possible to ensure equality in the inclusions of Proposition 5.36. However a technical assumption called a qualification condition should be added, otherwise the result may be invalid, as the following example shows. Example Let X D Y D R, g.x/ D x2 , C D f0g, B D g1 .C/. Then N.B; 0/ D R ¤ g0 .0/| .N.C; 0// D f0g. The factorization of Lemma 3.20 will be helpful for handling inverse images. Proposition 5.37 (Lyusternik) Let X, Y be Banach spaces, let U be an open subset of X and let g W U ! Y be circa-differentiable at x 2 U with g0 .x/.X/ D Y. Then, for S WD g1 .y/ with y WD g.x/ one has N.S; x/ D NF .S; x/ D g0 .x/| .Y /. Proof Proposition 5.36 ensures that g0 .x/| .Y / NF .S; x/ N.S; x/. Now, given x 2 N.S; x/, for all v 2 T.S; x/ D ker g0 .x/ D T I .S; x/ we have hv; x i D 0, so that Lemma 3.20 yields some y 2 Y such that x D y ı g0 .x/ D g0 .x/| . y /. t u A more general case is treated in the next theorem. Theorem 5.29 Let X, Y be Banach spaces, let U be an open subset of X and let g W U ! Y be a map that is circa-differentiable at x 2 U with A WD g0 .x/ surjective. Then, if C is a subset of Y and if x 2 B WD g1 .C/, y WD g.x/ 2 C, one has N.B; x/ D g0 .x/| .N.C; y//; NF .B; x/ D g0 .x/| .NF .C; y//: Proof We prove the Fréchet case only, leaving the directional case to the reader. The Lyusternik-Graves Theorem (Theorem 5.15) asserts the existence of > 0, c > 0 such that for all y 2 B.y; / there exists a xy 2 g1 . y/ satisfying xy x c ky yk. When y 2 C \ B.y; / we have xy 2 g1 .C/ D B, hence d.x; g1 . y/ \ B/ d.x; xy / cd.y; y/. Moreover, setting V WD B.y; /, U WD g1 .V/, B0 WD B \ U, C0 WD C \ V, we have g.B0 / D C0 and NF .B; x/ D NF .B0 ; x/ and NF .C; y/ D NF .C0 ; y/. Thus, we can replace B with B0 and C with C0 . Then Proposition 5.36 ensures that NF .B; x/ D g0 .x/| .NF .C; y//. t u Exercise With the notation and the assumption of the theorem show that T.B; x/ WD .g0 .x//1 .T.C; y//. Exercise Apply the theorem and the preceding exercise to the case Y WD Rd , C WD RdC .
5.6 Applications to Optimization
297
5.6.4 Multiplier Rules As observed above, the usual necessary condition f 0 .a/ D 0 in order that a function f W X ! R attains its minimum at a when it is differentiable there, has to be modified when some restrictions are imposed. In the present section we consider the frequent case of constraints defined by equalities and we present a practical rule. The case of inequalities is dealt with in the exercises. The famous Lagrange multiplier rule is a direct consequence of Fermat’s Rule and of Proposition 5.37. Theorem 5.30 (Lagrange Multiplier Rule) Let X; Y be Banach spaces, let W be an open subset of X, let f W W ! R be differentiable at a and let g W W ! Y be circa-differentiable at a with g0 .a/.X/ D Y. Let b WD g.a/. Suppose that f attains on S WD g1 .b/ a local minimum at a. Then there exists some y 2 Y (called the Lagrange multiplier) such that f 0 .a/ D y ı g0 .a/: Example Let us find the shape of a box having a given volume v > 0 and a minimum area. Denoting by x; y; z the sizes of the sides of the box, we are led to minimize f .x; y; z/ WD 2.xy C yz C zx/
subject to g.x; y; z/ WD xyz v D 0; x; y; z > 0:
First, we secure the existence of a solution by showing that f is coercive on S WD g1 .0/. In fact, if wn WD .xn ; yn ; zn / 2 S and .kwn k/ ! C1, one of p the components p of wn , say xn , converges to C1I then, since yn C zn 2 yn zn D 2 v=xn , we get p f .wn / 2xn . yn C zn / 4 vxn ! C1: Now let .x; y; z/ be a minimizer of f on S. Since the derivative of g is non-null at .x; y; z/, the Lagrange multiplier rule yields some 2 R such that 2 .y C z/ D yz 2 .z C x/ D zx 2 .x C y/ D xy: Then, multiplying each side of the first equation by x, and doing similar operations with the other two equations, we get v D xyz D 2x. y C z/ D 2y.z C x/ D 2z.x C y/;
298
5 The Power of Differential Calculus
hence, by summation, 3v D 4.xy C yz C zx/ > 0. Subtracting sides by sides the equations expressing the Lagrange multiplier rule, we get 2 .y x/ D z. y x/; 2.z y/ D x.z y/; 2.x z/ D y.x z/: Since ; x; y; z are positive, considering the various cases, we get x D y D z. Since the unique solution of the necessary condition is w WD .v 1=3 ; v 1=3 ; v 1=3 /, we conclude that w is the solution of the problem and the optimal box is a cube. We also note that the least area is a.v/ WD f .w/ D 6v 2=3 and that D 4v 1=3 is exactly the derivative of the function v 7! a.v/, a general fact we will explain later on which shows that the artificial multiplier has in fact an important interpretation as a measure of the change of the optimal value when the parameter v varies. Example-Exercise Let X be some Euclidean space and let A 2 L.X; X/ be symmetric. Let f and g be given by f .x/ D .Ax j x/, g.x/ D kxk2 1. Take v 2 SX such that f attains its minimum on the unit sphere SX at v. Then show that there exists some 2 R such that Av D v. Deduce from this result that any symmetric square matrix is diagonalizable.
Exercises 1. (Simplified Karush-Kuhn-Tucker Theorem) Let X, Y be Banach spaces, let g W X ! Z be circa-differentiable at x with g0 .x/.X/ D Z and let C Z be a closed convex cone of Z. Suppose x 2 F WD g1 .C/ is a minimizer on F of a function f W X ! R that is differentiable at x. Use Theorem 5.29 and the Fermat rule in order to obtain the existence of some y 2 C0 such that hy ; g.x/i D 0, f 0 .x/ C y ı g0 .x/ D 0. 2. (a) Compute the tangent cone at .0; 0/ to the set ˚ F WD .r; s/ 2 R2 W s j r j .1 C r2 /1 : (b) Use the Fermat rule to give a necessary condition in order that .0; 0/ be a local minimizer of a function f on F, assuming that f is differentiable at .0; 0/. ˚ (c) Rewrite F as F D .r; s/ 2 R2 W g1 .r; s/ 0; g2 .r; s/ 0 with g1 ; g2 given by g1 .r; s/ D r.1 C r2 /1 s, g2 .r; s/ D r.1 C r2 /1 s and apply the Karush-Kuhn-Tucker Theorem to get the condition obtained in (b). 3. (a) Compute the tangent cone to the set F D F 0 [ F 00 at a 2 F where ˚ F 0 WD .r; s/ 2 R2 W r4 C s4 2rs D 0 ˚ F 00 WD .r; s/ 2 R2 W r4 C s4 C 2rs D 0
5.7 Introduction to the Calculus of Variations
4.
5.
6.
7.
299
first for some point a ¤ .0; 0/, then for a D .0; 0/. [Hint: first study the symmetry properties of F and set s D tr]. (b) Write a necessary condition in order that a differentiable function f W R2 ! R attains on F a local minimizer at .0; 0/. Assuming that f is twice differentiable at .0; 0/, write a second order necessary condition. Give the dimensions of a cylindrical can that has a given volume v and the least area a.v/. Give an interpretation of the multiplier in terms of the derivative of a./. Give the dimensions of a cylindrical can that has a given area a and the greatest volume v.a/. Give an interpretation of the multiplier in terms of the derivative of v./. Give the dimensions of a box without lid that has a given volume v and the least area a.v/. Give an interpretation of the multiplier in terms of the derivative of a./. Give the dimensions of a box without lid that has a given area a and the greatest volume v.a/. Give an interpretation of the multiplier in terms of the derivative of v./.
5.7 Introduction to the Calculus of Variations The importance of the calculus of variations stems from its role in the history of the development of analysis and from its ability to present general principles that govern a number of physical phenomena. Among these are the Fermat Principle ruling the route of light and the Euler-Maupertuis Principle of Least Action governing mechanics. Historically, the calculus of variations appeared at the end of the 17th century with the brachistochrone problem, solved in 1696 by Johann Bernoulli (Fig. 5.5). This problem consists in determining a curve joining two given points of space along which a frictionless bead slides under the action of gravity in a minimal time. The novelty of such a problem lies in the fact that the unknown is a geometrical object, a curve or a function, not a real number or a finite sequence in real numbers.
Fig. 5.5 The brachistochrone problem
300
5 The Power of Differential Calculus
Thus, such a topic puts to the fore the use of functional spaces, even for onedimensional problems. We do not limit our attention to this case because many partial differential equations are derived from problems in the calculus of variations.
5.7.1 The One-Variable Case In fact, the choice of an appropriate space of functions in which a solution is to be found is part of the problem. Several choices are possible. The simplest one is the space of functions of class C1 , but it rules out piecewise C1 functions. The most general one involves absolutely continuous maps and Lebesgue null sets and is a bit technical; for many problems piecewise C1 curves would suffice but the space of piecewise C1 functions is not complete. We adopt an intermediate choice close to the choice of the class C1 . Let E be a Banach space and let T WD Œa; b be a compact interval of R. We will use the space X WD R1 .T; E/ of functions x W T ! E that are primitives of (normalized) regulated functions from T to EI this means that there exists a function x0 W T ! E that is right continuous on Œ0; 1Œ and has a left limit x0 .t / for all t 20; 1 with x0 .b/ D x0 .b / such that Z
t
x.t/ D x.0/ C
x0 .s/ds
t 2 T:
a
Then x0 is determined by x since for each t 2 Œa; bŒ, x0 .t/ is the right derivative of x at t and x0 .b/ is the left derivative of x at b. We endow X with the norm kxk D sup kx.t/k C sup x0 .t/ : t2T
t2T
It is equivalent to the norm x 7! kx.a/k C supt2T kx0 .t/k, as is easily seen. Then X is a Banach space (use Theorem 5.6). Without loss of generality, we may suppose T WD Œ0; 1; passing from x 2 R1 .T; E/ to w 2 R1 .Œ0; 1; E/ by setting w.s/ WD x.a C s.b a// enables us to reduce a problem set on R1 .T; E/ to a problem set in R1 .Œ0; 1; E/. Given .e0 ; e1 / 2 E E, an open subset U of E E T and a continuous function L W U ! R, the problem consists in minimizing the function j given by Z
1
j.x/ D 0
L.x.t/; x0 .t/; t/dt
over the set WU .e0 ; e1 / of elements x of X such that x.0/ D e0 , x.1/ D e1 and .x.t/; x0 .t/; t/ 2 U for each t 2 T. We note that since L is continuous the function t 7! L.x.t/; x0 .t/; t/ is regulated, so that the integral is well defined.
5.7 Introduction to the Calculus of Variations
301
Let us note that our choice of the space R1 .T; E/ for the solution is larger than the space C1 .T; E/, so that it may happen that a solution exists in R1 .T; E/, but not in C1 .T; E/. Example Let L W R R ! R be given by L.e; v/ WD e2 .v 2 4/2 and let e0 WD 0, e1 WD 1. Then x 2 R1 .T/ WD R1 .T; R/ given by x.t/ D .2t 1/C WD max.2t 1; 0/ is a solution of the problem since j.x/ D 0 and j.x/ 0 for all x 2 R1 .T/. However, the problem has no solution in C1 .T/. To see this, we note that if x 2 C1 .T/ satisfies the conditions x.0/ D 0, x.1/ D 1, x0 must take the value 1 for at least one t 2 T since otherwise we can find ˛ > 0 such that either x0 .t/ 1 C ˛ or x0 .t/ 1 ˛ for all t 2 T and in each case the boundary conditions are not satisfied. Then, on some neighborhood of t one has x0 .t/ 20; 2Œ and x vanishes at most once on this neighborhood, so that j.x/ > 0. t u Even by using a larger space than C1 .T; E/ and taking a very regular Lagrangian L, the minimization problem of j over WU .e0 ; e1 / may have no solution, as shown by the next example. The existence of a minimizer can be ensured by the use of compactness and lower semicontinuity arguments. Such a method is usually called the direct method of the calculus of variations (see the last example of Sect. 2.2.3 and f.i. [61, 78, 86]). Example Let L W U WD R R ! R be given by L.e; v/ D e2 C .v 2 1/2 and let e0 D 0 D e1 . Let us show that inffj.w/ W w 2 W.e0 ; e1 /g D 0 but that there is no w 2 WU .e0 ; e1 / such that j.w/ D 0. The latter fact is obvious since j.w/ D 0 implies that w.t/ D 0 for all t 2 Œ0; 1 (since w is continuous) and jw0 .t/j D 1 for almost every t 2 Œ0; 1 and these two requirements are incompatible. To demonstrate the first fact we note that j.w/ 0 for all w 2 WU .e0 ; e1 / and we construct a sequence .wn / in WU .e0 ; e1 / such that . j.wn // ! 0. We determine wn by requiring that wn .k=n/ D 0 for all k 2 Nn [f0g and w0n .t/ D 1 for t 2.2k2/=2n; .2k1/=2nŒ and w0n .t/ D 1 for t 2.2k 1/=2n; 2k=2nŒ for k 2 Nn . Then we have w0n .t/2 D 1 for almost all t and wn .t/2 1=4n2 for all t 2 Œ0; 1, so that j.wn / 1=4n2 . t u In fact, in the sequel we replace the set WU .e0 ; e1 / with W \ H.e0 ; e1 /, where H.e0 ; e1 / is the affine subspace H.e0 ; e1 / WD fx 2 X W x.0/ D e0 ; x.1/ D e1 g and W WD fx 2 X W cl..J 1 x/.T// Ug with .J 1 x/.T/ WD f.x.t/; x0 .t/; t/ W t 2 Tg. Such a choice brings some preliminary results of interest. Lemma 5.18 Given U, L, W and j as above, the set W is open in X and j is continuous on W. Proof This can be deduced from Proposition 3.45 by using the embedding x 7! .x; x0 / from X WD R1 .T; E/ into R.T; E2 /, or more directly as follows. By Proposition 3.43, for all x 2 X, the set cl..J 1 x/.T// is a compact subset of E E T. Thus, if x 2 W, there exists some r > 0 such that B..J 1 x/.T/; r/ U. Then for all w 2 X satisfying kw xk < r one has w 2 W. Thus W is open in X.
302
5 The Power of Differential Calculus
Moreover, L being continuous, it is uniformly continuous around cl..J 1 x/.T// in the sense that for every " > 0 one can find a ı > 0 such that for all .e; v; t/ 2 cl..J 1 x/.T// and all .e0 ; v 0 ; t0 / 2 B..e; v; t/; ı/ one has jL.e0 ; v 0 ; t0 / L.e; v; t/j ". Therefore, for all w 2 X satisfying kw xk ı, one has ˇ ˇ ˇL.w.t/; w0 .t/; t/ L.x.t/; x0 .t/; t/ˇ "; hence j j.w/ j.x/j ".
t u
Proposition 5.38 Suppose L is continuous on U and has partial derivatives with respect to its first and second variables that are continuous on U. Then j is of class C1 on W and for x 2 W, x 2 X one has Z 1 j0 .x/x D ŒD1 L.x.t/; x0 .t/; t/x.t/ C D2 L.x.t/; x0 .t/; t/x0 .t/dt: 0
Proof Using an interchange between integration and differentiation one can prove that j is Gateaux differentiable with a derivative given by the above formula. Then the continuity of the derivative shows that in fact j is Fréchet differentiable. Alternatively, using the embedding x 7! .x; x0 / from R1 .T; E/ into R.T; E2 / considered R1 above and the linear map f 7! 0 f .t/dt from R.T; E/ into E, differentiability can be deduced from the chain rule and Corollary 5.8. In order to be more explicit, let us set Lt .e; v/ D L.e; v; t/ for .e; v; t/ 2 U, Y WD f.e1 ; e2 ; v1 ; v2 ; t/ W 8s 2 Œ0; 1; ..1 s/e1 C se2 ; .1 s/v1 C sv2 ; t/ 2 Ug; Z WD f.w1 ; w2 / 2 W 2 W 8t 2 T; .w1 .t/; w2 .t/; w01 .t/; w02 .t/; t/ 2 Yg; and, for .e1 ; e2 ; v1 ; v2 ; t/ 2 Y Z K.e1 ; e2 ; v1 ; v2 ; t/ WD
0
1
DLt ..1 s/e1 C se2 ; .1 s/v1 C sv2 /ds;
so that L.e1 ; v1 ; t/ L.e2 ; v2 ; t/ D K.e1 ; e2 ; v1 ; v2 ; t/.e1 e2 ; v1 v2 /:
(5.40)
The compactness of T WD Œ0; 1 easily yields that Y is open in E2 E2 T. Then a proof similar to that of Lemma 5.18 shows that Z is open in X X and that the map Z h W .w1 ; w2 ; x/ 7!
from Z X into R is continuous. Since this map is linear in x and such that j.w1 / j.w2 / D h.w1 ; w2 ; w1 w2 /
5.7 Introduction to the Calculus of Variations
303
we get from Corollary 5.8 that j is of class D1 and that for w 2 W the derivative of j at w is h.w; w; /. Such a conclusion is enough for the necessary condition we have in view. With an additional effort we can show that j is of class C1 . For .w1 ; w2 / 2 Z, t 2 T setting k.w1 ; w2 /.t/ WD K.w1 .t/; w2 .t/; w01 .t/; w02 .t/; t/; we define a continuous map k from Z into R.T; E E /. Now the map ' W R.T; E E / ! .R1 .T; E// defined by Z '.k1 ; k2 /.x/ WD
1 0
Œk1 .t/x.t/ C k2 .t/x0 .t/dt
is linear and continuous. Composing k with ' we obtain a continuous map from Z into .R1 .T; E// . For .w1 ; w2 / 2 Z, by substituting .wi .t/; w0i .t// to .ei ; vi / for i D 1, 2 in (5.40) and integrating over T, we see that j.w1 / j.w2 / D .' ı k/.w1 ; w2 /.w1 w2 / and we deduce from Lemma 5.6 that j is differentiable at w 2 W and that j0 .w/ D .' ı k/.w; w/. Thus j is of class C1 ; see also Cor. 5.9. t u Let us look for necessary minimality conditions. Proposition 5.39 Suppose L satisfies the assumptions of the preceding proposition and x is a local minimizer of j on W.e0 ; e1 /. Then x is a critical point of j on W.e0 ; e1 / in the following sense: j0 .x/v D 0
8v 2 X0 WD W.0; 0/ WD fx 2 X W x.0/ D 0 D x.1/g:
Proof Let N be a neighborhood of x in X such that j.w/ j.x/ for every w 2 N \ W.e0 ; e1 /. Given v 2 X0 , for r 2 R with jrj small enough, we have w WD x C rv 2 W by Lemma 5.18 and w.0/ D e0 , w.1/ D e1 . Thus w 2 N \ W.e0 ; e1 /, hence j.x C rv/ j.x/ for jrj small enough. It follows that j0 .x/x D 0. t u Such a condition can be given a more explicit and more efficient form. Theorem 5.31 (Euler-Lagrange Condition) Suppose L satisfies the assumptions of Proposition 5.38 and x 2 W is a critical point of j on W.e0 ; e1 /. Then the function D2 L.x./; x0 ./; / is a primitive of D1 L.x./; x0 ./; /: for every t 2 Œ0; 1Œ the right derivative of D2 L.x./; x0 ./; / exists and is such that d D2 L.x.t/; x0 .t/; t / D D1 L.x.t/; x0 .t/; t/: dt
(5.41)
304
5 The Power of Differential Calculus
Here, by an abuse of notation, we denote the right derivative as the derivative. In fact, for a countable subset D of T this relation holds for all t 2 TnD with the usual bilateral derivative. The solutions of equation 5.41 are called extremals. We break the proof into three steps of independent interest. Taking A.t/ WD D1 L.x.t/; x0 .t/; t/, B.t/ WD D2 L.x.t/; x0 .t/; .t//, B.t/ WD D1 L.t; x.t/; x0 .t// in the last one, we shall obtain the result. The first step is as follows. Lemma 5.19 Let f be an element of the space Rn .T; R/ of normalized regulated R1 functions on T such that f .t/ 0 for all t 2 T and 0 f .t/dt D 0. Then f D 0. Proof Suppose, on the contrary, that there exists some r 2 T such that f .r/ > 0. When r < 1, by the right continuity of f at r, we can find some ˛; ı > 0 such that R1 R rCı r C ı < 1 and f .s/ ˛ for s 2 Œr; r C ı. Then, we get 0 f .t/dt r f .t/dt ˛ı > 0, a contradiction. If r D 1, a similar argument using the left continuity of f at 1 also leads to a contradiction. t u Lemma 5.20 Let F 2 Rn .T; E / be such that for all x 2 X0 WD fx 2 X W x.0/ D R1 0 D x.1/g one has 0 F.t/:x0 .t/dt D 0. Then F./ is constant. R1 More precisely, for e WD 0 F.t/dt one has F.t/ D e for all t 2 T. Proof Changing F into G WD F e , it suffices to show that G./ D 0 when R1 R1 0 0 G.t/dt D 0 and 0 G.t/:x .t/dt D 0 for every x 2 X0 . Given e 2 E, let us introduce f , g W T ! R, and v, x W T ! E given by g.t/ DR G.t/.e/ WD hG.t/; ei, t f .t/ D .g.t//2 , v.t/ WD G.t/.e/e WD hG.t/; eie, x.t/ D 0 v.s/ds. We see that R R 1 1 x.0/ D 0, x0r .t/ D v.t/ for t 2 Œ0; 1Œ, x.1/ D 0 v.t/dt D h 0 G.t/dt; eie D 0 since the mean of G is 0. Thus x 2 X0 . Our assumption yields Z
1 0
Z f .t/dt D Z
1 0 1
D 0
G.t/ .e/ hG.t/; eidt Z G.t/ .hG.t/; eie/ dt D
1 0
G.t/:x0 .t/dt D 0:
Lemma 5.19 entails f ./ D 0. Since e is arbitrary in E, we get G./ D 0. Lemma 5.21 (Dubois-Reymond) Let A; B 2 Rn .T; E / be such that Z 8x 2 X0
0
1
A.t/x.t/ C B.t/x0 .t/ dt D 0:
Then B is a primitive of A: for every t 2 T one has B.t/ D B.0/ C
Rt 0
A.s/ds.
t u
5.7 Introduction to the Calculus of Variations
305
Rt Proof Let us set C.t/ WD B.0/ C 0 A.s/ds. Then, for each x 2 X0 the function t 7! C.t/x.t/ has a right derivative t ! 7 A.t/x.t/ C C.t/x0 .t/ and by assumption Z
1
A.t/x.t/ C B.t/x0 .t/ dt D
Z
1
d .C.t/x.t// C .B.t/ C.t//x0 .t/dt 0 0 dt Z 1 Z 1 0 D C.1/x.1/ C.0/x.0/ C .B.t/ C.t//x .t/dt D .B.t/ C.t//x0 .t/dt:
0D
Œ
0
0
Lemma 5.20 ensures that B C is constant. Since B.0/ C.0/ D 0, B D C.
t u
Let us consider some cases for which equation (5.41) can be simplified. In such cases, one can obtain first integrals, i.e. functions of x./ that are constant along an extremal. bt .v/. Corollary 5.22 Suppose the Lagrangian L is independent of e W L.e; v; t/ D L bt .x0 .t// is constant. More generally, Then, for every extremal x./, the map t 7! DL if for some e 2 E one has D1 L.e; v; t/:e D 0 for all .e; v; t/ 2 E E T, then the function t 7! D2 L.x.t/; x0 .t/; t/:e is constant. Proof Since D1 L.x.t/; x0 .t/; t/:e D 0, the Euler-Lagrange relation (5.41) means that d D L.x.t/; x0 .t/; t/:e D 0, hence t 7! D2 L.x.t/; x0 .t/; t/:e is constant. If D1 L D 0, dt 2 bt .x0 .t// is constant. this conclusion holds for every e 2 E and t 7! DL t u Example Let E WD R and let L be given by L.e; v; t/ WD v 3 =3. Then every extremal x must satisfy x0 .t/2 D c2 for some c 2 R. If we choose to minimize j on C1 .T; R/ with the boundary conditions x.0/ D x0 , x.1/ D x1 we find a unique solution x.t/ D x0 C .x1 x0 /t for t 2 T. If we minimize j on X WD R1 .T; R/, we find infinitely many extremals that are piecewise C1 , hence in X. p Example Let E WD R and let L be given by L.e; v; t/ WD .1=t/ v 2 C 1. Then every extremal x satisfies for some c 2 R the equation D2 L.x.t/; x0 .t/; t/ D c, i.e. p
x0 .t/
t x0 .t/2 C 1
D c;
so that x0 .t/2 .1 c2 t2 / D c2 t2 or, assuming x0 takes nonnegative values, x0 .t/ D p
ct 1 c2 t 2
:
p Integrating, we obtain x.t/ D c1 1 c2 t2 C b for some b 2 R, so that .x.t/ b/2 C t2 D c2
306
5 The Power of Differential Calculus
and the graph of x lies on a circle. The constants b and c can be determined by the initial and final values e0 and e1 . Exercise In a number q of classical problems the Lagrangian has the form L.e; v; t/ WD g.t; e/ kvk2 C 1. Show that in such a case, with E WD R, the Euler equation takes the form D1 L.x.t/; x0 .t/; t/ D3 L.x.t/; x0 .t/; t/x0 .t/
x00 .t/ L.x.t/; x0 .t/; t/ D 0: x0 .t/2 C 1
Under some regularity conditions, higher-order necessary conditions can be obtained. Corollary 5.23 (Erdmann) Suppose the Lagrangian L is autonomous, i.e. independent of t W L.e; v; t/ D L.e; v/. Then, for every twice differentiable extremal x./, the function h W t 7! D2 L.x.t/; x0 .t//:x0 .t/ L.x.t/; x0 .t// is constant. Proof It suffices to check that the two terms D2 L.x.t/; x0 .t//:x00 .t/ cancel in the computation of h0 so that: h0 .t/ D dtd ŒD2 L.x.t/; x0 .t//:x0 .t/ D1 L.x.t/; x0 .t//:x0 .t/ D 0. u t Proposition 5.40 (Legendre) Suppose L is of class C2 . If x is a minimizer of j on some ball centered at x for the norm of R1 .T; E/, then for all t 2 T the operator D22 L.x.t/; x0 .t/; t/ is positive semi-definite. Proof As in Proposition 5.38, given y 2 R1 .T; E/ one can show that the second derivative j00 .x/:y:y of j at x in the direction y is given by j00 .x/:y:y Z 1 D ŒD21 L.z.t//:y.t/:y.t/ C 2D21;2 L.z.t//:y.t/y0 .t/ C D22 L.z.t//:y0 .t/:y0 .t/dt 0
where z.t/ WD .x.t/; x0 .t/; t/. For a; b 2 T with a < b, e 2 E, and n 2 let hn W Œ0; 1 ! R be the sawtooth function in R1 .T; R/ satisfying hn .t/ D 0 for k t 2 Œ0; a [ Œb; 1, h0n .t/ D .1/ 0 for t 2 Œa C k.b a/=n; a C .k C 1/.b a/=n and let yn .t/ WD hn .t/e, so that yn 1 D kek and kyn k D kek .b a/=n. Taking y WD yn in the relation j00 .x/:y:y 0 and passing to the limit we get Z
b a
D22 L.z.t//:e:edt 0:
Using a proof by contradiction as in Lemma 5.19, we conclude that for all t 2 T we t u have D22 L.z.t//:e:e 0.
5.7 Introduction to the Calculus of Variations
307
5.7.2 Some Examples Geodesics These are the curves of minimal length joining two points. The simplest case is that of a Euclidean space E WD Rd . Then the length of a curve x W Œ0; 1 ! E, x WD .x1 ; : : : ; xd / of class C1 (or in R1 .T; E/) is given by Z
1
j.x/ D 0
0 x .t/ dt D
Z
1 0
q
x01 .t/2 C : : : C x0d .t/2 dt:
b Thus the Lagrangian L is given by L.e; v; t/ WD L.v/ WD kvk and Corollary 5.22 d b applies. Since DL.v/ D v= kvk for v 2 R nf0g, along an extremal x whose derivative does not vanish, the direction x0 .t/= x0 .t/ of the derivative is a constant vector u. This can be seenpby using coordinates since for i 2 Nd the function b t 7! Di L.x.t// D x0i .t/= x01 .t/2 C : : : C x0d .t/2 is constant. Parameterizing the Rt extremal x joining two points e0 , e1 of E by the arc length s.t/ WD 0 x0 .r/ dr, we find that the curve is a line segment: x.t/ D e0 C s.t/u with u D .e1 e0 /=s.1/, so that x runs along the segment Œe0 ; e1 . The Trajectories of Light in Isotropic Media Suppose that according to the Fermat Principle the trajectory in an isotropic medium minimizes the travel time. Saying that the medium is isotropic means that the speed of light does not depend on the direction. However, it may depend on the point: in deserts, the air is warmer near the ground and the trajectory of light is curved, giving rise to possible mirages. Let us take the Lagrangian in the form L.e; v/ WD c.e/ kvk ; where c is a smooth function representing the physical properties of the medium. If c does not depend on its ith variable, one obtains that along an extremal x the function x0 .t/:ei x0i .t/ t 7! c.x.t// 0 D c.x.t// p 0 x .t/ x1 .t/2 C : : : C x0d .t/2 is constant. From this one can recover the Descartes-Snell law of refraction (Fig. 5.1). If the plane x3 D 0 separates two media whose refractive indexes are constants n1 and n2 respectively, the angles 1 and 2 of x0 .t / and x0 .tC / with the plane x3 D 0 at the point where the trajectory crosses it satisfy the relation 1 1 cos 1 D cos 2 : n1 n2
308
5 The Power of Differential Calculus
Fig. 5.6 Lobatchevski’s geometry in Poincaré’s half-space
Lobatchevski’s Geometry In the half-space P WD R P endowed with the Riemannian metric given by ge .v/ WD .1=e2 / kvk, the length of a curve x./ W Œ0; 1 ! P is given by Z `.x/ WD
1 0
p 0 x1 .t/2 C x02 .t/2 dt x2 .t/
The Lagrangian L is as in the preceding example with c.e/ WD 1=e2 , hence is independent of e1 . Thus, for any extremal x we obtain from Corollary 5.22 that there exists some c 2 R such that p
x01 .t/
x2 .t/ x01 .t/2 C x02 .t/2
D c:
The solutions of such equations are either the lines x1 .t/ D x0 or the arcs of the circle x1 .t/ D r cos t C a, x2 .t/ D r sin t centered on Rf0g. Two points of P can be joined by a geodesic, but such lines do not satisfy all the axioms of Euclid (Fig. 5.6). The Brachistochrone Problem (J. Bernoulli, 1696) This consists in finding a curve joining two points of R3 so that a particle moving along this curve starting from the highest of these two points reaches the other one in the shortest time, friction and resistance of the medium being neglected. We may assume the highest point is .0; 0; h/ and the lowest point is .1; 0; 0/. We admit the curve remains in the plane Rf0g R and can be parameterized as .x.r/; 0; z.r//, r 2 Œ0; 1, with x.r/ D r, i.e. that it is the graph of a function z./. According to Galileo’s law, ds dr the velocity ds dt D dr dt of the particle is independent of the shape of the curve and p is given by 2gz.r/ where g is the acceleration due to gravity. The travel time is given by 1 T.z/ WD p 2g
Z
1 0
p 1 C z0 .r/2 dt; p z.r/
z 2 R1 .Œ0; 1; R/:
5.7 Introduction to the Calculus of Variations
309
Since the Lagrangian is autonomous, i.e. does not depend on r (which replaces t here in order to avoid confusion with time), Corollary 5.23 yields some c 2 R such that p 1 C z0 .r/2 z0 .r/2 Dc p p z.r/ z.r/.1 C z0 .r/2 / p or equivalently 1= z.r/.1 C z0 .r/2 / D c, or z.r/.1 C z0 .r/2 / D c2 , with the boundary conditions z.0/ D h, z.1/ D 0. Introducing the slope s./ W r 7! s.r/ of the tangent .1; z0 .r// to the curve by s.r/ WD tan1 z0 .r/, so that z0 .r/ D tan s.r/, we get z.r/ D
c D c cos2 s.r/: 1 C tan2 s.r/
Differentiating each side of this relation, we obtain z0 .r/ D 2cs0 .r/ cos s.r/ sin s.r/; hence, since z0 .r/ D tan s.r/, 1 D 2cs0 .r/ cos2 s.r/ D cs0 .r/.1 C cos 2s.r//: Thus, by integration, x D r D r0 c.s C .1=2/ sin 2s/, z D .1=2/c.1 C cos 2s/. The curve has the shape of a cycloid, a curve traced out by a point on the edge of a wheel, as the wheel is rolling along a straight line. The Minimal Surface Problem Given a function x W Œ0; 1 ! RC of class C2 , the area of the surface obtained by rotating its graph about the z-axis (considering t WD z/ is given by Z 1 p A.x/ WD 2 x.t/ x0 .t/2 C 1dt: 0
p Setting L.e; v/ WD e v 2 C 1, we can find extremals of A./=2 by applying Corollary 5.23. The latter asserts the existence of a constant k such that p x.t/x0 .t/2 x.t/ x0 .t/2 C 1 D k p x0 .t/2 C 1 p or (if x.t/ > 0) k x0 .t/2 C 1 D x.t/x0 .t/2 x.t/.x0 .t/2 C 1/ D x.t/, hence x0 .t/2 D k2 x.t/2 1: This differential equation written by separation of variables as Z Z kdx p D dt x2 k 2
310
5 The Power of Differential Calculus
Fig. 5.7 A minimal surface of revolution
can be solved by finding a primitive in the left-hand side: k cosh1 .x=k/. Thus x.t/ D k cosh.t=k C c/, where the constants c and k have to be determined by the values x.0/ and x.1/. The graph of such a function is called a catenary. Its shape can be seen in the cooling towers of power plants or in soap bubbles between two rings (Fig. 5.7). Classical Mechanics Let us consider a solid with mass m whose position is determined by parameters .q1 ; : : : ; qn / 2 E WD Rn . It is subject to a force F.q1 ; : : : ; qn / deriving from a potential U.q1 ; : : : ; qn / in the sense that F.q1 ; : : : ; qn / D rU.q1 ; : : : ; qn /. Its kinetic energy is given by T.v1 ; : : : ; vn / D .1=2/m.v12 C : : : C vn2 /. Let L be the Lagrangian given by L.q; v/ WD L.q1 ; : : : ; qn ; v1 ; : : : ; vn / D T.v1 ; : : : ; vn / C U.q1 ; : : : ; qn /: Using the isomorphism between E and E , the Euler-Lagrange equations turn to be mq00 .t/ D F.q.t//; the Newton equation, in which q00 .t/ WD .q001 .t/; : : : ; q00n .t// is the acceleration.
5.7.3 The Legendre Transform The Legendre transform is a classical method used in the calculus of variations and in the study of differential equations. We give a short account of it as an illustration of inversion techniques. We present a slight refinement of it, replacing a continuous differentiability assumption or a local Lipschitz assumption with a stability assumption. We say that a map g W U ! V between two metric spaces is stable or is Stepanovian if for any u 2 U there exist some r > 0, c 2 RC such that
5.7 Introduction to the Calculus of Variations
311
for every u 2 B.u; r/ one has d.g.u/; g.u// cd.u; u/: Definition 5.21 A function f W U ! R on an open subset U of a Banach space X is a (classical) Legendre function if it is differentiable, if its derivative f 0 W U ! Y WD X is a stable bijection onto an open subset V of Y whose inverse h is stable too. Then one defines the Legendre transform of f as the function f L W V ! R given by f L . y/ WD hh. y/; yi f .h. y//
y 2 V:
If h is of class C1 , f L is clearly of class C1 . However, we do not make this assumption. Proposition 5.41 If f is a Legendre function on U, then its Legendre transform f L is of class C1 on V WD f 0 .U/. It is of class Ck (k 1) if f is of class Ck . Moreover, L f L is a Legendre function, f L D f and for all .u; v/ 2 U V one has v D Df .u/ , u D Df L .v/: Furthermore, for k 2 one has D2 f L .v/ D .D2 f .u//1 for v 2 V, u D Df L .v/. Here D2 f .u/ is considered as an element of L.X; X / and D2 f L .v/ as an element of L.X ; X/. Proof Given u 2 U, v WD Df .u/ 2 V, let y 2 V v, let x WD x. y/ WD h.v C y/ h.v/ 2 U u and let r.x/ D f .u C x/ f .u/ Df .u/x. Then, since h.v/ D u, h.v C y/ D u C x, the definition of f L yields f L .v C y/ f L .v/ hu; yi D hu C x; v C yi f .u C x/ hu; vi C f .u/ hu; yi D hx; v C yi Df .u/.x/ r.x/ D hx. y/; yi r.x. y//: Since there exists a c 2 RC such that kx. y/k c kyk for kyk small enough, the last right-hand side is a remainder as a function of y. Thus f L is differentiable at v and Df L .v/ D u D h.v/. Therefore . f L /0 D h is a bijection with inverse f 0 and f L is a Legendre function. Now L L .u/ D hDf L .v/; vi f L .v/ D hu; vi .hu; vi f .u// D f .u/: f Suppose now that f is of class C2 . Let g WD f 0 , u 2 U, v WD g.u/ 2 V, x 2 X, A WD g0 .u/. For t > 0 small enough to ensure u C tx 2 U, we set yt WD t1 .g.u C tx/ g.u//, so that v C tyt D g.u C tx/, h.v C tyt / D u C tx, h.v/ D u and ktxk D kh.v C tyt / h.v/k tc kyt k for t small enough. Since . yt / ! y WD A.x/ as t ! 0C , we get kxk c kA.x/k. Thus A is injective and its image is a complete
312
5 The Power of Differential Calculus
subspace of Y, as is easily seen. Let us show that this image is dense, which will prove that A is an isomorphism. Given y 2 Y, let us set xt WD t1 .h.v C ty/ h.v//, so that v C ty D g.h.v/ C txt / and ty D g.u C txt / v D g.u C txt / g.u/ D tA.xt / C tzt , where .zt / ! 0 as t ! 0C since g is differentiable at u and .xt / is bounded by the assumption that h is stable at v. Thus d. y; A.X// limt kzt k D 0 and y 2 cl.A.X// D A.X/ which is closed in Y. Thus A is an isomorphism and the inverse mapping theorem shows that the inverse h of g is differentiable at v with derivative A1 WD g0 .u/1 . Thus h is of class C1 and since . f L /0 D h, we get that f L is of class C2 . Moreover, D2 f L .v/ D Dh.v/ D .Dg.u//1 D .D2 f .u//1 D .D2 f .h.v///1 : t When f is of class Ck , . f L /0 D h D g1 is of class Ck1 . Thus, f L is of class Ck . u It will be shown that the Legendre transform can be used to reduce the nonlinear second order partial differential equation of minimal surfaces in R3
div
rf .x/ .1 C krf .x/k2 /1=2
D0
x 2 ˝ R2
(5.42)
to a linear equation. Exercise Let X be a Banach space, let A W X ! X be a linear isomorphism, let b 2 X , c 2 R and let f be given by f .x/ WD .1=2/hAx; xi C hb; xi C c for x 2 X. Show that f is a Legendre function and compute f L . Exercise Given a; b 2 R, let f W R ! R be given by f .x/ WD .1=4/x4 C ax C b. Show that f is a Legendre function and find its Legendre transform.
5.7.4 The Hamiltonian Formalism When L is of class C2 , the Euler-Lagrange equation (5.41) is an implicit ordinary differential equation of order two. Let us show how it can be reduced to an explicit first order differential system under the assumption that for .e; t/ 2 E T the function Le;t W v 7! L.e; v; t/ is a Legendre function on Ue;t WD fv 2 E W .e; v; t/ 2 Ug. We set Ve;t WD DLe;t .Ue;t / and we denote by V WD
[
feg Ve;t ftg
.e;t/2ET
the image of U under the diffeomorphism .e; v; t/ 7! .e; DLe;t .v/; t/. We introduce the Hamiltonian H W V ! R by H.e; p; t/ D hp; vi L.e; v; t/
for p WD D2 L.e; v; t/;
(5.43)
5.7 Introduction to the Calculus of Variations
313
so that He;t WD H.e; ; t/ is the Legendre transform of Le;t . The passage from the Euler-Lagrange equation to the Hamiltonian system is described in the next result. Theorem 5.32 (Hamilton) Suppose L is continuous on U, has continuous partial derivatives D1 L and D2 L and that for all .e; t/ 2 T E, the map D2 L.e; ; t/ is a diffeomorphism of class C1 from Ue;t onto its image Ve;t . Let x be an extremal and let y.t/ WD D2 L.x.t/; x0 .t/; t/. Then the pair .x; y/ satisfies the Hamilton differential system x0 .t/ D D2 H.x.t/; y.t/; t/ 0
y .t/ D D1 H.x.t/; y.t/; t/:
(5.44) (5.45)
Proof Since Le;t is a Legendre function, the relation p D D2 L.e; v; t/ is equivalent to the relation v D D2 H.e; p; t/: v D D2 H.e; p; t/ ” p D D2 L.e; v; t/:
(5.46)
Plugging e WD x.t/, v WD x0 .t/, p WD y.t/ WD D2 L.x.t/; x0 .t/; t/ in (5.46), we get (5.44). Since for all .e; t/ 2 T E, the map D2 L.e; ; t/ is a diffeomorphism of class C1 from Ue;t onto Ve;t , the map w 7! D22 L.e; v; t/:w is invertible and its inverse is the derivative at p WD D2 L.e; v; t/ of D2 L.e; ; t/1 D D2 H.e; ; t/. Thus, the map vt W .e; p/ 7! vt .e; p/ defined by the implicit equation p D2 L.e; vt .e; p/; t/ D 0 is of class C1 . The expression of H in (5.43) yields Ht .e; p/ WD H.e; p; t/ D hp; vt .e; p/i Lt .e; vt .e; p//; where Lt .e; vt .e; p// D L.e; vt .e; p/; t/: Differentiating both sides with respect to e, abbreviating vt .e; p/ as vt , one has D1 Ht .e; p/e0 D hp; D1 vt .e; p/:e0 i D1 Lt .e; vt /e0 D2 Lt .e; vt /.D1 vt .e; p/e0 / D D1 L.e; v.e; p; t/; t/e0 ; for all e0 2 E, or D1 H.e; p; t/ D D1 L.e; v.e; p; t/; t/:
(5.47)
Plugging e D x.t/, v D x0 .t/, p WD y.t/ into this relation and taking into account the Euler-Lagrange equation (5.41) and relation (5.47), we get y0 .t/ WD
d D2 L.x.t/; x0 .t/; t/ D D1 L.x.t/; x0 .t/; t/ D D1 H.x.t/; y.t/; t/: dt t u
314
5 The Power of Differential Calculus
Corollary 5.24 (Legendre) If L is of class C2 and if for an extremal x and all t 2 T the second derivative D22 L.x.t/; x0 .t/; t/ is an invertible element of L.E; E /, then x is of class C2 . Proof When L is of class C2 the definition of H in (5.43) shows that the Hamiltonian H is of class C2 . Moreover, our assumption ensures that for all t 2 T there is an open neighborhood U of .x.t/; x0 .t/; t/ such that v 7! D2 L.x.t/; v; t/ is a C1 diffeomorphism from Ux.t/;t onto its image Vx.t/;t , where Ue;t WD fv 2 E W .e; v; t/ 2 Ug. We may suppose U contains the compact set f.x.t/; x0 .t/; t/ W t 2 Tg. Thus, we can apply Theorem 5.32. By the regularity and uniqueness results for ordinary differential equations we get that .x./; y.// is of class C1 and in fact of class C2 since the right-hand sides of (5.44), (5.45) are then of class C1 . t u
Exercises 1. Let .e; v/ 7! L.e; v/ be a nonnegative autonomouspLagrangian on some open subset U of EE. Show that if x./ is an extremal of L such that the Lagrangian L.x./; x0 .// is constant, then x is an extremal of L. Suppose that for all .e; v/ 2 U and all r > 0 one has .e; rv/ 2 U p and L.e; rv/ D r2 L.e; v/. Show that any extremal of L is also an extremal of L. [Hint: use Corollary 5.23 and the homogeneity assumption in order to get that L is constant along an extremal.] 2. Let E be a Euclidean space and let L be the Lagrangian given by L.e; v/ WD kvk2 . R1 Show that if x minimizes j W x 7! 0 kx0 .t/k2 dt over the set W.e0 ; e1 / WD fx 2 R1 .T; E/ W x.0/ D e0 ; x.1/ D e1 ; x0 .T/ Enf0gg with T WD Œ0; 1, then t 7! x0 .t/ is constant on T and x is also an extremal of the length functional R1 ` W x 7! 0 kx0 .t/k dt over the set W.e0 ; e1 /. Use the preceding exercise to show that conversely, if x is an extremal of the length functional ` and if for some change of parameter the function s 7! x0 . .s// is constant, then x ı is an extremal of j.
5.7.5 The Several Variables Case The Euler-Lagrange necessary condition for the minimization of integral functionals with unknown functions of several variables is the source of numerous partial differential equations. However, we just give a short treatment, considering that the one-variable case gives the main ideas. Now we suppose that ˝ is a bounded open subset of Rd with smooth boundary, that E is still a Banach space (usually E WD R), and that L W Ed E ˝ ! R;
5.7 Introduction to the Calculus of Variations
315
the so-called Lagrangian gives rise to the functional Z j.w/ D L.Dw.x/; w.x/; x/dx w 2 C2 .˝; E/: ˝
Note that here we exchange the place of w and Dw.x/ for the convenience of notation. We look for conditions implied by the assumption that a function u W ˝ ! E of class C2 is a local minimizer of j on a space of functions from ˝ to E that contains the affine subspace Cc1 .˝; E/ C u, where Cc1 .˝; E/ is the space of functions of class C1 with compact support from ˝ into E. Taking an arbitrary v 2 Cc1 .˝; E/ and assuming that L is of class C2 and that the conditions for differentiating the integral j.u C tv/ with respect to t 2 R for t D 0 are satisfied, we obtain the relation j0 .u/.v/ D 0, with j0 .u/.v/ D
Z X d Œ Di L.Du.x/; u.x/; x/Di v.x/ C DdC1 L.Du.x/; u.x/; x/v.x/dx: ˝ iD1
Since v has compact support, with our smoothness assumption the Green’s formula of Theorem 9.9 yields Z ˝
Œ
d X
Di .Di L.Du.x/; u.x/; x//v.x/ C DdC1 L.Du.x/; u.x/; x/v.x/dx D 0:
iD1
Since v is arbitrary in Cc1 .˝; E/ we obtain the following second-order partial differential equation in divergence form
d X
Di .Di L.Du.x/; u.x/; x// C DdC1 L.Du.x/; u.x/; x/ D 0
x2˝
iD1
known as the Euler-Lagrange equation. In the next examples we take E WD R. In the case E WD Rn we would obtain systems. Example Assuming that L.p; e; x/ D Rd R ˝ we obtain
1 2
kpk2 WD
u.x/ D 0
1 2 2 p1
C : : : C 12 p2d for .p; e; x/ 2
x2˝
where is the Laplacian; this is Dirichlet’s Principle. Example Assuming that for a matrix A.x/ WD .ai;j .x// and a continuous function f W d ˝ ! R the Lagrangian is the quadratic form L. p; e; x/ D 12 ˙i;jD1 ai;j .x/pi pj f .x/e d for . p; e; x/ 2 R R ˝ we obtain d ˙i;jD1 Di .ai;j .x/Dj u.x// D f .x/
a linear (or rather affine) second-order equation.
x2˝
316
5 The Power of Differential Calculus
It can be shown that the second-order necessary condition j00 .u/vv 0
8v 2 Cc1 .˝/
leads to the Legendre condition d X @2 L .Du.x/; u.x/; x/qi qj 0 @pi @pj i;jD1
x 2 ˝; q WD .qi / 2 Rd :
Such a condition is related to the assumption that for all .e; x/ 2 R ˝ the function L.; e; x/ is convex. Such an assumption can be used to get existence results. For such results it is advisable to use the Sobolev space Wq1 .˝/ (with q 21; 1Œ) considered in Chap. 9 rather than C1 .˝/: not only it is a larger space but it is also reflexive. The first step in this method (called the direct method) is the following semicontinuity result. Proposition 5.42 Assume that for some q 21; 1Œ and some a 2 P, b 2 RC one has L. p; e; x/ a kpkq b for all . p; e; x/ 2 Rd R˝ and that L.; e; x/ is convex. Then the functional w 7! j.w/ is sequentially weakly lower semicontinuous on Wq1 .˝/. Note that the assumption on L ensures that for all w 2 Wq1 .˝/ q
j.w/ a kDwkLq .˝/ bd .˝/ where d .˝/ is the Lebesgue measure of ˝. Thus j.w/ ! 1 as kDwkLq .˝/ ! 1. Fixing the value g of w on the boundary @˝ of ˝, one can deduce from the theory of traces that j is coercive on the affine space Wq1 .˝/g of those w 2 Wq1 .˝/ whose traces on @˝ are g. Existence ensues (see [117, Section 8.2]). Theorem 5.33 Assuming that the space Wq1 .˝/g is not empty and that the assumptions of the preceding proposition are satisfied, the functional j attains its minimum on Wq1 .˝/g .
A thing of beauty is a joy for ever: Its loveliness increases; it will never Pass into nothingness. John Keats, Endymion
Abstract This chapter can be conceived as a substantial course on convex analysis. But it appears here in view of its relationships with other subjects such as optimization and differential calculus. Convex functions have remarkable continuity and differentiability properties. They offer a substitute to the derivative, the subdifferential, whose calculus rules are delineated. Moreover, convexity allows rich duality properties that are displayed along two classical lines: the Lagrangian one and the perturbational one.
convex functions exhibits a substitute for the derivative called the subdifferential. It serves as a prototype for nonsmooth analysis. The main differences with classical analysis are the one-sided character of the subdifferential and the fact that a set of linear forms is substituted for the derivative. Still, nice calculus rules can be devised. Some of them, for instance for the subdifferential of the maximum of two functions, go beyond usual calculus rules. Besides classical rules of convex analysis, we sketch some fuzzy rules for the calculus of subdifferentials in the exercises of Sect. 6.3.1 (see also [208, Section 3.5]). Subdifferentials are closely linked with duality, so we provide a short account of this important topic. We also gather some elements of the geometry of normed spaces that may be useful. Even if we do not insist on that point, it appears that duality plays some role in the interplay between convexity and differentiability of norms or powers of norms. Convex analysis illustrates a typical feature of nonsmooth analysis that shows a spectacular difference with classical analysis: the study of functions of this class is intimately tied to the study of a class of sets. The many passages from functions to sets and vice versa represent a fruitful and attractive approach that exemplifies the unity and the flexibility of mathematics. The usefulness of convex analysis in various fields (combinatorics, game theory, geometry, mathematical economics, mechanics, optimization. . . ) is undeniable. Here we illustrate it with just two examples. Example (Linear Programming) Given A 2 L.Rm ; Rn /; b 2 Rn , c 2 Rm , let us consider the problem .P/
minimize hcjxi subject to x 2 Rm ; x 0; Ax D b:
If m is large and n is small, it may be of interest to consider the dual problem .D/
maximize hbjyi subject to y 2 Rn ; A y c
for which the number n of unknowns is reduced. Here A is the adjoint of A: In the case n D 2; one can even even solve .D/ graphically: the constraint set C WD fy 2 Rn W A y cg is a polyhedron of R2 and it suffices to move the line Lr WD fy 2 R2 W hbjyi D rg as much as possible so that it still meets C: In the case m D 12 and C is the convex hull of the twelve vertices of a regular planar cristal or of a clock, one obtains in such a way either a vertex or a segment joining two consecutive vertices. If for instance b D .1; 1/ is the direction of the light, the segment joining the vertices I and II of the clock in Fig. 6.1 is the set of solutions to .D/. Such a graphical solution can be found by young children. It remains to apply the known relations between the solutions of .D/ and the solutions of .P/ provided by the theory of linear programming, a special case of convex programming. In particular, the values of the two problems are equal unless one of the feasible sets is empty, and if a solution y of .D/ is such that .A y/i < ci then any solution x of .P/ must satisfy xi D 0: This is often enough to find a
6 A Touch of Convex Analysis
319
XII
I II
X
XI
IV
I IX VII
III
V
VI
VII
Fig. 6.1 The dual of a linear programming problem
solution of the given problem .P/: Of course, in the case of industrial or trading problems the values of m and n may be very large (say 105 or 106 ) and one has to use algorithms. Such algorithms exist (for instance the simplex algorithm and interior point algorithms) and are very efficient. Example (The Fermat-Torricelli Problem) Given a finite family a1 ,. . . ,an of distinct points in a normed vector space .X; kk/ this problem consists in finding a point x 2 X that minimizes the function f given by f .x/ WD kx a1 k C : : : C kx an k
x 2 X:
It appears when one wants to find the best location for a warehouse that serves several shops or factories. It can be generalized to the case when some weights wi affect the terms kx ai k or when such terms are replaced by hi .kx ai k/, where hi W RC ! RC : It was solved by Torricelli in the case X D R2 and n D 3: The methods of the present chapter can efficiently give the solution to this problem. One first remarks that f is a convex continuous function and that the solution x is characterized by the relation 0 2 @f .x/ D @ kk .x a1 / C @ kk .x a2 / C @ kk .x a3 /; where @ is the subdifferential introduced in this chapter. Since @ kk .0/ D BR2 and @ kk .z/ D z= kzk for z 2 R2 nf0g; x D a1 satisfies this relation if and only if one has 0 2 BR2 C u2 .a1 / C u3 .a1 /
320
6 A Touch of Convex Analysis
where ui .x/ WD .x ai /= kx ai k for x 2 R2 nfai g: Since ui .x/ is a unit vector, this relation can be written as ku2 .a1 / C u3 .a1 /k2 1 or hu2 .a1 / j u3 .a1 /i 1=2: This means that the cosine of the angle between the vectors a1 a2 and a1 a3 is not greater than 1=2 or that the angle between these two vectors is at least 120ı. When all the angles of the triangle a1 a2 a3 with vertices a1 , a2 , a3 are less than 120ı the solution x is characterized by u1 C u2 C u3 D 0
with ui WD ui .x/ WD
x ai : kx ai k
Since kui k2 D 1 for i 2 N3 ; taking scalar products, this relation implies that hu2 j u1 i C hu3 j u1 i D 1 hu1 j u2 i C hu3 j u2 i D 1 hu1 j u3 i C hu2 j u3 i D 1: Solving this system by taking the sums of the sides of two different lines yields hui j uj i D 1=2 for i; j 2 N3 with i ¤ j: Conversely, when these relations hold one gets ku1 C u2 C u3 k2 D ku1 k2 C ku2 k2 C ku3 k3 C
3 X
hui j uj i D 0
i¤jD1
since kui k2 D 1 for i 2 N3 . Thus x is characterized by these relations and can be constructed by drawing equilateral triangles on each side of the triangle a1 a2 a3 and taking the point x common to the 3 segments joining the vertex ai to the vertex a0i of the new equilateral triangle opposite to the segment aj ak with j ¤ i; k ¤ i as in the next figure (Fig. 6.2). a′3
a1 a′2
f=a1
f a3
a2
a′1
Fig. 6.2 The Fermat-Torricelli problem
a2
a3
6.1 Continuity Properties of Convex Functions
321
6.1 Continuity Properties of Convex Functions A nice semicontinuity property of convex functions is given in the following statement. Theorem 6.1 If f W X ! R1 is a convex function that is lower semicontinuous on a normed space X for the topology associated with the norm, then f is lower semicontinuous on X endowed with the weak topology. Proof This is an immediate consequence of Mazur’s Theorem: for every real number r the sublevel set Œf r WD fx 2 X W f .x/ rg of f is closed and convex, hence weakly closed. t u The preceding proof shows that the same property holds for quasiconvex functions, i.e. functions whose sublevel sets are convex. The following consequence follows from Lemma 2.6, since bounded, closed, convex subsets of a reflexive Banach space are weakly compact. Corollary 6.1 A coercive lower semicontinuous convex function f on a reflexive Banach space X attains its infimum. Since the epigraph of a lower semicontinuous convex function is the intersection of a family of closed half-spaces, one may guess that such a function is the supremum of a family of continuous affine forms. However, some care is in order. Hereafter we say that a convex function is closed if it is lower semicontinuous and either it is identically equal to 1 (in that case we denote it by 1X ) or it takes X its values in R1 WD R [ fC1g. Recall that f 2 R is proper if f does not take the value 1 and if it is not the constant function 1X . Then its epigraph is a proper subset of X R (i.e. is nonempty and different from the whole space). We observed that a lower semicontinuous convex function assuming the value X 1 cannot take a finite value (see Sect. 3.3). Thus a closed convex function f 2 R is either proper, or 1X or 1X . Note that, given a nonempty proper closed convex subset C of X, the valley function C given by C .x/ D 1 for x 2 C, C .x/ D C1 for x 2 XnC is an example of a lower semicontinuous convex function that is not closed and not proper. Note that the lower semicontinuous hull f of a proper function may be not proper: consider the function f W R ! R given by f .0/ WD 0 and f .x/ D 1= jxj for x 2 Rnf0g: If f is the supremum of a nonempty family of continuous affine functions, then f is either 1X or a closed proper convex function. In both cases, and in the case of f D 1X (which corresponds to the empty family), it is a closed convex function. A remarkable converse holds. Theorem 6.2 Any closed convex function is the supremum of a family of continuous affine functions (the ones it majorizes). If f is proper, this family is nonempty.
322
6 A Touch of Convex Analysis
Clearly, if f D 1X , one can take the family of all continuous affine functions on X, while if f D 1X one takes the empty family. The following lemma is the first step of the proof of this result for the case f ¤ 1X . Lemma 6.1 For any lower semicontinuous convex function f W X ! R1 there exists a continuous affine function g such that g f . Moreover, if w 2 dom f and r < f .w/ we may require that g.w/ > r. Proof The case f D 1X is obvious. Let us suppose f ¤ 1X , so that the epigraph Ef of f is nonempty. Let w 2 dom f and r < f .w/. The Hahn-Banach Theorem allows us to separate the compact set f.w; r/g from the closed convex set Ef W there exist .h; c/ 2 X R D.X R/ and b 2 R such that 8.x; s/ 2 Ef
hh; xi C cs > b > hh; wi C cr:
(6.1)
Taking x D w, s > f .w/ > r, we see that c > 0. Dividing each side of these inequalities by c, we get s > c1 h.x/ C c1 b
8x 2 dom f ; 8s f .x/:
It follows that f g for g given by g.x/ WD c1 h.x/ C c1 b. Moreover, the second inequality in relation (6.1) can be written as g.w/ > r. t u Now let us prove Theorem 6.2. Again, the cases f D 1X , f D 1X being obvious, we may suppose f is proper. Let w 2 X and r < f .w/. If w 2 dom f , the preceding lemma provides us with a continuous affine function g f with g.w/ > r. Now, let us consider the case w 2 Xn dom f . Separating f.w; r/g from Ef , we get some .h; c/ 2 .X R/ and b 2 R such that relation (6.1) holds. Taking x 2 dom f and s large, we see that c 0. If c > 0, we can conclude as in the preceding proof. If c D 0, observing that b h.w/ > 0, taking a continuous affine function k such that k f (such a function exists, by the lemma) and setting g WD k C n.b h/; with n > .b h.w//1 .r k.w//, we see that g.w/ > r and g f as k f and b h.x/ 0 for x 2 dom f by relation (6.1) with c D 0. t u Since lower semicontinuity is stable by the operation of taking suprema, one can deduce Theorem 6.1 from Theorem 6.2. For convex functions, one disposes of remarkably simple continuity criteria. Proposition 6.1 Let f W X ! R1 be a convex function on a normed space (or topological vector space) X. If f is finite at some x 2 X, the following assertions are equivalent: (a) f is bounded above on some neighborhood V of xI (b) f is upper semicontinuous at xI (c) f is continuous at x.
6.1 Continuity Properties of Convex Functions
323
Proof The implications (c))(b))(a) are obvious. Let us show (a))(b) and (b))(c). We may suppose that x D 0; f .x/ D 0 by performing a translation and adding a constant. Given " > 0, let m sup f .V/, m ". Let U WD "m1 V. Then, for u 2 U, setting v WD "1 mu 2 V, by convexity we have f .u/ "m1 f .v/ C .1 "m1 /f .0/ " and, as U is a neighborhood of 0, f is upper semicontinuous at 0. In order to deduce (c) from (b), i.e. that f is continuous at 0, we note that for w 2 W WD U \ .U/ we have 0 D f .0/ 12 f .w/ C 12 f .w/ 12 f .w/ C 12 " hence f .w/ ". t u Remark If V is the ball BŒx; r and sup f .V/ m, for c WD r1 .m f .x// one has 8x 2 BŒ0; r
f .x C x/ f .x/ c kxk
since for x 2 BŒ0; r, setting t WD r1 kxk, taking u such that kuk D r, x D tu, one gets f .x C x/ f .x/ D f ..1 t/x C t.x C u// f .x/ t . f .x C u/ f .x// r1 .m f .x// kxk ; a property called quietness at x. In fact, for all x 2 BŒ0; r, since f .x C x/ f .x/ .f .x x/ f .x// c kxk, we have jf .x C x/ f .x/j c kxk, and we say that f is stable at x. Later on this property will be reinforced into a local Lipschitz property. t u The following results illustrate the uses of the preceding criteria. Proposition 6.2 Suppose f W X ! R1 is a convex function on a finite dimensional space X. Then f is continuous on the interior of its domain Df WD domf WD f 1 .R/. Proof Given x 2 intDf , let x1 ; : : : ; xn 2 Df be such that x belongs to the interior of the convex hull C of fx1 ; : : : ; xn g (for instance, one can take for C a ball with center x for some polyhedral norm, X being identified with some Rd /. Then f is bounded above on C by m WD max.f .x1 /; : : : ; f .xn //, hence is continuous at x. t u Proposition 6.3 Let f W X ! R1 be a lower semicontinuous convex function on a Banach space X. Then f is continuous on the core of its domain Df (which coincides with the interior of Df ). Proof Given x 2 coreDf , let m > f .x/ and let C WD fx 2 X W f .x/ mg. Again we may suppose x D 0. Then C is a closed convex subset of X that is absorbing: for all x 2 X we can find r > 0 such that rx 2 Df and for s > 0 small enough we have f .rsx/ .1 s/f .0/ C sf .rx/ < m so that rsx 2 C. Thus C is a neighborhood of 0 by Lemma 3.31 and f is continuous at 0 by Proposition 6.1 t u Convex functions enjoy a “miraculous” propagation property.
324
6 A Touch of Convex Analysis
Proposition 6.4 Let f W X ! R1 be a convex function on a normed space X. If f is continuous at some x 2 Df WD domf then f is continuous on the interior of Df . Proof Given x0 2 intDf , let us prove that f is continuous at x0 . Using a translation, we may suppose x0 D 0. Then, as Df is a neighborhood of 0, there exists some r > 0 such that y WD rx 2 Df . Let V be a neighborhood of 0 such that f is bounded above by some m on x C V. Then, by convexity, f is bounded above on r.1 C r/1 .x C V/ C .1 C r/1 y D r.1 C r/1 V 2 N .0/ by r.1 C r/1 m C .1 C r/1 f .y/. Then, by Proposition 6.1, f is continuous at x0 .
t u
Local boundedness of a convex function entails a regularity property stronger than continuity: a local Lipschitz property. In fact, the result is not just a local one: the following statement and its corollary give a precise content to this assertion: the corollary shows that a Lipschitz property is available on balls that may be big provided the function is bounded above on a larger ball. One even gets a quantitative estimate of the Lipschitz rate. Proposition 6.5 Let f be a convex function on a convex subset C of a normed space X and let ˛; ˇ 2 R, > 0. Suppose f is bounded below by ˇ on a subset B of C and is bounded above by ˛ on a subset A of C such that B C UX A, where UX is the open unit ball of X. Then f is Lipschitzian on B with rate 1 .˛ ˇ/. Proof Given x; y 2 B and ı > kx yk, let z WD y C ı 1 .y x/ 2 A, since B C UX A. Then y D x C t.z x/ where t WD ı.ı C /1 2 Œ0; 1, hence f .y/ f .x/ t.f .z/ f .x// t.˛ ˇ/ ı1 .˛ ˇ/: Interchanging the roles of x and y and taking the infimum on ı in kx yk; 1Œ, we get jf .y/ f .x/j 1 .˛ ˇ/kx yk: t u The preceding statement is versatile enough to apply in a variety of geometric cases. The simplest one is the case of balls. Corollary 6.2 Suppose the convex function f on the normed space X is bounded above by ˛ on some ball B.x; r/. Then, for any s 20; rŒ the function f is Lipschitzian on the ball B.x; s/ with rate 2.r s/1 .˛ f .x//. Proof Taking A WD B.x; r/; B WD B.x; s/, WD r s, ˇ WD 2f .x/ ˛ it suffices to observe that for all x 2 B one has f .x/ ˇ by convexity. t u Corollary 6.3 Any convex function which is continuous on an open convex subset U of a normed space is locally Lipschitzian on U. Convex functions enjoy nice properties for what concerns optimization. A simple example is as follows. Proposition 6.6 Any local minimizer of a convex function f W X ! R1 WD R[fC1g on a normed space (or topological vector space) X is a global minimizer.
6.1 Continuity Properties of Convex Functions
325
Proof Let x 2 X and let V be a neighborhood of x such that f .x/ f .v/ for all v 2 V. Given x 2 X, one can find t 20; 1Œ such that v WD x C t.x x/ 2 V. Then, by convexity, we have tf .x/ C .1 t/f .x/ f .v/ f .x/, hence f .x/ f .x/. t u
Exercises 1. Let X be a separable Hilbert space with Hilbertian basis fen W n 2 Ng and let the function f W X ! R be given by f .x/ WD
1 X nD0
j xn jnC2
for x D
1 X
xn en :
nD0
(a) Show that f is well defined on X, bounded above by 1 on the unit ball and everywhere bounded below by 0. (b) Show that the Lipschitz rate of f around ek is at least k C 2. (c) Deduce from the preceding that f is not Lipschitzian on the ball rBX with r > 1. Observe that f is not bounded above on such a ball. 2. Using the data and the notation of Corollary 6.2 and noting that f is bounded above on B.x; s/ by .1 r1 s/f .x/ C r1 s˛, hence is bounded below by ˇ WD .1 C r1 s/f .x/ r1 s˛ on this ball, show that the Lipschitz rate of f on B.x; s/ is at most .1 C r1 s/.r s/1 .˛ f .x//. 3. Prove a similar estimate of the Lipschitz rate of f when one supposes that f is bounded above by some ˛ on the sphere with center x and radius r. 4. (a) Let f W X ! R be a uniformly continuous function on a normed space. Show that for any ı > 0 there exists a k > 0 such that d.f .x/; f .y// kd.x; y/ for all x; y 2 X satisfying d.x; y/ ı. [Hint: use a subdivision of the segment Œx; y by points ui such that d.ui ; uiC1 / ˛, where ˛ > 0 is such that d.f .u/; f .v// 1 whenever u; v 2 X satisfy d.u; v/ ˛.] (b) Prove that any uniformly continuous convex function f on X is Lipschitzian. [Hint: use (a) and Proposition 6.5.] 2 5. (The log barrier) Prove that f W Rn ! R1 given by f .u/ D log.det u/ if u is a symmetric positive definite matrix, C1 otherwise, is a convex function. 6. Deduce from Proposition 6.3 that for any closed convex subset of a Banach space one has intC D core C. [Hint: use the indicator function C of C.] 7. Prove that on the dual X of a non-reflexive Banach space X one can find a convex function f that is continuous in the topology associated with the dual norm, but that is not lower semicontinuous in the weak topology. [Hint: take f 2 X nX.]
326
6 A Touch of Convex Analysis
6.2 Differentiability Properties of Convex Functions Convexity of a function entails particular differentiability properties. The case of a one-variable function, which is our starting point, will provide our first evidence. However, it is a substitute for the derivative that will be the main point of this section. Later on, we will see that this new object, called the subdifferential, enjoys useful calculus rules.
6.2.1 Derivatives of Convex Functions We first observe that if f W T ! R is a finite convex function on some interval T of R, then for r < s < t in T the following inequalities hold: f .t/ f .r/ f .t/ f .s/ f .s/ f .r/ : sr tr ts
(6.2)
They express that the slope of a secant to the graph of f is a nondecreasing function of the abscissas of its extremities and stem from the convexity inequality f .s/ D f
t s tr
rC
sr ts sr t f .r/ C f .t/ tr tr tr
(since the coefficients of f .r/ and f .t/ are in Œ0; 1 and have sum 1), which yields f .s/ f .r/
sr .f .t/ f .r// ; tr
f .t/ f .s/
ts .f .t/ f .r// : tr
Lemma 6.2 If f W T ! R is a finite convex function on some interval T of R, then, for any s 2 Tnfsup Tg the right derivative fr0 .s/ WD Dr f .s/ of f at s exists in R [ f1g and is given by Dr f .s/ WD lim
t!sC
f .t/ f .s/ f .t/ f .s/ D inf : t>s ts ts
If moreover, s is in the interior of T then Dr f .s/ is finite, the left derivative D` f .s/ exists, is finite and D` f .s/ Dr f .s/. Furthermore, the functions s 7! Dr f .s/ and s 7! D` f .s/ are nondecreasing. Proof The first assertion is a direct consequence in the existence of a limit for the nondecreasing function t 7! .t s/1 .f .t/ f .s// on s; sup TŒ. Changing f into g given by g.u/ WD f .u/, we get the assertions about the left derivative. The second assertion stems from the fact that when s 2 intT, the limit is finite since, by (6.2), for r < u < s the quotient .s u/1 .f .s/ f .u// is bounded below by
6.2 Differentiability Properties of Convex Functions
327
.s r/1 .f .s/ f .r//. Thus f .s/ f .r/ f .u/ f .s/ f .t/ f .s/ D` f .s/ D sup inf D Dr f .s/: t>s sr us ts u0 t1 Œf .x C tv/ f .x/. It is finite if x 2 core.dom f /. If X is a Banach space, if x 2 int.dom f /, and if f is lower semicontinuous, then the
328
6 A Touch of Convex Analysis
directional derivative df .x; v/ WD
lim
.t;w/!.0C ;v/
f .x C tw/ f .x/ t
exists and coincides with the radial derivative. In particular, if u W Œ0; a ! X with a > 0 is right differentiable at 0 with u.0/ D x; then f ı u is right differentiable at 0 and .f ı u/0r .0/ D dr f .x; u0r .0//: Proof Let g be given by g.t/ D f .xCtv/. Then g is convex and its right derivative at 0 is dr f .x; v/. It exists in Œ1, C1Œ if x C PvŒ\ dom f is nonempty and it is C1 otherwise. Even in the latter case, this right derivative is inft>0 t1 .g.t/ g.0// D inft>0 t1 .f .x C tv/ f .x//. When x belongs to core.dom f /, for any v 2 X, 0 is in the interior of dom g and we can conclude with Lemma 6.2. When X is a Banach space, f is lower semicontinuous, and x 2 int.dom f / the function f is Lipschitzian on a neighborhood of x; so that there exists some c 2 RC such that j.1=t/.f .x C tw/ f .x C tv//j c kw vk ! 0 as .t; w/ ! .0C ; v/: Thus df .x; v/ D dr f .x; v/: The last assertion ensues. t u Proposition 6.10 If f W X ! R1 is a convex function on a vector space X, then for all x 2 dom f , the radial derivative dr f .x; / is a sublinear function. Proof Clearly dr f .x; / is positively homogeneous. Let us prove it is subadditive: for any v; w 2 X we have f .x C 12 t.v C w// 12 f .x C tv/ C 12 f .x C tw/, hence dr f .x; v C w/ D lim
t!0C
lim
t!0C
t 2 Œ f .x C .v C w// f .x/ t 2 1 1 . f .x C tv/ f .x// C lim . f .x C tw/ f .x// t!0C t t
D dr f .x; v/ C dr f .x; w/: t u The preceding statement can also be justified by checking that dr f .x; v/ D inffs W .v; s/ 2 T r .Ef ; xf /g; where Ef is the epigraph of f , xf WD .x; f .x// and T r .Ef ; xf / is the radial tangent cone to Ef at xf , where the radial tangent cone to a convex set C at z 2 C is the set T r .C; z/ WD RC .C z/: When X is a normed space, T r .C; z/ is not closed in general, as simple examples show. Therefore, it is advisable to replace it with the tangent cone T.C; z/ to C at z. In the case when C is convex, T.C; z/ is just the closure of T r .C; z/. In the case when C is the epigraph of a convex function f finite at x and z WD xf WD .x; f .x//, it
6.2 Differentiability Properties of Convex Functions
329
can be shown that T.C; z/ is the epigraph of the (lower) directional derivative of f at x defined by f 0 .x; v/ WD df .x; v/ WD
lim inf
.t;u/!.0C ;v/
f .x C tu/ f .x/ : t
Since f 0 .x; / D df .x; / is lower semicontinuous, it has better duality properties than dr f .x; /. Moreover, it is as closely connected to the notion of a subdifferential of f at x as dr f .x; /. We consider this notion in the next subsection.
6.2.2 Subdifferentials of Convex Functions Since a general convex function f may have kinks, it may happen that there is not just one affine function minorizing f and taking the value f .x/ at a given point x 2 dom f : For example, this occurs with f WD jj W R ! R and x WD 0 W every linear form x 2 Œ1; 1 satisfies x f , x .0/ D f .0/: It is worth considering the set of such continuous linear forms. Definition 6.1 (Fenchel, Moreau) If f W X ! R1 is a function on a normed space X and x 2 X, then the subdifferential of f at x is the empty set if x 2 Xn dom f and if x 2 dom f , it is the set @f .x/ of x 2 X such that 8w 2 X
f .w/ f .x/ C hx ; w xi:
(6.3)
This is a global notion which is very restrictive for an arbitrary function. For a convex function it turns into a crucial tool that is a useful substitute for the derivative, as we will shortly see. A strong advantage of the subdifferential is that it yields a characterization of minimizers. Proposition 6.11 A function f on a normed space X attains its minimum at x 2 dom f if and only if 0 2 @f .x/. The result is an immediate consequence in the definition. Calculus rules will make it efficient. In particular, they enable us to give optimality conditions for problems with constraints. A first consequence in the next result is that the subdifferential of a convex function f is not just a global notion, but also a local notion. Let us recall that f is said to be Gateaux differentiable at x with derivative Df .x/ WD ` 2 X if f is finite at x and if for all v 2 X f .x C tv/ f .x/ ! `.v/ as t ! 0; t ¤ 0: t
330
6 A Touch of Convex Analysis
Theorem 6.3 If f is a convex function on a normed space X and x 2 dom f then x 2 @f .x/ ” 8v 2 X hx ; vi df .x; v/ ” 8v 2 X hx ; vi dr f .x; v/: If x 2 core.dom f / and f is Gateaux differentiable at x, then @f .x/ D fDf .x/g. Proof Given x 2 @f .x/, for any t 2 P, u 2 X we have hx ; tui f .x C tu/ f .x/: Dividing by t and taking the lower limit as .t; u/ ! .0C ; v/, we get hx ; vi df .x; v/ dr f .x; v/. Now, if f is convex and if x satisfies the inequality hx ; vi dr f .x; v/ for all v 2 X, then, for v 2 X, t 20; 1Œ, by the monotonicity observed in relation (6.2), we have hx ; vi dr f .x; v/
1 .f .x C tv/ f .x// f .x C v/ f .x/: t
Setting v D w x, we obtain relation (6.3). The last assertion is obvious: if x ` WD Df .x/ then one has x D `. t u A geometric interpretation of the subdifferential of a function can be given in terms of the normal cone to its epigraph. Recall that the normal cone to a convex subset C of a normed space X at some z 2 C is defined as the set N.C; z/ of z 2 X such that hz ; w zi 0 for every w 2 CI thus it is the polar cone to the radial tangent cone T r .C; z/ and also, by density, it is the polar cone to T.C; z/. Proposition 6.12 For a convex function f on a normed space X and x 2 dom f , one has the following equivalence in which Ef is the epigraph of f and xf WD .x; f .x//: x 2 @f .x/ ” .x ; 1/ 2 N.Ef ; xf /: The proof is immediate from the definition of @f .x/ W x 2 @f .x/ , 8.w; r/ 2 Ef hx ; w xi r f .x/ , .x ; 1/ 2 N.Ef ; xf /: On the other hand, the normal cone to a convex set can be described in terms of subdifferentials. Proposition 6.13 For a convex subset C of a normed space X, the normal cone to C at x 2 C is the subdifferential of the indicator function C to C at x. It is also the cone RC @dC .x/ generated by the subdifferential of the distance function to C at x. Proof By definition, x 2 N.C; x/ if and only if hx ; w xi 0 for all w 2 C. Since C .w/ D 0 for w 2 C and C .w/ D 1 for w 2 XnC, this property is equivalent to x 2 @C .x/.
6.2 Differentiability Properties of Convex Functions
331
The inclusion RC @dC .x/ N.C; x/ is obvious: when r 2 RC , x 2 @dC .x/, one has 8w 2 C
hrx ; w xi rdC .w/ rdC .x/ D 0:
Conversely, when x 2 N.C; x/, the function x attains its infimum on C at x, and is Lipschitzian with rate c D kx k, so that, by the Penalization Lemma, x C cdC attains its infimum on X at xI then 0 2 @ .x C cdC / .x/, which is equivalent to x 2 c@dC .x/. t u The last argument shows the interest of disposing of calculus rules. Such rules will be considered in the next section. A simple consequence of the subdifferentiability of a convex function f at a point x (i.e., of the nonemptiness of @f .x/) is the lower semicontinuity of f at x. The converse is not true, as shows the example of f W R ! R1 given by p f .r/ WD 1 r2 for r 2 Œ1; 1; C1 otherwise and x WD 1: Still continuity entails subdifferentiability. This is a remarkable criterion! Theorem 6.4 (Moreau) If a convex function f on a normed space X is finite and continuous at x, then @f .x/ is nonempty and weak compact. Moreover, for all u 2 X df .x; u/ D maxfhx ; ui W x 2 @f .x/g: Proof For every r > f .x/ there exists a neighborhood V of x such that V Œr; 1Œ is contained in the epigraph Ef of f . Thus, the interior of Ef is convex and nonempty. It does not contain xf WD .x; f .x// since for s < f .x/ close to f .x/ one has .x; s/ … Ef . The geometric Hahn-Banach theorem yields some .u ; c/ 2 .X R/ such that hu ; wi C cr > hu ; xi C cf .x/
8.w; r/ 2 intEf :
This implies (by taking w D x, r D f .x/ C 1) that c > 0 and, by Lemma 3.12, that hu ; w xi C c.r f .x// 0
8.w; r/ 2 Ef :
In turn, this relation, which can be written as f .w/ f .x/ hc1 u ; w xi
8w 2 X;
shows that x WD c1 u 2 @f .x/. Thus @f .x/ is nonempty. Since @f .x/ is the intersection of the weak closed half-spaces Dw WD fx 2 X W hx ; w xi f .w/ f .x/g;
w 2 dom f ;
332
6 A Touch of Convex Analysis
it is always weak closed. When f is continuous at x, taking > 0 such that sup f .B.x; // f .x/ C 1, for all x 2 @f .x/ we have kx k D 1 supfhx ; w xi W w 2 B.x; /g 1 : The second assertion will be proved with the alternative proof that follows.
t u
Alternative proof By the remark following Proposition 6.1 we can find c 2 RC and r > 0 such that jf .x C v/ f .x/j c kvk for v 2 B.0; r/. It follows that jdf .x; w/j c kwk for w 2 X. Given u 2 X the Hahn-Banach Theorem yields some linear form x such that x df .x; / c kk and hx ; ui D df .x; u/. Thus, x is continuous and x 2 @f .x/. t u Remark Without the continuity assumption, @f .x/ may be unbounded. This is the case for the indicator function of RC on X D R, for which @f .0/ D RC . Examples (a) For f WD kk one has @ kk .0/ D BX and @ kk .x/ D fx 2 X W kx k D 1; hx ; xi D kxkg for x 2 Xnf0g. (b) Let X be a normed space and let j./ WD 12 kk2 . Then @j.x/ D J.x/, the duality (multi)map defined by J.x/ WD fx 2 X W kx k D kxk ; hx ; xi D kxk2 g; and J.x/ is nonempty, as shown by applying Corollary 3.14 or Theorem 6.4. u t Corollary 6.4 Let f W X ! R1 be a convex function finite and continuous at x 2 X. Then f is Gateaux and Hadamard differentiable at x if and only if @f .x/ is a singleton fx g. Moreover, Df .x/ D x . Proof Suppose @f .x/ D fx g: The preceding theorem ensures that df .x; / D x . Thus f is Gateaux differentiable. Since f is Lipschitzian around x, it is Hadamard differentiable. The converse is an obvious consequence in Theorem 6.3. t u Corollary 6.5 Let f be a convex function on a normed space X. Suppose the restriction of f to the affine subspace A generated by dom f is continuous at x 2 dom f . Then @f .x/ is nonempty. Proof Without loss of generality, we may suppose x D 0, so that A is the vector subspace generated by dom f . The preceding theorem ensures that the restriction f j A of f to A is subdifferentiable at 0. Then, any continuous linear extension of any element of @.f j A/.0/ belongs to @f .0/, and such extensions exist by the HahnBanach Theorem. t u Recall that for a subset D of a normed vector space X, riD is the set of points that belong to the interior of D in the affine subspace Y generated by D.
6.2 Differentiability Properties of Convex Functions
333
Corollary 6.6 Let f be a convex function on a finite dimensional normed space X and let x 2 ri dom f (i.e., be such that RC .dom f x/ is a linear subspace). Then @f .x/ is nonempty. Proof Taking D D dom f , by Prop. 6.2 we have that the restriction g of f to Y is continuous at x. The preceding corollary applies. u t In the general case of a closed proper convex function f on a Banach space, a density result for the set dom@f of subdifferentiability points of f can be given. Theorem 6.5 (Brøndsted-Rockafellar) For a closed proper convex function f on a Banach space X, the set of points x 2 X such that @f .x/ is nonempty is dense in dom f . More precisely, for any x 2 dom f there exists a sequence .xn / ! x such that .f .xn // ! f .x/ and @f .xn / ¤ ¿ for all n 2 N. Proof Given x 2 dom f and a sequence ."n / ! 0C ; Lemma 6.1 provides some xn 2 X ; rn 2 R such that gn WD hxn ; i rn f and gn .x/ > f .x/ "n . Then we have inf.f gn / 0 and f .x/ gn .x/ < "n . The Ekeland’s variational principle yields some xn 2 B.x; "n / such that f .x/ gn .x/ C kx xn k f .xn / gn .xn /: Thus, 0 2 @.f gn C k xn k/.xn /: The sum rule we shall see shortly (Theorem 6.8) ensures that there exists some un 2 @ k xn k .xn / D @ kk .0/ D BX such that xn un 2 @f .xn /:
Exercises 1. Establish the inequality xy p1 xp Cq1 yq for any x; y 2 RC , p; q > 1 satisfying p1 C q1 D 1 by minimizing the function x 7! p1 xp xy for a fixed y > 0. Deduce from this inequality the Hölder’s inequality: 8a WD .ai /; b WD .bi / 2 R
n
n X iD1
jai bi j
n X iD1
!1=p jai j
p
n X
!1=q jbi j
q
:
iD1
[Hint: set si WD ai = kakp , ti WD bi = kbkq , with kakp WD .˙1in jai jp /1=p , kbkq WD .˙1in jbi jq /1=q and note that ˙1in jsi jp D 1, ˙1in jti jq D 1, ˙1in jsi ti j ˙1in .p1 kskp C q1 ktkq /.] 2. (a) Let A be a positive definite matrix, p let 1 (resp. n ) be its smallest (resp. largest) eigenvalue and let WD 1 :n . Verify thatpthe function p f W t 7! t= C =t is convex p on Œ1 ; n and satisfies f . / D = C n =1 D 1 1 n p f .n /, hence f .t/ 1 =n C n =1 for all t 2 Œ1 ; n .
334
6 A Touch of Convex Analysis
(b) Show that is an eigenvalue of 1 ACA1 if and only if is an eigenvalue of A. [Hint: reduce A to a diagonal form.] p (c) Prove that 2 hAx; xi:hA1p x; xi 1 hAx; xi C hA1 x; xi for all x 2 Rn . [Hint: use the inequality 2 ab a C b for a; b > 0.] (d) Deduce from this Kantorovich’s inequality: 8x 2 Rn
p p hAx; xi:hA1 x; xi .1=4/. 1 =n C n =1 /kxk4 :
3. (Calmness subdifferentiability criterion). A function f W X ! R1 finite at x 2 X is said to be calm at x if f is quiet at x, i.e. if there exist c 2 RC and a neighborhood V of x such that f .x/ f .x/ c kx xk for all x 2 V. The calmness rate of f at x is the infimum f .x/ of the constants c > 0 for which the preceding inequality is satisfied on some neighborhood of x. Show that a convex function f W X ! R1 finite at some x 2 X is subdifferentiable at x if and only if it is calm at x. Verify that the calmness rate of f at x is equal to the remoteness .@f .x// of @f .x/, where the remoteness of a nonempty subset S of X or X is the number .S/ WD inffksk W s 2 Sg. 4. (Subdifferential determination of convex functions). Given two continuous proper convex functions f , g on an open convex subset W of a Banach space X satisfying @f @g, prove that there exists some c 2 R such that f ./ D g./ C c on W. [Hint: reduce the question to the case X D R: given w, x 2 W show that f .w/ f .x/ D g.w/ g.x/ by taking compositions of f and g with the affine map h W t 7! h.t/ WD w C t.x w/ from R to X and note that the functions f ı h and g ı h have nondecreasing derivatives satisfying .f ı h/0 .g ı h/0 .] 5. Prove that the subdifferential M WD @f of a proper convex function f is cyclically monotone in the sense that for all n 2 Nnf0g and any family f.xi ; xi / W i 2 f0g [ Nn g in (the graph) of M one has hx0 x1 ; x0 i C C hxn1 xn ; xn1 i C hxn x0 ; xn i 0: 6. Deduce from Theorem 9.23 that @f is maximally cyclically monotone in the sense that any multimap T W X X whose graph is cyclically monotone and contains the graph of @f coincides with @f : 7. Prove that if a multimap T W X X is cyclically monotone there exists a proper convex function f on X such that T @f : Deduce from this and the two preceding exercises that a multimap T W X X is the subdifferential @f of a proper convex function if and only if it is maximally cyclically monotone. [See [218].]
6.2 Differentiability Properties of Convex Functions
335
6.2.3 Differentiability of Convex Functions Convex functions also enjoy particular differentiability properties. A first instance is the next result displaying an easy differentiability test using the functions rx .w/ WD f .x C w/ C f .x w/ 2f .x/
(6.4)
rx;u .t/ WD f .x C tu/ C f .x tu/ 2f .x/ D rx .tu/
(6.5)
This criterion enables us to prove differentiability without knowing the derivative. Proposition 6.14 Let f W X ! R1 be a convex function finite and continuous (or more generally subdifferentiable) at some point x 2 X. Then f is Fréchet (resp. Hadamard) differentiable at x if and only if rx is a remainder (resp. if for all u 2 SX the one-variable function rx;u is a remainder). Proof Necessity is obtained by addition directly from the definitions. Let us prove sufficiency in the Fréchet case. Let x 2 @f .x/. Then the definition of @f .x/ and (6.4) yield 0 f .x C w/ f .x/ hx ; wi D f .x/ f .x w/ C hx ; wi C rx .w/ rx .w/: This shows that f is Fréchet differentiable at x with derivative x . The Gateaux case follows by a reduction to one-dimensional subspaces. Since f is continuous at x, it is Lipschitzian around x, so that Gateaux differentiability coincides with Hadamard differentiability. t u Other instances arise with continuity properties of derivatives or closure properties of subdifferentials. Hereafter, for a net .xi /i2I weak converging to some x in
X we write .xi /i2I ! x . Proposition 6.15 Let f be a convex function on a normed space X, let x 2 dom f , .xi /i2I ! x and let xi 2 @f .xi / be such that .f .xi //i2I ! f .x/, .xi /i2I ! x , and .hxi ; xi xi/i2I ! 0. Then x 2 @f .x/. Note that the assumption .hxi ; xi xi/i2I ! 0 is satisfied when .xi /i2I is bounded. Proof It suffices to observe that for all w 2 X one has hx ; w xi D limhxi ; w xi D limhxi ; w xi i lim.f .w/ f .xi // D f .w/ f .x/: i
i
i
t u Taking for f the indicator function of a convex set C, we get the following consequence which can be given an easy direct proof.
336
6 A Touch of Convex Analysis
Corollary 6.7 Let C be a convex subset of a normed space X, let .xi /i2I be a net in C with limit x 2 C and let xi 2 N.C; xi / be such that .xi /i2I weak converges to some x and .hxi ; xi xi/i2I ! 0. Then x 2 N.C; x/. Proposition 6.16 If f W W ! R is continuous and convex on an open convex subset W of a normed space X; then df is upper semicontinuous on W X: If, moreover, f is Gateaux differentiable at x 2 W then f is Hadamard differentiable at x and for all v 2 X df is continuous at .x; v/. If, moreover, f is Gateaux differentiable around x, then f is of class D1 around x. Proof For any r > df .x; v/ one can find a positive number s such that r > s1 Œf .x C sv/ f .x/. Thus, for .x0 ; v 0 / close enough to .x; v/ one has r > s1 Œf .x0 C sv 0 / f .x0 / df .x0 ; v 0 /, so that df .x; v/ lim sup df .x0 ; v 0 /: .x0 ;v0 /!.x;v/
If f is Gateaux differentiable at x; since df .x0 ; v 0 / df .x0 ; v 0 /, the linearity of df .x; / implies that lim inf df .x0 ; v 0 / lim sup df .x0 ; v 0 / df .x; v/ D df .x; v/;
.x0 ;v0 /!.x;v/
.x0 ;v0 /!.x;v/
These inequalities prove our continuity assertion. Hadamard differentiability ensues (and can be deduced from the local Lipschitz property of f ). t u In the next statement the continuity of the derivative of f is reinforced and for a subset A of X and r 2 P, we use the notation B.A; r/ WD fx W d.x ; A/ < rg. Proposition 6.17 Let f W W ! R be a convex function on some open convex subset W of a normed space X. If f is Fréchet differentiable at some x 2 W and Gateaux differentiable on W, then its derivative is continuous at x. More generally, if f is Fréchet differentiable at some x 2 W, then its subdifferential @f is continuous at x in the following sense: for all " > 0, there exists an > 0 such that @f .w/ \ B.f 0 .x/; "/ ¤ ¿ and @f .w/ B.f 0 .x/; "/ for all w 2 B.x; /. Proof It suffices to prove the second assertion. The differentiability of f at x entails the continuity of f on W, hence that @f .w/ ¤ ¿ for all w 2 W. Let x WD Df .x/. Given " 20; d.x; XnW/Œ, ˛ 20; "Œ, let ı > 0 be such that 8u 2 B.0; ı/
f .x C u/ f .x/ hx ; ui ˛ kuk :
Let c WD ˛"1 20; 1Œ. For all w 2 B.x; .1 c/ı/, w 2 @f .w/, v 2 X one has f .w/ f .w C v/ C hw ; vi 0:
(6.6)
6.2 Differentiability Properties of Convex Functions
337
Setting u WD wxCv in (6.6) with v 2 B.0; cı/, one has u 2 B.0; ı/, xCu D wCv and, adding the respective sides of the preceding inequalities, one gets f .w/ f .x/ hx ; ui C hw ; vi ˛ kuk : Using the relation hx ; u vi D hx ; w xi f .w/ f .x/, this inequality yields hw x ; vi ˛ kuk ˛ı: Taking the supremum over v 2 B.0; cı/, one gets kw x k c1 ˛ D ".
t u
Corollary 6.8 A Fréchet differentiable convex function on an open convex subset of a normed space is of class C1 . Let us mention some density properties of the set of points of differentiability of a convex function; see [95, 119, 120, 187, 209]. Theorem 6.6 (Asplund, Lindenstrauss) Let f W W ! R be a continuous convex function on some open convex subset W of a Banach space X whose dual is separable, then the set F of points of differentiability of f is dense in W. Theorem 6.7 (Mazur) If X is a separable Banach space, the set H of Hadamard differentiability points of a continuous convex function f W W ! R is dense in W. Let us recall that for what concerns subdifferentiability, no restriction on the space is required in view of the Brøndsted-Rockafellar Theorem.
Exercises 1. (a) Let f W R ! R be given by f .x/ D jxj. Show that @f .0/ D Œ1; 1. (b) Verify that the subdifferential at 0 of a sublinear function f on a normed space X is given by @f .0/ D fx 2 X W x f g. Prove that @f .x/ D fx 2 X W x f ; hx ; xi D f .x/g for x 2 X. 2. For a convex function f on R finite at x show that @f .x/ D ŒD` f .x/; Dr f .x/. 3. Prove that the closure of the radial tangent cone at x 2 C to a convex subset of a normed space coincides with the tangent cone to C as defined in Chap. 5. 4. Prove that the normal cone N.C; x/ to a convex subset C of a normed space at x 2 C coincides with the normal cone to C as defined in Chap. 5. Compute N.C; x/ for X WD Rm ; C WD Rm C ; x 2 C. 5. (Ubiquitous convex sets) Exhibit a proper convex subset C of a Banach space X such that T.C; x/ D X for some x 2 C. Show that X must be infinite dimensional. [Hint: take for X a separable Hilbert space with Hilbert basis .en / and set C WD fx D ˙n xn en W jxn j 2n 8ng, x D 0.] p 6. Let f W R2 ! R1 be given by f .x1 ; x2 / WD max.jx1 j ; 1 x2 / for .x1 ; x2 / 2 R RC , C1 otherwise. Prove that f is convex but that dom @f is not convex.
338
6 A Touch of Convex Analysis
7. Let f W X ! R1 be a proper convex function on a normed space X and let x; y 2 dom f : Show that df .x; y x/ C df .y; x y/ 0: Deduce from this relation that when f is Gateaux differentiable at x and y one has hf 0 .x/ f 0 .y/; x yi 0: 8. Let X be a Hilbert space, let C be a nonempty closed convex subset of X and let f W X ! R be given by f .x/ WD .1=2/Œkxk2 kx P.x/k2 , where P is the metric projection of X onto C: P.x/ WD fug, where u 2 C, kx uk D d.x; C/. Show that f is convex and that f is everywhere Fréchet differentiable, with gradient given by rf .x/ D P.x/ for all x 2 X. [Hint: note that f .x/ D supfhx; yi .1=2/ kyk2 W y 2 Cg, i.e. f is the conjugate of .1=2/ kk2 C C ./; use the estimates kx C u P.x C u/k2 kx C u P.x/k2 and kx P.x/k2 kx P.x C u/k2 to prove f is differentiable at x.] 9. Let f W W ! R be convex on some open convex subset of Rd and such that the partial derivatives Di f .x/ (i 2 Nd ) of f at some x 2 W exist. Show that f is differentiable at x: 10*. Prove that a convex function f W W ! R on an open subset W of Rd is differentiable almost everywhere on W. 11. Show that for a convex function f W X ! R1 on a normed space X, the multimap @f W X X is monotone, i.e. satisfies hw x ; w xi 0 for all w, x 2 X, w 2 @f .w/, x 2 @f .x/.
6.2.4 Elementary Calculus Rules for Subdifferentials The fact that the calculus of subdifferentials satisfies structured rules is part of the value of convex analysis. We first consider simple rules. Then we turn to more general rules. Convex functions enjoy several subdifferential calculus rules that are akin to the classical rules of differential calculus. Nonetheless, there are some differences: in general a technical assumption is needed to get the interesting inclusion. Moreover, one does not have @.f /.x/ D @f .x/ in general. On the other hand, some rules of convex analysis have no analogues in the differentiable case. An example of such a new rule is the following obvious observation. Lemma 6.3 Suppose f g and f .x/ D g.x/ for some x 2 X. Then @f .x/ @g.x/. This observation easily yields the following (rather inessential) rule for infima. T Lemma 6.4 Let .fi /i2I be a family of functions andTlet x 2 i2I dom fi . If f WD infi2I fi and if fi .x/ D f .x/ for all i 2 I, then @f .x/ D i2I @fi .x/. T Proof The inclusion @f .x/ i2I @fTi .x/ stems from Lemma 6.3. For the opposite inclusion, we note that for all x 2 i2I @fi .x/, for all i 2 I and all x 2 X one has fi .x/ C hx ; x xi fi .x/, hence f .x/ C hx ; x xi f .x/ as f .x/ D fi .x/. t u A general rule can be given for value functions of parameterized problems.
6.2 Differentiability Properties of Convex Functions
339
Proposition 6.18 Let f W W X ! R1 , where W and X are normed spaces. Let p be the performance function given by p.w/ WD infff .w; x/ W x 2 Xg and let S W W X be the solution multimap given by S.w/ WD fx 2 X W f .w; x/ D p.w/g. Suppose that for some w 2 X one has S.w/ ¤ ¿. Then one has the equivalence w 2 @p.w/ ” 8x 2 S.w/
.w ; 0/ 2 @f .w; x/
” 9x 2 S.w/
.w ; 0/ 2 @f .w; x/:
Proof For all x 2 S.w/, .w; x/ 2 W X, one has p.w/ D f .w; x/, p.w/ f .w; x/, whence w 2 @p.w/ , 8w 2 W
p.w/ p.w/ C hw ; w wi
) 8.w; x/ 2 W X f .w; x/ f .w; x/ C h.w ; 0/; .w w; x x/i; or .w ; 0/ 2 @f .w; x/. Conversely, if this last relation holds for some x 2 S.w/ and some w 2 W , then, taking the infimum over x 2 X in the last inequality, one gets 8w 2 W
p.w/ p.w/ C hw ; w wi;
i.e. w 2 @p.w/.
t u
Corollary 6.9 Given functions g, h W X ! R1 , w 2 dom .gh/, x 2 X such that .gh/.w/ D g.w x/ C h.x/, one has @.gh/.w/ D @g.w x/ \ @h.x/: Proof Setting f .w; x/ WD g.w x/ C h.x/, p WD gh; it suffices to see that for any w 2 @g.w x/ \ @h.x/ one has .w ; 0/ 2 @f .w; x/ and that conversely for .w ; 0/ 2 @f .w; x/ one has w 2 @g.w x/ and by symmetry w 2 @h.x/. t u The case of the supremum of a finite family of convex functions is more likely to occur than the case of the infimum. For the case of an infinite family we refer to [208, Section 3.3.1]. finite family of convex functions on a normed space Proposition 6.19 Let .fi /i2I be a T X and let f WD supi2I fi . Let x 2 i2I dom fi and let I.x/ WD fi 2 I W fi .x/ D f .x/g. Suppose that for all i 2 I the function fi is continuous at x. Then one has df .x; / D max dfi .x; /; i2I.x/
@f .x/ D co.
[
@fi .x//:
(6.7) (6.8)
i2I.x/
Proof Let u 2 X. Since fi and f are continuous at x, by Proposition 6.9 df .x; / coincides with the radial derivative. For i 2 I.x/, since fi f and fi .x/ D f .x/, we have dfi .x; u/ df .x; u/. Thus s WD maxi2I.x/ dfi .x; u/ df .x; u/ and equality holds
340
6 A Touch of Convex Analysis
when s D 1. Let us suppose that s < 1 and let us show that for every r > s we have r f 0 .x; u/; this will prove that s D f 0 .x; u/. For i 2 I.x/, let ti > 0 be such that .1=t/ .fi .x C tu/ fi .x// < r
for t 20; ti Œ:
Since for j 2 I nI.x/ the function fj is continuous at x, given " > 0 such that fi .x/ C " < f .x/ for all i 2 I nI.x/, we can find tj > 0 such that fj .x C tu/ < f .x/ "
for t 20; tj Œ:
Then, for t 20; t0 Œ, with t0 WD min.jrj1 "; minj2InI.x/ tj / we have " tr, hence f .x C tu/ D max fi .x C tu/ max.max.fi .x/ C tr/; f .x/ "/ D f .x/ C tr: i2I
i2I.x/
Thus df .x; u/ r and df .x; u/ D maxi2I.x/ dfi .x; u/. For i 2 I.x/, the inclusion @fi .x/ @f .x/ follows from Lemma 6.3 or from the inequality dfi .x; / df .x; /. Denoting by C the right-hand side of (6.8), and observing that @f .x/ is convex, the inclusion C @f .x/ ensues. Let us show that assuming there exists some w 2 @f .x/nC leads to a contradiction. Since C is weak closed (in fact weak compact), the Hahn-Banach theorem yields some c 2 R and u 2 X (the dual of X endowed with the weak topology in view of Proposition 3.18) such that hw ; ui > c hx ; ui
8x 2 C:
Since df .x; u/ hw ; ui we get df .x; u/ > c sup hx ; ui D sup x 2C
i2I.x/
sup hx ; ui D sup dfi .x; u/;
x 2@fi .x/
contradicting the equality we established.
i2I.x/
t u
Now we state classical and convenient sum and composition rules. They are special cases of a mixed rule given later in Corollary 6.14. We incite the reader to devise direct proofs using the Hahn-Banach Theorem. Such proofs are available in most books dealing with convex analysis. Moreover, each result can be deduced from the other one. Theorem 6.8 Let f and g be convex functions on a normed space X. If f and g are finite at x and if g is continuous at some point of dom f \ dom g then @.f C g/.x/ D @f .x/ C @g.x/:
6.2 Differentiability Properties of Convex Functions
341
Theorem 6.9 (Chain Rule) Let X and Y be normed spaces, let A W X ! Y be a linear continuous map and let g W Y ! R1 be finite at y WD A.x/ and continuous at some point of A.X/. Then, for f WD g ı A one has @f .x/ D A| [email protected]// WD @g.y/ ı A: In Banach spaces, one can get rid of the continuity assumptions in the preceding two rules, replacing them by a so-called “qualification condition”, as will be shown in a forthcoming subsection devoted to duality results. The statement we present gathers a chain rule and a sum rule. It will be proved in Theorem 6.19. Theorem 6.10 (Attouch-Brézis) Let X; Y be Banach spaces, let A 2 L.X; Y/; and let f W X ! R1 , g W Y ! R1 be closed proper convex functions. If the cone Z WD RC .A.dom f / dom g/ is closed and symmetric (i.e., cl Z D Z D Z), in particular if Z D Y, then, for all x 2 X and x 2 dom f \ A1 .dom g/ one has @.f C g ı A/.x/ D @f .x/ C A| [email protected]//:
Exercises 1. Let f , g be two convex functions on a normed space X that are finite at some x 2 X. Suppose g is Fréchet differentiable at x and show that @.f C g/.x/ D @f .x/ C g0 .x/. 2. Let f be a convex function on a normed space X that is finite at some x 2 X. Suppose there exists some ` 2 X such that r defined by r.x/ WD max.f .x C x/ f .x/ `.x/; 0/ is a remainder. Show that f is Fréchet differentiable at x. 3. Prove that a convex function f on a normed space X finite at some x 2 X is subdifferentiable at x if and only if it is calm at x in the sense that there exist c > 0 and a neighborhood V of x such that f .w/ f .x/ c kw xk for all w 2 V if and only if it is globally calm at x in the sense that the preceding inequality is valid for V D X. Show that in such a case one has @f .x/ \ cBX ¤ ¿ but that one may have @f .x/ ª cBX . 4. Prove that a differentiable function f W W ! R defined on an open convex subset of a normed space X is convex if and only if f 0 W W ! X is monotone, i.e. satisfies hf 0 .w/ f 0 .x/; w xi 0 for all w, x 2 W. 5. Assuming that X is complete (resp. X and Y are complete) show that Theorem 6.8 (resp. 6.9) is a consequence in the Attouch-Brezis theorem. 6. Let X and Y be reflexive Banach spaces, let A W X ! Y be linear and continuous and let C WD A1 .D/, where D is a closed convex subset of Y. Let x 2 C, y WD A.x/. Show that x 2 N.C; x/ if and only if there exist sequences .xn / ! x,
342
6 A Touch of Convex Analysis
.yn / ! y WD Ax in D, .yn / in Y such that yn 2 N.D; yn / for all n and .A| yn x /n ! 0;
y : kyn Axn k ! 0: n n
[Hint: introduce the penalized decoupling function pn W X Y ! R1 given by pn .x; y/ WD D .y/ hx ; xi C n kAx yk2 C kxk2 and take a minimizer .xn ; yn / of pn on BXY .] 7. Let X and Y be reflexive Banach spaces, let A 2 L.X; Y/ and let f WD g ı A, where g W Y ! R1 is lower semicontinuous and convex. Let x 2 dom f , x 2 X . Show that x 2 @f .x/ if and only if there exist sequences .xn / ! x in X, .yn / ! y WD Ax in Y, .yn / in Y such that yn 2 @g.yn / for all n, .g.yn // ! g.y/; and .A| yn x /n ! 0; yn : kyn Axn k n ! 0: 8. Let h, k be lower semicontinuous proper convex functions on a reflexive Banach space X and let f WD h C k be finite at x 2 X. Show that x 2 X belongs to @f .x/ if and only if there exist sequences .wn /, .zn / ! x in X, .wn /, .zn / in X such that wn 2 @h.wn /, zn 2 @k.zn / for all n, .h.wn // ! h.x/; .k.zn / ! k.x/ and
.wn C zn /n ! x ;
..kwn xk C kzn xk/:.wn C zn //n ! 0:
6.2.5 Application to Optimality Conditions Let us apply the above calculus rules to the constrained optimization problem .C/
minimize f .x/
subject to x 2 C;
where f W X ! R1 is convex and C is a convex subset of X. We assume that f takes at least one finite value on C, so that inf.C/ is not C1. Then .C/ is equivalent to the minimization of fC WD f C C on X. Optimality conditions for problem .C/ involve the notion of a normal cone to C at some x 2 CI in the convex case we are dealing with presently, its simple definition has been given before Proposition 6.12. We recall it for the reader’s convenience: the normal cone to C at x 2 C is the set N.C; x/ of continuous linear forms on X which attain their maximum on C at x W N.C; x/ WD @C .x/ WD fx 2 X W 8x 2 C hx ; x xi 0g: Example If x is in the interior of C, one has N.C; x/ D f0g since a continuous linear form that has a local maximum is null. t u
6.2 Differentiability Properties of Convex Functions
343
Example Let g 2 X nf0g, c 2 R and let D WD fx 2 X W g.x/ cg. Then, if x is such that g.x/ < c one has x 2 intD, hence N.D; x/ D f0g, while for all x such that g.x/ D c, one has N.D; x/ D RC g. In fact, for any r 2 RC and all x 2 D one has rg.x x/ 0, hence rg 2 N.D; x/. Conversely, let h 2 N.D; x/. Then, for all u 2 Ker g, one has x C u 2 D, hence h.u/ 0. Changing u into u, we see that Ker g Ker h, so that there exists an r 2 R such that h D rg W picking u 2 X satisfying g.u/ D 1 (this is possible since g ¤ 0), we have r D h.u/, and since x u 2 D we get that r D h.u/ D h..x u/ x/ 0, hence r 2 RC . t u Theorem 6.11 A sufficient condition for x 2 C to be a solution to .C/ is 0 2 @f .x/ C N.C; x/: Under one of the following assumptions, this condition is necessary: (a) f is finite and continuous at some point of CI (b) f is finite at some point of the interior of CI (c) f is lower semicontinuous, RC .dom f C/ D cl.RC .dom f C//, C is closed and X is complete. Proof Suppose x 2 C is such that 0 2 @f .x/ C N.C; x/. Let x 2 @f .x/ be such that x 2 N.C; x/. Then, f .x/ is finite and, for all x 2 C, one has f .x/ f .x/ hx ; x xi 0 W x is a solution to .C/. The necessary condition stems from the relations 0 2 @.f CC /.x/ D @f .x/CC .x/ valid under each of the assumptions of (a)–(c). t u Using a fuzzy sum rule one can give a necessary and sufficient optimality condition that does not require additional assumptions (see [205, 251]). In order to apply the conditions of Theorem 6.11 to the important case in which C is defined by inequalities, let us give a means to compute the normal cone to C in such a case. We start with the case of a single inequality, generalizing the second example of this subsection. Lemma 6.5 Let g W X ! R1 be a convex function and let C WD fx 2 X W g.x/ 0g, x 2 g1 .0/. Suppose C0 WD fx 2 X W g.x/ < 0g is nonempty and g is continuous at x and at some point x0 2 C0 . Then one has N.C; x0 / D f0g and N.C; x/ D RC @g.x/. Proof For all x0 2 C0 the set C is a neighborhood of x0 , so that N.C; x0 / D f0g. The inclusion N.C; x/ RC @g.x/ is obvious: given r 2 RC and x 2 @g.x/, for all x 2 C one has hrx ; x xi r.g.x/ g.x// 0, hence rx 2 N.C; x/. Conversely, let x 2 N.C; x/nf0g. The interior of C is nonempty since it contains 0 x . Since hx ; xi hx ; xi for all x 2 C, we have hx ; xi < hx ; xi for all x 2 int.C/ (otherwise x would have a local maximum, hence would be 0). In particular, g.x/ < 0 implies hx ; xi < hx ; xi since for y in the segment x; xŒ we have g.y/ < 0 and g is bounded above near y, hence y 2 int.C/. Thus, g.x/ 0 for all x such that hx ; xi hx ; xi. Therefore x is a minimizer of g on D WD fx 2 X W hx ; xi hx ; xig. Since g is continuous at x 2 D, we have 0 2 @g.x/ C N.D; x/ by assertion (a) of the preceding theorem. But the second example of the present section, with
344
6 A Touch of Convex Analysis
c WD hx ; xi and x instead of g, ensures that N.D; x/ D RC x . Since 0 … @g.x/ because x is not a minimizer of g, we get some s > 0 such that sx 2 @g.x/, hence x 2 s1 @g.x/. t u The case of a finite number of inequalities is a consequence in Lemma 6.5 and of the following rule for the calculus of normal cones. Lemma 6.6 Let C1 ; : : : ; Ck be convex subsets of X and let x 2 C WD C1 \ : : : \ Ck . Then N.C; x/ D N.C1 ; x/ C : : : C N.Ck ; x/ whenever one of the following assumptions is satisfied: (a) there exist j 2 Nk and some z 2 Cj that belongs to int Ci for all i ¤ jI (b) X is complete, C1 ; : : : ; Ck are closed and for D WD f.x; : : : ; x/ W x 2 Xg, P WD C1 : : : Ck , the cone RC .P D/ is a closed linear subspace of X k . Proof Assumption (a) ensures that @.C1 C : : : C Ck /.x/ D @C1 .x/ C : : : C @Ck .x/ since for i ¤ j the function Ci is finite and continuous at z 2 domCj . The AttouchBrézis theorem gives the conclusion under assumption (b) since, if A is the diagonal map x 7! .x; : : : ; x/ from X into X k , one has C D A1 .P/, hence C D P ı A and @C .x/ D A| .@P .x// D A| .@C1 .x/ : : : @Ck .x// D @C1 .x/ C : : : C @Ck .x/; t u
as easily checked.
The next example shows the necessity of requiring some additional assumptions traditionally called “qualification conditions”. Example For i D 1; 2, let Ci WD BŒci ; 1 with ci WD .0; .1/i / in X WD R2 endowed with the Euclidean norm. Then C WD C1 \ C2 D fxg, with x WD .0; 0/, hence N.C,x/ D X , but N.Ci ; x/ D f0g .1/iC1RC and N.C1 ; x/ C N.C2 ; x/ D f0g R. t u Lemma 6.7 Let gi W X ! R1 be convex, let Ci WD fx 2 X W gi .x/ 0g for i 2 I WD Nk , let x 2 C WD C1 \ : : : \ Ck and let I.x/ WD fi 2 I W gi .x/ D 0g. Suppose that for all i 2 I gi is continuous at x. Assume Slater’s condition: there exists some x0 2 Ci0 WD fx 2 X W gi .x/ < 0g for all i 2 I.x/. Then, for x 2 X , one has x 2 N.C; x/ if and only if there exist y1 ; : : : ; yk in RC such that x 2 y1 @g1 .x/ C : : : C yk @gk .x/;
y1 g1 .x/ D 0; : : : ; yk gk .x/ D 0:
(6.9)
Proof The sufficient condition is immediate: if x D y1 x1 C : : : C yk xk with xi 2 @gi .x/ and yi 2 RC with yi gi .x/ D 0, for all x 2 C we get hx ; x xi 0 as the sum of the terms yi hxi ; x xi 0 since x 2 Ci and xi 2 @gi .x/.
6.2 Differentiability Properties of Convex Functions
345
Let us suppose now that x 2 N.C; x/. For i 2 InI.x/, since gi is continuous at x and gi .x/ < 0, one has x 2 int.Ci /, hence N.C; x/ D N.C0 ; x/ where C0 is the intersection of the family .Ci / for i 2 I.x/. Given x0 2 Ci0 for all i 2 I.x/, since gi is continuous at x, hence for t 2 0; 1 Œ , z WD .1 t/x0 C tx 2 intCi since gi is bounded above around z and gi .z/ < 0, hence z 2 int.C0 /. Thus Lemma 6.6 yields some wi 2 N.Ci ; x/ such that x D w1 C : : : C wk (with wi D 0 for i 2 InI.x/). For i 2 I.x/, Lemma 6.5 provides some yi 2 RC and some xi 2 @gi .x/ satisfying wi D yi xi . Since, for i 2 I nI.x/, gi is continuous at x, we can write wi D yi xi with yi D 0, xi arbitrary in @gi .x/ which is nonempty by Theorem 6.4. Thus relation (6.9) holds. t u This characterization and Theorem 6.11 give immediately a necessary and sufficient optimality condition for the mathematical programming problem .M/ minimize f .x/ subject to x 2 C WD fx 2 X W g1 .x/ 0; : : : ; gk .x/ 0g; where f is convex and g1 ; : : : ; gk are as above. Theorem 6.12 (Karush-Kuhn-Tucker Theorem) Let f W X ! R1 , g1 ; : : : ; gk be as in the preceding lemma and let x 2 C. Suppose f is convex, continuous at some point of C, and Slater’s condition holds: there exists some x0 such that gi .x0 / < 0 for i 2 I.x/. Then x is a solution to .M/ if and only if there exist y1 ; : : : ; yk in RC such that 0 2 @f .x/ C y1 @g1 .x/ C : : : C yk @gk .x/;
y1 g1 .x/ D 0; : : : ; yk gk .x/ D 0:
Introducing the Lagrangian function ` by `.x; y/ WD `y .x/ WD f .x/ C y1 g1 .x/ C : : : C yk gk .x/
x 2 X; y 2 Rk
and the set K.x/ of Karush-Kuhn-Tucker multipliers at x, K.x/ WD fy WD .y1 ; : : : ; yk / 2 RkC ; 0 2 @`y .x/; y:g.x/ D 0g; the above condition can be written as y 2 K.x/. Here we use the fact that yi gi .x/ 0 for all i, so that y1 g1 .x/ C : : : C yk gk .x/ D 0 is equivalent to yi gi .x/ D 0 for all i; we also use the continuity assumption on the gi ’s at x. Thus, in order to take the constraints into account, the condition 0 2 @f .x/ of the unconstrained problem has been replaced by a similar condition with `y in place of f . Despite this justification, the multipliers yi seem to be artificial ingredients. However they cannot be neglected, as shown by Exercise 1 below, even if in solving practical problems one is led to get rid of them as soon as possible. In fact, the “marginal” interpretation we provide below shows that their knowledge is not without interest, as they provide useful information about the behavior of the value of perturbed problems. In order to shed some light on such an interpretation, let us introduce for w WD .w1 ; : : : ; wk / 2 Rk the perturbed problem
346
6 A Touch of Convex Analysis
.Mw / minimize f .x/ subject to x 2 Cw WD fx 2 X W gi .x/ C wi 0; i 2 Nk g and set G WD f.x; w/ 2 X Rk W g1 .x/ C w1 0; : : : ; gk .x/ C wk 0g, p.w/ WD infff .x/ W x 2 Cw g: Since p.w/ D infx2X P.w; x/, with P.w; x/ WD f .x/ C G .x; w/, p is convex, G and P being convex. Let us also introduce the set M of Lagrange multipliers: M WD fy 2 RkC W inf f .x/ D inf `y .x/g: x2C
x2X
Theorem 6.13 Suppose p.0/ is finite. Then, the set M of Lagrange multipliers is contained in @p.0/. If the functions gi are finite, then M D @p.0/ and for all x in the set S of solutions to .M/ one has K.x/ D M. It follows that the set K.x/ is independent of the choice of x in S. Proof Let y 2 M. Given w 2 Rk , for all x 2 Cw and i D 1; : : : ; k, we have yi gi .x/ yi wi since yi 2 RC and gi .x/ wi . Thus, by definition of M; p.0/ D inf `y .x/ inf f .x/ hy; wi inf f .x/ hy; wi D p.w/ hy; wi; x2X
x2X
x2Cw
so that y 2 @p.0/. Conversely, assume the functions gi are finite and let y 2 @p.0/. We first observe that y 2 RkC since for w 2 RkC we have C Cw , hence p.w/ p.0/, so that, taking for w the elements of the canonical basis of Rk , the inequalities hy; wi p.w/ p.0/ 0 imply that the components of y are nonnegative. Now, given x 2 X, taking wi WD gi .x/ for i D 1; : : : ; k, one has x 2 Cw , hence f .x/ p.w/ and f .x/ C hy; g.x/i p.w/ C hy; g.x/i p.0/ C hy; wi C hy; g.x/i D p.0/; so that infx2X `y .x/ p.0/. Since for x 2 C one has hy; g.x/i 0, hence infx2X `y .x/ infx2C `y .x/ inf x2C f .x/ D p.0/, we get infx2X `y .x/ D p.0/, hence y 2 M. Finally, let x 2 S and let y 2 K.x/. Then, as 0 2 @`y .x/ or `y .x/ D infx2X `y .x/ and y:g.x/ D 0, we have `y .x/ D f .x/ D infx2C f .x/ and we get y 2 M. Conversely, let y 2 M. Then f .x/ D p.0/ D infx2X `y .x/ `y .x/, so that y:g.x/ 0. Since for all i D 1; : : : ; k we have yi 0 and gi .x/ 0, the reverse inequality holds, hence y:g.x/ D 0. Moreover, the relations infx2X `y .x/ D p.0/ D f .x/ D `y .x/ imply that 0 2 @`y .x/. Therefore y 2 K.x/. t u
6.3 The Legendre-Fenchel Transform and Its Applications
347
Exercises 1. (a) Compute the normal cone to RC . (b) Given a convex function f W R ! R give a necessary and sufficient condition in order that it attains its minimum on C WD fx 2 R W x 0g at 0. Taking f .x/ D x, note that the condition f 0 .0/ D 0 is not satisfied. (c) Compute the normal cone to RnC at some x 2 RnC . 2. Show that the sufficient condition of the Karush-Kuhn-Tucker Theorem holds without the Slater condition and continuity assumptions. 3. State and prove a necessary and sufficient optimality condition for a program including equality constraints given by continuous affine functions. 4. (a) Use the Lagrangian ` W X Rk ! R[f1; C1g given by `.x; y/ D f .x/ C y1 g1 .x/ C : : : C yk gk .x/ for .x; y/ 2 X RkC ; `.x; y/ D 1
if .x; y/ 2 X .Rk nRkC /;
to formulate optimality conditions for the problem .M/. (b) Introduce a Lagrangian ` W X Y ! R adapted to the problem of minimizing a convex function f under the constraint x 2 C WD fx 2 X W g.x/ 2 ZC g, where ZC is a closed convex cone in a Banach space Z and g W X ! Z is a map whose epigraph E WD f.x; z/ 2 X Z W z 2 ZC C g.x/g is closed and convex. 5. Let X WD R and let f W X ! R, g W X ! R be given by f .x/ WD x˛ for x 2 RC , with ˛ 20; 1Œ, f .x/ WD C1 for x < 0, g.x/ D x for x 2 R. Show that there is no Karush-Kuhn-Tucker multiplier at the solution of .M/ with such data. 6. (Minimum volume ellipsoid problem) Let .e1 ; : : : ; en / be the canonical basis of n Rn and let SCC be the set of positive definite matrices of size .n; n/. (a) Show that the identity matrix I is the unique optimal solution of the problem Minimize log det u;
n ; u 2 SCC
ku.ei /k2 1 0; i D 1; : : : ; n:
[Hint: Use Theorem 6.12 and a compactness argument; see [44, pp. 32, 48].] (b) Deduce from (a) the following special form of Hadamard’s inequality: for n u 2 SCC and ui WD u.ei /, one has det.u1 ; : : : ; un / ku1 k : : : kun k. 7 . Characterize the tangent cone to the positive cone Lp .S/C of Lp .S/ for p 2 Œ1; 1Œ, S being a finite measure space.
6.3 The Legendre-Fenchel Transform and Its Applications There are several instances in mathematics in which a duality can be used to transform a given problem into an associated one called the dual problem. The dual problem may appear to be more tractable and may yield useful information about
348
6 A Touch of Convex Analysis
the original problem and even help to solve it entirely. For an abstract approach, see [143]. For optimization problems, the Legendre-Fenchel conjugacy is certainly the most useful duality. We present its main properties and we show how it can be used to get duality results for minimization problems. In the last subsection we show that duality enables us to give calculus rules for subdifferentials.
6.3.1 The Legendre-Fenchel Transform Given a normed space X in duality with its topological dual X through the usual pairing h; i and a function f W X ! R, the knowledge of the performance function f .x / WD inf .f .x/ hx ; xi/
(6.10)
x2X
associated with the natural perturbation of f by continuous linear forms is likely to give precious information about f , at least when f is closed proper convex. Notice that a pair .x ; r/ 2 X R is in the hypograph Hf WD f.x ; r/ W f .x / rg of f if and only if one has f x C r: Thus, when f is closed and proper convex, f is related to the characterization of the epigraph of f as the intersection of the upper half-spaces determined by the continuous affine forms that are less than f , as in Theorem 6.2. Since f is concave (it is called the concave conjugate of f ) and upper semicontinuous, one usually prefers to deal with the convex conjugate or Legendre-Fenchel
f
f (x*) = –f*(x*) * x* 0
1
Fig. 6.3 The Young-Fenchel transform in one-dimension
6.3 The Legendre-Fenchel Transform and Its Applications
349
conjugate (or simply Fenchel conjugate) f of f given by f D f W f .x / WD sup.hx ; xi f .x//:
(6.11)
x2X
We note that whenever the domain dom f of f is nonempty, f takes its values in R1 WD R [ fC1g; in contrast, if f D 1X then f D 1X ; the function whose sole value is 1. We also observe that f is convex and lower semicontinuous with respect to the weak topology on X as a supremum of continuous affine functions. Notice that we could replace X with another space Y in duality with X. Then, for g W Y ! R we use a similar notation for the conjugate g W X ! R of g defined by g .x/ WD sup.hx; yi g.y//
x 2 X:
y2Y
The computation of conjugates is eased by the calculus rules we give below. The following examples illustrate the interest of this transformation. Examples (a) Let f be the indicator function C of a subset C of X. Then f is the support function hC or C of C given by hC .x / WD C .x / WD supx2C hx ; xi. (b) Let hS W X ! R1 be the support function of S X, S nonempty. Then hS D C , where C WD clco.S/ is the closed convex hull of S. (c) If f is linear and continuous, then f is the indicator function of ff g. (d) For f D 1p kkp with p 21; 1Œ, q WD .1 1p /1 , one has f D 1q kkq . (e) If f D kk, then f D B , the indicator function of the closed unit ball B of X . (f) More generally, if f is positively homogeneous and f .0/ D 0, then f is the indicator function of @f .0/. Other examples are given in the exercises. Examples (e) and (f) point out a connection between subdifferentials and conjugates; we will consider this question with more generality later on. Examples (a) and (b) illustrate the close relationships between functions and sets. Let us point out the potential generality of Example (a), which shows that the computation of conjugate functions can be reduced to the calculus of support functions: for any function f W X ! R with epigraph E WD Ef , the value at x of f satisfies the relation f .x / D hE .x ; 1/;
(6.12)
as an immediate interpretation of the definition shows (see also Exercise 1 below). The next example in the simple one-dimensional case points out a link with the Legendre transform (Fig. 6.3).
350
6 A Touch of Convex Analysis
Example-Exercise Let g W RC ! R be an increasing continuous function such that g.0/ DR0 and g.r/ ! 1 as r ! 1. Let h be the inverse R y of g and let f be given x by f .x/ WD 0 g.r/dr. Show that f is given by f .y/ D 0 h.s/ds. This transform enjoys nice properties. We leave their easy proofs as exercises. Here the infimal convolution f g of two functions f ; g is defined by .f g/.x/ WD infw .f .x w/ C g.w//. Proposition 6.20 The Fenchel transform satisfies the following properties: It is antitone: for any functions f ; g with f g one has f g . For any function f and c 2 R, .f C c/ D f c. For any function f and c > 0, .cf / .x / D cf .c1 x / for all x 2 X . For any function f and c > 0, if g WD f .c /, then g D f .c1 /. For any function f and ` 2 X , .f C `/ D f . `/. For any function f and x 2 X, .f . C x// D f h; xi. For all functions f , g one has .f g/ D f C g . For p./ WD infx2X f .; x/, where f W W X ! R, one has p ./ D f .; 0/. If h is positively homogeneous one has h D S , where S WD fx 2 X W x hg: Let us examine whether f enables one to recover f . With this aim in mind we introduce the biconjugate of f as the function f WD .f / . As above we use the same symbol for the conjugate g of a function g on X W g .x/ WD sup .hx ; xi g.x //: x 2X
In doing so we commit some abuse of notation since in fact we consider the restriction of g to X X . However, the notation is compatible with the choice of the pairing between X and X . In fact, our study could be cast in the framework of topological vector spaces X, Y in separated duality; taking for Y the dual of X endowed with the weak topology, one would get X as the dual of Y. Theorem 6.14 For any function f W X ! R one has f f . If f is closed proper convex one has f D f . This relation also holds if f D 1X or f D 1X , the constant functions with values C1 and 1, respectively. Proof Given x 2 X, for any function f W X ! R and any x 2 X we have f .x / f .x/ hx ; xi hence f .x/ D supfhx ; xi f .x / W x 2 X g f .x/. Let us suppose f is closed proper convex. For any w 2 X and r < f .w/ we can find x 2 X and c 2 R such that r < hx ; wi c, hx ; xi c f .x/ for all x 2 X. Then we have f .x / c, hence f .w/ hx ; wi c > r for all w 2 X. Therefore f f , hence f D f . The cases of the constant functions 1X , 1X with values 1 and C1, respectively, are immediate. t u Corollary 6.10 For any function f W X ! R bounded below by a continuous affine function and with nonempty domain, the greatest closed proper convex function on X bounded above by f is f j X.
6.3 The Legendre-Fenchel Transform and Its Applications
351
If f is not bounded below by a continuous affine function, then f D 1X . Proof Let us note that the epigraph of f is the set of .w ; r/ 2 X R such that f .x/ hw ; xi r for all x 2 X. The last assertion is an immediate consequence of this observation. If g is a closed proper convex function satisfying g f , we have g f since the Fenchel transform is antitone; then g D g f . Thus, when f ¤ 1X and f is bounded below by a continuous affine function, f is proper and clearly lower semicontinuous and convex, hence closed proper convex and f is the greatest such function bounded above by f . t u Corollary 6.11 For any function f W X ! R one has f D f .
Proof The result is obvious if f D 1X or if f D 1X I otherwise f is closed proper convex. t u A crucial relationship between the Fenchel conjugate and the Fenchel-Moreau subdifferential is given by the Young-Fenchel relation that follows. Theorem 6.15 (Young-Fenchel) For any function f W X ! R and for any x 2 X, x 2 X one has f .x/ C f .x / hx ; xi. When f .x/ 2 R equality holds if and only if x 2 @f .x/. Moreover, x 2 @f .x/ implies x 2 @f .x /. Proof The first assertion is a direct consequence in the definition. When f .x/ 2 R the equality f .x/ C f .x / D hx ; xi is equivalent to each of the following assertions f .x/ C f .x / hx ; xi f .x/ f .w/ C hx ; wi hx ; xi
8w 2 X
x 2 @f .x/: Moreover, they imply the inequality f .x/ C f .x / hx ; xi equivalent to x 2 @f .x /. u t Theorem 6.16 For any function f W X ! R and x 2 X one has f .x/ D f .x/ whenever @f .x/ ¤ ¿. Moreover, when f .x/ D f .x/ 2 R, one has @f .x/ D @f .x/ and x 2 @f .x/ if and only if x 2 @f .x /. Proof Given x 2 @f .x/, let g W w 7! hx ; w xi C f .x/. Then g is a continuous affine function satisfying g f , so that g f and g.x/ D f .x/ f .x/, so that f .x/ D f .x/ and x 2 @f .x/. Moreover, when f .x/ D f .x/ 2 R, the reverse inclusion @f .x/ @f .x/ follows from the relations f f , f .x/ D f .x/. The last assertion is a consequence of the last assertion of Theorem 6.15 and of the preceding. t u Corollary 6.12 When f D f the multimap @f is the inverse of the multimap @f : x 2 @f .x/ , x 2 @f .x /:
352
6 A Touch of Convex Analysis
The following special case is of great importance for dual problems. Corollary 6.13 When f .0/ D f .0/ 2 R the set of minimizers of f is @f .0/. For any function g with finite infimum, the set @g .0/ is the set of minimizers of g . When f D f and f .0/ is finite, the set @f .0/ is the set of minimizers of f . Proof The first assertion follows from the equivalences x 2 @f .0/ , 0 2 @f .x / , x is a minimizer of f . The second one ensues, as g .0/ D inf g.X/ and g D g . Taking g WD f , one gets the last assertion. t u
Exercises 1. Show that for any function f W X ! R with nonempty domain, the support function of the epigraph E WD Ef of f satisfies hE .x ; 1/ D f .x / and hE .x ; r/ D rf .r1 x /
for r < 0;
hE .x ; 0/ D hdom f .x /; hE .x ; r/ D C1
for r > 0:
2. Given a function f W X ! R1 with nonempty domain, verify that epi f f1g D .S.Q//0 \ .X R f1g/;
3.
4. 5. 6. 7. 8. 9.
where Q WD RC .epi f f1g/ and S is the map .x; r; s/ 7! .x; s; r/, a linear isometry. Show that for any function f W X ! R, the greatest lower semicontinuous convex function bounded above by f is either f or the “valley function” C associated with the closed convex hull C of dom f , given by C .x/ D 1 if x 2 C, C .x/ D C1 if x … C. If X is a normed space and f D g ı kk, where g W RC ! R1 is extended by C1 on R , show that f D g ı kk , where kk is the dual norm of kk. For X D R and f .x/ D exp x, verify that f .y/ D y log y y for y > 0, f .0/ D 0, f .y/ D C1 for y < 0. Let f W R ! R1 be given by f .x/ WD log x for x 2 P, f .x/ WD C1 for x 2 R . Verify that f .x / D log jx j 1 for x 2 P, f .x/ WD C1 for x 2 RC . Let f W X ! R1 and let g be the convex hull of f . Show that g D f . Let f W X ! R1 and let h be the lower semicontinuous hull of f . Show that h D f . (Legendre transform) Let f W X ! R1 be a lower semicontinuous proper convex function that is differentiable on its open domain W and such that its derivative f 0 W W ! X realizes a bijection between W and W WD f 0 .W/, with inverse
6.3 The Legendre-Fenchel Transform and Its Applications
353
h. Let f L W W ! R be the Legendre transform of f : f L .w / WD hw ; h.w /i f .h.w //. Show that f L coincides with the restriction to W of the conjugate f of f .
6.3.2 A Brief Account of Convex Duality Theories There are two main theories of convex duality. The perturbational approach is probably the most natural one, so we shall start with it. An alternative is the Lagrangian theory, which can also be cast in a general framework. We relate the two approaches and we apply them to classical problems. Given a nonempty set X and an objective function f W X ! R1 which may incorporate constraints, we consider the problem .P/ minimize f .x/
subject to x 2 X:
A perturbation of .P/ is the data of a normed space W named the space of parameters coupled with a normed space Y and of a function P W X W ! R called a perturbation function satisfying 8x 2 X;
P.x; 0/ D f .x/:
If P.x; 0/ f .x/ holds for all x 2 X one says that P is a sub-perturbation of .P/. The performance function p W W ! R (or value function) is given by 8w 2 W;
p.w/ WD inf P.x; w/: x2X
The relation p .0/ WD supfp .y/ W y 2 Yg is an incentive to introduce the dual problem: .DP / maximize dP .y/ WD p .y/
subject to y 2 Y;
so that sup.DP / D p .0/ and inf.P/ D p.0/. The so-called weak duality property sup.DP / inf.P/ consists in the obvious inequality p .0/ p.0/ . Strong duality is said to hold whenever sup.DP / D inf.P/ and .DP / has a solution. Let us note that the “opposite” of .DP /; that is the adjoint problem .P /
minimize dP .y/ D p .y/
is a convex problem even if f is nonconvex.
y2Y
354
6 A Touch of Convex Analysis
To obtain duality results, one may require that X be a normed vector space and P be convex. Then p .y/ D sup .hy; wi inf P.w; x// w2W
D
sup .w;x/2WX
x2X
.h.y; 0/; .w; x/i P.w; x// D P .y; 0/
and p .0/ D supfP .y; 0/ W y 2 Yg: But duality results can be obtained under the weaker assumption that the performance function p associated with P is convex, X being an arbitrary set. Let us give an interpretation of the set S of solutions to the dual problem .DP /: Here, although p is not necessarily convex, we set @p.0/ WD fy 2 Y W 8w 2 W p.w/ p.0/ C hw; yig: Proposition 6.21 Suppose sup.DP / D inf.P/ 2 R. Then one has @p.0/ D S the set of solutions to .DP /: Moreover, if p.0/ is finite the relation sup.DP / D inf.P/ holds whenever @p.0/ is nonempty. Proof Suppose sup.DP / D inf.P/, i.e. p .0/ D p.0/. Given y 2 S , we have p .y/ D supy2Y p .y/ D p .0/ D p.0/; so that infw2W .p.w/ hw; yi/ D p.0/ or p.w/ p.0/ C hw; yi for all w 2 W and y 2 @p.0/: Given y 2 @p.0/; we have supw2W .hw; yi p.w// p.0/, hence p .0/ p .y/ p.0/: Since the inequality p .0/ p.0/ is always satisfied, we obtain p .0/ D p .y/ D p.0/ and y 2 S since p .0/ D supy2Y p .y/: t u Example-Exercise (Linear Programming) Given A 2 L.Rm ; Rn /; b 2 Rn , c 2 Rm , consider the problem .P/
minimize hcjxi subject to x 2 Rm ; x 0; Ax D b:
Define a perturbation whose associated dual problem is as in the first example of this chapter: .D/
maximize hbjyi subject to y 2 Rn ; A y c:
Compare it to the Lagrangian dual problem described hereafter.
t u
The Lagrangian scheme consists in replacing .P/ with a family .Py /y2Y of simpler problems .Py / minimize Ly .x/
subject to x 2 X:
6.3 The Legendre-Fenchel Transform and Its Applications
355
where L W X Y ! R is called a Lagrangian if supy2Y L.x; y/ D f .x/ for all x 2 X and a sub-Lagrangian if supy2Y L.x; y/ f .x/ for all x 2 X: Here Ly WD L.; y/ for all y 2 Y: In both cases, setting dL .y/ WD inf L.x; y/, x2X
(6.13)
the value sup.DL / of the dual problem .DL / maximize dL .y/
subject to y 2 Y
satisfies the estimate sup.DL / inf.P/ (weak duality). One says that there is no duality gap when sup.DL / D inf.P/ and one says that strong duality holds when there is no duality gap and .DL / has a solution. An element y of Y is called a multiplier if infx2X L.x; y/ D infx2X f .x/, i.e. if dL .y/ D infx2X f .x/: Since for all y 2 Y one has dL .y/ infx2X f .x/; we see that y 2 Y is a multiplier if and only if y is a solution to the dual problem .DL / and there is no duality gap. Multipliers can be used to detect solutions of .P/; as shown by the next statement. Proposition 6.22 Let L be a Lagrangian or a sub-Lagrangian of .P/. If y is an element of the set M of multipliers, then the set S of solutions of .P/ is contained in the set Sy of solutions of .Py / and inf.P/ D inf.Py /: Conversely, if for some y 2 Y and some x 2 Sy one has L.x; y/ D f .x/, then x 2 S and y 2 M: Under each of the preceding conditions, .x; y/ is a saddle point of L: L.x; y/ L.x; y/ L.x; y/
8.x; y/ 2 X Y:
(6.14)
Proof Let x 2 S and y 2 M. Since L.x; y/ f .x/ D inf.P/ D infx2X L.x; y/ by definition of a multiplier, we see that x minimizes L.; y/ and L.x; y/ D f .x/ D inf.Py /: Conversely, given y 2 Y such that f .x/ D L.x; y/ for some x 2 Sy , i.e. such that L.x; y/ L.x; y/ for all x 2 X, then, since L.x; y/ f .x/ for all x 2 X, we get that x minimizes f and f .x/ D infx2X L.x; y/: x 2 S and y 2 M: Since in such a case the inequalities L.x; y/ L.x; y/ for all x 2 X can be completed with the relations L.x; y/ f .x/ D L.x; y/ by definition of a subLagrangian, .x; y/ is a saddle point of L: t u Now let us show that any perturbation or sub-perturbation P of .P/ yields a subLagrangian of .P/ by setting L.x; y/ WD inf .P.x; w/ hw; yi/ w2W
8.x; y/ 2 X Y;
or Lx WD Px for all x 2 X, where Lx WD L.x; / and Px WD P.x; /:
(6.15)
356
6 A Touch of Convex Analysis
Proposition 6.23 (a) The function L deduced from a sub-perturbation P via (6.15) is a subLagrangian of .P/: If P is a perturbation and if P x .0/ D Px .0/ for all x 2 X; in particular, if the function Px WD P.x; / is closed proper convex, then L is a Lagrangian of .P/. (b) Moreover, the objective dL W Y ! R of the sub-Lagrangian dual problem given by (6.13) coincides with the objective dP WD p W Y ! R of the perturbational dual problem. Thus, the values and the optimal solutions of the two dual problems are the same. (c) The set M of multipliers of the sub-Lagrangian L coincides with @p.0/: Proof (a) For all x 2 X; by definition of L, we have sup L.x; y/ D sup Px .y/ D P x .0/ Px .0/ f .x/ y2Y
y2Y
and these relations are equalities whenever P x .0/ D Px .0/ D f .x/ for all x 2 X: (b) For all y 2 Y the definitions of dL , L and dP yield dL .y/ WD inf L.x; y/ WD inf inf .P.x; w/ hw; yi/ x2X
x2X w2W
D inf .p.w/ hw; yi/ D p .y/ WD dP .y/: w2W
(c) If y 2 M we have sup.DP / D sup.DL / D inf.P/ and y 2 S D @p.0/ in view of Proposition 6.21. Conversely, if y 2 @p.0/ Proposition 6.21 shows that sup.DP / D inf.P/ and y 2 S ; so that y 2 M since sup.DL / D sup.DP /: t u Now let us consider the passage from a Lagrangian to a perturbation. Given a sub-Lagrangian L let us set P.x; w/ WD sup.hy; wi C L.x; y// D .Lx / .w/
.x; w/ 2 X W:
(6.16)
y2Y
Proposition 6.24 Given a Lagrangian (resp. a sub-Lagrangian) L, the function P defined by (6.16) is a perturbation (resp. a sub-perturbation) of .P/. Moreover, the function K W X Y ! R given by Kx WD .Lx / for all x 2 X is a sub-Lagrangian of .P/ and K is the sub-Lagrangian associated with P: In particular, the dual objective function of K coincides with the dual objective function of P: Thus, when .Lx / D Lx for all x 2 X the dual problem .DP / associated with P coincides with the Lagrangian dual problem .DL /: Proof Since for all x 2 X one has P.x; 0/ WD supy2Y L.x; y/ f .x/ the first assertion is an immediate consequence in the definitions. Since for all .x; y/ 2 X Y; one has Lx ./ f .x/; taking biconjugates one gets K.x; y/ WD .Lx / .y/ f .x/;
6.3 The Legendre-Fenchel Transform and Its Applications
357
so that K is a sub-Lagrangian of .P/ and K L: Since for all x 2 X one has Px D .Lx / and Kx D .Lx / ; we have Kx D .Px / ; so that K is the subLagrangian associated with P: Then, by Proposition 6.23 (b), the dual objective function dK associated with K coincides with dP : When .Lx / D Lx for all x 2 X one has K D L; so that the dual problem associated with .P/, which coincides with the dual problem associated with K, also coincides with the dual problem associated with L: t u Thus, the Lagrangians L such that Lx D .Lx / for all x 2 X are in one-toone correspondence with the perturbations P of .P/ such that Px D .Px / . Mathematical programming problems can be cast in such frameworks. Given a nonempty set X; a normed vector space W; a multimap G W X W with inverse F W W X and a function f W X ! R1 , let us consider the problem .M/ minimize f .x/
subject to x 2 F.0/ WD G1 .0/:
It is natural to take as a perturbation of .M/ the function P W X W ! R1 given by 8.x; w/ 2 X W;
P.x; w/ WD f .x/ C F.w/.x/ D f .x/ C G .x; w/;
where G is the indicator function of (the graph of) G. In such a case, the (sub-) Lagrangian L associated with P is given by L.x; y/ D .Px / .y/ D inf .f .x/ C G.x/ .w/ hw; yi// D f .x/ G.x/ .y/; w2W
where for a subset C of W, C is the support function of C. If the values of G are closed convex, then Px D P x for all x 2 X and L is a Lagrangian. When G is given by G.x/ WD C g.x/, where C is a subset of W and g W X ! W is a map, one gets P.x; w/ D f .x/ C C .g.x/ C w/ and L.x; y/ D f .x/ C hg.x/; yi C .y/; a familiar form when C is a convex cone of W; since in such a case one has C D C0 where C0 is the polar cone of C: The classical mathematical programming problem .Mf ;g / minimize f .x/ subject to gi .x/ D 0; gj .x/ 0; i 2 Nk ; j 2 Nm C k m 0 k corresponds to the case W WD RkCm and C WD f0Rk g Rm C ; then C WD R RC and the Lagrangian is given by
L.x; y/ D f .x/ C
kCm X
yi gi .x/
.x; y/ WD .x; y1 ; : : : ykCm / 2 X Rk Rm C;
iD1
L.x; y/ D 1 if y 2 RkCm n.Rk Rm C /:
358
6 A Touch of Convex Analysis
Example Let us consider the linear programming problem .P/ minimize hc; xi under the constraints x 2 RnC , Ax b where A 2 L.Rn ; Rm /, b 2 Rm and c 2 Rn identified with the dual of Rn : Setting n n 0 f WD c; C WD Rm C ; X WD RC , g.x/ WD Ax b; since C D RC and since for m y 2 RC one has d.y/ WD infn hc; xi C hy; Ax bi D hy; bi if c C A| y 2 Rm C , 1 otherwise x2RC
the dual problem can be written as | .D/ maximize hb; yi under the constraints y 2 Rm C , A y c:
Thus the dual problem has a form similar to the form of the primal problem. If m is smaller than n; the dual problem may be easier to solve than the primal problem. Then, using Proposition 6.22, one may use a solution to .D/ to solve the primal problem. Moreover, the interpretation of the set of multipliers in terms of @p.0/ yields precious sensitivity results. In the exercises other examples are provided. A case of special interest is the minimization problem .Pf ;g;h /
minimize f .x/ C h.g.x//;
x 2 D;
(6.17)
where f W X ! R1 , g W D ! W and h W W ! R1 , X, W being Banach spaces, D being a subset of X. When h is the indicator function C of a subset C of W, .Pf ;g;h / amounts to the minimization of f over D \ g1 .C/. A natural perturbation of problem .Pf ;g;h / is given by P.x; w/ WD f .x/ C h.g.x/ C w/: The objective function p of .D/ can easily be expressed in terms of the data: p .y/ D inf .p.w/ hy; wi/ D inf inf .f .x/ C h.g.x/ C w/ hy; wi/ w2W
w2W x2D
D inf inf .f .x/ C hy; g.x/i C h.g.x/ C w/ hy; g.x/ C wi/ x2D w2W
D inf Œf .x/ C hy; g.x/i C inf .h.z/ hy; zi/ x2D
z2W
D inf Œf .x/ C hy; g.x/i h .y/: x2D
When D WD X and h D C , with C a convex cone in W, as in problem .M/ above with G.x/ WD C g.x/, h is the indicator function of the polar cone C0 and the function ` given by `.x; y/ WD f .x/ C hy; g.x/i C0 .y/
6.3 The Legendre-Fenchel Transform and Its Applications
359
called the Lagrangian of .M/ corresponds to the Lagrangian L introduced in Proposition 6.23.
Exercises 1. Let C be a closed convex subset of a normed space X, and let C be the support function of C given by C .x / WD supfhx ; xi W x 2 Cg for x 2 X . Prove that dC D .C C BX / . 2. If C is a subset of a normed space X, the signed distance to C is the function dC˙ given by dC˙ .x/ WD dC .x/ if x 2 XnC, dC˙ .x/ WD dXnC .x/ for x 2 C. (a) Show that dC˙ is convex when C is convex. (b) Let C be a closed convex subset of X, let S be the indicator function of the unit sphere in X , and let C be the support function of C. Prove that dC˙ D .C C S / . (c) Suppose C is a nonempty open convex subset of X and let w 2 WnC. Let s W x 7! 2w x. Verify that C \ s.C/ D ¿ and use a separation theorem. Prove the relation inffkw xk W x 2 XnCg D supfhx ; wi C .x / W x 2 X nB.0; 1/g: (d) Show that if the infimum is attained at some x 2 XnC, then there exists some x 2 X such that x 2 S.x w/ WD fx 2 SX W hx ; x wi D kx wkg (Fig. 6.4).
Fig. 6.4 Duality between distance and support functions
C x
w
360
6 A Touch of Convex Analysis
3. Assuming W is a normed vector space, show that the weak duality inequality inf.P/ C inf.P / 0 stems from the Fenchel inequality P.0; x/ C P .w ; 0/ h0; xi C hw ; 0i D 0. 4. Verify that there is no Lagrange multiplier for pthe problem of minimizing the function f W D WD RC ! R given by f .x/ D x under the constraint g.x/ 0 for g.x/ D x although it has 0 as optimal solution. 5. Show that the dual problem of the quadratic programming problem 1 .P/ minimize hQx; xi C hc; xi under the constraints x 2 Rn , Ax b 2 when Q is positive definite can be written as 1 1 .D/ maximize hAQ1 A| y; yi hb C AQ1 c; yi hQ1 c; ci: 2 2 Show how the solution to the dual problem can be used to solve the primal problem. 6. (Bourass-Giner) Let p be the performance function associated to the perturbation P of problem .Pf ;g;h /: A criterion for the convexity of p can be given by considering the set Ef ;g WD f.w; r/ 2 W R W 9x 2 g1 .w/ \ D; f .x/ < rg: The pair .f ; g/ is said to be convex-like if Ef ;g is convex. Verify that Ef ;g is the strict epigraph of the performance function q given by q.w/ WD infff .x/ W x 2 D; g.x/ D wg; so that .f ; g/ is convex-like if and only if q is convex. Show that if .f ; g/ is convexlike and if h is convex, then p is convex. [Hint: p is the infimal convolution hQq of h and qQ , where qQ .w/ WD q.w/ for w 2 W.] 7. (Geometric programming) Let G.X/ be the class of functions on X that are finite sums of functions of the form x 7! c log.expha1 ; xi C : : : C expham ; xi/ for some ai 2 X (i 2 Nm ), c > 0. Given g0 ; g1 ; : : : ; gk in G.X/, write down a Lagrangian dual problem for the problem of minimizing g0 .x/ under the constraints gi .x/ 0 (i 2 Nk ) and give a duality result. 8. (Allocation problem) Assume that a given quantity x0 of some commodity has to be allocated among n distinct activities or agents. Suppose the return of activity i for an allocation xi is gi .xi /, where gi is an increasing concave function (in view of the law of diminishing marginal returns). The allocation problem (A) consists in finding some x WD .x1 ; : : : ; xn / 2 Rn that maximizes the total return g.x/ WD g.x1 / C : : : C g.xn / subject to x1 C : : : C xn D x0 , xi 2 RC . Convert this problem into a convex minimization problem and use duality to show that it can be reduced to a one-variable maximization problem (D) once the conjugates of the functions fi WD gi have been computed and then choosing xi in such a way
6.3 The Legendre-Fenchel Transform and Its Applications
361
that yxi gi .xi / D infs fys gi .s/g, where y is the solution to (D). [See [189, p. 202].]
6.3.3 Duality and Subdifferentiability Results Let us first give criteria for duality results. Then we shall apply them to subdifferentiability rules. We start with a perturbation P of problem .P/ and its associated performance function p. Proposition 6.25 Suppose p is convex and inf.P/ is finite. Let V be the vector space generated by dom p. Suppose there exist some r > 0, m 2 R and some map w 7! x.w/ from rBW \ V to X such that P.w; x.w// m for all w 2 rBW \ V. Then p j V is continuous, p is subdifferentiable at 0 and strong duality holds. In particular, if for some r > 0, m 2 R, and x 2 X one has P.w; x/ m for all w 2 rBW \ V; then strong duality holds. Proof Under our assumption, p is bounded above by m on rBW \ V since for w 2 W one has p.w/ P.w; x.w//. Thus p j V is continuous at 0 and, by Corollary 6.5, p is subdifferentiable at 0. t u From the preceding proposition one can get the subdifferentiability rules under continuity assumptions we have seen previously (exercise). We rather deduce new subdifferentiability rules under closedness assumptions and algebraic assumptions that are quite convenient. Theorem 6.17 Let W, X be Banach spaces and let p be the performance function associated with a perturbation P W W X ! R1 that is convex, lower semicontinuous and such that [ RC dom P.; x/ D Z D cl.Z/: (6.18) Z WD x2X
Then, if p.0/ 2 R, p is subdifferentiable at 0 and strong duality holds. Note that assumption (6.18) means that Z is a closed vector subspace of W. It is obviously satisfied if Z D W, i.e. if dom p is absorbant or 0 2 core.dom p/. Proof By Corollary 6.5, we may suppose Z D W. The set F WD f.x; r; w/ 2 X R W W P.w; x/ rg is closed and convex as the image of the epigraph of P under the isomorphism .x; w; r/ 7! .x; r; w/. Relation (6.18) means that the projection C WD pW .F/ of F is absorbing, i.e. 0 2 core C. The Robinson-Ursescu Theorem ensures that F, considered as a multimap from X R to W, is open at any .x; r/ 2 X R such that .x; r; 0/ 2 FI in particular, there exists some c > 0 such that cBW F..x; r/ C BXR //:
362
6 A Touch of Convex Analysis
Thus, for all w 2 cBW there exists some .xw ; rw / 2 BŒ.x; r/; 1 such that .xw ; rw ; w/ 2 F, i.e. P.w; xw / rw m WD jrj C 1. Thus p is bounded above by m on cBW , hence p is continuous at 0 and subdifferentiable at 0. The preceding proposition entails that strong duality holds. t u Let us present some consequences for subdifferential calculus and the calculus of conjugates. In the next theorem we gather a sum rule and a composition rule. It generalizes Theorems 6.8 and 6.9 (Exercise 1). Again, for a function h W Z ! R and r 2 R, we set fh sg WD h1 . 1; s/. Here we have to make a change of notation in order to capture this classical result. Theorem 6.18 (Fenchel-Rockafellar) Let X, Y be normed spaces, let A W X ! Y be a continuous linear map and let f W X ! R1 , g W Y ! R1 be convex functions such that there exist r > 0, s 2 RC for which rBY A.ff sg \ sBX / fg sg:
(6.19)
Then, with the usual convention that an infimum is denoted by min whenever it is attained when finite, for all x 2 X one has f .x A| y / C g .y / : .f C g ı A/ .x / D min
(6.20)
y 2Y
Moreover, for any x 2 dom f \ A1 .dom g/ one has @.f C g ı A/.x/ D @f .x/ C A| [email protected]//:
(6.21)
Proof Let W WD Y, let x 2 X and let P W W X ! R1 be given by P.w; x/ WD f .x/ hx ; xi C g.Ax C w/: For all w 2 rBW (6.19) yields xw 2 ff sg \ sBX and yw 2 fg sg such that w D yw Axw . Then, the performance function p given by p.w/ WD infx2X P.w; x/ satisfies p.w/ P.w; xw / D f .xw / hx ; xw i C g.yw / 2s C s kx k ; and strong duality holds. Now, for y 2 Y , setting y WD Ax C w, one has P .y ; 0/ D
sup .w;x/2WX
hy ; wi C hx ; xi f .x/ g.Ax C w/
D sup hx ; xi hy ; Axi f .x/ C sup hy ; yi g.y/ x2X
D f .x A| y / C g .y /;
y2W
6.3 The Legendre-Fenchel Transform and Its Applications
363
so that (6.20) follows from the relation .f C g ı A/ .x / D inf P.0; x/ D inf.P/ D min.P / x2X
D min f .x A| y / C g .y / : P .y ; 0/ D min y 2Y
y 2Y
Now if x 2 @j.x/, with j WD f C g ı A, one has j.x/ C j .x / hx ; xi D 0 and there exists some y 2 Y such that j .x / D f .x A| y / C g .y /, hence 0 D f .x/ C f .x A| y / hx A| y ; xi C g.Ax/ C g .y / hA| y ; xi : Since both terms between brackets are nonnegative, they are null. Thus x A| y 2 @f .x/, A| y 2 @g.Ax/, and the non-trivial inclusion of equality (6.21) holds. t u Corollary 6.14 Let X, Y be normed spaces, let A W X ! Y be a continuous linear map and let f W X ! R1 , g W Y ! R1 be convex functions such that for some x 2 dom f the function g is finite and continuous at Ax. Then, for all x 2 X and x 2 dom f \ A1 .dom g/ one has | f .f C g ı A/ .x / D min .x A y / C g .y / ; y 2Y
Proof Let s > max.kxk ; f .x/; g.Ax//: Since g is continuous at Ax; one can find r > 0 such that g.y/ < s for all y 2 BŒAx; r: Then rBY A.x/ fg sg and condition (6.19) is satisfied. The preceding theorem applies. t u In Banach spaces, under semicontinuity assumptions a similar conclusion can be obtained under a transversality condition that is often just of an algebraic nature. Theorem 6.19 (Attouch-Brézis) Let X; Y be Banach spaces, let A 2 L.X; Y/; and let f W X ! R1 , g W Y ! R1 be closed proper convex functions. If the cone Z WD RC .A.dom f / dom g/ is closed and symmetric (i.e., Z D Z D cl Z) then, for all x 2 X and x 2 dom f \ A1 .dom g/ relations (6.22) and (6.23) hold. Note that the assumption on Z means that Z is a closed linear subspace. It is obviously satisfied when the simple algebraic condition that follows is fulfilled Y D RC .A.dom f / dom g/ :
(6.24)
Proof Taking W WD Y, we define the perturbation function P as in the preceding proof. Then, for x 2 X, we have w 2 dom P.; x/ if and only if x 2 dom f and w 2 dom g Ax, so that the cone generated by the union over x of dom P.; x/ is
364
6 A Touch of Convex Analysis
RC .dom g A.dom f //, the closed linear subspace Z. Then Corollary 6.17 ensures that strong duality holds and the proof can be finished in the same way as the preceding one. t u As an application of the Fenchel-Rockafellar theorem let us give a dual expression of the distance function to a closed convex subset. Proposition 6.26 The distance function dC to a nonempty closed convex subset C of a normed space X and its support function hC WD C are linked by the following relations for all x 2 X, B denoting the closed unit ball BX of X dC .x/ D max .hx; x i hC .x //; x 2B
1 2 1 dC .x/ D max inf .hx x; x i kx k2 /: x 2B x2C 2 2
If C is a cone with polar cone C0 then dC .x/ D
max hx; x i;
x 2B \C0
1 2 1 2 d .x/ D max .hx; x i kx k /: 2 C 2 x 2C0
Proof Considering the closed convex functions f1 , f2 W X ! R given by f1 .x/ WD kx xk, f2 .x/ WD .1=2/ kx xk2 ; the Fenchel-Rockafellar theorem yields dC .x/ D inf .f1 .x/ C C .x// D max .f1 .x / C .x // x 2X
x2X
D max .hx; x i hC .x //; x 2B
1 2 d .x/ D inf .f2 .x/ C C .x// D max .f2 .x / C .x // x2X x 2X 2 C 1 2 D max .hx; x i kx k hC .x / x 2B 2 1 D max inf .hx x; x i kx k2 /: x 2B x2C 2 When C is a cone, one has hC .x / D C0 .x / and the expressions can be simplified as stated. t u Proposition 6.27 For a nonempty closed convex subset C of X the conjugates and the subdifferentials of the functions dC and 12 dC2 are such that 1 1 . dC2 / D kk2 C hC ; 2 2 1 @dC .x/ D @ kk .x w/ \ N.C; w/; @. dC2 /.x/ D J.x w/ \ N.C; w/ 2 .dC / D BX C hC ;
for w 2 PC .x/ WD fz 2 C W kx zk D dC .x/g assumed to be nonempty and J.x w/ D 12 @ kk2 .x w/.
6.3 The Legendre-Fenchel Transform and Its Applications
365
In particular, for x 2 C one has @dC .x/ D BX \ N.C; x/;
1 @. dC2 /.x/ D f0g: 2
Proof Since dC D kk C , 12 dC2 D . 12 kk2 /C , we have dC D BX ChC , . 12 dC2 / D 2 2 2 1 1 1 2 kk ChC by Proposition 6.20 and the relation . 2 kk / D 2 kk : For the calculus 1 2 t u of @dC and @. 2 dC / one applies Corollary 6.9. Let us give a generalization of a result about infimal convolution we shall use later on. Given Banach spaces X, Y and h; k W X Y ! R1 we introduce the partial convolutions h1 k and h2 k by .h1 k/.x; y/ WD inffh.u; y/ C k.v; y/ W u C v D xg .h2 k/.x; y/ WD inffh.x; w/ C k.x; z/ W w C z D yg
.x; y/ 2 X Y; .x; y/ 2 X Y
and, denoting by pX W X Y ! X the canonical projection, the condition X0 WD RC .pX .dom h/ pX .dom k// D X0 D cl X0 :
(6.25)
meaning that X0 is a closed linear subspace. Proposition 6.28 Let X, Y be Banach spaces, let h; k W X Y ! R1 be proper, convex, lower semicontinuous functions satisfying condition (6.25) and .h2 k/.x; y/ > 1 for all .x; y/ 2 X Y. Then for .x ; y / 2 X Y .h2 k/ .x ; y / D .h 1 k /.x ; y / and if this value is finite there exists some u , v 2 X such that u C v D x and .h2 k/ .x ; y / D h .u ; y / C k .v ; y /: Proof Since h2 k is the performance function associated with a convex function of .x; w; z/; it is convex. Moreover, since Lemma 3.19 and (6.25) imply that pX .dom h/ \ pX .dom k/ ¤ ¿; we can find some .x; y/ 2 X Y such that .h2 k/.x; y/ < 1. Furthermore, for all .x ; y / 2 X Y and all u , v 2 X satisfying u C v D x we have .h2 k/ .x ; y / sup sup .hx; u i C hx; v i C hw; y i C hz; y i h.x; w/ k.x; z// x2X w;z2Y
h .u ; y / C k .v ; y /; hence .h2 k/ .x ; y / .h 1 k /.x ; y /: It remains to prove that when .h2 k/ .x ; y / < 1 there exists some u , v 2 X such that u C v D x
366
6 A Touch of Convex Analysis
and h .u ; y / C k .v ; y / .h2 k/ .x ; y /:
(6.26)
For such a purpose, we introduce the functions f ; g W X Y Y ! R1 given by f .u; w; z/ WD h.u; w/ hu; x i hw; y i; g.u; w; z/ WD k.u; z/ hz; y i C .h2 k/ .x ; y /: We note that dom f D dom h Y and dom g D f.u; w; z/ W .u; z/ 2 dom k; w 2 Yg, so that for any .x; y1 ; y2 / 2 X0 Y Y we can find r 2 P; u, v 2 X, w, z 2 Y such that .u; w/ 2 dom h, .v; z/ 2 dom k and .x; y1 ; y2 / D r.u; w; r1 y2 C z/ r.v; w r1 y1 ; z/: Thus X0 Y Y P.dom f dom g/: The reverse inequality being obvious, we see that P.dom f dom g/ is a closed linear subspace. Moreover, for all .u; w; z/ 2 X Y Y we have .f C g/.u; w; z/ D h.u; w/ C k.u; z/ hu; u i hw; y i hz; y i C .h2 k/ .u ; y / .h2 k/.u; w C z/ C .h2 k/ .u ; y / hu; ui hw C z; y i 0: The sandwich theorem (or the Attouch-Brezis theorem) ensures that there exists some .u ; w ; z / 2 X Y Y such that f .u ; w ; z / C g .u ; w ; z / 0: Then both terms of this sum are finite, so that z D 0, w D 0 and f .u ; w ; z / D h .u C x ; w C y /; g .u ; w ; z / D k .u ; y w / .h2 k/ .x ; y /: Thus, the preceding inequality amounts to relation (6.26).
t u
Exercises 1. Show that Corollary 6.14 generalizes Theorems 6.8 and 6.9. 2. Let P be a closed convex cone of a Banach space X, let Q be its polar cone and let B be the closed unit ball. Prove that the distance function to Q and the support function to P \ B are equal.
6.3 The Legendre-Fenchel Transform and Its Applications
367
3. Show that if C is a nonempty closed convex subset of X containing the origin, the conjugate C of the gauge function C of C is given by C D C0 where C0 WD fx 2 X W hx ; xi 1g is the polar set of C and C is given by C .x/ WD inffr 2 P W x 2 rCg. 4. Let A W X ! W be a continuous linear operator. Suppose W is ordered by a 0 closed convex cone WC and Y WD W is ordered by the cone YC D WC . Let b 2 W and let f W X ! R1 . (a) Find the dual problem of the mathematical programming problem .P/
minimize f .x/
x 2 X; Ax b
by using the perturbation function P given by P.x; w/ WD f .x/ C F.w/ .x/ where F.w/ WD fx 2 X W Ax b wg. (b) Show that the function L W X Y ! R given by L.x; y/ WD .Px / .y/ is a Lagrangian of .P/ in the sense that supfL.x; y/ W y 2 Yg D f .x/ C F.0/ .x/. (c) (Quadratic programming) Give an explicit form of the dual problem when X D Rn , W WD Rm , WC D Rm C and f is a quadratic form: f .x/ D .1=2/hQx; xi C hq; xi, with Q positive definite. Generalize to the case when Q is positive semidefinite. [See [189].] 5. (General Fenchel equality) Given a family f1 ; : : : ; fk of convex lower semicontinuous functions that are finite and continuous at some point of X, prove that inf . f1 .x/ C : : : C fk .x// D inff f1 .x1 / C : : : C fk .xk / W x1 C : : : C xk D 0g:
x2X
6.3.4 The Interplay Between a Function and Its Conjugate Much information can be drawn from the study of the conjugate of a convex function. In the present subsection we consider growth properties versus boundedness properties. Then we deal with reinforced convexity properties versus smoothness properties. In the next subsection we give applications to optimization problems and in a following subsection we display some applications to the geometry of normed spaces. We first study the correspondence between some simple properties of onevariable functions. For that purpose, we introduce some classes of functions on RC that will play a role in the sequel. Each such function is extended by 1 on 1; 0Œ (or by setting ˛.r/ WD ˛.r/ for r < 0) so that, for ˛ 2 A; where A WD f˛ W RC ! RC W ˛.0/ D 0g one has ˛ .s/ D supfrs ˛.r/ W r 2 RC g
s 2 RC
368
6 A Touch of Convex Analysis
and ˛ 2 A. We denote by A0 the set of gages, i.e. the set of ˛ 2 A that are nondecreasing and firm, i.e. such that .rn / ! 0 whenever .˛.rn // ! 0, and we denote by Ac the set of ˛ 2 A0 that are convex and lower semicontinuous. Note that ˛ 2 A is in A0 if and only if ˛ is nondecreasing and ˛ 1 .0/ D f0g. A function ˛ 2 A is said to be starshaped if ˛.r/=r ˛.s/=s for r s in P. Any ˛ 2 Ac is starshaped and increasing. Lemma 6.8 For any starshaped ˛ 2 A0 ; in particular for any ˛ 2 Ac , ˛ is a remainder on RC : ˛ .s/=s ! 0 as s ! 0C : For any remainder W RC ! RC one has 2 Ac (hence 2 A0 and is convex, hence starshaped). For any starshaped ˛ 2 A0 ; one has ˛ 2 Ac : This lemma shows that one can often replace an element ˛ of A0 by an element ˇ D ˛ 2 Ac : Proof Let ˛ 2 A0 be starshaped and let " > 0; so that ı WD ˛."/=" is positive. We claim that for all s 20; ı we have ˛ .s/=s ": Indeed, since for r " we have ˛.r/ r˛."/="; we get ˛ .s/ max. sup .rs ˛.r//; sup .rs r˛."/="// r2Œ0;"
r2";1Œ
max."s; supfrs rı W r 2"; 1Œg/ D "s. Now, let W RC ! RC be a remainder. Since 2 A; we have 2 A. Since is lower semicontinuous and convex with .0/ D 0, it suffices to verify that for all r > 0 we have .r/ > 0: Let " 20; rŒ: Since is a remainder, we can find ı > 0 such that .s/=s " for all s 2 Œ0; ı: Setting ˛.s/ D "s for s 2 Œ0; ı and ˛.s/ D 1 for s 2ı; 1Œ we have ˛; hence .r/ ˛ .r/ D supfrs "s W s 2 Œ0; ıg D .r "/ı > 0: The last assertion is a consequence in the two previous ones. t u Now let us turn to functions defined on normed vector spaces. In order to obtain symmetry in the properties below, we assume that we have two normed spaces X, Y in metric duality, i.e. that there exists a continuous bilinear coupling c WD h; i W X Y ! R such that kyk D supfhx; yi W x 2 BX g for all y 2 Y and kxk D supfhx; yi W y 2 BY g for all x 2 X. That is the case when Y is the dual of X or when X is the dual of Y. In order to fix the terminology, we say that f W X ! R is coercive if limkxk!1 f .x/ D 1; it is called supercoercive if its coercivity rate cf WD lim inf f .x/= kxk kxk!C1
is positive and hypercoercive if cf D 1. The distinctness of these notions allows us to give different perturbation results. Lemma 6.9 For f ; g W X ! R1 one has cf Cg cf C cg : In particular, if f is hypercoercive and if cg > 1 i.e. if there exist some c 2 R such that g.x/ c kxk for kxk large enough, then f C g is hypercoercive.
6.3 The Legendre-Fenchel Transform and Its Applications
369
If f is supercoercive and if cg > cf then f C g is supercoercive. Proof The first assertion is a consequence in the sum rule for a lower limit. In fact, given a < cf , b < cg for x 2 X with kxk large enough one has f .x/ > a kxk, g.x/ > b kxk, hence .f C g/.x/ > .a C b/ kxk : The other two assertions are direct consequences. t u In the next lemma, for g W Y ! R, we set rg WD supfs 2 RC W sup g.sBY / < 1g: This lemma completes the fact that f is bounded below on X if and only if f .0/ < 1. We recall that the assumption that f is bounded below on bounded subsets is automatically fulfilled if f is closed proper convex since then f is bounded below by a continuous affine function. Lemma 6.10 Let f W X ! R1 be proper and let r; s 2 RC , a; b 2 R. (a) If f is such that f a on rBX and f ./ s kk b on XnrBX then, for y 2 sBY one has f .y/ r kyk min.a; rs b/. In particular, if f ./ s kk b on X then f ./ b on sBY . (b) If f is supercoercive and bounded below on bounded sets, then for all s 20; cf Œ there exist some c 2 R, r 2 RC such that f ./ r kk c on sBY so that f is bounded above on sBY . Moreover rf cf : (c) If f is hypercoercive and bounded below on bounded sets, then f is bounded above on bounded sets. (d) If f is such that f ./ s kk b on rBX , then one has f b on sBY and f ./ r kk C b rs on YnsBY : If f is such that f ./ s kk b on rBY ; then one has f b on sBX and f ./ r kk C b rs on XnsBX . In particular, if f is bounded above by c 2 R on rBY then f ./ r kk c on X and cf rf . (e) If f is bounded above on bounded sets then f is hypercoercive and bounded below on bounded sets. Proof (a) For y 2 sBY , x 2 X, separating the cases x 2 rBX and x 2 XnrBX , we have f .y/ max. sup .hy; xi a/; sup .hy; xi s kxk C b// x2rBX
x2XnrBX
max.r kyk a; sup t.kyk s/ C b/ D r kyk C max.a; b rs/: tr
Taking r D 0; a D b; we get f ./ b on sBY : (b) For all s 20; cf Œ one can find r > 0 such that f .x/= kxk s for all x 2 XnrBX . Setting b D 0 and a WD inf f .rBX / in (a), one gets f ./ r kk min.a; sr/ max.sr a; 0/ on sBY . (c) This is an immediate consequence of (b) with cf D 1:
370
6 A Touch of Convex Analysis
(d) If f is such that f ./ s kk b on rBX , then for y 2 Y one has f .y/ supx2rBX .hx; yi s kxk C b/ D suptr .t kyk st C b/, hence f .y/ b for y 2 sBY and f .y/ r kyk C b rs for y 2 YnsBY : Interchanging the roles of X and Y and applying the obtained implication to f we get f f b on sBX and f ./ f ./ r kk C b rs on XnsBX : Taking s WD 0, b WD c, we obtain that f ./ r kk c on X when f is such that f c on rBY : (e) This is an immediate consequence of (d) since r and s can be arbitrary in RC . t u For a closed proper convex function, the relationships between growth properties of f and boundedness properties of f are more striking. Proposition 6.29 For f W X ! R1 closed, proper and convex the following assertions are equivalent: (a) (b) (c) (d) (e)
f is coercive; the sublevel sets of f are bounded; there exists b 2 R, c 2 P such that f c kk C b; f is supercoercive: cf WD lim infkxk!C1 f .x/= kxk > 0; f is bounded above on a neighborhood of 0. If Y is complete the preceding assertions are equivalent to the next one: (f) 0 2 int.domf /.
Proof (a),(b) is easy and (c))(a) is obvious. (d))(c) Since f is bounded below by a continuous affine function, it is bounded below on balls. Given c 20; cf Œ, we can find r > 0 such that f ./ c kk on XnrBX and a 2 R such that f ./ a on rBX . Taking b WD min.a cr; 0/, we get f ./ c kk C b on rBX and XnrBX hence on X. (a))(d) Suppose cf 0. Given a sequence ."n / ! 0C in 0; 1Œ, one can find xn 2 X such that kxn k n="n and f .xn / "n kxn k. Let tn WD 1=."n kxn k/ 1=n. Then, given w 2 domf , for un WD .1 tn /w C tn xn , one has f .un / .1 tn /f .w/ C tn f .xn / jf .w/j C 1 but .un / is unbounded since kun k tn kxn k .1 tn / kwk 1="n kwk, a contradiction with (a). (d),(e) has been proved in the preceding lemma and (e))(f) is obvious. (f))(e) in the case when Y is complete is a consequence in Proposition 6.3 since f is convex and lower semicontinuous. t u Now let us point out relationships between the rotundity properties of f and the smoothness of f . This can be done in a quantitative manner. The rotundity properties we introduce are strengthenings of the notion of strict convexity : f is said to be strictly convex if for all t 20; 1Œ and distinct x0 , x1 2 X, one has .1 t/f .x0 / C tf .x1 / > f ..1 t/x0 C tx1 /:
6.3 The Legendre-Fenchel Transform and Its Applications
371
Given 2 A and x0 2 X, a function f W X ! R1 is said to be -convex at x0 if for all x1 2 X, t 20; 1Œ one has .1 t/f .x0 / C tf .x1 / f ..1 t/x0 C tx1 / C t.1 t/.kx0 x1 k/:
(6.27)
It is -convex if it is -convex at x0 for all x0 2 X. The greatest function satisfying (6.27) is called the index of (uniform) convexity of f and is denoted by f : Note that f is given by f .r/ WD
.1 t/f .x0 / C tf .x1 / f ..1 t/x0 C tx1 / : t.1 t/ x0 ; x1 2domf ; kx1 x0 kDr; sup
t20;1Œ
If f 2 A0 one says that f is uniformly convex. When f D cj j2 for some c 2 P one says that f is strongly convex. The function f is said to be -convex on a subset B of X if fB WD f C B is -convex. It is uniformly convex on a subset B of X if fB WD f C B is -convex for some 2 A0 . Given 2 A, a function f W X ! R1 is said to be -smooth at x0 2 X, if for all x1 2 X, t 2 Œ0; 1 one has f ..1 t/x0 C tx1 / C t.1 t/.kx0 x1 k/ .1 t/f .x0 / C tf .x1 /:
(6.28)
The function f W X ! R1 is said to be -smooth if for all x0 2 X it is -smooth at x0 . The least function satisfying such a property is called the index of (uniform) smoothness of f and is denoted by f : Note that setting xt WD .1 t/x0 C tx1 , the function f is given by f .r/ WD
inf
x0 ;x1 2X; kx0 x1 kr t20;1Œ
f
.1 t/f .x0 / C tf .x1 / f .xt / W xt 2 domf g: t.1 t/
If f 2 o.RC /; the set of remainders on RC ; one says that f is uniformly smooth. Theorem 6.20 If for some 2 A a function f W X ! R1 is -smooth, then f is -convex. If f is uniformly smooth, then f is uniformly convex and f .f / . If for some 2 A the function f is -convex, then f is -smooth. If f is uniformly convex, then f is uniformly smooth and f .f / . Proof Suppose f is -smooth. Given y0 , y1 2 Y, t 20; 1Œ and w; x 2 X, let xt WD x C tw, yt WD .1 t/y0 C ty1 . Then, using (6.28) we have .1 t/f .y0 / C tf .y1 / .1 t/hx; y0 i C thx1 ; y1 i .1 t/f .x/ tf .x1 / hxt ; yt i C t.1 t/hw; y1 y0 i f .xt / t.1 t/.kwk/:
372
6 A Touch of Convex Analysis
Setting w WD kwk u with u 2 SX , taking the supremum on x 2 X, u 2 SX , then on kwk 2 RC , we get .1 t/f .y0 / C tf .y1 / f .yt / C t.1 t/ .ky1 y0 k/; so that f is -convex. If f is uniformly smooth, WD f is a remainder on RC ; so that, by Lemma 6.8, is in Ac and f is uniformly convex. Now suppose f is -convex. Given y0 , y1 2 Y, t 2 Œ0; 1, for any r0 < f .y0 /; r1 < f .y1 / we can pick x0 , x1 2 X such that r0 < hx0 ; y0 i f .x0 /;
r1 < hx1 ; y1 i f .x1 /:
Multiplying both sides of the first (resp. second) inequality by .1 t/ (resp. t) and adding to both sides of the Young-Fenchel inequality 0 f .yt / C f .xt / hxt ; yt i with xt WD .1 t/x0 C tx1 , yt WD .1 t/y0 C ty1 , we get that .1 t/r0 C tr1 is bounded above by f .yt / C f .xt / C t.1 t/hx0 x1 ; y0 y1 i .1 t/f .x0 / tf .x1 / f .yt / C t.1 t/ kx0 x1 k : ky0 y1 k t.1 t/.kx0 x1 k/ f .yt / C t.1 t/ .ky0 y1 k/: Since r0 and r1 are arbitrarily close to f .y0 / and f .y1 / respectively, we get .1 t/f .y0 / C tf .y1 / f .yt / C t.1 t/ .ky0 y1 k/; so that f is -smooth. If f is uniformly convex, .f / being a remainder by Lemma 6.8, f is uniformly smooth. The estimate for f stems from its minimality property; similarly, the estimate for f stems from its maximality property. t u Since X and Y play symmetric roles, we get the following corollary. Corollary 6.15 A function f W X ! R1 is uniformly smooth (resp. uniformly convex) if and only if f is uniformly convex (resp. uniformly smooth). In [17, 46, 256, 264, 265] the reader will find more information about rotundity properties and smoothness properties of convex functions. We shall return to such properties in the next subsection after considering some applications to minimization.
6.3 The Legendre-Fenchel Transform and Its Applications
373
Exercises 1. Let X be a Hilbert space with scalar product h j i and let A W X ! X be a symmetric, linear, continuous map such that the quadratic form q associated with A is positive on Xnf0g. Let b 2 X and let f be given by f .x/ D q.x/ hb j xi. (a) Show that A and the square root A1=2 of A are injective and that their images satisfy R.A/ R.A1=2 /. (b) Using Theorem 6.9 and the relation q D g ı A1=2 for g WD 12 kk2 show 2 that q .x / D 12 .A1=2 /1 .x / for x 2 R.A1=2 / and q .x / D C1 otherwise. (c) Show that if b 2 R.A/, then f attains its minimum at A1 .b/. (d) Show that if b 2 R.A1=2 /nR.A/, then f is bounded below but does not attain its infimum. (e) Show that if b … R.A1=2 /, then infx2X f .x/ D 1. (f) Deduce from the preceding questions that when R.A/ is closed, then R.A1=2 / D R.A/. [Hint: when R.A/ ¤ X take b 2 XnR.A/ and pick some u 2 R.A/? such that hb j ui > 0; then verify that infr>0 f .ru/ D 1.] 2. (a) Show that if f W X ! R1 is such that f b and domf rBX then one has f ./ r kk b. (b) Show that if f is such that f b and f ./ c kk on XnrBX then one has cBX dom f . 3. Give an example of a coercive function that is not supercoercive. 4. Give an example of a supercoercive function that is not hypercoercive. 5. If f .x/ D 1p kxkp with p 21; 1Œ show that f .x / D 1q kx kq with q WD .1 1 1 / , where kk is the dual norm. Observe that for p D 2 one has q D 2. p 6. Let X be a Hilbert space identified with its dual. Show that f D f if and only if f WD 12 kk2 . 7. Let h W X ! R1 be a positively homogeneous function such that h.0/ D 0; let b 2 R, and let f W X ! R1 be such that f h b: Verify that f b on S WD @h.0/: Show that conversely if for some g W X ! R1 and some S X one has g b on S; then g hS b where hS .x/ WD supfhx; x i W x 2 Sg: Conclude that for b 2 R and some weak closed convex subset S of X one has f hS b if and only if f b on S. 8. Given a Banach space X and a convex function f defined on it, show that the Fenchel conjugate f of f is Gateaux differentiable at some x 2 X if and only if any sequence .xn / such that .f .xn / x .xn // ! inf.f x / converges. 9*. (Figiel [121]) If f is convex, show that the function r 7! f .r/=r is starshaped, i.e. f is such that f .0/ D 0 and r 7! f .r/=r2 is nondecreasing [see [121] or [265, Prop. 3.5.1]].
374
6 A Touch of Convex Analysis
10. For f convex and x 2 dom f , one defines the index of uniform convexity of f at x as x .r/ WD inff
.1 t/f .x/ C tf .x/ f ..1 t/x C tx/ W t 20; 1Œ; x 2 rSX C x/; t.1 t/
so that f is uniformly convex at x if and only if x 2 A0 : One also defines x by
x .r/ WD infff .x/ f .x/ f 0 .x; x x/ W x 2 .dom f / \ .rSX C x/g: Show that x x , so that x 2 A0 whenever f is uniformly convex at x. 11. With the notation of the preceding exercise, prove that when x 2 A0 and f is Fréchet differentiable at x; then f is uniformly convex at x. [See [265, Prop. 3.4.5].]
6.3.5 Conditioning and Well-Posedness Given a nonempty closed subset S of a normed space X and ˛ 2 A; we say that f W X ! R1 whose set of minimizers arg min f contains S is ˛-conditioned for S if 8x 2 X
f .x/ inf f .X/ C ˛.dS .x//;
(6.29)
where dS .x/ WD inffd.w; x/ W w 2 Sg; and that f is ˛-conditioned if this relation holds and S is the set of minimizers of f : This is the case whenever (6.29) holds with ˛ firm, or such that ˛ 1 .0/ D f0g. We define the conditioning index f of f with respect to S by
f .r/ WD infff .x/ inf f .X/ W x 2 X; dS .x/ rg
r 2 RC :
Clearly, f 2 A and f is nondecreasing. Moreover, f is the greatest nondecreasing element ˛ of A such that f is ˛-conditioned. When f 2 A0 we say that f is wellconditioned. Let us first note the following observation. Proposition 6.30 The conditioning index f of a convex function f with a nonempty set S of minimizers is starshaped. Proof Without loss of generality we suppose inf f .X/ D 0: Given r > 0 and c > 1 we have to prove that f .cr/=c f .r/. Let x 2 X be such that s WD dS .x/ cr: Given .qn / ! 1 with qn 20; 1Œ for all n 2 N, we can find an 2 S such that qn kx an k s. Let sn WD kx an k, let tn WD 1 qn .1 1=c/ 2 Œ0; 1Œ; and let
6.3 The Legendre-Fenchel Transform and Its Applications
375
wn D .1 tn /an C tn x: Then kwn xk D .1 tn / kx an k D .1 tn /sn , so that dS .wn / dS .x/ kwn xk D s .1 tn /sn D s sn qn .1 1=c/ s s.1 1=c/ D s=c r: Since f .an / D 0, by convexity of f we have
f .r/ f .wn / tn f .x/: Taking the limit as n ! 1 we get f .r/ .1=c/f .x/. Since x is arbitrary in fx W dS .x/ crg; taking the infimum we obtain f .r/ .1=c/ f .cr/: t u As usual, in the next statement, a sequence .xn / of X such that .f .xn // ! m WD inf f .X/ is said to be minimizing. Lemma 6.11 ([206]) The following assertions on f W X ! R1 and a nonempty closed subset S of arg min f are equivalent and imply that S WD arg min f : (a) any minimizing sequence .xn / satisfies .dS .xn // ! 0; (b) f belongs to A0 , the set of nondecreasing firm elements of A; (c) there exists some ˛ 2 A0 such that f is ˛-conditioned for S. Proof (a))(b) Suppose f .r/ D 0 for some r > 0: Then there exists a sequence .xn / such that dS .xn / r and .f .xn // ! m WD inf f .X/; a contradiction with (a). The implication (b))(c) is obvious since f is f -conditioned. (c))(a) Suppose f .x/ mC˛.dS .x// for some ˛ 2 A0 : Let .xn / be a minimizing sequence in f and let " > 0 be given. For r " we have ˛.r/ ˛."/ > 0: Thus, taking k 2 N such that f .xn / m < ˛."/ for all n k; we have dS .xn / < " for all n k: We already observed that S D arg min f when f is ˛-conditioned for S with ˛ 2 A0 . t u In the convex case one disposes of a larger list of characterizations. Proposition 6.31 For f W X ! R1 convex and S X closed convex, the three assertions of the preceding lemma are equivalent to the following ones: (d) there exists some convex lower semicontinuous remainder such that for all x 2 X one has f .x / f .0/ C S .x / C .kx k/I (e) there exists some starshaped ˛ 2 A0 such that for all z 2 S, .x; x / 2 @f one has ˛.dS .x// hx ; x ziI (f) there exists some ˇ 2 A0 such that for all .x; x / 2 @f one has ˇ.dS .x// kx k I (g) .dS .xn // ! 0 for any sequence .xn / in X satisfying .d@f .xn / .0// ! 0: Proof (c))(d) Given ˛ 2 A0 such that f inf f .X/ C ˛ ı dS , using the relation .˛ ı dS / D S C ˛ ı kk
376
6 A Touch of Convex Analysis
obtained by observing that for all x 2 S there exist sequences .xn / ! x, wn in S satisfying kxn wn k dS .x/, so that .˛ ı dS / .x / D
sup .hx ; xi ˛.kx wk// D
.x;w/2XS
sup .hx ; w C zi ˛.kzk//;
.w;z/2SX
passing to the conjugates and taking WD ˛ , we get the inequality in (d). (d))(c) Given as in (d), using the relation .S ./ C ı kk/ D ı dS obtained by writing sup inf .hx ; x wi .kx k// D sup inf .rkx wk .r//;
x 2X w2S
r2RC w2S
we get the inequality in (c) since f f and since ˛ WD belongs to Ac by Lemma 6.8. (c))(e) Given ˛ 2 A0 such that f is ˛-conditioned: 8x 2 X
f .x/ inf f .X/ C ˛.dS .x//;
adding to each side of this inequality the respective sides of the relation f .z/f .x/ hx ; z xi; we get the inequality in (e). (e))(f) Since ˛ 2 A0 is starshaped, ˇ defined by ˇ.r/ WD ˛.r/=r for r 2 P, ˇ.0/ WD 0 is in A0 and assertion (e) implies that dS .x/ˇ.dS .x// infz2S kx k : kx zk for all .x; x / 2 @f : Assertion (f) follows after simplification, the case x 2 S being obvious. (f))(g) being obvious, let us show that (g))(f). For this purpose, let us set ˇ.r/ WD inffkx k W 9x 2 X; dS .x/ r; x 2 @f .x/g: Clearly, ˇ is nondecreasing and for any .x; x / 2 @f we have ˇ.dS .x// kx k : Let us show that ˇ 2 A0 , i.e. that ˇ is firm. Suppose on the contrary that ˇ.r/ D 0 for some r > 0: Then, there exists a sequence ..xn ; xn // in @f such that .xn / ! 0 and dS .xn / r for all n 2 N. This is a contradiction with (g). t u The following consequence reminds us of the contents of the preceding subsection since a reinforced convexity property of f at some x corresponds to a smoothness property of f at x : Corollary 6.16 For f W X ! R1 convex, x 2 dom f , x 2 X , the following assertions are equivalent: .a/ 9˛ 2 Ac W 8x 2 X
f .x/ f .x/ C hx x; x i C ˛.kx xk/
.b/ 9 2 o.R/ W 8x 2 X
f .x / f .x / C hx; x x i C .kx x k/:
6.3 The Legendre-Fenchel Transform and Its Applications
377
Moreover, one can take WD ˛ and conversely ˛ WD and each of these assertions implies that x 2 @f .x/. Proof Using Lemmas 6.8, 6.11, this equivalence can be deduced from the equivalence (c),(d) of the preceding proposition by taking S WD fxg and changing f into g WD f h; x i, observing that S WD fxg is the set of minimizers of g and that f f with f .x/ D f .x/. One can also give a direct proof. This equivalence can be completed with other assertions (see the exercises). t u For f convex and x 2 dom f , condition (a) is satisfied whenever f is uniformly convex at x in the following sense: there exists an ˛ 2 Ac such that for all x 2 X, t 2 Œ0; 1 one has .1 t/f .x/ C tf .x/ f ..1 t/x C tx/ C t.1 t/˛.kx xk/: In fact, this condition implies that for all x 2 X, t 20; 1 one has f .x/ f .x/ .1=t/Œf .x C t.x x// f .x/ C .1 t/˛.kx xk/; whence, for all x 2 @f .x/; x 2 X f .x/ f .x/ df .x; x x/ C ˛.kx xk/ hx ; x xi C ˛.kx xk/: Theorem 6.21 Let f be a closed proper convex function finite and continuous at x 2 X. If f is strictly convex (resp. uniformly convex), then f is Hadamard (resp. Fréchet) differentiable at x. Proof For x 2 @f .x/ one has x 2 @f .x / by Theorem 6.16, hence 0 2 @.f x/.x / and x is a minimizer of f x. When f is strictly convex, f x is strictly convex too and it has at most one minimizer. Thus @f .x/ is a singleton and f is Hadamard differentiable at x in view of Corollary 6.4. When g WD f is uniformly convex, taking x 2 @f .x/, so that x 2 @g.x /, using the preceding observation and the implication (a))(b) of the preceding corollary we obtain that g is Fréchet differentiable at x, the reverse inequality in (b) being a consequence of subdifferentiability. Since f is the restriction to X of g , we get that f is Fréchet differentiable at x. t u
Exercises 1. Let c 2 P and let ˛ 2 Ac be given by ˛.r/ WD cr for r 2 RC : Verify that ˛ D Œ0;c : If for f W X ! R one has f D ˛ one says that f is linearly conditioned and one says that c is the (linear) rate of conditioning of f : 2. (Conditioning of a matrix) Let Q 2 L.X; X/ be a symmetric, positive linear p operator on a (finite dimensional) Euclidean space X and let f WD q, where
378
6 A Touch of Convex Analysis
p q.x/ WD hQx j xi for x 2 X: Show that the rate of conditioning of f is where is the smallest eigenvalue of Q. Note that when kQk D 1; is the conditioning rate of Q defined as the ratio between its smallest eigenvalue and its largest eigenvalue. 3. Show that the assertions of Proposition 6.31 are equivalent to the following one: (h) there exists a 2 A continuous at 0 such that dS .x/ .kx k/ for all .x; x / 2 @f : 4. Deduce from Proposition 6.31 that each of the two assertions of Corollary 6.16 is equivalent to any of the following ones: (c) there exists a 2 Ac such that hx x; x x i .kx xk/ for all .x; x / 2 @f ; (d) there exists a 2 A0 such that kx x k .kx xk/ for all .x; x / 2 @f ; (e) there exists a ˇ 2 A continuous at 0 such that kx xk ˇ.kx x k/ for all .x; x / 2 @f ; (f) for any sequence .xn / of X .d@f .xn / .x // ! 0 implies that .xn / ! x; (g) x 2 int.dom f / and f is Fréchet differentiable at x :
6.4 *Applications to the Geometry of Normed Spaces Several interesting properties of normed spaces depend on the geometry of their unit balls. We already noted the fact that uniform convexity implies reflexivity. In this section we give various complements, some of which use functions associated with the norm through weight functions. A function h W RC ! RC will be called a weight function if it is continuous, increasing, and such that h.0/ D 0, h.r/ ! 1 as r ! 1. To any such function h we associate the function jh W X ! RC by Z jh .x/ WD k.kxk/,
s
where k.s/ WD
h.r/dr 0
s 2 RC :
(6.30)
The choice h.r/ WD rp1 for r 2 RC is convenient for the study of Lp spaces, with p 2 Œ1; 1ŒI then jh .x/ D .1=p/ kxkp . In particular, for p WD 2 one has jh ./ D .1=2/ kk2 . Lemma 6.12 Let h be a weight function and let k and jh be defined by (6.30). Then Rt k is strictly convex, differentiable with k0 D h on RC and k .t/ D 0 h1 .s/ds: Moreover, k.s/ C k .t/ D st if and only if t D h.s/: The function jh is convex continuous on X and .jh / D k ı kk where kk is the dual norm of kk (or the norm of Y if X and Y are in metric duality). Moreover, y 2 @jh .x/ , y 2 h.kxk/@ kk .x/ , hx; yi D kxk : kyk ; kyk D h.kxk/:
6.4 *Applications to the Geometry of Normed Spaces
379
Thus, since h1 is a weight function, one sees that .jh / D k ı kk , being the function associated with h1 and the dual norm, has the same structure as jh . Proof The fundamental theorem of calculus (Corollary 5.4) ensures that k is differentiable with k0 D h: Since h is positive and increasing on P, k is increasing and strictly convex. The relation k.s/ C k .t/ D st means that t 2 @k.s/ or t D k0 .s/ D h.s/: It is equivalent to s 2 @k .t/: Since for a given t there exists only one s (s D h1 .t/) such that this relation holds, we get that k is Rdifferentiable t and .k /0 .t/ D s D h1 .t/: Since h1 is continuous, we have k .t/ D 0 h1 .s/ds: Since k is convex, increasing, and continuous, it follows that jh is convex and continuous. Let us compute .jh / . Assuming (Y, kk ) is in metric duality with (X, kk), for y 2 Y with X one has .jh / .y/ WD supfhx; yi k.kxk/ W x 2 Xg D supfhx; yi k.r/ W r 2 RC ; x 2 rSX g D supfr kyk k.r/g D k .kyk /: Finally y 2 @jh .x/ , jh .x/ C k .kyk / D hx; yi , k.kxk/ C k .kyk / hx; yi since the reverse inequality is always valid in view of the relation hx; yi kxk : kyk : This means that hx; yi D kxk : kyk and kyk 2 @k.kxk/ D fh.kxk/g: t u The following notions are weakened versions of uniform convexity, as is easily seen. Definition 6.2 A norm kk on a vector space X ¤ f0g is said to be rotund (or strictly convex) if for every w ¤ x in its unit sphere SX and t 20; 1Œ one has .1 t/w C tx … SX : A norm kk on a vector space X is said to be locally uniformly rotund (LUR), or locally uniformly convex, if for all x, xn 2 X satisfying .kxn k/ ! kxk, .kx C xn k/ ! 2 kxk one has .xn / ! x. Thus .X; kk/ is rotund if any u 2 SX is an extremal point of the unit ball BX in the sense that u cannot be the midpoint of a segment of BX not reduced to a singleton. Taking xn D w, we see that a locally uniformly rotund norm is rotund. Let us display characterizations of these properties. Lemma 6.13 For a normed space .X; kk/ the following assertions are equivalent: (a) kk is rotund; (b) if x; w 2 SX satisfy kx C wk D 2, then x D w; (c) if x; w 2 X satisfy kx C wk2 D 2 kxk2 C 2 kwk2 , then x D w;
380
6 A Touch of Convex Analysis
(d) if x; w 2 Xnf0g satisfy kx C wk D kxk C kwk, then x D w for some 2 RC ; (e) for any weight function h, the function jh is strictly convex. Proof (a),(b): (a))(b) is immediate. For the reverse implication suppose that for some t 20; 1Œ, w, x 2 SX we have kzt k D 1 for zt WD .1 t/w C tx. Let s WD min.t; 1 t/ > 0; so that t C s 2 Œ0; 1; t s 2 Œ0; 1 and ztCs C zts D 2zt : Since kztCs k 1, kzts k 1 our assumption implies that ztCs D zts ; hence 2sw D 2sx and w D x: (b),(c): (c))(b) is immediate. (b))(c). For x, w 2 X, since 2 kxk2 C 2 kwk2 kx C wk2 2 kxk2 C 2 kwk2 .kxk C kwk/2 D .kxk kwk/2 ; the relation 2 kxk2 C 2 kwk2 kx C wk2 D 0 implies kxk D kwk. Setting x WD ru, w WD rv with r WD kxk D kwk, u; v 2 SX , for r > 0 we get ku C vk D 2, so that u D v and x D w, whereas for r D 0 we have x D w D 0. (d))(b) is immediate. Let us prove (b))(d). Suppose kx C wk D kxk C kwk for x; w 2 Xnf0g and r WD kxk s WD kwk. Then 2 r1 x C s1 w r1 kx C wk r1 w s1 w D r1 .kxk C kwk/ .r1 s1 / kwk D r1 kxk C s1 kwk D 2: Thus r1 x C s1 w D 2 and r1 x D s1 w. (a))(e) For w; x 2 X, t 20; 1Œ the relation .1 t/jh .w/ C tjh .x/ D jh .k.1 t/w C txk/ implies that kwk D kxk D k.1 t/w C txk since k is strictly convex and increasing and since k.1 t/w C txk .1 t/ kwkCt kxk : Then, either kwk D kxk D 0 and w D x D 0; or 0 < kwk D kxk D k.1 t/w C txk and w D x by strict convexity of the norm. (e))(b) Let x; w 2 SX be such that kx C wk D 2: Then we have jh ..1=2/x C .1=2/w/ D k..1=2/ kx C wk/ D k.1/ D .1=2/jh .x/ C .1=2/jh .w/. Since jh is strictly convex we must have x D w: t u Let us turn to characterizations of locally uniformly convex normed spaces. Again, we recall that a function f is said to be uniformly convex at w 2 X if there exists some ˛ 2 A0 such that f is ˛-convex at w, i.e. if 8x 2 X
.1 t/f .w/ C tf .x/ f ..1 t/w C tx/ C t.1 t/˛.kw xk/
and f is said to be locally uniformly convex if for all w 2 X it is uniformly convex at w. A normed space .X; kk/ is said to be locally uniformly convex or locally uniformly rotund (LUR) if the square of its norm is locally uniformly rotund. Lemma 6.14 For a normed space .X; kk/ the following assertions are equivalent: (a) .xn / ! x whenever (kxn k/ ! kxk and .kx C xn k/ ! 2 kxk; (b) if x, xn 2 SX for n 2 N satisfy .kx C xn k/ ! 2, then .xn / ! x;
6.4 *Applications to the Geometry of Normed Spaces
381
(c) if x, xn 2 X satisfy .2 kxk2 C 2 kxn k2 kx C xn k2 / ! 0, then .xn / ! x; (d) for any weight function h the function jh is locally uniformly convex. Proof (a))(b) is obvious. The converse is obtained by considering (in the nontrivial case x ¤ 0) u WD x= kxk, un WD xn = kxn k (for n large enough). (c))(a) is immediate. (a))(c) For x, xn 2 X, since 2 kxk2 C 2 kxn k2 kx C xn k2 2 kxk2 C 2 kxn k2 .kxk C kxn k/2 D .kxk kxn k/2 ; the relation limn .2 kxk2 C 2 kxn k2 kx C xn k2 / D 0 implies .kxn k/ ! kxk. Since we may suppose x ¤ 0, we can write xn D tn wn , with kwn k D kxk and .tn / ! 1, observing that .2 kxk2 C 2 kwn k2 kx C wn k2 / ! 0, we obtain that .wn / ! x and .xn / ! x. The equivalence (a),(d) can be obtained by fixing w in the proof of Proposition 6.33 below. t u The LUR property has interesting consequences, as the next proposition shows. Proposition 6.32 If kk is a LUR norm, then X has the (sequential) Kadec-Klee Property: a sequence .xn / of X converges to x 2 X whenever it weakly converges to x and .kxn k/ ! kxk. Proof Let x 2 X and let .xn /n2N be a weakly convergent sequence whose limit x is such that .kxn k/ ! kxk. Then, lim supn kx C xn k lim supn .kxk C kxn k/ D 2 kxk. On the other hand, since the norm is weakly lower semicontinuous, we have lim infn kx C xn k k2xk. Thus .kx C xn k/ ! 2 kxk and, since the norm is LUR, t u we get .xn / ! x. Let us turn to uniform convexity. The next result enables us to obtain several characterizations by using Corollary 6.16 and the exercises following it. Proposition 6.33 For any weight function h, jh is uniformly convex on bounded subsets if and only if .X; kk/ is uniformly convex. Proof Suppose that for some r > 0 and some weight function h the function jh is uniformly convex on rBX : Let 2 A0 be such that for w, x 2 rBX and t 2 Œ0; 1 one has .1 t/jh .w/ C tjh .x/ jh ..1 t/w C tx/ C t.1 t/.kx wk/: Given " > 0 and u, v 2 SX satisfying ku vk ", taking w WD ru, x WD rv, 0 t WD 12 we get k.r/ 14 .r ku vk/ k. 12 r ku C vk/: 1 Takingr 2 RC such that 1 0 0 k.r / D k.r/ 4 .r"/ and ı WD 1 r =r we see that 2 .u C v/ 1 ı because k is increasing. Thus .X; kk/ is uniformly convex. Conversely, suppose .X; kk/ is uniformly convex. Let h be a weight function and let r 2 P. If jh is not uniformly convex on rBX , there exists some " > 0 such that for
382
6 A Touch of Convex Analysis
any ı > 0 one can find w, x 2 rBX such that 1 1 1 1 kw xk D " and jh . w C x/ > jh .w/ C jh .x/ ı: 2 2 2 2 Taking a sequence .ın / ! 0C ; one can find sequences .wn /, .xn / in rBX such that kwn xn k D " and jh . 12 wn C 12 xn / > 12 jh .wn / C 12 jh .xn / ın for all n 2 N. Let sn , tn 2 Œ0; r, un , vn 2 SX be such that wn D sn un , xn D tn vn : The preceding inequality implies that 1 1 1 1 1 k. sn C tn / k. ksn un C tn vn k/ > k.sn / C k.tn / ın 2 2 2 2 2
(6.31)
for all n 2 N. Taking subsequences if necessary, one may assume that .sn / and .tn / converge to s and t respectively in Œ0; r and that . 12 kun C vn k/ converges to some q 2 Œ0; 1. Passing to the limits in the preceding inequalities one gets k. 12 s C 12 t/ 1 1 2 k.s/ C 2 k.t/: By strict convexity of k one obtains s D t: Since " D ksn un tn vn k sn kun vn k C jsn tn j 2sn C jsn tn j and .sn tn / ! 0; one must have s > 0: Since .ksn un C tn vn k ksun C tvn k/ ! 0 and since k is continuous, the second inequality in (6.31) yields k.sq/ k.s/; if q < 1 this inequality contradicts the assumption that k is increasing. Thus . 12 kun C vn k/ ! 1; contradicting the assumption that .X; kk/ is uniformly convex. t u Now let us turn to differentiability properties of the norm. Definition 6.3 The space .X; kk/ is said to be smooth if for all x 2 Xnf0g there is only one x 2 X such that kx k D 1 and hx ; xi D kxk : This condition means that the normalized duality mapping S W X ! P.X / given by S.x/ WD fx 2 X W hx ; xi D kxk ; kx k D 1g and the duality map J WD kk S./ are single-valued, or equivalently, by Corollary 6.4, that the norm and .1=2/ kk2 are Gateaux (or Hadamard) differentiable on Xnf0g. In order to give some versatility to the following famous differentiability test for a norm, we adopt the framework of normed spaces X, Y in metric duality for a continuous bilinear coupling c WD h; i W X Y ! R. We say that a sequence .yn / in Y c-weakly converges (or simply, weakly converges) to y 2 Y if for every x 2 X we have .hx; yn i/ ! hx; yi. This notion coincides with weak convergence when Y WD X and with weak convergence when X WD Y .
6.4 *Applications to the Geometry of Normed Spaces
383
Proposition 6.34 (Šmulian Test) Let X and Y be normed spaces in metric duality and let x 2 SX . The following assertions (a) and (b) are equivalent and are implied by (c). If Y is the dual of X, then (a), (b) and (c) are equivalent: (a) the norm of X is Fréchet (resp. Hadamard) differentiable at xI (b) for any sequences .yn /, .zn / in SY such that .hx; yn i/ ! 1, .hx; zn i/ ! 1, one has .kyn zn k/ ! 0 (resp. .yn zn / c-weakly converges to 0); (c) a sequence .yn / of SY is convergent (resp. c-weakly convergent) whenever .hx; yn i/ ! 1. Proof (a))(b) Suppose the norm kk of X is Hadamard differentiable at x 2 SX . By Lemma 6.14, for any given " > 0 and any u 2 SX there exists some ı > 0 such that kx C tuk C kx tuk 2 C "t when t 2 Œı; ı. Let .yn / and .zn / be sequences in SY such that .hx; yn i/ ! 1 and .hx; zn i/ ! 1. Then, for t WD ı, one can find k 2 N such that for all n k one has thu; yn zn i D hx C tu; yn i C hx tu; zn i hx; yn i hx; zn i kx C tuk C kx tuk 2 C 2ı" 3ı": Thus hu; yn zn i 3" for n k. Changing u into u, we see that .hu; yn zn i/ ! 0. The Fréchet case is similar, using uniformity in u 2 SX . (b))(a) Suppose the norm kk of X is not Hadamard differentiable at x 2 SX . By Lemma 6.14 there exist some u 2 SX , some " > 0 and some sequence .tn / ! 0C such that kx C tn uk C kx tn uk 2 3tn " for all n. Let us pick yn , zn in SY such that hx C tn u; yn i kx C tn uk tn ";
hx tn u; zn i kx tn uk tn ":
(6.32)
Then hx; yn i kx C tn uk tn " tn kuk : kyn k and hx; yn i 1, so that .hx; yn i/ ! 1 and similarly, .hx; zn i/ ! 1. Since kxk D 1, kyn k D 1, kzn k D 1, we get tn hu; yn zn i D hx C tn u; yn i C hx tn u; zn i hx; yn i hx; zn i kx C tn uk C kx tn uk 2tn " kxk .kyn k C kzn k/ tn "; hence hu; yn zn i ", contradicting the assumption that .yn zn / c-weakly converges to 0. When the norm kk of X is not Fréchet differentiable at x 2 SX one can find " > 0 and sequences .tn / ! 0C , .un / in SX such that kx C tn un k C kx tn un k 2 3tn " for all n 2 N. Then, taking .yn /, .zn / 2 SY as in relation (6.32) with u replaced with un , the preceding computation reads hun ; yn zn i ", hence kyn zn k ", a contradiction with the assumption that .yn zn / ! 0. (c))(b) Let .yn /, .zn / be sequences of SY such that .hx; yn i/ ! 1, .hx; zn i/ ! 1. Let wn D yp when n WD 2p, wn D zp when n WD 2p C 1. Then .hx; wn i/ ! 1, so
384
6 A Touch of Convex Analysis
that, by (c), .wn / converges (resp. c-weakly converges). Thus .yn zn / ! 0 (resp. c-weakly converges to 0). (a))(c) when Y D X . Let y WD kk0 .x/. One has kyk 1 since the norm is Lipschitzian with rate 1 and, by homogeneity, hy; xi D limt!0 .1=t/.kx C txk kxk/ D 1, so that y 2 SY and we can take zn WD y in assertion (b). Thus (c) holds. t u Let us turn to duality results. Proposition 6.35 Let kk be a norm on X and let kk be its dual norm. (a) If kk is a rotund norm, then kk is Hadamard differentiable on Xnf0g. (b) If kk is Hadamard differentiable on X nf0g, then kk is a rotund norm. In particular, a compatible norm on a reflexive Banach space X is Hadamard differentiable on Xnf0g if and only if its dual norm is rotund. Proof (a) By Corollary 6.4, it suffices to show that for every x 2 Xnf0g, S.x/ WD @ kk .x/ D fx 2 X W kx k D 1; hx ; xi D kxkg is a singleton. Let x ; y 2 S.x/. Then 2 kxk D hx ; xi C hy ; xi kx C y k : kxk 2 kxk ; hence kx C y k D 2, and by assertion (b) of Lemma 6.13, we have x D y . (b) If kk is not strictly convex, one can find x; y 2 SX such that x ¤ y and z WD .1 t/x C ty 2 SX with t WD 1=2: Taking f 2 SX such that f .z/ D 1, we see that 1 D f .z/ D .1 t/f .x/ C tf .y/ 1, so that this inequality is an equality and f .x/ D f .y/ D 1. Viewing x and y as elements of X , we have x; y 2 @ kk .f /, so that kk is not differentiable at f . t u Proposition 6.36 Let .X; kk/ be a normed space. If the dual norm kk is LUR, then kk is Fréchet differentiable on Xnf0g. Proof We use Šmulian Test (c). Let x 2 SX . Using a corollary of the Hahn-Banach theorem, we pick f 2 SX such that f .x/ D 1. Let .fn / be a sequence in SX such that .fn .x// ! 1. Since 2 kf C fn k .f C fn /.x/ ! 2; we have limn .2 kf k2 C 2 kfn k2 kf C fn k2 / D 0, hence, by the LUR property, .fn / ! f . Then, by Proposition 6.34, kk is Fréchet differentiable at x, hence on 0; C1Œx. t u So, it will be useful to detect when a norm on the dual of X is a dual norm.
6.4 *Applications to the Geometry of Normed Spaces
385
Lemma 6.15 An equivalent norm kk on the dual X of a Banach space X is the dual norm of an equivalent norm kkX on X if and only if it is weak lower semicontinuous. Proof If kk is the dual norm of an equivalent norm kkX , then kk D supfhx; i W x 2 X; kxkX D 1g is weak lower semicontinuous as a supremum of weak continuous linear forms. Conversely, if kk is weak lower semicontinuous, its unit ball B is convex and weak closed, hence coincides with its bipolar. Then one can see that kk is the dual norm of the Minkowski gauge of the polar set of B , a compatible norm on X. u t In order to deal with quantitative properties, it is useful to introduce the function X W RC ! RC associated with a norm kk on X given by X .t/ WD supf.1=2/.kx C tuk C kx tuk/ kxk W x; u 2 SX g
t 2 RC :
It is called the modulus of smoothness of .X; kk/I it is a modulus since X .t/ t for t 2 RC : Moreover, since kx C tuk C kx tuk k2tuk, one has X .t/ t 1 for all t 2 RC . Definition 6.4 The space .X; kk/ is uniformly smooth if the function X is a remainder (i.e., X .t/=t ! 0 as t ! 0C ). Proposition 6.37 The function X is starshaped, i.e. t 7! X .t/=t is nondecreasing. It can be shown (see [265, Prop. 3.5.1]) that even t 7! X .t/=t2 is nondecreasing. Proof For t > s > 0, x, u 2 SX , by a property of convex functions, one has 1 1 .kx C suk kxk/ .kx C tuk kxk/ s t and a similar inequality with u changed into u; hence the result.
t u
In order to give a quantitative measure of rotundity, it is useful to introduce the function X W R ! R1 given by X .s/ WD C1 for s 2 RnŒ0; 1 and
X .s/ WD inff1 k.x C y/=2k W x; y 2 SX ; k.x y/=2k sg
s 2 Œ0; 1:
It can be shown that X can be given other expressions among which are
X .s/ D inff1 k.x C y/=2k W x; y 2 SX ; k.x y/=2k D sg D inff1 k.x C y/=2k W x; y 2 BX ; k.x y/=2k sg
s 2 Œ0; 1 s 2 Œ0; 1:
Let us relate this function to the definition of uniform convexity of the norm we gave in Sect. 3.4.3: 8" > 0 9ı > 0 W x; y 2 BX ; k.x y/=2k " ) k.x C y/=2k < 1 ı:
(6.33)
386
6 A Touch of Convex Analysis
Proposition 6.38 A norm kk on a vector space X ¤ f0g is uniformly rotund (or uniformly convex) if and only if X is firm, i.e. such that X .s/ > 0 for all s > 0: Proof If X is firm, for " > 0 we can take ı WD X ."/ in (6.33): given x; y 2 BX , such that 1 k.x C y/=2k < X ."/ we must have k.x y/=2k < ". Conversely, if the norm kk is uniformly rotund, given s 20; 1 we can find ı > 0 such that for all x; y 2 BX satisfying k.x y/=2k s we have 1 k.x C y/=2k > ı, hence X .s/ ı: X is firm. t u Note that since X is nondecreasing, .X; kk/ is uniformly rotund if and only if
X is forcing, i.e. .sn / ! 0 whenever . X .sn // ! 0. We shall see (Theorem 8.27) that the usual norm on Lp .S/ for p > 1 and S a measure space is uniformly rotund. Exercise Show that for p 21; 1Œ the space `p of real sequences x WD .xn / such that kxkp WD .˙n0 jxn jp /1=p is finite is uniformly rotund for kkp : The function ıX W Œ0; 2 ! R given by ıX .t/ WD X .t=2/ for t 2 Œ0; 2 is classically called the modulus of rotundity of .X; kk/ but it seems to us that X is preferable to ıX in view of the following remarkable property. We phrase it in the framework of normed spaces in metric duality which enables us to take for Y either the dual space of X or a predual of X: In fact, we use the crucial properties kxk D supfhx; yi W y 2 SY g, kyk D supfhx; yi W x 2 SX g: Proposition 6.39 (Lindenstrauss) If X and Y are normed vector spaces in metric duality, then Y D X and X D Y , the Fenchel conjugate of a function on RC being the conjugate of the extension of by C1 on RnRC ; i.e. is given by
.t/ WD supfst .s/ W s 2 RC g. Proof By metric duality, for t > 0 one has Y .t/ WD .1=2/ supfky C tvk C ky tvk 2 W y; v 2 SY g D .1=2/ supfhy C tv; xi C hy tv; wi 2 W w; x 2 SX ; y; v 2 SY g D .1=2/ supfhy; x C wi C htv; x wi 2 W w; x 2 SX ; y; v 2 SY g D .1=2/ supfkx C wk C t kx wk 2 W w; x 2 SX g D supfk.x C w/=2k C ts 1 W w; x 2 SX ; x C w W w; x 2 SX ; D supfst inff1 2
s 2 RC ; s k.x w/=2kg x w sg W s 2 RC g: 2
This last expression is nothing but supfst X .s/ W s 2 RC g D X .t/: Finally, the roles of X and Y are symmetric. t u Corollary 6.17 (Šmulian) A normed space is uniformly rotund if and only if its dual space is uniformly smooth. A normed space is uniformly smooth if and only if its dual space is uniformly rotund.
6.4 *Applications to the Geometry of Normed Spaces
387
Proof It suffices to show that if X and Y are two normed spaces in metric duality, then X is uniformly rotund if and only if Y is uniformly smooth. This follows from the last proposition and Lemma 6.8. t u One can deduce from the last corollary an analogue to Proposition 6.33. Proposition 6.40 The space .X; kk/ is uniformly smooth if and only if for any weight function h, jh is uniformly smooth. b of Proof Since the unit sphere of X is dense in the unit sphere of the completion X X we may suppose X is complete. Given an arbitrary weight function h; the function jh is uniformly smooth if and only if .jh / is uniformly convex (Theorem 6.20). By Lemma 6.12, .jh / D k ı kk and k is a weight function. Then, by Proposition 6.33, .jh / is a uniformly convex function if and only if .X ; kk / is uniformly convex. Combining these equivalences we get the announced assertion. t u The restriction of the duality map J to the unit sphere of a uniformly smooth Banach space is uniformly continuous. More precisely, one can give a modulus of local uniform continuity of S./ WD J./= kk on Xnf0g; remembering that X is a remainder on RC or that X is firm. Proposition 6.41 The duality map J of a uniformly smooth Banach space X is uniformly continuous on bounded subsets of X. Proof By Corollary 6.17, X is uniformly convex. Thus, given " > 0, x , y 2 BX such that 1 k.x C y /=2k < X ."=2/ we have kx y k < ": Given x, y 2 SX such that kx yk < ı WD 2 X ."=2/ we have kJ.x/ C J.y/k hy; J.x/ C J.y/i D hx; J.x/i C hy; J.y/i hx y; J.x/i 2 kx yk > 2 ı D 2.1 X ."=2//; hence kJ.x/ J.y/k < ": This shows that J is uniformly continuous on SX : Let us prove that J is uniformly continuous on BX : Given " > 0; for x, y 2 ."=2/BX we have kJ.x/ J.y/k kJ.x/k C kJ.y/k " since kJ.w/k D kwk for all w 2 X: Assuming x 2 BX n."=2/BX , y 2 BX ; with kx yk < ı"=4 "=2 (since ı WD 2 X ."=2/ 2/, we have y ¤ 0, so that, setting u WD x= kxk, v WD y= kyk and using the inequalities jkyk kxkj kx yk, ku vk
1 1 4 kx yk C jkyk kxkj kx yk < ı; " kxk kxk
we see that kJ.x/ J.y/k kxk kS.u/ S.v/k C kkxk S.v/ kyk S.v/k 2"
388
6 A Touch of Convex Analysis
since J.x/ D kxk S.u/ and kS.v/k D 1: This shows that J is uniformly continuous on BX . By homogeneity, the same holds on any bounded set. t u Exercise Verify that for J .r/ WD .r=4/ X .r=4/ one has kJ.x/ J.y/k " whenever x, y 2 BX are such that kx yk J ."/. Remark The unit duality map S./ of a uniformly smooth Banach space .X; kk/ satisfies for all x; y 2 Xnf0g the following relation in which ux WD x= kxk, uy WD y= kyk W 2X .2 ux uy / : kS.x/ S.y/k ux uy
(6.34)
In particular, for J./ WD kk S./, for x; y 2 SX one has kJ.x/ J.y/k
2X .2 kx yk/ : kx yk
Proof Since S.x/ D J.ux / and S.y/ D J.uy /; to prove (6.34) we may assume that x; y 2 SX . By a general property of convex functions, since S./ D kk0 and hS.u/; vi ku C vk kuk for all u 2 Xnf0g, v 2 X, we have hS.x/; y xi kyk kxk D 0: Let r WD kx yk and let z 2 rBX : By the preceding inequalities we have hS.y/; zi hS.x/; zi ky C zk kxk hS.x/; zi ky C zk 1 C hS.x/; x y zi ky C zk 1 C k2x y zk kyk kx C .y x C z/k kx .y x C z/k 2 2X .ky x C zk/ 2X .2r/: Taking the supremum over z 2 rBX we get r kS.y/ S.x/k 2X .2r/ and relation (6.34). t u Example For a measure space S and p 21; 1Œ; the duality map of Lp .S/ with its usual norm is given by J.x/.s/ WD kxkp2p jx.s/jp2 x.s/ for x 2 Lp .S/; s 2 S: For p D 1 one has J.x/ D fy 2 L1 .S/ W y.s/ 2 kxk1 sign x.s/g: 1 .˝/; where ˝ is a bounded open subset of Rd , J.x/ D Example For X D Wp;0 2p
d Di .jDi xjp2 Di x/: kxk1;p ˙iD1
6.4 *Applications to the Geometry of Normed Spaces
389
Finally, let us note that in any normed space X one can define a kind of substitute to an inner product by setting hx j yiC WD sup hx ; yi D lim
t!0C
x 2J.x/
1 .kx C tyk2 kxk2 / 2t
x; y 2 X;
as J.x/ D @j.x/; with j./ WD 12 kk2 , using Theorem 6.4. It is called the semiinner product of X. This definition is related to the notion of a semi-scalar product introduced in Proposition 3.15 by the inequality Œx; y hx j yiC
8x; y 2 X
since Œx; is an element of J.x/ D @j.x/ for all x 2 X: When X is a smooth space one has hx j yiC D hJ.x/; yi for all x; y 2 X and hx j iC is linear and continuous, not just sublinear and continuous. In the general case the following properties hold. They are left as exercises stemming from Proposition 6.9. Lemma 6.16 For any Banach space .X; kk/ the following properties hold. jhx j yiC j kxk : kyk and .x; y/ 7! hx j yiC is upper semicontinuous. hx j y C ziC hx j yiC C hx j ziC : hx j y C rxiC D hx j yiC C r kxk2 for all r 2 R. If u W T ! X is right differentiable at some t in the interior of an interval T, then f ./ WD .1=2/ ku./k2 is right differentiable at t and its right derivative is fr0 .t/ D hu.t/ j u0r .t/iC : (e) If X is uniformly convex, then h j iC is uniformly continuous on bounded subsets of X X.
(a) (b) (c) (d)
The following renorming theorem is of interest. We refer to specialized monographs (f.i. [119, Thm 8.20] as a recent reference) for the proof of its second assertion. Theorem 6.22 (a) Every separable Banach space X has an equivalent norm that is Hadamard differentiable on Xnf0g. (b) Every Banach space X whose dual is separable has an equivalent norm that is Fréchet differentiable on Xnf0g. Proof (a) Let .en /n2N be a countable dense subset of BX . Define a norm on X by " kf k D
kf k20
C
1 X nD0
#1=2 2n f 2 .en /
f 2 X
390
6 A Touch of Convex Analysis
where kk0 is the original norm of X . The norm kk is easily seen to be weak lower semicontinuous, so that it is the dual norm of some norm kk on X. In view of Lemma 6.35, it suffices to show that kk is strictly convex. Let f ; g 2 X be such that kf C gk2 D 2 kf k2 C 2 kgk2 . Since 2 kf k20 C 2 kgk20 kf C gk20 and 2f 2 .en / C 2g2 .en / .f C g/2 .en / for all n, we get that these last inequalities are equalities, so that .f g/2 .en / D 2f 2 .en / C 2g2 .en / .f C g/2 .en / D 0 for all n. Thus f .en / D g.en / for all n and, by density, f D g.
t u
The proofs of following theorems are beyond the scope of this book. However the results may be useful. Theorem 6.23 (Asplund) Every reflexive Banach space can be provided with an equivalent norm for which both X and X endowed with the dual norm are strictly rotund. Theorem 6.24 (Kadec, Troyanski) Every reflexive Banach space can be provided with an equivalent norm for which both X and X are locally uniformly rotund, and uniformly smooth. The notion of an obtuse angle between two vectors can be extended to normed spaces in a number of equivalent ways, as shown in the next lemma. Lemma 6.17 (Kato) For given elements x; y of a Banach space X the following assertions are equivalent: (a) (b) (c) (d)
kxk kx tyk for all t 2 RC ; there exists an x 2 J.x/ such that hx ; yi 0; there exists a semi-scalar product Œ; on X such that Œx; y 0; hx j yiC 0, where hu j viC WD limt!0C .1=2t/.ku C tvk2 kuk2 /.
Proof (a))(b) Without loss of generality we may suppose kxk D 1: For t > 0 small enough let us take ut 2 S.x ty/ WD kx tyk1 J.x ty/; so that ut D 1 and kxk kx tyk D hut ; x tyi D hut ; xi thut ; yi kxk thut ; yi: By the Banach-Alaoglu theorem there exists a weak limit point u 2 BX of the net .ut /t>0 and the preceding relations imply hu ; yi 0 and hu ; xi D kxk (replacing y with x), so that u 2 S.x/ and x WD kxk u 2 J.x/ satisfies hx ; yi 0. (b))(a) Given x 2 J.x/ such that hx ; yi 0, t 2 RC , we have kx k : kxk D hx ; xi hx ; x tyi kx k : kx tyk hence kxk kx tyk if x ¤ 0. If x D 0 the inequality is obvious.
6.4 *Applications to the Geometry of Normed Spaces
391
(b))(c) This follows from the fact that we can choose a selection j of J such that j.x/ D x ; then, setting Œu; v D hj.u/; vi for .x; v/ 2 X X, we have Œx; y D hx ; yi: (c))(b) Given a semi-scalar product Œ; on X and x 2 X; we have x WD Œx; 2 J.x/: (c))(d) Since for all .u; v/ 2 X X we have hu j viC Œu; v; taking u D x, v WD y we get hx j yiC Œx; y D Œx; y 0: (d))(a) Since f W t 7! .1=2/ kx tyk2 is convex, for t 2 RC we have kx tyk2 t u kxk2 2tdf .x; y/ D 2thx; yiC 0:
Exercises 1. Show that the spaces `1 ; c0 and `1 are not strictly convex. [Hint: for e1 WD .1; 0; : : :/ and e2 WD .0; 1; 0; : : :/ one has k.1=2/.e1 C e2 /k1 D 1 and for u WD e1 Ce2 , v WD e1 e2 one has u, v 2 SX for X WD c0 and .1=2/.uCv/ 2 SX .] 2. Show that the space C.Œ0; 1/ is not strictly convex for the norm kk1 : 3. Show that for a measure space S, the spaces L1 .S/ and L1 .S/ are not strictly convex. p 4. Show that a Hilbert space H is uniformly convex and H .s/ D 1 1 s2 : It can be shown that for any normed space X one has X H , a result due to Nörlander (1960). 5. Using the Šmulian Test, show that if the norm of a normed space is Fréchet differentiable on Xnf0g, then it is of class C1 there. 6. Show that a normed space X is strictly convex if and only if each point x of its unit sphere SX is an exposed point of the unit ball BX , i.e. for each x 2 SX there exists an f 2 X such that f .x/ > f .u/ for all u 2 BX nfxg. 7. Show that a normed space .X; kk/ is uniformly rotund if for any sequences .xn /, .yn / in BX such that .kxn C yn k/ ! 2 one has .kxn yn k/ ! 0. 8 . Let S be a locally compact topological space and let X WD C0 .S/ be the space of bounded continuous functions on S converging to 0 at infinity: x 2 C0 .S/ if and only if x./ is bounded, continuous on S and if, for any " > 0, one can find a compact subset K of S such that sup jx.SnK/j ". (a) Show that the supremum norm kk1 is Hadamard differentiable at x 2 X if and only if Mx WD fs 2 S W jx.s/j D kxk1 g is a singleton. (b) Show that kk1 is Fréchet differentiable at x 2 X if and only if Mx is a singleton fsg such that s is an isolated point of S. absolutely summable families x WD .xi /i2I 9 . Let X WD `1 .I/ be the space of P endowed with the norm kxk1 WD i2I jxi j. (a) Show that kk1 is nowhere Hadamard differentiable if I is uncountable. (b) If I WD N, show that kk1 is Hadamard differentiable at x if and only if xi ¤ 0 for all i 2 I. (c) If I WD N, show that kk1 is nowhere Fréchet differentiable.
392
6 A Touch of Convex Analysis
10. Show that the space X WD Lp .S; / (p > 1) is uniformly rotund and uniformly smooth (Hanner, 1956) with
X .s/ D .p 1/s2 =2 C o.s2 / for p 21; 2, X .s/ D sp =p C o.sp / for p 2. X .t/ D tp =p C o.tp / for p 21; 2, X .t/ D .p 1/t2 =2 C o.t2 / for p 2. 11. Let S be a subset of a normed space X; let x 2 X and let x 2 S be such that kx xk kx wk for all w 2 S: Let T.S; x/ be the set of v 2 X such that there exist sequences .vn / ! v, .tn / ! 0C satisfying x C tn vn 2 S for all n 2 N. Show that hx x j viC 0 for all v 2 T.S; x/. In this sense one can say that x x is normal to S at x. 12 . Show that for p 21; 1Œ there exists some ˛p 2 P such that for all r; s 2 R one has ˇ ˇ ˇ p2 p2 ˇ ˇjrj r jsj sˇ ˛p .jrj C jsj/p jr sj : Verify that the function g W t 7! .1 C t/2p .1 t/1 .1 tp1 / on Œ0; 1Œ has limit 22p .p 1/ as t ! 1 and is bounded on Œ0; 1Œ. Deduce from this the fact that there exist constants cp ; c0p such that for all a 2 P; b 20; a one has c0p .a C b/p2 .a b/2 .ap1 bp1 /.a b/ cp .a C b/p2 .a b/2 : Conclude that for p 2 Œ2; 1Œ; the function kp W Rd Rd ! R given by kp .x; y/ WD .kxk C kyk/p2 kx yk2 for x, y 2 Rd endowed with the Euclidean norm satisfies the inequalities 21p kp .x; y/ hkxkp2 x kykp2 y j x yi cp kp .x; y/: [See [74, Prop. 17.3].] Find some consequences for the space Lp .S; /; where .S; S; / is a finite measure space.
6.5 Regularization of Convex Functions It may be useful to approximate a function by a sequence in regular functions. For a locally integrable function on Rd a standard means is to use integral convolution with a smooth bump function. In the case of a convex function on a normed space, classical processes are the Baire regularization and the Moreau regularization. Although some of the following results can be extended to the case of functions defined on the dual of a Banach space, for the sake of simplicity we only consider functions defined on a reflexive space. In fact, our first result can be given in the framework of metric spaces. Proposition 6.42 (Baire, Hausdorff, McShane, Pasch) Let f W X ! R1 be a proper, lower semicontinuous function on a metric space X: Suppose there exist b,
6.5 Regularization of Convex Functions
393
c 2 R, x 2 X such that f ./ b cd.; x/: Then, for all n 2 N the function fn W X ! R given by fn .x/ WD infu2X .f .u/ C nd.u; x// for x 2 X is Lipschitzian and the sequence .fn / pointwise converges to f on X. Since a lower semicontinuous proper convex function on a normed space is bounded below by a continuous affine function, this approximation result holds for such a function. Also, note that when X is a normed space, one has fn WD f n kk. Proof For n 2 N, u 2 X, let gn;u W X ! R1 be given by gn;u .x/ D f .u/ C nd.u; x/ for x 2 X. Since fn .x/ WD infu2X gn;u .x/, for all x 2 X, taking u D x in the infimum, we see that fn .x/ f .x/. Moreover, taking some u 2 dom f ; we see that fn .x/ f .u/ C nd.u; x/ < 1 for all x 2 X. Since for n c gn;u .x/ b cd.u; x/ C nd.u; x/ b C .n c/d.u; x/ nd.x; x/
(6.35)
we have fn .x/ WD infu gn;u .x/ b nd.x; x/ > 1 and since the Lipschitz rate of gn;u is n, the function fn WD infu2X gn;u is Lipschitzian with rate at most n. Given x 2 X and s < f .x/; since f is lower semicontinuous at x, we can find r > 0 such that f .u/ > s for all u 2 B.x; r/: Then, for u 2 B.x; r/ we have gn;u .x/ f .u/ > s whereas for u 2 XnB.x; r/ we have gn;u .x/ bcd.x; x/Cnr s provided n is large enough. Thus, for such an n we get fn .x/ s: This shows that limn fn .x/ D f .x/: t u Exercise Let f W X ! R1 be the indicator function C of a nonempty subset C of X. Identify fn and transcribe the conclusion of the proposition. Exercise Let f W X ! R1 be a closed convex function on a uniformly convex Banach space and let C WD cl.dom f //: Show that for every x 2 X and n 2 Nnf0g there exists a unique point wn 2 X such that f .wn / C n kx wn k D infff .w/ C n kx wk W w 2 Xg and that .wn / ! PC .x/; the unique point w 2 C such that t u kw xk D infu2C ku xk. Exercise (Subdifferential Determination: The Approach of Zlateva [269]) Let f W X ! R1 be a closed convex function on a Banach space X. For n 2 Nnf0g and " 2 P, setting fn WD f n kk ; let Mn;" .x/ WD fw 2 X W f .w/ C n kx wk inf .f .u/ C n kx uk C "/g; u2X
@" f .x/ WD fx 2 X W f x f .x/ hx ; xi "g: (a) Show that for all x 2 dom f ; n 2 Nnf0g, " 2 P, w 2 Mn;" .x/ one has @fn .x/ @" f .w/ \ @" n kk .x w/: (b) Assume that f .0/ D 0 and 0 2 @f .0/. Show that for all r 2 P, x 2 BŒ0; r, " 20; 1, n 1=r one has Mn;" .x/ BŒ0; 3r:
394
6 A Touch of Convex Analysis
(c) Assume that f is as in (b) and let g W X ! R1 be closed proper convex and such that g.0/ D 0 and @f @g, i.e. @f .x/ @g.x/ for all x 2 X. Show that for all r 2 P, x 2 BŒ0; r, n 1=r one has @fn .x/ @gn .x/ with gn WD gn kk. (d) Deduce from Exercise 4 of Sect. 2.2 that for all r > 0 and n 1=r there exist some cr;n 2 R such that fn D gn C cr;n and f D g C cr on B.0; r/: Conclude that there exists some c 2 R such that f D g C c on X. t u Let us turn to the most useful approximation result for convex functions. We use a parameter r 2 P and an increasing function h W RC ! RC satisfying h.0/ D 0 and the condition 8c 2 R
lim inf t!1
k.t c/ >0 k.t/
(6.36)
Rt for k.t/ WD 0 h.s/ds. Such a condition is fulfilled if, for some p > 1, k.t/ WD .1=p/tp or if k.t/ WD et for t 2 RC . Given a function f W X ! R1 , we set jh WD k ı kk, fr WD f r1 jh W 1 fr .x/ WD inf fr;x .u/ with fr;x .u/ WD f .u/ C jh .x u/ u2X r
x 2 X:
In the usual case of the Moreau regularization, one takes h.t/ D t for r 2 RC , so that fr .x/ D infu2X .f .u/ C 2r1 kx uk2 /; but the choice h.t/ WD tp1 can be convenient, for instance when dealing with Lp spaces, with p 21; 1Œ. Also, the use of a weight h enables us to take into account the growth properties of f : In the sequel we denote by rf ;h or simply by rf the extended real given by 1 rf WD supfr 2 P W 9m 2 R f m jh g: r Exercise Show that, when f is bounded below on bounded sets, 1=rf ;h coincides with the h-coercivity rate cf ;h WD lim inf
f .x/
kxk!1 jh .x/
:
Proposition 6.43 Let f W X ! R1 be a proper function on a Banach space X and let h W R ! R be a weight function satisfying condition (6.36). Assume that inf.f cjh / > 1 for some c 2 R. For r 20; rf Œ let fr WD f r1 jh W 1 fr .x/ WD inf .f .u/ C jh .x u// u2X r
Z with jh .v/ WD
kvk
h.s/ds: 0
Then, denoting by f the lower semicontinuous hull of f , one has lim fr .x/ D sup fr .x/ D f .x/ WD lim inf f .w/
r!0C
r>0
w!x
x 2 X:
6.5 Regularization of Convex Functions
395
For any bounded subset B of X there exists some rB 20; rf Œ such that for all r 20; rB the function fr is finite and Lipschitzian on B. Moreover, for x 2 X, denoting by Pr .x/ the set of minimizers of fr;x W u 7! f .u/ C r1 jh .x u/, given x 2 domf one has e.Pr .x/; x/ WD supfku xk W u 2 Pr .x/g ! 0 as r ! 0C : The map Pr is called the proximal map of f . When f is convex Pr is characterized by the relation 0 2 @fr;x .Pr .x// or, in view of Theorem 6.8, jh being convex continuous, 1 Jh .x Pr .x// 2 @f .Pr .x// r
(6.37)
where Jh WD @jh is characterized by hv; Jh .v/i D h.kvk/, kJh .v/k D h.kvk/ (see Lemma 6.12). Moreover, since the function .x; u/ 7! fr;x .u/ is convex, fr is convex. Proof We first observe that for all r 2 P the function fr is bounded above on bounded subsets: fixing x 2 f 1 .R/, c 2 RC , for all x 2 cBX we have fr .x/ mr;c WD f .x/ C r1 k.kxk C c/ < 1. Let us show that for all x 2 X; r > 0 we have fr .x/ f .x/: given a sequence .xn / converging to x such that limn f .xn / D f .x/, since jh is continuous at 0; we have fr .x/ lim inf fr;x .xn / D lim inf.f .xn / C r1 jh .x xn // D f .x/: n
n
Let us prove that for any bounded subset B there exists some rB 20; rf Œ such that for all r 20; rB Œ the function fr is bounded below on B. Let c > 0 be such that B cBX and let r 20; rf Œ. By definition of rf there exists an m 2 R such that f C .1=r/jh m. Also, there exist some a 20; 1 and s 2 P such that for s s one has k.s c/ ak.s/: Thus, setting rB WD ar, for r 20; rB Œ, x 2 cBX , u 2 X, and s WD kuk s, we have jh .u x/ k.kuk kxk/ k.s c/ since k is nondecreasing, jh .u/ D k.s/, and 1 1 1 1 f .u/ C jh .u x/ f .u/ C jh .u/ k.s/ C k.s c/ r r r r a 1 a 1 m C . /k.s/ m C . /k.s/: r r r r
(6.38)
For s WD kuk s, by definition of rf one has f .u/ C .1=r/jh .x u/ m .1=r/jh .u/ m .1=r/k.s/: Thus fr;x is bounded below on X by m .1=r/k.s/, uniformly on x 2 cBX : infx2cBX fr .x/ > 1.
396
6 A Touch of Convex Analysis
Since k.s/ h.s/.s s/ C k.s/, with h.s/ > 0, estimate (6.38) shows that the function fr;x W u 7! f .u/ C .1=r/jh .x u/ is supercoercive, uniformly for x 2 cBX . Since for x 2 cBX we have fr .x/ mr;c , there exists some > 0 such that fr .x/ D infffr;x .u/ W u 2 BX g
8x 2 cBX :
(6.39)
Since jh is Lipschitzian on balls, we deduce from this relation that fr is Lipschitzian on cBX ; hence on B. Given x 2 X, let us show that .fr .x// ! f .x/ as r ! 0C and that e.Pr .x/; x/ ! 0 as r ! 0C if x 2 domf . Let c kxk, B WD cBX , r 20; rB Œ and let " > 0, 2 R be given. Let u 2 X be such that t WD ku xk ". If t > s C c we have s WD kuk s hence, for r > 0 small enough, namely r < r WD ak.s/.mCk.s/=r/1 ; by (6.38) we get f .u/ C .1=r/jh .u x/ m C .a=r 1=r/k.s/ > : If t WD ku xk s C c we have kuk s C 2c and the assumption f m .1=r/jh yields f .u/ C .1=r/jh .u x/ m .1=r/k.s C 2c/ C .1=r/k."/ > for r small enough, namely r < r;" for some r;" 2 P. Taking < f .x/ and choosing " > 0 such that f .w/ for all w 2 B.x; "/, we get fr .x/ WD infu2X fr;x .u/ for r as above. Thus lim infr!0C fr .x/ f .x/ and since fr .x/ f .x/ we obtain .fr .x// ! f .x/. Now, for x 2 domf , taking f .x/, we see that for r < min.r ; r;" /, for all u 2 Pr .x/ we must have u 2 B.x; "/: t u Additional assumptions enhance the interest of this regularization. Theorem 6.25 Suppose .X; kk/ is reflexive and that f is a weakly lower semicontinuous function such that inf.f cjh / > 1 for some c 2 R. Then, for any bounded subset B of X there exists some rB > 0 such that for r 20; rB Œ the restriction to B of the proximal multimap Pr is closed and with nonempty values. If, moreover, .X; kk/ is strictly convex and f is convex, then Pr is a continuous map. Furthermore, if .X ; kk / satisfies the sequential dual Kadec-Klee Property, then, for r large enough, fr is of class C1 with derivative given by Dfr .x/ D .1=r/J.x Pr .x//. Proof When f is weakly lower semicontinuous, fr;x is weakly lower semicontinuous too and coercive, hence fr;x attains its infimum when X is reflexive. Moreover, when .X; kk/ is strictly convex and f is convex, fr;x is strictly convex too and its set of minimizers Pr .x/ is a singleton. Given a bounded subset B of X, x 2 B, a convergent sequence .xn / ! x in B, r 20; rB Œ, the proof of relation (6.39) shows that any sequence .un / satisfying un 2 Pr .xn / for all n is bounded. From any subsequence in .un / we extract a subsequence .uk.n/ / that has a weak limit u. Passing to the limit in the relation fr .xn / D f .un / C
6.5 Regularization of Convex Functions
397
rjh .xn un /, we get fr .x/ lim inf f .uk.n/ / C lim inf.1=r/jh .xk.n/ uk.n/ / n
n
f .u/ C .1=r/jh .u x/: This shows that u 2 Pr .x/. When Pr .x/ is a singleton fzg, we get that .un / ! u WD z: Given x 2 dom f , we set Qr .x/ WD x Pr .x/. Proposition 6.18 ensures that for all x0 2 X 1 fr .x0 / fr .x/ hJh .Qr .x//; x0 xi 0: r In order to prove that fr is differentiable at x with derivative .1=r/J.Qr .x// let us denote by or .x0 / the left-hand side of this inequality and let us show that or ./ is a remainder at x. Using relation (6.37) and thus the inequalities 1 hJh .Qr .x0 //; Pr .x/ Pr .x0 /i r 1 1 1 jh .Qr .x// jh .Qr .x0 // hJh .Qr .x0 //; Qr .x/ Qr .x0 /i r r r f .Pr .x// f .Pr .x0 //
and using the relation fr .x/ D f .Pr .x// C 1r jh .Qr .x// and the similar one with x changed into x0 ; by adding side by side we obtain 1 hJh .Qr .x0 //; .Pr .x0 / Pr .x// C .Qr .x0 / Qr .x//i r 1 or .x0 / hJh .Qr .x0 // Jh .Qr .x//; x0 xi: r
fr .x0 / fr .x/
Since Pr and Jh are continuous (as easy changes in the proof of Proposition 3.29 show), we get that jor .x0 /j ".x0 / kx0 xk where ".x0 / ! 0 as x0 ! x: That proves that fr is differentiable at x with derivative .1=r/J.Qr .x//. t u Several algorithms use the properties of proximal maps. The Yosida regularization process for monotone operators described in Sect. 9.4.3 is related to this Moreau type regularization via subdifferentials of convex functions.
In the case of continuous functions, the notion of integral coincides with the notion of primitive. Riemann has defined the integral of some discontinuous functions, but not all derivative functions are integrable in the Riemann sense. Thus, the problem of searching for primitive functions through integration is not solved, and one may wish for a definition of an integral including Riemann’s which allows one to solve the problem of primitive functions. Henri Lebesgue, Sur une généralisation de l’intégrale définie, Comptes-rendus de l’Académie des Sciences de Paris 132, pp. 128–132, April 29th 1901.
Abstract Using the notions of measure theory introduced in Chap. 1, an integration process is introduced for functions with values in normed vector spaces. Such an extension does not require much supplementary effort but can be bypassed in a first reading. Convergence results and calculus rules form the bulk of the chapter.
In this chapter we deal with a crucial tool of analysis, namely integration. Its birth is contemporary with the surge of differential calculus at the end of the seventeenth century. Its first appearances concerned the calculation of areas or volumes. Probability questions gave it a further impetus. But it is during the twentieth century that firm grounds were given to the topic by the use of measure theory. Throughout this chapter .S; S; / is a measure space. We treat the case of vectorvalued functions in order to obtain in a single stroke the case of complex-valued functions and the case of real-valued functions. In a first reading the reader may assume that E is R or C. However the construction in the case when E is a general Banach space is similar to that for the scalar case: one starts with a class of simple functions for which the definition of the integral is undeniable. Then one passes to a more general class by a kind of completion process. We suggest that in a first step the reader replaces the notation for the norm of the Banach space .E; kkE ) by the notation j jE or even j j in order to be easily convinced that the construction of the integral in the vectorial case is not different from the construction in the scalar case.
7.1 Step Functions and -Measurable Functions Given a Banach space .E; kkE ), a map f W .S; S; / ! E is called a step map or a simple map if it is measurable and if it takes its values in a finite subset fe1 ; : : : ; en g of E. It is a -step map if, moreover, it is 0 outside a set of finite measure. We denote by St.; E/ the set of -step maps from S to E: The functions we shall integrate are not necessarily measurable, but they are -measurable in the sense of the next definition. Definition 7.1 A function f W S ! E is said to be -measurable if there exists a sequence . fn / of St.; E/ that converges a.e. to f . We denote by L0 .; E/ the set of -measurable functions from S to E. It is easy to show that the set L0 .; E/ is a vector space and that L0 .; R/ has pleasant stability properties. Let us give a characterization. We recall that T 2 S is said to be of -finite measure if there exists a sequence .Tn / of S with union T such that all Tn ’s have finite measures. We also recall that a subset of a metric space is said to be separable if it contains a dense countable subset. Proposition 7.1 A function f W S ! E is -measurable if and only if it satisfies the following conditions: (a) there exists a measurable map g W S ! E such that f D g a.e.; (b) there exists a T 2 S with -finite measure such that f D 0 on SnT; (c) there exists a null set N of S such that f .SnN/ is separable. Proof Let f 2 L0 .; E/ so that there exist a null set N of S and a sequence . fn / of St.; E/ that converges to f on SnN. Replacing N with a larger set if necessary, we may suppose N 2 S and .N/ D 0: Setting gn WD 1SnN fn we see that gn 2 St.; E/ and .gn / ! g WD 1SnN f : Then g is measurable and f D g a.e., so that (a) is satisfied. Since gn 2 St.; E/ there exists an element Sn of S with finite measure such that gn vanishes on SnSn : Then g vanishes on Sn [n Sn , hence f vanishes on SnT for T WD [n Sn [ N and .Sn [ N/ is finite. Thus (b) holds. Since gn is a -step function, the set Fn WD gn .S/ is finite. Then the union F WD [n Fn is countable and its closure contains f .SnN/; so that (c) holds. Let us prove the converse. First, we assume that .S/ is finite. Let f W S ! E satisfying conditions (a), (b), (c) and let D WD fek W k 2 Ng be a countable dense subset of f .S/: Given a decreasing sequence .rk / of positive numbers with limit 0; for all n 2 N the union [k B.ek ; rn / of the open balls with radius rn and centers in D 1 contains f .S/, so that S D [k f 1 .B.ek ; rn // and ..[m .B.ek ; rn /// ! .S/ as kD0 f m ! C1: Thus, we can find some m.n/ 2 N such that .SnSn / 2n for Sn WD [kD0 f 1 .B.ek ; rn //: m.n/
7.1 Step Functions and -Measurable Functions
401
Let Zn WD
[
.SnSp /
p>n
Z WD
\
Zn ;
n0
so that .Zn / 2nC1 and .Z/ D 0. Let us define a -step map fn W S ! E by setting fn .s/ D 0 for s 2 SnSn and fn .s/ D e0 if f .s/ 2 B.e0 ; rn /, fn .s/ D ek k1 if f .s/ 2 B.ek ; rn /n.[iD0 B.ei ; rn // for k D 1; : : : ; m.n/: Then, for s 2 Sn we have kfn .s/ f .s/kE < rn . Given s 2 SnZ we can find ns 2 N such that s 2 SnZns hence s 2 Sn for n ns : this shows that . fn .s// ! f .s/ and that f is -measurable. Now let us reduce the general case to the case when S has finite measure. By assumption (b) there exists a sequence .Tk / in S such that f D 0 on SnT with T WD [k Tk and .Tk / < 1 for all k 2 N. We may suppose Tk TkC1 for all k: Using the special case we established, for each k 2 N we can find a sequence . fk;n /n of -step functions on Tk converging to f j Tk on Tk nYk with Yk a null set of Tk . Let us define inductively a sequence . fn / of -step functions on S by setting fn .s/ D 0 for s 2 SnTn , fn .s/ WD f0;n .s/ for s 2 T0 ; : : : ; fn .s/ D fk;n .s/ for s 2 Tk nTk1 .k n/: Then . fn / ! f on .Sn [k Yk /.
Corollary 7.1 Let . fn / be a sequence in -measurable maps, converging almost everywhere to a map f W S ! E. Then f is -measurable. Proof This follows from Lemma 1.3 and from the characterization we have just proved using the following facts: a countable union of null subsets is a null subset, a countable union of -finite subsets is -finite, and the union of a countable family .An / of subsets of E having countable dense subsets Dn has a countable dense subset D WD [n Dn since cl.D/ [n cl.Dn / D [n An . Corollary 7.2* If .S; S; / is complete and -finite and if E is separable, then f W S ! E is -measurable if and only if f is measurable. Proof The sufficient condition is immediate. The necessary condition stems from the fact that if g W S ! E is measurable and if f D g a.e. then f is measurable when is complete. Besides the preceding stability result, one can show that the family of -measurable functions with values in R is a complete lattice. We omit the proof. Proposition 7.2* Any family F WD f fi W i 2 Ig of -measurable functions from .S; S; / to R has a supremum f called the essential supremum of F: This means that there exists some -measurable function f W S ! R such that f fi a.e. for all i 2 I and f g a.e. for any -measurable function g W S ! R satisfying g fi a.e. for all i 2 I. Of course, a similar result holds for the essential infimum.
402
7 Integration
7.2 Integrable Functions and Their Integrals One defines the integral of a -step map f with distinct values fe1 ; : : : ; em g by Z
Z fd WD S
m X
f .s/d.s/ WD S
for Sk WD f 1 .ek /;
.Sk /ek
kD1
R using a classical notation. Thus S fd is a weighted sum of the values of f and if f is constant R with value e on a subset T of S with finite measure and is null on SnT, one has S fd D .T/e. More generally, the additivity of shows that if .Si /i2I is a finite partition of S (i.e. S D [i2I Si and Si \ Sj D ¿ for i ¤ j in I) by elements of S and if f is constant on Si ; with value ei 2 E satisfying ei D 0 if .Si / D C1, using the convention .C1/ 0 D 0, one has Z fd D S
X
.Si /ei :
i2I
R The map f 7! S fd is easily seen to be linear from the space St.; E/ of -step maps from .S; S; / to E (if f WD ˙i2I 1Ai ai ; g WD ˙j2J 1Bj bj with I; J finite, ai , bj 2 E; and Ai , Bj 2 S, take a finite family .Ck /k2K of disjoint elements of S such that all ARi ’s and all Bj ’s are unions of subfamilies of .Ck /k2K ). If E D R and if f 0 one has S fd 0: If T 2 S and if f is a -step function on .S; S; /, the function 1T f is a -step function on .S; S; / and f jT is a T -step function on .T; ST ; T /, where ST is the -algebra induced by S on T, T is the induced measure by on ST , and we have Z
Z
Z
fd WD T
. f jT /dT D T
1T fd: S
If .T; T 0 / is a measurable partition of S (i.e. T; T 0 2 S, S D T [ T 0 , and T \ T 0 D ¿) one has Z Z Z fd D fd C fd: S
T
T0
Observing that for all f 2 St.; E/ the function kf ./kE is a -step function, one easily sees that the function kk1 W St.; E/ ! R given by Z kf k1 WD
kf .s/kE d.s/ S
7.2 Integrable Functions and Their Integrals
403
R is a seminorm. Then the map f 7! S fd is continuous from St.; E/ to Esince m m for all f WD ˙iD1 1Si ei 2 St.; E/ the triangle inequality ˙iD1 .Si /ei E m ˙iD1 .Si / kei kE yields Z Z fd kf .s/k d.s/ D kf k : E 1 S
E
(7.1)
S
R It follows from Corollary 3.2 that the map f 7! S fd can be extended into a linear continuous map to the completion of .St.; E/; kk1 /: However this completed space is abstract. It is the purpose of the following analysis to represent it by a concrete space L1 .; E/ of (equivalence classes of) functions from S to E called -integrable functions. Definition 7.2 A function f W S ! E (resp. f W S ! R) is said to be -integrable (or in short integrable) if there exists a Cauchy sequence . fn / in St.; E/ (resp. St.; R/) for the seminorm kk1 which converges a.e. to f . The space of -integrable functions from S to E is denoted by L1 .S; S; ; E/ or L1 .; E/ or L1 .S; E/ if there is no ambiguity. For E WD R one writes L1 ./ or L1 .S/. Remark If f 2 L1 .; E/ and if g W S ! E coincides with f a.e. then g 2 L1 .; E/ since if . fn / is a Cauchy sequence in St.; E/ converging a.e. to f then . fn / converges to g a.e. R To prove that the map f 7! fd can be extended from St.; E/ to L1 .; E/ we need some preliminary results. First, we observe that L1 .; E/ is a linear space: if . fn / and .gn / are Cauchy sequences in St.; E/ and if . fn / ! f , .gn / ! g a.e., and c 2 R (or c 2 C is E is a complex space), then . fn C cgn / ! f C cg a.e. and, by the triangle inequality, . fn C cgn / is a Cauchy sequence. Next, we give a refined convergence property. Proposition 7.3 Any Cauchy sequence in St.; E/ has a subsequence which converges almost everywhere. More precisely, if . fn / is an Abel sequence in St.; E/; then . fn / converges almost uniformly to some f 2 L1 .S; E/ in the sense that . fn / ! f a.e. and for every " > 0 there exists T 2 S such that .T/ < " and . fn / converges uniformly to f on SnT: Proof Since any Cauchy sequence has an Abel subsequence, it suffices to prove the second assertion. Suppose . fn / is an Abel sequence in St.; E/: let c > 0 and q 2 Œ0; 1Œ be such that kfnC1 fn k1 cqn for all n 2 N. Let r 2q; 1Œ and let Rn WD fs 2 S W kfnC1 .s/ fn .s/kE rn g: Since Z
Z
rn .Rn /
kfnC1 .s/ fn .s/kE d.s/ Rn
kfnC1 fn kE d cqn ; S
404
7 Integration
one has .Rn / c.q=r/n : For m 2 N let Tm WD [nm Rn , so that Tm 2 S. For any given " > 0; for m large enough one has .Tm / cr.r q/1 .q=r/m ". Let T WD Tm . For s 2 SnT and n m one has kfnC1 .s/ fn .s/kE < rn ; so that . fn / converges uniformly on SnT: Then . fn / converges on SnN; where N 2 S is the intersection of the family .Tm /; so that N has measure 0: By definition, the limit f of . fn / (extended by 0 on N) is in L1 .S; E/: Lemma 7.1 Let . fn / and R .gn / be Cauchy R sequences of St.; E/; converging a.e. to the same map f : Then . S fn d/ and . S gn d/ converge and their limits are equal. Moreover, .kfn gn k1 / ! 0: R Proof The sequence . S fn d/ is a Cauchy sequence in E since for n; p 2 N Z Z Z fn d fp d fn fp d ! 0: E S
S
n;p!C1
S
E
R
R
Thus . S fn d/ converges in E. Similarly . S gn d/ converges. Let us show the limits are the same. By the triangle inequality, the sequence .hn / WD . fn gn / is a Cauchy sequence in .St.; E/; kk1 / and it converges almost everywhere to 0: Let us prove that .khn k1 / ! 0: Since Z Z Z fn d gn d khn k d D khn k E 1 S
S
E
S
R
R this will prove that limn S fn d D limn S gn d: We may suppose .hn / is an Abel sequence rather than a Cauchy sequence since .khn k1 / ! 0 whenever .hk.n/ 1 / ! 0 for some subsequence .h k.n/ /n of .hn /n : Given " > 0 there exists an m 2 N such that hn hp 1 < " for n, p m. Let A 2 S be a set of finite measure outside of which hm vanishes. Then, for n m we have Z Z Z (7.2) khn kE d D khn hm kE d khn hm kE d ": SnA
SnA
S
Let c > khm k1 WD sups2S khm .s/k. The preceding proposition yields a subset T 2 S such that .T/ < "=c and such that .hn / converges to 0 uniformly on AnT: Let m0 m be such that for n m0 we have Z (7.3) khn kE d " AnT
(we use the fact that .AnT/ < C1). Then, for n m0 we have Z
Z khn kE d T
.khn hm kE C khm kE /d
(7.4)
khn hm k1 C .T/ khm k1 < 2":
(7.5)
T
7.2 Integrable Functions and Their Integrals
405
Gathering relations (7.2), (7.3), (7.5), for n m0 we get Z khn k1 D
Z khn kE d C
SnA
Z khn kE d C
AnT
khn kE d < 4" T
so that .khn k1 / ! 0:
The last lemma implies that we can set without ambiguity Z
Z f .s/d.s/ WD lim n
S
fn .s/d.s/ S
since the right-hand side does not depend on the choice of the Cauchy sequence . fn / in St.; E/ R converging to f a.e.; moreover this definition is compatible with the definition of S fd for f 2 St.; E/ since one can take fn D f for all n in this case. It implies more. Proposition 7.4 Let f 2 L1 .; E/ and let . fn / be a Cauchy sequence of -step maps of S into E; converging a.e. to f : Then kf ./kE is integrable and Z
Z kf .s/kE d.s/ D lim n
S
S
kfn .s/kE d.s/ D lim kfn k1 ; n
Z Z f .s/d.s/ kf .s/k d.s/: E S
E
(7.6) (7.7)
S
Proof The sequence .kfn ./kE / clearly converges to kf ./kE a.e. and it is a Cauchy sequence in St.; R/ since ˇ ˇ ˇkfn .s/k fp .s/ ˇ fn .s/ fp .s/ s 2 S; E E E ˇZ ˇ Z Z Z ˇ ˇ ˇ ˇ ˇ kfn k d fp dˇ ˇkfn k fp ˇ d fn fp d: E E ˇ ˇ E E E S
S
S
S
R
Thus kf ./kE is integrable and the definition of S kf ./kE d yields the first announced The second one follows by a passage to the limit in the relation R relation. R fn d kfn k d. E S S E R Proposition 7.5 The function f 7! kf k1 WD S kf .s/kE d.s/ is a semi-norm on L1 .; E/. Moreover, St.; E/ is dense in L1 .; E/ equipped with kk1 W if f 2 L1 .; E/ and if . fn / is a Cauchy sequence in St.; E/ converging a.e. to f ; then one has .kf fn k1 / ! 0. Proof The relations kcf k1 D jcj kf k1 ; kf C gk1 kf k1 C kgk1 for c 2 R, f , g 2 L1 .; E/ are obtained from the similar relations in St.; E/ by a passage to the limit using relation (7.6). Let f 2 L1 .; E/ and let . fn / be a Cauchy sequence in St.; E/ converging to f a.e. Given " > 0 we can find k 2 N such that
406
7 Integration
kfm fn k1 " for n m k: Then, by relation (7.6), for m k we have kf fm k1 D limn!1 kfn fm k1 " so that .kf fm k1 /m ! 0. By (7.6) the semi-norm kk1 on L1 .; E/ extends the semi-norm kk1 on St.; E/: Since St.; E/ is dense in L1 .; E/; the next result shows that L1 .; E/ can be considered as the completion of St.; E/: Theorem 7.1 (Fisher-Riesz) The space .L1 .; E/; kk1 / is complete, as is its quotient space .L1 .; E/; kk1 / by the subspace Z N .; E/ WD f f 2 L1 .; E/ W
kf k d D 0g: S
Proof Let . fn / be an Abel sequence in .L1 .; E/; kk1 / W for some c > 0, r 20; 1Œ we have kfn fnC1 k1 crn : Since St.; E/ is dense in L1 .; E/, for all n 2 N we can pick some gn 2 St.; E/ such that kfn gn k1 rn : Then, since kgn gnC1 k1 kgn fn k1 C kfn fnC1 k1 C kfnC1 gnC1 k1 .c C 2/rn the sequence .gn / is an Abel sequence in .St.; E/; kk1 /: Proposition 7.3 ensures that .gn / converges a.e. to some function f in L1 .; E/. Then kf fn k1 kf gn k1 C kgn fn k1 kf gn k1 C rn ; and since .kf gn k1 / ! 0 by Proposition 7.5, we get that . fn / converges to f in L1 .; E/. The last assertion is a general fact about complete spaces. It is useful to characterize the subspace N .; E/ and the equivalence relation it induces. We need an extension of Proposition 7.3. Proposition 7.6 Let . fn / be an Abel sequence in .L1 .; E/; kk1 / with limit f : Then . fn / converges a.e. to f , and given " > 0, there exists a subset T of S of measure less than " such that the convergence is uniform on SnT. Proof Let c > 0, r 20; 1Œ be such that kf fn k1 cr2n for all n 2 N. Changing f and all the fn ’s on sets of measure 0, we may assume that they are all measurable. Let Sn be the set of s 2 S such that kf .s/ fn .s/kE crn : Then Z
Z
crn .Sn /
kf fn kE d cr2n ;
kf fn kE d Sn
S
so that .Sn / rn : Setting Tk WD [nk Sn and N WD \k Tk we have .Tk / < rk =.1 r/ and .N/ D 0: For s 2 SnTk and n k we have kf .s/ fn .s/kE crn so that . fn / converges to f uniformly on SnTk and . fn / converges pointwise to f on SnN: We are ready to give a characterization of N .; E/:
7.2 Integrable Functions and Their Integrals
407
Corollary 7.3 For a function f W S ! E, the following assertions are equivalent: (a) f 2 N .; E/; i.e. f 2 L1 .S; E/ and kf k1 D 0I (b) f D 0 almost everywhere. Proof If f D 0 a.e., the sequence . fn / of St.; E/ given by fn D 0 for all n is a Cauchy sequence and converges to f a.e. so that f 2 L1 .S; E/ and kf k1 D limn kfn k1 D 0 by Proposition 7.5. Conversely, if f 2 N .; E/, the sequence . fn / with fn D 0 for all n is an Abel sequence in St.; E/ and .kf fn k1 / ! 0 since kf fn k1 D kf k1 D 0 for all n. Proposition 7.6 ensures that . f fn / ! 0 a.e., so that f D 0 a.e. Corollary 7.4 The space L1 .S; E/ is the quotient space of L1 .S; E/ by the equivalence relation: f , g 2 L1 .S; E/ are equivalent if and only if f D g almost everywhere. It is usual and convenient to identify a function f with its equivalence class with respect to the relation of equality a.e. (although that should not be done). If T 2 S and if f 2 L1 .; E/, the function 1T f belongs to L1 .; E/ as is easily seen. Setting Z
Z fd D
1T fd;
T
S
and using the linearity of the integral and the relation 1A[B f D 1A f C 1B f when A; B 2 S; A \ B D ¿, we can see that the property Z
Z
Z
fd D
A; B 2 S; A \ B D ¿ H) A[B
fd C A
fd B
is still valid for f 2 L1 .; E/. Let us give some calculus rules. Proposition 7.7 If A W E ! F is a continuous linear map with values in another Banach space F, then for any f 2 L1 .S; E/; the function A ı f belongs to L1 .S; F/ and
Z Z A. f .s//d.s/ D A f .s/d.s/ ; S
S
Z A. f .s//d.s/ kAk : kf k : 1 S
F
Proof When f 2 St.; E/ the result is an easy consequence in the definition and of the triangle inequality. The general case is obtained by a passage to the limit. Corollary 7.5 Given Banach spaces E1 ,. . . ,Ek , a function f W S ! E WD E1 : : : Ek belongs to L1 .; E/ if and only if its components f1 ,. . . ,fk belong to L1 .; E1 /,. . . , L1 .; Ek /, respectively.
408
7 Integration
Proof Since fi D pi ı f , where pi W E ! Ei is the canonical projection, one has fi 2 L1 .; Ei / whenever f 2 L1 .; E/. The converse is a consequence in the definition, observing that if . fi;n /n is a Cauchy sequence in St.; Ei / converging a.e. to fi for i D 1; : : : ; k then fn WD . f1;n ; : : : ; fk;n / is a -step function, . fn / is a Cauchy sequence and . fn / ! f a.e. Thus, the integration of functions with values in Rd or C can be reduced to the integration of real-valued functions. The next consequence shows that the vectorial integral is determined by scalar integrals. R Corollary 7.6 If f , g 2 RL1 .S; E/ Rand if for all e 2 E one has S e ı fd D R S e ı gd then one has S fd D S gd. R R Proof Since for all h 2 L1 .S; E/ and all e 2 E one has e . S hd/ D S e ı hd, the result follows from a consequence in the separation theorem ensuring that for e1 , e2 2 E one has e1 D e2 whenever e .e1 / D e .e2 / for all e 2 E : Let us end this subsection with an extension of the construction of the integral to the case of functions with values in R. Such an extension is useful, in particular for convergence questions. Definition 7.3 A function f W S ! R is said to be integrable if there exists some g 2 L1 .S; R/ such that f D g almost everywhere. More generally, given a Banach space E; a null subset N of S and a function f W SnN ! E one says that f is integrable if there exists some g 2 L1 .S; E/ such that f D g a.e. on SnN. In both cases one sets Z Z fd D gd: S
S
Clearly this value does not depend on the choice of g: One can verify that the properties of the present subsection can be extended to the present case. However, one has to be careful with calculus rules and take into account the convention .C1/ C .1/ WD C1. If f W S ! R is an almost measurable function (in short, a.m. function) Rin the sense Rthat it coincides a.e. with a real-valued measurable function, we define S fd Rto be S gd if there is an integrable function g W S ! R such that f D g a.e. and S fd D C1 if there is no such function. This definition of the (upper) integral of f does not depend on the choice of g in L1 .S; R/ among those satisfying gD R R f a.e. Moreover, one sees that for a.m. functions f , f 0 with f f 0 one has S fd S f 0 d. Alternatively, one sees that the integral of an a.m. function f W S ! R is Z
Z
Z
fd WD S
f .s/.ds/ D inff S
h.s/d.s/ W h 2 L1 .S; R/; h f a.e.g; S
7.2 Integrable Functions and Their Integrals
409
with our standing convention inf ¿ WD C1: In fact, if g 2 L1 .S; R/ is such that f D g a.e. we can R take h D g in the right-hand side, so that the infimum is less than or equal to S fd; on the other hand, if h 2 L1 .S; R/ is such that h f a.e. and if g W S ! R is integrable with f D g a.e., changing g, h to some measurable functions gR0 , h0 such that g0 D g, h0 D h a.e., the preceding corollary ensures that R R hd gd, so that the infimum is greater than or equal to fd. Moreover, S S S with the convention .C1/ .C1/ D C1; one can easily verify that Z
Z
Z
f .s/.ds/ D S
max. f .s/; 0/d.s/ S
max.f .s/; 0/d.s/: S
R R R Thus, the only case in which S fd differs from S .f /d is when both S fd R and S .f /d are C1: The next lemmas will be useful for the study of convergence properties. They are also of independent interest. Lemma 7.2 For f 2 L1 .; E/, c > 0 let Sc WD fs 2 S W kf .s/kE cg: Then there exists a T 2 S with .T/ < C1 and a null set N such that Sc D TnN. Moreover, if f is measurable then Sc is measurable with a finite measure. Taking a sequence .cn / ! 0C we see that for every f 2 L1 .; E/ there exist a sequence .Tn / in S and a null set N such that .Tn / < C1 for all n 2 N and f D 0 on Sn.T [ N/ where T D [n Tn : When .S; S; / is complete, the preceding assertions can be simplified. Proof Let us first suppose f 2 L1 .; E/ is measurable. Then kf ./kE is measurable, so that Sc is measurable. Let . fn / be an Abel sequence in St.; E/ converging a.e. to f . Proposition 7.3 ensures that for every " > 0 there exists some Z 2 S with .Z/ < " such that . fn / converges to f uniformly on SnZ. Thus, for n large enough and all s 2 Sc nZ, we have kfn .s/kE c=2 so that Z
Z kfn kE d S
Sc nZ
kfn kE d .c=2/.ScnZ/
and Sc nZ and Sc have finite measures. Now let f be an arbitrary element of L1 .; E/. Since f is -measurable there exist a measurable map g W S ! E and a null set M 2 S such that f D g on SnM. Let Tc WD fs 2 S W kg.s/kE cg, so that Tc 2 S and .Tc / is finite. We have Sc nM D Tc nM; so that, setting T WD Tc nM 2 S, we get .T/ < 1, Sc D T [ .Sc \ M/, and N WD Sc \ M is a null set. Lemma 7.3 Let f W S ! RC and let . fn / be an increasing sequence of -step functions converging a.e. to f : Then f is integrable if and only if the sequence R . S fn d/ is bounded above.R R R In such a case, one has S fd D supn S fn d D limn S fn d:
410
7 Integration
Proof Let . fn / be an increasing Rsequence of -step functions converging a.e. to fR. Suppose ˇ ˇ first that R m WD sup R n . S fn d/ < C1: Then, for p > n in N one has ˇ fp fn ˇ d D f d p S S SRfn d; so that R. fn / is a Cauchy sequence and f is integrable. Then, by definition, S fd D limn S fn d: Now suppose m D C1 and let us show that assuming that f is integrable leads to a contradiction. In fact, if .gn / is a Cauchy sequence of -step functions converging k 2 N and ˇ ˇ a.e.ˇ to f , fixing ˇ setting hn WD max. fk ; gn /, for p n one has ˇhp hn ˇ Rˇgp gnˇ ; so that .hn / is a Cauchy in L1 .S; R/, and S hn d converges in R. R sequence, R hence is convergent R Since S hn d S fk d and supk S fk d D C1, we get a contradiction. Remark If f W S ! RC WD Œ0; C1 is measurable, one can always find an increasing sequence . fn / of step functions which converges a.e. to f : setting for n 2 N, k 2 N with 0 k 4n Sk;n WD f 1 .Œk2n ; .k C 1/2n Œ/ if k < 4n ; Sk;n WD f 1 .Œ2n ; C1/ if k D 4n ; fn .s/ D k2n if s 2 Sk;n ; one gets a partition .Sk;n /k of S by measurable sets and a step function fn : The sequence . fn / is increasing. The step function fn is a -step function if .Sk;n / < C1 for k 2 Œ1; 4n : Corollary 7.7 If f W S ! RC is measurable, R R if g W S ! RC is integrable and if f g a.e., then f is integrable and S fd S gd: R Proof Let us first observe that if k W S ! RC is integrable, then S kd 0 since for any Cauchy sequence .kn / ! k a.e. the sequence .knC / with knC WD max.kn ; 0/ is still Cauchy, .knC / ! k a.e., and knC 2 St.; R/ if kn 2 St.; R/. Now, let h be a step function such that 0 h f and let T WD fs W h.s/ > 0g, ˛ WD minfh.t/ W t 2 Tg: Since h g and g is integrable, T fs 2 S W g.s/ ˛g and T has finite measure. Hence h is a -step function. Thus there exists an increasing sequence . fn / of -step functions that converges Ra.e. to f . Let kn WD g fn : Then kn is integrable by the preceding remark, and S Rkn d 0R by the R first observation of theR proof. Thus, the sequence . S fn d/ D . S gd S kn d/ R by S gd: Ris bounded above R The preceding lemma asserts that f is integrable and S fd D supn S fn d S gd:
Exercises 1. Using a common refinement of two finite partitions, show that the integral of a -step map f does not depend on the choice of the partition used to write f : 2. Verify that the set St.; E/ of -step maps from S into a normed space E is a linear space and that the integral is linear on St.; E/:
7.2 Integrable Functions and Their Integrals
411
3. If .S; S; / is a -finite measure space, show that there exists an integrable R function u W S ! R with positive values such that S u.s/d.s/ D 1. [Hint: Given a sequence .Sn / in S such that S D [n Sn and cn WD .Sn / 20; C1Œ for all n; set u D ˙n .2n =cn /1Sn .] 4. Let S WD R equipped with the Lebesgue measure . Give an example of a sequence . fn / of integrable functions uniformly converging to a function f that is not integrable. Show that if all the functions fn are null on the complement of a measurable subset T of S of finite measure, then f is integrable. Give an example of a sequence of integrable functions that are null on the complement of a measurable subset T of S of finite measure and that pointwise converges to a function f that is not integrable. 5. Let .S; S; / be a measure space and let f , g be two integrable functions on S such that .f f > rgfg > rg/ D 0
a.e. r 2 R.
Show that f D g a.e. 6. Let c be the counting measure of .N; P.N// defined by c .A/ D card.A/ if A 2 P.N/ R is finite and c .A/ WD C1 if A is infinite. Show that for f W N ! RC one has fdc D ˙n f .n/: 7. (Jensen’s Inequality) Let .S; S; / be a measure space with finite measure, let T be an open interval of R and let g W T ! R be a convex function. Prove that for all integrable function g W S ! R with values in T the following inequality holds: g.
1 .S/
Z f .s/d.s// S
1 .S/
Z g. f .s//d.s//: S
Interpret it in terms of averages R and reduce it to the case is a probability. [Hint: assuming .S/ D 1; let a WD S fd and let c WD g0` .a/: Verify that g.t/ g.a/ c.t a/ for all t 2 TI take t WD f .s/ and integrate.] 8. Let g W R ! R be continuous and such that for any bounded measurable function f on a bounded interval S of R endowed with the measure induced by the Lebesgue measure the inequality of the preceding exercise holds. Prove that g is convex. 9. (Markov’s Inequality). Let .S; S; / be a measure space and let fR W S ! RC be measurable. Show that for all c > 0 one has .f f cg/ .1=c/ fd: Deduce from this inequality that if f is integrable then .f f D C1g/ D 0. [Hint: use the continuity property of a measure and note that f f D C1g D \n f f ng.]
412
7 Integration
7.3 Approximation of Integrable Functions It is natural to wonder whether L1 .; E/ WD L1 .S; S; ; E/ remains the same if we replace S with a smaller family R of subsets of S and with the restriction of to R. If S D Rd for instance, instead of taking for S the Borel algebra, one may wish to use the ring A generated by the semi-ring R of semi-closed rectangles, i.e. by Lemma 1.4, the ring A formed by the unions of finite families of disjoint elements of the semi-ring R. Let us denote by St.; E/ (resp. St.; R/) the set of step maps from S to E (resp. R) with respect to the restriction of to R: We say that is -finite with respect to (or R) if every T 2 S with finite measure is contained in the union of a countable family of elements of R with finite measures. An answer to the above question is as follows. Theorem 7.2 Let R be a semi-ring generating the -algebra S. If is -finite with respect to its restriction to R, then St.; E/ is dense in L1 .; E/ and L1 .; E/ D L1 .; E/: The proof relies on two lemmas. Lemma 7.4 Let R be an element of R with finite measure and let SR WD fT 2 S W T R; 1T 2 cl.St.; R/g: Then SR considered as a family of subsets of R is a -algebra. Proof Since 1R 2 St.; R/ as .R/ < C1 we have R 2 SR : Since 1RnT D 1R 1T for T R and since LR WD cl.St.; R// is a linear subspace of L1 .; R/; we have RnT 2 SR whenever T 2 SR : By Proposition 1.5 it suffices to prove that [n Tn 2 SR whenever Tn 2 SR for all n 2 N. We first show that T [ T 0 2 SR if T 2 SR and T 0 2 SR : Given " > 0; we pick g; g0 2 St.; R/ such that k1T gk1 < "=2, k1T 0 g0 k1 < "=2: We use the relation ˇ ˇ ˇ ˇ ˇ ˇ ˇsup.r; s/ sup.r0 ; s0 /ˇ ˇr r0 ˇ C ˇs s0 ˇ for all r; r0 ; s; s0 2 RC to get that ksup.1T ; 1T 0 / sup.g; g0 /k1 < ": Since sup.1T ; 1T 0 / D 1T[T 0 and sup.g; g0 / 2 St.; R/ we see that T [ T 0 2 SR . Thus T \ T 0 D Rn.RnT/ [ .RnT 0 // 2 SR and T 0 nT D T 0 \ .RnT/. Therefore, when considering T D [n Tn we may suppose the sets Tn 2 SR are disjoint. Then, given " > 0, we pick gn 2 St.; R/ such that k1Tn gn k1 < "=2n; so that, for n large enough we have j.T/ .T1 [ : : : [ Tn j "=2 and n n X X gk k1T 1T1 [:::[Tn k1 C 1T1 [:::[Tn gk ": 1T kD1
1
kD1
1
7.3 Approximation of Integrable Functions
413
Thus T 2 SR and SR is a -algebra.
0
Let us denote by S the family of subsets T of S such that T \ R 2 SR for all R 2 R. Lemma 1.2 shows that S 0 is a -algebra. Since it obviously contains R, we have S 0 D S. Let us prove more. Lemma 7.5 If is -finite with respect to , then for all T 2 S with finite measure one has 1T 2 cl.St.; R//: Proof Let T 2 S with finite measure. By assumption one can find a sequence .Rn / in S such that T [n Rn and .Rn / < C1 for all n 2 N. Since for all n we have T \ Rn 2 SR by the preceding remark, given " > 0 we can find gn 2 St.; R/ such that k1T\Rn gn k1 "=2n : Since T D [n .T \ Rn / we can find some n such that .Tn.R1 [ : : : [ Rn // < "=2. Thus n n n n X X X X gk 1T 1T\Rk C 1T\Rk gk " 1T kD1
1
kD1
1
kD1
kD1
1
as k1T .1T\R1 C : : : C 1T\Rn /k1 D .Tn.R1 [ : : : [ Rn // < "=2: Therefore 1T 2 cl.St.; R//: Now let us show that St.; E/ cl.St.; E//: Since St.; E/ is dense in L1 .; E/; this will prove that cl.St.; E// D L1 .; E/ and that L1 .; E/ D L1 .; E/: Let f 2 St.; E/ W f D 1S1 e1 C : : : C 1Sm em with ei 2 Enf0g and Si 2 S with .Ei / < C1 for i 2 Nm : Given " > 0 for all i 2 Nm we can find some gi 2 St.; R/ such that k1Si gi k1 "=.m kei k/: Then we get kf .g1 e1 C : : : C gm em /k1 ": Thus f 2 cl.St.; E//:
Let us apply Theorem 7.2 to the approximation of integrable functions on Rd by smooth functions. Given r < s in R we denote by b the bell-shaped function on R defined by 1
b.t/ WD e .tr/.st/ if t 2r; sŒ
b.t/ D 0 otherwise.
414
7 Integration
It can be checked that b is of class C1 and a WD given by c.x/ WD
1 a
Z
R
b > 0. The function c W R ! R
x
b.t/dt 1
starts with the value 0 untill r and then climbs to the constant value 1 reached for x s: Thus, for q 2 Œr; s, it can be considered as an approximation of 1Œq;C1Œ , the shifted Heaviside function. Changing r and s we can make the climb as steep as necessary. Combining c with a shifted function of x 7! c.x/, we get an approximation g of the characteristic function of Œr; s which differs from 1Œr;s on small intervals around the extremities of Œr; s and takes its values in Œ0; 1: Therefore, given " > 0, we can choose g in such a way that 1Œr;s g1 " and g is of class C1 with a compact support. If R is the rectangle Œr1 ; s1 Œrd ; sd in Rd ; using approximations gi of 1Œri ;si for i 2 Nd and setting g.x/ WD g1 .x1 / gd .xd / for x WD .x1 ; ; xd /; we get an approximation of 1R : If D belongs to the ring R generated by the class C of semi-closed rectangles, we easily get an approximation of 1D since D is the union of a finite family of disjoint semi-closed rectangles. Combining this construction with Theorem 7.2, we get the following useful result. Theorem 7.3 The space Cc1 .Rd / of functions of class C1 with compact support is dense in the space .L1 .Rd /; kk1 /:
Exercises 1. Let .X; S; / be a measure space, let E be a Banach space, and let R be a semiring generating the -algebra S. Assume that is -finite with respect to the R restriction of to R. Let f 2 L1 .; E/ be such that R fd D 0 for all R 2 R. Show that f D 0 a.e. [Hint: use Theorem 7.2 and the Dominated Convergence Theorem of the next section.] 2. Let d be the Lebesgue measure on Rd and let E be a Banach space. Prove that if R f 2 L1 .d ; E/ is such that fgdd D 0 for all g 2 Cc1 .Rd / then f D 0 a.e. [Hint: use the preceding exercise, the Dominated Convergence Theorem, and observe that the characteristic function of a semi-closed rectangle R can be approximated by functions of Cc1 .Rd / taking their values in Œ0; 1 and pointwise converging to 1R .] 3. RProve that when .S/ > 0, for u 2 L1 .S; R/ with u.s/ > 0 for all s 2 S; one has the family of sets Sn WD fs 2 S W u.s/ 2n g S u.s/d.s/ > 0: [Hint: consider R for n 2 N and observe that S u.s/d.s/ 2n .Sn / and for at least one n 2 N one has .Sn / > 0:] 4. (Lusin’s Theorem) Let be a regular Borel measure on a metric space .X; d/ and let f 2 L1 ./: Prove that for every " > 0 there exists some g 2 C.X/\L1 ./ such
7.4 Convergence Results
415
that kf gk1 ", kgk1 kf k1 and .f f ¤ gg/ ": [Hint: use Urysohn’s Lemma.]
7.4 Convergence Results For the study of integrals of real-valued functions on a measure space .S; S; / one can make use of order properties. This is also the case for functions with values in R WD R [ f1; C1g: In view of the compactness of R it is convenient to consider such functions, in particular for convergence questions. Then one can use the fact that any increasing sequence . fn / of functions with values in R has a limit, with limn fn .x/ D C1 if . fn .x//n is unbounded. We devote this section to convergence results for such functions. However, in order to avoid complications with null sets, we start with the simple case of measurable functions for a notion that is more often used in probability theory than in analysis. Definition 7.4 Let .S; S; / be a measure space and let .E; kk/ be a Banach space. A sequence . fn / of measurable maps from S to E is said to converge to f W S ! E in measure if for all c > 0 one has limn .fkfn f k > cg/ D 0, where fkfn f k > cg stands for fx 2 S W kfn .x/ f .x/k > cg: Proposition 7.8 If f 2 L1 .S; E/ and if . fn / is a sequence in L1 .S; E/ such that .kfn f k1 / ! 0; then . fn / converges to f in measure. Proof This stems from the inequality .fkfn f k > cg/ .1=c/ kfn f k1 for all n. Proposition 7.9 If . fn / ! f a.e. and if .S/ < C1; then . fn / ! f in measure. Proof Given c > 0; let An WD fkfn f k > cg and let Bk WD [nk An : The sequence .Bk / is decreasing and its intersection is contained in the set N of s 2 S such that . fn .s// does not converge to f .s/: Thus .\k Bk / D 0; hence limk .Bk / D 0 (Lemma 1.5). Since Ak Bk we get limk .Ak / D 0 W . fn / converges to f in measure. Proposition 7.10 If . fn / ! f in measure then a subsequence of . fn / converges to f a.e. Proof Using the definition of convergence in measure, we construct inductively an increasing sequence .n.k//k1 of N such that n n.k/ H) .fkfn f k > 1=kg/ 2k : Setting Ak WD ffn.k/ f > 1=kg and A WD \j1 [kj Ak we see that for all s 2 SnA there exists j 1 such that s … Ak for all k j, i.e. fn.k/ .s/ f .s/ 1=k for k j:
416
7 Integration
This means that . fn.k/ .s//k ! f .s/ for all s 2 SnA: Since for all j 1 .
[
Ak /
kj
X
.Ak /
kj
X
2k D 2jC1
kj
we get that .A/ D limj .[kj Ak / D 0: . fn.k/ / ! f almost everywhere.
Let us gather some more classical convergence results. They are most useful. They can be extended to the case when the functions . fn / are almost measurable with values in R. Theorem 7.4 (Monotone Convergence Theorem) Let . fn / be a sequence in L1 .S; R/ such Rthat fn fnC1 a.e. and let f .s/ WD limn fn .s/: Then f is integrable if andRonly if . S fRn d/ is bounded in R: In such a case .kfn f k1 / ! 0 as n ! 1 and . S fn d/ ! S fd as n ! 1: Proof Suppose f is Rintegrable (in 7.3). Then for all n 2 N R the sense of Definition R R one has S f0 d S fn d R S fd; so that . S fn d/ is bounded. Conversely, if the nondecreasing sequence . S fn d/ is bounded, it is convergent, hence Cauchy. Thus, given " > 0; for n; p in N large enough with p n, one has fp fn D 1
Z
Z
Z
. fp fn /d D S
fp d S
fn d ": S
Since L1 .S; R/ is complete, . fn / converges to some g 2 L1 .S; R/ for the semi-norm kk1 . Proposition 7.6 ensures that . fn / has a subsequence converging a.e. to g: The monotonicity of . fn / implies that the whole sequence a.e. to g: Thus R . fn / converges R g D f a.e. and f is integrable. The convergence . f d/ ! fd follows from n S S ˇR ˇ R R ˇ fn d fdˇ j fn f j d and .kfn f k / ! 0 as n ! 1. 1 S S S Corollary 7.8 (Fatou’s Lemma) Let . fn / be a sequence in L1 .S; R/ such that fn .s/ 0 a.e. for all n 2 N. If lim infn kfn k1 < C1; then f WD lim infn fn is integrable and Z
Z f .s/d.s/ lim inf n
S
fn .s/d.s/: S
Consequently, the semi-norm kk1 is lower semicontinuous on the positive cone of L1 .S; R/ with respect to a.e. convergence. Proof For m; p 2 N with m p; let gm;p WD infmnp fn ; let gm WD Rinfpm gm;p D infnm fn ; so that f D limm gm : Since .gm;p /pm is decreasing and . S gm;p d/pm is bounded below by 0; the Monotone Convergence Theorem ensures that gm is integrable. Moreover, Z
Z gm d inf S
nm S
fn d D inf kfn k1 c WD lim inf kfn k1 : nm
n
7.4 Convergence Results
417
Again R the Monotone Convergence Theorem ensures that f D limm gm is integrable and S fd c: The next result is a cornerstone of Lebesgue integration theory. Theorem 7.5 (Lebesgue’s Dominated Convergence Theorem) Let . fn / be a sequence in L1 .S; E/ such that there exists an h 2 L1 .S; R/ satisfying kfn .s/kE h.s/ a.e. s 2 S for all n 2 N. If . fn / converges a.e. to Rsome map fR; then f is in L1 .S; E/ and . fn / converges to f in L1 .S; E/: Moreover, . S fn d/ ! S fd. Proof For k; p 2 N with k p; let gk;p WD supkm;np kfm fn kE ; gk WD supm;nk kfm fn kE ; so that 0 gk;p gk 2h: Since .gk;p /p is nondecreasing and .gk / is nonincreasing, the Monotone Convergence Theorem ensures that gk D suppk gk;p and g WD infk gk are integrable and .kgk gk1 / ! 0 as k ! 1: However, since . fn / converges a.e. to some map f ; we have g D 0 a.e. Thus .kgk k1 / ! 0 as k ! 1 and . fn / is a Cauchy sequence in L1 .S; E/: By Proposition 7.6 . fn / has a subsequence which converges a.e. and in L1 .S; E/ to some element f 0 of L1 .S; E/: Then f 0 D f a.e. and f 2 L1 .S; E/: Corollary 7.9 Let f W S ! E be -measurable. If there exists some g 2 L1 .S; R/ such that kf ./kE g./ almost everywhere then f 2 L1 .S; E/. In particular, f 2 L1 .S; E/ if and only if kf ./kE is in L1 .S; R/: Proof Changing f and g on a set of measure 0, we may assume g is measurable. Let N be a measurable set of measure 0 and let . fn / be a sequence in St.; E/ such that . fn / ! f on SnN: Given a > 1, for n 2 N let Sn WD fx 2 S W ag.x/ kfn .x/k 0g and let fn0 WD 1Sn fn : Then Sn is measurable and one has fn0 2 L1 .S; E/. Let us observe that for all x 2 SnN we have . fn0 .x// ! f .x/ W if g.x/ D 0 we have f .x/ D 0 and either x 2 Sn so that fn0 .x/ D fn .x/ D 0 or x … Sn and fn0 .x/ D0 whereas if g.x/ > 0, for n large enough we have x 2 Sn and fn0 .x/ D fn .x/: Since fn0 .x/ ag.x/ for all x 2 S the Dominated Convergence Theorem entails that f 2 L1 .S; E/. The second assertion is a consequence in the first one and of Proposition 7.4. Corollary 7.10 Let . fn / be a sequence in L1 .S; E/ which converges a.e. to some map f : If there exists some c 2 R such that kfn k1 c for all n 2 N, then f 2 L1 .S; E/ and kf k1 c. Proof Since .gn / WD .kfn ./kE / ! g WD kf ./kE a.e. and kgn k1 D kfn k1 c for all n; Fatou’s lemma ensures that g 2 L1 .S; R/ and kgk1 c. Since all fn are -measurable, f is -measurable by Corollary 7.1. Then, the preceding corollary implies that f 2 L1 .S; E/. Moreover, kf k1 D kgk1 c.
418
7 Integration
1 Corollary 7.11 (Beppo Levi) Let ˙nD1 fn be an infinite series whose terms are 1 RC -valued -measurable functions on S: Then f WD ˙nD1 fn is -measurable and
Z X 1 nD1
fn d D
1 Z X
fn d:
(7.8)
nD1
1 fn are E-valued integrable If E is a Banach space and if the terms of a series ˙nD1 1 1 functions such that ˙nD1 kfn k1 converges, and if the series ˙nD1 fn converges a.e., then its sum f is integrable and the preceding relation holds.
Proof The first assertion stems from the Monotone Convergence Theorem applied to the partial sums of the series. 1 fn are E-valued integrable functions such that c WD If the terms of the series ˙nD1 n 1 ˙nD1 kfn k1 < C1, the partial sums gn satisfy kgn k1 ˙kD1 kfk k1 c and the result follows from the preceding corollary and from Theorem 7.5. Corollary 7.12 Let F, G; H be Banach spaces and let b W F G ! H be a continuous bilinear map. Let f 2 L1 .S; F/ and let g W S ! G be a bounded -measurable map. Then h./ WD b. f ./; g.// is in L1 .S; H/ and khk1 kbk : kgk1 kf k1 : In particular, one has gf 2 L1 .S; E/ whenever f 2 L1 .S; E/ and g W S ! R is bounded and -measurable. Proof Let c WD kgk1 WD supx2S kg.x/kG , let . fn / be a Cauchy sequence of -step maps converging a.e. to f and let .gn / be a sequence of step functions converging a.e. to g: As in the proof of Corollary 7.9, given a > 1 we may suppose kgn .x/kG a kg.x/kG for all x 2 S. Then hn ./ WD b. fn ./; gn .// is a -step map, .hn / ! h a.e., and khn k1 ac kbk kfn k1 for all n. Corollary 7.10 enables us to conclude that h 2 L1 .S; H/ and khk1 ac kbk kf k1 : Since a is arbitrary close to 1, we get the announced estimate. The next theorem has a probabilistic interpretation in terms of the averages 1 mf .A/ WD .A/
Z fd: A
Theorem 7.6 Let f 2 L1 .S; E/ and let F be a closed subset of E: If for all A 2 S with positive finite measure one has mf .A/ 2 F, then f .x/ 2 F for almost all x 2 S. Proof Changing f on a set of measure 0 and replacing E with a separable subspace E0 and F with F \ E0 , we may suppose E is separable and that f is null outside of a set S0 which is a countable union of measurable subsets of finite measures. Thus we are reduced to the case when S has finite measure. Given e 2 EnF we pick r > 0
7.4 Convergence Results
419
such that the closed ball B WD BŒe; r with center e and radius r does not meet F: Let A WD f 1 .B/: If .A/ is positive we have Z Z Z mf .A/ e D 1 . fd ed/ 1 kf ekE d r; .A/ E .A/ A A A E a contradiction with mf .A/ 2 F EnB: Thus .A/ D 0: Taking a countable dense subset D of EnF and closed balls with centers in D and rational radius contained in EnF we get that the set Snf 1 .F/ has measure 0: Important consequences are obtained by taking F WD BŒ0; b, F WD f0g, F WD Œa; b in R, F WD RC respectively. Corollary 7.13 For f 2 L1 .S; E/, b 2 RC one has kf kE b a.e. if and only if 8A 2 S
Z fd b.A/: A
E
Corollary 7.14 If f 2 L1 .S; E/ and if for all measurable subsets A of S with finite positive measure one has mf .A/ D 0; then f D 0 almost everywhere. Corollary 7.15 If f 2 L1 .S; R/ and if for all measurable subsets A of S with finite positive measure one has a mf .A/ b; then f .x/ 2 Œa; b almost everywhere. R If for all measurable subsets A of S with finite positive measure one has 0 A fd; then f .x/ 0 almost everywhere. A kind of converse is also of interest. Corollary 7.16 If f 2 L1 .S; R/, if f 0 a.e., and if everywhere.
R S
fd D 0 then f D 0 almost
Proof This follows from Corollary 7.14 since for all A 2 S one has 1A f f hence Z
Z
Z
fd D
0 A
1A fd S
fd D 0: S
The next result is in the same vein as the preceding ones. It will be used later on. Theorem 7.7 Let .S; S; / be a measure space and let h W SR!RC be a -measurable function. A measure W S !RC coincides with A 7! A hd if and only if for all A 2 S one has (with the convention C1:0 D 0) .A/ inf h.A/ .A/ .A/ sup h.A/:
(7.9)
Proof Condition (7.9) is clearly necessary. Let us prove it is sufficient. We may suppose h is measurable and in a first step we assume h is positive everywhere. We
420
7 Integration
fix c 20; 1Œ for a moment and for A 2 S and n 2 Z we set An WD fx 2 A W cnC1 h.x/ < cn g: These sets form a measurable partition of A: Relation (7.9) yields cnC1 .An / .An / cn .An / for all n 2 N. On the other hand, integrating h on An we have Z cnC1 .An /
hd cn .An /: An
Summing these inequalities over n 2 Z we get Z hd
c A
X n
c
nC1
.An /
X n
.An /
X n
1 c .An / c
Z
n
hd: A
R R Using the -additivity of , these relations imply c A hd .A/ 1c A hd. R Since c is arbitrarily close to 1 we get .A/ D A hd: In the general case we apply this partial result R to the induced measure on S0 WD fx 2 S W h.x/ > 0g and we obtain thatR .A/ D A hd for all A 2 S contained in S0 : Now (7.9) ensures that .A/ DR0 D A hd for all A 2 S contained in SnS0 : Using the additivity of and of A 7! A hd we get the result in the general case.
Exercises 1. Show that under the assumption of Corollary 7.10 it is not always true that . fn / converges to f in L1 .S; E/: [Hint: take fn WD 2n 1In with In WD Œ2n ; 2nC1 and f D 0.] 2. If .S; S; / is a -finite measure space, show that there exists an integrable R function u W S ! R with positive values such that S u.s/d.s/ D 1. [Hint: Given a sequence .Sn / of disjoint subsets in S such that S D [n Sn and cn WD .Sn / 20; C1Œ for all n; set u D ˙n .2n =cn /1Sn .] 3. Let .S; S/ WD .Œ0; 1; B.Œ0; 1// and let be the restriction of the Lebesgue measure to B.Œ0; 1/. For n 1 let fn be given by fn .r/ WD min.r1=2 enr ; n/: Show that . fn / ! 0 a.e. and that j fn j h with h.r/ WD r1=2 ; so that R . S fn d/ ! 0 by Theorem 7.5 but that . fn / does not converge uniformly to 0. 4. The following example shows that Theorem 7.5 is not a universal tool. Let .S; S; / WD .R; B.R/; / and let g W S ! RC be a continuous function null on RnŒ0; 1 with a non-null integral and for n 2 Nnf0g let fn be given by
7.5 Integrals Depending on a Parameter
421
R fRn .r/ WD g.r C n/=n: Show that . fn / ! 0 uniformly, that . R fn d/ ! 0 but that R hd D C1 for any measurable function h satisfying fn h for all n 2 N. 5. Let .S; S; / be a measure space, and let . fn / be a sequence in L1 .; E/ that pointwise converges to a function f and such that .kfn k1 / is bounded. Show R that f 2 L1 .; E/: Give an example showing that the sequence . fn d/ may not converge R and another one showing that this sequence converges but that its limit is not fd: 6. Show the convergence of the sequences whose general terms are: Z
n 0
x .1 /n ex=2 dx; n
Z
n 0
x .1 C /n e2x dx; n
Z 0
n
x .1 /n cos2 nxdx: n
7. Let .rn / be a sequence in Œ0; 1 such that frn W n 2 Ng is dense in Œ0; 1 and let Sn WD frk W k 2 Nn g: Show that the function fn WD 1Sn is Riemann-integrable with null integral and that the pointwise limit f of the increasing sequence . fn / is Riemann-integrable with integral equal to 1: 8. Show that convergence in measure satisfies the conditions of a space with limits (Definition 2.1). Is this the case for convergence almost everywhere? 9. Verify that given a measure space .S; S; /; for any A 2 S with finite measure the function pA on the space L0 .; E/ of -measurable maps from S into E given by pA . f / WD inffc 2 RC W .fj1A f j > cg/ cg is a seminorm. The convergence associated with this family of seminorms is called local convergence in measure. Show that when is finite, this convergence coincides with convergence in measure. 10. With the notation of the preceding exercise, show that for all f 2 L0 .; E/ the intersection of the family of semi-balls fg 2 L0 .; E/ W pA .g f / < rg for r > 0; A 2 S with finite measure, is f f g: 11. With the notation of Exercise 9, let p be the generalized semi-norm on L0 .; E/ given by p. f / WD pS . f / if there exists c 2 RC such that .fj f j > cg/ c, p. f / D C1 otherwise. Show that the convergence associated to p coincides with convergence in measure. 12. Show that if .S; S; / is a probability space, i.e. a measure space with .S/ D 1; R j fn f j a sequence . fn / ! f in measure if and only if one has . 1Cj d/ ! 0: fn f j
7.5 Integrals Depending on a Parameter Since many functions are defined by integrals, it is useful to learn how one can ensure continuity or differentiability of integrals depending on some parameters.
422
7 Integration
Proposition 7.11 Let .S; S; / be a measure space, let .T; d/ be a metric space, let t 2 T, let E be a Banach space and let f W S T ! E be a map such that (a) for all t 2 T the partial map ft WD f .; t/ belongs to L1 .S; ; E/, (b) for a.e. s 2 S the partial map t 7! f .s; t/ is continuous at t 2 T, (c) there exists a h 2 L1 .S; ; R/ such that kf .s; t/k R h.s/ for all .s; t/ 2 S T. Then the map g W T ! E given by g.t/ WD S ft d is continuous at t. Proof Given a sequence .tn / ! t in T we have to show that .g.tn // ! g.t/: We have Z Z kg.tn / g.t/k D . ftn ft /d kftn ft k d; S
S
. ftn ft / ! R0 a.e. and kftn ft k 2h: The Dominated Convergence Theorem ensures that . S kftn ft k d/ ! 0: Thus .g.tn // ! g.t/: For simplicity, the differentiability result we give now is presented for onevariable functions. It leads to a differentiability result in the sense of Hadamard. A Fréchet differentiability result can be devised as in the following exercise. Proposition 7.12 Let .S; S; / be a measure space, let T be an open interval of R, let t 2 T, let E be a Banach space and let f W S T ! E be a map such that (a) for all t 2 T the partial map ft WD f .; t/ belongs to L1 .S; ; E/, (b) for all s 2 S the partial map t 7! f .s; t/ is differentiable at t 2 T, (c) there exists some h 2 L1 .S; ; R/ such that kf .s; t/ f .s; t/k h.s/ jt tj for all .s; t/ 2 S T. R Then the map g W T ! E given by g.t/ WD S ft d is differentiable at t and Z @ f .s; t/d.s/: g0 .t/ D @t S Proof Let us denote by v the right-hand side of this relation. Given a sequence .tn / ! t in Tnftg we have to show that ..1=rn /.g.tn / g.t//n ! v for rn WD tn t: Now .qn .s// WD .
1 @ . f .s; t C rn / f .s; t/// !n!C1 q.s/ WD f .s; t/ rn @t
and R kqn .s/k Rh.s/ for all n 2 N and s 2 S. Invoking again Theorem 7.5, we get . S qn d/n ! S qd: Exercise Suppose W is an open convex subset of a normed space X, w 2 W, the map f W S W ! E is such that for all w 2 W one has fw WD f .; w/ 2 L1 .S; ; E/; and such that for all s 2 S the partial map w 7! f .s; w/ is Fréchet differentiable on @ W (or around w) with @w f .; w/ 2 L1 .S; ; L.X; E//. Assume there exists some h 2 L1 .S; ; R/ such that kf .s; w/ f .s; w/kE h.s/ R kw wkX for all .s; w/ 2 S W. Then the map g W W ! E given by g.w/ WD S fw d is Fréchet differentiable at w
7.6 Integration on a Product
423
and
Z Dg.w/ D S
@ f .s; w/d.s/: @w
[Hint: using Theorem 7.5, show for every sequence .xn / ! 0 in Xnf0g one has R that @ limn kx1n k .g.w C xn / g.w/ S @w f .s; w/:xn d.s// D 0.]
7.6 Integration on a Product In elementary calculus, integrals of continuous functions of several variables are often computed by iterating one-variable integrals. It is the purpose of the present section to consider a similar device in a general framework. The subject has a long history. Cavalieri (1598–1647) observed that if two bodies in space have the same height and are such that their horizontal sections have the same area, then they have the same volume. But such a result was known to Chinese mathematicians about one millennium earlier. Moreover, Archimedes (circa 287–212 BC) was proud to have established that the volume of a ball is 2/3 the volume of the circumscribed cylinder with the same height. Cicero claimed that he discovered an engraving on Archimedes’ grave with a verse and a drawing of a ball and a circumscribed cylinder. Use such a drawing to deduce that the volume of the ball is equal to the volume of the cylinder deprived from two cones with apex at the center of the ball and bases the bottom and the top of the cylinder (Fig. 7.1). Fig. 7.1 The ball and the circumscribed cylinder
424
7 Integration
We first consider product measures with a fresh (and simpler) approach. Given sets X, Y and rings A, B on X and Y respectively, we denote by C the collection of sets of the form C WD A B with A 2 A, B 2 B and by A B the collection of unions of finite families of disjoint elements of C. The relations .A B/ \ .A0 B0 / D .A \ A0 / .B \ B0 /; .A B/n.A0 B0 / D Œ.AnA0 / B [ Œ.A \ A0 / .BnB0 /; .A B/c D .Ac Y/ [ .X Bc / show that C is a semi-ring and Lemma 1.4 entails that A B is the ring generated by C. Moreover, A B is an algebra if A and B are algebras. The following proposition shows that A B is a -algebra whenever A and B are -algebras. Given a class G of subsets of a set Z, we denote by G the -algebra generated by G. Let us state again Proposition 1.16. Proposition 7.13 Given rings A, B on the sets X and Y respectively, the -algebra A ˝ B WD .A B/ generated by the ring A B coincides with the -algebra A ˝ B generated by the ring A B . Let us also recall a fundamental measurability result (Lemma 1.8). If f W XY ! Z is a map with values in a measurable space .Z; S/ and if P 2 P.XY/, for x 2 X we denote by fx W Y ! Z the partial map of f and by Px 2 P.Y/ the slice of P defined by fx .y/ WD f .x; y/; y 2 Y
Px WD fy 2 Y W .x; y/ 2 Pg:
Lemma 7.6 Let X, Y be sets and let A, B be rings (resp. -algebras) on X and Y, respectively. Then, for all P 2 A B (resp. A ˝ B) and for all x 2 X one has Px 2 B. For any measurable map f W .X Y; A ˝ B/ ! .Z; S/ and any x 2 X, fx is measurable. If .W; W/ is a measurable space, a map g W W ! X Y is measurable with respect to W and A ˝ B if and only if pX ı g and pY ı g are measurable. For the rest of this section, .X; M; / and .Y; N ; / are -finite measure spaces. We want to construct a natural measure on .X Y; M N /: We denote by A (resp. B) the ring formed by those A 2 M (resp. B 2 N ) such that .A/ < C1 (resp. .B/ < C1). We observe that for A 2 A, B 2 B, e 2 E, x 2 X and f WD 1AB e we have fx D 1A .x/1B e, Z fx .y/d .y/ D 1A .x/ .B/e; Y
Z Z . fx .y/d .y//d.x/ D .A/ .B/e: X
Y
7.6 Integration on a Product
425
By linearity, we conclude that if f is a step function R with respect to A B then fx is a step function with respect to B and g W x 7! Y fx .y/d .y/ is a step function with respect to A. Moreover, the map f 7!
Z Z Z Z . fd /d WD . fx .y/d .y//d.x/ X
Y
X
Y
is linear on the space of step functions with respect to A B since each of the integrals is linear. These observations lead to a simple proof of the existence result. Theorem 7.8 Given -finite measure spaces .X; M; / and .Y; N ; / there exists a unique measure ˝ on the -algebra M ˝ N generated by the semi-ring C of subsets of the form A B with A 2 A, B 2 B i.e., A 2 M, B 2 N of finite measures such that for all A 2 A, B 2 B one has . ˝ /.A B/ D .A/ .B/: Proof Let W C ! RC be the function given by . /.A B/ WD .A/ .B/
A 2 A; B 2 B.
Let us check that is additive on C. Let A 2 A; B 2 B, Ai 2 A; Bi 2 B for i D 1; : : : ; n be such that Aj Bj and Ak Bk are disjoint for j ¤ k and AB D
n [
.Ai Bi /;
iD1 n Then for f WD 1AB , fi D 1Ai 1Bi we have f D ˙iD1 fi hence
. /.A B/ D
Z Z n Z Z n X X . fd /d D . fi d /d D . /.Ai Bi /: X
Y
iD1
X
Y
iD1
Then, by Lemma 1.6, has a unique extension to an additive function (still denoted by ) on the ring AB generated by C. Let us show that in fact is countably additive. We make use of Lemma 1.5. Let .Pn / be an increasing sequence in AB whose union P is in AB. Let fn be the characteristic function of Pn : Then . fn /n is increasing and converges to the characteristic function f of P: Moreover, for all x 2 X the sequence . fn;x /n of partial functions is increasing and converges
426
7 Integration
to fx : Furthermore, fn;x (resp. fx ) is in St. ; E/: The Monotone Convergence Theorem ensures that Z Z .gn .x//n WD . fn;x .y/d .y//n ! g.x/ WD fx .y/d .y/: Y
n!C1
Y
Since gn and g are -step functions as we observed above and since .gn / is increasing, another application of the Monotone Convergence Theorem yields that Z
Z gn .x/d.x//n
. X
!
n!C1
g.x/d.x/: X
R R Since X gn .x/d.x/ D . /.Pn / and X g.x/d.x/ D . /.P/ by the observations above, we get . /.P/ D limn . /.Pn /: Then Hahn’s Theorem ensures that can be extended to a measure on the -algebra M ˝N generated by A B or C. The values of ˝ can be computed. We need a preliminary result. Lemma 7.7 Given -finite measure spaces .X; M; / and .Y; N ; /, for all P 2 M ˝ N the function x 7! .Px / is measurable. Proof First suppose is finite. Let D be the class of P 2 M ˝ N such that for all x 2 X the function fP W x 7! .Px / is measurable (we know that Px 2 N , so that .Px / is well defined in RC ). For all A 2 M and all B 2 N one has A B 2 D since ..A B/x / D 1A .x/ .B/. In particular X Y 2 D. Let P , Q 2 D with Q P: Since for all x 2 X we have ..PnQ/x / D .Px nQx / D .Px / .Qx / we see that PnQ 2 D. If P is the union of an increasing sequence .Pn / of D we have Px D [n Pnx hence .Px / D limn .Pnx /; so that x 7! .Px / is measurable and P 2 D. Thus D is a complemented increasing class containing the class C of rectangles A B with A 2 M, B 2 N . Since .A B/ \ .A0 B0 / D .A \ A0 / .B \ B0 /; C is closed under finite intersections and Proposition 1.9 ensures that M ˝ N D: for all P 2 M ˝ N the function x 7! .Px / is measurable. Now let us suppose and are -finite. Let .Yn / be a sequence of disjoint measurable subsets of Y with finite measures and with union Y. For each n 2 N define a finite measure n on N by setting n .B/ WD .B \ Yn /: According to the first part of the proof, for all P 2 M ˝ N the function x 7! n .Px / is measurable. Since .Px / D ˙n n .Px / the function x 7! .Px / is measurable too.
7.6 Integration on a Product
427
Proposition 7.14 Given -finite measure spaces .X; M; / and .Y; N ; /, for all P 2 M ˝ N one has Z . ˝ /.P/ D .Px /d.x/: X
Of course, the roles of .X; / and .Y; / can be interchanged. Proof We already know that for all P 2 M ˝ N the function x 7! .Px / is measurable so that the function W M ˝ N !RC given by Z .Px /d.x/
.P/ D X
is well defined. By additivity of and linearity of integration, is additive. If P is the union of an increasing sequence .Pn / of D we have seen that .Px / D limn .Pnx / and the Monotone Convergence Theorem ensures that Z
Z .Px /d.x/ D lim
.P/ D
n
X
X
.Pnx /d.x/ D lim .Pn /: n
Thus, by Lemma 1.5, is -additive. Since .P/ D .A/ .B/ D . ˝ /.P/ for P D A B 2 C, and ˝ coincide on the algebra A B generated by C, hence on the -algebra M ˝ N generated by A B in view of Theorem 1.11. Lemma 7.8 Let Z be a set of . ˝ /-measure 0 in X Y: Then, for almost all x 2 X one has .Zx / D 0: Proof We may suppose Z belongs to M ˝N R . Then for all x the set Zx is measurable and f W x 7! .Zx / is measurable. Since X fd D . ˝ /.Z/ D 0 and f 0 we have f D 0 a.e. by Corollary 7.16. Alternative proof Since by Theorem 1.12 ˝ is the restriction to M ˝ N of the outer measure deduced from , the construction of Proposition 1.9 yields for every " > 0 and k 2 Nnf0g a sequence .Cn / of elements of C whose union contains Z and satisfies 1 X nD1
. /.Cn /
" : 2k k
For all x 2 X the slice Zx is contained in the union of the slices Cn;x WD .Cn /x . Let 1 Sk WD fx 2 X W .Zx / g; k
Tk WD fx 2 X W f .x/ WD
1 X nD1
.Cn;x /
1 g: k
428
7 Integration
Then f is measurable and by the Beppo Levi’s Theorem one has 1 .Tk / k
Z f .x/d.x/ D X
1 Z X nD1
.Cn;x /d.x/ D X
1 X
. /.Cn /
nD1
" 2k k
hence .Tk / 2k ": Since .Zx / f .x/ we have Sk Tk hence .Sk / 2k ": Thus, since S WD fx 2 X W .Zx / > 0g is [k1 Sk we get .S/ " hence .S/ D 0; " being arbitrarily small. Now, let us turn to functions. We start with nonnegative functions. Theorem 7.9 (Tonelli) Given -finite measure spaces .X; M; /, .Y; N ; / and a . ˝ /-measurable function f W X Y ! RC ; for almost all x 2 X the function fx is -measurable and Z Z Z fd. ˝ / D . fx d /d.x/: XY
X
Y
Proof Let us first consider the case when f is the characteristic functionR of some P 2 M ˝ N . Then, for all x 2 X, fx is the characteristic function of Px , Y fx d D .Px / and Proposition 7.14 yields the result. By linearity we get the result for every nonnegative step function. Now if f is . ˝ /-measurable, using the assumption that X Y is -finite and the remark following Lemma 7.3 we can find an increasing sequence of step functions . fn / converging a.e. to f : Then, using the Monotone Convergence Theorem, we get the result by a passage to the limit. Finally, let Z 2 M ˝ N be a null set and let g W X Y ! RC be . ˝ /measurable and such that f D g off Z. The preceding lemma shows that there exists a null set N of .X; M; / such that we have fx D gx off the -null set R for x 2 XnN R Zx , so that fx is -measurable and Y fx d D Y gx dv for all XnN: Thus Z Z Z Z Z . fx d /d.x/ D . gx d /d.x/ D X
Y
X
Y
Z gd. ˝ / D XY
fd. ˝ /: XY
Corollary 7.17 (Tonelli) Given -finite measure spaces .X; M; /, .Y; N ; / and a . ˝ /-measurable function f W X Y ! R RC ; if for almost all x 2 X the -measurable function f is such that j j x Y j fx .y/j d .y/ < C1; and if R R . f d /d.x/ < C1 then f is integrable on .X Y; ˝ / and j j x X Y Z fd. ˝ / D XY
Z Z . fx d /d.x/: X
Y
Theorem 7.10 (Fubini) Given -finite measure spaces .X; M; /, .Y; N ; /; a Banach space E and f 2 L1 .X Y; ˝ ; E/ then for almost all x 2 X the function
7.6 Integration on a Product
429
fx is in L1 .Y; ; E/ and the map x 7!
R
Y fx d
is integrable with
Z Z fd. ˝ / D . fx d /d.x/:
Z XY
X
(7.10)
Y
Proof Since f 2 L1 .X Y; ˝ ; E/ we have kf kE 2 L1 .X Y; ˝ ; R/: Tonelli’s Theorem yields a null set N in .X; / such that for x 2 XnN the function kfx kE is -measurable and Z Z Z d. ˝ / D . kfx kE d /d.x/: kf kE XY
X
Y
Thus, by Corollary 7.9, for all x 2 XnN we have fx 2 L1 .Y; ; E/: Let . f n / be a sequence in St. ˝ ; E/ converging to f on .X Y/nZ; where Z is a null set of .X Y; ˝ /: Setting Sn WD fw 2 X Y W kf n .w/kE 2 kf .w/kE g and replacing f n with fn0 WD 1Sn f n if necessary, we may suppose kf n kE 2 kf kE for all n. We may suppose the -null set N of X is such that .Zx / D 0 for all x 2 XnN, enlarging N if Then, for all x 2 XnN and all y 2 YnZx we have . fxn .y// ! fx .y/ and necessary. f n 2 kfx k : Theorem 7.5 ensures that E x E Z
Z
.gn .x//n WD .
8x 2 XnN
Y
fxn d /n ! g.x/ WD
fx d : Y
Moreover, Z 8x 2 XnN
kg .x/kE n
Y
Since the function x 7! yields
R Y
n f d 2 x E
Z kfx kE d : Y
kfx kE d is integrable, another application of Theorem 7.5 Z
Z gn .x/d.x//n !
. X
g.x/d.x/: X
R R R R Since XY f n d. ˝ / D X gn d and . XY f n d. ˝ //n ! XY fd. ˝ /, we get relation (7.10). The roles of X and Y being symmetric, relation (7.10) ensures that the two iterated integrals coincide. However, this coincidence may not occur if f is not supposed to be integrable for ˝ (Exercise 3). Fubini’s Theorem allows an interpretation of the integral of a nonnegative integrable function f W X ! RC as the “area” of the subset of X RC under the graph of f . Corollary 7.18 Let .X; M; / be a -finite measure space, let f 2 L1 .X; / be nonnegative, and let H WD f.x; r/ 2 X RC W r f .x/g: Then H is measurable if
430
7 Integration
and only if f is measurable. If this occurs, providing .R; B.R// with the Lebesgue measure , one has Z . ˝ /.H/ D f .x/d.x/: (7.11) X
Proof If f is measurable, one can show that g W X R ! R given by g.x; y/ WD f .x/ y is measurable. Since H D g1 .RC / \ .X RC /; H is measurable. Conversely, suppose H is measurable. Then, by Proposition 7.14 and the lemma preceding it, for all x 2 X, r 2 R the slices Hx WD Œ0; f .x/ and H r WD fx 2 X W .x; r/ 2 Hg D f 1 .Œr; 1Œ/ are measurable, so that f is measurable and since R .Hx / D f .x/; relation (7.11) stems from the relation . ˝ /.H/ D X .Hx /d.x/ of Proposition 7.14. Tonelli’s Theorem and the Stieltjes measure can be used to give a practical means to compute integrals. Proposition 7.15 (Integration by Parts) Let S WD Œa; b with a, b 2 R, a < b and let g W Œa; b ! R be nondecreasing and left-continuous. Setting g.r/ WD g.a/ for r < a; g.r/ D g.b/ for r > b; let g be the Stieltjes measure R r associated with g: Given f 2 L1 .S; ; R/ let F W S ! R be defined by F.r/ WD a f .s/d.s/: Then Z
b
Z f .s/g.s/d.s/ D F.b/g.b/
a
b
F.r/dg .r/:
a
Note that since g is bounded and measurable, fg 2 L1 .S; ; R/ and since F is continuous it is g -integrable. Proof Let us endow S2 WD S S with its Borel -algebra and the measure g ˝ and let us define h W S S ! R by h.r; s/ WD f .s/ for s r;
h.r; s/ WD 0 for s > r:
Tonelli’s Theorem ensures that h is in L1 .S2 ; g ˝ ; R/ since Z
Z
Z
b
jhj dg ˝ D
r
.
SS
a
a
j f .s/j d.s//dg .r/ kf k1 .g.b/ g.a//:
Then, Fubini’s Theorem applied to h yields Z
Z
b a
Z
b
F.r/dg .r/ D
. a
Z D
a
Z
b
f .s/d.s//dg .r/ D a
b
Z
r
b
. a
f .s/dg .r//d.s/ s
Z
b
f .s/.g.b/ g.s//d.s/ D F.b/g.b/
f .s/g.s/d.s/:
a
7.6 Integration on a Product
431
Another application of Fubini’s Theorem is noteworthy. Proposition 7.16 In Rd endowed with the Lebesgue measure d every hyperplane H has measure 0: Proof We use an induction on d. For d D 1; the result is obvious since a hyperplane is just a point. We assume the result holds for d 1: Since d is invariant by translation, we may suppose the hyperplane H contains 0. If .e1 ; : : : ; ed / is the canonical basis, there exists a k 2 Nd such that ek … H: Since d is invariant under the isomorphism u given by u.ed / D ek ; u.ek / D ed ; u.ei / D ei for i 2 Nd nfk; dg; as a product of intervals is changed into a product of intervals with the same measure, we may suppose k D d: For r 2 R the slice Hr WD H \ .Rd1 frg/ is a hyperplane of Rd1 ; hence is a null set. Integrating over r, we get d .H/ D 0 by Proposition 7.14. Another application of Fubini’s Theorem concerns the convolution operation. Proposition 7.17 Let f , g 2 L1 .Rd ; d ; R/, where d is the Lebesgue measure on B.Rd /. Then there exists a measurable subset S of Rd whose complement is a null set such that for all x 2 S the function y 7! f .x y/g.y/R belongs to L1 .Rd ; d ; R/ and the function h WD f g W R ! R given by h.x/ D Rd f .x y/g.y/dd .y/ for x 2 S; h.x/ D 0 for x 2 Rd nS is integrable. Moreover, kf gk1 kf k1 kgk1 : Proof We begin by showing that the function k W .x; y/ 7! f .xy/g.y/ is measurable when f and g are measurable. In fact, k is the composition of the continuous map .x; y/ 7! .x y; y/ with f g and p W .r; s/ 7! rs: Then, Tonelli’s Theorem and the translation invariance of the Lebesgue measure yield Z
Z Rd Rd
jkj d.d ˝ d / D
Rd
Z . Z
Z D
Rd
.
Rd
Rd
j f .x y/g.y/j dd .x//dd .y/ Z j f .x/j dd .x// jg.y/j dd .y/ D kf k1
Rd
jgj dd
or kjkjk1 D kf k1 kgk1 : Then, by Proposition 7.9, k 2 L1 .Rd Rd ; d ˝ d ; R/ and Fubini’s Theorem implies that for almostˇRevery x 2 Rd the kx W y 7! ˇ function R f .x y/g.y/ belongs to L1 .Rd ; d ; R/. Since ˇ Rd kx .y/dd .y/ˇ Rd jkx .y/j dd .y/ we deduce from the preceding equalities that Z
Z Rd
jh.x/j dd .x/
Z Rd
Rd
j f .x y/g.y/j dd .x/dd .y/ D kf k1 kgk1 :
Let us note that the preceding result is valid when .R ; d / is replaced by a topological group G endowed with a translation invariant measure. Let us also observe that the class a.e. of f g depends only on the classes a.e. of f and g so d
432
7 Integration
that the convolution can be considered as an operation from L1 .Rd / L1 .Rd / into L1 .Rd /.
Exercises 1. (Commutativity of products) Given -finite measure spaces .X; M; / and .Y; N ; / and an element P of M ˝ N , show that P| WD f.y; x/ W .x; y/ 2 Pg belongs to N ˝ M and that . ˝ /.P| / D . ˝ /.P/: 2. (Associativity of products) Given -finite measure spaces .X; M; /; .Y; N ; / and .Z; P; $/ show that . ˝ / ˝ $ D ˝ . ˝ $/: 3. Let be the Lebesgue measure on X WD .R; B.R//, let be the counting measure on Y WD .R; B.R//, and let f be the characteristic function of the line L WD f.x; x/ W x 2 Rg: Show that Z fd. ˝ / ¤ XY
Z Z . fx d /d.x/: X
Y
4. Let f W Œ0; 12 ! R be given by f .0; 0/ D 0; f .x; y/ WD .x2 y2 /.x2 Cy2 /2 . Prove R1 R1 that 0 . 0 f .x; y/dy/dx D =4: Deduce from this result that f is not integrable on Œ0; 12 : 5. Let X and Y be two topological spaces and let B.X/ and B.Y/ be the associated Borelian -algebras. Verify that B.X/ ˝ B.Y/ B.X Y/: Prove that equality holds when X and Y have countable bases of open sets. 6. Let .S; S; / be a measure space, being -finite and let f 2 L1 .S/. For t 2 RC one sets Et WD fx 2 S W j f .x/j tg, m.t/ WD .Et /; so that m./ is nonincreasing. R R C1 Show that for every p 2 P WD0; C1Œ one has j f jp d D 0 ptp1 m.t/dt: [Hint: use Fubini’s Theorem for .x; t/ 7! ptp1 m.t/ on E WD f.x; t/ 2 S RC W t j f .x/jg.] Deduce from the preceding relation that for all t > 0 and all p 1 one has Tchebychev’s inequality tp m.t/ kf kpp . Verify that for S WD0; 1 endowed with 1 ; for f given by f .x/ WD 1=x log x one has supt0 tm.t/ < C1, but f … L1 .S/:
7.7 Change of Variables Let us start with a general overview of image measures. This notion is important in probability theory. Proposition 7.18 Let h W .X; A/ ! .Y; B/ be a measurable map between two measurable spaces and let be a measure on .X; A/: Then the map W B !RC
7.7 Change of Variables
433
given by .B/ WD .h1 .B// is a measure on .Y; B/ called the image measure of by h; it is denoted by h./ or h]. Proof Clearly, .¿/ WD .h1 .¿// D .¿/ D 0: If .Bn / is a sequence of disjoint elements of B with union B; one has h1 .B/ D [n h1 .Bn / and for m ¤ n h1 .Bm /\ h1 .Bn / D h1 .Bm \ Bn / D ¿: Thus .B/ D ˙n .h1 .Bn // D ˙n .Bn /: Thus is a measure on .Y; B/ and even a measure on Ah WD fB Y W h1 .B/ 2 Ag. Theorem 7.11 Let h and be as in the preceding proposition and let f W .Y; B/ ! R be h./-integrable. Then f ı h is -integrable and Z
Z fdh./ D Y
f ı hd:
(7.12)
X
Proof Both assertions are obvious when f WD 1B with B 2 B and h./.B/ < 1 since 1h1 .B/ D 1B ı h. By linearity these assertions are extended to the case when f is a -step function on .Y; B/. If f is the limit a.e. of a Cauchy sequence of h./-step functions of Y; then f ı h is the limit a.e. of the sequence . fn ı h/ which is a Cauchy sequence of -step functions on X: Relation (7.12) is obtained by a passage to the limit. Remark If .Y; B/ D .X; A/ and if isR invariantR under h in the sense that h./ D ; then for every f 2 L1 .X; / one has fd D f ı hd: Fubini’s Theorem can be used to show that the Lebesgue measure d on Rd is invariant under linear isometries. We need a preliminary algebraic result. Lemma 7.9 For d 2; any linear map u W Rd ! Rd is obtained as the composition of a finite family fu1 ; : : : ; uk g of linear maps of the following types described in terms of the canonical basis e1 ; : : : ; ed : (a) for some permutation of Nd WD f1; : : : ; dg one has u.ei / D e .i/ ; (b) for some r 2 R one has u.e1 / D re1 , u.ei / D ei for i 2 Nd nf1g; (c) u.e1 / D e1 C e2 , u.ei / D ei for i 2 Nd nf1g: Proof If u D 0 one can write u D pd ı p1 where p1 (resp. pd ) is the first (resp. the last) projection. If u ¤ 0 one of the coefficients of the matrix of u is non-null. Using permutations if necessary we may suppose it is u1;1 : Composing u with maps of the type (a), (b) and (c) in order to get ui;1 D 0 for i 2 Nd nf1g we can suppose there exist c WD u1;1 , v 2 L.Rd1 ; Rd1 /; w 2 L.Rd1 ; R/ such that for .r; y/ 2 R Rd1 we have u.r; y/ D .cr C w.y/; v.y//: Let u0 2 L.Rd ; Rd /; v 0 2 L.Rd ; Rd / be defined by u0 .s; z/ WD .s; v.z//;
v 0 .r; y/ WD .cr C w.y/; y/;
434
7 Integration
so that u D u0 ı v 0 : Clearly v 0 is a composition of maps of types (b) and (c). Then, using an induction assumption on d applied to v; we get the result. Theorem 7.12 For any linear map u W Rd ! Rd and any (Lebesgue) measurable subset S of Rd the set u.S/ is Lebesgue measurable and d .u.S// D jdet.u/j d .S/: In particular, the Lebesgue measure d is invariant under the orthogonal group. When u is an isomorphism, the preceding relation means that the image measure of d by u1 is jdet.u/j d : Proof The result follows from Proposition 7.16 when det.u/ D 0 since then u.S/ is contained in a hyperplane. Thus, we may suppose u is an isomorphism. Since u1 is continuous, it is measurable, so that u.S/ D .u1 /1 .S/ is measurable whenever S is measurable. Since for two linear isomorphisms v, w from Rd into itself we have det.v ı w/ D det.v/ det.w/; it suffices to prove the result for each map described in the preceding lemma. We already observed that is invariant under u when u permutes two coordinates, hence when u is any permutation of the coordinates. For isomorphisms of the type (b) and (c) of the lemma, it suffices to prove that the measures d and B 7! .1= det.u//d .u.B// coincide on the class C of products of intervals. For type (b) this is obvious and the case of type (c) can be reduced to the case d D 2: Now if u W R2 ! R2 is the map .x; y/ 7! .x C y; y/ and if A, B are intervals of R we observe that the slice .u.A B//y WD fr W .r; y/ 2 u.A B/g is just A C y: Tonelli’s Theorem and the invariance of the Lebesgue measure under translations yield the conclusion. Corollary 7.19 Let E be a Banach space, let f 2 L1 .Rd ; d ; E/ and let u W Rd ! Rd be a linear map. Then f ı u 2 L1 .Rd ; d ; E/ and Z
Z Rd
f .x/dd .x/ D
Rd
f .u.w// jdet.u/j dd .w/:
(7.13)
Proof For f WD 1T e with e 2 E, T measurable, relation (7.13) follows from the preceding theorem. By additivity, relation (7.13) holds for f 2 St.d ; E/: Taking a Cauchy sequence . fn / in St.d ; E/ such that . fn / ! f a.e. we see that . fn ı u jdet.u/j/ is a Cauchy sequence in St.d ; E/ converging a.e. to f ı u jdet.u/j. Then relation (7.13) is obtained by a passage to the limit. Now let us pass to nonlinear changes of variables. We take on Rd the norm given by kxk WD kxk1 WD max.jx1 j ; : : : ; jxd j/ for x WD .x1 ; : : : ; xd / 2 Rd . Theorem 7.13 Let h W W ! X be a C1 -diffeomorphism between two open subsets of Rd and let f 2 L1 .X; X ; R/ where X is the induced Lebesgue measure on .X; B.X//. Then w 7! f .h.w// jdet.Dh.w/j is integrable on W with respect to the
7.7 Change of Variables
435
measure W on .W; B.W// induced by the Lebesgue measure and Z
Z f .x/dX .x/ D X
f .h.w// jdet.Dh.w/j dW .w/:
(7.14)
W
Proof Using Proposition 7.5 it suffices to prove the result in the case when f is the characteristic function of some B 2 B.X/. In such a case, setting A WD h1 .B/ and introducing the Jacobian Jh of h given by Jh .w/ D det.Dh.w//, relation (7.14) takes the form Z X .h.A// D jJh j dW 8A 2 B.W/: A
Let us set .A/ WD X .h.A// D h1 .X /.A/ for A 2 B.W/; so that D h1 .X / is a measure on B.W/. In view of Theorem 7.7, the preceding relation is a consequence of the estimates jinf Jh .A/j W .A/ .A/ jsup Jh .A/j W .A/
8A 2 B.W/:
Taking into account the relations h. / D X , Jh1 .x/ D 1=Jh .w/ for x WD h.w/; it suffices to prove the right-hand inequality and to apply it to the map h1 : We first prove that for any closed cube C (i.e. a ball for the norm kk1 ) contained in W we have .C/ jsup Jh .C/j W .C/: Suppose, on the contrary, that we have .C/ > jsup Jh .C/j W .C/ for some closed cube C contained in W: Let c > jsup Jh .C/j be such that .C/ > cW .C/: Taking 2d cubes the edges of which have lengths that are half the length of the edges of C; we get a W -partition of C i.e., a covering of C by measurable subsets whose mutual intersections are null sets. By additivity we get that one of the new cubes we call C1 is such that .C1 / > cW .C1 /: Repeating this division, we inductively get a closed cube Cn contained in C such that diam.Cn / D 2n diam.C/ and cW .Cn / < .Cn /:
(7.15)
The intersection \n Cn is a singleton fwg: The derivative u WD Dh.w/ of h at w is a linear isomorphism satisfying jdet uj < c since w 2 C: Let k W W ! Rd be given by k.w/ D w C u1 .h.w/ h.w//
w 2 W:
Since k is of class C1 , k.w/ D w and Dk.w/ D u1 ı Dh.w/ is close to IRd for w close to w; for all " > 0 we can find ı > 0 such that for all w 2 B.w; ı/, by the Mean Value Theorem applied to k IRd we have kk.w/ wk " kw wk :
436
7 Integration
Since w 2 Cn for all n and .diam.Cn // ! 0; for n large enough we have Cn B.w; ı/ and k.Cn / Cn C "BŒ0;diam.Cn /=2: This last set is a cube whose diameter is .1 C "/diam.Cn /: Thus .k.Cn // .1 C "/d W .Cn /: Using the invariance by translation of the Lebesgue measure on Rd and Theorem 7.12 we get ˇ ˇ .Cn / : .k.Cn // D .u1 .h.Cn /// D ˇdet u1 ˇ .h.Cn // D jdet uj Then, relation (7.15) yields cW .Cn / < .Cn / D jdet uj .k.Cn // .1 C "/d jdet uj W .Cn /: Thus c < .1 C "/d jdet uj for all " > 0, hence c jdet uj, contradicting c > jsup Jh .C/j. Therefore, for all closed ball C of W we have .C/ jsup Jh .C/j W .C/: Since any open subset O of W is the union of a countable disjoint family of closed balls contained in W; we get the estimate .O/ jsup Jh .O/j W .O/ for every open subset O of W: Now, by the regularity of the Lebesgue measure, for all measurable subsets A of W we get .A/ inff .O/ W O 2 OW ; A Og inffjsup Jh .O/j W .O/ W O 2 OW ; A Og D .A/ jsup Jh .A/j since w 7! jJh .w/j is continuous. This proves the required inequality and the theorem. Example (Polar Coordinates) Let W WD ; ŒP, X WD R2 n.R f0g/, with P WD0; 1Œ and let h W W ! X be given by h. ; r/ WD .r cos ; r sin /: Since 2 .R2 nX/ D 0 and since jdet.Dh. ; r/j D r.cos2 C sin2 / D r, for any f 2 L1 .R2 / one has Z
Z R2
f .x; y/dxdy D
Z
C1 0
f .h. ; r//rdrd :
p As an application, let us show that R e dx D : This follows from the use of the Fubini-Tonelli Theorem and of polar coordinates: R
Z .
e
x2
Z
2
dx/ D
R
Z D
R2
e.x Z
2 Cy2 /
C1 0
x2
dxdy
1 2 2 er rdrd D 2Œ er C1 D : 0 2
7.7 Change of Variables
437
By a similar calculation, for any ı > 0, using the inclusion .RC nŒ0; ı/2 R2C nıB2 ; where B2 is the Euclidean unit ball of R2 , we get Z .
1
2
ex dx/2 D
ı
Z .RC nŒ0;ı/2
e.x
2 Cy2 /
Z
C1
Z
=2
dxdy ı
2
er rd dr D
0
ı2 e : 4
p R1 R1 2 2 2 Then, for t > 0 one has ı t1=2 eu =t du D ı=pt ex dx 2 eı =2t : Moreover, for d 2 Nnf0; 1g, using the fact that for x 2 Rd nıBd there exists k 2 Nd such that jxk j > ı=d 1=2 , we get
Z t Rd nıB
d=2 kxk2 =t
e
Z dx .
R
e
Z du/
R
d
Hence
t
12 u2 =t
1
d1 ı=d 1=2
1
2
t 2 eu =t du
1 d 1 d1 2 2 2 .eı =td /1=2 D 2 eı =2td : 2 2
2
Rd nıBd
td=2 ekxk =t dx ! 0 as t ! 0C :
Example (Spherical Coordinates) Let W WD ; ŒP =2; =2Œ, X WD R3 n.R f0g R/; and let h W W ! X be given by h. ; r; '/ WD .r cos cos '; r sin cos '; r sin '/: It can be proved that h is a diffeomorphism of class C1 of W onto X and jdet.Dh. ; r; '/j D r2 cos ': Then, for any f 2 L1 .R3 / one has Z
Z R3
f .x; y; z/dxdydz D
Z
C1
Z
0
=2
=2
f .h. ; r; '//r2 cos 'd'drd :
Exercises 1. Let E be the ellipsoid E D fx 2 Rd W .x1 =a1 /2 C : : : C .xd =ad /2 1g: Compute d .E/: [Hint: use either an induction and Fubini’s Theorem or a linear isomorphism transforming E into the unit ball Bd of Rd .] 2. (Cylindrical coordinates). Let X WD .R2 n.R f0g// R, W WD ; ŒP R, and let h W W ! X be given by h. ; r; z/ WD .r cos ; r sin ; z/. Show that for any f 2 L1 .R3 / one has Z Z Z C1 Z C1 f .x; y; z/dxdydz D f .h. ; r; z//dzrdrd : R3
0
1
3. Given r > s > 0, compute the volume of the solid torus T WD f.x; y; z/ 2 R3 W ..x2 C y2 /1=2 r/2 C z2 s2 g: Compare 3 .T/ with the volume of the cylinder C WD B.0; r s/ Œ0; 2s:
438
7 Integration
4. Show that the measure bd of the unit ball Bd WD BRd of Rd is given by bd D .1=.d=2/Š/ d=2 if d is even and bd WD .kŠ=dŠ/2d k if dRis odd, d D 2k C 1. [Hint: for t > 1; using polar coordinates, compute I.t/ WD R2 .1 x2 y2 /t dxdy and show that for d 1 one has bdC2 D I.d=2/bd .] 2 5. (First Guldin’s Theorem). Let S be a Borel R subset of RR and let s WD .x; y/ be 1 its center of inertia given by s WD 2 .S/ . S xd2 .x; y/; S yd2 .x; y//: Let V WD f.x; y; z/ 2 R3 W 9.u; v/ 2 S W x D u; y2 C z2 D v 2 g: Prove that 3 .V/ D 2 jyj 2 .S/: [Hint: use cylindrical coordinates with axis R f.0; 0/g.] 6. Show that if h W W ! X is a diffeomorphism of class C1 between two open subsets of R and if C WD Œa; b is a compact interval of W one has Z
b
X .h.C// D
ˇ 0 ˇ ˇh .w/ˇ d.w/:
a
[Hint: use the fact that h0 does not vanish on C and that h can be assumed to be increasing or decreasing.] R1 7. (Euler’s beta For s; t 2 P, let B.s; t/ WD 0 xs1 .1 x/t1 dx: Check that R 1 function) B.s; t/ D 0 ws1 .1 C w/st dw by making the Rsubstitution x D w.1 C w/1 . 1 8. (Euler’s gamma function) For t 2 P, let .t/ WD 0 xt1 ex dx: Verify that .t/ is well defined and that satisfies the relation .t C 1/ D t .t/ for t > 0; so that .n/ D .n 1/Š for n 2 N. 9. Given an open subset W of Rd , with d 2, and a C1 -diffeomorphism h W W ! h.W/ Rd ; show that for all x 2 W there exists a neighborhood V of x such that h j V is obtained as the composition of permutations of coordinates and of C1 -diffeomorphisms of the form .x1 ; : : : ; xd / 7! .j.x1 ; : : : ; xd /; x2 ; : : : ; xd / and .v1 ; : : : ; vd / 7! .v1 ; k2 .v1 ; : : : ; vd /; : : : ; kd .v1 ; : : : ; vd //, where j and ki are of class C1 . Using the preceding exercise, show that one can avoid the use of Theorem 7.7 in a proof of Theorem 7.13.
7.8 Measures on Spheres We intend to define a natural measure on the unit sphere Sd1 of Rd endowed with its Euclidean structure. In fact Sd1 is a compact Riemannian manifold i.e., a compact manifold on which the tangent spaces are given a smoothly varying scalar product and on any such manifold one can associate a canonical measure. In the present case one can use the homeomorphism h W P Sd1 !Rd nf0g given by h.r; s/ WD rs, with P WD0; C1Œ. Given a subset B of Sd1 ; we set C.B/ WD h.T B/ with T WD0; 1Œ: It is a Borel subset of Rd if and only if B is a Borel subset of Sd1 since a subset G of Sd1 is open if and only if the set C.G/ is open in Rd nf0g or in Rd . Let us define a Borel measure on Sd1 by setting for B 2 B.Sd1 / d1 .B/ WD dd .C.B//
7.8 Measures on Spheres
439
where d is the Lebesgue measure on Rd : Since d is invariant under the orthogonal group Od and since C.u.B// D u.C.B// for u 2 Od ; B 2 B.Sd1 /, d1 is invariant under Od . Note that taking B D Sd1 we get that d1 .Sd1 / D dbd ; where bd WD d .BRd / is the measure of the unit ball of Rd : Lemma 7.10 Let d1 be the measure on .P; B.P// with density r 7! rd on P, i.e. the measure induced by the Stieltjes measure on .R; B.R// associated with the function r 7! .rC /d on R, with rC WD max.r; 0/, so that d1 .Œa; bŒ/ WD bd ad for a < b in P. Then, for any Borel subset A of P Sd1 , one has d .h.A// D .d1 ˝ d1 /.A/: Proof We first observe that B.P Sd1 / is generated by the class C of products of the form A WD Œa; bŒB with a; b 2 P, a b, B 2 B.Sd1 /: For such a product A we have h.A/ D bC.B/naC.B/; hence d .h.A// D d .bC.B// d .aC.B//; D bd d1 .B/ ad d1 .B/ D d1 .Œa; bŒ/ d1 .B/ D .d1 ˝ d1 /.A/: Since the family A of sets A 2 B.P Sd1 / satisfying the relation d .h.A// D .d1 ˝ d1 /.A/ is a -algebra, A coincides with B.P Sd1 /. Proposition 7.19 For any integrable function f on Rd one has Z
Z Rd
fdd D
PSd1
f .rs/rd1 d1 .r/dd1 .s/:
Proof This follows from Theorem 7.7 and the fact that the measure A 7! d .h.A// on B.P Sd1 / has the density .r; s/ 7! rd1 with respect to the measure 1 ˝ d1 :
Exercises 1. Given a function g W RC ! RC , let f WD g ı kk W Rd ! R. Verify that if g is measurable (resp. integrable with respect to the measure rd1 dr) so is f for d and Z
Z Rd
fdd D dbd
C1
g.r/rd1 dr 0
with bd WD d .BRd /:
2. Given d 2 Nnf0g and t 2 R, let f WD kkt : Show that for t > d the function f is integrable on the unit ball Bd of Rd but f is not integrable on Rd nBd : Show that
440
7 Integration
for t < d the function f is integrable on Rd nBd but f is not integrable on Bd : Verify that for t D d the function f is not integrable on Bd or on Rd nBd . 3. Let g W R ! RC be given by g.r/ WD exp.r2 / and let f WD g ı kk W R2 ! R. 2 Show that the functions f and g are integrable and that kf k1 D kgkp 1 . Compute kf k1 using polar coordinates. Deduce from this the relation kgk1 D : 4. Let ft .x; y/R WD .1 x2 y2 /t 1B2 .x; y/ with t 2 1; C1Œ. Using coordinates polar compute ft d2 . Prove that for d 1 one has bdC2 D bd fd=2 1 . Deduce from this the values of b2k and b2kC1 .
Abstract The aim of this chapter is twofold. In the first part vectorial measures are introduced and the question of the representation of a measure in terms of another one is tackled. In the second part, Lebesgue Lp spaces are studied and their main properties established. The main properties of the Fourier transform and the Radon transform are displayed in view of their important applications.
In this chapter we consider some advanced subjects of measure theory and integration such as vectorial measures and the derivatives of a measure with respect to another one. We also introduce and study important spaces of functions known as Lebesgue spaces. They serve as models for several questions in functional analysis. We devote attention to some useful transforms, the most important one being the Fourier transform, another being the Radon transform used in medical tomography.
8.1 Vectorial Measures In the sequel .S; S/ is a fixed measurable space. What we called “measure” will be called “positive measure” whenever a risk of confusion may appear. The reason is that we intend to deal with measures with values in R, C or even a Banach space E: A map WS!E is called a vectorial measure or an E-valued measure if .¿/ D 0 and if it is countably additive in the sense that for any A 2 S and any countable (measurable) partition fAn W n 2 Ng of A; i.e. any sequence in disjoint measurable sets whose union is A; the family . .An // is summable and one has .A/ D
sense that the preceding relation holds for any countable partition of A and any A 2 S. Note that the sum is unambiguously defined since we exclude the case when takes both values 1 and C1: For simplicity, in the sequel we always assume a signed measure takes its values in R1 : The total variation j j of a vectorial measure is the function j j W S !RC given by j j .A/ WD sup
1 X
k .An /kE
A 2 S;
nD0
the supremum being taken over all countable partitions fAn W n 2 Ng of A: If is a signed measure we replace k .An /kE with j .An /j, with the convention that jC1j D C1. If is a positive measure the -additivity of yields j j D : If is a vectorial measure and A 2 S, taking the partition fAn W n 2 Ng of A given by A0 WD A; An WD ¿ for n 1 we see that j j .A/ k .A/kE :
(8.1)
Similarly, if is a signed measure, for all A 2 S we have j j .A/ j .A/j. It is easy to see that the space M.S; S; E/ (also denoted by M.S; E/ if there is no risk of confusion) of E-valued measures on .S; S/ and its subset Mb .S; E/ WD f 2 M.S; E/ W j j .S/ < C1g are linear spaces and that the function 7! k k WD j j .S/ is a norm on Mb .S; E/ (see Exercise 3). In Exercise 1 you are invited to prove that if E is finite dimensional then any E-valued measure is bounded in the sense that k k WD j j .S/ is finite, i.e. 2 Mb .S; E/: Proposition 8.1 Let W S ! E be an E-valued measure (resp. W S ! R be a signed measure). Then j j is a positive measure. Moreover, if is a positive measure satisfying k .A/kE .A/ for all A 2 S, then one has j j : Proof Clearly j j .¿/ D 0: Let us show j j is countably additive. Let fAk g be a countable partition of A 2 S by sets in S. For any sequence .rk / of nonnegative numbers satisfying rk D 0 if j j .Ak / D 0 and rk < j j .Ak / whenever j j .Ak / > 0; and for any k 2 N we pick a countable partition fAk;n W n 2 Ng of Ak such that rk ˙n k .Ak;n /kE (taking Ak;0 D Ak , Ak;n D ¿ for n 1 when j j .Ak / D 0). Then XX X rk k .Ak;n /kE j j .A/ k
k
n
since fAk;n W .k; n/ 2 N2 g can be viewed as a countable partition of A: Taking the supremum over the sequences .rk / chosen as above we get ˙k j j .Ak / j j .A/:
8.1 Vectorial Measures
443
To get the reverse inequality we consider an arbitrary countable partition fBn g of A: Then fBn \ Ak W k 2 Ng is a countable partition of Bn ; so that .Bn / D ˙k .Bn \ Ak /; k .Bn /kE ˙k k .Bn \ Ak /kE and X
k .Bn /kE
n
XX n
XX k
k .Bn \ Ak /kE
k
k .Ak \ Bn /kE
n
X
j j .Ak /
k
since fAk \ Bn W n 2 Ng is a countable partition of Ak . Taking the supremum over the set of countable partitions fBn g of A we get j j .A/ ˙k j j .Ak / and equality holds. If is a positive measure satisfying k .A/kE .A/ for all A 2 S, then for any countable partition fAn g of A 2 S one has ˙n k .An /kE ˙n .An / D .A/, hence, taking the supremum over all partitions of A; j j .A/ .A/: It follows from this result that if A, B 2 S are such that A B then one has j j .A/ j j .B/: In particular, if j j .B/ is finite, then for all A 2 S included in B, j j .A/ is finite. Let us give an important example of vectorial measure. Example Given a Banach space E, a positive measure on .S; S/, and h 2 L1 .; E/, one defines an E-valued measure on S by setting for A 2 S Z h .A/ WD
hd:
(8.2)
A
Clearly h .¿/ D 0 and the countable additivity of h follows from Corollary 7.11. Moreover, h is absolutely continuous with respect to in the sense of the next definition. Definition 8.1 A vectorial measure or a signed measure is said to be absolutely continuous with respect to a vectorial measure or a signed measure if for all A 2 S one has .A/ D 0 whenever jj .A/ D 0: Then one writes and one also says that is -continuous. Let us observe that one has if and only if, for all A 2 S, jj .A/ D 0 implies j j .A/ D 0: In fact, if , when jj .A/ D 0, for any measurable partition .An / of A one has .An / D 0 for all n, hence j j .A/ D 0: The converse stems from the relation k .A/kE j j .A/ for all A 2 S. Another characterization explains the terminology. Proposition 8.2 If is a finite positive measure, a vectorial measure or a signed measure is absolutely continuous with respect to if and only if for every " > 0 there exists a ı > 0 such that for all A 2 S satisfying jj .A/ < ı one has j j .A/ < ":
444
8 Differentiation and Integration
Proof Since k .A/kE j j .A/ for all A 2 S, the condition implies that . Conversely, assume that . If the condition fails, one can find " > 0 such that for any positive integer n there exists some An 2 S with jj .An / < 2n but j j .An / ": Let Bn WD [pn Ap and let B WD \n Bn . Then, jj being a positive measure, we have X jj .B/ jj .Bn / jj .Ap / 2nC1 ; pn
hence jj .B/ D 0. But since An Bn we have j j .Bn / j j .An / ", hence j j .B/ D limn j j .Bn / ": By definition of j j one can find some C 2 S such that C B and .C/ ¤ 0; contradicting the assumption that is absolutely continuous with respect to and the fact that 0 jj .C/ jj .B/ D 0. Let us study more closely the preceding example in order to get some familiarity with the notions we have introduced. Proposition 8.3 Given a Banach space E, a positive -finite measure on .S; S/, and h 2 L1 .; E/, let h be the E-valued measure defined by (8.2). Then one has jh j. More generally, for all A 2 S one has Z jh j .A/ D
khkE d:
(8.3)
A
Moreover, if g 2 L1 .; E/ is such that g D h , then g D h almost everywhere. Proof We have already observed that relation (8.2) implies that h : if A 2 S is such thatR jj .A/ D 0; then we have .A/ D 0 as is a positive measure, hence h .A/ WD A hd D 0. Let us prove relation (8.3). GivenRA 2 S and partition .An / of A by a countable R members of S, since kh .An /kE D 1An hdE 1An khkE d; Corollary 7.11 yields X
kh .An /kE
n
XZ
1An khkE d D
Z X
n
Z 1An khkE d D
khkE d: A
n
RTaking the supremum over all countable partitions of A we get jh j .A/ A khkE d: R The opposite inequality jh j .A/ A khkE d holds if h is a -step function, h WD 1B1 e1 C : : : C 1Bk ek with ei 2 E, Bi 2 S, Bi \ Bj D ¿ for i ¤ j since any (measurable) countable partition .An /n0 of A can be refined in a partition .An \ Bi /.i;n/2Nk N of A such that h is constant on each An \ Bi so that jh j .A/
k Z XX n
iD1
An \Bi
k Z XX hd D E
n
iD1
Z An \Bi
khkE d D
khkE d: A
8.1 Vectorial Measures
445
R In order to prove the inequality jh j .A/ A khkE d in the case h 2 L1 .; E/, let R us observe that we can reduce the task to the case when .A/ is finite: if we had A khkE d jh j .A/ C " for some " > 0; using theRassumption that R is -finite, > we would pick B 2 S with B A, .B/ < C1 and B khkE d A khkE d "; R so that, since jh j .A/ jh j .B/, the inequality jh j .B/ B khkE d would be impossible. Assuming now that ˛ WD .A/ is finite, given " > 0, it suffices to show R that jh j .A/ A khkE d ": Let k./ WD kh./kE : Since k we can find ı > 0 such that for all Z 2 S satisfying .Z/ < ı we have jk j .Z/ < "=3: By Proposition 7.3 we can find a -step function g and Z 2 S such that .Z/ < ı and kg hk1 0 there exists some A 2 S satisfying .A/ < " and .SnA/ < ": Show that ? . [Hint: consider a sequence .An / of S such that .An / < 2n ; .SnAn / < 2n and set T WD \m [nm An .] 6. Let be a finite signed or complex measure on a measurable space .S; S/: d Show that the Radon-Nikodým derivative dj j of with respect to j j satisfies ˇ ˇ ˇ d ˇ ˇ dj j ˇ D 1 j j-almost everywhere on S: Let 0 , 00 be finite signed or complex measures on a measure space .S; ; S/ with -finite. Verify that if 0 and 00 one has 0 00 d WD 0 C 00 and d D d C d : d d 7. Let and be -finite measures on a measurable space .S; S/ such that and let be a -finite signed measure such that : Show that d d d D d : d (-almost everywhere). and that d 8. Let E WD `1 be the space of sequences x WD .xn / of real numbers such that kxk1 WD ˙n jxn j < C1. Prove that E has the RNP. [Hint: note that if en is the element of E whose components are all 0 except for the nth which is 1; the following property holds: if .rn / is a sequence of real numbers such that supn k˙kn rk ek k1 < 1 then ˙kn rk ek converges in `1 . Then, given a measure space .S; S; / and a measure W S ! E of bounded variation satisfying apply the Radon-Nikodým Theorem to the measures n given by .A/ WD ˙n n .A/en to get a Radon-Nikodým derivative of .] 9 . Prove that the space c0 of sequences x WD .xn / with limit 0 endowed with the supremum norm does not have the RNP. 10 . Let S WD Œ0; 1 endowed with the restriction of the Lebesgue measure and let E WD C.S/ endowed with the supremum norm. Define W S ! E by .A/.t/ WD .A \ Œ0; t/ for t 2 S: Verify that and that is of bounded variation. Prove that has no Radon-Nikodým derivative. [Hint: for this counterexample due to Lewis, see [99, p. 73].]
456
8 Differentiation and Integration
8.3 Differentiation of Measures on Rd In this section we want to give a concrete notion of the derivative of a measure on the Borel algebra B WD Bd of Rd with respect to the Lebesgue measure WD d on B. We shall use the notion of a Vitali covering of Rd . Recall that a family F of subsets of a set S is said to be disjoint if distinct members of F are disjoint. If, moreover, F is a covering of S in the sense that the union of the members of F is S, then F is called a partition of S. Definition 8.4 A Vitali covering of a subset A of Rd is a family V of measurable bounded subsets of Rd such that there exists some c > 0 for which one has ..V//1=d c diam.V/ > 0 for all V 2 V and such that for all x 2 A and all r > 0 one can find some V 2 V containing x with diameter diam.V/ < r: The condition .V/ c diam.V/ > 0 for all V 2 V is satisfied if V is a family of balls or cubes (which are balls for the norm kk1 ). This condition discards sets that are too thin. Sometimes Vitali coverings are defined as families of balls, but we prefer to dispose of a more versatile definition. We observe that if V is a Vitali covering of A, then fcl.V/ W V 2 Vg is also a Vitali covering of A because for any subset V of Rd one has diam.cl.V// D diam.V/: We also note that if V is a Vitali covering of A with associated constant c; for any c0 20; cŒ one can find an open Vitali covering V 0 of A with constant at least c0 by replacing each member V of V with V 0 WD V C B.0; "/ where " > 0 is such that c diam.V/ c0 .diam.V/ C 2"/: The following result is crucial but its proof is not easy. We advise the reader to consider first the case d D 1; even if the general case is not much different. Theorem 8.9.(Vitali) Let A be an arbitrary nonempty subset of Rd and let V be a Vitali covering of A. Then there exist a finite or countable subfamily W WD fVn W n 2 Ng of V and a null set N of Rd such that the sets cl.Vn / are disjoint and A .[n cl.Vn // [ N: Proof Let us first suppose A is bounded; let b > supfkxk W x 2 Ag. Taking into account the preceding remarks we may assume the members of V are closed. Taking a subfamily of V if necessary, we may assume that all the members of V meet A and are contained in B.0; b/. If for a finite subfamily U WD fVk W k 2 Nn g of V the members of U are disjoint and A AU WD [k Vk we can take W WD U. Thus we suppose henceforth that there is no such family. Then, for any finite disjoint subfamily U of V there is some point x 2 AnAU . Since AU is closed, we can find ı > 0 such that B.x; ı/ \ AU D ¿ and B.x; ı/ B.0; b/: Taking some V 2 V containing x with diameter less than ı we get a member of V disjoint from the members of U, hence a larger disjoint subfamily of V: In order to make this construction more precise and more efficient, we pick V 2 V such that diamV =2; where
WD supfdiamV W V 2 V; V \ AU D ¿g;
8.3 Differentiation of Measures on Rd
457
noting that 0 < 2b since each element of V is contained in B.0; b/: We set
0 WD supfdiamV W V 2 Vg; and we start with U0 WD fV0 g, where V0 2 V is such that diamV0 0 =2: Assuming inductively that a disjoint subfamily Un1 WD fV0 ; : : : ; Vn1 g of V has been chosen, we set
n WD supfdiamV W V 2 V; V \ AUn1 D ¿g; and we pick some Vn in fV 2 V W V \AUn1 D ¿g such that diamVn > n =2: Taking Un WD Un1 [ fVn g completes our induction step. Since the family W WD fVn W n 2 Ng is disjoint and its members are contained in B.0; b/, we have ˙n .Vn / .B.0; b// < 1, hence ..Vn // ! 0. It follows that .diamVn / ! 0 and . n / ! 0: For each n 2 N we pick a closed ball Bn with center in Vn and radius 2 n : Since diamVn n we have Vn C BŒ0; n Bn . Since .Vn / cd .diamVn /d 2d cd . n /d we even have ˙n . n /d < C1 and ˙n .Bn / < C1 since .Bn / D bd .2 n /d where bd WD .BŒ0; 1/: We claim that for all m 2 N we have [ [ Vj Bk : (8.5) An jm
k>m
In fact, given x 2 An [jm Vj , we can find some V 2 V such that x 2 V and V \ [jm Vj D ¿: Since . n / ! 0 and diamV > 0, there are integers n such that V \ .[jn Vj / ¤ ¿: Let k be the smallest such integer, so that k > m and V \ Vk ¤ ¿: Let y 2 V \ Vk , so that d.x; y/ diamV k by definition of k and x 2 BŒy; k Vk C BŒ0; k Bk : Thus relation (8.5) holds. Denoting by the outer measure associated with ; we deduce from (8.5) that for all m 2 N we have [ [ X .An Vj / .An Vj / .Bk /: j0
jm
k>m
Since the series ˙k .Bk / is convergent, we have .˙k>m .Bk // ! 0 as m ! 1; hence .N/ D 0 for N WD An.[j0 Vj /: When A is unbounded, taking a countable disjoint family fGi W i 2 Ng of bounded open subsets such that .Rd n [i Gi / D 0, for all i 2 N we get a countable disjoint subfamily fVi;n W n 2 Ng of members of V contained in Gi covering A \ Gi up to a null set. Merging these families into a single family we obtain the required countable family covering A up to a null set.
458
8 Differentiation and Integration
Definition 8.5 Given a Vitali covering V of Rd and a signed measure on B, the upper derivative and the lower derivative of at x 2 Rd are defined respectively by D.x/ D inf supf
.V/ W V 2 V, x 2 int.V/; diam.V/ < rg; .V/
D.x/ D sup inff
.V/ W V 2 V, x 2 int.V/; diam.V/ < rg: .V/
r>0
r>0
The measure is said to be differentiable at x when D.x/ D D.x/ 2 RC : Then this value is denoted by D.x/ and is called the derivative of at x. Note that in this definition infr>0 and supr>0 can be replaced with limr!0C and when is differentiable at x; for V.x/ WD fV 2 V W x 2 int.V/g; one has D.x/ D
lim
diam.V/!0C
f
.V/ W V 2 V.x/g: .V/
When is a vectorial measure we adopt this definition for the derivative of at x: Lemma 8.2 For any signed measure on B and any open Vitali covering V of Rd the functions D and D are measurable. Proof Since D.x/ D D./.x/ for all x 2 Rd ; it suffices to prove the measurability of D. Taking a sequence .rn / of positive rational numbers with limit 0; we have D D limn Drn with Dr .x/ WD supf
.V/ W V 2 V; x 2 int.V/; diam.V/ < rg: .V/
For all t 2 R the set fx 2 Rd W Dr .x/ > tg is open. Thus Dr is a lower semicontinuous function, hence is measurable. Then D is measurable as the limit of a sequence in measurable functions. The definitions of lower and upper derivatives enable us to obtain estimates. Lemma 8.3 Let be a positive measure on B finite on compact sets and let B 2 B. (a) If for some c > 0 one has D.x/ c for all x 2 B then one has .B/ c.B/: (b) If is absolutely continuous with respect to the Lebesgue measure of Rd and if for some c > 0 one has D.x/ c for all x 2 B; then one has .B/ c.B/: Proof (a) Since is regular by Theorem 1.14, it suffices to show that for all open subsets U of Rd containing B and all b 20; cŒ one has .U/ b.B/: Given a Vitali covering V of Rd let Vb be the family of those V 2 V contained in U and such that .V/ b.V/: Since D.x/ c for all x 2 B; the
8.3 Differentiation of Measures on Rd
459
definition of D.x/ ensures that Vb is a Vitali covering of B: Theorem 8.9 provides a countable disjoint subfamily W WD fVn W n 2 Ng of Vb such that .Bn [n Vn / D 0: Since the sets Vn are disjoint, contained in U, and such that .Vn / b.Vn /, we get .U/
X
.Vn /
n
X
b.Vn / D b.
[
n
Vn / D b.B/:
n
(b) Given a > c and a Vitali covering V of Rd let Va be the family of those V 2 V such that .V/ a.V/: Since D.x/ < a for all x 2 B; the definition of D.x/ ensures that Va is a Vitali covering of B: Theorem 8.9 provides a countable disjoint subfamily W WD fVn W n 2 Ng of Va such that .Bn [n Vn / D 0: Since the sets Vn are disjoint and such that .Vn / a.Vn / and since .Bn [n Vn / D 0, hence .Bn [n Vn / D 0 as , we get a.B/ a
X n
.Vn /
X
.Vn / D .
n
[
Vn / .B/:
n
Since a > c is arbitrarily close to c; we get .B/ c.B/:
Let us turn to differentiability results. It is natural to compare the definition of derivative we just introduced with the notion of Radon-Nikodým derivative. We start with some special cases. Lemma 8.4 Let be a finite Borel measure on Rd that is singular with respect to the Lebesgue measure : Then is differentiable, with derivative 0 almost everywhere. Proof Let N 2 B be such that .N/ D 0 and .N c / D 0 for N c WD Rd nN. Let B WD fx 2 Rd W D.x/ > 0g and for n 2 Nnf0g let Bn WD fx 2 N c W D.x/ 1=ng 2 B: Lemma 8.3 ensures that .Bn / n.Bn / n.N c / D 0: Since B N [ .[n Bn / we have .B/ D 0: Since 0 D D; we get that D D D D 0 on Rd nB: d Theorem 8.10 Given R h 2 L1 .R /; let h be the associated measure with density h given by h .A/ WD A hd: Then h is differentiable almost everywhere and Dh D h almost everywhere.
Proof Without loss of generality we suppose h is measurable. For t 2 R we denote by t the measure with density .h t/C with respect to W Z
.h.x/ t/C d.x/
t .B/ D B
for B 2 B:
460
8 Differentiation and Integration
Then, for St WD fh < tg; t is a positive measure satisfying t .B/ D 0 for all B 2 B contained in St . Taking a sequence .˛n / ! 0C and setting Bn WD fx 2 St W Dt .x/ > ˛n g 2 B, we deduce from Lemma 8.3 that .Bn / D 0: Thus Dt D 0 a.e. on St . Moreover, since h .h t/C C t, for any B 2 B we have Z h .B/ WD
hd t .B/ C t.B/: B
In particular, for x 2 Rd and V 2 V.x/ we have h .V/ t .V/ C t; .V/ .V/ so that Dh .x/ Dt .x/ C t. Since Dt D 0 a.e. on St , for all t 2 Q, the set Nt WD fx 2 Rd W h.x/ < t < Dh .x/g is a -null set. Since Q is countable, the set fh < Dh g is a -null set. Since h D h and Dh D D.h /; we see that fh > Dh g is a -null set and Dh h Dh a.e. Since Dh Dh ; as is easily seen, we obtain that Dh D h D Dh a.e. and h is differentiable a.e. with derivative h. The preceding result can be extended to vectorial measures. Theorem 8.11 Let h 2 L1 .Rd ; ; E/; where E is a Banach space. Then the measure h with density h is almost everywhere differentiable on Rd and its derivative is h: Dh D h: Proof Modifying h on a null set if necessary, we may suppose h is measurable and that h.Rd / is separable. Let V be a Vitali covering of Rd : For w 2 Rd and r > 0 let Vr .w/ WD fV 2 V W w 2 int.V/; diamV < rg: Let Z W WD fw 2 Rd W 8" > 0 9r > 0 8V 2 Vr .w/
kh h.w/k d ".V/g: V
For w 2 W; since Z Z 1 h .V/ 1 hd h.w/ kh h.w/k d .V/ h.w/ D .V/ .V/ V V the measure h is differentiable at w with derivative h.w/: It remains to show that .Rd nW/ D 0: Let fen W n 2 Ng be a dense countable subset of h.Rd /. For n 2 N let Wn WD \">0 Wn;" , where Wn;"
ˇ ˇZ ˇ " ˇ kh en k ˇ d kh.w/ en kˇˇ g: WD fw 2 R W 9r > 0 8V 2 Vr .w/ ˇ .V/ 3 V d
8.3 Differentiation of Measures on Rd
461
Applying Theorem 8.10 to the function kh en k we get that .Rd nWn / D 0. Let us show that \n Wn W; this will prove that .Rd nW/ D 0: For w 2 \n Wn ; given " > 0 let k 2 N be such that kh.w/ ek k < "=3: Then there exists an r > 0 such that for all V 2 Vr .w/ we have Z
Z kh ek k d C .V/ kek h.w/k
kh h.w/k d V
V
ˇ ˇZ ˇ ˇ ˇˇ kh ek k d .V/ kh.w/ ek kˇˇ C 2.V/ kek h.w/k ".V/: V
Thus w 2 W: Another differentiability result for a vectorial measure follows.
Proposition 8.5 Let W B ! E be a vectorial measure with values in a Banach space E. Suppose the total variation jj is finite on every compact subset of Rd : If is singular with respect to the Lebesgue measure then is almost everywhere differentiable and its derivative is 0. Proof Let us first show that if B 2 B is such that jj .B/ D 0 then is differentiable a.e. on B and its derivative is 0: For n 2 N let Bn WD fx 2 B W D jj .x/ 2n g: Lemma 8.3 shows that .Bn / D 0: It follows that for A WD fx 2 B W D jj .x/ > 0g one has .A/ D 0: Thus is differentiable a.e. on B and its derivative is 0. If ? there exists a B 2 B such that jj .B/ D 0 and .Rd nB/ D 0. The preceding shows that is differentiable a.e. on B; hence a.e. on Rd with derivative 0. Theorem 8.12 Let W B ! R be a finite measure on the Borel -algebra of Rd : Then there exists a Lebesgue null set N such that is differentiable on Rd nN and the function h given by h.x/ D D.x/ for x 2 Rd nN, h.x/ D 0 for x 2 N is a Radon-Nikodým derivative of the absolutely continuous part a of : Proof Let us first suppose and let h be a Radon-Nikodým derivative of with respect to : For rational numbers r, s satisfying r < s let A.r; s/ WD fx 2 Rd W D.x/ r < s h.x/g: Lemma 8.3 ensures that Z hd D .A.r; s// r.A.r; s//:
s.A.r; s// A.r;s/
The first inequality shows that .A.r; s// < C1: Then, since r < s, we get that .A.r; s// D 0: Since A WD fx 2 Rd W D.x/ < h.x/g is the countable union
462
8 Differentiation and Integration
of the sets A.r; s/ with r; s 2 Q, r < s, we get that .A/ D 0: Similarly, for B WD fx 2 Rd W h.x/ < D.x/g we have .B/ D 0: Since D.x/ D.x/, we conclude that D.x/ D h.x/ D D.x/ almost everywhere. Now suppose is an arbitrary finite Borel measure and let D a C s be its Lebesgue decomposition. Let h be a Radon-Nikodým derivative of a with respect to . Since Da D h by what precedes and Ds D 0 by Proposition 8.5, by additivity we have D D Da C Ds D h
almost everywhere.
Given a Lebesgue measurable subset M of R , let us consider the Borel measure defined by .B/ WD .B \ M/: We say that a point x of Rd is a point of density of M if is differentiable at x and D.x/ D 1: We say that x is a point of dispersion of M if x is a point of density of Rd nM; or, equivalently, if is differentiable at x and D.x/ D 0: d
Corollary 8.4 Let M be a Lebesgue measurable subset of Rd . Then -almost every point of M is a point of density of M and -almost every point of M c WD Rd nM is a point of dispersion of M: For spaces with the Radon-Nikodým property, a representation of vectorial measures of finite variation on the Borel -algebra B of Rd can be given. Theorem 8.13 Let E be a Banach space with the Radon-Nikodým Property. Any vectorial measure W B ! E on the Borel -algebra B of Rd whose total variation is finite is almost everywhere differentiable and there exists a measure s W B ! E that is singular with respect to the Lebesgue measure such that, denoting by h the derivative of with respect to , one has D h C s : Proof Let D a C s be the Lebesgue decomposition of ; with a and s ? : Since E has the RNP and the total variation of a is R finite, a is differentiable a a.e. and for h WD D 2 L .; E/ one has D W B ! 7 1 a h B hd. By Proposition 8.5 D s , whose total variation is finite too, is differentiable a.e. with a null derivative. D s D a D D a Since D D D D C D ; we get that is differentiable a.e. and D D D D h:
Exercises 1. Given a Vitali covering V of Rd show that there exists some b > 0 such that for any finite subfamily fVi W i 2 Ig of V one can find a subfamily fVj W j 2 Jg of fVi W i 2 Ig satisfying Vj \ Vj0 D ¿ for j ¤ j0 in J and .[j2J Vj / b.[i2I Vi /: 2. Let .S; S; / be a finite measure space without atoms and let E WD L1 ./: Show that the identity map I W L1 ./ ! E is not representable with respect to .S; S; /: [Hint: see [99, p. 61].]
8.4 Derivatives of One-Variable Functions
463
8.4 Derivatives of One-Variable Functions In this section we use the preceding notions of derivative of a measure to deal with the usual concept of derivative for a one-variable function. Let us first recall the definitions of the Dini derivatives of a function f W Œa; b ! R at x 2a; bŒ: D f .x/ WD lim inf
f .x C u/ f .x/ ; u
DC f .x/ WD lim inf
D f .x/ WD lim sup
f .x C u/ f .x/ ; u
DC f .x/ WD lim sup
u!0
u!0
u!0C
u!0C
f .x C u/ f .x/ ; u f .x C u/ f .x/ : u
The left (resp. right) derivative of f at x exists if and only if D f .x/ D D f .x/ (resp. DC f .x/ D DC f .x/) and f is differentiable at x if and only if these four quantities are finite and coincide. In the next lemma we use the fact that the set S of strict local minimizers of a real one-variable function g is countable. Here we say that x is a strict local minimizer of g if there exists some " > 0 such that g.w/ > g.x/ for all w 2 Œx "; x C "nfxg: To prove the assertion, for n 2 Nnf0g let Sn WD fx W g.w/ > g.x/ 8w 2 Œx 1=n; x C 1=nnfxgg so that for x, y 2 Sn we have jx yj > 1=n if x ¤ y, and Sn is countable. Since S D [n1 Sn our assertion ensues. Lemma 8.5 For an arbitrary one-variable function f the following sets are at most countable: E WD fx W DC f .x/ < D f .x/g;
F WD fx W D f .x/ < DC f .x/g:
Moreover, the set of points at which the right and left derivatives of f exist, but are not equal, is at most countable. Proof For each r 2 Q let fr .x/ WD f .x/ rx and let Fr WD fx W D f .x/ < r < DC f .x/g; so that F D [r2Q Fr : Then, each x 2 Fr is a strict local minimizer of fr since D fr .x/ < 0 < DC fr .x/: Thus Fr is countable and F is countable too. The proof for E is similar. The set G of points at which the right and left derivatives of f exist, but are not equal is contained in E [ F; hence is at most countable. Dini’s derivatives can be used to prove that a function is nondecreasing.
464
8 Differentiation and Integration
Theorem 8.14 Let T be an interval of R, let f W T ! R be continuous and let D be a countable subset of T. Then f is nondecreasing on T if and only if for all x 2 TnD one has DC f .x/ 0: The condition DC f .x/ 0 (and even DC f .x/ 0) for all x 2 Tnfsup Tg is obviously necessary. We derive the sufficiency of the condition from the following lemma. Lemma 8.6 (Zygmund) Let f W T ! R be a continuous function and let D be a subset of T such that int f .D/ is empty. If for all x 2 TnD one has DC f .x/ > 0 then f is nondecreasing on T. Assuming D is countable, for all " > 0 the function f" given by f" .x/ D f .x/ C "x satisfies the assumptions of the lemma since DC f" .x/ D DC f .x/ C " > 0 for all x 2 TnD and since f" .D/ is countable. Thus, for all u; v 2 T with u < v we have f" .u/ f" .v/, hence f .u/ f .v/, " being arbitrarily small. Proof of the lemma Given u < v in T; let us prove that f .u/ f .v/; or equivalently that for any c < f .u/ we have c < f .v/: Since f .D/ does not contain an interval, we may assume that c … f .D/; replacing c by some c0 2c; f .u/Œ if necessary. Let S WD ft 2 Œu; v W f .t/ cg; so that u 2 S, and s WD sup S: Since S is closed it suffices to show that s D v or that the inequality s < v leads to a contradiction. By continuity of f , we have f .s/ c and we cannot have f .s/ > c since otherwise s would be in an open interval contained in S: Thus f .s/ D c and since c … f .D/ we have DC f .s/ > 0: That implies that there exists a sequence .sn / ! s in s; vŒ such that f .sn / > f .s/ D c for all n 2 N, contradicting the definition of s as sup S: Thus s D v and f .v/ c and we conclude that f .v/ f .u/: Corollary 8.5 (Dini) Let f W T ! R be a continuous function on an interval T of R such that for some c 2 R and some countable subset D of T one has D˙ f .t/ c for all t 2 TnD; where D˙ f is one of the four Dini derivatives of f . Then, for any pair .s; t/ of distinct points of T one has f .t/ f .s/ c ts Applying the result to the function f , one obtains an upper bound for the quotient from an upper bound of D˙ f : Proof We may suppose s < t and D˙ f D DC f (otherwise we use the function g given by g.r/ D f .r/ for r 2 T). Then the result stems from the theorem applied to the function fc W r 7! f .r/ cr: Exercise Assume that one of the four Dini derivatives of f is finite and continuous at some r 2 int T. Show that f is differentiable at r. We need to make clear some continuity properties of nondecreasing functions.
8.4 Derivatives of One-Variable Functions
465
Lemma 8.7 Let f W R ! R be a nondecreasing function. Then, for all x 2 R the one-sided limits f .xC / WD limy!x; y>x f . y/ and f .x / WD limy!x; yx f . y/ is an easy consequence of the fact that f is nondecreasing, so that this limit f .xC / is inf f .x; 1Œ/ f .x/: A similar argument holds for f .x /: Thus f is continuous at x if and only if f .x / D f .xC /: For each x 2 C we pick a rational number qx 2f .x /; f .xC /Œ: Since for x < y we have qx < qy ; the countability of C stems from the countability of Q. The function g is such that g.x/ D sup f . 1; xŒ/; so that g is nondecreasing and left-continuous since for any sequence .xn / in 1; xŒ with limit x we have 1; xŒD [n 1; xn Œ. If f is left-continuous at x we have g.x/ WD f .x / D f .x/: Since g f ; for all x 2 R we have g.xC / f .xC / and in fact g.xC / D f .xC / since for all c > g.xC / D inf g.x; 1Œ/; we can find some y 2x; 1Œ such that c > g. y/; so that for all z 2x; yŒ we have c > f .z/; hence c > f .xC /: If g is right-continuous at r; we have g.r/ D g.rC / D f .rC / f .r/; hence g.r/ D f .r/: Let us pass to differentiability properties. We need a comparison of the derivative of the Stieltjes measure associated with a nondecreasing function f and the derivative of f when it exists. Lemma 8.8 Let be a finite signed measure on B.R/ and let f W R ! R be given by f .x/ WD . 1; xŒ/ for x 2 R. If is differentiable at some t 2 R; then f is differentiable at t and f 0 .t/ D D.t/: Proof By definition, for s 2 R, r > 0 we have f .s C r/ f .s/ .Œs; s C rŒ/ D ; r .Œr; r C sŒ/ f .s r/ f .s/ .Œs r; sŒ/ D : r .Œs r; sŒ/ Taking the Vitali covering V WD fŒs; s C rŒW s 2 R; r > 0g; setting Tr;s WD Œs; s C rŒ, 0 Tr;s WD Œs r; sŒ, when t is a point of differentiability of we have D.t/ D lim
r!0C s 0 one can find some ı > 0 such that for any measurable subset A of T satisfying .A/ < ı one has jh .A/j < ". In particular, taking for A the union of a finite family .Ti /i2Nm of disjoint open intervals Ti WD ai ; bi Œ of T satisfying ˙i .bi ai / < ı we have ˙i j f .bi / f .ai /j jh j .A/ < ": For the second assertion we may assume T is a compact interval Œa; b and extend f 0 to a function h on R by 0 on RnT. Then the assertion follows from Lemma 8.8 applied to the measure h with density h with respect toR the Lebesgue measure, so x that for x 2 Œa; b one has h . 1; xŒ/ D h .Œa; xŒ/ D a h.t/d.t/ D f .x/. Proposition 8.6 If T is the compact interval Œa; b for some a < b in R, any absolutely continuous function f on T is of bounded variation: AC.T/ BV.T/. The Cantor-Lebesgue function shows that the reverse inclusion does not hold (see Exercise 1). Proof Let f 2 AC.T/ and let ı 2 P correspond to " D 1 in the preceding definition. Introduce a subdivision WD ft0 D a < t1 < < tn D bg such that ti ti1 < ı for i 2 Nn and n < .b a/=ı C 1. Given a subdivision WD fs0 D a < s1 < < sm D bg since the sum S WD
m X
j f .si / f .si1 /j
iD1
does not decrease when some additional points are introduced, we may assume that contains all the points of . Gathering the terms of this sum corresponding to intervals contained in some interval Œtj1 ; tj ; we see that S n: Thus f is in VB.T/: The following condition characterizes absolutely continuous functions, but we only prove it is a necessary condition.
8.4 Derivatives of One-Variable Functions
469
Proposition 8.7 (Lusin) Let f be an absolutely continuous function on T WD Œa; b: Then f satisfies the following condition: (N) S T; .S/ D 0 H) . f .S// D 0: Proof Let f 2 AC.T/. Given " 2 P, let ı 2 P correspond to " as in the definition of absolute continuity. Given S T such that .S/ D 0, we consider an open subset G containing S with .G/ < ı: Now G is the union of a countable family .Tn / of open intervals. Since f is continuous, f .cl.Tn // is an interval whose endpoints are points f .an /, f .bn / with an , bn 2 cl.Tn /; ff .an /; f .bn /g D fmin f .cl.Tn //; max f .cl.Tn //g, an bn , so that . f .Tn // j f .bn / f .an /j : Since X X .bn an / .Tn / D .G/ < ı; n
n
an extension of Definition 8.6 to countable families shows that X X . f .S// . f .G// . f .Tn // D j f .bn / f .an /j ": n
n
Since " is arbitrarily small, we have . f .S// D 0:
Proposition 8.8 Let f W T ! R be a continuous function on T WD Œa; b satisfying condition (N). Let S be the set of s 2a; bŒ such that f is differentiable at s with derivative 0: Then . f .S// D 0: Proof Given " 2 P, for each s 2 S we can find some ıs 2 P such that Ts WD Œs ıs ; s C ıs T and j f .s C r/ f .s/j < " jrj
8r 2 Œıs ; ıs :
For arbitrary points a; b in Ts we have j f .b/ f .a/j j f .b/ f .s/j C j f .a/ f .s/j 2"ıs D ".Ts /:
(8.7)
The family V WD fTs W s 2 Sg is a Vitali covering of S; so that by the Vitali’s theorem there exists a finite or countable subfamily W WD fTs W s 2 Cg and a null set N such that the sets Ts with s 2 C are disjoint and S.
[
Ts / [ N:
s2C
By condition (N) we have . f .N// D 0 and by relation (8.7), for all s 2 C we have . f .Ts // ".Ts /: Then, the preceding inclusion yields . f .S// ..
[ s2C
f .Ts // [ f .N// D
X s2C
. f .Ts //
X s2C
".Ts / ".T/
470
8 Differentiation and Integration
since the intervals Ts with s 2 C are disjoint and contained in T: Since " is arbitrarily small we see that . f .S// D 0: Corollary 8.6 Let f be an absolutely continuous function on an interval T. If the derivative f 0 of f is nonnegative a.e. then f is nondecreasing. Proof Given " > 0 let f" be given by f" .t/ WD f .t/ C "t. Let D be the set of t 2 T such that either f"0 .t/ does not exists or is negative. By assumption .D/ D 0: Then . f" .D// D 0; so that the interior of f" .D/ is empty. Then Zygmund’s lemma applies and f" is nondecreasing. It follows that f is nondecreasing. Corollary 8.7 Let f W T ! R be an absolutely continuous function on T WD Œa; b whose derivative f 0 is 0 a.e. Then f is constant. Such an assertion is not valid for an arbitrary function, even if it is nondecreasing (see Exercise 1). Proof Let S the set of points of a; bŒ at which f is differentiable, so that, by the inclusion AC.T/ BV.T/ and Theorem 8.16, T D S [ N; where N is of measure 0. Since f satisfies condition (N) the preceding proposition shows that . f .T// . f .S// C . f .N// D 0: Thus, the interval f .T/ is a singleton and f is constant. The following theorem is often called the Fundamental Theorem of Calculus. It shows the power of Lebesgue integration theory. Theorem 8.17 A function f W T ! R on an interval T of R is absolutely continuous if and only if it is differentiable a.e., if its derivative f 0 is locally integrable and for all a; x 2 T Z x f .x/ D f .a/ C f 0 .t/dt: a
Proof By Lemma 8.9 the condition is sufficient. Let us prove it is necessary. Since the restriction of f to any compact interval is of bounded variation, f is differentiable a.e. and its derivative f 0 is Lebesgue measurable. Relation (8.6) shows that f 0 is Lebesgue integrable on any compact interval, so that it is meaningful to set Z
x
g.x/ WD f .a/ C
f 0 .t/dt:
a
Lemma 8.9 ensures that g is absolutely continuous and a.e. differentiable with derivative f 0 : Thus the absolutely continuous function f g is a.e. differentiable with derivative 0: By the preceding corollary it is constant with value f .a/ g.a/ D 0 W f D g. Let us end this section by quoting the following result for which we refer to [60, 226].
8.5 Lebesgue Lp .S; E/ Spaces
471
Theorem 8.18 (Lusin) Let g W Œa; b ! R be a measurable function that is finite almost everywhere. Then there exists a continuous function f W Œa; b ! R that is differentiable a.e. and such that f 0 D g almost everywhere.
Exercises 1. (The Cantor-Lebesgue function) Recall that the Cantor set C is the image of the set f0; 1gN of sequences .kn /n with kn D 0 or 1 under the map .kn / 7! ˙n 2kn =3nC1 : Let f W C ! R be given by f .x/ D ˙n kn =2nC1 for x D ˙n 2kn =3nC1 : Show that f is well defined (in spite of the nonuniqueness of the representation of x) and is continuous, with f .0/ D 0; f .1/ D 1: Show that f .C/ D Œ0; 1: Check that if a; bŒ is an open interval in Œ0; 1nC; then f .a/ D f .b/; so that f can be extended into a continuous function on Œ0; 1 by giving it a constant value on each such interval a; bŒ, so that f 0 .x/ D 0 on Œ0; 1nC with .C/ D 0; in spite of the fact that f is increasing on C: 2. Point out two results of the present subsection showing that the function of the preceding exercise is not absolutely continuous. 3. Let f W Œ0; 1 ! R be given by f .0/ D 0, f .x/ D x2 sin x2 for x 20; 1: Show that f is differentiable everywhere, but is not absolutely continuous. 4. Let T be a compact interval of R and let f W T ! R be a continuous function that is such that f is differentiable at all except countably many of the points of T; with an integrable derivative f 0 : Prove that f is absolutely continuous. 5. Let be a finite signed measure on .R, B.R//; and let f W R ! R be given by f .x/ D . 1; xŒ/: Show that f is absolutely continuous if and only if is absolutely continuous with respect to Lebesgue measure. [See Lemma 8.9 and [80, Prop.4.4.5].] 6. Let f and g be absolutely continuous functions on the compact interval Œa; b: Prove the following version of integration by parts: Z
b
f .b/g.b/ f .a/g.a/ D a
f .t/g0 .t/dt C
Z
b
f 0 .t/g.t/dt:
a
8.5 Lebesgue Lp .S; E/ Spaces We devote the present section and the next chapter to two classes of normed spaces that play a crucial role in analysis. They are closely related. In this section, unless otherwise specified, p is an element of the interval Œ1; C1 and q 2 Œ1; C1 is the
472
8 Differentiation and Integration
so-called conjugate exponent given by q D .1 1=p/1 if p 21; C1Œ, q D C1 if p D 1, and q D 1 if p D C1; so that the relation 1=p C 1=q D 1 holds by convention.
8.5.1 Basic Facts About Lebesgue Spaces We start with some classical inequalities. Lemma 8.10 For p 21; C1Œ let q WD .1 1p /1 : Then for r; s 2 RC one has 1 p 1 q r C s ; p q
(8.8)
1 1 1 1 . r C s/p rp C sp : 2 2 2 2
(8.9)
rs
Proof Since relation (8.8) is satisfied if r D 0 or s D 0; we may assume r and s are positive. Setting u D rp and v D sq we are reduced to showing that u1=p v 1=q
1 1 u C v: p q
(8.10)
Let us consider the function g W0; C1Œ!R given by g.t/ D t=p C 1=q t1=p : Its derivative g0 , given by g0 .t/ D .1=p/.1 t1=q / is negative on 0; 1Œ and positive on 1; C1Œ; so that g attains its minimum at t D 1. Thus g.t/ 0 for all t > 0: Setting t WD u=v and then multiplying by v, we get inequality (8.10), the so-called Young’s inequality. Relation (8.9) is a consequence in the convexity of the function t 7! tp (its derivative t 7! ptp1 is nondecreasing). Relation (8.9) is an improvement of the obvious estimate . 12 r C 12 s/p .r _ s/p rp C sp : Definition 8.7 Given a measure space .S; S; /; p 21; C1Œ, and a Banach space E, let Lp .S; S; ; E/; or in short Lp .S; E/ or Lp .; E/, be the set of -measurable p maps f W S ! E such that k f ./kE is integrable. For E WD R the notation Lp .S/ replaces Lp .S; R/: p
p
This set is a vector space since kcf kE D jcjp k f kE for c 2 R (or c 2 C) and p p p f 2 Lp .S; E/ and since k f C gkE 2p1 k f kE C 2p1 kgkE for f , g 2 Lp .S; E/ in view of relation (8.9). In the sequel, for f 2 Lp .S; S; ; E/; we set
Z k f kp WD S
p k f kE
1=p d :
8.5 Lebesgue Lp .S; E/ Spaces
473
Given a -measurable map f W S ! E, let k f k1 be the infimum of the set of c 2 RC such that k f .s/kE c a.e. (with k f k1 D 1 if there is no such c). When finite, this infimum is attained since for any sequence .cn / ! c WD k f k1 with cn > c for all n one can find a -null set Nn such that k f .s/kE cn for all s 2 SnNn ; so that k f .s/kE c for all s 2 SnN with N WD [n Nn : We denote by L1 .S; S; ; E/; or in short L1 .S; E/, the set of -measurable maps f W S ! E such that k f k1 < C1: The set L1 .S; E/ is a vector space and f 7! k f k1 is easily seen to be a semi-norm on L1 .S; E/: The next proposition prepares the proof that f 7! k f kp is a semi-norm on Lp .S; E/: Proposition 8.9 (Hölder’s Inequality) Let p, q 2 Œ1; C1 satisfying 1=pC1=q D 1 in the extended sense described above and let .S; S; / be a measure space. Given f 2 Lp .S; S; /, g 2 Lq .S; S; / one has fg 2 L1 .S; S; / and Z j fgj d k f kp kgkq :
(8.11)
If F, G, H are Banach spaces, if b W F G ! H is a continuous bilinear map and if f 2 Lp .S; F/, g 2 Lq .S; G/ one has b ı . f ; g/ 2 L1 .S; H/ and kb ı . f ; g/k1 kbk k f kp kgkq :
(8.12)
Proof The first assertion is a special case of the second one with F D G D H D R, b.r; s/ D rs: The map h WD b ı . f ; g/ is -measurable when f and g are -measurable. In view of Corollary 7.12 we may suppose p, q 21; C1Œ. Since f D 0 a.e. and h D 0 a.e. when k f kp D 0; we may suppose ˛ WD k f kp ¤ 0 and ˇ WD kgkq ¤ 0: Then, for t 2 S, the relation kh.t/kH kbk k f .t/kF kg.t/kG and inequality (8.8) with r WD .1=˛/ k f .t/kF , s WD .1=ˇ/ kg.t/kG yield 1 1 11 p q kh.t/kH kbk . p k f .t/kF C q kg.t/kG /: ˛ˇ p˛ qˇ Integrating over S, we get .1=˛ˇ/
R S
kh.t/kH d.t/ kbk and relation (8.12).
Corollary 8.8 For k 2 N n f0; 1g and p, p1 ; ; pk 2 Œ1; C1 satisfying 1=p D 1=p1 C C 1=pk and fi 2 Lpi .S/ for i 2 Nk , one has f WD f1 fk 2 Lp .S/ and k f kp k f1 kp1 k fk kpk : Proof Setting qi WD pi =p and gi WD j fi jp we reduce the result to the case p D 1: Then an induction starting with the case k D 2 and the Hölder’s inequality gives the result.
474
8 Differentiation and Integration
Example (Interpolation) For m; p; q 2 Œ1; C1 and t 2 Œ0; 1 with m p q; 1=p D t=m C .1 t/=q and f 2 Lm .S/ \ Lq .S/ one has f 2 Lp .S/ and k f kp k f ktm k f kq1t : In the sequel we write kk instead of kkE when no confusion may arise. Corollary 8.9 If is a finite measure and if f 2 Lp .S; E/ for some p 21; C1, then for all p0 2 Œ1; p one has f 2 Lp0 .S; E/ and 0
k f kp0 .S/1=p 1=p k f kp : 0
Proof Setting r WD p=p0 ; s D .1 1=r/1 ; f 0 WD k f kp , g WD 1S ; relation (8.11) with R 0 0 . p; q; f / changed into .r; s; f 0 / yields S k f kp d k f kpp k1S ks 1: R Proposition 8.10 The function kkp W f 7! . k f kp d/1=p is a semi-norm on Lp .S; E/. In particular, for f , g 2 Lp .S; E/ we have the Minkowski inequality k f C gkp k f kp C kgkp :
(8.13)
Proof The relation kcf kp D jcj k f kp for c 2 C (or R) and f 2 Lp .S; E/ is immediate. Let f , g 2 Lp .S; E/. Writing k f C gkp k f k k f C gkp1 C kgk k f C gkp1 ; integrating over S and using the Hölder’s inequality on each term we get Z
Z k f C gk d k f kp .
k f C gk
p
S
q. p1/
Z
1 q
d/ C kgkp .
S
1
k f C gkq. p1/ d/ q : S
R
When k f C gkp is non-null, dividing both sides by . S k f C gkp d/p=q and using R the relations pp=q D 1, q. p1/ D p, we get . S k f C gkp d/1=p k f kp Ckgkp : When k f C gkp D 0; the inequality k f C gkp k f kp C kgkp is obvious. If f 2 Lp .S; E/ is such that k f kp D 0 we have k f kp D 0 a.e. by Corollary 7.16, hence f D 0 a.e. Such a fact incites us to consider the space Lp .S; E/ of equivalence classes of maps in Lp .S; E/ with respect to the relation of equality almost everywhere. Then we dispose of properties similar to those in L1 .S; E/. Moreover, the semi-norm kkp induces a norm on Lp .S; E/: We first prove an analogue of Egoroff’s Theorem. Proposition 8.11 Let . fn / be an Abel sequence in .Lp .; E/; kkp / with p 2 Œ1; C1Œ. Then . fn / converges a.e. to some -measurable function f , and given " > 0, there exists a subset T of S of measure less than " such that the convergence is uniform on SnT. Moreover, f is in Lp .; E/ and . fn / ! f in Lp .; E/: Proof Let c > 0 and r 20; 1Œ be such that k fnC1 fn kp cr2n for all n 2 N. Changing all the fn ’s on a set of measure 0, we may assume that they are all
8.5 Lebesgue Lp .S; E/ Spaces
475
measurable. Let Sn be the set of s 2 S such that k fnC1 .s/ fn .s/kE crn : Then Z
Z p
cp rnp .Sn / Sn
k fnC1 fn kE d
k fnC1 fn kE d cp r2np ; p
S
so that .Sn / rnp : Setting Tk WD [nk Sn and N WD \k Tk we have .Tk / < rkp =.1 rp / and .N/ D 0: For s 2 SnTk and n k we have k fnC1 .s/ fn .s/kE crn so that . fn / converges uniformly on SnTk and . fn / converges pointwise to some function f on SnN: Extending f by 0 on N; we get a -measurable function on S: We still denote it by f : Moreover, for n k in N we have
Z
1=p 1 2k X fjC1 fj c r D k fn fk kp : k fn fk k d p 1 r2 S jDk p
Applying Fatou’s Lemma to the sequence .k fn fk kp /nk which converges almost everywhere to k f fk kp ; we get that Z
Z k f fk k d liminfn
k fn fk kp d cp
p
S
S
r2kp : .1 r2 /p
Since fk 2 Lp .S; E/ and k f fk kp < C1 we obtain that f 2 Lp .S; E/ and k f fk kp cr2k =.1 r2 /; so that . fk / ! f in Lp .S; E/. Theorem 8.19 For p 2 Œ1; C1, any measure space .S; S; /, and any Banach space E, the spaces .Lp .S; E/; kkp / and .Lp .S; E/; kkp / are complete. Proof Since completeness can be established by using Abel sequences, the case p 2 Œ1; C1Œ is established by Proposition 8.11. Thus, we turn to the case p D C1: Let . fn / be a Cauchy sequence in .L1 .S; E/; kk1 /: From the comments following the definition of kk1 we know that for all m, n 2 N there exists a null set Nm;n such that k fm fn k1 D supfk fm .s/ fn .s/k W s 2 SnNm;n g: Let N WD [m;n Nm;n ; it is a null set and on SnN the sequence . fn / is a Cauchy sequence for the norm of uniform convergence. Since E is complete, this sequence converges to some function f . We extend f by 0 on N: Then f is -measurable and f 2 L1 .S; E/: Moreover, since .supfk fn .s/ f .s/k W s 2 SnNg/ ! 0; we get .k fn f k1 /n ! 0. Theorem 8.20 (Monotone Convergence Theorem in Lp ) For p 2 Œ1; C1Œ, let . fn / be an increasing sequence in Lp .S; R/ such that there exists some c 2 RC satisfying k fn kp c for all n 2 N. Then there exists some f 2 Lp .S; R/ such that . fn / ! f a.e. and .k fn f kp / ! 0. Proof Let gn WD fn f0 , so that .gn / is increasing and gn takes its values in RC . Moreover, the Minkowski inequality yields kgn kp k fn kp C k f0 kp 2c: By p the Monotone Convergence Theorem, the sequence .gn /n converges a.e. to some p element h 2 L1 .S; R/ and . gn h 1 /n ! 0. Since h 0 a.e. we can set g WD h1=p
476
8 Differentiation and Integration
and f WD f0 Cg. Then .gn / ! g a.e., so that g is -measurable and even g 2 Lp .S; R/ since gp 2 L1 .S; R/: For c 2 RC the function r 7! rp .r c/p being nondecreasing on Œc; C1Œ, we have rp .r c/p C cp for all r c hence .g gn /p gp gpn : Integrating over S; we get .kg gn kp /p kgp k1 gpn 1 : Since .gpn 1 /n ! khk1 D kgp k1 we get .kg gn kp /n ! 0; hence .k f fn kp /n ! 0. Theorem 8.21 (Dominated Convergence Theorem in Lp ) For p 2 Œ1; C1Œ, let . fn / be a sequence in Lp .S; E/ such that there exists some h 2 Lp .S; R/ satisfying k fn k h for all n 2 N. If . fn / ! f a.e. for some function f , then f 2 Lp .S; E/ and .k fn f kp / ! 0. Proof Since . fn / ! f a.e. and all fn ’s are -measurable functions, f is -measurable and satisfies k f k h a.e., so that k f kp khkp < 1 and f 2 Lp .S; E/. Now k fn f kp 2 L1 .S; R/ and since k fn f kp 2p hp ; the Dominated Convergence Theorem ensures that .k fn f kp /n ! 0 in L1 .S; R/, so that .k fn f kp /n ! 0: Let us give a kind of converse. Theorem 8.22 For p 2 Œ1; C1Œ, let . fn / ! f be a convergent sequence in .Lp .S; E/; kkp /: Then there exist h 2 Lp .S; R/ and a subsequence . fk.n/ / of . fn / such that . fk.n/ / ! f a.e. and fk.n/ .x/ h.x/ a.e. for all n 2 N. Proof In fact, the conclusion is valid for any Abel subsequence in . fn /: Thus we may suppose that the sequence . fn / satisfies k fn fnC1 kp 2n
8n 2 N:
Setting gn .x/ WD
n X
k fkC1 .x/ fk .x/k
kD1
we see that kgn kp 1: The Monotone Convergence Theorem implies that there exists some g 2 Lp .S; R/ such that .gn / ! g a.e. and .kgn gkp / ! 0. For n > m 2 we have k fn .x/ fm .x/k
n1 X fjC1 .x/ fj .x/ g.x/ gm1 .x/; jDm
so that . fn .x// is a.e. a Cauchy sequence in R, hence has a limit. Denoting this limit by f1 .x/ and passing to the limit on n in the preceding inequalities we get k f1 .x/ fm .x/k g.x/ gm1 .x/ g.x/:
8.5 Lebesgue Lp .S; E/ Spaces
477
Thus f1 2 Lp .S; R/: Using again the Dominated Convergence Theorem we get .k f1 fm kp / ! 0 as m ! 1; so that f1 D f a.e. and k fm .x/k k f .x/k C g.x/ with h WD k f k C g 2 Lp .S; R/:
Exercises 1. Let S WD Œ0; 1 be endowed with the Lebesgue measure. For m 2 Nnf0g and k 2 f0; : : : ; 2m 1g let Tm;k WD Œ2m k; 2m .k C 1/Œ and let f2m Ck D 1Tm;k : Show that k fn kp D 2m=p for n D 2m Ck with k 2 f0; : : : ; 2m 1g; hence that . fn / ! 0 in Lp .S/: Given x 2 Œ0; 1Œ and m 2 Nnf0g, let km .x/ 2 f0; : : : ; 2m 1g be such that km .x/ 2m x < km .x/C1; so that fn.x/ .x/ D 1 for n.x/ WD 2m Ckm .x/: Show that for all x 2 S there exists a subsequence in . fn .x//n that does not converge to 0: 2. For p 1 give an example of a measure space .S; S; / and of a sequence . fn / of Lp .S/ such that . fn / ! 0 a.e. that does not converge to 0 in Lp .S/: 3. Show that the Monotone Convergence Theorem does not hold in L1 .S; R/. 4. Show that the Dominated Convergence Theorem does not hold in L1 .S; R/. 5. Given a finite measure space .S; S; / and p; q; r such that 1 p q r and t 2 Œ0; 1 such that 1=q D t=p C .1 t/=r show that Lp .S/ \ Lr .S/ Lq .S/ and k f kq k f ktp k f kr1t for all f 2 Lp .S/ \ Lr .S/. [Hint: set s WD p=tq, s0 WD r=.1 t/q so that 1=s C 1=s0 D 1 and apply Hölder’s inequality to j f jq D j f jtq j f j.1t/q .] 6. Given p 2 Œ1; 1Œ, a finite measure space .S; S; /; and f 2 Lp .S/ show that k f k1 D lim
p!1
1 k f kp : .S/
[Hint: set h. p/ WD .S/1=p k f kp so that h is increasing and bounded above by k f k1 I given r < k f k1 let SRr WD fs 2 S W j f .s/j rg and observe that .Sr / < C1; h. p/ .S/1=p . Sr j f jp /1=p .S/1=p .Sr /1=p r.] For r 2 RC let f 2 Lp .S/ be such that k f kp r for all p 2 Œ1; 1Œ: Show that f 2 L1 .S/: Exhibit an example of a function f 2 Lp .S/nL1 .S/ for all p 2 Œ1; 1Œ. [Hint: take f WD ln on S WD Œ0; 1.] 7. For d 2 N, d 2; let f1 ; ; fd 2 Ld1 .Rd1 /. Then, setting x0i WD .x1 ; ; xi1 ; xiC1 ; ; xd / 2 Rd1 for x 2 Rd and i 2 Nd ; for f given by f .x/ WD f1 .x01 / fd .x0d / show that f 2 L1 .Rd / and k f k1 k f1 kd1 k fd kd1 : 8. (Young’s inequality) Given p, q, r 2 Œ1; C1 satisfying 1=p C 1=q D 1 C 1=r and f 2 Lp .R/; g 2 Lq .R/ show that f g 2 Lr .R/ and k f gkr k f kp : kgkq :
478
8 Differentiation and Integration
9. Show that the convolution f g of f 2 Lp .R/ with g 2 Lq .R/ may not Rexist if the condition 1=p C 1=q 1 is not satisfied. [Hint: one may have R f .x w/g.w/dw D C1 for all x 2 R.] 10. Given a -finite measure space .X; S; / and a measurable function f W X ! R one defines mf W RC !R by mf .r/ WD .fx 2 X W j f .x/j > rg/. Verify that mf is R R C1 nonincreasing. Show that for all p 1 one has j f jp d D 0 prp1 mf .r/dr. [Hint: apply Fubini’s Theorem to the function .x; r/ 7! prp1 mf .r/ on the set S WD f.x; r/ 2 X R W r 2 Œ0; f .x/g]. Deduce from this Tchebychev’s inequality rp mf .r/ k f kp : 11. With the notation of the preceding exercise let Mp ./ be the set of measurable functions f such that r 7! rp mf .r/ is bounded on RC . Verify that for X WD0; 1 endowed with the restriction of the Lebesgue measure the function f W X ! R given by f .x/ WD 1=x ln x belongs to M1 ./nL1 ./. 12. With the notation of the two preceding exercises, for f 2 L1 ./; where is the Lebesgue measure on R, x 2 RR and if T is an open interval of R containing x one sets Tf .x/ WD .1=.T// T fd: Let hf W R !R be the Hardy-Littlewood function given by hf .x/ D supfTf .x/ W T 2 T .x/g, where T .x/ is the set of open intervals containing x: Show that hf is measurable. [Hint: check that for all r > 0 the set Hr WD fhf R> rg is open as for all x 2 Hr there exists some T 2 T .x/ such that r.T/ T fd.] Given a Borel measure on R finite on bounded intervals, show that for every finite family .Ti /i2I of bounded open intervals of R one can find a subfamily .Tj /j2J of disjoint intervals such that .[i2I Ti / 2˙j2J .Tj /: Show that for all r 2 RC one has r.Hr / 2 k f k1 : [Hint: given a compact subset K ofR Hr take a finite covering .Ti /i2I of K by open intervals Ti satisfying r.Ti / Ti fd and apply the preceding question.] Conclude that hf is finite a.e. and belongs to M1 ./: 13. Prove that for all p 2 Œ1; 1 the space Lp .S/ is a lattice and that the map . f ; g/ 7! f _ g WD max. f ; g/ is continuous. [Hint: if hn WD fn _ gn ; note that hn D .1=2/.j fn gn j C fn C gn /.] 14. Given p 2 Œ1; 1Œ; a sequence . fn / ! f in Lp .S/; a bounded sequence .gn / in L1 .S/ such that .gn / ! g a.e., prove that . fn gn / ! fg in Lp .S/: [Hint: note that fn gn fg D . fn f /gn C f .gn g/ and that f .gn g/ ! 0 in Lp .S/ by dominated convergence.] 15. Given p 2 Œ1; 1Œ; f 2 Lp .Rd / show that, setting fu .x/ WD f .x u/ for u, x 2 Rd ; one has fu ! f as u ! 0 in Rd : 16. (Riesz convexity theorem) Let .S; S; / be a measure space, let X be a Banach space and let A be a linear map from X into the space L0 .S/ of classes a.e. of -measurable functions on S. Let R be an interval of Œ0; 1 such that for all r 2 R the map A is continuous from X into L1=r .S/. Prove that r 7! ln.kAk1=r / is convex. Compare this result with Exercise 5. [See [106, p.524].]
8.5 Lebesgue Lp .S; E/ Spaces
479
8.5.2 Nemytskii Maps Given p; q 21; 1Œ with p q, separable Banach spaces E, F; a measure space .S; S; /, and a map g W S E ! F, let us consider conditions ensuring that for u 2 Lp .S; E/ the map v WD g ˘ u WD g.; u.// belongs to Lq .S; F/: We assume that g is a Caratheodory map, i.e. for all e 2 E the map g.; e/ is measurable and for a.e. s 2 S the map gs WD g.s; / is continuous from E to F. Lemma 8.11 If g W S E ! F is a Caratheodory map, then, for all -measurable maps u W S ! E, the map g ˘ u WD g.; u.// is -measurable. Proof Let u W S ! E be a -measurable map. By definition there is a Cauchy sequence .un / of -step functions that converges to u a.e., so that .g ˘ un/ converges to g ˘ u a.e. In view of Corollary 7.1, it suffices to prove that g ˘ v is -measurable for each -step function v: Writing vD
k X
1 A i ei
iD1
with ei 2 E; Ai 2 S with .Ai / < 1, and Ai \ Aj D ¿ for i ¤ j, we see that .g ˘ v/.s/ D
k X
1Ai .s/g.s; ei /
s 2 S;
iD1
so that g ˘ v is -measurable.
We need the growth condition: (G) there exist a 2 Lq .S; R/, b 2 L1 .S; R/ and a null set N such that 8.s; e/ 2 .S n N/ E
kg.s; e/k a.s/ C b.s/ kekp=q :
Theorem 8.23 (Krasnoselskii) Let p; q 2 Œ1; 1Œ with p q: For a Caratheodory map g W S E ! F the following assertions are equivalent: (a) g satisfies the growth condition (G); (b) for all u 2 Lp .S; E/ the map v WD g ˘ u WD g.; u.// belongs to Lq .S; F/ and the Nemytskii map G W Lp .S; E/ ! Lq .S; F/ given by G.u/ D g.; u.// is bounding, i.e. maps bounded subsets into bounded subsets; (c) the Nemytskii map G W Lp .S; E/ ! Lq .S; F/ given by G.u/ D g.; u.// is well defined and continuous. Proof We just prove the most useful implications. We use the inequality jr C sjq 2q1 jrjq C 2q1 jsjq for all r; s 2 R we have derived from the convexity of t 7! jtjq .
480
8 Differentiation and Integration
(a))(b) Clearly, for u 2 Lp .S; E/ the map v WD g.; u.// is -measurable and satisfies for s 2 S ˇq ˇ ˇ ˇ kg.s; u.s//kq ˇa.s/ C b.s/ ku.s/kp=q ˇ 2q1 ja.s/jq C 2q1 jb.s/jq ku.s/kp : Integrating, we get Z S
kv.s/kq d.s/ 2q1 kakqq C 2q1 kbkq1 kukpp < 1
so that v 2 Lq .S; F/: Moreover, this inequality shows that G maps bounded subsets into bounded subsets. (a))(c) To prove that G is continuous, we use Proposition 2.19. Given u 2 Lp .S; E/ and a sequence .un / ! u in Lp .S; E/, we pick an Abel subsequence which we do not relabel. Then by Theorem 8.22 .un / ! u a.e. and there exists some h 2 Lp .S; R/ such that kun .s/k h.s/ a.e. Using assumption (G) in which we may assume that a and b are nonnegative, we get kg.s; un .s// g.s; u.s//kq 2q1 kg.s; un .s//kq C 2q1 kg.s; u.s//kq 2q1 .2q a.s/q C 2q1 b.s/q .kun .s/kp C ku.s/kp // 22q1 a.s/q C 22q1 b.s/q h.s/p : Observing that .g.s; un .s/// ! g.s; u.s// a.e. and applying the Dominated Convergence Theorem, we obtain that .g ˘ un / ! g ˘ u in Lq .S; F/. Corollary 8.10 Let g W S E ! F be a Caratheodory map satisfying the growth condition (G) for some p 1 and q D 1: Then the map Ig given by Z Ig .u/ WD
g.s; u.s//d.s/
u 2 Lp .S; E/
S
is well defined and continuous from Lp .S; E/ into F. ProofR The map Ig is just the composition of G W u 7! g ˘ u with integration v 7! S vd: Remark For p 2 Œ1; 1Œ and q D 1; and g satisfying for some a 2 L1 .S/ the condition kg.s; e/k a.s/ for all .s; e/ 2 S E; clearly one has G.u/ WD g ˘ u 2 L1 .S; F/ for all u 2 Lp .S; E/: However, unless G is constant, G is not continuous from Lp .S; E/ into L1 .S; F/ in general. A counterexample can be given whenever for any measurable subset T of S with .T/ > 0 one can find a sequence .Tn / of measurable subsets of T such that .\n Tn / D 0 and .Tn / > 0 for all n, as is the case when .S; S; / is non-atomic. In fact, for u; v 2 Lp .S; E/ such that G.u/ ¤ G.v/; taking " > 0 satisfying " < kG.u/ G.v/k1 ; setting T WD fs 2 S W kG.u/.s/ G.v/.s/k "g, taking an associated sequence .Tn /,
8.5 Lebesgue Lp .S; E/ Spaces
481
and setting un WD 1SnTn u C 1Tn v, we see that .un / ! u in Lp .S; E/ but since Tn fs W kG.un /.s/ G.u/.s/k "g; we do not have .G.un // ! G.u/ in L1 .S; F/: Exercise Let g W S E ! R be a measurable map. Suppose g takes nonnegative values and is lower semicontinuous with respect to its second variable. Using Fatou’s Lemma, show that the map Ig given by Z Ig .u/ WD
u 2 Lp .S; E/
g.s; u.s//d.s/ S
is well defined and lower semicontinuous from Lp .S; E/ into RC . Exercise Prove the same conclusion when g is lower semicontinuous with respect to its second variable and satisfies the growth condition (G ) there exist a 2 L1 .S; R/, b 2 L1 .S; R/ and a null set N such that g.s; e/ a.s/ b.s/ kekp :
8.s; e/ 2 .S n N/ E
See Sect. 2.2.3.
Let us turn to differentiability properties. The case of a map of class D1 is easier than the case of a map of class C1 : We just treat the case p D q, the case p > q being obtained by slight changes from the proof of Theorem 8.25. Theorem 8.24 Let p 2 Œ1; 1Œ. Let g W S E ! F be a Caratheodory map that is of class D1 with respect to its second variable and such that D2 g W S E ! L.E; F/ is a Caratheodory map satisfying for some b 2 L1 .S/ the conditions a WD g.; 0/ 2 Lp .S; F/ and for a.e. s kD2 g.s; /k b.s/: Then the Nemytskii map G W u 7! g˘u is of class D1 on Lp .S; E/; hence is Hadamard differentiable, and DG.u/.v/ D D2 g.; u.//v./
8u; v 2 Lp .S; E/:
Proof It follows from the growth assumption on D2 g and from the Mean Value Theorem that g satisfies condition (G). Since g is of class D1 with respect to its second variable, the map h W S E3 ! F given by 0
00
h.s; e; e ; e / WD
Z 0
1
D2 g.s; .1 t/e C te0 /e00 dt
is a Caratheodory map such that kh.s; e; e0 ; e00 /k b.s/ ke00 k and for a.e. s, g.s; e/ g.s; e0 / D h.s; e; e0 ; e e0 /
8e; e0 2 E:
(8.14)
482
8 Differentiation and Integration
Taking q D 1 in the preceding theorem, we get that the Nemytskii operator H associated with h maps Lp .S; E3 / continuously into L1 .S; F/: Since for w, x 2 Lp .S; E/ one has G.w/ G.x/ D H.w; x; w x/ with H continuous and H.w; x; / linear, we get from the characterization of Corollary 5.8 that G is of class D1 and that DG.u/.v/ D H.u; u; v/ D D2 g.; u.//v./ for all u; v 2 Lp .S; E/: Remark For p D q 2 Œ1; 1Œ; unless for some null subset N of S for all s 2 SnN the map D2 g.s; / is constant, the Nemytskii map G is not of class C1 since its Hadamard derivative DG.u/ at u 2 Lp .S; E/ is the map D2 g ˘ u 2 L1 .S; L.E; F// considered as a linear subspace of L.Lp .S; E/; Lp .S; F// and u 7! D2 g ˘ u is not continuous in view of the preceding remark. Note that the fact that L1 .S; L.E; F// is isometrically embedded in L.Lp .S; E/; Lp .S; F// requires a measurable selection theorem, so we admit it. Theorem 8.25 Let p; q 2 Œ1; 1Œ with p > q and let r WD pq=. p q/: Let g W S E ! F be a Caratheodory map that is Fréchet differentiable with respect to its second variable and such that D2 g W S E ! L.E; F/ is a Caratheodory map satisfying for some a1 2 Lr .S/ and b1 2 L1 .S/ the conditions g.; 0/ 2 Lq .S; F/ and kD2 g.s; e/k a1 .s/ C b1 .s/ kekp=r
a.e. s; 8e 2 E:
Then the Nemytskii map G W u 7! g ˘ u is Fréchet differentiable from Lp .S; E/ to Lp .S; F/ and DG.u/.v/ D D2 g.; u.//v./
8u; v 2 Lp .S; E/:
Proof It follows from the growth assumption on D2 g; from the relation p=r C 1 D p=q, and from the Mean Value Theorem that g satisfies condition (G) with a D g.; 0/ and b D .q=p/b1. Since g is of class C1 with respect to its second variable, R1 the map h W S E2 ! L.E; F/ given by h.s; e; e0 / WD 0 D2 g.s; .1 t/e C te0 /dt is a Caratheodory map such that for a.e. s, g.s; e/ g.s; e0 / D h.s; e; e0 /.e e0 /
8e; e0 2 E:
(8.15)
Then h induces a Nemytskii operator H W Lp .S; E2 / ! Lr .S; L.E; F// given by H.u; v/ WD h.; u./; v.//: Now, by a result similar to Corollary 8.8 the bilinear evaluation map L.E; F/ E ! F given by .`; e/ 7! `.e/ induces a continuous bilinear map Lr .S; L.E; F// Lp .S; E/ ! Lq .S; F/ given by .z; w/ 7! z:w with .z:w/.s/ WD z.s/:w.s/: Thus, an element z of Lr .S; L.E; F// can be considered as an
8.6 Duality and Reflexivity of Lebesgue Spaces
483
element of L.Lp .S; E/; Lq .S; F//. Taking z WD h.; u./; v.// WD H.u; v/; we deduce from (8.15) that G.u/ G.v/ D H.u; v/:.u v/ and considering H.u; v/ as an element of L.Lp .S; E/; Lq .S; F// and observing that .u; v/ 7! H.u; v/ is continuous, we obtain that G is of class C1 in view of Lemma 5.6. Exercise Let S be a compact interval of R and let X be the space of Lipschitzian functions on S endowed with the norm given by kxk WD sups2S jx.s/j C sup.s;t/; s¤t jx.s/x.t/j jstj : For g W S R ! R such that G W x 7! g.; x.// sends X into X and is Lipschitzian, prove that there are a; b 2 X such that g.s; t/ D a.s/t C b.s/:
8.6 Duality and Reflexivity of Lebesgue Spaces In this section, the measure space .S; S; / being fixed, we simplify the notation Lp .S; S; ; R/ into Lp .S/ and we pass from Lp .S/ to Lp .S/ without making the necessary comments about equivalence classes. Let us first establish a geometric property of Lebesgue spaces. Lemma 8.12 (Clarkson) For p 2 Œ2; C1Œ and for all f ; g 2 Lp .S/ one has p p 1 . f C g/ C 1 . f g/ 1 k f kp C 1 kgkp : p p 2 2 2 2 p p
(8.16)
For p 21; 2, q WD .1 1=p/1 , and for all f ; g 2 Lp .S/ one has q q
q=p 1 . f C g/ C 1 . f g/ 1 k f kp C 1 kgkp : p p 2 2 2 2 p p
(8.17)
Proof For p 2 Œ2; C1Œ and a 2 RC , b 2 P; setting t WD a=b and using the fact that h.t/ WD .t2 C 1/p=2 tp h.0/ D 1 for t 2 RC since h0 0, we have ap C bp .a2 C b2 /p=2 : This relation is still valid if b D 0: On the other hand, given r; s 2 RC ; the convexity of t 7! jtjp=2 yields the relation ˇ2 ˇ ˇ2 ˇ ˇ ˇ ˇ1 ˇ1 1 1 1 1 ˇ ˇ ˇ .ˇ .r C s/ˇ C ˇ .r s/ˇˇ /p=2 D . r2 C s2 /p=2 rp C sp : 2 2 2 2 2 2
484
8 Differentiation and Integration
Taking a WD 12 .r C s/, b WD 12 .r s/, relation (8.16) follows from the properties of the integral. Relation (8.17) is more delicate. We refer the reader to [97, 159]. Theorem 8.26 Given a coupling function c W E F ! R between two separable 1 RBanach spaces and p 2 Œ1; C1Œ; q WD .1 1=p/ , the mapping c W . f ; g/ 7! S c. f .s/; g.s//d.s/ is a coupling between Lp .S; E/ and Lq .S; F/: If c is a metric coupling between two Banach spaces, then c is a metric coupling. Proof Since c W E F ! R is bilinear and continuous, for f 2 Lp .S; E/ and g 2 Lq .S; F/ the function h WD c ı . f ; g/ W s 7! c. f .s/; g.s// D h f .s/; g.s/i is -measurable by Proposition 7.1. Moreover, Hölder’s inequality (8.11) yields Z
Z
jc. f ; g/j
k f k : kgk d kck k f kp : kgkq :
jhj d kck S
(8.18)
S
Let g 2 Lq .S; F/ be such that c. f ; g/ D 0 for all f 2RLp .S; E/: Then, for all A 2 S with finite positive measure and all e 2 E we have A c.e; g.s//d.s/ D 0: Then, by Corollary 7.14, for all e 2 E we get c.e; g.s// D 0 a.e. Taking a countable dense subset of E and using the fact that c is a coupling we get g D 0 a.e. A similar implication holds for f : Thus c is a coupling function between X WD Lp .S; E/ and Y WD Lq .S; F/: Now let us suppose c is a metric coupling. Given a -step function f D ˙i2I 1Ai ai as above with k f kp ¤ 0, for each i 2 I we pick a sequence .bi;n /n0 in the unit sphere SY of Y such that .c.ai ; bi;n //n ! kai k and we set gn WD ˙i2I kai kp1 bi;n 1Ai : Then, since q. p 1/ D p; we have kgn kqq D
X
kai kq. p1/ kbi;n kq .Ai / D
i2I
c. f ; gn / D
Z X
X
kai kp .Ai / D k f kpp ;
i2I
kai kp1 c.ai ; bi;n /1Ai d !
S i2I
X
kai kp .Ai / D k f kpp :
i2I
It follows that kgn kq D k f kp=q p and sup n
k f kpp c. f ; gn / D k f kp : kgn kq k f kp=q p
Then kc. f ; /kLq .S;F/ k f kp and since the reverse inequality follows from (8.18), we get kc. f ; /kLq .S;F/ D k f kp : Since both sides of this equality are continuous functions of f and since the space St.; E/ is dense in Lp .S; E/ with respect to the norm kkp , this equality holds for all f 2 Lp .S; E/: Similarly, one can show that kc.; g/kLp .S;E/ D kgkq for all g 2 Lq .S; F/: Thus c is a metric coupling.
8.6 Duality and Reflexivity of Lebesgue Spaces
485
Exercise Give a simplified proof of the last assertion of the theorem in the case R E D F D R. [Hint: given f 2 Lp .S/ take g WD j f jp2 f and note that S fgd D k f kpp and kgkq D k f kpp1 .] Theorem 8.27 (Clarkson) For p 21; C1Œ the space Lp .S/ is uniformly convex, hence reflexive. Proof Let us first consider the case p 2 Œ2; C1Œ: By Clarkson’s inequality (8.16) for f ; g 2 Lp .S/ satisfying k f kp 1, kgkp 1, k f gkp > " we have k. f C g/=2kpp < 1 ."=2/p: This shows that p W t 7! 1 .1 .t=2/p /1=p is a gage of convexity of .Lp .S/; kkp /: When p 21; 2 we use the second Clarkson’s inequality (8.17) showing that for f ; g 2 Lp .S/ satisfying k f kp 1, kgkp 1, k f gkp > " we have k. f C g/=2kqp < 1 ."=2/q . Thus, p given by p .t/ WD 1 .1 .t=2/q /1=q is a gage of convexity of .Lp .S/; kkp /: The reflexivity of Lp .S/ then follows from Theorem 3.19. Remark If E is a Banach space of type p in the sense that there exists some c > 0 such that 8u; v 2 E
ku C vkp C ku vkp 2 kukp C c kvkp
then Lp .S; E/ is of type p: one has 8f ; g 2 Lp .S; E/
k f C gkpp C k f gkpp 2 k f kpp C c kgkpp :
It can be shown that any reflexive Banach space can be endowed with an equivalent norm that is of type p: Remark If S is an open subset of Rd and is the restriction of the Lebesgue measure , then the space L1 .S/ is not reflexive (see Exercise 1). It is important to identify the dual of Lp .S/: Theorem 8.28 (Riesz) For p 21; C1Œ, q WD .1 1=p/1 , Y WD Lq .S/ Rcan be identified with the dual of X WD Lp .S/ via the map cY given by cY .g/. f / D S fgd for f 2 Lp .S/, g 2 Lq .S/: When .S; S; / is -finite, Y WD L1 .S/ can be identified with the dual of L1 .S/ via the map cY : Proof For p 21; C1Œ this is a consequence in Proposition 3.23 and Theorem 8.26. Let us consider the case p D 1: We start with the additional assumption that .S/ < C1: We already know that the map cY is an isometry from L1 .S/ onto its image in the dual of L1 .S/: Hölder’s inequality ensures that for p 21; C1Œ the space Lp .S/ is contained in L1 .S/ and the canonical injection jp W Lp .S/ ! L1 .S/ is continuous since for q WD .11=p/1 one has k f k1 k1S kq k f kp D ..S//1=q k f kp for all f 2 Lp .S/: Given R ` 2 .L1 .S// ; by the preceding case we can find gq2 Lq .S/ such that .`ıjp /. f / D S fgq d for all f 2 Lp .S/I moreover gq is unique and gq q D ` ı jp c1=q k`k with c WD .S/: For r > p and s WD .1 1=r/1 , denoting by
486
8 Differentiation and Integration
jq;s W Lq .S/ ! Ls .S/ the canonical injection of Corollary 8.9, by uniqueness we get that jq;s .gq / D gs : Taking a representant g of the common class of gq and gs , let us show that g 2 L1 .S/ and kgk1 k`k : Given b > k`k ; let T WD ft 2 S W jg.t/j bg: If a WD .T/ is positive, we have kgkq a1=q b; since for q large enough we have a1=q b > c1=q k`k ; which is impossible since gq q c1=q k`k. R Thus .T/ D 0, g 2 L1 .S/ and `. f / D S fgd for all p > 1 and all f 2 Lp .S/. Now, given f 2 L1 .S/ and k 2 N, setting fk .s/ D f .s/ whenever j f .s/j k and fk .s/ D 0 otherwise, we see that fk 2 Lp .S/ and that j fk gj j fgj for all k 2 N with fg 2 L1 .S/ and that . fk g/ ! fg a.e., so that, by the Dominated Convergence Theorem we have . fk g/ ! fg in L1 .S/: Thus Z
Z `. f / D lim `. fk / D lim k
k
fk gd D
fgd
S
S
and ` D cY .g/ since f is arbitrary in L1 .S/: Now let us consider the case when .S; S; / is -finite. Let .Sn /n be a partition of S into sets of finite measure. Given ` 2 .L1 .S// ; forR any n 2 N we can find some gn 2 L1 .Sn / such that for all h 2 L1 .Sn / we have Sn hgn D `. jn .h//; where jn .h/ 2 L1 .S/ is the extension of h by 0 on SnSn and moreover kgn k1 k`k since kjn .h/k1 D khk1 : Let g 2 L1 .S/ be such that gn D g j Sn : Then, for all f 2 L1 .S/, observing that for fn WD f j Sn 2 L1 .Sn / we have jn . fn / D 1Sn f , f D ˙n 1Sn f , since ` is linear and continuous, we get `. f / D
X n
`.1Sn fn / D
XZ n
Sn
fn gn D
Z
XZ n
1Sn fg D S
fg S
by Corollary 7.11, using the fact that ˙n 1Sn fg D fg, ˙n j1Sn fgj D j fgj, and the fact that an absolutely convergent series is convergent and ˙nk j1Sn fgj j fgj. The following general equivalence result is outside the scope of this book. We quote it from [99, p. 98] in order to give perspective. We shall just prove an easier result. Theorem 8.29 Let .S; S; / be a finite measure space, p 2 Œ1; 1Œ; q 21; 1 with 1=p C 1=q D 1, and let E be a Banach space. Then Lp .; E/ D Lq .; E / if and only if E has the Radon-Nikodým property. Note that the surjectivity of the map cY W L1 .S/ ! L1 .S/ when .S; S; / is finite can be rephrased by saying that any continuous linear form ` W L1 .S/ ! R is representable. Theorem 8.30 below gives a generalization yielding an alternative proof (due to J. Von Neumann) to this statement in view of the equivalence of Proposition 8.4. We start with an analogue of the surjectivity result for cY valid for Hilbert spaces. Proposition 8.12 If E is a Hilbert space, for every -finite measure space .S; S; /; L1 .; E/ can be identified with the dual space of L1 .; E/.
8.6 Duality and Reflexivity of Lebesgue Spaces
487
Proof Using the arguments of the last proof, it suffices to assume that is finite and to R show that every continuous linear map ` W L1 .; E/ ! R is of the form f 7! h f j gid for some g 2 L1 .; E/. As already observed, by Corollary 8.9 (or the Cauchy-Schwarz inequality) we have a continuous linear injection j of L2 .; E/ into L1 .; E/ since Z
j f j d k f k2 k1S k2 D .S/1=2 k f k2
for all f 2 L2 .; S/:
Thus ` ı j is a continuous linear form on the Hilbert space L2 .; E/: The Riesz representation theorem in Hilbert spaces provides some g 2 L2 .; E/ such that kgk2 D k` ı jk and Z h f j gid
`. j. f // D
for all f 2 L2 .; E/:
(8.19)
S
In particular, for all A 2 S, e 2 E one has 1A e 2 L2 .; E/ and j.1A e/ 2 L1 .; E/, hence ˇZ ˇ ˇZ ˇ ˇ ˇ ˇ ˇ ˇ h e j gidˇ D ˇ h1A e j gidˇ D j`. j.1A e//j k`k kek .A/: ˇ ˇ ˇ ˇ A
S
For all e 2 E Corollary 7.13 ensures that jhe j gij k`k kek a.e. Using the fact that for a representant g0 of g the set g0 .S/ is contained in a separable subspace of E, we get that kgk k`k a.e. so that g 2 L1 .; E/: Since L2 .; E/ (and R even St.; E/) is dense in L1 .; E/; relation (8.19) can be extended into `. f / D S h f j gid for all f 2 L1 .; E/: Theorem 8.30 Hilbert spaces satisfy the Radon-Nikodým Property: if E is a Hilbert space, if .S; S; / is a finite measure space, then for any -continuous vectorial measure W S ! ERof bounded variation, there exists an h 2 L1 .; E/ such that D h , i.e. .A/ D A hd for all A 2 S. Proof Let W S ! E be a -continuous vectorial measure on the finite measure space .S; S; / with total variation j j : Since j j is -continuous, it has a RadonNikodým derivative k 2 L1 ./ with respect to W for all ' 2 L1 .j j/ one has Z
Z 'd j j D
'kd:
(8.20)
Now, by Theorem 8.1, one can associate to the scalar product b D h j i W E RE ! R a unique continuous linear map bO W L1 .j j ; E/ ! R denoted ˇ ˇ by f 7! S fd ˇ O A e/ D he j .A/i for all A 2 S, e 2 E and ˇb. O f /ˇˇ k f k1 for all such that b.1 f 2 L1 .j j R ; E/: The preceding proposition yields a unique g 2 L1 .j j ; E/ such that O f / D h f j gid j j for all f 2 L1 .j j ; E/: In particular, for all A 2 S, e 2 E we b.
488
8 Differentiation and Integration
have O A e/ D he j .A/i D b.1
Z
Z h1A e j gid j j D he j S
gd j ji: A
Setting h WD kg 2 RL1 .; E/ and taking ' WD h f j gi in relation R(8.20) we get he j .A/i D he j A hdi for all A 2 S, e 2 E, hence .A/ D A hd for all A 2 S. Now let us turn to approximation properties of Lebesgue spaces. Given p 2 Œ1; C1Œ and an open subset ˝ of Rd (with d 2 Nnf0g) equipped with the measure induced by the Lebesgue measure d , we want to show that the elements of Lp .˝/ can be approximated by some simple functions. For this purpose, let us introduce the space L1;loc .˝/ of locally integrable functions on ˝, that is, the space of -measurable functions such that for every compact subset K of ˝ one has 1K f 2 L1 .˝; /. It is easy to see that L1;loc .˝/ is a linear space. It contains Lp .˝/ R for all p 2 Œ1; C1 since for f 2 Lp .˝/ and a compact subset K of ˝ one has j1K f j d .K/1=q k f kp with q WD .1 1=p/1 : We start with a preliminary result of interest. RLemma 8.13 A function f 2 L1;loc .˝/ is null a.e. if and only if it is such that fg D 0 for all g in the space Cc .˝/ of continuous functions with compact support in ˝: Proof IfRf 2 L1;loc .˝/ is null a.e., obviously for any g 2 Cc .˝/ we R have fg D 0 a.e. and fg D 0: Conversely, suppose f 2 L1;loc .˝/ is such that fg D 0 for all g 2 Cc .˝/: In a first step we assume that f 2 L1 .˝/ and that .˝/ < C1: By Theorem 7.3, given " > 0 we can find h 2 Cc .˝/ such that k f hk1 < ": Let K WD KC [ K with KC WD fx 2 ˝ W h.x/ "g;
K WD fx 2 ˝ W h.x/ "g:
Taking a compact subset L of ˝ whose interior contains K; the Tietze-Urysohn Theorem (Theorem 2.8) yields some continuous function g on ˝ such that g j .˝nL/ D 0;
g j KC D 1;
g j K D 1;
sup jg.x/j 1: x2˝
Since jhgj .x/ jh.x/j " for all x 2 ˝nK; we have Z
Z jhgj ˝nK
˝nK
jhj ".˝nK/;
ˇZ ˇ ˇZ ˇ ˇ ˇ ˇ ˇ ˇ hgˇ D ˇ .h f /gˇ kgk kh f k "; 1 1 ˇ ˇ ˇ ˇ ˝ ˝ ˇZ Z Z Z Z ˇ hg D hg hg " C ˇˇ jhj D K
K
˝
˝nK
˝nK
ˇ ˇ hgˇˇ " C ".˝nK/;
8.6 Duality and Reflexivity of Lebesgue Spaces
489
so that Z
Z jhj D ˝
Z jhj C
˝nK
jhj " C 2".˝nK/ ".2.˝/ C 1/: K
Thus k f k1 k f hk1 C khk1 2"..˝/ C 1/. Since " is arbitrarily small, we get k f k1 D 0 and f D 0 a.e. In the general case we take a sequence .˝n / of open subsets of ˝ whose closures are compact and whose union is ˝ (for instance ˝n WD fx R 2 ˝ W kxk < n; d.x; Rd n˝/ > 1=ng). Then fn WD f j ˝n 2 L1 .˝n / and ˝n fn g D 0 for all g 2 Cc .˝n /; so that fn D 0 a.e. It follows that f D 0 a.e. Theorem 8.31 For p 2 Œ1; C1Œ and for an open subset ˝ of Rd , the space Cc .˝/ of continuous functions with compact support in ˝ is dense in Lp .˝/: Proof In view of Theorem 7.3, it suffices to consider the case p 21; C1Œ: Since for q WD .1 1=p/1 the dual R of Lp .˝/ is Lq .˝/; it also suffices to prove that for all h 2 Lq .˝/ satisfying hg D 0 for all g 2 Cc .˝/ we have h D 0 a.e. Since h 2 L1;loc .˝/; this follows from the preceding lemma. Let us give a useful application. Given f 2 Lp .Rd / with p 2 Œ1; C1 and w 2 Rd we denote by Tw f the function defined by Tw f WD f ı tw where tw W Rd ! Rd is the translation x 7! x w: Since tw is an isometry, f ı tw is -measurable when f is measurable. Moreover, if f D g a.e. then fx W Tw f .x/ ¤ Tw g.x/g D fy W f . y/ ¤ g. y/g C w; so that Tw f D Tw g a.e. and we can regard Tw as operating on equivalence classes of functions with respect to equality a.e. We observe that Tw is a linear isometry from Lp .Rd / onto Lp .Rd /: Moreover, we have a convergence result. Lemma 8.14 For p 2 Œ1; C1Œ and f 2 Lp .Rd / one has kTw f f kp ! 0 as w ! 0: Proof Let us first suppose f 2 Cc .Rd /; so that f is uniformly continuous: given " > 0 there exists some ı 20; 1Œ such that j f .x w/ f .x/j " when kwk ı: Denoting by K the support of f , for w 2 ıBRd we have Z kTw f f kpp D
j f .x w/ f .x/jp dd .x/ 2d .K/"p ; K[.KCw/
so that kTw f f kp ! 0 as w ! 0: Now let us suppose f 2 Lp .Rd /. Given " > 0; since Cc .Rd / is dense in Lp .Rd /; there exists a g 2 Cc .Rd / such that kg f kp < "=3: Then we choose ı > 0 such that kTw g gkp < "=3 whenever w 2 ıBRd ; so that we have kTw f f kp kTw f Tw gkp C kTw g gkp C kg f kp < " since kTw f Tw gkp D k f gkp by the change of variables theorem.
Proposition 8.13 For p 2 Œ1; C1Œ and for an open subset ˝ of Rd , the space Lp .˝/ is separable.
490
8 Differentiation and Integration
Proof Let .Kn / be an increasing sequence of compact subsets of ˝ whose interiors cover ˝ (take f.i. Kn WD fx 2 ˝ \ nBRd W d.x; Rd n˝/ 1=ng). Let Ln be the set of functions on ˝ which are null on ˝nKn and whose restrictions to Kn are restrictions of polynomial functions over Q. Then L WD [n Ln is countable. Since Cc .˝/ is dense in Lp .˝/; it suffices to prove that for any g 2 Cc .˝/ and any " > 0 we can find some h 2 L such that kh gkp ": There is some m 2 N such that the support K of g is contained in the interior of Km : Using Weierstrass’ Theorem we can find some h 2 Lm such that k.h g/ j Km k1 "=d .Km /1=p : Since g j ˝nKm D 0 and h j ˝nKm D 0 we have kh gkp ":
Exercises 1. Prove that if ˝ is an open subset of Rd ; if S is the Borel -algebra of ˝ and if is the restriction of the Lebesgue measure, then L1 .˝/ is not reflexive. [Hint: Given a 2 ˝ and a sequence .rn / ! 0C with rn < d.a; Rd n˝/; let fn WD cn 1B.a;rn/ with cn WD 1=.B.a; rn //: Assuming L1 .˝/ Ris reflexive one R can find f 2 L1 .˝/ and a subsequence . fk.n/ /n of . fn / such that . ˝ fk.n/ g/ ! ˝ fg for all g 2 L1 .˝/: Taking for g an element of the space Cc .˝a / of continuous functions with compact support in ˝a WD ˝nfag; R by density this entails that f j ˝a D 0; hence that f D 0 a.e., a contradicting ˝ f 1˝ d D 1.] 2. Prove that if ˝ is an open subset of Rd ; if S is the Borel -algebra of ˝ and if is the restriction of the Lebesgue measure, then L1 .˝/ is not separable. [Hint: For all a 2 ˝; given ra > 0; ra < d.a; Rd n˝/; let fa WD 1B.a;ra/ and let Ga be the open ball of .L1 .˝/; kk1 / with center fa and radius 1=2: Then for a ¤ b in ˝ one has Ga \ Gb D ¿ and since the family .Ga /a2˝ is uncountable, L1 .˝/ cannot have a countable base of open sets.] 3. Let .S; S; / and .T; T ; / be two measure spaces with -finite measures. Let K 2 L2 .S T/; S T being endowed with the measure ˝ : Given f 2 L2 .S/, show that for almost every t 2 T the function s 7! K.s; t/f .s/ is in L2 .T/: [Hint: use Fubini’s Theorem to show that for almost every t 2 T one has K.; t/ 2 L2 .S/ and apply the Cauchy-Schwarz inequality to f and K.; t/.] R Prove that g given by g.t/ WD S K.s; t/f .s/d.s/ is in L2 .T/; that A W f 7! g is linear from L2 .S/ to L2 .T/ and that kgk2 kKk2 k f k2 : Under appropriate assumptions on S and T, Hilbert-Schmidt operators between L2 .S/ and L2 .T/ can be represented by operators as above.
8.7 Compactness in Lebesgue Spaces The following notion can be used for convergence criteria; it is also useful for compactness theorems and existence results. In this section, .E; kk/ is a Banach space; for simplicity we also denote by jj the norm of E; observing that the whole study can be reduced to the case E D R.
8.7 Compactness in Lebesgue Spaces
491
Definition 8.8 Let p 2 Œ1; C1Œ. A subset F of Lp .S; E/ is said to be p-equiintegrable if it satisfies the following two conditions: (a) for every " > 0 there exists R a ı > 0 such that for every T 2 S satisfying .T/ < ı one has supf 2F . T j f jp d/1=p "; (b) for every R " > 0 there exists a B 2 S such that .B/ < C1 and supf 2F . SnB j f jp d/1=p ". When .S/ < C1; condition (b) (which is omitted by some authors) is trivially satisfied. For p D 1 one simply says that F is equi-integrable. Thus F is p-equiintegrable if and only if the family fk f kp W f 2 Fg is equi-integrable. Let us give some examples. Example 1 If F WD f f g with f 2 Lp .S; R/; then F is equi-integrable. Let us verify this. Since j f jp is integrable, by Proposition 8.2, condition (a) is satisfied. Moreover, since by Lemma 7.2 j f jp is null off a set of -finite measure Sf WD [n Sn , where .Sn / is an increasing sequence in S with .Sn / < 1 for all n, taking B WD Sn with n large enough, we get assertion (b). Example 2 Let F Lp .S; E/ be such that there exists a h 2 Lp .S; R/ satisfying j f j h for all f 2 F: Then F is p-equi-integrable. This easily follows from the preceding example. Our next example is important, so we state it as a proposition. Proposition 8.14 Let . fn / be a convergent sequence in Lp .S; E/: Then F WD ffn W n 2 Ng is p-equi-integrable. Proof Given " > 0 let k 2 N be such that k fn f kp < "=2 for n > k; where f WD limn fn . Then, for n > k and for any T 2 S the Minkowski inequality yields Z
j fn jp d/1=p .
. T
Z
j fn f jp d/1=p C . T
Z
j f jp d/1=p T
" C. 2
Z
j f jp d/1=p : T
Using Example 1 we can find ı > 0 such that for R R any T 2 S satisfying .T/ < ı we have . T j f jp d/1=p < "=2; hence supn>k . T j fRn jp d/1=p ": Taking h WD maxnk fj fn jg and using Example 2, we get supnk . T j fn jp d/1=p " if T 2 S satisfies .T/ < ı (with a possibly smaller ı). Thus condition (a) is satisfied. Condition (b) can be established similarly. We have a converse, provided . fn / ! f a.e. or . fn / ! f in measure or converges locally in measure. Proposition 8.15 (Vitali) For a sequence . fn / of Lp .S; E/ and f 2 Lp .S; E/ one has . fn / ! f in Lp .S; E/ if and only if . fn / ! f in measure and the set F WD ffn W n 2 Ng is p-equi-integrable. Proof If .k fn f kp /n ! 0; Proposition 7.8 implies that . fn / ! f in measure and the preceding proposition shows that the set F is p-equi-integrable.
492
8 Differentiation and Integration
Conversely, suppose the set F is p-equi-integrable and . fn / ! f in measure. Then G WD F [ f f g is p-equi-integrable in view of Example 1. Given " > 0; by condition (a) of the precedingR definition there exists a ı > 0 such that for R all T 2 S satisfying .T/ < ı one has T jgjp d "p =2p for all g 2 G; hence . T j fn f jp d/1=p " for all n 2 N by the Minkowski inequality. Also, by condition (b) there exists a R B 2 S such that .B/ < C1 and . SnB j fn f jp d/1=p " for all n 2 N: Since . fn / ! f in measure, given c WD "=.B/1=p , setting Tn WD fs 2 B W j fn .s/ f .s/j > cg we have .Tn / < ı for n large enough. Then Z
Z
Z
j fn f j d D S
Z
j fn f j d C
p
j fn f j d C
p
SnB
BnTn
" C .BnTn /c C " 3" p
p
j fn f jp d
p
p
Tn p
since .BnTn /cp .B/cp "p : This shows that .k fn f kp / ! 0:
The following characterization sheds more light on the notion of p-equiintegrability. Proposition 8.16 (De la Vallée Poussin) Let p 2 Œ1; C1Œ and let F be a subset of Lp .S; E/. Among the following assertions one has (a),(b))(c). If F is bounded, these assertions are all equivalent: (a) for some increasing function W RC ! RC such that limr!1 .r/ r D 1 one R has m WD supf S .j f jp /d W f 2 Fg < 1;R (b) F is uniformly p-integrable: limr!1 supf fj f j>rg j f jp d W f 2 Fg D 0; (c) assertion R (a) of Definition 8.8 holds: for some function ı W P ! P one has supf 2F . T j f jp d/1=p " whenever T 2 S satisfies .T/ < ı."/. Proof Without loss of generality we may suppose p D 1 and E D R. (a))(b) Since limr!1 .r/=r D 1; for every " > 0 there exists a r" > 0 such that r r" H) .r/ .m="/r: Then, for r r" and all f 2 F we have Z
Z j f j d fj f j>rg
fj f j>rg
" "
.j f j/d m m
Z
.j f j/d ": S
(b))(a) We choose an increasing sequence .k.n//n of N such that, for all n 2 N, Z an WD sup f 2F
fj f j>k.n/g
j f j d 2n :
8.7 Compactness in Lebesgue Spaces
493
For i 2 N let Ni WD fn 2 N W i 1 k.n/ < ig and let bi be the number of elements of Ni , ci WD b0 C : : : C bi . Note that limi ci D C1: Define W RC ! RC by for r 2 Œi; i C 1Œ:
.r/ WD rci
Then, denoting by Œr the integer part of r; so that Œr r < Œr C 1 and Œr 2 N, we have
.r/=r cŒr ! C1 as r ! C1: Using associativity for series of nonnegative numbers and the fact that for n 2 Nj we have k.n/ < j, for all f 2 F we get Z
.j f j/d D
1 Z X
S
iD0
D
fij f j 0; let ı."/ > 0 be such that for all T 2 S satisfying .T/ < ı."/ we have supf T j f j d W f 2 Fg ": Let r" WD m=ı."/: Then for r r" and f 2 F we have .fj f j > rg/ k f k1 =r m=r ı."/
494
8 Differentiation and Integration
hence Z supf fj f j>rg
j f j d W f 2 Fg ":
The next result explains the importance of equi-integrability for existence questions. We just quote it; see [106, 112]. Theorem 8.32 (Dunford-Pettis Criterion) If p 2 Œ1; C1Œ and if E is a reflexive Banach space, the weak closure of a p-equi-integrable subset F of Lp .S; E/ is weakly compact. For p > 1 every bounded subset of Lp .S; E/ is weakly relatively compact. In order to give a compactness criterion for the strong topology in Lp .S/; where S is a measurable subset of Rd , let us introduce a notation for the shift fu WD Tu f of a function f 2 Lp .S/ by u 2 Rd given by .Tu f /.x/ WD f .x u/ for x 2 Su WD S C u: Theorem 8.33 (Fréchet-Kolmogorov) Let p 2 Œ1; C1Œ, let S be a bounded measurable subset of Rd and let ˝ Rd be a measurable subset of Rd such that S C rBRd ˝ for some r > 0: Let G be a bounded subset of Lp .˝/ satisfying the condition: 8" > 0 9ı 20; r W g 2 G; u 2 ıBRd H) k.Tu g g/ j Skp < ":
(8.21)
Then the set F of restrictions to S of the functions g in G is relatively compact in Lp .S/: Proof We observe that for ı 20; r and u 2 ıBRd , x 2 S we have x u 2 ˝ so that, setting gu WD Tu g, gu j S is well defined. Shrinking ˝ if necessary, we may assume ˝ is bounded. Moreover, extending the functions in G or Lp .˝/ by 0 on Rd n˝; we may consider G as a bounded subset of Lp .Rd / \ L1 .Rd /: Let R W Lp .˝/ ! Lp .S/ be the restriction map: g 7! g j S: For all f 2 F WD R.G/ we pick some gf 2 G such that R.gf / D f : Let . jn / beR a mollifier: jn 2 Cc1 .Rd / has its support in rn BRd ; where .rn /ˇ ! 0 ˇ in 0; r and jn D 1 for all n. For f 2 F and n 2 N, writing ˇTu gf f ˇ jn D ˇ ˇ R 1=q ˇTu gf f ˇ j1=p jn D 1, for x 2 S we n jn , using Hölder’s inequality, and noting that get ˇ ˇ ˇ.gf jn /.x/ f .x/ˇ
Z Rd
ˇ ˇ ˇgf .x u/jn .u/ gf .x/jn .u/ˇ du
Z
Rd
ˇ ˇ ˇgf .x u/ gf .x/ˇp jjn .u/j du
1=p
:
8.7 Compactness in Lebesgue Spaces
495
Thus, given " > 0; taking ı > 0 as in (8.21) and n 2 N such that rn ı; denoting by Bd the closed unit ball of Rd ; and using Fubini’s Theorem, we get ˇ ˇ ˇ.gf jn /.x/ f .x/ˇp Z
ˇ ˇ ˇ.gf jn /.x/ f .x/ˇp dx S
Z Z
ˇ ˇ ˇgf .x u/ gf .x/ˇp jjn .u/j du; rn Bd
Z
ˇ ˇ ˇgf .x u/ gf .x/ˇp dxdu "p ;
jjn .u/j rn Bd
S
hence gf jn f Lp .S/ ": Let us show that the family Hn WD f.gf jn / j cl.S/ W f 2 Fg satisfies the assumptions of Ascoli’s theorem: for all x, x0 2 K WD cl.S/, f 2 F we have ˇ ˇ sup ˇ.gf jn /.x/ˇ sup kjn k1 gf 1 < C1; f 2F
f 2F
ˇ ˇ ˇ.gf jn /.x/ .gf jn /.x0 /ˇ cn gf x x0 ; 1 where cn is the Lipschitz constant of jn : Thus, Hn is relatively compact in C.K/, hence in Lp .S/:Given " > 0; we can cover Hn by a finite family of balls with radius ": Since gf jn f Lp .S/ " for all f 2 F, we see that F is covered by a finite number of balls with radius 2": F is precompact. Since Lp .S/ is complete, F is relatively compact. Corollary 8.11 Let p 2 Œ1; C1Œ, let G be a bounded subset of Lp .Rd / satisfying the condition: 8" > 0 9ı > 0 W g 2 G; u 2 ıBRd H) kTu g gkp < ":
(8.22)
d Suppose that ˇ " > 0 there exists a bounded measurable subset S" of R ˇR for every ˇ ˇ such that ˇ Rd nS" jgjp ˇ < "p for all g 2 G: Then G is relatively compact in Lp .Rd /.
Proof Given " > 0; in the preceding theorem let us take ˝ WD Rd , S WD S" : Since for all g 2 G we have k.Tu g g/ j Skp kTu g gkp ; condition (8.22) entails condition (8.21) and the set F" of restrictions to S" of the elements of G is precompact. Let g1 ; : : : ; gn 2 G be such that F" is covered by the balls B. fi ; "/ with radius " and centers fi WD gi ˇj S" : Givenˇ g 2 G weˇ can find i ˇ2 Nn such that ˇ ˇ ˇR ˇR f WD g j S" 2 B. fi ; "/. Then, since ˇ Rd nS" jgjp ˇ < "p and ˇ Rd nS" jgi jp ˇ < "p we have Z kg gi kpp D
Z Rd nS"
jg gi jp C
j f fi jp 2p1 "p C 2p1 "p C "p ; S"
so that G is covered by a finite number of balls of radius .2p C 1/1=p "; hence is precompact or relatively compact.
496
8 Differentiation and Integration
Exercises 1. Show that condition (b) of Definition 8.8 is not a consequence in condition (a). [Hint: take S WD R with the Lebesgue measure, fn WD .1=n/1Œn;2n .] 2. Show that the two conditions of Definition 8.8 imply the following assertion: (c) for any decreasing sequence .Sn / of S satisfying .\Sn / D 0 and for any R " > 0 there exists a k 2 N such that supf 2F . Sk j f jp d/1=p < ": Conversely, when is -finite or when F is countable, show that assertion (c) implies F is equi-integrable. 3. Let .gn /; .hn / be convergent sequences of Lp .S; R/: Show that the following set is p-equi-integrable: F WD f f 2 Lp .S; R/ W 9n 2 N; gn f hn g. 4. Let F be a p-equi-integrable subset of Lp .S; E/ and let . fn /, .gn / be two sequences of F: Show that .k fn gn kp / ! 0 if and only if . fn gn / ! 0 in measure. 5*. (Di Perna-Lions) Let .S; S; / be a measure space with finite measure, let . fn / ! f weakly in L1 .S; /, and let .gn / be a bounded sequence in L1 .S; / such that .gn / ! g a.e. Prove that . fn gn / ! fg weakly in L1 .S; /: [Hint: Use Egorov’s theorem.] 6*. (Chacon’s Biting lemma) Let .S; S; / be a measure space with finite measure and let . fn / be a bounded sequence in L1 .S; /: Prove that there exist a subsequence . fk.n/ / of . fn / and f 2 L1 .S; / such that for every " > 0 there exists some T 2 S such that .SnT/ < " and . fk.n/ /n ! f weakly in L1 .T; T /. [See: [55, 124, p. 184].] 7. Given a measure space .S; S; / with finite measure, prove that a subset F of L1 .S/ is uniformly integrable if and only if F is bounded and equi-integrable. 8. Given a measure space .S; S; / with finite measure, prove that a sequence . fn / of L1 .S/ converges to some f 2 L1 .S/ if and only if it converges to f in measure and ffn W n 2 Ng is uniformly integrable.
8.8 Convolution and Regularization Given a (Lebesgue) measurable function f W Rd ! R and w 2 Rd we denote by fw the function obtained by composing the translated function Tw f W x ! 7 f .x w/ with the symmetry S W f 7! fQ defined by fQ .x/ WD f .x/ for x 2 Rd : fw .x/ WD f .w x/
x 2 Rd :
Clearly, fw is measurable and fw 2 Lp .Rd / if f 2 Lp .Rd / with p 2 Œ1; C1: If f and g are equal a.e., then fw and gw are equal a.e., so that we regard fw as an element of d Lp .RdR/ if f 2 Lp .Rd /: Here R R is equipped with the Lebesgue measure d ; but we write f .x/dx rather than Rd f .x/dd .x/:
8.8 Convolution and Regularization
497
Given two nonnegative measurable functions f ; g on Rd we define their (integral) convolution h WD f g by Z . f g/.w/ WD
Rd
f .w x/g.x/dx:
(8.23)
Such a definition is justified (but one may have . f g/.w/ D C1) because the function x 7! f .w x/g.x/ is measurable. In fact, it is easy to see that .w; x/ 7! f .wx/g.x/ is measurable. Then the Fubini-Tonelli Theorem shows that the function R w 7! f .w x/g.x/dx is measurable and Z Z
Z Z f .w x/g.x/dxdw D
f .w x/g.x/dwdx Z
D
Z g.x/.
Z f .w x/dw/dx D
Z g.x/dx
f .w/dw
or Z
Z . f g/.w/dd .w/ D .
Z fdd /.
gdd /:
(8.24)
d d RGiven f , g 2 L1 .R / and w 2 R the function fw g is integrable if and only if Rd j f .w x/g.x/j dx < C1; if and only if one has j f j jgj .w/ < C1: Then one defines . f g/.w/ by relation (8.23). One easily sees that . f g/.w/ is independent of the choice of f and g in their equivalence classes with respect to a.e. equality. Moreover, one has . f g/.w/ D .g f /.w/ but the operation is not associative (Exercise 1). In order to present an existence criterion for f g on the whole of Rd ; let us make precise the notion of the support (or essential support) supp f of a measurable function f defined up to a null set N. It is the complement of the greatest open subset O of Rd such that f D 0 a.e. on O. Such an open set exists: choosing a countable base G of open subsets of Rd , O is the union of the family Gf whose members are the members G of G such that f D 0 a.e. on G. This set does not depend on the choice of the base G. It is also independent of the choice of a representant of f in its equivalence class with respect to a.e. equality. If f is continuous, this notion of support coincides with the usual one: the support of f is then the closure of the set of points at which f is non-null.
Lemma 8.15 Let f and g be two measurable functions such that f g is defined almost everywhere. Then supp . f g/ cl.supp f Csupp g/: Proof Let F WDsupp f , G WDsupp g and let U WD Rd ncl.F C G/: Let N be the (null) set of points w 2 Rd such that fw g W x 7! f .w x/g.x/ is not integrable. If w 2 UnN and x 2 Rd either we have x … G (and then g.x/ D 0) or x 2 G and then w x … F since .w x/ C x … .F C G/; in both cases we have f .w x/g.x/ D 0: Since
498
8 Differentiation and Integration
fw g D 0 for w 2 UnN; we have . f g/.w/ D supp . f g/ Rd nU D cl.F C G/:
R
fw g D 0 for w 2 UnN; hence
Proposition 8.17 Let f 2 L1;loc .Rd / and let g 2 L1 .Rd / with compact support. Then . f g/.w/ is defined for all w 2 Rd . The same conclusion holds when f 2 L1 .Rd / and g 2 L1 .Rd / and then f g 2 L1 .Rd / with k f gk1 k f k1 kgk1 : Proof Let K be the support of g and let w 2 Rd . Then w K is compact and Z .j f j jgj/.w/ D
j f .w x/j jg.x/j dx K
Z
Z j f .w x/j dx D kgk1
kgk1 K
j f .u/j du < C1; wK
so that the function fw g is integrable. For the second assertion, in the preceding inequalities one replaces K with Rd nN; where N is the null set N WD fx 2 Rd W jg.x/j > kgk1 g and one gets the estimate k f gk1 k f k1 kgk1 : Another result asserting that the convolution is well defined is as follows. Theorem 8.34 Let p 2 Œ1; C1, f 2 L1 .Rd /, and g 2 Lp .Rd /: Then, for almost every w 2 Rd the function R fw g is integrable and for h WD f g defined by relation (8.23), i.e. h.w/ WD fw g, one has h 2 Lp .Rd / and k f gkp k f k1 kgkp :
(8.25)
Proof We take functions rather than classes since f g depends on the classes of f and g: The case p D C1 is treated by Proposition 8.17. Let us first consider the case p D 1: Setting k.w; x/ WD fw .x/g.x/ WD f .w x/g.x/, for x 2 Rd we have Z
Z j f .w x/j dw D jg.x/j k f k1 < C1;
jk.w; x/j dw D jg.x/j
hence k 2 L1 .Rd Rd / by Corollary 7.17. Then Fubini’s Theorem yields Z
Z dw
Z jk.w; x/j dx D
Z dx
jk.w; x/j dw k f k1 kgk1 ;
so that for almost every w 2 Rd one has fw g 2 L1 .Rd / and inequality (8.25) holds for p D 1: Now let p 21; 1Œ, q WD .1 1=p/1 , and let f 2 L1 .Rd /, g 2 Lp .Rd /: By the preceding, j fw j jgjp 2 L1 .Rd / for almost every w 2 Rd . Thus j fw j1=p jgj 2 Lp .Rd /, j fw j1=q 2 Lq .Rd /; and since j fw gj D j fw j1=q .j fw j1=p jgj/, Hölder’s inequality yields Z
1=q 1=p j fw .x/j jg.x/j dx j fw j1=q j fw j1=p jgj D k fw k1 k fw jgjp k1 < C1: q
p
8.8 Convolution and Regularization
499
Using the relations k fw jgjp k1 D we get
Z
R
j fw j : jgjp D .j f j jgjp /.w/ and k fw k1 D k f k1
p
jh.w/j
j fw .x/j jg.x/j dx
p
p=q
p=q
k fw k1 k fw jgjp k1 D k f k1 :.j f j jgjp /.w/:
R The relation .j f j jgjp /.w/dw kf k1 kjgjp k1 obtained in the case p D 1 yields Z k f gkpp D
Z p=q
jh.w/j dw k f k1 p
p=q
.j f j jgjp /.w/dw k f k1 k f k1 kgkpp :
Since p=q C 1 D p; inequality (8.25) ensues.
Now let us start to describe some regularizing effects of convolution. Theorem 8.35 Let p, q 2 Œ1; C1 with 1=p C 1=q D 1, f 2 Lp .Rd /, and g 2 Lq .Rd /: Then, for all w 2 Rd the function fw g Ris integrable and the function h WD f g defined by relation (8.23), i.e. h.w/ WD fw g, is uniformly continuous and bounded and one has k f gk1 k f kp kgkq :
(8.26)
If, moreover, p, q 21; C1Œ one has . f g/.w/ ! 0 as kwk ! C1: Proof For all w 2 Rd Hölder’s inequality and a change of variables yield Z .j f j jgj/.w/ D
j fw j jgj dd k fw kp kgkq D k f kp kgkq :
Since p or q is finite, we may suppose p is finite. Then, for all u, w, x we have fwu .x/ D f .w u x/ D .Tu f /.w x/ D .Tu f /w .x/, so that, by the preceding inequality, Z j. f g/.w u/ . f g/.w/j
j..Tu f /w fw /gj dd kTu f f kp kgkq :
Since kTu f f kp ! 0 as u ! 0; this proves that f g is uniformly continuous. Now let us prove the second assertion, assuming that p, q 21; C1Œ: Let us first suppose f belongs to the space Cc .Rd / of continuous functions with compact support. Let K be the support of f ; so that f D 0 on Rd nK and by Hölder’s inequality, for c WD d .w K/ D d .K/, we have Z j f .w x/j jg.x/j dx c
j f gj .w/ D wK
1=p
Z j f .w x/j jg.x/j dx q
Rd
q
1=q
:
500
8 Differentiation and Integration
Now, noting that j fw jq jgjq k f kq1 jgjq 2 L1 .Rd / and that for any sequence .wn / satisfying .kwn k/ ! C1 we have .j fwn .x/jq jg.x/jq /n ! 0; the Dominated Convergence Theorem yields .j f gj .wn //n ! 0: Finally, let f 2 Lp .Rd /: Since Cc .Rd / is dense in Lp .Rd /; we pick a sequence . fn / in Cc .Rd / with limit f in Lp .Rd /. Since j. fn g/.w/ . f g/.w/j D j. f fn / gj .w/ k f fn kp kgkq ! 0; given " > 0 we pick m 2 N such that k f fn kp kgkq "=2 for n m: Then we choose r > 0 such that j fm gj .w/ < "=2 for kwk > r and for such a w 2 Rd we get j. f g/.w/j j. f g/.w/ . fm g/.w/j C j. fm g/.w/j ": Let us turn to differentiability properties. In the sequel, ˛ WD .˛1 ; : : : ; ˛d / 2 Nd being a multi-index, we denote by D˛ f the partial derivative D˛ f .x1 ; : : : ; xd / WD D˛1 1 : : : D˛d d f .x1 ; : : : ; xd / D .
with the convention D˛i i g D g if ˛i D 0: We set j˛j WD ˛1 C : : : C ˛d : We denote by Ck .˝/ the space of functions of class Ck (i.e. that have continuous partial derivatives of order at most k) on the open subset ˝ of Rd and we set Cck .˝/ WD Ck .˝/ \ Cc .˝/; C1 .˝/ WD
\
Ck .˝/; Cc1 .˝/ WD C1 .˝/ \ Cc .˝/:
k0
Proposition 8.18 For f 2 Cck .Rd /, g 2 L1;loc .Rd / one has f g 2 Ck .Rd / and 8˛ 2 Nd ; j˛j k
D˛ . f g/ D .D˛ f / g:
In particular, f g 2 C1 .Rd / when f 2 Cc1 .Rd / and g 2 L1;loc .Rd /: Proof As an induction shows, it suffices to prove the case k D 1: Let i 2 Nd and ˛i D 1; ˛j D 0 for j ¤ i and let ei 2 Rd be such that all its components are null except the i-th which is 1: Given f 2 Cck .Rd /, g 2 L1;loc .Rd /; and a fixed w 2 Rd , we denote by K the support of f and for x 2 Rd and t 2 T WD Œ1; 1 we write f .w C tei x/ f .w x/ D thi .w x; t/ R1 with hi .w x; t/ WD 0 Di f .w C stei x/ds; a continuous function of t, x (w being fixed) that is 0 if x … Ki WD w C Tei K; a compact subset. Let mi WD supfjhi .w x; t/j W x 2 Ki ; t 2 Tg:
8.8 Convolution and Regularization
501
Since g 2 L1;loc .Rd / the function x 7! mi g.x/1Ki .x/ is integrable and we have jhi .w x; t/g.x/j mi jg.x/j 1Ki .x/ for almost every x. Applying Theorem 7.12, Rwe get that t 7! . f g/.w C tei / is differentiable at t D 0 and its derivative is Rd Di f .w x/g.x/dx D .Di f g/.w/: We take advantage of the preceding result with the aim of regularization. We say that a sequence .gn / of Cc1 .Rd / is a mollifier if for all n theR support supp gn of gn is contained in B.0; rn / with .rn / ! 0C ; if gn 0 and if gn D 1: A common means to get a mollifier consists in taking a nonnegative function g 2 Cc1 .Rd / with support in BRd and, for a given sequence .rn / ! 0C , in setting gn .x/ D cn g.x=rn / R d
where cn WD 1=.rn
g/: For g one can take the function given by
g.x/ WD exp.1=.kxk2 1// for x 2 B.0; 1/;
g.x/ D 0 for x 2 Rd nB.0; 1/:
Lemma 8.16 Given f 2 C.Rd / and a mollifier .gn / one has . f gn / ! f uniformly on every compact subset of Rd : Proof Let f 2 C.Rd /; let .gn / be a mollifier, and let K be a compact subset of Rd : Since f is uniformly continuous around K in the sense that for every " > 0 there exists a ı > 0 such that for all w 2 K and all x 2 B.0; ı/ one has j f .w x/ f .w/j ", hence Z . f gn /.w/ f .w/ D . f .w x/ f .w//gn .x/dx Z D B.0;rn /
. f .w x/ f .w//gn .x/dx:
R For w 2 K and rn < ı we get j. f gn /.w/ f .w/j " gn D ":
Theorem 8.36 For p 2 Œ1; C1Œ, f 2 Lp .R /; one has . f gn /n ! f in Lp .R /: d
d
Proof Given " > 0; Theorem 8.31 yields some h 2 Cc .Rd / such that k f hkp < ": Let .rn / ! 0C be such that supp gn rn BRd ; so that, for r WD max.rn /; by Lemma 8.15, supp .h gn / supp h C rn BRd K WD supp h C rBRd : Since .h gn / ! h uniformly on K by the preceding lemma, we have ."n / WD .kh gn hkp / ! 0: Then, since kgn k1 D 1; relation (8.25) entails the conclusion: k f gn f kp k. f h/ gn kp C kh gn hkp C kh f kp k f hkp kgn k1 C "n C " 2" C "n and k f gn f kp 3" for n large enough.
502
8 Differentiation and Integration
Corollary 8.12 For p 2 Œ1; C1Œ and an open subset ˝ of Rd ; the space Cc1 .˝/ is dense in Lp .˝/. Proof Given " > 0 and f 2 Lp .˝/; using Theorem 8.31 we pick g 2 Cc .˝/ such that k f gkp < ". Extending g by 0 on Rd n˝ we get an element h 2 Lp .Rd /: Taking a mollifier .gn / and .rn / ! 0C such that supp gn rn BRd we see that for n large enough we have supp h gn supp hCrn BRd ˝: The restriction fn of h gn to ˝ belongs to Cc1 .˝/ by Proposition 8.18 and k fn gkLp .˝/ D kh gn hkp " for n large enough. Thus k f fn kp 2" for n large enough. Let us note a convergence result for the convolution with the Gaussian functions 2 gt ./ WD td=2 ekk =t that will be used in the next section. It bears some analogy with the preceding theorem, albeit gt does not have a compact support. From the example at the end of Sect. 7.7 we know that for all ı > 0 the following two properties are satisfied: Z Rd nıBd
gt .x/dx ! 0 as t ! 0C
(8.27)
gt .x/dx D 1:
(8.28)
Z
Rd
These facts enable us to prove the following result. Lemma 8.17 For all f 2 L1 .Rd / one has k f gt f k1 ! 0 as t ! 0C : Proof By (8.28), we have Z Z . f gt /.w/ f .w/ D f .w x/gt .x/dx f .w/ D . f .w x/ f .w//gt .x/dx: Rd
Rd
Using Fubini’s Theorem we get the estimate Z j. f gt /.w/ f .w/j dw k f gt f k 1 D Rd
Z
Rd
Z D
Rd
Z .
Rd
j f .w x/ f .w/j dw/gt .x/dx
kTx f f k1 gt .x/dx:
Given " > 0, Lemma 8.14 yields some ı > 0 such that kTx f f k1 " for every x 2 ıBd : On the other hand, by (8.27) we can find > 0 such that for t 20; we have Z gt .x/dx ": Rd nıBd
8.9 Some Useful Transforms
503
Using this inequality, noting that kTx f f k1 kTx f k1 C k f k1 2 k f k1 , and taking (8.27) into account, we get Z k f gt f k 1 Z
Z ıBd
ıBd
kTx f f k1 gt .x/dx C Z "gt .x/dx C
Rd nıBd
Rd nıBd
kTx f f k1 gt .x/dx
2 k f k1 gt .x/dx ".1 C 2 k f k1 /:
Since " > 0 is arbitrarily small, we have k f gt f k1 ! 0 as t ! 0C :
Exercises 1. Consider the following example showing that the convolution operation is not associative. For d D 1 take the functions f WD 1RC , g WD 1Œ1;0 1Œ0;1 , h WD 1. Verify that f g; g h are everywhere defined and that . f g/ h D 1 whereas f .g h/ D 0: 2. Prove that if f and g are nonnegative measurable functions, then f g is lower semicontinuous. 3. Show that if f , g 2 Cc .Rd /; then f g 2 Cc .Rd /, the space of continuous functions with compact supports. 4. Given p, q, r in RC with p, q 2 Œ1; 1 and 1=p C 1=q 1=r D 1 and f 2 Lp .Rd /, g 2 Lq .Rd / show that f g 2 Lr .Rd / and that k f gkr k f kp kgkq : [Hint: see [185, p. 98].] 5. Show that L1 .Rd / is a Banach algebra with respect to the convolution operation but without a unit element. 6. (Müntz) Given an increasing sequence .sn /n0 of positive numbers, let L be the linear subspace of X WD C.Œ0; 1; R/ generated by the functions xn W t 7! tsn : Show that L is dense in X for the norm induced by L2 .Œ0; 1; R/. 7. Let p 2 Œ1; 1Œ, let h 2 L1 .Rd /; and let G be a bounded subset of Lp .Rd /: Show that F WD G h WD fg h W g 2 Gg is such that for any measurable subset S of Rd with finite measure, the set FS WD f f j S W f 2 Fg is relatively compact in Lp .S/. [Hint: use Exercise 15 of Sect. 8.5.1 and the Fréchet-Kolmogorov theorem.]
8.9 Some Useful Transforms We devote this section to two important transformations using integration processes: the Fourier transform and the Radon transform. In a later section we deal with the Laplace Transform.
504
8 Differentiation and Integration
8.9.1 The Fourier Transform The Fourier transform we introduce in this section is a widely used tool. It can be defined on various spaces. We limit our study to the most elementary properties of this transform. We denote the scalar product of two vectors x WD .x1 ; : : : ; xd /, kDd y WD . y1 ; : : : ; yd / of Rd by x:y WD hx j yi D ˙kD1 xk yk . The functions we consider take their values in C; for simplicity, in the present section we just write Lp .Rd / instead of Lp .Rd ; C/. In some sources, the definitions of the Fourier transform F differ by some scaling factors. That does not change the essence of the transform. Our choice is dictated by the properties that F can be extended to an isometry of L2 .Rd / changing convolution into product. Definition 8.9 The Fourier transform of a function f 2 L1 .Rd / is the function F f WD fO given by fO . y/ WD
Z Rd
e2ihxjyi f .x/dx
y 2 Rd :
ˇ ˇ Since ˇe2ihxjyi f .x/ˇ D j f .x/j for all x, y 2 Rd , the right-hand side is the integral of an integrable function. If f D g a.e. then fO . y/ D gO . y/ for all y 2 Rd ; so that fO . y/ just depends on the class of f in L1 .Rd /: The Dominated Convergence Theorem and an elementary change of variables justify the following property in which .Tw f /.x/ WD f .x w/ as above. Proposition 8.19 For all f 2 L1 .Rd / the function fO is continuous and bounded and the map f 7! fO is linear and continuous from L1 .Rd / into the space Cb .Rd / of bounded continuous functions endowed with the norm kk1 . Moreover, for all d d f 2 L1 .R /; w, y 2 R ; t > 0 one has fO k f k1 ; 1
b
b
1
Tw f . y/ D e2ihwjyi fO . y/; f .t/. y/ D td fO . y=t/; f .=t/. y/ D td fO .ty/: The next properties explain the success of the Fourier transform.
b
Proposition 8.20 For all f , g 2 L1 .Rd / and all y 2 Rd one has f g. y/ D fO . y/Og. y/: Proof This follows from Fubini’s Theorem: fb
g. y/ D
Z e Rd
Z D
Rd
2ihxjyi
Z .
Rd
f .x u/g.u/du/dx
e2ihujyi g.u/.
Z Rd
e2ihxujyi f .x u/dx/du D fO . y/Og. y/
ˇ ˇ since .u; x/ 7! ˇe2ihxjyi f .x u/g.u/ˇ is integrable on Rd Rd :
8.9 Some Useful Transforms
505
In the sequel, given k 2 Nd and ˛ WD .˛1 ; : : : ; ˛d / 2 Nd , we denote by mk and m˛ the functions x 7! 2ixk and x 7! .2i/j˛j x˛1 1 : : : x˛d d respectively. Our notation stems from the fact that we want to see the effect of the Fourier transform after multiplication by one of these functions. For ˛ WD .˛1 ; : : : ; ˛d /, ˇ WD .ˇ1 ; : : : ; ˇd / 2 Nd we write ˇ ˛ if ˇk ˛k for k 2 Nd .
b
Proposition 8.21 If f 2 L1 .Rd / and mk f 2 L1 .Rd / for some k 2 Nd ; then mk f D Dk fO : If mˇ f 2 L1 .Rd / for every ˇ 2 Nd such that ˇ ˛, then one has m˛ f D D˛ fO :
b
Proof The first assertion is obtained by applying the differentiability criterion for a parameterized integral (Proposition 7.12). The second one is obtained by iterating the first relation. Corollary 8.13 If f is measurable and such that for all n 2 Nm (resp. n 2 N) one R has Rd .1 C kxkn / j f .x/j dx < C1; then fO 2 Cbm .Rd / (resp. Cb1 .Rd /), the space of functions of class Cm (resp. C1 ) with bounded partial derivatives of order not greater than m. Example Let us show that the Fourier transform of the Gaussian density gt .x/ WD 2 2 td=2 ekxk =t is given by gbt . y/ D ht . y/ with ht . y/ WD etkyk and the Fourier 2 transform of ht is given by b ht .z/ D gt .z/ WD td=2 ekzk =t . In view of Proposition 8.19, it suffices to prove the case t D 1: Since g1 .x/ D R 2 2 2 ex1 exd we may suppose that d D 1: Setting k. y/ WD R e.xCiy/ dx; we have Z 2 gb1 . y/ D e2ixy ex dx D g1 . y/k. y/: R
Using the criterion for differentiating a parameterized integral we get k0 . y/ D 2i Z Di R
Z
2
R
.x C iy/e.xCiy/ dx
d .xCiy/2 2 e dx D Œie.xCiy/ C1 1 D 0: dx
R 2 Thus, k. y/ D k.0/ D R ex dx D 1 as computed in an example at the end of Sect. 7.7 and gb1 D g1 D h1 : R R d Proposition 8.22 For all f , g 2 L1 .R / one has Rd f gO D Rd fO g: Proof Since for f , g 2 L1 .Rd / the functions fO and gO are continuous and bounded, the functions f gO and fO g are integrable. Now the function h on Rd Rd given by h.x; y/ WD e2ihxjyi f .x/g. y/ is measurable and Z
Z
Z
Z
jh.x; y/j dxdy D Rd
Rd
Rd
j f .x/j dx
Rd
jg. y/j dy < C1:
506
8 Differentiation and Integration
R R Applying Fubini’s Theorem in writing Rd Rd h.x; y/dxdy as an iterated integral in R R two different orders, we get the relation Rd f gO D Rd fO g: The question of the inversion of the Fourier transform is crucial. A first answer follows. fO .x/ a.e. Theorem 8.37 If f 2 L .Rd / is such that fO 2 L .Rd / then f .x/ D b 1
1
Proof When f 2 L1 .R / is such that fO 2 L1 .R /, for all x 2 Rd one has d
d
b fO .x/ WD
Z Rd
e2ihxjyi fO . y/dy:
ˇ ˇ ˇ ˇ 2 ˇ ˇ ˇ ˇ Since for kt;x . y/ WD e2ihxjyi etkyk we have ˇ fO . y/kt;x . y/ˇ ˇ fO . y/ˇ and since kt;x . y/fO . y/ ! e2ihxjyi fO . y/ as t ! 0C ; the Dominated Convergence Theorem yields b fO .x/ D lim
Z
t!0C
Rd
kt;x . y/fO . y/dy:
Applying the preceding proposition to f and the function kt;x we get Z Rd
kt;x . y/fO . y/dy D
Z Rd
f .w/kc t;x .w/dw:
Since the preceding example shows that the Fourier transform of the function y 7! 2 2 ht . y/ WD etkyk is given by b ht .z/ D gt .z/ WD td=2 ekzk =t ; we have kc t;x .w/ D
Z
2
Rd
e2ihwxjyi etkyk dy D b ht .w x/ D gt .x w/;
b fO .x/ D lim
t!0C
Z Rd
f .w/kc t;x .w/dw D lim
t!0C
Z Rd
f .w/gt .x w/dw:
Since . f gt / ! f in L1 .Rd / by Lemma 8.17, any sequence .tn / ! 0 has a subsequence .tk.n/ / such that . f gtk.n/ / ! f a.e. by Proposition 7.6. Thus b fO .x/ D f .x/ a.e. Corollary 8.14 Let f 2 L1 .Rd / be such that fO D 0: Then f D 0: ˇ ˇ Corollary 8.15 If f 2 Ck .Rd / is such that Dˇ f 2 L1 .Rd / and ˇxˇ ˇ fO 2 L1 .Rd / for all ˇ 2 Nd satisfying jˇj k; then one has D˛ f . y/ D .2i/k y˛ fO . y/ for all ˛ 2 Nd satisfying j˛j D k and all y 2 Rd :
b
b
Proof Setting g WD fO in the relation m˛ f D D˛ fO and applying the inverse Fourier transform, we get D˛ g.x/ D m˛ .x/f .x/ D .2i/k x˛ gO .x/; a relation equivalent to the one in the statement.
b
8.9 Some Useful Transforms
507
Theorem 8.38 (Plancherel) The map f 7! fO from L2 .Rd / \ L1 .Rd / into L2 .Rd / has a unique extension as a linear isometry from L2 .Rd / into L2 .Rd / still denoted by F W f 7! fO : d d d O Proof Let us first show that for f 2 L2 .R / \ L1 .R / we have f 2 L2 .R / and O f D k f k2 . By Proposition 8.19, fO is bounded, so that, for t 2 P, ht . y/ WD 2 ˇ ˇ2 ˇ ˇ tkyk2 , the function ˇ fO ˇ ht is integrable. Since f 2 L1 .Rd / the function .w; x; y/ 7! e 2
f .x/f . y/ht .w/ is in L1 .R3d /: Since hbt .z/ D gt .z/ D td=2 ekzk applying Fubini’s Theorem, we get Z Rd
Z Z ˇ ˇ ˇ O ˇ2 . ˇ f . y/ˇ ht . y/dy D Z D Z D
Rd
R3d
R2d
Rd
f .w/e
2ihwjyi
Z dw/. Rd
=t
as we have seen,
f .x/e2ihxjyi dx/ht . y/dy
f .w/f .x/e2ihwxjyi ht . y/dwdxdy f .w/f .x/gt .x w/dwdx:
Lemma 8.17 ensures that f gt ! f in L1 .Rd / as t ! 0C : Thus there exists a sequence .tn / ! 0C such that .. f gtn /.w/ ! f .w/ for almost every w: We also have ˇ ˇ ˇf .w/. f gt /.w/ˇ D .f .w/. f
g /.w/ ! f .w/f .w/ for almost every w and since t n n ˇR ˇ ˇ ˇ ˇ ˇ ˇ d f .w/f .x/gt .w x/dxˇ k f k ˇf .w/ˇ and since k f k ˇf ./ˇ is integrable, the n 1 1 R Dominated Convergence Theorem yields Z Rd
Z ˇ ˇ ˇ O ˇ2 ˇ f . y/ˇ htn . y/dy D
R2d
Z f .w/f .x/gtn .x w/dxdw !
Rd
f .w/f .w/dw:
Since ht ! 1 pointwise as t ! 0C , by the Monotone Convergence Theorem we get 2 Z O f D 2
Rd
Z ˇ ˇ ˇ O ˇ2 ˇ f . y/ˇ dy D lim n
Rd
Z ˇ ˇ ˇ O ˇ2 ˇ f . y/ˇ htn . y/dy D
Rd
f .w/f .w/dw D k f k22 :
Thus fO 2 L2 .Rd /: Since L2 .Rd / \ L1 .Rd / (and even Cc .Rd /) is dense in L2 .Rd /; Theorem 3.2 entails that f 7! fO can be extended to L2 .Rd / in such a way that O f D k f k2 : 2 R Remark Using the polarization identity for the scalar product h f j gi WD f g; i.e. the relation 4h f j gi D k f C gk2 k f gk2 C i k f C igk2 i k f igk2 ;
508
8 Differentiation and Integration
one deduces from the relation fO D k f k2 the Parseval identity: 2
h f j gi D h fO j gO i:
8f ; g 2 L2 .Rd / The preceding theorem can be completed.
Theorem 8.39 The Fourier transform F W L2 .Rd / ! L2 .Rd / defined by F . f / WD fO _ is an isometry onto L2 .Rd / and its inverse is S ı F W g 7! g, where S is the symmetry d d defined by S. f /.x/ WD f .x/ for f 2 L2 .R /; x 2 R : 2
Proof Let us use again the Gaussian functions gt WD td=2 ekk =t whose Fourier 2 transform ht WD gbt is given by ht . y/ D etkyk : We claim that for all t > 0; all y 2 Rd , and all f 2 L2 .Rd / we have Z
Z Rd
f .x/gt . y x/dx D
fO .w/e2ihwjyi ht .w/dw:
Rd
(8.29)
Since e2ihjyi ht ./ 2 L1 .Rd / and its Fourier transform is Ty hbt WD gt . y/, Proposition 8.22 asserts that this relation holds when f 2 L1 .Rd /: Taking f 2 L2 .Rd / and a sequence . fn / in L1 .Rd / \ L2 .Rd / with L2 -limit f and using the fact that .b fn / ! fO in L2 .Rd / since F is continuous, we see that relation (8.29) still holds, both sides of it being continuous functions of f in L2 .Rd / since Ty gt WD gt . y/ and e2ihjyi ht ./ are in L2 .Rd /: As t ! 0C the left-hand side . f gt /. y/ of relation (8.29) considered as a function of y converges to f in L1 .Rd / by Lemma 8.17. Thus, we can find a sequence .tn / ! 0C such that .. f gtn /. y// ! f . y/ for almost every y 2 d R Dominated Convergence Theorem, the right-hand side converges to R : By the O .w/e2ihwjyi dw D F .fO /.y/ D .S ı F /. f /. y/: Thus f D .S ı F /. f / a.e. f d R As an application of the Fourier transform, let us point out its use for the partial differential equation u C u D f
(8.30)
where f 2 L2 .Rd / is given. Taking the Fourier transform of both sides of this equation, we get .1 C 4 2 kyk2 /Ou. y/ D fO . y/
y 2 Rd :
Thus uD.
fO 1C
4 2
2
kk
/_ D f b;
8.9 Some Useful Transforms
509
where b, the Fourier transform of 1=.1 C 4 2 kk2 /, is called the Bessel potential. For the computation of b, see [117, p. 187] for instance.
Exercises 1. Prove the properties asserted in Proposition 8.19 using the given hints. 2. (Riemann-Lebesgue Lemma) Show that for all f 2 L1 .Rd / one has fO . y/ ! 0 as kyk ! C1: 3. Show that for p 22; C1Œ there are no q 2 Œ1; C1Œ and c 2 RC such that2 O f c k f kp for all f 2 Lp .Rd /\L1 .Rd /: [Hint: use the function hz WD ezkk q
with z WD a C ib, a > 0.] 4. Show that if for p 21; 2Œ there are some q 2 Œ1; C1Œ and c 2 RC such that O f c k f kp for all f 2 Lp .Rd / \ L1 .Rd / then one must have q D .1 1=p/1 : [Hint: use a scaling argument, replacing f with tf . The fact that such a constant c exists for q WD .1 1=p/1 is true, but not easy to prove.] 5. Show that for f 2 L1 .Rd / satisfying Dk f 2 L1 .Rd / one has Dk f .x/ D .mk f /.x/: [Hint: use an integration by parts.] Generalize this result to any partial derivative D˛ : 6. Let S.Rd / be the space of functions of class C1 such that for all ˛ 2 Nd and all n 2 N the function x 7! kxkn D˛ f .x/ is bounded. Show that the Fourier transform maps S.Rd / into S.Rd /:
b
8.9.2 Introduction to the Radon Transform The Radon transform is used in a number of fields, in particular in tomography, as is the Ray or X-ray transform (see the exercises for the latter), [89, 90, 122, 157, 201], (Fig. 8.1). Given u 2 Sd1 , the unit sphere in Rd , we denote by Hu the hyperplane Hu WD u? WD fx 2 Rd W hu j xi D 0g of Rd . Taking an orthonormal basis b WD .b1 ; : : : ; bd1 / of Hu we get an isometry hb W Rd1 ! Hu given by hb .x1 ; : : : ; xd1 / WD x1 b1 C : : : C xd1 bd1 : The image measure u on Hu by hb does not depend on the basis b: if a WD .a1 ; : : : ; ad1 / is another orthonormal basis, then for every measurable subset A of Hu we have 1 1 u .A/ WD d1 .h1 b .A// D d1 .T.ha .A/// D d1 .ha .A//; where T is the 1 d1 isometry T WD hb ı ha of R as d1 is invariant under linear isometries.
510
8 Differentiation and Integration
Hu,t
E
Hu u
Fig. 8.1 The Radon transform
For t 2 R we denote by Hu;t the affine hyperplane of Rd given by Hu;t WD Hu C tu D fx 2 Rd W hu j xi D tg: Since for any orthonormal basis b WD .b1 ; : : : ; bd1 / of Hu WD Hu;0 the map hb;tu W Rd1 ! Hu;t given by hb;tu .x/ WD hb .x/ C tu is an isometry, the set Hu;t can be equipped with the measure u;t , the image of d1 by hb;tu : Definition 8.10 If f is a measurable function on Rd , its Radon transform is the function fQ WD Rf W Sd1 R !R given by fQ .u; t/ WD
Z
Z . f jHu;t /du;t D Hu;t
Rd1
f ı hb;tu dd1
If f is in the space Cc .Rd / of continuous functions with compact support, then fQ .u; t/ is defined for all .u; t/ and is a continuous function of .u; t/: If f is just in L1 .Rd /; then fQ is not defined for all .u; t/: Proposition 8.23 For f 2 Cc .Rd / and for all .u; t/ 2 Sd1 R one has fQ .u; t/ D fQ .u; t/ and if f .x/ D 0 for kxk > r; then fQ .u; t/ D 0 for jtj > r:
8.9 Some Useful Transforms
511
For f 2 Cc1 .Rd /, a multi-index ˛, and .u; t/ 2 Sd1 R, one has
e
.D˛ f /.u; t/ D u˛
@j˛j Q f .u; t/: @tj˛j
fD In particular, for f 2 Cc1 .Rd / one has f
@2 Q f: @t2
Proof The first assertion is obvious since Hu;t D Hu;t for all .u; t/ 2 Sd1 R and since Hu;t Rd nBŒ0; r if t > r: Given .u; t/ 2 Sd1 R, taking an orthonormal basis .b1 ; : : : ; bd1 / of Hu WD Hu;0 ; since .b1 ; : : : ; bd1 ; u/ is an orthonormal basis of Rd we have Di f D ei :rf D ei :.
d1 X
.rf :bj /bj C .rf :u/u/:
jD1
RSince the function .rf :bj / ı hb;tu D Dj . f ı hb;tu / has compact support and Dj . f ı hb;tu /dxj D 0 we get
e
Di f .u; t/ D
d1 X
Z ei :bj
jD1
Z Rd1
.rf :bj / ı hb;tu dd1 C .ei :u/
Rd1
.rf :u/ ı hb;tu dd1
Z
D ui D ui
Rd1
d dt
.rf :u/.hb .x/ C tu/dd1 .x/
Z
Rd1
f .hb .x/ C tu/dd1 .x/
by differentiating an integral depending on a parameter. Iterating this formula, we get the second assertion. Let us point out the interplay between the Fourier transform and the Radon transform. We just consider the case f 2 Cc .Rd /. For fixed u 2 Sd1 we denote cu the Fourier transform of the function .Rf /u WD fQu W t 7! fQ .u; t/ W by Rf Z cu .s/ WD e2ist fQ .u; t/dt: Rf R
Lemma 8.18 If f 2 Cc .Rd / the following relation holds between the partial Fourier Transform of the Radon transform Rf of f and the Fourier Transform fO of f W 8u 2 Sd1 ; 8s 2 R
cu .s/ D fO .su/: Rf
(8.31)
Proof For u 2 Sd1 and a basis b WD .b1 ; : : : ; bd1 / of Hu , let us use the isomorphism hb;u W Rd1 R ! Rd1 given by hb;u .x; t/ WD hb;u .x1 ; : : : ; xd1 ; t/ WD x1 b1 C : : : C xd1 bd1 C tu
512
8 Differentiation and Integration
for x WD .x1 ; : : : ; xd1 / 2 Rd1 , t 2 Rd . Applying Fubini’s Theorem and this orthogonal change of coordinates, since t D hhb;u .x; t/ j ui, setting w WD hb;u .x; t/, we get cu .s/ D Rf
Z Z
e2ist R
Rd1
D Z
Z
Rd
D Rd
f .hb;u .x; t//dxdt
e2ishhb;u .x;t/jui f .hb;u .x; t// jdet hb;u j dxdt e2ihwjsui f .w/dw D fO .su/
since jdet hb;u j D 1.
Lemma 8.19 If f 2 Cc .Rd / and if d1 denotes the measure on Sd1 defined in Sect. 7.8 one has Z ˇ Z Z ˇ ˇ c ˇ2 d1 Rf .s/ dsd .u/ D 2 (8.32) j f .x/j2 dx: ˇ u ˇ jsj d1 Sd1
R
Rd
Proof Using Plancherel’s formula k f k2 D fO and polar coordinates y D su with 2 .s; u/ 2 P Sd1 to compute fO with the help of Proposition 7.19 yields 2
Z Rd
j f .x/j2 dx D
Z
Z Sd1
C1 0
ˇ ˇ2 ˇO ˇ ˇ f .su/ˇ sd1 dsdd1 .u/:
Now, using the change of variables r WD s; v WD u; we have Z
Z Sd1
C1 0
Z ˇ ˇ2 ˇO ˇ ˇ f .su/ˇ sd1 dsdd1 .u/ D
Z Sd1
0
1
ˇ ˇ ˇO ˇ ˇ f .rv/ˇ jrjd1 drdd1 .v/:
Adding the two sides and invoking relation (8.31), we get relation (8.32).
We can deduce from Lemma 8.18 an inversion formula for the Radon transform. Thus f can be recovered from its Radon transform. Theorem 8.40 If f 2 Cc .Rd / then for x 2 Rd one has 1 f .x/ D 2
Z Z R
Sd1
cu .s/ jsjd1 e2isu:x dd1 .u/ds: Rf
8.9 Some Useful Transforms
513
Proof Let us denote by g.x/ the right-hand side of this relation. By relation (8.31), using again the change of variables r WD s; v WD u and then polar coordinates, by Theorem 7.19, we have g.x/ D
by the Fourier inversion theorem. Corollary 8.16 For d D 2k C 1 and f 2 Cc .Rd /; for x 2 Rd one has f .x/ D
.1/k 2.2/2k
Z Sd1
cu /.2k/ .u:x/dd1 .u/: .Rf
Proof Using Corollary 8.15 we have cu /.2k/ .s/ D .1/k .2/2k .Rf
Z R
cu .t/dt: t2k e2ist Rf
Given x 2 Rd , setting s D u:x for u 2 Sd1 and integrating over Sd1 we get Z Sd1
cu /.2k/ .u:x/dd1 .u/ .Rf
D .1/ .2/ k
2k
Z Z R
Sd1
in view of Theorem 8.40.
cu .t/dd1 .u/dt D .1/k .2/2k 2f .x/ jtjd1 e2itu:x Rf
cu /.s/ D 0 for Application. It follows from this corollary that if d is odd and .Rf jsj r, then f j B.0; r/ D 0: We observe that if f is the characteristic function 1E of a measurable subset E of Rd then fQ .u; t/ D u;t .Eu;t /, where Eu;t is the slice Eu;t WD E \ Hu;t : Thus fQ .u; t/ gives precious information about the size of E or rather Eu;t (in particular when E is a tumor). It follows from Lemma 7.7 that for all u 2 Sd1 the function t 7! u;t .Eu;t / is measurable if E is measurable. In general, not much more can be said about this function. However, for d 3 we have the following remarkable regularity result.
514
8 Differentiation and Integration
Theorem 8.41 For d 3 and for any measurable subset E of Rd with finite measure there exists a subset N of Sd1 with null measure such that for all u 2 Sd1 nN and for all t 2 R the set Eu;t is u;t -measurable and t 7! u;t .Eu;t / satisfies a Hölder condition for any ˛ 20; 1=2Œ (hence is continuous): for some c WD c.u; ˛/ one has ju;s .Eu;s / u;t .Eu;t /j c js tj˛ :
8s; t 2 R
It has been shown by Besicovitch that such a result is not valid for d D 2: We shall deduce it from a similar result pertaining to functions. Theorem 8.42 For d 3 and for any f 2 L1 .Rd / \ L2 .Rd / there exists a subset N of Sd1 with null measure such that for all u 2 Sd1 nN the function f jHu;t is u;t -integrable. Moreover, there exists some c > 0 such that for all f 2 Cc .Rd / one has Z sup Rf .u; t/dd1 .u/ c k f k1 C c k f k2 : Sd1
t
Furthermore, for any u 2 Sd1 nN, ˛ 20; 1=2Œ, fQu WD Rfu W t 7! Rf .u; t/ is continuous and satisfies a Hölder condition: for some c WD c.u; ˛/ one has ˇ ˇ ˇ ˇ ˇQfu .t/ fQu .t0 /ˇ c ˇt t0 ˇ˛ : 8t; t0 2 R To prove this result we need a criterion for the Hölderian behavior of fQu WD Rfu or, more generally for a function h W R ! R. Lemma 8.20 Let h W R ! R be such that hO is defined, belongs to L1 .R/ and O satisfies h.t/ D b h.t/ a.e. Then for d > 2 there exists some c D c.d/ > 0 such that whenever 1=2
Z ˇ Z 1ˇ ˇ ˇ ˇO ˇ ˇ O ˇ2 d1 b ˇh.s/ˇ ds a; ˇh.s/ˇ jsj ds R
1
for some a, b > 0 one has sup jh.t/j a C bc:
(8.33)
t2R
Moreover, for all ˛ 20; d=2 1Œ; ˛ 1, there exists some c˛ > 0 such that t; t0 2 R
H)
ˇ ˇ ˇ ˇ ˇh.t/ h.t0 /ˇ 4.a C bc˛ / ˇt t0 ˇ˛ :
(8.34)
8.9 Some Useful Transforms
515
O Proof Since h.t/ D b h.t/, setting S WD RnŒ1; 1 we have Z
2ist O ds D h.s/e
h.t/ D
Z
R
1
2ist O ds C h.s/e
1
Z
2ist O ds: h.s/e S
The modulus of the first term is bounded by a: We estimate the second one by applying the Cauchy-Schwarz inequality:
Z ˇ 1=2 Z 1=2 Z ˇ ˇ ˇ ˇO ˇ ˇ O ˇ2 d1 1d bc jsj ds ˇh.s/ˇ ds ˇh.s/ˇ jsj ds S
S
S
R
1=2 1d ds < C1 since d 1 > 1: Thus relation (8.33) holds. jsj S ˇ ˇ To get relation (8.34) we use the estimate ˇeir 1ˇ jrj jrj˛ for all r 2 Œ1; 1 obtained the Mean Value Theorem, identifying C with R2 and ˇ ir by applying ˇ ˇ ˇ the estimate e 1 2 2 jrj˛ for all r 2 RnŒ1; 1. Then, for s, t, t0 2 R we have ˇ ˇ ˇ ˇ˛ 0ˇ ˇ 2ist e2ist ˇ 4 ˇs j˛ :j t t0 ˇ ; ˇe ˇZ 1 ˇ Z C1 ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ˛ ˇO ˇ ˛ 2ist 2ist0 ˇ ˇ 4 ˇt t0 ˇ˛ O e /ds h.s/ h.s/.e ˇ jsj ds 4a ˇt t0 ˇ : ˇ ˇ ˇ for c WD
1
1
ˇR ˇ 0 ˇ O ˇ 2ist We estimate the second term ˇ S h.s/.e e2ist /dsˇ in the decomposition of jh.t/ h.t0 /j by applying again the Cauchy-Schwarz inequality
Z ˇ 1=2 Z 1=2 ˇ ˇ ˇ ˇ ˇ˛ ˇ O ˇ2 d1 1dC2˛ 0 ˇ˛ ˇ 4 t t ds 4bc˛ ˇt t0 ˇ jsj ˇh.s/ˇ jsj ds S
S
R
1=2 for c˛ WD S jsj1dC2˛ ds < C1 since 1 d C 2˛ < 1. Gathering the two estimates, we get relation (8.34). Proof of Theorem 8.42 We replace h with hu WD Rfu in the preceding lemma, cu .s/ D fO .su/ by (8.31), and we set remembering that Rf ˇZ 1 ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇc ˇ ˇ ˇ ˇ ˇ b au WD ˇ hu .s/dsˇˇ 2 sup ˇhbu .s/ˇ D 2 sup ˇRf u .s/ˇ D 2 sup ˇ fO .su/ˇ 2 k f k1 ; s s s 1 Z ˇ ˇ2 ˇ ˇ bu WD . ˇhbu .s/ˇ jsjd1 ds/1=2 : R
Relation (8.32) ensures that Z Z 2 bu dd1 .u/ WD Sd1
Sd1
Z ˇ ˇ ˇ b ˇ2 d1 ˇhu .s/ˇ jsj dsdd1 .u/ D 2 k f k22 : R
516
8 Differentiation and Integration
Thus, there exists a set N of null measure in Sd1 such that bu < C1 for all u 2 Sd1 nN: Therefore, for u 2 Sd1 nN we get sup jhu .t/j au C bu c; t2R
hence, by the Cauchy-Schwarz inequality, Z
Z sup jhu .t/j dd1 .u/
Sd1 t2R
Z Sd1
au dd1 .u/ C c
Sd1
bu dd1 .u/
2d1 .Sd1 / k f k1 C 21=2 c.d1 .Sd1 //1=2 k f k2 : The last assertion of Theorem 8.42 is a consequence of relation (8.34).
Exercises 1. The Ray or X-ray transform of a measurable function f on Rd is defined on the tangent bundle TSd1 of the unit sphere Sd1 of Rd : This set is the set of pairs .u; v/ with u 2 Sd1 and v 2 Tu Sd1 , the tangent space to Sd1 at u, i.e. the orthogonal subspace Hu WD u? to u: For a measurable function f on Rd it is given by Z .Pf /.u; v/ WD .Pu f /.v/ WD
R
f .v C tu/dt:
Thus, Pf takes into account the behavior of f on the ray issued from v and passing through u C v: Given a function f on TSd1 and u 2 Sd1 , denote by fu the restriction of f to the tangent space to Sd1 at u: fu .v/ WD f .u; v/. If g is another measurable function on TSd1 define the partial convolution of f and g by Z . f u g/.u; v/ D
fu .v w/gu .w/du .w/
.u; v/ 2 TSd1 ;
Hu
Hu D Tu Sd1 being endowed with the image measure u of d1 described above. For f ; g 2 Cc1 .Rd /, u 2 Sd1 show that Pf u Pg D P. f u g/: 2. With the notation of the preceding exercise, define the Fourier transform of fu by: b fu .w/ WD
Z
e2ihvjwi fu .v/du .v/
.u; v/ 2 TSd1 :
Hu
b
Prove that for f 2 Cc1 .Rd /; .u; w/ 2 TSd1 one has Pu f .w/ D fO .w/: 3. Given f ; g 2 Cc1 .Rd /, let h WD f g. Compute Rhu in terms of Rfu and Rgu .
The most practical solution is a good theory. Albert Einstein
Abstract A large part of this chapter is devoted to Sobolev spaces, which are convenient spaces for handling partial differential equations. The weakened notion of derivative they convey is related to the question of transposition. Such a notion gives a natural approach to the concept of a weak solution to a partial differential equation. The question of regularity for such a solution is given a concise treatment. On the other hand, some nonlinear problems are considered. In particular, monotone operators are viewed through recent advances using representations by convex functions.
We devote this chapter to an introduction to the study of partial differential equations. Such equations or systems are numerous and serve the modeling of various physical or biological phenomena: impressive lists of such equations are given in the books [88, 117, 248] among many others. They are constantly completed with the studies of new phenomena or processes. As an example, the mathematical study of hydraulic fracture (fracking) emerges from the knowledge of equations for porous media and thin film equations. Also, more and more examples stem from the progress of mathematical biology. Such equations involve partial derivatives of order one or higher of an unknown function u on an open subset ˝ of Rd or a finite family of unknown functions. Thus, we use again the notation Di u D
@u ; @xi
D˛ u D
@j˛j u @x˛d d
@x˛1 1
for i 2 Nd or ˛ WD .˛1 ; ; ˛d / 2 Nd , with j˛j WD ˛1 C C ˛d : The most famous operator obtained by combining such partial derivatives is the Laplacian WD
It plays an important role in Riemannian geometry and in physics and it can be considered as the prototype of a large class of linear partial differential equations, the class of elliptic equations. Among the deep questions centered around the Laplacian is the following surprising one: can one hear the shape of a drum ([141, 171])? Such a question is motivated by the fact that one hears some harmonics that are linked with the eigenvalues of the Laplacian. Thus one may wonder whether two bounded domains ˝ and ˝ 0 of R2 are isometric when the eigenvalues of on ˝ and ˝ 0 are the same. The answer is positive if ˝ is a disc but negative if ˝ has corners or if ˝ and ˝ 0 are smooth domains of the sphere Sd1 of Rd ; it is still open for smooth domains. A general problem consists in finding as much information as possible on ˝ from the knowledge of the spectrum of (see [32, 36, 69, 141, 203, 235] among hundreds of studies). Solving such equations is often difficult and it is rare that explicit solutions can be found. Thus, since one must seek approximate solutions, this topic is closely connected to numerical analysis. Because nonlinear equations require particular tools, we essentially restrict our approach to linear equations. Even for this class of equations, the proofs of the fundamental results are not simple. Thus, we just present the main lines of some of these results. The choice of the class of functions in which one would like to find a solution is part of the problem. The classical classes of continuously differentiable functions on ˝ are not the most appropriate classes. Some results can be given in spaces of functions satisfying a Hölder property. But the most important class of functions on ˝ for the study of partial differential equations is the class of Sobolev functions. They are (a.e. equality equivalence classes of) functions in Lp .˝/ that have partial derivatives in Lp .˝/ in a weak sense we make precise in Sect. 9.1. Such classes of spaces have good compactness and completeness properties and can be embedded in some classical spaces such as Ck .˝/ or Lq .˝/: Moreover, the reflexivity of the Sobolev space Wpm .˝/ is a great advantage. In the sequel ˝ denotes an open subset of Rd and K.˝/ stands for the family of compact subsets of ˝. In some cases we require boundedness or smoothness of ˝.
9.1 Definition and Basic Properties of Sobolev Spaces This section deals with the definition of a class of functions spaces that is well suited for the study of partial differential equations and for the calculus of variations. In the case ˝ D Rd ; one can introduce this class by using the Fourier transform. However, we are interested in the case of an arbitrary open subset ˝ of Rd :
9.1 Definition and Basic Properties of Sobolev Spaces
521
9.1.1 Test Functions and Weak Derivatives Let us recall that for k 2 N[f1g the space of functions of class Ck with compact support in an open subset ˝ of Rd is denoted by Cck .˝/, the support supp ' of a continuous function ' being the closure of the set ' 1 .Rnf0g/. Such functions are called test functions. The notation D.˝/ is also classical for Cc1 .˝/: This vector space can be endowed with a topology induced by a family of seminorms (Exercise 1), but it is easier to use the associated convergence defined by: .'i /i2I ! ' if and only if there exist K 2 K.˝/, i 2 I and k 2 N such that supp 'i K for all i i and . pK;k .'i '//i2I ! 0, where pK;k . / WD sup sup jD˛ .x/j : j˛jk x2K
One can verify the axioms of convergence (Definition 2.2). It is also easy to verify that for every multi-index ˛ WD .˛1 ; ; ˛d / 2 Nd the map D˛ W ' 7! Da ' is continuous from Cc1 .˝/ into Cc1 .˝/ endowed with the convergence defined above. One must be aware that the convergence on Cc1 .˝/ is not the convergence associated with the seminorms pK;k ; even if for all K 2 K.˝/ the induced convergence on the subspace D.K/ formed by the functions ' 2 Cc1 .˝/ with support in K coincides with the convergence associated with the seminorms pK;k . The construction of a family of seminorms on Cc1 .˝/ inducing the above convergence is rather sophisticated; it is proposed as an exercise (Exercise 1 at the end of this section). A distribution on ˝ is a continuous linear form on Cc1 .˝/ for the convergence just defined. The space of distributions on ˝ will be denoted by Cc1 .˝/ ; it is often also denoted by D0 .˝/: Proposition 9.1 A linear form T on Cc1 .˝/ is a distribution if and only if for all K 2 K.˝/ there exist c > 0 and k 2 N such that jT.'/j c˙j˛jk supK jD˛ 'j for all ' 2 Cc1 .˝/ satisfying supp ' K: If the same k can be used for all K 2 K.˝/, T is said to be a distribution of order k. Proof The condition is obviously sufficient. Let us show it is necessary. If it is not satisfied, there exist some K 2 K.˝/ and a sequence .'n / in the space Cc1 .K/ of functions in Cc1 .˝/ satisfying supp ' K such that pK;n .'n / 1 and T.'n / n for all n 2 N. Then . 1n 'n / ! 0 in Cc1 .˝/ but .T. 1n 'n // does not converge to 0: Example Let be a Radon measure, i.e. a continuous linear form on Cc0 .˝/ equipped with a convergence similar to the one on Cc1 .˝/; but with the seminorms pK;0 instead of the seminorms pK;k : Then jCc1 .˝/ is a distribution of order 0: R Example Let f 2 L1;loc .˝/ and let Tf be given by Tf .'/ WD ˝ f .x/'.x/dx for ' 2 Cc1 .˝/. Then Tf is a Radon measure, hence a distribution. By Corollary 8.12
522
9 Partial Differential Equations
and Theorem 8.26, for any f 2 Lq .˝/ with q 2 Œ1; 1Œ we have f D 0 whenever Tf .'/ D 0 for all ' 2 Cc1 .˝/: Thus one can identify f and Tf : Example The Dirac measure ıa associated with a 2 ˝ is the distribution ' 7! '.a/: Transposition enables us to define the derivative of a distribution. Definition 9.1 Given a multi-index ˛ and a distribution T, the ˛-derivative D˛ T is the distribution defined by .D˛ T/.'/ WD .1/j˛j T.D˛ '/
' 2 Cc1 .˝/:
Since ' 7! D˛ ' is continuous, the linear form ' 7! T.D˛ '/ is continuous on that D˛ T is indeed a distribution. The sign in front of T.D˛ '/ is justified by the following coherence result. Cc1 .˝/; so
Proposition 9.2 If f 2 Ck .˝/ with k 2 Nnf0g and if ˛ is a multi-index satisfying j˛j k; then D˛ Tf D Tg with g WD D˛ f : Proof By the next exercise, it suffices to prove the result for k D 1 and ˛ WD .0; ; 0; 1; 0; ; 0/: This follows from the integration by part formula: Z
8' 2 Cc1 .˝/
Z ˝
Di f .x/'.x/dx D
˝
f .x/Di '.x/dx:
Exercise For any T 2 Cc1 .˝/ and any multi-index ˛, ˇ, verify that Dˇ .D˛ T/ D D˛Cˇ T. Definition 9.2 Given a multi-index ˛ and u 2 L1;loc .˝/, an element w of L1;loc .˝/ is said to be the weak ˛-partial derivative of u if for every ' 2 Cc1 .˝/ one has Z ˝
w.x/'.x/dx D .1/j˛j
Z ˝
u.x/D˛ '.x/dx:
(9.1)
The preceding definition avoids distributions. Nonetheless, one recognizes in it that the distribution Tw associated with w is just D˛ Tu in the sense of distributions. Example For any open interval ˝ WDa; bŒ of R containing 0, the (a.e. equality equivalence class of the) Heaviside function w W R ! R given by w.x/ D 1 for x < 0; w.x/ D 1 for x > 0, w.0/ being arbitrary, is the weak derivative Du of u./ D jj : Indeed, for every ' 2 Cc1 .˝/; since a'.a/ D 0 D 0'.0/ and b'.b/ D 0 when v is extended by 0 on Rn˝, one has Z
Z
b
0
u.x/D'.x/dx D a
Z
a
Z
0
D a
b
xD'.x/dx C
'.x/dx
xD'.x/dx 0
Z
b 0
Z
b
'.x/dx D
w.x/'.x/dx: a
9.1 Definition and Basic Properties of Sobolev Spaces
523
Example Not all elements u 2 L1;loc .˝/ have a weak derivative in L1;loc .˝/. Taking ˝ WD R, for u the function given by u.x/ D 0 for x 2 R [ Œ1; 1Œ, u.x/ D 1 for x 2 Œ0; 1Œ, we easily see that if w 2 L1;loc .˝/ isR a weak derivative of u, then we must have w D 0 on R [ Œ1; 1Œ, 0; 1Œ, hence ˝ w.x/'.x/dx D 0 1 1 Rfor all ' 2 Cc .˝/. However, for ' 2 Cc .˝/ such that D' D 1 on Œ0; 1 we have ˝ u.x/D'.x/dx > 0: The assertions of the following lemma are left as exercises. Lemma 9.1 If u 2 Cj˛j .˝/; then D˛ u is the weak ˛-partial derivative of u: For u 2 L1;loc .˝/ there is at most one weak ˛-partial derivative of u: Let u 2 L1;loc .˝/ be such that the weak derivatives Di u exist and belong to L1;loc .˝/ for i 2 Nd : Then for f 2 C1 .˝/ the weak derivatives Di . fu/ exist in L1;loc .˝/ and Di . fu/ D uDi f C fDi u: These assertions justify the notation D˛ u for the weak ˛-partial derivative of u: For p 2 Œ1; 1Œ; u, v 2 Lp;loc .˝/, and a multi-index ˛, one says that v D D˛ u in the strong Lp sense, if for any K 2 K.˝/ there exists a sequence .un / in Cj˛j .˝/ such that Z Z . jun .x/ u.x/jp dx/ ! 0; . jD˛ un .x/ v.x/jp dx/ ! 0: (9.2) K
K
Proposition 9.3 For p 2 Œ1; 1Œ; u, v 2 Lp;loc .˝/, and a multi-index ˛, one has v D D˛ u in the strong Lp sense if and only if v D D˛ u in the weak sense. Proof Suppose v D D˛ u in the strong Lp sense. Given ' 2 Cc1 .˝/; let K WD R supp ' and let .un / be a sequence in Cj˛j .˝/ satisfying (9.2). Since w ! 7 R R K w' is continuous with respect to the norm of L .˝/, we have v' D p ˝ R K v' D R R R limn K D˛ un ' and similarly ˝ uD˛ ' D K uD˛ ' D limn K un D˛ ': Since R ˛ R ˛ j˛j K un D ' D .1/ K D un '; we get (9.1). The converse is obtained in a more precise form in the next theorem. Theorem 9.1 (Friedrich) Let p 2 Œ1; 1Œ and let u; v˛ 2 Lp .˝/ for ˛ 2 Nd satisfying (9.1) with w WD v˛ . Then, there exists a sequence .un / in Cc1 .Rd / such that for all K 2 K.˝/ one has .un j˝ /n ! u in Lp .˝/; ˛
.D un jK /n ! v˛ jK in Lp .K/: Proof We extend every w 2 Lp .˝/ by 0 on Rd n˝: Let W Rd ! R be defined by .x/ D ce1=.kxk
2
1/
for x 2 Bd WD BRd .0; 1/, .x/ D 0 for x 2 Rd nBd ;
524
9 Partial Differential Equations
R the constant c being adjusted so that Rd D 1. Given a sequence .rn / ! 0C , let n ./ WD rnd .=rn /. Define the regularization operators Rn W Lp .˝/ ! C.Rd / by Z .Rn w/.x/ WD
Z ˝
n .x y/w. y/dy D
.z/w.x rn z/dz w 2 Lp .˝/; x 2 Rd : Bd
Given K 2 K.˝/, let nK 2 N be such that for n nK one has rn < r WD gap.K; Rd n˝/ WD inffkx yk W x 2 K; y 2 Rd n˝g; R hence x rn z 2 ˝ for x 2 K, z 2 Bd : By Hölder’s inequality and the relation Bd D 1; for n nK , x 2 K one has
Z p1 Z p .z/dz .z/ jw.x rn z/jp dz j.Rn w/.x/j Z
Bd
Bd
.z/ jw.x rn z/jp dz;
D Bd
hence, using the Fubini-Tonelli Theorem, Z Z Z . jw.x rn z/jp dx/.z/dz; j.Rn w/.x/jp dx K
Bd
K
so that 8n nK ; w 2 Lp .˝/
kRn wkLp .K/ kwkLp .˝/ :
(9.3)
Given " > 0; using the density of C.˝/ in Lp .˝/ (Corollary 8.12), we pick v 2 C.˝/ such that ku vkLp .˝/ < ": The estimate (9.3) with w WD v u yields kRn v Rn ukLp .K/ kv ukLp .˝/ < ": R Observing that for x 2 K we have v.x/ D Bd .z/v.x/dz and Z j.Rn v/.x/ v.x/j
.z/ jv.x rn z/ v.x/j dz Bd
so that .Rn v v/n ! 0 uniformly on K and kRn v vkLp .K/ " for n large enough. Thus, for n large enough, kRn u ukLp .K/ kRn u Rn vkLp .K/ C kRn v vkLp .K/ C kv ukLp .K/ 3": Now, for x 2 K the function n;x W y 7! n .x y/ belongs to Cc1 .˝/ and .1/j˛j rnd D˛ n;x . y/ D rnj˛j D˛ ..x y/=rn / so that the definition of the weak derivative yields after differentiating under the integral symbol D˛ .Rn u/.x/ D
rnj˛j rnd
Z ˝
D˛ .
xy /u. y/dy D rn
Z ˝
n;x . y/v˛ . y/dy D .Rn v˛ /.x/:
9.1 Definition and Basic Properties of Sobolev Spaces
525
Replacing u with v˛ in the previous estimate we get .kRn v˛ v˛ kLp .K/ /n ! 0 and .kD˛ Rn u v˛ kLp .K/ /n ! 0:
9.1.2 Definition and First Properties of Sobolev Spaces We are ready to describe the important class of Sobolev spaces. Definition 9.3 Given m 2 Nnf0g and p 2 Œ1; 1, the Sobolev space Wpm .˝/ is the set of w 2 Lp .˝/ such that for every multi-index ˛ satisfying j˛j m the weak derivative D˛ w of w is an element of Lp .˝/: The norm of Wpm .˝/ is given by 2 kwkm;p WD kwkWpm .˝/ WD
4kwkpp
X
C
31=p kD
˛
wkpp 5
:
j˛jm
For p D 2; Wpm .˝/ is often denoted by H m .˝/: Theorem 9.2 For all m 2 Nnf0g and p 2 Œ1; 1 the Sobolev space Wpm .˝/ is a Banach space and H m .˝/ is a Hilbert space. Moreover, for p 2 Œ1; 1Œ the space Wpm .˝/ is separable and for p 21; 1Œ, it is reflexive. Proof In view of the properties of the Lp .˝/ space, it suffices to show that the map Jm;p W w 7! .w; D˛ w/j˛jm is an isometry of Wpm .˝/ onto a closed subspace of Lp .˝/m.d/ , where m.d/ WD card N.m; d/, with N.m; d/ WD f˛ 2 .f0g [ Nm /d W 0 j˛j mg: The fact that Jm;p is isometric is obvious. Let .wn / be a sequence in Wpm .˝/ such that .Jm;p .wn //n converges to some .w; u˛ / 2 Lp .˝/m.d/. Then the sequences .wn / and .D˛ wn / are Cauchy sequences in Lp .˝/ and 8' 2
Cc1 .˝/;
Z 8˛ 2 N.m; d/
˛
˝
wn D ' D .1/
j˛j
Z ˝
D˛ wn ';
so that, passing to the limit in this relation, we get 8' 2
Cc1 .˝/;
Z 8˛ 2 N.m; d/
˛
˝
wD ' D .1/
j˛j
Z ˝
u˛ ':
526
9 Partial Differential Equations
This shows that u˛ is the weak ˛-derivative of w; hence that w 2 Wpm .˝/. Finally, the norm of H m .˝/ is clearly associated with the scalar product induced by the one in L2 .˝/m.d/. Friedrich’s Theorem can be refined. Hereafter, for k 2 N [ f1g; we adopt the usual (but somewhat queer) notation Ck .˝/ for the space of restrictions to ˝ of functions in Ck .Rd /: Note that for k D 0 the space Ck .˝/ is just the space C.˝/ of continuous functions on ˝ and that when ˝ is smooth, Ck .˝/ can be given an intrinsic definition. Theorem 9.3 (Meyers-Serrin) For m 2 Nnf0g and p 2 Œ1; 1Œ, the space C1 .˝/ \ Wpm .˝/ is dense in Wpm .˝/. Proof Let .˝n /n be a sequence of open subsets of ˝ covering ˝ such that for all n the set Kn WD cl.˝n / is compact and contained in ˝nC1 : For instance, one can take ˝n WD fx 2 ˝ W d.x; Rd n˝/ > 2n g \ B.0; n/: Let Kn0 WD Kn n˝n1 and Vn WD ˝nC1 nKn1 ; so that Kn0 Vn : Since Kn0 is compact and Vn is open, there exists a function qn of class C1 whose support is contained in Vn satisfying qn jKn D 1: Since .Vn /n is a covering of ˝ and since VnC1 \ Vn1 D ¿; the families .Vn /n and .supp.qn //n are locally finite. Thus q WD ˙n qn is of class C1 and q 1 on ˝: Let pn WD qn =q; so that . pn /n is a partition of unity subordinated to .Vn /n : Given u 2 Wpm .˝/ and " > 0 we have supp. pn u/ Vn so that there exists some rn > 0 such that supp.Rrn . pn u// Vn and kRrn . pn u/ pn ukp < "=2n; kRrn .Di . pn u// Di . pn u/kp < "=2n for i 2 Nd , n 2 Nnf0g: Let us set u" WD
X
Rrn . pn u/:
n1
Since the family .supp.Rrn . pn u///n is locally finite, u" 2 C1 .Rd / and since u D ˙n1 pn u we have ku" j˝ ukp
X
kRrn . pn u/ pn ukp ";
n1
kDi u" j˝ Di ukp
X
kRrn .Di . pn u// Di . pn u/kp "
n1
since Di .Rrn . pn u// D Rrn .Di . pn u// by Lemma 9.1 and the end of the proof of Theorem 9.1. Thus u" 2 Wpm .˝/ and .u" / ! u in Wpm .˝/ as " ! 0C : The following characterizations will be useful for the study of regularity properties of solutions to elliptic partial differential equations.
9.1 Definition and Basic Properties of Sobolev Spaces
527
Proposition 9.4 For p 21; 1, q WD .1 1=p/1 (with q D 1 when p D 1) and u 2 Lp .˝/ the following assertions are equivalent: (a) u 2 Wp1 .˝/I (b) there exists some constant c 2 RC such that for all i 2 Nd ˇZ ˇ ˇ ˇ ˇ uDi ' ˇ c k'k Lq .˝/ ˇ ˇ
8' 2 Cc1 .˝/I
˝
(c) there exists some constant c 2 RC such that for all K 2 K.˝/ and all w 2 ıBRd satisfying K C ıBRd ˝; one has for tw .x/ WD x w ku ı tw ukLp .K/ c kwk : Moreover, one can take c D kukWp1 .˝/ in (b) and (c) when (a) holds. R R Proof (a))(b) Since ˝ uDi ' D ˝ 'Di u this follows from Hölder’s inequality with c D maxi2Nd kDi ukp . This implication is also a consequence of the other ones below. (a))(c) We first observe that by Theorem 9.1 and a passage to the limit it suffices to prove (c) with c WD kukWp1 .˝/ when u 2 Cc1 .˝/: Then, given K 2 K.˝/ and w 2 Rd satisfying tw .K/ ˝; for x 2 K we have Z u.x w/ u.x/ D
1
ru.x tw/:wdt;
0
hence, for p 21; 1Œ Z ju.x w/ u.x/j kwk p
p
1
kru.x tw/kp dt;
0
Z
Z Z
1
ju.x w/ u.x/j dx kwk p
p
K
K
Z kwkp
kru.x tw/kp dtdx
0
kru.x/kp dx: ˝
Passing to the limit as p ! 1 in the inequality ku ı tw ukLp .K/ krukLp .˝/ kwk we just obtained, we extend this inequality to the case p D 1. (c))(b) Given i 2 Nd ; for t > 0 small enough, a change of variables shows that Z
1 u.x/ .'.x C tei / '.x//dx D t ˝
Z ˝
1 .u.x/ u.x tei //'.x/dx t
528
9 Partial Differential Equations
for all ' 2 Cc1 .˝/: By Hölder’s inequality and (c), the absolute value of the righthand side is bounded above by c k'kLq .˝/ : Since the left-hand side converges to ˇR ˇ R ˇ ˇ ˝ uDi ' as t ! 0C , we get ˝ uDi ' c k'kLq .˝/ : R (b))(a) Assertion (b) shows that the linear map ' 7! ˝ uDi ' is continuous on Cc1 .˝/ endowed with the Lq norm. Since Cc1 .˝/ is dense in Lq .˝/; this map can be extended to Lq .˝/ into a continuous linear map `: Then, the Riesz representation R theorem yields some vi 2 Lp .˝/ such that `.'/ D ˝ vi ' for all ' 2 Lq .˝/; in particular for all ' 2 Cc1 .˝/: This shows that vi is the i-partial weak derivative of u; so that u 2 Wp1 .˝/: Remark The preceding proof shows that if w 2 Rd is such that tw .˝/ ˝, then 8u 2 Wp1 .˝/:
ku ı tw ukLp .˝/ kwk : krukLp .˝/
The next result presents another characterization in the case p WD 1. We admit it (see [118] for instance). Proposition 9.5 (Rademacher) For a bounded open subset ˝ of class C1 of Rd ; 1 the space W1 .˝/ is the space of Lipschitzian functions on ˝: Moreover, every 1 element u 2 W1 .˝/ is differentiable a.e. and its partial derivatives are its weak partial derivatives. A characterization of H k .Rd / using the Fourier transform can be given. Proposition 9.6 Given k 2 Nnf0g; u 2 L2 .Rd / one has u 2 H k .Rd / if and only if .1 C kkk /Ou./ 2 L2 .Rd /; where uO is the Fourier transform of u. Moreover, there exists some c > 0 such that 1 kukH k .Rd / .1 C kkk /Ou./ d c kukH k .Rd / : L2 .R / c Proof Let u 2 H k .Rd /: For any multi-index ˛ satisfying j˛j k; approximating u with a sequence in Cck .Rd / we get
b
y 7! D˛ u. y/ D .2iy/˛ uO . y/ is in L2 .Rd / by Plancherel’s theorem, hence, taking ˛ WD .0 : : : ; 0; k; 0; : : : ; 0/ with k at the j-th place, k 2 Nd , we obtain Z
k 2
Rd
2
.1 C kyk / jOu. y/j dy c
2
Z
d X ˇ ˇ k ˇD u.x/ˇ2 dx c2 kuk2 k Rd jD1
for some c > 0 and .1 C kkk /Ou./ 2 L2 .Rd /:
j
H .Rd /
9.1 Definition and Basic Properties of Sobolev Spaces
529
Conversely, let u 2 L2 .Rd / be such that .1 C kkk /Ou./ 2 L2 .Rd /. Denoting _ by v WD u the inverse image of u by the Fourier transform, so that u D v, O setting m˛ . y/ WD .2i/j˛j y˛ , u˛ WD m˛ v; the relation m˛ v D D˛ vO obtained in Proposition 8.21 yields u˛ D D˛ u. Then, for all ' 2 Cc1 .Rd /; by Proposition 8.22 we have Z Z Z Z ˛ ˛ ˛ D 'u D D ' vO D D 'v D m˛ 'v O
b
Rd
Z
Rd
D
b
Rd
Rd
b
b
j˛j '.1/ O m˛ v D .1/j˛j
Rd
Z
Rd
b
' m˛ v:
This means that u˛ WD m˛ v is the weak ˛-partial derivative of u: Since
b
m˛ v 2 D km˛ vk2 D .2/2j˛j 2 2
b
Z Rd
jyj2j˛j jv. y/j2 dy .2/2j˛j .1 C kkk /Ou./
we have u˛ WD m˛ v 2 L2 .Rd / and u 2 H k .Rd /:
2
The preceding characterization shows that the definition that follows for fractional Sobolev spaces is compatible with the one we gave for entire values of s: Definition 9.4 Given s 2 P WD0; 1Œ the space H s .Rd / is the set of u 2 L2 .Rd / such that .1 C kks /Ou 2 L2 .Rd /: Properties of such spaces, which can be used for trace on boundaries results, can be found in more advanced books.
9.1.3 Calculus Rules in Sobolev Spaces Up to now we have used some product rules when one of the factors is in C1 .˝/: More general rules should be given. Proposition 9.7 Given p 2 Œ1; 1 and u, v 2 Wp1 .˝/ \ L1 .˝/, one has uv 2 Wp1 .˝/ \ L1 .˝/ and Di .uv/ D uDi v C vDi u for i 2 Nd : Proof Let us first suppose p 2 Œ1; 1Œ: The Meyers-Serrin Theorem yields sequences .un / ! u, .vn / ! v in Wp1 .˝/ such that un , vn 2 C1 .˝/ \ Wp1 .˝/ for all n 2 N. In fact, for some sequence .rn / ! 0C ; one can take un D Rrn u, vn D Rrn v and then Di un D Rrn Di u, Di vn D Rrn Di v for all n 2 N and i 2 Nd : The family of all these functions is bounded in L1 .˝/ since kRr wk1 kwk1 for all r > 0 and w 2 L1 .˝/: Thus, for all ' 2 Cc1 .˝/; using the estimates kun vn Di ' uvDi 'k1 kun ukp kvn k1 kDi 'kq C kuk1 kvn vkp kDi 'kq
530
9 Partial Differential Equations
with q WD p=. p 1/ and similar ones, we can pass to the limit in the relations Z
Z
8' 2 Cc1 .˝/; i 2 Nd
˝
un vn Di ' D
˝
.un Di vn C vn Di un /'
and get similar relations with u, v replacing un; vn proving the assertion for p 2 Œ1; 1Œ, taking q D 1 when p D 1. Now let us suppose p D 1: Then, given ' 2 Cc1 .˝/, we pick some bounded open subset ˝ 0 of ˝ satisfying supp.'/ ˝ 0 so that for all p 1 we have u, v 2 Wp1 .˝ 0 / \ L1 .˝ 0 /, the measure of ˝ 0 being finite, and Z Z uvDi ' D .uDi v C vDi u/' 8i 2 Nd ˝0
˝0
by what precedes. In this equality we can replace ˝ 0 with ˝ and we get the result. t u Let us give a composition result. Proposition 9.8 Let g W R ! R be a Lipschitz function and let p 2 Œ1; 1: Then, for all u 2 Wp1 .˝/ such that g ı u 2 Lp .˝/ (this occurs when p D 1 or when g.0/ D 0) one has f WD g ı u 2 Wp1 .˝/ and Di f D .g0 ı u/Di u for i 2 Nd . Proof Such a result relies on Rademacher’s Theorem (Proposition 9.5). We prove it in the case when g is of class C1 with a bounded derivative and g.0/ D 0. Let c WD kg0 k1 : Then for all r 2 R we have jg.r/j c jrj by the Mean Value Theorem, hence jg.u.x//j c ju.x/j for x 2 ˝ and g ı u 2 Lp .˝/: Also .g0 ı u/Di u 2 Lp .˝/ since jg0 .u.x//Di u.x/j c jDi u.x/j : When p 2 Œ1; 1Œ we pick a sequence .un / in C1 .˝/ \ Wpm .˝/ such that .kun ukm;p / ! 0: The inequality jg.un .x// g.u.x//j c jun .x/ u.x/j ensures that .kg ı un g ı ukp /n ! 0 whereas the estimate ˇ 0 ˇ ˇ ˇ ˇ ˇ ˇg ı un Di un g0 ı uDi uˇ ˇg0 ı un ˇ : jDi un Di uj C ˇg0 ı un g0 ı uˇ : jDi uj implies that .kg0 ı un Di un g0 ı uDi ukp /n ! 0 by dominated convergence. Thus, for all ' 2 Cc1 .˝/; using Hölder’s inequality, one can pass to the limit in the relation Z Z g ı u n Di ' D g 0 ı u n Di u n ' ˝
˝
0
and obtain that Di f D .g ı u/Di u in the weak sense. For p D 1; given ' 2 Cc1 .˝/ one takes a bounded open subset ˝ 0 ˝ such that supp.'/ ˝ 0 so that, for all p 1, one has u 2 Wp1 .˝ 0 /, .g0 ı u/Di u 2 Lp .˝ 0 / and Z Z g ı uDi ' D g0 ı uDi u' ˝0
˝0
9.1 Definition and Basic Properties of Sobolev Spaces
531
by the preceding case. In this relation one can replace ˝ 0 with ˝ and get Di f D .g0 ı u/Di u, ' being arbitrary. Corollary 9.1 Given p 2 Œ1; 1, for all u 2 Wp1 .˝/ one has uC WD max.u; 0/ 2 Wp1 .˝/ and Di uC D Di u a.e. in ˝ C WD fx W u.x/ > 0g; Di uC D 0 a.e. in ˝ WD ˝n˝ C : Similar assertions are valid for u WD .u/C and juj : Proof Apply the proposition with g.r/ WD rC : One can also use the C1 version, taking g" .r/ WD .r2 C "2 /1=2 " for r 2 RC , g" .r/ D 0 for r 2 R and passing to the limit as " ! 0C : For juj one uses the fact that juj D uC C u . We deduce from the preceding corollary that Wp1 .˝/ has a lattice structure. Corollary 9.2 Given p 2 Œ1; 1, for all u, v 2 Wp1 .˝/ one has u _ v WD max.u; v/ 2 Wp1 .˝/, u ^ v WD min.u; v/ 2 Wp1 .˝/. If ˝ is bounded, then for all c 2 R one has u ^ c 2 Wp1 .˝/, u _ c 2 Wp1 .˝/: Proof The first assertion stems from the relation u _ v D .1=2/.u C v C ju vj/. The second one follows from u ^ v D .u/ _ .v/: If ˝ is bounded, then any constant c belongs to Wp1 .˝/: Now let us give a change of variable result. Theorem 9.4 Let h W ˝ 0 ! ˝ be a bijection between two open subsets of Rd , h and h1 being Lipschitzian. Then for all u 2 Wp1 .˝/ one has v WD u ı h 2 Wp1 .˝ 0 / d and Di v D ˙jD1 .Dj u ı h/Di hj for h WD .h1 ; : : : ; hd /: Proof We give the proof in the case h and h1 are Lipschitzian and of class C1 : When p 2 Œ1; 1Œ one picks a sequence .un /n in C1 .˝/ \ Wp1 .˝/ such that .kun uk1;p /n ! 0: Then, by the change of variable theorem (Theorem 7.13) one has .un ı h/ ! u ı h and ..Dj un ı h/Di hj /n ! .Dj u ı h/Di hj in Lp .˝ 0 /: For all ' 2 Cc1 .˝ 0 / we have Z
Z ˝0
.un ı h/Di ' D
d X .Dj un ı h/Di hj ':
˝ 0 jD1
(9.4)
Passing to the limit, one gets a similar relation with un replaced by u: It shows that d the weak derivative Di v of v WD u ı h is ˙jD1 .Dj u ı h/Di hj : Thus v 2 Wp1 .˝ 0 /: 1 When p D 1; given ' 2 Cc .˝/ one takes a bounded open subset ˝ 00 ˝ 0 such that supp.'/ ˝ 00 and using the fact that un jh.˝ 00 / 2 C1 .h.˝ 00 //\Wp1 .h.˝ 00 // one has a relation similar to (9.4) with ˝ 0 replaced by ˝ 00 and the result follows as above.
532
9 Partial Differential Equations
9.1.4 Extension The study of the Sobolev spaces Wpm .˝/ is easier in the case ˝ D Rd than in the case of a general open subset ˝. For example, we have seen that for ˝ D Rd and p D 2 we can use the Fourier transform. Thus, it may be useful to derive properties of Wpm .˝/ by extending its elements to functions in Wpm .Rd /. This is possible provided ˝ is smooth enough. We start with a simple situation that captures the essence of the process, especially in the case m D 1. We recommend a rewriting of the proof for this special case. Then the extended function E.u/ of u 2 Wp1 .Rd1 P/, where P WD0; 1Œ, is obtained by reflection: .E.u//.s; t/ D u.s; t/ for t < 0. Proposition 9.9 Given m 2 Nnf0g, p 2 Œ1; 1 and r, b > 0, let U WD BRd1 .0; r/ b; bŒ, U C WD U \ .Rd1 P/. Then there is a continuous linear map E WD Em;p W Wpm .U C / ! Wpm .U/ such that for all u 2 Wpm .U C / one has E.u/ jUC D u: In other terms, E.u/ is an extension of u: Proof Since C1 .U C / \ Wpm .U C / is dense in Wpm .U C /; it suffices to define E.u/ for u 2 C1 .U C / \ Wpm .U C /: Let .cj /j2Nm be the solution of the linear system c1 .1/k C c2 .2/k C : : : C cm .m/k D mk
k 2 Nm
whose determinant is non-zero, as it is a Vandermonde determinant. Then, for u 2 C1 .U C / \ Wpm .U C /, we set E.u/.s; t/ D u.s; t/ for .s; t/ 2 U C ; E.u/.s; t/ D
m X
cj uQ .s; jt=m/ for .s; t/ 2 UnU C ;
jD1
where uQ 2 C1 .Rd / is an extension of u: Then, for k 2 Nm we have Dkd E.u/.s; 0/ D
m 1 X cj .j/k Dkd uQ .s; 0/ D Dkd uQ .s; 0/ D lim Dkd u.s; t/ t!0C mk jD1
and since for ˛ D .ˇ; k/ 2 Nd with j˛j D jˇj C k m we have D˛ E.u/ D Dkd E.Dˇ u/; we see that E.u/ 2 Cm .U/ and that for some c > 0 we have kE.u/km;p c kukm;p for all u 2 C1 .U C / \ Wpm .U C /; hence for all u 2 Wpm .U C /: This assertion can be checked directly or by using Theorem 9.4 with the map .s; t/ 7! .s; jt=m/. t u Theorem 9.5 Let ˝ be an open subset of class Cm of Rd , i.e. the interior of a submanifold with boundary of dimension d and of class Cm of Rd . Assume the boundary of ˝ is bounded. Then there exists a continuous linear map E W Wpm .˝/ ! Wpm .Rd / such that E.u/ j˝ D u for all u 2 Wpm .˝/:
9.1 Definition and Basic Properties of Sobolev Spaces
533
Proof We first suppose that for some r, b > 0 and some function g W V ! R of class Cm whose derivatives of order not greater than m are bounded, with V WD BRd1 .0; r/, the set ˝ is given by ˝ WD fs; t/ 2 V R W g.s/ < t < g.s/ C bg:
(9.5)
Then, the map h W .s; t/ 7! .s; t C g.s// is a bijection from U 0 WD V T, with T WD b; bŒ, onto an open subset U that is of class Cm with bounded derivatives of order not greater than m, as is its inverse .s; t0 / 7! .s; t0 g.s// and such that ˝ D U \ h.V T C / with T C WD0; bŒ. Setting E.u/ WD vQ ı h1 for v WD u ı h, vQ WD EU0 .v/; EU0 being the extension operator defined in the preceding proposition, we get an extension uQ 2 Wpm .U/ of u. In the general case we take an open covering .Ui /i2Nk of by open subsets that are the images by some orthogonal maps `i of some sets Ui0 WD f.s; gi .s/ C t/ W s 2 Vi WD B.ai ; ri /; t 2 bi ; bi Œg with gi W Vi ! R of class Cm with bounded derivatives of order not greater than m such that ˝ \ Ui D `i .Ui0C / with Ui0C WD f.s; gi .s/ C t/ W s 2 Vi ; t 20; bi Œg as in (9.5). We take a C1 partition of unity . pi /i2Nk subordinated to this covering. k We complete it with p0 WD 1 ˙iD1 pi and we set E.u/ D p0 u C
k X
pi Ei .u j˝\Ui /
iD1
where Ei is the extension operator Wpm .˝ \ Ui / ! Wpm .Ui / defined in the first step. Here p0 u stands for p0 u on ˝ and 0 on Rd n˝; so that p0 u 2 Lp .Rd / and Di . p0 u/ D uDi p0 C p0 Di u in the weak sense, and p0 u 2 Wpm .Rd / with kp0 ukm;p c kukm;p for some constant c > 0. Since ku j˝\Ui km;p kukm;p and since for some constant ci we have kEi .wi /km;p ci kwi km;p for all w 2 Wpm .˝ \ Ui /, the rule for products completes the proof since pi 2 Cc1 .Rd / and obviously E.u/ j˝ D u:
9.1.5 Traces Again, let be the boundary @˝ of ˝: In order to define the trace on of u 2 Wpm .˝/ one has to assume some regularity of ˝: For the sake of simplicity, we assume that ˝ is of class C1 : For a Lipschitz open set one has to use Hausdorff measures and we wish to leave aside this refinement. Some care is required since
534
9 Partial Differential Equations
has Lebesgue measure 0 whereas u is defined up to a set of measure 0! Again, we start with a simple case for which ˝ WD f.s; t/ 2 V R W g.s/ < t < g.s/ C bg;
(9.6)
U WD f.s; t/ 2 V R W g.s/ b < t < g.s/ C bg;
(9.7)
where V WD BRd1 .0; r/; g W V ! R is of class C1 : We endow g WD f.s; g.s// W s 2 Vg with the measure g transported by the diffeomorphism s 7! .s; g.s// from the measure with density .1 C krgk2 /1=2 with respect to the measure induced by the Lebesgue measure d1 on V. Lemma 9.2 For p 2 Œ1; 1Œ there exists a c > 0 such that for all u 2 Cc1 .U/ \ Wp1 .˝/ one has 1=p 11=p u j g Lp . g / c kDd ukLp .˝/ kukLp .˝/ ; u j g c kukWp1 .˝/ : L . / p
g
(9.8) (9.9)
Proof Given ' 2 Cc1 . r; rŒ/ we note that ˇ Z r ˇ Z r ˇ ˇ ˇ ˇ p 0 ˇ ˇ j'.t/jp1 : ˇ' 0 .t/ˇ dt: j' .0/j D ˇ .' / .t/dtˇ p p
0
0
For u 2 Cc1 .U/ \ Wp1 .˝/; applying this relation to ' W t 7! u.s; g.s/ C t/; integrating and using Hölder’s inequality, since c WD p sups2V .1 C krg.s/k2 /1=2 < 1, g being Lipschitzian, we get Z
ju.s; g.s//jp .1 C krg.s/k2 /1=2 ds c V
V
Z .
Z Z
r 0
.jujp1 : jDd uj/.s; g.s/ C t/dtds
1=p
g
11=p
jujp dg /1=p c kDd ukLp .˝/ kukLp .˝/ :
The inequality u j g L
p . g /
c kukWp1 .˝/ follows from the relations
kDd ukLp .˝/ kukm;p ;
kukLp .˝/ kukm;p :
Given u 2 Cc .U/ \ Wp1 .˝/, we can use a mollifier to get a sequence .un / WD .Rn u/ ! u pointwise and in Wp1 .˝/: Then .un j g /n is a Cauchy sequence in Lp . / pointwise converging to u j g . Taking limits in the inequality un j g L . / p g c kun kWp1 .˝/ we obtain (9.9). Now let us suppose ˝ is an open subset of class C1 with a bounded boundary WD @˝. As in the preceding subsection, we take an open covering .Ui /i2Nk of
9.1 Definition and Basic Properties of Sobolev Spaces
535
by open subsets that are the images under some orthogonal maps `i of some sets Ui0 WD f.s; gi .s/ C t/ W s 2 Vi WD B.ai ; ri /; t 2 bi ; bi Œg with gi W Vi ! R Lipschitz such that ˝ \ Ui D `i .Ui0C / with Ui0C WD f.s; gi .s/ C t/ W s 2 Vi ; t 20; bi Œg: We admit that there is a Borel measure on inducing on all Ui \ the measure `i .gi / transported by `i of the measure gi on the graph of gi , the measure gi being itself transported by the diffeomorphism s 7! .s; gi .s// from the measure with density .1 C krgi k2 /1=2 with respect to d1 jVi . Let . pi /i2Nk be a partition of unity of class C1 subordinated to the covering .Ui /i2Nk . A measurable function f W ! R is integrable for if for all i 2 Nk the function pi f is integrable with respect to `i .gi / and then Z fd D
k Z X Ui \
iD1
D
k Z X iD1
pi fd`i .gi /
.. pi f / ı `i /.s; gi .s//.1 C krgi .s/k2 /1=2 dd1 .s/: Vi
It can be shown (see [202] for example) that this definition does not depend on the choice of the covering .Ui /i2Nk or on . pi /i2Nk : For p 2 Œ1; 1Œ, the space Lp . / is the space of measurable (equivalence classes of) functions f such that j f jp is integrable R p and k f kp WD . j f j d/1=p : The (unit) normal vector n.x/ to at x 2 Ui \ is the vector `i .ni .`1 i .x///; with ni .s; gi .s// WD
.rgi .s/; 1/ .krgi .s/k2 C 1/1=2
:
The operator T of the next statement is called the trace operator. Theorem 9.6 Let ˝ be an open subset of class C1 of Rd whose boundary is bounded and let p 2 Œ1; 1Œ. Then there exists a unique continuous linear map T WD T W Wp1 .˝/ ! Lp . / such that T.u/ D u j for all u 2 C1 .˝/ \ Wp1 .˝/: Proof Since C1 .˝/ \ Wp1 .˝/ is dense in Wp1 .˝/ by the Meyers-Serrin Theorem, it suffices to show that there exists some c > 0 such that for all u 2 C1 .˝/ \ Wp1 .˝/ one has ku j kLp . / c kukWp1 .˝/ or even 1=p
11=p
ku j kLp . / c krukLp .˝/ kukLp .˝/ :
536
9 Partial Differential Equations
We pick a covering .Ui /i2Nk of by open subsets as above and we take a C1 partition of unity . pi /i2Nk subordinated to this covering. We complete it with p0 WD k 1 ˙iD1 pi . For i 2 Nk , the preceding lemma yields some ci > 0 such that 1=p
11=p
kpi u j kLp . / ci kr. pi u/kLp .˝/ kpi ukLp .˝/
since pi u 2 Cc1 .Ui / \ Wp1 .˝/: Now, kpi ukLp .˝/ kukLp .˝/ and for some constant c0i (depending on pi ) kr. pi u/kLp .˝/ kurpi kLp .˝/ C kpi rukLp .˝/ c0i kukWp1 .˝/ : k . pi u/ j ; by definition of the measure on ; Since . p0 u/ j D 0 and u j D ˙iD1 we get
ku j kLp . /
k X
1=p
11=p
kpi u j kLp . / c krukLp .˝/ kukLp .˝/
iD1 01=p
for c WD c1 c1
01=p
C : : : C ck ck
.
The operator T is not surjective from Wp1 .˝/ onto Lp . /: Characterizing its image would require the definition of Sobolev spaces Wps . / with s 2 RC nN since 11=p
the image of T is Wp . /: In the model case ˝ WD Rd1 P, WD Rd1 f0g, 11=p the space Wp . / can be defined as the set of u 2 Lp .Rd1 / such that Z ju.x/ u. y/jp p dxdy/1=p < 1: kuk11=p;p WD .kukp C pCd2 Rd1 Rd1 kx yk 11=p
. / endowed with this norm is a Banach space It can be proved that Wp and a Hilbert space H 1=2 . / for p D 2 (see [94, Thm 3.5 p. 115]; the relation 11=p T.Wp1 .˝// D Wp . / is proved in [94, Thm 3.9 p. 117]). Definition 9.5 Given an open subset ˝ of Rd , the closure of Ccm .˝/ in Wpm .˝/ m (resp. H m .˝/) is denoted by Wp;0 .˝/ (resp. H0m .˝/). Since one can approach in the norm kkm;p any function of Ccm .˝/ by a sequence m of functions in Cc1 .˝/; the space Wp;0 .˝/ is also the closure of Cc1 .˝/ in Wpm .˝/. 1 .Rd / D Wp1 .Rd /: Proposition 9.10 For all p 2 Œ1; 1Œ one has Wp;0
Proof For n 2 Nnf0g, let n 2 Cc1 .Rd / be given by n .x/ D 1 for x 2 B.0; n 1/, 1=.kxk2 n2 / for x 2 B.0; n/nB.0; n 1/, n .x/ D 0 for x 2 Rd nB.0; n/; n .x/ D cn e with cn adjusted so that cn e1=.12n/ D 1. For w 2 C1 .Rd / \ Wp1 .Rd /; by dominated convergence, we have .k n w wkp / ! 0 as n ! 1 and kDi .
n w/
Di wkp k
n Di w
Di wkp C kwDi
n kp
!0
as n ! 1
9.1 Definition and Basic Properties of Sobolev Spaces
537
since both .. n 1/Di w/ and .wDi n / ! 0 a.e. and are dominated by a function in Lp .Rd /: Since C1 .Rd / \ Wp1 .Rd / is dense in Wp1 .Rd /, we get that Cc1 .Rd / is dense in 1 Wp .Rd /: 1 The following characterization of Wp;0 .˝/ is valid if ˝ is Lipschitz, but we prove it when ˝ is of class C1 : 1 .˝/ D ker T: Theorem 9.7 If ˝ is an open subset of class C1 of Rd , then Wp;0
Proof Since Cc1 .˝/ is contained in the kernel ker T of the trace map T; and since 1 ker T is closed in Wp1 .˝/, we have Wp;0 .˝/ ker T: 1 .˝/ when ˝ is of class Conversely, let us prove that any u 2 ker T belongs to Wp;0 1 C : Taking a partition of unity, and using a diffeomorphism of class C1 , we reduce the problem to the case ˝ WD V P, where V is the open unit ball of Rd1 and P WD0; 1Œ. Since the space C1 .˝/ \ Wp1 .˝/ is dense in Wp1 .˝/ by the MeyersSerrin Theorem (Theorem 9.3), we take a sequence .un /n in C1 .˝/ \ Wp1 .˝/ that converges to u in Wp1 .˝/. For k 2 Nnf0g; .x0 ; r/ 2 ˝k WD V0; 1=kŒ, Rr we have un .x0 ; r/ D un .x0 ; 0/ C vn .x0 ; r/ with vn .x0 ; r/ WD 0 Dd xn .x0 ; t/dt D R 1=k 1Œ0;r .x0 ; t/Dd xn .x0 ; t/dt; and, by Hölder’s inequality with q WD p=. p 1/; 0 ˇ ˇ ˇvn .x0 ; r/ˇ r1=q . Z ˝k
ˇ ˇ ˇvn .x0 ; r/ˇp dx0 dr
Z
1 k
Z
1 k
0 p
ˇ ˇ ˇDd un .x0 ; t/ˇp dt/1=p ; Z Z
1 k
r q dr 0
0
V
ˇ ˇ ˇDd un .x0 ; t/ˇp dt D 1 kDd ukp Lp .˝k / ; pkp
hence Z kun kLp .˝k / .
1 k
0
kun .; 0/kLp .V/ dt/1=p C p
1 p1=p k
kDd un kLp .˝k / :
Passing to the limit as n ! 1, since .T.un //n ! 0 in Lp .V/ by continuity of T, we get kukLp .˝k /
1 kDd ukLp .˝k / : p1=p k
(9.10)
Now, let h 2 C1 .R/ be such that h.R/ Œ0; 1, h jŒ0;1=2 D 1, h jŒ1;1Œ D 0 and let us set wn .x0 ; t/ WD u.x0 ; t/.1 hn .t//;
538
9 Partial Differential Equations
with hn .t/ WD h.nt/, so that Dd wn D Dd u.1 hn / uh0n :
Di wn D Di u.1 hn /; i 2 Nd1 Thus Z Z
Z ˝
jDi wn Di ujp Z
jDd wn Dd uj
˝
p
˝
hn jDi ujp ;
˝
i 2 Nd1
ˇ ˇ .hn jDd uj C ˇuh0n ˇ/p :
Since the support of hn is contained in ˝ Œ0; 1=n, the first integrals converge to 0 by the Dominated Convergence Theorem. Since wn D u on Œ1=n; 1Œ, the last integral can be estimated with the help of (9.10) with k D n, c WD kh0 k1 , kDd wn Dd ukLp .˝/ D kDd wn Dd ukLp .˝n / khn Dd uk C h0 u Lp .˝n /
Lp .˝n /
n
kDd ukLp .˝n / C cn kukLp .˝n / kDd ukLp .˝n / C
c p1=p
kDd ukLp .˝n /
and each of these terms tends to 0. Since we also have .wn / ! u in Lp .˝/; we see that .wn / ! u in Wp1 .˝/: Now, since wn jV0;1=2nŒ D 0; using a mollifier we see that 1 1 wn 2 Wp;0 .˝/: Thus u 2 Wp;0 .˝/ since this subspace of Wp1 .˝/ is closed. In the sequel we say that a subset ˝ of Rd has a bounded width if there exist some b > 0 and some unit vector e 2 Rd such that je:xj b for all x 2 ˝: Then the width of ˝ is 2b. Of course, any bounded subset has a bounded width. For such open subsets the following inequality is often useful; it was used by Poincaré in his study of tides. Theorem 9.8 (Poincaré) Let ˝ be an open subset of Rd with a bounded width and let p 2 Œ1; 1Œ. Then there exists some c > 0 such that kwkLp .˝/ c krwkLp .˝/
1 8w 2 Wp;0 .˝/:
1 .˝/ equivalent to the usual norm. Thus w 7! krwkLp .˝/ is a norm on Wp;0
Proof By density, it suffices to prove this inequality for w 2 Cc1 .˝/: Using an orthogonal transformation applying e1 onto the vector e above, we may suppose jx1 j b for all x WD .x1 ; x0 / 2 ˝ with x0 2 Rd1 : Given w 2 Cc1 .˝/ we have 0
w.x1 ; x / D
Z
x1
b
D1 w.t; x0 /dt;
9.1 Definition and Basic Properties of Sobolev Spaces
539
hence, by Hölder’s inequality with q WD .1 1=p/1 , ˇ ˇ ˇw.x1 ; x0 /ˇp .2b/p=q Z Rd1
ˇ ˇ ˇw.x1 ; x0 /ˇp dx0 .2b/p=q
Z Z
b b
ˇ ˇ ˇD1 w.t; x0 /ˇp dt; Z
Rd1
b b
ˇ ˇ ˇD1 w.t; x0 /ˇp dtdx0 :
Since jD1 wj krwk ; integrating on x1 from b to b, we get the announced inequality Z
Z
b
Z
jwj D p
˝
Rd1
b
ˇ ˇ ˇw.x1 ; x0 /ˇp dx0 dx1 .2b/1Cp=q
Z
.2b/p
˝
Z Rd
jD1 wjp
krwkp :
Thus we can take c D 2b, where 2b is the width of ˝:
We admit the following formulas (which can be generalized to the case when @˝ is just Lipschitzian). Theorem 9.9 (Green’s Formulas) Let ˝ be a bounded open subset of class C1 of Rd : Then Z Z Z vru D urv C uvnd 8u; v 2 H 1 .˝/; ˝
˝
˝
Z
Z
vu D
Z
rurv ˝
@u vd @n
8u; v 2 H 2 .˝/:
Taking the scalar product of each side by the vector ei from the canonical basis of Rd , the first relation can be written componentwise as Z Z Z uDi v C vDi u D uvni d 8u; v 2 H 1 .˝/ ˝
˝
where ni WD n:ei is the i-th component of the normal vector n: Replacing u by Di u and summing over i 2 N we get the second relation of the statement. Also, taking v D 1, this relation yields the Gauss-Green’s formula Z Z Di u D uni d 8u 2 H 1 .˝/: ˝
In turn, replacing u by uv we recover the preceding relation. Moreover, replacing u by Di u in the Gauss-Green formula and summing over i 2 N we obtain Z ˝
Z u D
@u d: @n
540
9 Partial Differential Equations
Given p; q 2 Œ1; dŒ, with 1=p C 1=q D 1 C 1=d and u 2 Wp1 .˝/, v 2 Wq1 .˝/; Green’s formula can be extended to: Z Z Z uDi v C vDi u D T.u/T.v/ni d i 2 Nd : ˝
˝
Exercises 1 . A family of seminorms on Cc1 .˝/ inducing the convergence we defined can be described as follows. Take a sequence .Kn / in K.˝/ such that Kn int.KnC1 /, set ˝n WD ˝nKn ; take a sequence .kn / ! 1, and set, for ' 2 Cc1 .˝/ p.Kn /;.kn / .'/ WD sup supfjD˛ '.x/j W j˛j kn ; x 2 ˝n g: n
2. 3. 4.
5.
6.
Note that such a family is uncountable. Show that the associated convergence on Cc1 .˝/ is the one described in this section. [Hint: show that for every K 2 K.˝/ the family of seminorms p.Kn /;.kn / induces on D.K/ WD f' 2 Cc1 .˝/ W supp.'/ Kg the same topology as the one defined by the seminorms pK;k ]. Show that for f , g 2 L1;loc .˝/ satisfying Tf D Tg one has f D g: Show that the weak derivative of the Heaviside function h defined by h.r/ D 0 for r 2 R , h.r/ D 1 for r 2 P is the Dirac measure ı0 : Given f 2 C1 .˝/ and a distribution T; verify that setting . fT/.'/ WD T. f '/ one defines a distribution fT: Show that f .T1 C T2 / D fT1 C fT2 and that . f1 C f2 /T D f1 T C f2 T, f1 . f2 T/ D . f1 f2 /T for T1 , T2 2 Cc1 .˝/ , f1 ; f2 2 C1 .˝/: For u 2 Lp .Rd / show that u 2 Wp1 .˝/ if and only if u has a representative v that is absolutely continuous in each of its variables and whose partial derivatives are in Lp .Rd /. [See [268, Thm 2.1.4].] Prove that if ˝ is an open subset of class C1 of Rd with a bounded boundary, for every ˇ 2 Nd satisfying 0 < jˇj < m and every " > 0 there exists some c > 0 such that X ˇ D u " 8u 2 Wpm .˝/ kD˛ ukp C c kukp : p j˛jDm
Deduce from this that the following norm is equivalent to the usual norm on Wpm .˝/ u 7! kukp C
X j˛jDm
[See [1, 3].]
kD˛ ukp :
9.2 Embedding Results
541
m 7*. Prove the following characterization of Wp;0 .˝/ when ˝ is of class C1 and p 21; 1Œ, q WD .1 1=p/1 . The following assertions are equivalent: 1 .˝/; (a) u 2 Wp;0 (b) u 2 Lp .˝/ and there exists some constant c 2 RC such that
ˇZ ˇ ˇ ˇ ˇ uDi ' ˇ c k'k Lq .˝/ ˇ ˇ
8' 2 Cc1 .Rd /; i 2 Nd I
˝
(c) the function u given by u j˝ D u, u jRd n˝ D 0 belongs to Wp1 .Rd / and, in this case, Di u is given by a similar extension of Di u: [See [52, Prop. 9.18].]
9.2 Embedding Results The purpose of this section is to prove that, for an open subset ˝ of Rd , m 2 N, p 2 Œ1; 1, the functions in Wpm .˝/ belong to more usual spaces such as Lq .˝/ or Ck .˝/: In particular, we start by showing that for an appropriate q, the norm k f kq of a function f 2 Cc1 .Rd / is dominated by krf kp where rf WD .D1 f ; : : : ; Dd f / denotes the gradient of f . For d D 1, p D 1, q WD C1, this follows from the Rx R C1 0 relations f .x/ D 1 f 0 .r/dr D x f .r/dr, which imply k f k1 D sup j f .x/j x2R
1 2
Z
C1 1
ˇ 0 ˇ ˇ f .r/ˇ dr:
(9.11)
In the case d > 1, the following general estimate is of tremendous importance. Theorem 9.10 (Gagliardo-Nirenberg-Sobolev) For f 2 Cc1 .Rd /, p 2 Œ1; dŒ, q WD qd . p/ WD .1=p 1=d/1 ; c WD p.d 1/=.d p/ one has the following inequalities: k f kq
c c 1=d kD1 f k1=d krf kp : p kDd f kp 2 2
(9.12)
Thus for any Lipschitzian open subset ˝ of Rd there exists a continuous linear injection of Wp1 .˝/ into Lq .˝/ extending the identity map on Cc1 .Rd /. The value .1=p 1=d/1 D .d p/=dp of the Sobolev’s exponent qd . p/ can be recovered by using a scaling argument. Taking u WD u./ with > 0 instead of u in the relation k f kq 2c krf kp we obtain k f kq
c 1Cd=qd=p krf kp 2
which implies 1 C d=q d=p D 0, i.e. q D .1=p 1=d/1 . We start the proof with the case p D 1, so that c D 1, q WD d=.d 1/.
542
9 Partial Differential Equations
Lemma 9.3 Let d 2 Nnf0; 1g; let q WD d=.d 1/, and let f 2 Cc1 .Rd /: Then Z .
Rd
j f .x/jq dx/1=q
1 . 2
Z Rd
jD1 f .x/j dx/1=d .
Z Rd
jDd f .x/j dx/1=d :
(9.13)
Proof We know the result for d D 1 and we prove it by induction on d, admitting the result for a function of d 1 1 variables. For k 2 Nd1 , x WD .s; t/ 2 Rd1 R we set Z Z Ik .t/ WD Jd .s/ WD jDk f .s; t/j ds jDd f .s; t/j dt: Rd1
R
Besides the exponent q WD d=.d 1/, we need the exponent r WD .d 1/=.d 2/ corresponding to the dimension d 1; so that the induction assumption can be written as Z 1 1 1 . (9.14) j f .s; t/jr ds/1=r I1 .t/ d1 Id1 .t/ d1 2 Rd1 since Dk f .; t/ 2 Cc1 .Rd1 / for k 2 Nd1 : By (9.11), for all .s; t/ 2 Rd1 R we have j f .s; t/j .1=2/Jd .s/; hence j f .s; t/jq 2.q1/ j f .s; t/j Jd .s/1=.d1/ since q 1 D 1=.d 1/: Applying Hölder’s inequality with the exponent r and using the relation 1 1=r D 1=.d 1/, we get Z Rd1
j f .s; t/jq ds 21q .
Z
1
Rd1
j f .s; t/jr ds/ r .
Z
1
Rd1
Jd .s/ds/ d1 :
Taking into account the induction assumption (9.14) and integrating with respect to t, we obtain Z Z Z 1 1 1 Jd .s/ds/ d1 : j f .s; t/jq dsdt 2q . I1 .t/ d1 Id1 .t/ d1 dt/. Rd1 R
R
Rd1
Using Hölder’s inequality for d 1 functions (Corollary 8.8 with pi WD d 1), we estimate the first integral and obtain a relation equivalent to (9.13): Z Rd
j f .x/jq dx 2q .
Z
1
R
I1 .t/dt/ d1 .
Z R
1
Id1 .t/dt/ d1 .
Z
1
Rd1
Jd .s/ds/ d1 :
9.2 Embedding Results
543
Proof of the theorem It remains to consider the case p 21; dŒ, for which c WD p.d 1/=.d p/ > 1: We apply the lemma to the function g WD f c rather than f , obtaining, since Dk g D cf c1 Dk f for k 2 Nd Z .
d
Rd
jgj d1 /
d1 d
1 . 2 c . 2
Z Z
Z
1
Rd
jD1 gj/ d : : : .
1
Rd
jDd gj/ d Z
1
Rd
j f jc1 : jD1 f j/ d : : : .
1
Rd
j f jc1 : jDd f j/ d :
We use Hölder’s inequality with exponent p and the relations .d 1/=d D c=q and .c 1/p=. p 1/ D q to estimate each integral in the last product and we obtain Z
c . jfj / . 2 Rd q
c q
Z jfj / q
Rd
p1 p
Z :.
jD1 f j / p
Rd
Z
1 dp
:::.
1
Rd
jDd f jp / dp :
R
Since c=q. p1/=p D .dp/=dp D 1=q; simplifying by . Rd j f .x/jq dx/. p1/=p we get the first inequality of (9.12). The second one follows since jDk f .x/j krf .x/k for k 2 Nd : The last assertion is obtained by using the extension theorem 9.5. The next embedding theorem shows the interest of using Wpm .˝/ spaces instead of just H m .˝/ spaces. Theorem 9.11 (Morrey) Let ˝ be a Lipschitzian open subset of Rd whose boundary is bounded and let p > d, h WD 1 d=p. Then there exists some c > 0 such that every w 2 Wp1 .˝/ is the class for a.e. equality of a continuous function still denoted by w satisfying the Hölder condition 8x; x0 2 ˝
ˇ ˇ ˇw.x/ w.x0 /ˇ c krwk
Lp .Rd /
x x0 h :
Proof In this proof we denote by B the closed ball with center 0 and radius r=2 > 0 for the norm kk1 : We first suppose w 2 Cc1 .Rd / \ Wp1 .Rd /: For x; y 2 B; using the relation Z
1
w.x/ w. y/ D
rw.x C t. y x//:.x y/dt;
0 d the inequality jrw.z/:.x y/j r˙iD1 jDi w.z/j ; and integrating over B; we get
ˇ ˇ Z Z Z ˇ d ˇ ˇr w.x/ w. y/dyˇ r ˇ ˇ B
B
r
1 0
jDi w.x C t. y x//j dtdy
iD1
d Z X iD1
d X
1 0
Z jDi w.x C t. y x//j dydt: B
544
9 Partial Differential Equations
Using the change of variables z D ty and Hölder’s inequality, we obtain ˇ Z ˇ d Z X ˇ d ˇ ˇr w.x/ wˇ r ˇ ˇ B
iD1
r
Z
0
d Z X iD1
1
jDi w..1 t/x C z/j td dzdt tB
1
Z
0
1p d. p1/ jDi w..1 t/x C z/j dz .tr/ p td dt p
tB
1=p d Z r1Cdd=p X : jDi wjp 1 d=p iD1 B
The triangle inequality ensures that for x, x0 2 B we have ˇ Z ˇ ˇ Z ˇ 1d=p ˇ ˇ ˇ ˇ ˇ 0 ˇ ˇw.x/ w.x0 /ˇ ˇw.x/ 1 ˇ C ˇw.x / 1 ˇ 2 r w w krwkp : ˇ ˇ ˇ ˇ rd B rd B 1 d=p Now, given x, x0 2 Rd , applying what precedes to the translated function wa WD w. a/ with a WD .x x0 /=2 and setting r D kx x0 k, we get the announced inequality with c WD 2=.1 d=p/. Since the space Cc1 .Rd / is dense in Wp1 .Rd /, taking a sequence in Cc1 .Rd / that converges in Wp1 .Rd / and a.e., this inequality is valid for w 2 Wp1 .Rd / on the complement of a null set N, hence on Rd since Rd nN is dense in Rd and w is uniformly continuous on Rd nN. When ˝ is a Lipschitzian open subset of Rd with a bounded boundary, we use Theorem 9.5 to get the result, with another constant depending on ˝. The preceding embedding results can be completed with some compactness properties. Theorem 9.12 (Rellich-Kondrachov) Suppose ˝ is bounded and of class C1 : Then the following injections are compact: Wp1 .˝/ ! Lr .˝/
1 1 pd ; for p 2 Œ1; dŒ; r 2 Œ1; qŒ with q D . /1 D p d dp
Wp1 .˝/ ! Lr .˝/
for p D d; r 2 Œ p; 1Œ;
Wp1 .˝/ ! C.˝/
for p > d:
The proof relies on extension and regularization methods and on the ArzelaAscoli theorem (see [52, 117]).
9.3 Elliptic Problems
545
9.3 Elliptic Problems In this section we consider linear partial differential equations that do not involve time. Since our study relies on the Lax-Milgram Theorem, we only use the Hilbertian Sobolev spaces H m .˝/ WD Wpm .˝/ with p D 2, assuming ˝ is an open subset of Rd whose boundary is of class C1 . Because p D 2 is fixed, we simply denote by kkm the norm kkm;p when no confusion may arise. For m D 0, for the sake of clarity, we often write kkL2 .˝/ or kkL2 instead of kk0 . Also, we concentrate on elliptic problems of the form X c˛ .x/D˛ u.x/ D f .x/ (9.15) .Lu/.x/ WD j˛j2m
where f and the coefficients c˛ are given functions on ˝, with m 2 Nnf0g. To such equations are usually adjoined boundary conditions. For m D 1 they are usually of one of the forms u j@˝ D g or
(Dirichlet condition)
@u j@˝ D g @n
(Neumann condition),
where g is a given function on WD @˝ and @u WD n:ru is the normal derivative of @n u. For m > 1 the traces of higher order derivatives may be involved. In the sequel we assume that the operator L can be written in divergence form: .Lu/.x/ D
m X
.1/j˛j D˛ .a˛ˇ .x/Dˇ u.x//:
j˛j; jˇjD0
For m D 1 the multi-indices are just indices and L can be written as Lu D div.A./ru.// C a./:ru./ C a0 ./u./;
(9.16)
where A WD .aij .// W ˝ ! L.Rd ; Rd /, a W ˝ ! Rd , and a0 W ˝ ! R are in L1 .˝; L.Rd ; Rd //, L1 .˝; Rd /, and L1 .˝/ respectively. Also, for differentiable maps u W ˝ ! R, v WD .v i / W ˝ ! Rd or u 2 H 1 .˝/, v 2 H 1 .˝/d , we write ru WD .Di u/i2Nd ;
div.v.// WD
d X
Di v i ./:
iD1
Then L is a continuous linear map from H 1 .˝/ into the dual H 1 .˝/ of H01 .˝/. The operator L is associated with the restriction to H 1 .˝/ H01 .˝/ of the bilinear form b W H 1 .˝/ H 1 .˝/ ! R defined for u; v 2 H 1 .˝/ by b.u; v/ WD
d Z X i;jD1 ˝
aij Di uDj v C
Z
d Z X iD1
˝
ai Di uv C
˝
a0 uv;
(9.17)
546
9 Partial Differential Equations
so that for .u; v/ 2 H 1 .˝/ H01 .˝/ one has b.u; v/ D hLu; vi:
9.3.1 Ellipticity The operator L is said to be uniformly elliptic, or just elliptic (over ˝) if the coefficients .a˛ˇ /j˛j;jˇjm are measurable and essentially bounded and if there exists a constant cE > 0 such that X a˛ˇ .x/v˛ vˇ cE kvk2m 8v WD .v˛ / 2 Rd.m/ ; x 2 ˝ (9.18) j˛jDjˇjDm
where d.m/ WD cardf˛ W j˛j D mg; moreover, in the case m > 1 we require that the coefficients .a˛ˇ /j˛j;jˇjDm are uniformly continuous with some modulus of uniform continuity c˝ . For the sake of simplicity, we only treat the case m D 1. We are essentially interested in the model problem in which L D C a0 I, where is the Laplacian u WD
m X @2 u iD1
@x2i
for which the preceding conditions are clearly satisfied. Note that here and in the rest of this section we commit an abuse of notation, writing I instead of the injection I1 W H 1 .˝/ ! H 1 .˝/ defined by hI1 .u/; vi WD hujviL2 .˝/ WD h j.u/j j0 .v/iL2 .˝/ for u 2 H 1 .˝/; v 2 H01 .˝/, j (resp. j0 ) being the canonical injection of H 1 .˝/ (resp. H01 .˝/) into L2 .˝/. Thus, |
I1 D j0 ı R ı j D R1 ı j0 ı j; where R (resp. R1 ) is the Riesz isometry from L2 .˝/ (resp. H01 .˝/) onto its dual space: |
j0
L2 .˝/ ! H 1 .˝/ "R " R1 j
H 1 .˝/ !
L2 .˝/
j 0
!
H01 .˝/
9.3 Elliptic Problems
547 |
Note that since j0 .H01 .˝// is dense in L2 .˝/, the maps j0 and j0 are injective as is I1 , so that identifying the spaces with their images is not a great abuse. Such an abuse is justified when for u 2 H 1 .˝/ one views u and u as distributions.
9.3.2 Energy Estimates and Existence Results As already mentioned, existence results for the equation Lu D f along with some boundary condition rely on the Lax-Milgram Theorem, so that we are led to consider the bilinear form b associated with L. The proofs we give are for the case m D 1, even if we state a more general result (see for instance [3, 216] for the first assertion). Theorem 9.13 (Gårding) If L is uniformly elliptic of order 2m, there exist constants cm , c0 > 0 such that X Z a˛ˇ D˛ uDˇ u cm kuk2m c0 kuk20 8u 2 H0m .˝/: (9.19) ˝
j˛j; jˇjm
If m D 1, if the operator L of (9.16) is uniformly elliptic and if for some ˇ > 0 one has a0 ˇ a.e., then there exists some c > 0 such that d Z X i;jD1 ˝
Z aij Di uDj u C
˝
a0 uu c kuk21
8u 2 H01 .˝/:
(9.20)
If m D 1 and if the operator L of (9.16) is uniformly elliptic, then there exist c0 > 0, c > 0 such that for all u 2 H01 .˝/ Z X d ˝ i;jD1
aij Di uDj u C
Z X d ˝ iD1
Z ai uDi u C
˝
a0 uu c kuk21 c0 kuk20 :
(9.21)
Proof For m D 1, u 2 H01 .˝/, x 2 ˝, taking ˛ WD .0; : : : ; 1; 0; : : : 0/, v˛ WD vi WD Di u.x/ in the ellipticity condition (9.18) we obtain, with c1 WD cE , c0 WD cE Cka0 k1 XZ
Z
Z
i;jD1 ˝
aij Di uDj u C
˝
a0 uu cE
˝
.Di u/2 C
Z ˝
a0 uu c1 kuk21 c0 kuk20 .
When for some ˇ > 0 we have a0 ˇ a.e., setting c WD min.cE ; ˇ/; we have the estimate Z Z Z XZ aij Di uDj u C a0 uu cE .Di u/2 C ˇ uu c kuk21 . i;jD1 ˝
˝
˝
˝
548
9 Partial Differential Equations
To get the final assertion we take " > 0 such that "d kai k1 < 2cE , we use the estimate ˇZ ˇ ˇ ˇ ˇ ai uDi uˇ kai k kuk kDi uk 1 kai k ."1 kuk2 C " kDi uk2 /; 1 0 0 1 0 0 ˇ ˇ 2 ˝
d and we take c0 > ka0 k1 C .1=2"/˙iD1 kai k1 , c WD cE 12 "d kai k1 .
The preceding inequalities incite us to transform the given problem into a more tractable one. We first consider the case of a Dirichlet condition with ˝ bounded and of class C1 . We assume that for all i, j 2 Nd we have aij 2 Cb1 .˝/ and ai 2 L1 .˝/, and that f and g are given. We say that u 2 C2 .˝/ \ C.˝/ is a classical solution of the system Lu D f ; u j@˝ D g if these equalities are pointwise relations. A weak solution of this system is an element u 2 H 1 .˝/ such that the trace T .u/ R of u on WD @˝ satisfies T .u/ D g and, for all v 2 H01 .˝/, b.u; v/ D ˝ f v, where Rb is the bilinear form introduced in (9.17). Let us note that the relation b.u; v/ D ˝ f v or b.u; v/ D hf j j0 viL2 .˝/ D hRf ; j0 vi can be written |
hLu; vi D h j0 Rf ; vi for the coupling between H01 .˝/ and its dual. When it is satisfied for all v 2 H01 .˝/, | | it means that Lu D j0 Rf or Lu D f when we identify f with j0 Rf 2 H 1 .˝/. Proposition 9.11 Let f 2 L2 .˝/; g 2 C1 . / and let L be a uniformly elliptic differential operator of order 2 in divergence form, ˝ being bounded and of class C1 . A classical solution u 2 Cb2 .˝/ \ C1 .˝/ of the system Lu D f ; u j D g is a weak solution of this system. Conversely, if u 2 H 1 .˝/ is a weak solution belonging to Cb2 .˝/ \ C1 .˝/, then u is a classical solution of this system. Proof If u 2 Cb2 .˝/ \ C1 .˝/ is a classical solution, then u 2 H 1 .˝/ (and even 1 W1 .˝/) and T .u/ is just the restriction of u to : Since aij Dj u 2 C1 .˝/ we can use Green’s formula: Z Z Di .aij Dj u/v C aij Dj uDi v D 0 i; j 2 Nd ˝
˝
for all v 2 Cc1 .˝/, hence also for all v 2 H01 .˝/ by density since the left-hand side is a continuous linear form on Cc1 .˝/ with respect to the topology induced by H 1 .˝/ as a function of v; here we use the relations Di aij Dj u 2 Cb .˝/ L2 .˝/ and aij Di DRj u 2 L2 .˝/: Summing on i, j and using the relation Lu D f we get b.u; v/ D ˝ f v for all v 2 H01 .˝/. Conversely, letRu 2 H 1 .˝/ belonging to Cb2 .˝/ \ C1 .˝/ and satisfying T .u/ D g and b.u; v/ D ˝ f v for all v 2 H01 .˝/. Then we have u j D g. Using Green’s
9.3 Elliptic Problems
549
formula again we get Z X d ˝ i;jD1
Di .aij Dj u/v C
Z X Z d .ai Di u C a0 u/v fv D 0 ˝ iD1
˝
8v 2 Cc1 .˝/
or hLu f j viL2 D 0 for all v 2 Cc1 .˝/; where h j iL2 is the scalar product in L2 .˝/. Since Cc1 .˝/ is dense in L2 .˝/; we get Lu D f in L2 .˝/ and Lu is a continuous representative of f : Now we observe that given f0 2 L2 .˝/ and g 2 T .H 1 .˝//, the inhomogeneous weak problem find u 2 H 1 .˝/ such that T .u/ D g and b.u; v/ D h f0 j viL2 8v 2 H01 .˝/ can be reduced to the homogeneous weak problem find w 2 H01 .˝/ such that b.w; v/ D h f j viL2 for all v 2 H01 .˝/
(9.22)
where f WD f0 Lh, with h 2 H 1 .˝/ such that T .h/ D g. Setting w WD u h, this stems from the linearity of b.; v/ and from the fact that for all v 2 H01 .˝/ one has b.h; v/ D hLh j viL2 , as shown above. Thus, we concentrate on the Dirichlet problem (9.22) in a case incorporating the Laplace equation u C u D f
u 2 H01 .˝/:
Theorem 9.14 (Dirichlet, Riemann, Hilbert) For all f 2 L2 .˝/ the problem Lu D f ; u 2 H01 .˝/ in which a D 0 and a0 ˇ > 0 has a unique weak solution. If, moreover, aij D aji for all .i; j/ 2 N2d then the solution u minimizes on H01 .˝/ the functional v 7!
1 b.v; v/ h f j viL2 : 2
Proof The bilinear form b is obviously continuous on H01 .˝/ H01 .˝/, and even on H 1 .˝/ H 1 .˝/. Moreover, when a D 0 and a0 ˇ > 0, Gårding’s inequality shows that b is coercive on H01 .˝/: The Lax-Milgram Theorem ensures that there is a unique u 2 H01 .˝/ such that b.u; v/ D h f j viL2 for all v 2 H01 .˝/: Thus u is a solution of (9.22) with h WD 0, hence is a weak solution of Lu D f ; u 2 H01 .˝/: The characterization of u in the symmetric case is part of the Lax-Milgram Theorem. Corollary 9.3 For all f 2 L2 .˝/ the problem Lu C u D f , u 2 H01 .˝/ in which a D 0 and > ka0 k1 has a unique weak solution. Proof This follows from the fact that when > ka0 k1 there exists some ˇ > 0 such that a0 C ˇ. Let us return to the inhomogeneous case.
550
9 Partial Differential Equations
Corollary 9.4 Given f 2 L2 .˝/; g WD T .h/ with h 2 H 1 .˝/; the problem Lu D f ; u 2 H 1 .˝/; T .u/ D g in which a D 0 and a0 ˇ > 0 has a unique weak solution. If, moreover, aij D aji for all .i; j/ 2 N2d then the solution u minimizes on the affine space H01 .˝/ C h the functional F W v 7!
1 b.v; v/ h f j viL2 : 2
Proof The first assertion follows from the theorem using the translation v 7! v C h: The second one stems from the fact that u 2 H01 .˝/ C h, F.u/ F.v/ for all v 2 H01 .˝/ C h is equivalent to w WD u h 2 H01 .˝/, F.w/ F.z/ for all z 2 H01 .˝/, as one can see by using the relation b.w; / D h f j iL2 . Now let us consider the general (homogeneous) case in which L is given by (9.16) and the bilinear form b is given by (9.17) with a ¤ 0: The operator d d Di .aij Dj u/ C ˙iD1 ai Di u C a0 u can be seen as an unbounded L W u 7! ˙i;jD1 2 operator with domain H0 .˝/ L2 .˝/: We rather view it as a continuous linear map L W H01 .˝/ ! H01 .˝/ associated with the bilinear form b| . Theorem 9.15 For f 2 L2 .˝/ and L given by (9.16), the set of u 2 H01 .˝/ such that Lu D f is either a singleton for all f 2 L2 .˝/ or else a finite dimensional affine subspace of H01 .˝/ provided f satisfies a finite number of linear relations; otherwise it is empty. Proof We fix 2 Œc0 ; 1Œ, where c0 is as in the last assertion of Theorem 9.13 and we set L WD L C I;
b .u; v/ WD b.u; v/ C hu j viL2 ;
writing I instead of the injection I1 W H 1 .˝/ ! H 1 .˝/ or its restriction to H01 .˝/ and u; v instead of j0 .u/; j0 .v/ 2 L2 .˝/. Since b is coercive on H01 .˝/, by the LaxMilgram Theorem, for all f0 2 L2 .˝/ there exists a unique u 2 H01 .˝/ such that b .u ; v/ D h f0 j viL2
8v 2 H01 .˝/:
|
1 Equivalently, u D L1
. j0 .Rf0 // but we simply write u D L f0 , identifying f0 | with its image under j0 ı R in H 1 .˝/. We note that, given f 2 L2 .˝/, u 2 H01 .˝/ is a weak solution of Lu D f , T .u/ D 0 if and only if
b .u; v/ D h f C u j viL2
8v 2 H01 .˝/
or u D L1
. f C u/; or 1 u L1
u D L f :
9.3 Elliptic Problems
551 |
1 1 1 Now the operator L1
W L2 .˝/ ! H0 .˝/ (standing for L ı j0 ı R D L jL2 .˝/ ) is continuous since by the coercivity property of Theorem 9.13 there exists some c > 0 such that
2 c u 1 b .u ; u / D h f j u iL2 k f kL2 u 1 ; hence u 1 c1 k f kL2 . | 1 1 Composing L1
(or rather L ı j0 ı R D L jL2 .˝/ ) with the canonical injection 1 j0 W H0 .˝/ ! L2 .˝/, by the Rellich-Kondrachov Theorem we get a compact operator A WD j0 ı L1
in L2 .˝/: Thus u is a weak solution if and only if for 1 f WD j0 .L f / we have u A u D f ; u being considered as the element j0 .u/ of L2 .˝/ and we can apply the Fredholm | alternative: N.IA / D R.IA /? is finite dimensional and if N.IA / D f0g; then I A is an isomorphism. Moreover, R.I A / is a finite codimensional subspace, and the Fredholm alternative asserts that f 2 R.I A / if and only if h f j viL2 D 0
8v 2 N.I A| /:
Corollary 9.5 Let L be given by (9.16). Then there exists a finite or countable subset C of R such that for all f 2 L2 .˝/ the equation Lu u D f has a unique weak solution in H01 .˝/ if and only if 2 RnC: If C is infinite it is of the form C D fn W n 2 Ng where .n / is an increasing sequence with limit C1: Proof We take 2 Œc0 ; 1Œ and set L WD L C I as in the preceding proof. The Fredholm alternative asserts that the equation Lu u D f has a unique weak solution in H01 .˝/ if and only if N.L I/ D f0g: This relation is equivalent to N.L . C /I/ D f0g: Setting K WD L1
considered as a compact linear operator from L2 .˝/ into itself, the relation L .u/ D . C /u is equivalent to u D . C /K .u/: Thus u 2 N.LI/nf0g if and only if C ¤ 0 and u is an eigenvector of K associated with the eigenvalue . C /1 : The spectral analysis of the compact operator K expounded in Theorem 3.31 yields the conclusion. We can derive from the preceding corollary a characterization of the spectrum d of the principal part u 7! L0 .u/ WD ˙i;jD1 Di .aij Dj u/ of the operator L in the case aij D aji and ˝ is bounded. Then, by ellipticity and by Poincaré’s inequality, the operator L0 is symmetric and positive. The eigenvalues of K0 WD L1 0 form a decreasing sequence .n / ! 0C : Thus, the eigenvalues of L0 form the increasing sequence .n / with n WD 1=n : Taking a basis of the eigenspace associated with n ; we obtain the following result.
552
9 Partial Differential Equations
Proposition 9.12 When ˝ is bounded, aij D aji , ai D 0, a0 D 0, the eigenvalues of the elliptic operator L D L0 form an increasing sequence of positive numbers with limit C1 and there exists an orthonormal sequence in L2 .˝/ formed of eigenvectors of L. Exercise Interpret the adjoint of A with the help of the formal adjoint operator: L u D
d X
u 2 H01 .˝/
Di .aji Dj u/ C a0 u
i;jD1
that is associated with the bilinear form b| given by b| .u; v/ WD b.v; u/
8u; v 2 H01 .˝/:
[Hint: use Green’s formula hLu j viL2 D b.u; v/ D b| .v; u/ D hL v j uiL2
8u; v 2 H01 .˝/:
9.3.3 Regularity of Solutions The question of regularity of the weak solutions of the equation Lu D f or even of the solutions to the Laplace equation u C u D f
(9.23)
is delicate. We just state typical results and give an idea of the proofs. Theorem 9.16 (Agmon-Douglis-Nirenberg) If ˝ is an open subset of class C2 of Rd and p 2, there exists some constant c > 0 such that if f 2 Lp .˝/ \ L2 .˝/ then the unique weak solution u 2 H01 .˝/ of equation (9.23) belongs to Wp2 .˝/ and one has kuk2;p c k f kp
8f 2 Lp .˝/:
In particular, if ˝ is bounded and if f 2 C.˝/; then u 2 C1 .˝/: Proof Let us give a proof in the case p D 2, ˝ D Rd and sketch the case ˝ WD Rd1 P and the case of an arbitrary open subset of class C2 . The existence and uniqueness of the weak solution u stems from Theorem 9.14 or Corollary 9.3. For v 2 L2 .˝/, w 2 Rd n f0g let us set qw .v/ WD
1 .Tw .v/ v/; kwk
9.3 Elliptic Problems
553
where Tw .v/.x/ WD .v ı tw /.x/ WD v.x w/. By Proposition 9.4 it suffices to prove that the norm kqw .u/kp in Lp .˝/ of the difference quotient qw .u/ is bounded above by c k f kp for some c > 0. Since ˝ D Rd , for all v 2 L2 .˝/ we have Z
Z ˝
uTw .v/ D
Z
˝
Z uqw .v/ D
˝
˝
vTw .u/;
(9.24)
vqw .u/:
(9.25)
Taking v WD qw .u/ in relation (9.25) and then observing that qw .Di u/ D Di qw .u/ for i 2 Nd , we get Z
Z ˝
Z ˝
uqw .qw .u// D Z
ru:rqw .qw .u// D
˝
˝
.qw .u//:.qw .u//; rqw .u/:rqw .u/
hence, using the relation Z Z Z rur' C u' D f' ˝
˝
˝
8' 2 H01 .˝/
(9.26)
expressing that u 2 H01 .˝/ is the weak solution of equation (9.23), taking ' WD qw .qw .u//, we obtain Z Z Z 2 2 fqw .qw .u//: krqw .u/k C jqw .u/j D ˝
˝
˝
Thus kqw .u/k21;2 k f k2 kqw .qw .u//k2 : By Proposition 9.4, for all v 2 H 1 .˝/ we have kqw .v/k2 kvk1;2 . Taking v WD qw .u/ and simplifying by kqw .u/k1;2 , we obtain kqw .u/k1;2 k f k2 :
(9.27)
Since w is arbitrary in Rd ; applying again Proposition 9.4, and observing that kqw .Di u/k2 D kDi qw .u/k2 k f k2 , we get Di u 2 H 1 .˝/ for all i 2 Nd , hence u 2 H 2 .˝/ and kukH 2 .˝/ c k f k2 for some c > 0: In the case when ˝ WD Rd1 P; for w horizontal, i.e. w 2 Rd1 f0g; by the remark following Proposition 9.4, relation (9.27) is still valid. Since for i 2 Nd one has qw .Di u/ D Di qw .u/ one obtains kqw .Di u/kL2 .˝/ k f k2 :
554
9 Partial Differential Equations
Given ' 2 Cc1 .˝/, taking v WD Di ' in relation (9.25) we see that ˇZ ˇ ˇZ ˇ ˇ ˇ ˇ ˇ ˇ uqw .Di '/ˇ D ˇ 'qw .Di u/ˇ k'k k f k : 2 2 ˇ ˇ ˇ ˇ ˝
˝
Replacing w with tek , k 2 Nd1 , t > 0 and passing to the limit as t ! 0C ; we get ˇZ ˇ ˇ ˇ ˇ uDk Di ' ˇ k'k k f k 8' 2 Cc1 .˝/; i 2 Nd : (9.28) 2 2 ˇ ˇ ˝
Let us prove that ˇZ ˇ ˇ ˇ ˇ uD2 ' ˇ k'k k f k 2 2 d ˇ ˇ ˝
8' 2 Cc1 .˝/:
Returning to equation (9.26) and using relation (9.28) we get ˇZ ˇ d1 ˇZ ˇ ˇZ ˇ ˇ ˇ Xˇ ˇ ˇ ˇ ˇ uD2 ' ˇ ˇ uD2 ' ˇ C ˇ . f u/' ˇ c k'k k f k 2 2 d ˇ k ˇ ˇ ˇ ˇ ˇ ˝
kD1
˝
˝
8' 2 Cc1 .˝/
for some c > 0. Combining these inequalities with relation (9.28) we obtain ˇZ ˇ ˇ ˇ ˇ uDj Di ' ˇ k'k k f k 2 2 ˇ ˇ ˝
8' 2 Cc1 .˝/; i; j 2 Nd :
R R Riesz’s Theorem yields some ui;j 2 L2 .˝/ such that ˝ uDj Di ' D ˝ ui;j ' for all ' 2 Cc1 .˝/: It follows that u 2 H 2 .˝/: The case of a general open subset is treated by using a partition of unity. We refer to specialized books about partial differential equations for the proof. This result can be generalized to the case of a weak solution of equation Lu D f for L a uniformly elliptic differential operator of order two. We start with a glance m .˝/ the space of u 2 L2;loc .˝/ whose weak at interior regularity. We denote by Hloc derivatives up to order m are in L2;loc .˝/. For m D 0 we set H m .˝/ WD L2 .˝/. Theorem 9.17 Assume that for some m 2 N the coefficients of the uniformly elliptic operator L are in CmC1 .˝/ and f 2 H m .˝/. If u 2 H 1 .˝/ is a weak solution of mC2 the equation Lu D f ; then u belongs to Hloc .˝/ and for any open set ˝ 0 whose closure is compact and contained in ˝; for some c > 0 depending only on m; ˝, ˝ 0 , and the coefficients of L one has the estimate kukH mC2 .˝ 0 / c k f kH m .˝/ C c kukL2 .˝/ : 2 Note that since u 2 Hloc .˝/; we can use Green’s formula to integrate by parts and transform the relation b.u; v/ D h f j vi for all v 2 Cc1 .˝/ into
hLu j vi D h f j vi
8v 2 Cc1 .˝/:
(9.29)
9.3 Elliptic Problems
555
This relation implies that Lu D f a.e. in ˝. Thus, under additional assumptions as in the next corollary, u is a classical solution, the relation Lu D f holding everywhere. Corollary 9.6 Assume that the coefficients of the uniformly elliptic operator L are in C1 .˝/ and f 2 C1 .˝/: If u 2 H 1 .˝/ is a weak solution of the equation Lu D f ; then u belongs to C1 .˝/: Under refined assumptions, one can get regularity up to the boundary; see [3, 52, 117, 128, 139, 202] for instance. We denote by ˝ the closure of ˝. Theorem 9.18 Let ˝ be of class CmC2 for some m 2 N. Suppose the coefficients of L belong to CmC1 .˝/; f 2 H m .˝/; and u 2 H01 .˝/ is a weak solution of the equation Lu D f : Then u 2 H mC2 .˝/ and for some c > 0 depending only on m; ˝, and the coefficients of L, one has the estimate kukH mC2 .˝/ c k f kH m .˝/ C c kukL2 .˝/ : If, moreover, the weak solution is unique, as in Theorem 9.14, one has the estimate kukH mC2 .˝/ c k f kH m .˝/ : Also one can get regularity results in the class C2;s .˝/ of functions that are twice differentiable and whose second-order derivatives are s-Hölderian with s 20; 1Œ, i.e. satisfy inequalities of the form jv.x/ v.x0 /j c kx x0 ks for x, x0 2 ˝: See [136], [139]. Theorem 9.19 (Schauder) If ˝ is of class C2;s , with s 20; 1Œ and if f 2 C0;s .˝/, then a weak solution u of the equation Lu D f is a classical solution in C2;s .˝/ and there exists some c > 0 such that kukC2;s .˝/ c k f kC0;s .˝/ :
9.3.4 Maximum Principles We have seen that for an open subset ˝ of Rd , when a function u W ˝ ! R attains a local maximum at some point x 2 ˝ and is twice differentiable there, one has Du.x/ D 0;
D2 u.x/ 0:
(9.30)
The results we present in this subsection make use of such a fact. They have important consequences; see [52, 117, 214, 248]. Again, we consider an elliptic differential operator of order two: .Lu/.x/ WD
d X i;jD1
ai;j .x/D2i;j u.x/
C
d X iD1
ai .x/Di u.x/ C a0 .x/
8x 2 ˝;
556
9 Partial Differential Equations
As for Poincaré’s theorem, we say that the set ˝ has a bounded width in a direction y 2 Rd nf0g if for some b 2 RC and all x 2 ˝ one has jx:yj b kyk. Theorem 9.20 (Weak Maximum Principle) (a) Let u 2 C2 .˝/ \ C.˝/ be such that .Lu/.x/ < 0 for all x 2 ˝. Then, if a0 ./ D 0, one has u.x/ < sup u.@˝/
8x 2 ˝:
(9.31)
(b) If ˝ has a bounded width in some direction y, if a0 ./ D 0 on ˝, and if u 2 C2 .˝/ \ C.˝/ is such that .Lu/.x/ 0 for all x 2 ˝ then one has sup u.˝/ sup u.@˝/:
(9.32)
(c) If ˝ has a bounded width in some direction y 2 Rd , if a0 ./ 0 on ˝, and if u 2 C2 .˝/ \ C.˝/ is such that .Lu/.x/ 0 for all x 2 ˝ then one has sup u.˝/ sup uC .@˝/:
(9.33)
Proof (a) We assume that there is some x 2 ˝ such that u.x/ sup u.@˝/ and we show that a contradiction occurs. Changing u into u sup u.@˝/ we may suppose u.x/ 0 and sup u.@˝/ D 0: Then, taking ' 2 Cc1 .˝/ with ' D 1 around x and changing u into 'u and x into another point, we may suppose u.x/ D sup u.˝/. Since D2i;j u.x/ D D2j;i u.x/; there is no loss of generality in assuming .ai;j / is symmetric, replacing ai;j with 12 .ai;j C aj;i / if necessary. Then relation (9.30) holds. Since the matrix A.x/ is symmetric, there exist an orthogonal matrix Q and a diagonal matrix B such that A.x/ D QBQ| . Let .e1 ; : : : ; ed / be the canonical basis of Rd and let H WD .D2i;j u.x//. Since for two matrices M, N one has tr.MN/ D tr.NM/, the relation Lu.x/ < 0 reads 0 < tr.A.x/H/ D tr.QBQ| H/ D tr.BQ| HQ/: Since B is diagonal and its elements are positive by ellipticity, for some bi > 0 we obtain ˙1id bi hHQei j Qei i > 0. This contradicts the fact that D2 u.x/ 0: (b) Suppose u 2 C2 .˝/ \ C.˝/ is such that .Lu/.x/ 0 for all x 2 ˝: Given r > 0 such that rcE > kak1 ; where cE is the uniform ellipticity constant of L, a WD .a1 ; : : : ; ad /, and y WD . y1 ; : : : ; yd /, for " > 0 we set u" .x/ WD u.x/ C "erx:y
x 2 ˝:
Then, assuming without loss of generality that Pkyk D 1, for all x 2 ˝ we have A.x/y:y cE hence ˙1i;jd r2 ai;j .x/yi yj C 1id rai .x/:yi < 0 and .Lu" /.x/ D .Lu/.x/ C "yrx:y .r2
X 1i;jd
ai;j .x/yi yj C r
X 1id
ai .x/:yi / < 0:
9.3 Elliptic Problems
557
Part (a) ensures that sup u" .˝/ < sup u" .@˝/: Since the function x 7! erx:y is bounded on ˝, passing to the limit as " ! 0C , we get sup u.˝/ sup u.@˝/: (c) See [74]. One can give a more striking maximum principle (see [117]). Theorem 9.21 (Hopf) Let ˝ be a connected, bounded, open subset of Rd : Suppose a0 0 on ˝ and u 2 C2 .˝/ \ C.˝/ satisfies Lu 0 on ˝: Then, if u attains its maximum over ˝ at some x 2 ˝ and if u.x/ 0, then u is constant on ˝: Of course, one has the same conclusion if Lu 0 on ˝ and if u attains its minimum on ˝ at some x 2 ˝; with u.x/ 0:
Exercises 1. Write down explicitly the passages from the general form of a partial differential operator to its divergence form and vice versa in the case when the operator is of order 2 and the coefficients are smooth enough. 2. For the homogeneous Neumann problem of finding a function u W ˝ ! R satisfying u C u D f
@u D 0 on @˝; @n
in ˝;
where ˝ is a bounded open subset of Rd of class C1 ; one says that u is a classical solution if u 2 C2 .˝/ and the preceding relations are satisfied, where @u .x/ WD @n ru.x/:n.x/, n.x/ being the outward normal to ˝ at x 2 @˝. A weak solution is an element u of H 1 .˝/ satisfying Z
Z
Z
ru:rv C ˝
uv D ˝
˝
fv
8v 2 H 1 .˝/:
Show that a classical solution is a weak solution. Prove that for any given f 2 L2 .˝/ there exists a unique weak solution u of the Neumann problem and thatR u is given as the Rsolution to the minimization problem of the function v 7! 12 ˝ .jrvj2 C v 2 / ˝ f v on H 1 .˝/: 3. (Poisson’s integral formula for the upper half-space) Given a sufficiently smooth function g W R ! R, show that the function u W R P ! R given by u.x; y/ WD
y
Z
1
1
g.r/ dr .x r/2 C y2
satisfies Laplace’s equation and can be continuously extended to R RC so that it satisfies the Dirichlet condition u.x; 0/ D g.x/:
558
9 Partial Differential Equations
4. (Poisson’s integral formula for the unit ball Bd of Rd ) Given a continuous function g on the unit sphere Sd1 show that u given by Z
1 kxk2
u.x/ WD Sd1
d .Bd / kx ykd
d. y/
satisfies the equation u D 0 in Bd and limx!x u.x/ D g.x/ for x 2 Sd1 . [See [151, Section 4.1.3].] 5. (Hopf’s Lemma) Let a0 D 0 and let u 2 C2 .˝/ \ C1 .˝/ be such that Lu 0 on ˝. Suppose that for some x 2 @˝ there exists an open ball B ˝ such that x 2 @B and u.x/ > u.x/ for all x 2 ˝. Prove that @u @n .x/ > 0; where n is the outer normal to ˝ at x. [See [117, p. 330].]
9.4 Nonlinear Problems Nonlinear problems are much more intricate than linear problems. For such a reason, some results are obtained via linearization; however, in general, they are just local results. In some cases it is possible to reduce nonlinear problems to linear problems by using adapted transformations. Besides such reductions, in view of the abundance of techniques for dealing with nonlinear problems (see [7, 8, 58, 62, 68, 77, 92, 103, 188, 266]), we restrict our attention to two important methods: order methods and dissipativity methods.
9.4.1 Transforming Equations In some cases the Legendre transform is a powerful means to pass from a nonlinear equation to a linear equation. This is so for the minimal surface equation div.
ru .1 C kruk2 /1=2
In dimension 2, setting p WD p.r; s/ WD equation can be rewritten as .1 C q2 /
/ D 0:
@u @r .r; s/,
q WD q.r; s/ WD
@u @s .r; s/
2 @2 u @2 u 2 @ u .r; s/ C .1 C p .r; s/ 2pq / .r; s/ D 0: @r2 @r@s @s2
this
(9.34)
Assume that on some open subset U of R2 the map .r; s/ 7! ru.r; s/ WD . p.r; s/; q.r; s// is a C1 diffeomorphism from U onto the open subset V D ru.U/ of R2 : We denote by . p; q/ 7! x. p; q/ WD .r. p; q/; s. p; q// its inverse. The Legendre transform v of u is given by v. p; q/ WD pr. p; q/ C qs. p; q/ u.x. p; q//
. p; q/ 2 V:
9.4 Nonlinear Problems
559
We know from Theorem 5.41 that Dv. p; q/ D .r. p; q/; s. p; q// for all . p; q/ 2 V and D2 u.r; s/ D .D2 v. p; q//1 for all .r; s/ 2 U: Thus, setting D.r; s/ WD @r @s @r . p; q/ @q . p; q/ . @q . p; q//2 , one has @p 1 @2 v @2 u .r; s/ D . p; q/ 2 @r D.r; s/ @q2
@2 u 1 @2 v .r; s/ D . p; q/; 2 @s D.r; s/ @p2
1 @2 u @2 v .r; s/ D . p; q/: @r@s D.r; s/ @p@q Substituting this expression for the second-order partial derivatives of u in equation (9.34) we get the linear equation .1 C p2 /
2 @2 v @2 v 2 @ v . p; q/ C .1 C q . p; q/ C 2pq / . p; q/ D 0: @p2 @p@q @q2
If one obtains a solution v to this equation, one gets u as the Legendre transform of v as seen in Theorem 5.41. The hodograph transform consists in reversing the roles of unknown functions and independent variables in order to convert certain quasilinear systems of partial differential equations into linear systems. As an example, let us consider the case of the equations of steady, irrotational fluid flow in two dimensions: . 2 .w/ u2 /
@u @v @u @v uv. C / C . 2 .w/ v 2 / D0 @r @s @r @s @u @v D 0: @s @r
(9.35) (9.36)
Here we have omitted the variables .r; s/ and the unknown function is the velocity field w WD .u; v/ whereas the sound speed .w/ is given. Let us assume the map x WD .r; s/ 7! w WD .u.r; s/; v.r; s// defines a diffeomorphism from an open subset U of R2 onto an open subset V of R2 : By the inverse function theorem this occurs (locally) when the Jacobian J satisfies J WD J.r; s/ WD det Dw.r; s/ WD
@u @v @u @v ¤ 0: @r @s @s @r
Then Dx.u; v/ D .Dw.r; s//1 for .r; s/ WD .r.u; v/; s.u; v//; i.e. 1 @v @r D ; @u J @s
@r 1 @u D ; @v J @r
@s 1 @v @s 1 @u D , D : @u J @r @v J @s
560
9 Partial Differential Equations
We intend to look for the equations satisfied by the inverse map .u; v/ 7! .r.u; v/; s.u; v//: Inserting the preceding expressions of the elements of the matrix Dx.u; v/ into equations (9.35)–(9.36), we get the linear system . 2 .w/ u2 /
@r @s @s @r C uv. C / C . 2 .w/ v 2 / D0 @v @v @u @u @r @s D 0: @v @u
The last equation suggests that we look for a function .u; v/ 7! z.u; v/ such that rD
@z .u; v/; @u
sD
@z .u; v/: @v
Then, the first equation above becomes the equation . 2 .w/ u2 /
2 @2 z @2 z 2 2 @ z C . C 2uv .w/ v / D 0; @v 2 @u@v @u2
a linear second-order partial differential equation. In some cases, it may be useful to associate to the unknown solution u of a nonlinear equation a related function w that is a solution to a simpler equation, for instance a linear equation. Such a process can be applied to evolution problems as well as to stationary problems. Taking a smooth function h W R ! R, let us set w WD h ı u. The case h.r/ WD ecr for r 2 R for some constant c 2 Rnf0g is called the Hopf-Cole transformation. Let us consider its effect on the nonlinear parabolic equation @u .x; t/ C au.x; t/ C b kru.x; t/k2 D 0 @t u.x; 0/ D g.x/
.x; t/ 2 Rd RC
(9.37)
x 2 Rd ;
(9.38)
where g is a given function, a, b 2 R, the Laplacian and the gradient r bearing on the space variable x: Using the relations @u @w .x; t/ D h0 .u.x; t// .x; t/; @t @t
(9.39)
w.x; t/ D h0 .u.x; t//u.x; t/ C h00 .u.x; t// kru.x; t/k2
(9.40)
we see by multiplication of both sides of (9.37) by h0 .u.x; t// that @w .x; t/ C aw.x; t/ Œah00 .u.x; t// bh0 .u.x; t// kru.x; t/k2 D 0: @t
9.4 Nonlinear Problems
561
Thus, if we choose h in such a way that ah00 .r/ bh0 .r/ D 0 for all r 2 R, the equation satisfied by w is simply the heat equation: @w .x; t/ C aw.x; t/ D 0; @t
(9.41)
w.x; 0/ D h.g.x//
(9.42)
with conductivity a and initial condition h ı g. Assuming a ¤ 0; b ¤ 0, setting c WD b=a and taking h.r/ D ecr the equation 00 ah .r/ bh0 .r/ D 0 is satisfied. Then, the unique bounded solution of the equation satisfied by w is given by w.x; t/ D
1 .4at/d=2
Z
ekxyk
2
=4at bg. y/=a
e
dy
Rd
and we have u.x; t/ D .a=b/ log w.x; t/, with w as above, an explicit solution. A further transformation allows us to solve the viscous Burger’s equation @v @2 v @v .x; t/ a 2 .x; t/ C v.x; t/ .x; t/ D 0 @t @x @x
.x; t/ 2 R P
v.x; 0/ D k.x/
x2R
that serves as a model in fluid dynamics. Here k is a given smooth function. This equation can be reduced to (9.37) with d D 1 and b WD 1=2 by setting Z
Z
x
u.x; t/ D 1
v.s; t/ds;
x
g.x/ D
k.s/ds; 1
so that u.; 0/ D g. In fact, if u is a solution to the system @u 1 .x; t/ C au.x; t/ C .ru.x; t//2 D 0 @t 2
.x; t/ 2 Rd RC ;
u.; 0/ D g, then v.x; t/ WD @u .x; t/ is a solution to Burger’s equation. Thus, @x Burger’s equation can be solved explicitly, a rather exceptional situation for partial differential equations.
Exercises 1. Let g W P ; Œ! W WD R2 nD, with D WD R f0g, be the homeomorphism given by g.r; / WD .r cos ; r sin /: Let h W W ! R be a harmonic function, i.e. a function of class C1 satisfying D21 h C D22 h D 0: Verify that f WD h ı g satisfies
562
9 Partial Differential Equations
D211 f C r2 D222 f C r1 D1 f D 0: Characterize those harmonic functions h such that f D h ı g is independent of . 2. (The vibrating string equation) Find those functions f W R2 ! R of class C2 satisfying the equation D211 f D222 f D 0. 3. Let v W R2 ! RC be a function of class C3 satisfying the heat equation D2 v.x; t/ D D211 v.x; t/: Given c 2 R, set u D 2cD1 v=.1 cv/: Show that u is of class C2 on the open set W WD f.x; t/ 2 R P W v.x; t/ ¤ 1=cg and satisfies on W Burgers’ equation D2 u D D211 u uD1 u: Study the reverse passage. Deduce from this particular solutions of Burgers’ equation of the form 2cD1 v=.1 cv/ where v is a solution of the heat equation.
9.4.2 Using Potential Functions Using a potential function may also enable one to reduce a system of nonlinear equations to a single linear equation. We illustrate this method with Euler equation for inviscid, incompressible fluid flows. We denote by p W R3 R ! R the pressure of the flow, by u W R3 R ! R3 the velocity vector field, the two unknown functions, by f W R3 R ! R3 the external force and by g W R3 ! R3 the initial velocity that are given. The Euler system is as follows: @u .x; t/ C Du.x; t/:u.x; t/ D rp.x; t/ C f .x; t/ @t div u.x; t/ D 0 u.x; 0/ D g.x/
.x; t/ 2 R3 R
(9.43)
.x; t/ 2 R3 R
(9.44)
x 2 R3 :
(9.45)
Here the gradient r; the divergence div, and the derivative D are taken with respect to the spatial variable x D .x1 ; x2 ; x3 /, so that (9.43) means that for i 2 N3 3
X @ui .x; t/ C Dj ui .x; t/:uj .x; t/ D Di p.x; t/ C f i .x; t/ @t jD1
.x; t/ 2 R3 R.
It is natural to assume that div g.x/ D 0 in order to ensure compatibility of (9.44) and (9.45). We consider the case when the external force f is derived from a potential h W R3 R ! R: f .x; t/ D rh.x; t/. We look for a solution .u; p/ of this system for which the velocity u is also derived from a potential v W R3 R ! R: u.x; t/ WD rv.x; t/: Then equation (9.44) requires v to be a harmonic function: v D div u D 0:
9.4 Nonlinear Problems
563
Besides this condition, let us see how equation (9.43) is transformed. Assuming v is twice differentiable, we note that Dj ui :uj D Dj Di v:Dj v D Di Dj v:Dj v D
1 Di .Dj v/2 ; 2
so that (9.43) turns out to be r.
@v 1 .x; t/ C krv.x; t/k2 / D r. p.x; t/ h.x; t//: @t 2
Thus, assuming that we have found a function v that is harmonic in x and satisfies rv.; 0/ D g./; we can take p.x; t/ D h.x; t/
@v 1 .x; t/ krv.x; t/k2 : @t 2
This is Bernoulli’s law. Now let us illustrate the utility of the Mountain Pass Theorem by sketching an investigation of the semilinear boundary-value problem consisting in finding a weak solution u 2 H01 .˝/ (in the sense of Chap. 9) to the equation u D ' ı u
(9.46)
where ˝ is a bounded open subset of Rd (d 3) and ' W R ! R is a smooth function satisfying for some p 21; dC2 Œ and a, b 2 RC , ˛; ˇ 2 P; 20; 1=2Œ the d2 growth conditions ˇ ˇ 8t 2 R j'.t/j a C a jtjp ; ˇ' 0 .t/ˇ b C b jtjp1 Z t '.s/ds t'.t/ 8t 2 R 0 .t/ WD
(9.47) (9.48)
0
˛ jtjpC1 j .t/j ˇ jtjpC1
8t 2 R:
(9.49)
Note that the function ' given by '.t/ WD t jtjp1 satisfies the preceding conditions and that the last one implies that '.0/ D 0; so that u D 0 is a solution to equation (9.46). Let us show that there exists a weak solution u ¤ 0R in the space X WD H01 .˝/ endowed with the scalar product given by hu j vi WD ˝ DuDv and the associated norm. Let us note that the Sobolev injection theorem (Theorem 9.10) ensures that H01 .˝/ is embedded in Lq .˝/ with q WD 2d=.d 2/. Thus, using Sect. 8.5.2, we see that condition (9.47) and the inequality d2 dC2d2 dC2 p Dp < D q 2d d 2 2d 2d
564
9 Partial Differential Equations
2d ensure that for u 2 H01 .˝/ or even L2d=d2 .˝/ one has ' ı u 2 Ls .˝/ for s WD dC2 . Now, since 1=q C 1=s D 1, i.e. .d 2/=2d C .d C 2/=2d D 1, one has Ls .˝/ D .Lq .˝// , so that ' ı u can be considered as a continuous linear form on H01 .˝/, i.e. an element of X WD H 1 .˝/. On the other hand, by condition (9.49), we have 2d ı u 2 L1 .˝/ since p C 1 < d2 : Thus the function f given by
Z f .w/ WD
1 . jDwj2 ˝ 2
ı w/
w 2 H01 .˝/
is well defined on the space X WD H01 .˝/: We shall show that the Mountain Pass Theorem can be applied to f and yields some critical point u ¤ 0 of f : We leave to the reader the task of proving (with the help of the Sobolev inequalities as in [117, pp. 483–484]) that f is of class C1;1 with derivative given by 0
Z
f .u/v D
˝
.DuDv .' ı u/v/
u, v 2 H01 .˝/:
R Thus, if u is a critical point of f one has ˝ .DuDv.' ıu/v/ D 0 for all v 2 H01 .˝/, i.e. u is a weak solution to (9.46). We denote by G W H 1 .˝/ WD X ! X the isometry defined by G.w / D w for w 2 H 1 .˝/; where w is the unique solution of the equation w D w
w 2 H01 .˝/:
Thus, the relation f 0 .u/ D 0 can be written as u D ' ı u: Here D D| D, where D W H01 .˝/ ! L2 .˝/d is the map w 7! rw and D| W .L2 .˝/d / ! H 1 .˝/ is the transpose map of D when .L2 .˝/d / is identified with L2 .˝/d : We set W WD X, w0 D 0: For w 2 X with kwk D r; for some ˛ 0 ; ˛ 00 ˛ we have ˇZ ˇ ˇ ˇ
˝
ˇ Z Z ˇ pC1 ı wˇˇ ˛ jwjpC1 ˛ 0 . jwjq / q ˛ 00 kwkpC1 D ˛ 00 rpC1 ˝
˝
hence f .w/ 12 r2 ˛ 00 rpC1 14 r2 for r > 0 small enough, since p C 1 > 2: Now let us fix w 2 X with kwk D r and look for some t 1 such that w1 WD tw satisfies f .w1 / < m WD 14 r2 . Such a t can be found since Z f .tw/ D
1 . t2 jDwj2 ˝ 2
ı .tw// t2
Z ˝
1 jDwj2 tpC1 ˇ 2
Z ˝
jwjpC1 < 0
for t 1 large enough. Finally, let us verify condition (PSc ). Let .xn / be a sequence in X such that . f .xn // ! c and . f 0 .xn // ! 0: Thus, setting "n WD k f 0 .xn /k, for all v 2 X we have ˇZ ˇ ˇ ˇ ˇ .Dxn Dv .' ı xn /v/ˇ "n kvk : ˇ ˇ ˝
9.4 Nonlinear Problems
565
Taking v D xn we get Z ˝
.' ı xn /xn "n kxn k C kxn k2 :
But since .cn / WD . f .xn // ! c and since f .xn / D kxn k2 D 2cn C 2
Z ˝
Z ı xn 2cn C 2
˝
1 2
kxn k2
R ˝
ı xn we have
.' ı xn /xn 2cn C 2 "n kxn k C 2 kxn k2 :
Since 2 < 1 and ."n / ! 0 we get that .kxn k/ is bounded. Since X is reflexive, taking a subsequence of .xn / if necessary, we may suppose .xn / has a weak limit x: By compactness of the Sobolev embedding we get .xn / ! x in Lq .˝/. Since . p C 1/=q 1; the continuity of the Nemytskii operator v 7! ' ı v from Lq .˝/ into Ls .˝/ implies that .' ı xn / ! ' ı x in Ls .˝/ D .Lq .˝// ; hence in H 1 .˝/: Let zn WD D| Dxn ' ı xn D f 0 .xn /: Since .zn / ! 0 in H 1 .˝/ and .' ı xn / ! ' ı x in H 1 .˝/; we have .D| Dxn / ! ' ı x in H 1 .˝/: Since xn D G.D| Dxn / and since G is continuous, we get that .xn / ! G.' ı x/: Thus condition (PSc ) is satisfied.
9.4.3 Order Methods For nonlinear problems, uniqueness is usually lost. For instance, if ˝ D0; Œ2 the equation v C 2 jvj D 0;
v 2 H01 .˝/
admits an infinity of solutions given by vn .x; y/ D n sin x sin y for all n 2 N. Another elementary example is the equation x3 C px C q D 0 in R which may have 1, 2, or 3 solutions. In this subsection we give a uniqueness result and a useful notion of subsolution and supersolution. For an existence result for the so-called logistic equation see [74, Thm 11.3]. Given a bounded open subset ˝ of Rd ; f 2 H 1 .˝/ WD H01 .˝/ and h W R ! R (globally) Lipschitzian with rate `, let us consider the equation v C h ı v D f
v 2 H01 .˝/:
(9.50)
Let us note that since jh ı vj ` jvj C jh.0/j, for v 2 H01 .˝/ or even v 2 L2 .˝/ we have h ı v 2 L2 .˝/ since ˝ is bounded. Given v1 , v2 2 H 1 .˝/ we write v1 v2 on @˝ if .v1 v2 /C 2 H01 .˝/, with tC WD max.t; 0/. We endow H01 .˝/ with the order induced by L2 .˝/ and we provide H 1 .˝/ WD H01 .˝/ with the dual order, i.e. v1 v2 if and only if hv1 ; vi hv2 ; vi for all v 2 H01 .˝/; v 0:
566
9 Partial Differential Equations
Definition 9.6 One says that u is a supersolution to equation (9.50) if u 2 H 1 .˝/; u 0 on @˝ and u C h ı u f in the weak sense that Z rurv C hh ı u; vi h f ; vi 8v 2 H01 .˝/; v 0: ˝
One says that w is a subsolution to equation (9.50) if w 2 H 1 .˝/; w 0 on @˝ and w C h ı w f in the weak sense that Z rwrv C hh ı w; vi h f ; vi 8v 2 H01 .˝/; v 0: ˝
Let us first show the existence of a greatest solution and a smallest solution lying between a subsolution and a supersolution. Theorem 9.22 Let u be a supersolution and let w be a subsolution of (9.50) such that u w: Then there exist solutions uO and wO of (9.50) such that u uO wO w and even such that for any solution v of (9.50) satisfying u v w one has u uO v wO w: Proof We pick c > ` and we consider the map S W L2 .˝/ ! L2 .˝/ which assigns to every y 2 L2 .˝/ the weak solution z 2 H01 .˝/ L2 .˝/ of the equation z C cz D cy h ı y C f :
(9.51)
A solution to equation (9.50) is a fixed point of S. We note that S is well defined since cy h ı y C f 2 H 1 .˝/ for all y 2 L2 .˝/: Moreover, S is homotone, i.e. order preserving: given y y0 in L2 .˝/ we have S. y/ S. y0 / by the weak maximum principle since for z WD S. y/, z0 WD S. y0 / we have z C cz z0 C cz0 as .cy0 h. y0 // .cy h ı y/ c. y0 y/ `. y0 y/ 0: Furthermore, S is continuous from L2 .˝/ to itself: given y, y0 in L2 .˝/, for z WD S. y/, z0 WD S. y0 / we have .z z0 / C c.z z0 / D .cy cy0 / .h ı y h ı y0 // and applying the weak form of equation (9.51) with z z0 as test function we get 2 c z z0 2
Z ˝
.c. y y0 / .h ı y h. y0 ///.z z0 /
.c y y0 2 C h ı y h ı y0 /2 / z z0 2
9.4 Nonlinear Problems
567
and it follows that kz z0 k2 .1 C `=c/ ky y0 k2 W S is even Lipschitzian. Starting with y0 WD w we inductively define a sequence . yn / by ynC1 WD S. yn /: Since w is a subsolution, we have y1 C cy1 D cy0 h ı y0 C f y0 C cy0 and the maximum principle entails that y1 y0 : Applying S.n/ WD S ı : : : ı S to each side of this inequality, we get ynC1 yn for all n: Similarly, using the fact that u is a supersolution, we get yn u for all n and even yn zn WD S.n/ .u/: Passing to the limit using the Dominated Convergence Theorem, we get elements uO , w O of L2 .˝/ such that . yn / ! w O and .zn / ! uO in L2 .˝/: Since S is continuous, we see that uO , wO are fixed points of S; hence are solutions. Moreover, if v is a solution to (9.50) satisfying u v w; applying S.n/ we get zn S.n/ .v/ D v yn and passing to the limit we obtain uO v w. O
A uniqueness result can be obtained under an additional assumption. Corollary 9.7 Let f 2 L2 .˝/ and let h W R ! R be nondecreasing, Lipschitzian with rate ` and such that h.0/ D 0: Then equation (9.50) has at most one solution. Proof Let u and w be the weak solutions in H01 .˝/ of u D f C ;
w D f
respectively, with f C D max. f ; 0/; f D max.f ; 0/: By the weak maximum principle (Theorem 9.20) we have u 0 w; hence, since h is nondecreasing, u C h ı u u D f C f f D w w C h ı w: If v1 and v2 are two solutions to equation (9.50), by subtraction we have .v1 v2 / D .h ı v1 h ı v2 / in ˝, i.e. Z
Z ˝
r.v1 v2 /rv D
˝
8v 2 H01 .˝/:
.h ı v1 h ı v2 /v
Taking v WD v1 v2 ; since h is nondecreasing we get v1 D v2 by Poincaré’s inequality.
R ˝
jr.v1 v2 /j2 D 0 and
568
9 Partial Differential Equations
9.4.4 Monotone Multimaps We devote this subsection to a class of multivalued maps of great importance (see [43, 51, 207–209, 219, 230, 233, 234] f.i.). This class is used in the modern theory of partial differential equations. We encourage the reader to return to Sect. 1.3 on multivalued analysis when necessary. In the sequel we often identify a multimap M with its graph G.M/. Definition 9.7 A multimap (or multivalued operator) M W D.M/ X from its domain D.M/ WD dom M, a subset of a Banach space X, into the dual X of X is said to be monotone if for all w, x 2 D.M/ and w 2 M.w/, x 2 M.x/ one has hw x ; w xi 0: A multimap M W D.M/ X is said to be dissipative if M is monotone. Definition 9.8 A multimap M W D.M/ X is said to be maximally monotone if for any monotone multimap N W D.N/ X whose graph G.N/ satisfies G.M/ G.N/ one has G.M/ D G.N/. A similar definition holds for maximally dissipative multimaps: maximality means maximality in terms of inclusion of graphs (Fig. 9.1). In Hilbert spaces there is a tight relationship between monotone multimaps and nonexpansive maps. Here a multimap F W W Z between two metric spaces is said to be nonexpansive if dZ .z1 ; z2 / dW .w1 ; w2 / for all w1 ; w2 2 D.F/; z1 2 F.w1 /; z2 2 F.w2 /: Such a condition implies that F is single-valued (take w1 D w2 ). Proposition 9.13 Given a Hilbert space X and c > 0, let S W X 2 ! X 2 be defined by S.x; y/ WD .c.x C y/; c.x y//;
S1 .w; z/ D .
1 1 .w C z/; .w z//: 2c 2c
Let M W X X; P W X X be such that G.P/ D S.G.M//. Then P is a nonexpansive map if and only if M is a monotone multimap.
Fig. 9.1 A monotone multimap and a maximally monotone multimap
9.4 Nonlinear Problems
569
p Note that for c WD 1= 2 the bijection S is an isometry. Proof Given .xi ; yi / 2 G.M/ for i D 1; 2; let .wi ; zi / WD S.xi ; yi /: Then kw1 w2 k2 kz1 z2 k2 D h.w1 w2 / C .z1 z2 / j .w1 w2 / .z1 z2 /i D h.w1 C z1 / .w2 C z2 / j .w1 z1 / .w2 z2 /i D 4c2 hx1 x2 j y1 y2 i and thus kz1 z2 k kw1 w2 k ” hx1 x2 j y1 y2 i 0: In particular, z1 D z2 whenever w1 D w2 : P is a single-valued map on its domain D.P/ WD fx C y W .x; y/ 2 G.M/g. This equivalence means that P is nonexpansive if and only if M is monotone. Example Let f W R ! R be nondecreasing. Then f is monotone. Note that this is not the case if f is nonincreasing! Thus the terminology “monotone” is not really satisfactory; but it is well established. Example Let M WD @f be the subdifferential of a convex function. Then the very definition of @f shows (by an addition sides by sides) that M is monotone. The maximal monotonicity is not as obvious, however. In the case f WD .1=2/ kk2 , the norm being smooth on Xnf0g, the result is a consequence of a general fact pointed out in Proposition 9.14. In the general case, in view of the importance of this example, we give two different proofs. The first one is in the next theorem; the second one is in Corollary 9.8. Theorem 9.23 (Rockafellar) Let f W X ! R1 be a closed proper convex function. Then M WD @f is maximally monotone. Proof (M. Ivanov and N. Zlateva [167]) Let .w; w / 2 X X be monotonically related to M in the sense that 8.x; x / 2 M
hx w; x w i 0:
(9.52)
We have to prove that .w; w / 2 @f . Changing f into g W x 7! f .w C x/ hx; w i C m we may suppose .w; w / D .0; 0/ and f .0/ > 0: Then relation (9.52) reads 8.x; x / 2 @f
hx; x i 0:
(9.53)
Since f is the supremum of the family of continuous affine functions bounded above by f , there exists some z 2 X such that f z . Let c WD kz k and let ."n / be a sequence in 0; c=2Œ with limit 0: Let gn W X ! R be given by gn .x/ WD 2"n kxk for x 2 BX ,
gn .x/ WD 2"n C c.kxk 1/ for x 2 XnBX ,
570
9 Partial Differential Equations
so that fn WD f Cgn is bounded below. It can be checked that gn is convex continuous (gn D kn ı kk where kn W R ! R has an increasing right derivative). Let wn 2 X be such that fn .wn / infx2X fn .x/ C "n for all n 2 N. The Brøndsted-Rockafellar theorem (Theorem 6.5) with ın WD 1; wn WD 0 for all n yields sequences .xn / in X; .xn / in X such that xn 2 BŒwn ; ın and xn 2 @fn .xn / \ BŒ0; "n for all n 2 N. By the sum rule we have xn 2 @f .xn / C @gn .xn /, and by definition of gn , for all y 2 @gn .xn / we have hxn ; y i 2"n kxn k. Thus, taking (9.53) into account, we get hxn ; xn i 2"n kxn k : This relation implying xn 2"n if xn ¤ 0; the inclusion xn 2 BŒ0; "n leads to the conclusion that we must have xn D 0: Since @fn .0/ D @f .0/ C 2"n BX and since @f .0/ is closed, we obtain 0 2 @f .0/; the expected conclusion. Another criterion for maximal monotonicity can be given under a mild continuity assumption. We say that a map F W X ! X is radially (weak ) continuous (often called hemicontinuous) if for all x0 ; x1 2 X; t 2 R, xt WD .1 t/x0 C tx1 , one has F.xt / ! F.x0 / weakly when t ! 0: If J is the duality map of a smooth Banach space X, this condition is satisfied since J.xt / is bounded for t in any compact interval and any weak limit point x of J.xt / as t ! 0 satisfies kx k lim inft!0 kJ.xt /k D lim inft!0 kxt k D kx0 k and hx ; x0 i lim inft!0 hJ.xt /; xt i D limt!0 kxt k2 D kx0 k2 ; hence x D J.x0 /: Proposition 9.14 Let F W X ! X be a radially weak continuous (single-valued) monotone map. Then F is maximally monotone. In particular, if X is strictly convex, then the duality map J W X ! X is maximally monotone. In fact, by Theorem 9.23, the duality multimap J W X X is always maximally monotone since J D @j with j./ WD .1=2/ kk2 . Proof (Standard Trick for Monotone Operators) Let .x; x / 2 X X be such that hxu; x u i 0 for all .u; u / 2 F: Fixing x 2 X and taking u WD ut WD xCt.xx/ with t 20; 1; after simplification by t we get hx x; x F.ut /i 0: Passing to the limit as t ! 0C , by radial weak continuity of F, we get hx x; x F.x/i 0: Since x is arbitrary in X, we obtain x D F.x/: Let us display some remarkable properties of monotone and maximally monotone multimaps. We first observe that if M W X X is monotone, then M 1 is monotone when considering it takes its values in X : Moreover, if M is maximally monotone and if X is reflexive, then M 1 is maximally monotone. The next result we present is reminiscent of the fact that the subdifferential of a convex function f on a Banach space is locally bounded on the interior of the domain of @f : Theorem 9.24 Any monotone multimap M W X X is locally bounded on int D.M/. In particular, any linear monotone map A W X ! X is continuous.
9.4 Nonlinear Problems
571
The proof we present (due to Brezis-Crandall-Pazy) uses a surprising lemma. Lemma 9.4 (Fitzpatrick) Let .xn / ! 0 in X, . yn / in X with .yn / ! 1: Then, for all r > 0 there exist w 2 rBX and subsequences .xk.n/ /; . yk.n/ / of .xn / and . yn / respectively such that lim hw xk.n/ ; yk.n/ i D C1:
n!1
Proof Suppose on the contrary that there exists an r > 0 such that for all u 2 rBX one can find some cu 2 RC such that 8n 2 N
hu xn ; yn i cu :
For k 2 N, let Ck WD fu 2 rBX W 8n 2 N hu xn ; yn i kg. Our assumption ensures that rBX D [k2N Ck . Since Ck is closed for all k, Baire’s Theorem ensures that for some j 2 N the interior of Cj in rBX is nonempty. Let u0 2 rBX and s0 > 0 be such that BŒu0 ; s0 \ rBX Cj . We can find s 20; s0 Œ and q 20; 1Œ such that BŒqu0 ; s BŒu0 ; s0 \ rBX . Then, u1 WD qu0 2 rBX and for all v 2 sBX we have hu1 C v xn ; yn i j;
8n 2 N
hu1 xn ; yn i c WD cu1 :
Combining these inequalities, for all v 2 sBX we get hv 2xn ; yn i j C c: For n large enough we have k2xn k s=2 and we get .s=2/kyn k .k2xn k C s/kyn k sup hv 2xn ; yn i j C c; v2sBX
a contradiction with .yn / ! 1.
Proof of Theorem 9.24 Replacing M with Mw D f. y w; y / W . y; y / 2 Mg, we may suppose 0 2 int D.M/: Let r0 > 0 be such that r0 BX D.M/: Let us show that there exists an r 20; r0 such that M.rBX / is bounded.Assume on the contrary that there exist sequences .xn / ! 0, . yn / in Y such that .yn / ! 1 and yn 2 M.xn / for all n 2 N. The lemma provides w 2 r0 BX and subsequences .xk.n/ /; . yk.n/ / of .xn / and . yn / respectively such that lim hw xk.n/ ; yk.n/ i D C1:
n!1
Then w 2 D.M/ and the monotonicity of M implies that for z 2 M.w/ one has lim infhw xk.n/ ; zi lim hw xk.n/ ; yk.n/ i D C1; n!1
a contradiction.
n!1
572
9 Partial Differential Equations
It is easy to see that (the graph of) a maximally monotone multimap is closed. In general it is not closed in the weakweak topology. However, some related properties are presented in the next proposition. Proposition 9.15 Let M W X X be maximally monotone. Then (a) For all x 2 D.M/ the set M.x/ is weak closed and convex. (b) For any sequence .xn / ! x weakly in X and any sequence .xn / ! x weakly in X with .xn ; xn / 2 M for all n one has .x; x / 2 M and hx; x i D limn hxn ; xn i whenever one of the next equivalent conditions is satisfied: lim suphxn ; xn i hx; x i;
(9.54)
n
lim suphxn x; xn x i 0:
(9.55)
n
(c) For any sequence .xn / ! x weakly in X and any sequence .xn / ! x strongly in X with .xn ; xn / 2 M for all n, one has .x; x / 2 M: (d) For any sequence .xn / ! x in X and any sequence .xn / ! x weakly in X such that xn 2 M.xn / for all n, one has .x; x / 2 M: (e) For any sequence .xn / ! x in X, any bounded sequence .xn / in X such that xn 2 M.xn / for all n, and any weak limit point x of .xn /, one has .x; x / 2 M: In particular, if M is single-valued at x 2 int D.M/, then M is demi-continuous at x, i.e. continuous at x from D.M/ endowed with its strong topology into X endowed with its weak topology. Proof (a) Given x 2 D.M/; by maximal monotonicity of M we have M.x/ D fx 2 X W 8.u; u / 2 M hx u; x i hx u; u ig: These inequalities clearly define a weak closed convex subset of X . (b) Given sequences .xn / ! x weakly in X, .xn / ! x weakly in X with .xn ; xn / 2 M for all n, we first observe that (9.54) and (9.55) are equivalent since hxn x; xn x i D hxn ; xn i hx; xn x i hxn ; x i: Now, passing to the limit in the relation 0 hxn ; xn i hxn ; u i hu; xn i C hu; u i
8.u; u / 2 M
and assuming that (9.54) is satisfied, we get 0 hx; x i hx; u i hu; x i C hu; u i
8.u; u / 2 M
or 0 hx u; x u i for all .u; u / 2 M: By maximal monotonicity we get .x; x / 2 M:
9.4 Nonlinear Problems
573
(c) Since .xnˇ/ is bounded when ˇ .xn / ! x weakly, if .x n / ! x strongly we have ˇ ˇ D 0 and (9.55) lim supn hxn x; xn x i lim supn kxn xk : xn x holds. (d) When .xn / ! x and .xn / ! x weakly ; the sequence .xn / being bounded, we have .hxn x; xn i/ ! 0; so that lim supn hxn ; xn i lim supn hx; xn i D hx; x i: (e) Given sequences .xn / ! x in X, .xn / bounded in X such that xn 2 M.xn / for all n; and the limit x of a subnet .xn.i/ /i2I of .xn /, we have .hxn.i/ x; xn.i/ i/i2I ! 0, hence, for all .u; u / 2 M
0 lim infhxn.i/ u; xn.i/ u i D lim infhx u; xn.i/ u i D hx u; x u i: i2I
i2I
By maximality we get .x; x / 2 M: If x 2 int D.M/; for any sequences .xn / in D.M/ with limit x, .xn / in X such that xn 2 M.xn / for all n, the sequence .xn / is bounded by Theorem 9.24 and the preceding argument shows that any weak limit point x of .xn / is in M.x/. The last assertion ensues. Single-valued monotone maps enjoy remarkable continuity properties. Proposition 9.16 (a) Let F W X ! X be a radially continuous monotone single-valued map on a finite dimensional space X. Then F is continuous. (b) For any Banach space X, any radially weak* continuous monotone map F W X ! X is demicontinuous, i.e. continuous from X endowed with its strong topology into X endowed with its weak* topology. Proof (a) Let us first prove that any monotone map F W X ! X is bounding, i.e. bounded on bounded subsets (we prefer this term to the more usual term “bounded” which is confusing since the image of F is not necessarily bounded). If F is not bounding there exists a bounded sequence .xn / in X such that .kF.xn /k/ ! 1: We may suppose .xn / has a limit x and that .un / WD .F.xn /= kF.xn /k/ has a limit u 2 SX . Passing to the limit in the relation 1 hxn x; F.xn / F.x/i 0 kF.xn /k
8x 2 X; 8n 2 N;
we obtain hx x; u i 0
8x 2 X:
Thus u D 0, contradicting ku k D 1: Thus F is bounding. Now let us suppose a sequence .xn / converges to x 2 X and let y be a limit point of the bounded sequence .F.xn //: The monotonicity of F yields hx w; y F.w/i 0
8w 2 X:
574
9 Partial Differential Equations
Taking u arbitrary in X, t 20; 1 and w WD x t.x u/; we get hx u; y F.x/i 0 by the radial continuity of F. Since u is arbitrary in X; this means that y D F.x/: Thus .F.xn // ! F.x/. (b) The map F is maximally monotone by Proposition 9.14 and locally bounded by Theorem 9.24. Thus, the assertion is a consequence in Proposition 9.15(e).
Exercises 1. Let F W C ! X be a single-valued monotone map with domain a convex subset of a Banach space X. Suppose that F is (radially) differentiable at some point x of the interior of C. Show that for all v 2 X one has hF 0 .x/:v; vi 0. 2. Let L W X ! X be a linear monotone operator on a Hilbert space. Show that L is maximally monotone if and only if its graph is closed and L is monotone. [See: [152, Thm 10 p. 48].] 3. (Debrunner-Flor) Let M W X X be a monotone operator with graph G.M/ in a Hilbert space and let C be a closed convex subset of X containing the domain D.M/ of M: Show that for all y 2 X there exists some x 2 C such that hv C x j u xi h y j u xi
8.u; v/ 2 G.M/:
[See [29, Thm 21.7].]
9.4.5 Representation of Monotone Multimaps It is the purpose of this subsection to show that convex analysis can be used to study monotonicity. Thus, following [207], given a monotone operator M, we look for a convex function f W X X ! R representing M in such a way that some properties of f can be transferred to M and some operations on monotone operators correspond to operations on functions. In the sequel c and b denote the coupling functions c W X X ! R, b W .X X / .X X / ! R given by c.x; x / WD hx ; xi b..w; w /; .x; x // D hw ; xi C hx ; wi
.x; x / 2 X X ; .w; w /; .x; x / 2 X X :
9.4 Nonlinear Problems
575
Note that b is a symmetric bilinear function that realizes a metric coupling of X X with itself (exercise). Such a fact expresses the particular structure of X X : If f is a function on X X , we denote by f b its conjugate with respect to the coupling function b W f b .x; x / WD supfb..w; w /; .x; x // f .w; w / W .w; w / 2 X X g for .x; x / 2 X X and we set f | .x ; x/ WD f .x; x /. Considering X X as a subset of X X D .X X/ , we note that f b is the restriction to X X of . f | / , where for g W X X ! R, g W X X ! R is the usual conjugate of g for the usual coupling function between X X and X X . Equivalently, f b D h| , where h is the restriction to X X of the usual conjugate f of f and where h| .x; x / WD h.x ; x/. In the sequel we find it convenient to use f b rather than f in particular because it is defined on the same space as f ) and to identify a multimap with its graph. Since a multimap M W X X is characterized by (or even identified with) its graph G.M/ or M X X ; it is faithfully represented by the indicator function M of its graph M. However, M has no differentiability or convexity property. Thus it is natural to replace M or closely related functions such as cM WD M C c with their convex envelopes or their closed convex envelopes. The following observation enlightens our route. It can be formulated as follows: if M W X X is monotone, then c is convex on (the graph of) M, a striking fact. Thus, it is natural to use convexity to study monotone multimaps. Proposition 9.17 If M W X X is a monotone multimap, then the restriction to M of the convex envelope gM WD co cM of cM WD c C M coincides with the restriction to M of c. Proof Given .xi ; xi / 2 M for i in a finite set I and .ti /i2I 2 RIC satisfying ˙i2I ti D 1 and .x; x / WD .˙i2I ti xi ; ˙i2I ti xi / 2 M; by monotonicity we have ˙i2I ti hxi ; xi i hx; x i D ˙i2I ti hxi ; xi x i ˙i2I ti hx; xi x i D hx; ˙i2I ti xi x i D 0: By construction of co cM , this shows that for any .x; x / 2 M we have .co cM /.x; x / hx; x i. Since the reverse inequality co cM cM holds, we get co cM jM D c jM . Let us describe an easy way of obtaining monotone multimaps that is a kind of converse of the preceding observation. Lemma 9.5 Let G be the set of proper convex functions g W X X ! R1 such that g c: For g 2 G let Mg WD G.Mg / WD f.x; x / 2 X X W g.x; x / D c.x; x /g:
576
9 Partial Differential Equations
Then Mg is monotone, as is any multimap M such that M Mg . Proof Clearly, for g 2 G one has Mg WD f.x; x / 2 X X W g.x; x / c.x; x /g: Thus, given .x; x /, . y; y / 2 Mg , by convexity of g and this relation we have 1 1 1 1 1 1 h .x C y/; .x C y /i g. .x C y/; .x C y // D g. .x; x / C . y; y // 2 2 2 2 2 2 1 1 1 1 g.x; x / C g. y; y / D hx; x i C h y; y i: 2 2 2 2 Whence hx y; x y i 0 and Mg is monotone; so is any multimap M such that M Mg : The next proposition completes the preceding lemma. Proposition 9.18 For a nonempty subset M of X X the following assertions are equivalent: (a) (b) (c) (d) (e)
M is the graph of a monotone multimap; cM WD c C M cbM ; gM WD co cM c; the function gM WD co cM belongs to G and satisfies gM C M D cM ; there exists some g 2 G such that g C M D cM , i.e. M Mg :
Proof (a))(b) Given .x; x / 2 M; for all .w; w / 2 M we have hx; x i hw; x i C hx; w i hw; w i D b..w; w /; .x; x // c.w; w / hence hx; x i .c C M /b .x; x / WD cbM .x; x /: Thus cM cbM . (b))(c) Let gM WD cocM , the convex envelope of cM : Then gM D cM and gbM D b cM . Moreover, since cbM is convex and cM cbM , by definition of a convex envelope, we have gM cbM D .cM /| . Thus, for all .x; x / 2 X X we have 2gM .x; x / gM .x; x / C .cM /| .x; x / D gM .x; x / C gM .x ; x/ h.x; x /; .x ; x/i D 2hx; x i: Therefore gM c: (c))(d) We observe that (c) means that gM WD co cM 2 G. Thus gM C M c C M D cM ; on the other hand, gM C M cM C M D cM , so that gM C M D cM : (d))(e) Taking g D gM , it suffices to observe that M Mg since g C M D cM , for .x; x / 2 M we have g.x; x / D c.x; x /, hence M Mg . (e))(a) is a consequence in the preceding lemma.
9.4 Nonlinear Problems
577
The set G of proper convex functions on X X bounded below by c can serve to define maximally monotone multimaps, as the next theorem shows. We need some preliminary results in which we set q.x; x / WD
1 2 .kxk2 C kx k / 2
.x; x / 2 X X :
We observe that for all .x; x / 2 X X we have q.x; x / c.x; x /; q.x; x / c.x; x /: Moreover, if q.x; x / D c.x; x / we have 0 D q.x; x / c.x; x /
1 2 .kxk2 2 kxk : kx k C kx k / 0 2
hence kxk D kx k and hx; x i D kxk2 D kx k2 : Therefore Mq D J; the duality multimap, the inclusion J Mq being a consequence in the definition of J. This fact explains the importance of the relation q.x; x / D c.x; x /. Lemma 9.6 Let g 2 G and let M Mg be such that for all .x; x / 2 X X there exists some .w; w / 2 M such that g.w; w / hw; w i C q.w x; w x / C c.w x; w x / 0:
(9.56)
Then M is maximally monotone (hence M D Mg ). Note that condition (9.56) implies that .g c/.q C c/ 0. Proof Since M Mg , M is monotone. Let .x; x / be monotonically related to M in the sense that hw x; w x i 0
8.w; w / 2 M:
Picking .w; w / 2 M as in the statement and using the assumption g c we get q.wx; w x / 0; hence w D x, w D x : Thus .x; x / 2 M and M is maximally monotone. Theorem 9.25 Let g 2 G be such that gbb D g and gb 2 G. Then Mg WD f.x; x / 2 X X W g.x; x / D c.x; x /g is (the graph of) a maximally monotone multimap. Proof Given .x; x / 2 X X , setting h.u; u/ WD q.u x; u x / C c.u x; u x / hu; u i
.u; u / 2 X X
and performing an easy computation, we see that hb .u; u / D h.u; u /
8.u; u / 2 X X :
578
9 Partial Differential Equations
Since gb c, for all .u; u / 2 X X we have gb .u; u / C h.u; u / q.u x; u x / C c.u x; u x / 0: The sandwich theorem and the continuity of h (or Corollary 6.14) yield some .u; u / 2 X X such that .gb / .u ; u/ C h .u ; u/ 0: This amounts to gbb .u; u / C h.u; u/ 0 and means that (9.56) is satisfied since gbb D g. The preceding lemma ensures that Mg is maximally monotone. Corollary 9.8 Let f W X ! R1 be a closed, proper convex function. Then M WD @f is a maximally monotone multimap. Proof Let g W X X ! R1 be given by g.x; x / WD f .x/ C f .x /: Then, for all .x; x / 2 X X we have g.x; x / c.x; x / and gb .x; x / D
sup
.w;w /2XX
.hx; w i C hw; x i f .w/ f .w //
D f .x/ C f .x / D g.x; x /: Thus gbb D g and gb 2 G. Since Mg D G.@f /, Theorem 9.25 ensures that @f is maximally monotone. Now let us consider the reverse passage, from monotone subsets of X X to functions. Since in Banach spaces closed proper convex functions have a rather nice calculus, it is sensible to pass to the families H WD fh 2 G W h D hbb g D fh W h D hbb ; h cg; HM WD fh 2 H W h C M D cM WD c C M g: The set HM is called the set of representative functions of M. Note that h 2 HM if and only if h D hbb ; h c and M Mh . In Proposition 9.18, to a nonempty subset M of X X we have associated the function gM WD co .c C M /. If M is monotone gM belongs to G, hence satisfies gM c, so that its lower semicontinuous hull bb bb bb pM WD gbb M WD .co .c C M // D .c C M / WD cM
also satisfies pM c and belongs to H. It seems to be closely related to cM : its construction is given by simple operations and in some cases it is possible to give an explicit expression for it. We also consider its conjugate fM W fM .x; x / WD pbM .x; x /
pM .x; x / D fMb .x; x /
fM .x; x / WD cbM .x; x / WD supfhx; w i C hw; x i hw; w i W .w; w / 2 Mg:
9.4 Nonlinear Problems
579
The function fM is called the Fitzpatrick function of M. We call the function pM the predominant function of M in view of the following proposition. Proposition 9.19 For any subset M of X X the function pM is the greatest closed proper convex function on X X bounded above by cM . In fact, pM is the lower semicontinuous hull of gM WD co cM . For any monotone multimap M one has pM c, hence pM 2 HM , pM fM
cM fM C M :
(9.57)
The function pM is also the greatest element of HM : Proof The first assertion is a general fact about the biconjugate of a function applied to cM . Let M be monotone. The implication (a))(c) of Proposition 9.18 ensures that gM c hence pM D gbb M c since c is lower semicontinuous and in fact continuous. The implication (a))(b) of Proposition 9.18 yields cM cbM D fM hence cM D cM C M fM C M . Since fM is closed proper convex, from the inequality cM fM bb we deduce the relations pM WD cbb M fM D fM : Finally, for h 2 HM we have h h C M D cM , hence h D hbb cbb M D pM : Note that for a monotone multimap M the function fM is not necessarily a representative function of M. However, fM is of interest since it satisfies the properties fMbb D fM ; fM D cM on M if M is monotone since fM .x; x / D hx; x i
inf
.w;w /2M
hw x; w x i:
Moreover, when M is monotone, one has fM .x; x / hx; x i if and only if M [ f.x; x /g is monotone. The function pM is often close to gM , hence is often easier to compute than fM . Examples (a) Let M WD f.0; 0/g: Then M is monotone, pM D gM D cM D M and fM D M : More generally, when M is (the graph of) a continuous linear map satisfying M D M | jX , where M | is the transpose of M, one has pM D M D fM since for all .x; x / 2 M one has c.x; x / D hx; M.x/i D hM.x/; xi hence c.x; x / D 0 or cM D M . (b) More generally, let M be a linear subspace of X X such that hx; x i 0 for all .x; x / 2 M. In such a case M is the graph of a monotone multimap since for all .w; w /, .x; x / 2 M one has .w x; w x / 2 M. Proposition 9.17 asserts that cM is convex, so that gM D cM and pM D ccl.M/ since cl.M/ also is a linear subspace of X X and c jcl.M/ is convex and continuous. In particular, when M is the graph of a monotone, continuous, linear map, one has pM D cM , a remarkable simple fact, whereas, setting qM .w/ D .1=2/hw; Mwi; |
|
fM .x; x / D sup.hw; x C M xi hw; Mwi/ D .2qM / .x C M x/: w2X
580
9 Partial Differential Equations
(c) A special case of the preceding example concerns the identity map I on a Hilbert space X identified to X by the Riesz’ isomorphism. Then pI .x; x / D gI .x; x / D cI .x; x / and fI .x; x / D .1=4/ kx C x k2 : (d) Let M WD J, the duality map J of X and let q be the quadratic form on X X given by q.x; x / WD .1=2/ kxk2 C .1=2/ kx k2 as above. The definitions of pM and fM WD cbM yield pM q and fM q. (e) Let M WD @s, where s W X ! R is a continuous sublinear function on X: Let S WD @s.0/, so that s.x/ D supfhx; w i W w 2 Sg, @s.w/ D fw 2 S W c.w; w / D s.w/g and s .w / D S .w /. Thus one has cM .x; x / D s.x/ C S .x / D s.x/ C s .x /, hence pM .x; x / D s.x/ C s.x / D fM .x; x /. Exercise Let M be a monotone multimap with nonempty graph and let .x; x / 2 X X , r 2 P. Show that prM .x; x / D rpM .x; x =r/ and frM .x; x / D rfM .x; x =r/. Exercise Let X be reflexive and let M W X X : Show that for all .x; x / 2 X X | | one has pM1 .x ; x/ D pM .x; x /, fM1 .x ; x/ D fM .x; x /. Let us show that a maximally monotone multimap M is represented by the functions fM and pM . Theorem 9.26 For a nonempty subset M of X X and h W X X ! R1 , the following assertions are equivalent: (a) (b) (c) (d) (e)
M is maximally monotone; cM pM fM c and M D MfM D MpM ; if h satisfies h D hbb and pM h fM , then h 2 HM and Mh D M; for h WD fM one has h 2 HM and Mh D M; the function h WD fM satisfies h c and Mh D M:
Proof (a))(b) When M is monotone we already know that cM cbb M DW pM fM and inffhx w; x w i W .w; w / 2 Mg D 0 for all .x; x / 2 M. Since fM .x; x / WD cbM .x; x / WD supfhx; w i C hw; x i hw; w i W .w; w / 2 Mg D hx; x i inffhx w; x w i W .w; w / 2 Mg we get fM .x; x / D hx; x i for all .x; x / 2 M: When M is maximally monotone, for .x; x / 2 X X nM we have inffhx w; x w i W .w; w / 2 Mg < 0 hence fM .x; x / > hx; x i: Thus we have M D MfM hence MpM MfM D MI conversely, for .x; x / 2 M we have c.x; x / D cM .x; x / pM .x; x / c.x; x /; hence .x; x / 2 MpM : (b))(c) and (c))(d) are obvious. (d))(e) This follows from the inclusion HM G. (e))(a) For g WD h WD fM one has gbb D g and gb D pM 2 HM G and M D Mg , so that Theorem 9.25 ensures that M is maximally monotone.
9.4 Nonlinear Problems
581
Corollary 9.9 If M is maximally monotone, then for h W X X ! R1 satisfying h D hbb , pM h fM , one has pM hb fM and, denoting by X the canonical projection from X X onto X, one has co D.M/ X .dom h/ \ X .dom hb /: Proof Since pM D fMb and fM D pbM , the relations pM hb fM follow from the fact that h 7! hb is antitone. Thus, assertion (c) of the preceding theorem ensures that h; hb 2 HM and h and hb play symmetric roles. Since dom h is convex, it suffices to prove that D.M/ X .dom h/. Given x 2 D.M/; we pick x 2 M.x/; then .x; x / 2 M D Mh , so that h.x; x / D hx; x i 2 R and x 2 X .dom h/:
9.4.6 Surjectivity of Maximally Monotone Multimaps Let us give important characterizations of maximal monotonicity using the duality multimap J of X or rather its graph G.J/: Theorem 9.27 (Simons) Let M W X X be a monotone multimap such that G.M/ C G.J/ D X X . Then M is maximally monotone. If X is reflexive, the converse holds. Proof Let .x; x / 2 X X be monotonically related to M in the sense that hx w; x w i 0
8.w; w / 2 M:
Assuming M is monotone and G.M/ C G.J/ D X X , we can find .u; u / 2 M and .v; v / 2 G.J/ such that .u; u / C .v; v / D .x; x /. Then we have 0 hx u; x u i D hv; v i D kvk2 D kv k2 hence v D 0, v D 0 and .x; x / D .u; u / 2 M: Therefore M is maximally monotone. Conversely, let M be maximally monotone and let .z; z / 2 X X ; X being reflexive. Since N given by G.N/ WD G.M/ .z; z / is maximally monotone, replacing M with N, it suffices to show that .0; 0/ 2 G.N/ C G.J/: Observing that q given by q.x; x / WD .1=2/ kxk2 C .1=2/ kx k2 for .x; x / 2 X X is in H and .1=2/ kxk2 C.1=2/ kx k2 kxk : kx k hx; x i, by maximal monotonicity of M and the implication (a))(b) of Theorem 9.26, we get that fN .x; x / C q.x; x / c.x; x / hx; x i D 0: Since q is convex, finite and continuous, the sandwich theorem yields some .w ; w/ 2 X X D .X X / , r 2 R such that fN .x; x / hx; w i C hw; x i C r q.x; x /
8.x; x / 2 X X ;
582
9 Partial Differential Equations
or fNb .w; w / r qb .w; w / (this also follows from the Fenchel-Rockafellar theorem or the Attouch-Brézis theorem). Thus fNb .w; w /Cqb .w; w / 0. Since hw; w i pN .w; w / D fNb .w; w / and hw; w i qb .w; w / D q.w; w /, we get . pN .w; w / hw; w i/ C .q.w; w / hw; w i/ 0; whence .w; w / 2 MpN D G.N/ and .w; w / 2 Mq D G.J/: Therefore .0; 0/ 2 G.N/ C G.J/ and .z; z / 2 G.M/ C G.J/: Let us give a more striking form to the preceding result in the reflexive case. Theorem 9.28 (Minty, Rockafellar) Let X be a reflexive Banach space. Then a monotone multimap M W X X is maximally monotone if and only if one has R.M C J/ D X . Proof We first note that the relation G.M/ C G.J/ D X X implies the equality R.M CJ/ D X since for all x 2 X there exist some .u; u / 2 G.M/ and .v; v / 2 G.J/ such that .u; u / C .v; v / D .0; x /; hence v D u and v 2 J.v/ D J.u/ with x D u C v 2 .M C J/.u/: Thus, if X is reflexive and if M is maximally monotone, the preceding theorem entails that R.M C J/ D X : Now let us suppose R.M C J/ D X : In order to prove that M is maximally monotone, we assume that the reflexive space X is endowed with a compatible strictly convex norm, as the Kadec-Troyanski Renorming Theorem (Theorem 6.24) allows us to do. Then, the duality map J is single-valued and strictly monotone in the sense that for u; v 2 X with u ¤ v one has hu v; J.u/ J.v/i > 0: Let .x; x / be monotonically related to M W hw x; w x i 0
8.w; w / 2 M:
Since R.M C J/ D X there exists some .w; w / 2 M such that x C J.x/ D w C J.w/:
(9.58)
Then hw x; J.w/ J.x/i D hw x; w x i 0: By strict monotonicity of J we get w D x, hence w D x by (9.58) and .x; x / 2 M: Thus M is maximally monotone. Remark Let us give a direct proof of the relation R.J C M/ D X, assuming that M is maximally monotone and X is reflexive. By Theorem 9.26 there exists h 2 H
9.4 Nonlinear Problems
583
such that M D Mh : Since h c, Cauchy inequality hx; x i 12 .kxk2 C kx k2 / implies that 1 2 h.x; x / .kxk2 C kx k / 2
8.x; x / 2 X X :
The sandwich theorem yields some .w; w / 2 X X and r 2 R such that 1 2 h.x; x / hx; w i C hw; x i C r .kxk2 C kx k / 2
8.x; x / 2 X X : (9.59)
Choosing .x; x / 2 .J 1 w ; Jw/ (recall that J is onto), we get r
1 .kwk2 C kw k2 /: 2
Then, for any .x; x / 2 M, relation (9.59) implies that 1 hx; x i hx; w i C hw; x i C .kwk2 C kw k2 /: 2 Adding hw; w x i hx; w i to both sides of this inequality, we get 1 2 hx w; x w i hw; w i C .kwk2 C kw k / 0 2
8.x; x / 2 M:
By maximality of M we conclude that .w; w / 2 M: Now, taking .x; x / D .w; w / in the last inequalities, we get 0
1 .kwk2 C kw k2 / C hw; w i 0; 2
hence 0 kwk2 C kw k2 2 kwk : kw k D .kwk kw k/2 or kw k D kwk and hw; w i kwk2 ; so that w 2 J.w/: Thus 0 2 .M C J/.w/: Given z 2 X and applying what precedes to the maximally monotone multimap with graph M C .0; z /, we get z 2 .M C J/.X/; hence R.M C J/ D X : The following consequence clarifies the structure of the graph of a maximally monotone multimap. Corollary 9.10 (Minty, Rockafellar) The graph G.M/ of a maximally monotone multimap on a Hilbert space X is a Lipschitz submanifold of X X: More precisely, setting P WD .I C M/1 , the map 1 1 w 7! . .w C P.w//; .w P.w/// 2 2 is a bijective parameterization of G.M/ by X that is Lipschitzian as is its inverse.
584
9 Partial Differential Equations
Proof Since M and M 1 are maximally monotone, we know from Theorem 9.28 and Proposition 9.13 that P given by G.P/ WD f.x C y; x y/ W .x; y/ 2 Mg is defined on the whole of X and is nonexpansive. For any w 2 X there exists some .x; y/ 2 M such that w D x C y. Setting z WD x y D P.w/ we have . 12 .w C z/; 12 .w z// D .x; y/ 2 G.M/. Conversely, for any .x; y/ 2 G.M/, setting w WD xCy, z WD xy we have .x; y/ D . 12 .wCP.w//; 12 .wP.w///. Moreover, w 7! . 12 .w C P.w//; 12 .w P.w/// is nonexpansive and its inverse .x; y/ 7! .x C y; x y/ is also Lipschitzian. Remark The result can also be deduced from Proposition 9.13 since G.M/ is the image of G.P/ under the inverse S1 of the isomorphism of Proposition 9.13 and since G.P/ is a Lipschitzian submanifold of X X naturally parameterized by X as is the graph of any nonexpansive map. The following theorem is the prototype of several results in the same vein. Here a map F W X ! X is said to be scalarly coercive if hF.x/; xi= kxk ! 1 as kxk ! 1: This condition is satisfied when F is a continuous linear map A W X ! X that is positive definite in the sense that there exists some c > 0 such that hAx; xi c kxk2 for all x 2 X: Note that a scalarly coercive map is coercive in the sense that kF.x/k ! 1 as kxk ! 1 since kF.x/k : kxk hF.x/; xi: Theorem 9.29 (Browder, Minty) Let X be a reflexive Banach space, let F W X ! X be a monotone, radially continuous map, and let b, r 2 RC , y 2 bBX : If hF.x/; xi= kxk b for x 2 rSX , then the equation F.x/ D y has a solution x 2 rBX : In particular, a monotone, radially continuous, scalarly coercive map F is surjective. In fact, any monotone, radially continuous, and coercive map F is surjective. For the sake of simplicity we give the proof in the case when X is separable and reflexive, using the Galerkin method. Proof When X is finite dimensional, the first assertion is just a rephrasing of Corollary 2.11. Let .Xn /n0 be an increasing sequence of finite dimensional linear subspaces whose union W is dense in X. Let jn W Xn ! X be the canonical injection | and let pn WD jn W X ! Xn be its transpose. We identify the elements of Xn with their images by jn when it is convenient. Setting Fn WD pn ı F ı jn , we see that Fn is monotone and radially continuous, hence is continuous and for x 2 rSXn we have hFn .x/; xi D hF. jn .x//; jn .x/i b kjn .x/k D b kxk. By Corollary 2.11, given y 2 rBX , there exists some xn 2 Xn such that Fn .xn / D pn . y / and kxn k r: Since X 2 is reflexive and since F is bounding, we can find a subsequence ..xn.k/ ; F.xn.k/ /// of ..xn ; F.xn /// that weakly converges to some .x; x / 2 X X . Let us show that F.x/ D y :
9.4 Nonlinear Problems
585
For w 2 W; let m 2 N be such that w 2 Xm ; so that, for n m; we have 0 hF.xn / F.w/; jn .xn / jn .w/i D hFn .xn / Fn .w/; xn wi D h pn . y / Fn .w/; xn wi D h y F.w/; jn .xn / wi: Taking the limit over the subsequence, we get 0 h y F.w/; x wi: Since W is dense in X, for any z 2 X we can find a sequence .wn / in W with limit zI moreover, since F is maximally monotone by Proposition 9.14 and demicontinuous by Proposition 9.15, we may suppose that .F.wn // weak converges to F.z/: Passing to the limit in the last inequality with w changed into wn we obtain 0 h y F.z/; x zi: By maximal monotonicity of F we conclude that y D F.x/. Now, let us suppose F is just coercive (and monotone, radially continuous). Using Theorem 6.24 and endowing X with a norm whose dual norm is rotund, we see that for all " 20; 1 the map F" WD F C "J is scalarly coercive since for all x 2 X hF" .x/; xi hF.x/ F.0/; xi C hF.0/; xi C " kxk2 kF.0/k : kxk C " kxk2 : Since J is demicontinuous and monotone, F" is radially continuous and monotone. By the preceding we can find x" 2 X such that F.x" /C"J.x" / D 0: Then, the relation hF.x" / F.0/; x" i 0 implies that " kx" k2 D "hJ.x" /; x" i D hF.x" /; x" i kF.0/k : kx" k so that kF.x" /k D k"J.x" /k D k"x" k is bounded. Since F is coercive, we get that .x" / is bounded for " 20; 1 and .F.x" // ! 0 as " ! 0C . Then, for any weak limit point x of .x" /; by the standard trick for monotone operators we get F.x/ D 0: Changing F into F y we see that F is onto. Remark The assumption that X is reflexive cannot be dropped as the case F D J shows. Exercise If X is a finite dimensional space, show that a surjective monotone map F W X ! X is coercive. [Hint: assuming the contrary, find sequences .rn / ! 1; .un / ! u in SX such that .F.rn un // converges to some v 2 X ; take x 2 X such that u C v D F.x/ and get a contradiction to the relation hF.rn un / F.x/; rn un xi 0:] Exercise Let X be a uniformly convex and uniformly smooth Banach space and let F W X ! X be a continuous monotone map such that F 1 is locally bounded in the
586
9 Partial Differential Equations
sense that each x 2 X has a neighborhood V such that F 1 .V/ is bounded. Prove that F.X/ D X . [See [57, Thm 7].] Let us apply the preceding general solvability result to the case of partial differential equations in divergence form. For simplicity, we restrict our attention to the case of operators of order two over an open subset ˝ of Rd : F.u/.x/ WD
d X
x 2 ˝; u 2 Wp1 .˝/
Di Fi .x; Du.x//
iD1
where for i 2 Nd the function Fi W ˝ Rd ! R is measurable in its first variable, continuous in its second variable, and satisfies the following conditions for some p > 1; b, c > 0, some g, h 2 Lq .˝/ with q WD p=. p 1/ W
d X
jFi .x; y/j b kykp1 C g.x/ x 2 ˝; y 2 Rd
(9.60)
Fi .x; y/yi c kykp h.x/; x 2 ˝; y 2 Rd
(9.61)
iD1 d X .Fi .x; y/ Fi .x; z//. yi zi / 0 x 2 ˝; y; z 2 Rd ;
(9.62)
iD1
with y WD . yi /, z WD .zi /. Note that (9.62) replaces the ellipticity condition of linear elliptic equations. We introduce the generalized Dirichlet form a.; / on Wp1 .˝/ given by a.u; v/ D
d Z X iD1
˝
Fi .x; Du.x//Di v.x/dx D
d X hFi .; Du.//; Di v./i; iD1
for u; v 2 Wp1 .˝/. Our assumptions ensure that it is well defined since Fi .; Du.// 2 Lq .˝/: Moreover, it satisfies the inequality ja.u; v/j k.kuk1;p / kvk1;p
(9.63)
for some function k depending on b and g. In particular, for each u 2 Wp1 .˝/ the map F.u/ WD a.u; / is a continuous linear form on Wp1 .˝/ and on any closed linear 1 subspace X containing Wp;0 .˝/. Note that the choice of X incorporates boundary 1 conditions (Dirichlet conditions for X D Wp;0 .˝/). Let us show that the map F W X ! X defined above satisfies the assumptions of Theorem 9.29. Inequality (9.60) ensures that G W w 7! .Fi .; w./// is bounding, i.e. maps bounded subsets of Lp .˝/ into bounded subsets of Lq .˝/d : By a general property of Nemytskii operators, this implies that G is continuous from Lp .˝/ into
9.4 Nonlinear Problems
587
Lq .˝/d , hence that F is continuous from X into X . Assumption (9.61) entails scalar coercivity of F: Z a.u; u/ c ˝
kDu.x/kp dx khkq kDukp
hF.u/; ui a.u; u/ p1 D c kuk1;p khkq ! 1 kuk1;p kuk1;p
as kuk1;p ! 1:
Finally, the monotonicity of F can be derived from (9.62): for u; v 2 X hF.u/ F.v/; u vi D a.u; u v/ a.v; u v/
d Z X iD1
˝
.Fi .; Di u.// Fi .; Di v.///.Di u./ Di v.// 0:
Thus Theorem 9.29 can be applied. We can conclude as in the following statement. Corollary 9.11 Under assumptions (9.60)–(9.61)–(9.62), for any f in the dual of 1 .˝/ the equation F.u/ D f has a solution. Wp;0 In particular, this result applies to the nonlinear model equation
d X
Di .jDi ujp2 Di u/ C u D f :
iD1
Exercises 1. Let H be a Hilbert space, let X WD L2 .RC ; H/, and let A be the densely defined operator with domain D.A/ WD H 1 .RC ; H/ given by A.x/ WD x0 : Show that A is maximally monotone. [Hint: to prove maximality show that for any u 2 X there exists some x 2 D.A/ such that x0 C x D u.] 2 . (Debrunner-Flor) Let X be a Banach space, let M W X X be a monotone operator and let C be a weak compact, convex subset of X containing the range R.M/ of M: Let f W C ! X be a weak to norm continuous map. Show that there exists some x 2 C such that hu C f .x /; v x i 0
8.u; v/ 2 M:
[See [209].] 3. Prove that the result of the preceding exercise implies Brouwer’s Fixed Point Theorem. [Hint: given a compact convex subset K of a Euclidean space E and a continuous map g W K ! K; take for C a closed ball of E containing K in its
588
9 Partial Differential Equations
interior, M WD IC and f WD g ı pK ; where pK is the projection map onto K and show that if x 2 C is such that hu C f .x /; u x i 0 for all u 2 C; then one has x D f .x / and x 2 K; g.x / D x .] 4 . (Hammerstein equation) Let f W RC R ! R be continuous, nondecreasing in its second variable and measurable in its first variable, with f .; 0/ 2 X WD L2 .RC /; let g 2 X and let k 2 L1 .R/: Prove that the integral equation Z u.s/ C 0
1
k.r s/f .r; u.r//dr D g.s/
s 2 RC
has a solution u 2 X when the following assumptions are satisfied: (a) there exist a 2 X and b 2 R such that j f .r; t/j a.r/ C b jtj for all .r; t/ 2 RC R; (b) there exist c 2 P and a0 2 X such that f .r; t/ ct2 a0 .r/ jtj for all .r; t/ 2 RC R; R1 (c) Rsetting K.v/.s/ WD 0 k.r s/v.r/dr for v 2 X, s 2 RC one has 1 0 K.v/.s/v.s/ds 0. [See [92, Example 11.2] [57, 58, 266].] 5 . (Kenderov) Let M W X X be a maximally monotone multimap on a reflexive Banach space such that int(D.M/) is nonempty. Show that there exists a dense Gı subset, i.e. a countable intersection of open subsets, G of int(D.M/) such that M is single-valued on G and every selection of M is continuous at each point of G.
9.4.7 Sums of Maximally Monotone Multimaps Now we intend to show that representative functions can be used to deal with sums of maximally monotone operators. They are also useful for compositions with linear maps, but we skip such a (related) subject. We recall that for h; k W X X ! R1 the function .h2 k/ W X X ! R1 is defined by .h2 k/.x; x / WD inffh.x; u / C k.x; v / W u C v D x g: Theorem 9.30 Let X be a reflexive Banach space, let M, N W X X be maximally monotone operators satisfying the condition RC .co D.M/ co D.N// D X;
(9.64)
and let h 2 HM , k 2 HN be such that fM h pM ; fN k pN . Then h2 k D .hb 2 kb /b is in HMCN and x 2 .M C N/.x/ if and only if .h2 k/.x; x / D hx; x i: Moreover, M C N is maximally monotone.
9.4 Nonlinear Problems
589
Proof Since h 2 HM , k 2 HN we have h2 k c2 c D c: Moreover, for x 2 X, x 2 .M C N/.x/; say x D u C v with u 2 M.x/, v 2 N.x/, we have .h2 k/.x; x / h.x; u / C k.x; v / D c.x; u / C c.x; v / D c.x; x /, hence .h2 k/.x; x / D c.x; x /: In Proposition 6.28 we take Y WD X . Since co D.M/ X .dom hb / and co D.N/ X .dom kb / as seen by Corollary 9.9, condition (9.64) entails condition (6.25). Proposition 6.28 ensures that .hb 2 kb /b D hbb 2 kbb D h2 k;
(9.65)
so that h2 k D .h2 k/bb and h2 k 2 HMCN : If .h2 k/.x; x / D c.x; x /; by Proposition 6.28 there exist u , v such that x D u C v and .h2 k/.x; x / D h.x; u / C k.x; v /: Then 0 D .h2 k/.x; x / hx; x i D fh.x; u / hx; u ig C fk.x; v / hx; v ig and since each of the bracketed terms is nonnegative, both of them are zero and so, by the implication (a))(b) of Theorem 9.26, u 2 M.x/, v 2 N.x/ and x 2 .M C N/.x/: We also have pM D fMb hb pbM D fM and similarly pN kb fN : Then hb 2 kb c2 c D c and since co D.M/ X .dom hb /, co D.N/ X .dom kb /, as above we have hb 2 kb D .hb 2 kb /bb , hence hb 2 kb 2 H. Since .hb 2 kb /b D h2 k 2 H it follows from Theorem 9.25 that M C N D f.x; x / W .hb 2 kb /b .x; x / D c.x; x /g is maximally monotone. Corollary 9.12 (Rockafellar) Let X be a reflexive Banach space, let M, N W X X be maximally monotone multimaps. If the following condition is satisfied, in particular if D.M/ \ int.D.N// ¤ ¿, then M C N is maximally monotone: co.D.M// \ intco.D.N// ¤ ¿: Proof Let C WD co.D.M//, D WD co.D.N// and let a 2 C \ int.D/. Let r > 0 be such that a C rBX D. Then we have rBX C D, hence RC .C D/ D X and the theorem applies. Let us give an extension of Theorem 9.29 to multivalued maps. For this purpose, we say that a multimap M W X X is coercive if there exists a function W RC ! R satisfying limr!1 .r/ D 1 such that kx k .kxk/ for all .x; x / 2 M with kxk large:
(9.66)
This condition is obviously a generalization of coercivity to the multivalued case. Theorem 9.31 Let X be a reflexive space. If M W X X is maximally monotone and coercive then M is surjective. Moreover, if F W X ! X is a single-valued, monotone, radially weak* continuous map and if M C F is coercive, then, M C F is surjective: for all x 2 X
590
9 Partial Differential Equations
there exists some x 2 D.M/ such that x 2 M.x/ C F.x/: Proof We endow X with a strictly convex norm whose dual norm is strictly convex, so that the duality map J is single-valued. Without loss of generality we assume that .0; 0/ 2 M: Given x 2 X we have to find some x 2 D.M/ such that x 2 M.x/ C F.x/. By Theorem 9.28, since "1 M is maximally monotone, for any " > 0 there exists some x" 2 D.M/ such that x 2 M.x" / C "J.x" /: Then, since M is monotone and .0; 0/ 2 M, we have hx "J.x" /; x" i 0; hence " kx" k2 kx k : kx" k and " kx" k kx k. Then, the net .x "J.x" //">0 is bounded, and, since x "J.x" / 2 M.x" /, by coercivity, we infer that .x" / is bounded as " ! 0: We can find a sequence ."n / ! 0 and a weak limit x of the sequence .x"n /. Then, by Proposition 9.15 (c), x D limn .x "n J.x"n // belongs to M.x/: The second assertion follows from the fact that F is maximally monotone by Proposition 9.14 and that M C F is maximally monotone too by Corollary 9.12 since the domain of F is X.
Exercises 1. Show that in Theorem 9.28 the duality map J can be replaced with Jp WD 1p @ kkp with p 21; 1Œ. Such a modified duality map is adapted to spaces such as Lp spaces. 2. For a maximally monotone multimap M W X X ; show that for all h 2 HM one has fM h pM : 3. Suppose that for some monotone multimap M one has fM 2 HM : Then prove that for some .x; x / 2 X X one has fM .x; x / D hx; x i if and only if pM .x; x / D hx; x i. [Hint: use (9.57).] 4. Let h 2 H be such that hb 2 H. Show that Mh WD f.x; x / W h.x; x / D c.x; x /g is maximally monotone. 5. (Kirszbraun-Valentine) Let D be a nonempty subset of a Hilbert space X: Show bW that for any nonexpansive map F W D ! X there exists a nonexpansive map F X ! X whose restriction to D is F. [Hint: associate to F a monotone multimap b extending M as in Proposition 9.13 and take a maximally monotone multimap M b M and the associated nonexpansive map F.] 6. Let X WD L2 .R/, D.F/ WD fx 2 W21 .R/ W limjtj!1 x.t/ D 0g and let F W x 7! x0 for x 2 D.F/; G WD F: Verify that F and G are maximally monotone but F C G
9.4 Nonlinear Problems
591
is not maximally monotone. [Hint: given y 2 X one has x C F.x/ D y for x given Rt by x.t/ D 1 est y.s/ds, so that F is maximally monotone, whereas F C G has a strict extension by 0 on X.]
9.4.8 Variational Inequalities Several phenomena or processes have a one-sided character, either because time is involved or because obstacles are present. Models of such phenomena cannot be designed in the form of equations. Inequalities are more appropriate: see the motivation concerning industrial mathematics and the historical remarks provided in [88, 144, 145, 175, 188], [231, Section 9.1]. In this subsection we study such inequalities in which a subset C of a Banach space X and a map F W C ! X are involved. Hereafter we say that F W C ! X is C-(scalarly) coercive if there exists some v 2 C such that hF.w/ F.v/; w vi ! C1 as kwk ! C1; w 2 C: kw vk
(9.67)
This condition is independent of the choice of v 2 C since it is equivalent to .1= kwk/hF.w/; wi ! C1 as kwk ! C1; w 2 C: Thus, when C D X it coincides with scalar coercivity. As already observed, this condition implies that F is coercive on C, i.e. that kF.x/k ! 1 as kxk ! 1 with x 2 C. It is satisfied when F is the restriction to C of a continuous linear map A W X ! X that is positive definite in the sense that there exists some c > 0 such that hAx; xi c kxk2 for all x 2 X: This condition plays a role in one of the steps of the proof of the main result concerning such inequations. It is as follows. Theorem 9.32 Let C be a nonempty, closed, convex subset of a reflexive Banach space X; let f 2 X , and let F W C ! X be a monotone map that is radially weak continuous. If either C is bounded or F is C-coercive, there exists some u 2 C such that hF.u/; w ui h f ; w ui
8w 2 C:
(9.68)
Such an inequation is called a variational inequality. It can be interpreted as the search of u 2 C satisfying f F.u/ 2 N.C; u/, the normal cone to C at u. When C D X relation (9.68) is reduced to the equation F.u/ D f . Remark When F is strictly monotone in the sense that hF.w/ F.z/; w zi > 0 whenever w ¤ z; the solution is unique: if u and u0 are solutions, taking w WD u0
592
9 Partial Differential Equations
in (9.68) and then writing (9.68) with u0 in place of u and taking w D u we obtain the two inequalities hF.u/; u0 ui h f ; u0 ui
and hF.u0 /; u u0 i h f ; u u0 i;
hence hF.u/ F.u0 /; u0 ui 0 and u D u0 :
When X is a Hilbert space and F is Lipschitzian and strongly monotone, a simple proof can be given by using the contraction fixed point theorem. Theorem 9.33 (Stampacchia) Suppose X is a Hilbert space and F is Lipschitzian and strongly monotone in the sense that there exists some c > 0 such that for all v, w 2 C one has hF.v/ F.w/; v wi c kv wk2 : Then (9.68) has a unique solution u 2 C: In particular, when F is the restriction to C of a positive definite continuous linear map A W X ! X, (9.68) has a unique solution u 2 C: Proof Let ` be the Lipschitz rate of F and let c > 0 be as in the monotonicity assumption. We pick t > 0 in such a way that 0 < 1 2ct C `2 t2 < 1 and we set k WD .1 2ct C `2 t2 /1=2 : We observe that (9.68) can be rewritten htf tF.u/ C u u; w ui 0
8w 2 C
or u D pC .tf tF.u/Cu/; where pC is the projection map from X to C. Let g W C ! C be given by g.v/ WD pC .tf tF.v/ C v/ for v 2 C: Since pC is nonexpansive, for v, w 2 C we have kg.v/ g.w/k kv w t.F.v/ F.w//k ; hence kg.v/ g.w/k2 kv wk2 2thv w j F.v/ F.w/i C t2 kF.v/ F.w/k2 .1 2ct C `2 t2 / kv wk2 D k2 kv wk2 by our choice of k 20; 1Œ: The Contraction Theorem ensures that there exists some u 2 C such that g.u/ D u: That means that u is a solution to (9.68). Uniqueness stems from uniqueness of the solution of the equation g.u/ D u: Remark The Contraction Theorem ensures that the solution u is the limit of the sequence .un / obtained by the following algorithm in which t is chosen in such a way that 1 2ct C `2 t2 20; 1ŒW unC1 D pC .un t. f Fun //: In the general case, a preliminary result plays a key role.
9.4 Nonlinear Problems
593
Lemma 9.7 (Minty) Under the assumptions of Theorem 9.32 the variational inequality (9.68) is equivalent to finding u 2 C such that hF.w/; w ui h f ; w ui
8w 2 C:
(9.69)
Proof Let u 2 C be a solution to (9.68). Then by the monotonicity of F we have hF.w/; w ui D hF.w/ F.u/; w ui C hF.u/; w ui hF.u/; w ui h f ; w ui for all w 2 C so that u is a solution to (9.69). Conversely, if u is a solution to (9.69), for all v 2 C and t 20; 1Œ; taking w WD u C t.v u/ 2 C in (9.69) and simplifying, we get hF.u C t.v u//; v ui h f ; v ui: Using weak* radial continuity of F we get hF.u/; v ui h f ; v ui for all v 2 C. A first step in the proof of Theorem 9.32 is given in the next lemma. Lemma 9.8 If X is finite dimensional and if C is a nonempty compact convex subset of X, then, for all f 2 X , the variational inequality (9.68) has a solution u. Proof Using a scalar product h j i on X and identifying X and X we see that the continuous map x 7! pC .x F.x/ C f / from C into C has a fixed point u by Brouwer’s Theorem. The characterization of pC .u F.u/ C f / yields h.u F.u/ C f / u; w ui 0
8w 2 C;
so that u is a solution to (9.68).
Lemma 9.9 If X is finite dimensional, if C is a nonempty closed convex subset of X; and if F is monotone, continuous and C-coercive, then for all f 2 X the variational inequality (9.68) has a solution. Proof Changing F into F f , which is still monotone and C-coercive, we may suppose f D 0: For r 2 RC large enough we set Cr WD C \ rBX and we denote by ur 2 Cr a solution to the variational inequality hF.ur /; v ur i 0
8v 2 Cr :
(9.70)
Let v 2 C be as in (9.67) and let us show that for r kvk large enough we have kur k < r: If, on the contrary, we have kur k D r we get hF.ur /; v ur i D hF.v/; v ur i hF.v/ F.ur /; v ur i kv ur k .kF.v/k hF.v/ F.ur /; v ur i= kv ur k/ < 0
594
9 Partial Differential Equations
by the C-coercivity assumption, a contradiction with (9.70) since v 2 Cr . Thus kur k < r for r large enough and for all w 2 C; taking t 20; 1Œ small enough we have v WD ur C t.w ur / 2 Cr ; hence thF.ur /; w ur i D hF.ur /; v ur i 0 and ur is a solution to (9.68) with f D 0.
Proof of Theorem 9.32 This uses a method known as the Ritz-Galerkin method. We first consider the case C is bounded, and without loss of generality we suppose f D 0: For v 2 C let C.v/ WD fw 2 C W hF.v/; v wi 0g: Since C is weakly compact and C.v/ is weakly closed in C, C.v/ is weakly compact. We want to find some u in C.v/ for all v 2 C, i.e. we want to prove that \v2C C.v/ is nonempty. If, on the contrary, this intersection is empty, then by compactness there exists a finite family v1 ; : : : ; vn of elements of C such that C.v1 / \ : : : \ C.vn / D ¿. Let Y be the linear space spanned by v1 ; : : : ; vn and let j W Y ! X be the canonical injection. Its transpose j| is the restriction map r W X ! Y given by r.x / D x jY . The map FY WD r ı F ı j W Y ! Y is still monotone and radially continuous. Thus, the preceding lemmas yield some u 2 C \ Y such that hFY .v/; v ui 0 for all v 2 C \ Y. Considering u as an element of X, this relation implies that u 2 C.v1 / \ : : : \ C.vn /; a contradiction. Thus \v2C C.v/ ¤ ¿ and any element u of this intersection is a solution. When C is unbounded, we use the C-coercivity assumption as in the preceding lemma to prove the existence of a solution. Corollary 9.13 If X is a reflexive Banach space X and F W X ! X is a monotone scalarly coercive map that is radially weak continuous then F is surjective. Proof Taking C WD X; for any f 2 X we can find u 2 X such that hF.u/ f ; u wi 0 for all w 2 X. Thus F.u/ D f : Exercise If in Stampacchia’s theorem the map A WD F is linear, continuous, and symmetric in the sense that hAv; wi D hAw; vi for all v; w 2 X, show that the solution u of (9.68) is the minimizer on C of the convex function h W v 7! 1 2 hAv; vi h f ; vi: Exercise (Penalty Method) Assume X, C; f are as in Theorem 9.32, with 0 2 C: Suppose j W X ! R is a convex function of class C1 ; F WD J WD Dj and P W X ! X is a continuous monotone operator such that P.w/ D 0 is equivalent to w 2 C: Show that for every r > 0 the equation 1 F.ur / C P.ur / D 0 r
9.4 Nonlinear Problems
595
has a solution ur 2 C and that for a sequence .rn / ! 0C the sequence .urn / converges weakly to a solution u of (9.68). [See [231, Thm 9.3.7].] 1 .˝/ ! Application Let ˝ be an open subset of Rd , let p 22; 1Œ and let F W Wp;0 1 .Wp;0 .˝// be given by
Z hF.u/; vi WD
krukp2 hru j rvi ˝
1 u; v 2 Wp;0 .˝/:
Using the fact that the function j W x 7! .1=p/ kxkp is convex, so that its subdifferential @j W x 7! kxkp2 x is monotone, one can see that F is monotone. The nonlinear map F plays an important role as an example of a partial differential equation.
Fleet the time carelessly, as they did in the golden world. W. Shakespeare, As you like it. I have measured out my life with coffee spoons. T.S. Eliot.
Abstract In this chapter, problems involving time are considered. Those expressed by means of ordinary differential equations are the simplest ones. In contrast to problems involving partial derivatives, they do not require the functions spaces introduced in the preceding chapter. But for parabolic problems and hyperbolic problems, Sobolev spaces are again crucial tools. The notion of a semigroup forms a natural and unifying framework for such problems. Notions of dissipativity and monotonicity again show their usefulness.
We devote this chapter to problems involving time. The equations describing the evolution problem being studied may be ordinary differential equations or partial differential equations. As in [2] we gather these two types of problem together in order to underline the analogies and the differences between them; on the other hand, we do not consider integral equations nor delay differential equations or equations with deviated arguments. We give a particular attention to two equations (or rather systems). The first one is the heat equation that rules the propagation of heat in a medium: @u .x; t/ u.x; t/ D 0; u.x; 0/ D g.x/ @t
.x; t/ 2 ˝ P
where g is the temperature of the medium at time t D 0, which is supposed to be known, and u.x; t/ represents the temperature of point x in the open subset ˝ of Rd at time t 2 P WD0; 1Œ. Here the Laplacian is taken with respect to the space variable x. This equation is a typical example of a parabolic second-order equation. Other examples are the Fokker-Planck equation and the Komolgorov equation for the probabilistic study of diffusion processes.
The second type of equation we study is the wave equation @u @2 u .x; 0/ D h.x/ .x; t/ u.x; t/ D 0; u.x; 0/ D g.x/; @t2 @t
.x; t/ 2 ˝ P
that is so often studied in geophysics for petrol exploration. Under the form of the Schrödinger equation it governs the wave-like behavior of matter in quantum mechanics. This equation is a higher dimensional version of the simpler vibrating string equation @2 u @2 u @u .x; 0/ D h.x/ .x; t/ 2 T P .x; t/ .x; t/ D 0; u.x; 0/ D g.x/; @t2 @x2 @t giving the shape of a vibrating string over an interval T WD Œ0; Œ of R; it was solved by J. Le Rond d’Alembert in the eighteenth century. Such equations are typical examples of hyperbolic equations. We just describe elementary aspects of the approach to such equations, referring to specialized monographs for a complete study. We discard the many nonlinear evolution problems involving partial differential equations besides the ones involving dissipativity.
10.1 Ordinary Differential Equations Whole books have been devoted to ordinary differential equations; see f.i. [11, 79, 91, 93, 152, 154, 194, 200, 244]. Thus, in this section we only present the main facts about this field, which has been very active during the last three centuries and which still experiences important discoveries. The understanding of turbulence and chaos are examples of these new insights. We also discard stability questions that are so important for mechanical phenomena. We concentrate our study on firstorder differential equations since higher order equations can be reduced to first-order equations by introducing auxiliary unknown maps. In the case of the second-order differential equation x00 .t/ D f .t; x.t/; x0 .t//;
x.t0 / D x0 ; x0 .t0 / D y0
we substitute it with the differential equation .x0 .t/; y0 .t// D .y.t/; f .t; x.t/; y.t///;
.x.t0 /; y.t0 // D .x0 ; y0 /
and we observe that the so-called right-hand side of the latter has the same regularity as the right-hand side of the former.
10.1 Ordinary Differential Equations
599
10.1.1 Separation of Variables In some cases it is possible to get an explicit solution to a scalar differential equation of the form x0 .t/ D f .x.t//;
x.t0 / D x0 :
(10.1)
In the traditional formalism one rewrites this equation in the form dx D dt f .x/
(10.2)
and if one knows a primitive g of 1=f ; one gets g.x/ g.x0 / D t t0 : If, moreover, g has an inverse h on some interval T, one obtains x.t/ D h.t t0 C g.x0 //. Relation (10.2) could be given a rigorous meaning by using the formalism of differential forms. We rather assume that a solution x of equation (10.1) has an inverse y on some interval containing t0 . Then we have y0 .x.t// D
1 x0 .t/
D
1 : f .x.t//
Thus, if we can compute a primitive g of 1=f ; we may take y.x/ D g.x/ g.x0 / C t0 and then get x./ by taking the inverse of y./: The following simple example shows that in general one cannot expect to get a solution defined everywhere. Example Let us consider equation (10.1) with f given by f .x/ D x2 , t0 WD 0. If x0 D 0; then x./ D 0 is a solution to (10.1). If x0 ¤ 0; for t close to 0; one has x.t/ ¤ 0 and g W x 7! 1=x C 1=x0 is the primitive of 1=f that takes the value 0 for x D x0 . The relation 1=x.t/ C 1=x0 D t yields x.t/ D x0 .1 x0 t/1 : We observe that although f is of class C1 on R, the solution is just defined on the interval 1; 1=x0 Œ if x0 > 0 and on the interval 1=x0 ; 1Œ if x0 < 0. The classical method of separation of variables for ordinary differential equations can be extended to partial differential equations. As an example, let us consider the heat equation on some bounded open subset ˝ of Rd : @u .x; t/ u.x; t/ D 0 @t
.x; t/ 2 ˝ P
(10.3)
u.x; t/ D 0
.x; t/ 2 @˝ P
(10.4)
u.x; 0/ D g.x/
x2˝
(10.5)
600
10 Evolution Problems
where g W ˝ ! R is given and the Laplacian bears on the space variable x as above. We look for a solution of the form u.x; t/ WD v.t/w.x/ Computing
@u @t
.x; t/ 2 ˝ RC :
(10.6)
and u we get v 0 .t/w.x/ v.t/w.x/ D 0:
Assuming that w is an eigenfunction of , i.e. that for some 2 R we have w.x/ D w.x/; taking a solution v to the differential equation v 0 .t/ D v.t/; so that v.t/ D cet for some c 2 R, we obtain that the function u given by (10.6) solves (10.3). Assuming g can be represented as the sum of a series gD
1 X
cn wn
nD0
with cn 2 R, wn being an eigenfunction of with associated eigenvalue n ; we can expect u given by u.x; t/ D
1 X
cn en t wn
nD0
is a solution of the heat equation. Of course, the meaning of this series has to be made precise, as does the one for the initial value g: As a variant, let us consider the porous medium equation (see [130]) @u .x; t/ u .x; t/ D 0 @t
.x; t/ 2 ˝ P
where > 1 is a fixed constant and ˝ is an open subset of Rd . Again, we look for a solution u given by relation (10.6), so that for some constant c we must have w .x/ v 0 .t/ D c D v .t/ w.x/
.x; t/ 2 ˝ P; v.t/ ¤ 0, w.x/ ¤ 0:
The differential equation v 0 .t/ D cv .t/ yields v.t/ D ..1 /ct C b/1=.1 / for some b 2 RC . Looking for a solution w of the equation w .x/ D cw.x/ in the
10.1 Ordinary Differential Equations
601
form w.x/ WD a kxkˇ with a, ˇ 2 P, we see that we must have 0 D cw.x/ w .x/ D ac kxkˇ a ˇ .d C ˇ 2/ kxkˇ 2 ; hence ˇ D ˇ 2; ˇ D 2=. 1/ and ac D a ˇ .d C ˇ 2/ or c D a 1 ˇ .d C ˇ 2/. Another example of the method of separation of variables occurs with the Hamilton-Jacobi equation @u .x; t/ C H.ru.x; t// D 0 @t
.x; t/ 2 ˝ RC ;
(10.7)
where the gradient r bears on the space variable x and H W Rd ! R, the Hamiltonian, is given. This time we decompose u in an additive manner by taking u.x; t/ D v.t/ C w.x/: Equation (10.7) then becomes v 0 .t/ C H.rw.x// D 0
.x; t/ 2 ˝ RC :
Such a relation requires that there exists some constant c 2 R such that v 0 .t/ D c, H.rw.x// D c for all .x; t/ 2 ˝ RC : In particular, when the initial data is u.x; 0/ WD ha j xi C b for some a 2 Rd ; b 2 R, we get that for c WD H.a/; we have u.x; t/ D ha j xi C ct C b:
10.1.2 Existence Results We already proved the existence of a local solution to the differential equation x0 .t/ D f .t; x.t//
x.0/ D x0
when the right-hand side f is Lipschitzian in the state variable x: In this subsection and the next one we prove such a result by using a different method and we consider the dependence on the initial condition and on parameters. We also present a more general existence result involving partial derivatives that proves to be useful in differential geometry and in the consumer theory of mathematical economics. Theorem 10.1 (Frobenius) Let G be an open subset of the product X Y of two Banach spaces and let f W G ! L.X; Y/ be a map of class C1 . For all .x0 ; y0 / 2 G
602
10 Evolution Problems
the differential equation Dwy .x/ D f .x; wy .x//
wy .x0 / D y
(10.8)
has a local solution wy of class C1 defined on some neighborhood U of x0 whenever y belongs to some neighborhood V of y0 if and only if f satisfies the relation D1 f .x; y/:u:v C D2 f .x; y/:. f .x; y/:u/:v D D1 f .x; y/:v:u C D2 f .x; y/:. f .x; y/:v/:u (10.9) for all .x; y/ 2 G, .u; v/ 2 X 2 : In this case, the map w W .x; y/ 7! wy .x/ is of class C1 from U V into Y: When X WD Rd , Theorem 10.1 can be formulated in terms of partial derivatives; we encourage the reader to write down such a formulation. We observe that when X WD R equation (10.8) reduces to an ordinary differential equation since we identify L.R; Y/ with Y via ` 7! `:1 and Dwy .x/ with w0y .x/ WD Dwy .x/.1/: In such a case, relation (10.9) is automatically satisfied since D1 f .x; y/:u:v D uvD1 f .x; y/:1:1 and D2 f .x; y/:. f .x; y/:u/:v D uvD2 f .x; y/:. f .x; y/:1/:1 with similar relations obtained by interchanging u and v: These remarks lead to the next statement in which the notation is slightly changed. Theorem 10.2 Let E be a Banach space, let G be an open subset of R E and let f W G ! E be a map of class Ck , k 1. Then, for all .t0 ; x0 / 2 G there exist an open interval T containing t0 , an open neighborhood V of x0 , and a map w W T V ! E of class Ck such that .t; w.t; x// 2 G for all .t; x/ 2 T V and d w.; x/.t/ D f .t; w.t; x// dt w.t0 ; x/ D x
8.t; x/ 2 T V;
(10.10)
8x 2 V:
(10.11)
Proof of Theorem 10.1 Let us first show that condition (10.9) is necessary in order that the assertion about the existence of a local solution be satisfied. Equation (10.8) shows that Dwy is of class C1 , hence that the local solution wy is of class C2 : In such a case, Schwarz’ Theorem ensures that D2 wy .x/ is a symmetric bilinear map. Differentiating x 7! f .x; wy .x//, we deduce (10.9) from this symmetry property. Now let us show that relation (10.9) is sufficient to obtain the existence of a local solution. Without loss of generality we suppose that .x0 ; y0 / D .0; 0/ and that G WD X0 Y0 , with X0 D B.0; r/, Y0 D B.0; r/. We denote by T the interval Œ0; 1 and by C.T; Y/ the space of continuous maps from T to Y endowed with the norm kk1 W z 7! supt2T kz.t/k ; by C01 .T; Y/ the space of maps z of class C1 from T to Y satisfying z.0/ D 0 endowed with the norm kk1;1 W z 7! supt2T kz0 .t/k : By Corollary 5.14 both C.T; Y/ and C01 .T; Y/ are Banach spaces and the derivation D W z 7! z0 is an isometry from Z WD C01 .T; Y/ onto C.T; Y/:
10.1 Ordinary Differential Equations
603
Let V0 WD B.0; r=2/ Y and let Z0 WD fz 2 Z WD C01 .T; Y/ W z.T/ V0 g; so that for all y 2 V0 and z 2 Z0 we have y C z.t/ 2 Y0 for all t 2 T. Let us define a map g W X0 V0 Z0 ! C.T; Y/ by g.x; y; z/.t/ WD z0 .t/ f .tx; y C z.t//:x
.x; y; z/ 2 X0 V0 Z0 ; t 2 T:
As in Sect. 5.7.1, it can be shown that g is of class C1 and since for all t 2 T the evaluation map c 7! c.t/ from C.T; Y/ to Y is linear and continuous, setting g.x; y; z; t/ WD g.x; y; z/.t/, we see that the partial derivatives Di g of g are characterized by .D1 g.x; y; z/:u/.t/ D D1 g.x; y; z; t/:u
(10.12)
.D2 g.x; y; z/:v/.t/ D D2 g.x; y; z; t/:v
(10.13)
.D3 g.x; y; z/:w/.t/ D D3 g.x; y; z; t/:w
(10.14)
for all .x; y; z; t/ 2 X0 Y0 Z0 T and all .u; v; w/ 2 X Y Z, so that in particular, .D3 g.x; y; z/:w/.t/ D w0 .t/ D2 f .tx; y C z.t//:w.t/:x and D3 g.0; 0; 0/:w D w0 . Thus D3 g.0; 0; 0/ is the isomorphism D W C01 .T; Y/ ! C.T; Y/ given by Dw WD w0 : The implicit function theorem yields open balls U, V centered at 0 in X0 and Y0 respectively and a map h W U V ! Z0 of class C1 such that h.0; 0/ D 0 and g.x; y; h.x; y// D 0 for all .x; y/ 2 U V: Without loss of generality, we assume that for all .x; y/ 2 U V the linear map D3 g.x; y; h.x; y// is invertible. Taking the partial derivative with respect to y in the relation g.0; y; h.0; y// D 0 and using the fact that D2 g.0; y; z/ D 0 for all .y; z/ 2 V Z0 , we obtain D3 g.0; y; h.0; y// ı D2 h.0; y/ D 0, hence D2 h.0; y/ D 0 and h.0; y/ D h.0; 0/ D 0 for all y 2 V. Setting for .x; y/ 2 U V, t 2 T hx;y .t/ WD h.x; y/.t/;
wx;y .t/ WD y C hx;y .t/;
wy .x/ WD wx;y .1/;
we see that wy .0/ D y for all y 2 V and, by definition of h.x; y/, we have w0x;y .t/ D h0x;y .t/ D f .tx; y C hx;y .t//:x: It remains to show that Dwy .x/:u D f .x; wy .x//:u
8.x; y/ 2 U V; u 2 X:
(10.15)
604
10 Evolution Problems
For t 2 T, .x; y/ 2 U V, u 2 X, let us set j.t/ WD jx;y;u .t/ WD tf .tx; wx;y .t//:u: The derivative j0 .t/ of j at t is given by f .tx; wx;y .t//:u C tD1 f .tx; wx;y .t//:x:u C tD2 f .tx; wx;y .t//:f .tx; wx;y .t//:x:u or, in view of assumption (10.9), by f .tx; wx;y .t//:u C tD1 f .tx; wx;y .t//:u:x C tD2 f .tx; wx;y .t//:f .tx; wx;y .t//:u:x: On the other hand, taking the derivative with respect to x in the relation g.x; y; h.x; y// D 0, we get for all .x; y/ 2 U V, u 2 X, t 2 T D3 g.x; y; h.x; y//.D1 h.x; y/:u/ D D1 g.x; y; h.x; y//:u: Using relations (10.12) and (10.14), this equality means that D3 g.x; y; hx;y .t/; t/:.D1 h.x; y/:u/.t/ D f .tx; wx;y .t//:u C tD1 f .tx; wx;y .t//:u:x: Plugging the second expression of j0 .t/ in the computation of D3 g yields D3 g.x; y; h.x; y/.t/; t/:j.t/ D j0 .t/ D2 f .tx; y C hx;y .t//:j.t/:x D f .tx; wx;y .t//:u C tD1 f .tx; wx;y .t//:u:x: By the injectivity of D3 g.x; y; h.x; y/.t/; t/ we obtain .D1 h.x; y/:u/.t/ D j.t/ for all t 2 T; and in particular, for t WD 1. With relation (10.15) we get Dwy :u D .D1 h.x; y/:u/.1/ D j.1/ D f .x; wy .x//:u; so that Dwy .x/ D f .x; wy .x//. Theorem 10.2 includes a dependence statement upon the initial position x or x0 ; and a similar result is obtained in Theorem 10.1. This dependence can be extended to the dependence upon the initial condition .t0 ; x0 / and even upon a parameter p: Theorem 10.3 Let E, P be Banach spaces, let G be an open subset of RE P and let f W G ! E be a map of class Ck , k 1. Then, for all .t0 ; x0 ; p0 / 2 G there exist an open interval T containing t0 ; open neighborhoods U, V of p0 and x0 respectively, and a map w W T 2 V U ! E of class Ck such that .t; w.t; s; x; p/; p/ 2 G for all .t; s; x; p/ 2 T 2 V U and dw .; s; x; p/.t/ D f .t; w.t; s; x; p/; p/ dt w.s; s; x; p/ D x
8.t; s; x; p/ 2 T 2 V U;
8.s; x; p/ 2 T V U:
(10.16) (10.17)
10.1 Ordinary Differential Equations
605
Proof This follows from the proof of Theorem 10.1 and the extension of the implicit function theorem to the case when the equation depends on a parameter. One can also deduce this result from the study of the differential equation .x0 .t/; p0 .t// D . f .t; x.t/; p.t//; 0/: Let us also observe that equations (10.10)–(10.11) can be reduced to the case of an autonomous differential equation .s0 .t/; x0 .t// D .1; f .s.t/; x.t/// whose solution is of the form .s.t/; x.t// D .t C s0 ; x.t//: In finite dimensions, a local existence result can be given under a continuity assumption, but it does not assert local uniqueness. We admit it. The proof relies on the Ascoli-Arzela Theorem and the Brouwer’s Fixed Point Theorem. Theorem 10.4 (Arzela, Peano) Let E be a finite dimensional normed space, let G be an open subset of R E and let f W G ! E be a continuous map. Then, for any .t0 ; x0 / 2 G there exist an interval T of R containing t0 and a map x W T ! E such that x.t0 / D x0 , .t; x.t// 2 G, and x0 .t/ D f .t; x.t// for all t 2 T: A local existence and uniqueness result can be given following the lines of Theorem 2.14 under the assumption that the right-hand side f is locally Lipschitzian. Its conclusion is similar to that of Theorem 10.2, but with less regularity. We return to such a question in the next subsection.
Exercises 1. Verify that the solution of the equation x0 .t/ D x.t/3 taking the value x0 2 E WD R for t D t0 is given by x.t/ D x0 .1 2x20 .t t0 //1=2 for t 2 1; t0 C x2 0 =2Œ and x0 ¤ 0 and x.t/ D 0 if x0 D 0: Draw the integral curves in the phase space R E. 2. Verify that the solution of the equation x0 .t/ D x.t/2=3 taking the value x0 2 E WD 1=3 1=3 R for t D t0 is given by x.t/ D .x0 C .t t0 /=3/3 for t 2t0 3x0 ; C1Œ: 3. Let E WD R and let f W E ! E be given by f .e/ D 2e1=2 for e 2 RC , f .e/ D 2 jej1=2 for e 2 RnRC : Verify that the equation w0 .t/ D f .w.t// has two solutions on t 2t0 ; C1Œ taking the value x0 WD 0 for t D t0 , given by x.t/ D .t t0 /2 , y.t/ D x.t/ . Note that f is not Lipschitzian near 0: 4. Let E be a Banach space and let T be an open interval of R. Suppose that for some continuous function k W T ! RC a map f W T E ! E satisfies kf .t; e1 / f .t; e2 /k k.t/ ke1 e2 k for .t; e1 ; e2 / 2 T E2 : Prove that for any .t0 ; x0 / 2 T E there exists a unique solution x./ on T of the equations x0 ./ D f .; x.//, x.t0 / D x0 :
606
10 Evolution Problems
5. Let E WD c0 be the space of sequences x WD .xn / of real numbers satisfying limn xn D 0: Suppose E is endowed with the norm x ! kxk D supn jxn j. Let f W E ! E be given by f .x/ D .yn / with yn WD jxn j1=2 C .n C 1/1 : Verify that f is continuous but that there is no solution to the equation x0 .t/ D f .x.t// on some open interval containing 0 satisfying x.0/ D 0: Note that E is infinite dimensional.
10.1.3 Uniqueness and Globalization of Solutions A globalization of Theorem 10.1 can be devised, but we restrict our attention to the more classical globalization of Theorem 10.2. We retain its notation and we assume its conclusion, namely local existence rather than smoothness assumptions. Given .t0 ; x0 / 2 G; we define an order relation on the set of pairs .T; u/ where T is an open interval of R containing t0 and u W T ! E is a map of class C1 such that u.t0 / D x0 , .t; u.t// 2 G and u0 .t/ D f .t; u.t// for all t 2 T: we write .T1 ; u1 / .T2 ; u2 / if T1 T2 and u2 j T1 D u1 : It is easy to see that these conditions define an order and that this order is inductive: if ..Ti ; ui //i2I is a totally ordered family, the pair .T; u/ given by T WD [i2I Ti ; u j Ti D ui is a majorant of the family ..Ti ; ui //i2I . Thus, by Zorn’s Lemma, there exists a maximal solution of the equation x0 .t/ D f .t; x.t// for all t 2 T, x.t0 / D x0 . Note that this existence result does not assume uniqueness. Example Let E WD R, f .e/ D 3e2=3 for e 2 E, and let x0 D 0: Then the functions t 7! x.t/ D .t t0 /3 and y D 0 are two distinct maximal solutions defined on R. When local existence and local uniqueness are ensured, one can obtain existence and uniqueness of maximal solutions without using Zorn’s Lemma. In fact one has a largest solution .T0 ; u0 / for the order defined above. It is obtained by taking for T0 the union of the open intervals T containing t0 on which a solution u exists and by setting u0 .t/ D u.t/ for t 2 T: This defines u0 unambiguously since if u and v are solutions on intervals S and T containing t0 respectively one has u.t/ D v.t/ for t 2 S \ T as a connectedness argument shows along with local uniqueness. The domain of this maximal (or rather maximum) solution is as large as expected in the particular case of the next statement. Proposition 10.1 Let f W R E ! E be continuous and such that for any compact interval T of R there exists some cT 2 RC such that kf .t; e1 / f .t; e2 /k cT ke1 e2 k for all t 2 T, e1 , e2 2 E: Then for every .t0 ; x0 / 2 R E the maximal solution of the equation w0 .t/ D f .t; w.t// satisfying w.t0 / D x0 is defined on the whole of R. In particular, this conclusion holds whenever for some continuous maps A W R ! L.E/ WD L.E; E/, b W R ! E, one has f .t; e/ D A.t/:e b.t/ for all .t; e/ 2 R E. Proof This follows from Theorem 2.14 asserting that w./ is uniquely defined on any bounded interval of R.
10.1 Ordinary Differential Equations
607
The following lemma is useful when seeking estimates for solutions to evolution problems. Lemma 10.1 (Gronwall) Let a; b W T WD Œ0; ! R be two continuous (or regulated) nonnegative functions. If u W T ! R is a continuous (or regulated) nonegative function satisfying the inequality Z
t
u.t/
a.s/u.s/ds C b.t/
t2T
(10.18)
t 2 T:
(10.19)
0
then, setting A.t/ WD
Rt 0
a.s/ds; u is bounded above by Z
t
u.t/ eA.t/
eA.s/ a.s/b.s/ds C b.t/
0
In particular, if a./ is a constant function with value ˛ 2 P WD0; 1Œ and if b.t/ WD t C for t 2 T; with , 2 RC , then one has u.t/ . C
˛t /e ˛ ˛
t 2 T:
Proof We just prove the estimate (10.19), the special case being obtained by an integration by parts. We set Z
t
y.t/ WD
z.t/ WD eA.t/ y.t/:
a.s/u.s/ds; 0
On T (or the complement of a countable subset of T/ relation (10.18) implies that y0 .t/ a.t/y.t/ a.t/b.t/; so that z0 .t/ eA.t/ a.t/b.t/ and z.t/
Rt 0
eA.s/ a.s/b.s/ds since z.0/ D 0: Thus Z
t
y.t/ D eA.t/ z.t/ eA.t/
eA.s/ a.s/b.s/ds
0
and (10.19) follows from the fact that u.t/ y.t/ C b.t/:
With the help of Gronwall’s Lemma we shall obtain an estimate of the distance between two approximate solutions starting from two different points. Proposition 10.2 Let c, ı, ", be positive numbers, let E be a Banach space, let G be an open subset of R E, and let f ; g W G ! E be continuous and such
608
10 Evolution Problems
that kf .t; e/ g.t; e/k ı for all .t; e/ 2 G: Suppose that kf .t; e/ f .t; e0 /k c ke e0 k for all .t; e/ 2 G and .t; e0 / 2 G: If t0 2 T, an interval of R, and if x and y are two maps from T to E such that .t; x.t// 2 G, .t; y.t// 2 G, kx0 .t/ f .t; x.t//k ", ky0 .t/ g.t; y.t//k for all t 2 T, setting x0 WD x.t0 /, y0 WD y.t0 /, one has the estimate kx.t/ y.t/k kx0 y0 k ecjtt0 j C
" C C ı cjtt0 j .e 1/: c
Proof Changing t into t t0 we may assume that t0 D 0. Our assumptions ensure that for all t 2 T, e; e0 2 E we have f .t; e/ g.t; e0 / f .t; e/ f .t; e0 / C f .t; e0 / g.t; e0 / c e e0 C ı hence 0 x .t/ y0 .t/ x0 .t/ f .t; x.t// C c kx.t/ y.t/k C ı C g.t; y.t// y0 .t/ " C c kx.t/ y.t/k C ı C : Setting u.t/ WD kx.t/ y.t/k for t 2 T, t 0, this relation implies Z t 0 0 u.t/ kx0 y0 k k.x.t/ y.t// .x0 y0 /k D .x .s/ y .s//ds Z 0
t
0 x .s/ y0 .s/ ds
0
Z
t 0
cu.s/ds C ." C ı C /t;
hence the result, by Gronwall’s Lemma. The case t 0 is obtained by changing t into t. Corollary 10.1 Let E be a Banach space, let G be an open subset of R E and let f W G ! E be continuous and such that for all .t0 ; e0 / 2 G there exist c 2 RC and a neighborhood G0 of .t0 ; e0 / in G such that kf .t; e/ f .t; e0 /k c ke e0 k for all .t; e/; .t; e0 / 2 G0 . Let x./ and y./ be two solutions of the equation w0 .t/ D f .t; w.t// on an open interval T: Then, for any t0 , t 2 T one has kx.t/ y.t/k ecjtt0 j kx.t0 / y.t0 /k : In particular, if x./ and y./ coincide at some t0 2 T; then they coincide all over T: Moreover, for all .t0 ; e0 / 2 G there exists a largest open interval T.t0 ;x0 / on which there exists a solution. Of course here w./ is a solution on T if .t; w.t// 2 G and if w0 .t/ D f .t; w.t// for all t 2 T.
10.1 Ordinary Differential Equations
609
Proof The local estimate is obtained by taking " D D ı D 0 in the preceding proposition. It yields local uniqueness. A connectedness argument entails global uniqueness. Note that the assumption on f is satisfied when the partial derivative D2 f exists and is continuous. This follows from the Mean Value Theorem. Proposition 10.3 Let G be an open subset of R E and let f W G ! E be as in the preceding corollary. Let Df be the set of .t; t0 ; x0 / 2 R G such that t belongs to the largest interval T.t0 ;x0 / containing t0 on which a solution x./ WD x.; t0 ; x0 / of the equation x0 .t/ D f .t; x.t//, x.t0 / D x0 exists. Then Df is open. Moreover, the flow of f , i.e. the map .t; t0 ; x0 / 7! x.t; t0 ; x0 /, is continuous and of class Ck if f is of class Ck : Furthermore, for all .t1 ; t0 ; x0 / 2 Df , .t; t1 ; x1 / 2 Df with x1 WD x.t1 ; t0 ; x0 /, one has T.t0 ;x0 / D T.t1 ;x1 / and x.; t0 ; x0 / D x.; t1 ; x1 / on this interval. Proof We have seen local existence and uniqueness of solutions. Thus Df coincides with the set of .t; t0 ; x0 / 2 R G such that there exist a neighborhood U of x0 and an open interval T containing t and t0 on which a solution w of the equation w0 .t/ D f .t; w.t//, w.t0 / D u exists for all t 2 T. This yields openness of Df and the existence and uniqueness of the largest solution x.; t0 ; x0 / of the equation x0 .t/ D f .t; x.t//, x.t0 / D x0 . The regularity of .t; t0 ; x0 / 7! x.t; t0 ; x0 /; being a local property, ensues. Given .t1 ; t0 ; x0 / 2 Df , .t; t1 ; x1 / 2 Df with x1 WD x.t1 ; t0 ; x0 / we note that x.; t0 ; x0 / and x.; t1 ; x1 / are two solutions of x0 ./ D f .; x.// that coincide at t1 . Thus, T.t0 ;x0 / is a subset of the largest interval of existence T.t1 ;x1 / of the solution issued from x1 at t1 and for t 2 T.t0 ;x0 / one has x.t; t0 ; x0 / D x.t; t1 ; x1 /; in particular x.t0 ; t1 ; x1 / D x0 . Thus, the roles of .t0 ; x0 / and .t1 ; x1 / can be interchanged and T.t0 ;x0 / D T.t1 ;x1 / with x.; t0 ; x0 / D x.; t1 ; x1 / on this interval. The following result explains why the maximal interval of existence of the solution may differ from R. Proposition 10.4 Let f , G be as in the preceding proposition and for .t0 ; x0 / 2 G let T.t0 ;x0 / WD˛; !Œ be the largest open interval on which a solution of the equation x0 .; t0 ; x0 / D f .t; x.; t0 ; x0 //, x.t0 ; t0 ; x0 / D x0 is defined. Then any limit point x of x.; t0 ; x0 / as t ! ! is such that .!; x/ belongs to the boundary of G and a similar statement is valid for ˛ instead of !. When G WD R E and ! < 1, x.; t0 ; x0 // has no limit point as t ! ! and if, moreover, E is finite dimensional, x.; t0 ; x0 // is unbounded. Proof For the first assertion we may assume ! < 1. Suppose on the contrary that there exists some sequence .tn / ! ! such that .x.tn // ! x for some x 2 E satisfying .!; x/ 2 G: Let 2t0 ; !Œ and > ! and let U be an open neighborhood of x such that for all .s; u/ 2 ; ŒU there exists a unique solution of the equation x0 .t; u/ D f .t; x.t; u//, x.s; u/ D u: We pick n 2 N large enough such that tn > and x.tn / 2 U: Setting y.t/ D x.t/ for t 2˛; and y.t/ D x.t; x.tn // for t 2 ; Œ; we get a solution of the equation x0 .t; x0 / D f .t; x.t; x0 //, x.t0 ; x0 / D x0 whose domain is larger than the domain of x.; x0 /: This contradicts the maximality of the interval ˛; !Œ.
610
10 Evolution Problems
The second assertion stems from the emptiness of the boundary of G when G WD R E and, when E is finite dimensional, from the local compactness of E. The flow of an autonomous differential equation (also called a vector field) enjoys a striking property. Proposition 10.5 Let U be an open subset of a Banach space E and let f W U ! E be a map of class Ck with k 2 Nnf0g: For x0 2 U let T.x0 / be the largest open interval of R containing 0 on which the maximal solution x.; x0 / of x0 .; x0 / D f .x.; x0 //; x.0; x0 / D x0 is defined. Then, for s 2 T.x0 / and t 2 T.x.s; x0 // one has s C t 2 T.x0 / and x.s C t; x0 / D x.t; x.s; x0 //: In particular, if the solutions are defined on all of R, the family of maps .'t /t2R WD .x.t; //t2R W U ! U is a group of diffeomorphisms of U in the sense that 'tCs D 't ı 's for all s; t 2 R. Proof The derivative at t of x.sC; x0 / is f .x.sC; x0 // and x.sC; x0 /.0/ D x.s; x0 /; so that, by uniqueness, x.s C ; x0 /.t/ D x.t; x.s; x0 //: When x.; x0 / is defined on R, setting 't D x.t; /, the relation 'tCs D 't ı 's follows from the preceding. Since '0 D x.0; / D IU ./; the identity map, and since 't D 't1 , we see that 't is a diffeomorphism and .'t /t2R is a group of transformations of U.
Exercises 1. (Hadamard-Levy Theorem) Let f W Rd ! Rd be of class C2 such that f .0/ D 0 d and that1for some c > 0 and alld x 2 R , Df .x/ is an isomorphism satisfying Df .x/ c. Given x 2 R show that the solution w of the differential equation @w .t; x/ D ŒDf .w.t; x//1 :x with w.0; x/ D 0 is defined on R Rd . @t Prove that f .w.1; x// D x by showing that f .w.t; x// tx does not depend on t. Conclude that f is a diffeomorphism from Rd onto Rd : 2. (Arnold) Let E be a Euclidean space and let U W E ! R be a smooth potential with nonnegative values. Consider the Newton equation q00 .t/ D rU.q.t//: Setting p./ WD q0 ./, show that H given by H.p; q/ WD U.q/ C 12 kpk2 is constant along a solution .p./; q.// of the equation .p0 .t/; q0 .t// D .rU.q.t//; p.t// issued from .p0 ; q0 / 2 E2 for t0 D 0. Deduce from this fact that kq0 .t/k2 2H.p0 ; q0 / and that kq.t/ q0 k .2H.p0 ; q0 //1=2 jtj. Conclude that the solution q./ of the Newton equation is defined on the whole of R.
10.1 Ordinary Differential Equations
611
3. Prove the same conclusion as in the preceding exercise in the case when there exists some c 2 RC such that U.q/ c kqk2 for all q 2 E: 4. Verify that for E WD R, U.q/ WD q4 =2 the solution q of the Newton equation is given by q.t/ D .t 1/1 and cannot be extended after t D 1. 5. Let E0 be an open subset of a Banach space E and let f W E0 ! E be locally Lipschitzian or of class C1 . Let h W E0 ! R be such that h1 .frg/ is compact for all r 2 R. Suppose that for any solution x./ of the equation x0 ./ D f .x.// the function h.x.// is constant. Prove that the maximal solution x./ of this equation satisfying x.0/ D x0 is defined on the whole of R. 6. Applying the conclusion of Exercise 5 in the case E WD E0 WD R3 , f .x; y; z/ WD .y z; z x; x y/ for .x; y; z/ 2 E; h.x; y; z/ WD x2 C y2 C z2 , verify that the maximal solutions of the equation w0 .t/ D f .w.t// are defined on the whole of R. 7. Let E WD R, E0 WD P, f W P E ! E be given by f .t; e/ WD t2 sin t1 : Find the limit points of its solution x.t/ WD cos t1 on 0; 1Œ as t ! 0C :
10.1.4 The Exponential Map The usual exponential map exp W t 7! et on R can be extended to the space L.X/ WD L.X; X/ of continuous linear operators on a Banach space X by means of the formula eA WD
1 X 1 n A nŠ nD0
A 2 L.X/;
where A0 D IX and An is defined inductively by AnC1 D A ı An : The series is absolutely convergent since kAn k kAkn for all n 2 N. Considering partial sums and passing to the limit, we get A e ekAk :
(10.20)
The next lemma will be used to show the following classical result. Lemma 10.2 Let B, C W X ! X be two continuous linear operators on a Banach space X: If B and C commute, i.e. B ı C D C ı B, then one has eBCC D eB ı eC D eC ı eB : Proof Since B and C commute, one has .B C C/k D
k X jD0
kŠ Bkj ı Cj .k j/ŠjŠ
612
10 Evolution Problems
hence n n X k X 1 X X 1 1 .B C C/k D Bkj ı Cj D Bi ı C j : kŠ jŠ.k j/Š iŠjŠ kD0 kD0 jD0 iCjn
Thus n n n X X X X 1 i 1 j 1 1 i B/ı. C/ .B C C/k D B ı Cj : . iŠ jŠ kŠ iŠjŠ iD0 jD0 kD0 i;jn; iCj>n
Setting b WD kBk, c WD kCk ; the norm of the right-hand side is bounded above by X
n n n X X 1 i j 1 i X1 j 1 bc D. b /. c/ .b C c/k iŠjŠ iŠ jŠ kŠ i;jn; iCj>n iD0 jD0 kD0
and the limit as n ! 1 of this expression is eb ec ebCc D 0: Passing to the limit in the left-hand side of the preceding relation, we get eB ı eC eBCC D 0: Theorem 10.5 For a Banach space X; A 2 L.X/; and any u0 2 X; u.t/ WD etA u0 is the solution to the equation u0 .t/ D Au.t/;
u.0/ D u0 :
(10.21)
Proof Given r 2 R and s 2 Rnf0g, we note that for u.t/ D etA u0 we have sA e I 1 .u.r C s/ u.r// Au.r/ D A u.r/ s s with 1
X 1 esA I A D sn1 An ; s nŠ nD2 and, for a WD kAk sA X 1 e I 1 n1 ejsja 1 A a ! 0 as s ! 0: jsj kAkn D s nŠ jsj nD2 That shows that u0 .r/ exists and equals Au.r/ for all r 2 R. Since e0A D I; u is a solution to equation (10.21). Uniqueness yields the conclusion.
10.1 Ordinary Differential Equations
613
Since esA ı etA D e.sCt/A; we get a one-parameter group of transformations of X: We shall see an important generalization of this property. The following estimate will be used in the case k D 0, which is slightly simpler than the general case. Lemma 10.3 Let B and C be two continuous linear operators on a Banach space X; let u.t/ D etB , v.t/ D etC and let k 2 RC be such that etB ekt ; etC ekt for all t 2 RC . If B ı C D C ı B then ku.t/x v.t/xk tekt kBx Cxk
8t 2 RC ; x 2 X:
Proof Since B ı u.t/ D u.t/ ı B, B ı v.t/ D v.t/ ı B for all t 2 RC by the expansion of exp, an easy computation shows that d Œu.st/v.t st/x D tu.st/v.t st/.Bx Cx/ ds for all t 2 RC , x 2 X: It follows that for all t 2 RC , x 2 X we have Z ku.t/x v.t/xk D
0
1
tu.st/v.t st/.Bx Cx/ds Z
1
t kBx Cxk 0
ekst ek.tst/ ds D tekt kBx Cxk :
Exercises 1. (Trotter) Given a Banach space X and A, B 2 L.X/, show that et.ACB/ D limn!1 .e.t=n/A ı e.t=n/B /n for all t 2 RC . 2. Let X be a Banach space and let F W RC ! L.X/ be such that F.0/ D I, kF.t/k 1 for all t 2 RC and such that the right derivative A WD F 0 .0/ of F at 0 exists in L.X/: Show that etA D limn!1 .F.t=n//n for all t 2 RC . 3. (Fibonacci) Let A 2 L.R2 / be given by A.e1 / WD e2 , A.e2 / D e1 C e2 : Show that there exist sequences .an /, .bn / in R such that An D an A C bn I: Given a, b 2 R, compute un , where .un / is the sequence defined by u0 WD a, u1 WD b, unC1 D un1 C un : Taking a D 1, b Dp2 prove that this sequence converges to the golden ratio WD .1=2/.1 C 5/. Since antiquity, many philosophers, mathematicians and artists have been intrigued by this number; during the Renaissance it was called the divine proportion. It has been used in many monuments, from the Great Pyramid and the Parthenon to the United Nations building. Note that if two numbers r; s are such that s D r then r C s D s.
614
10 Evolution Problems
4. Show that the map A 7! eA from L.R2 / into GL.R2 / WD Iso.R2 / is not onto. [Hint: verify that B 2 GL.R2 / defined by B.e1 / WD 2e1 , B.e2 / WD e2 is not in the image of this map.] 5. Find the solution u.t/ WD .x.t/; y.t/; z.t// to the system u0 .t/ D Au.t/; u.0/ D u0 given by x0 .t/ D y.t/ C z.t/ y0 .t/ D z.t/ z0 .t/ D x.t/ C z.t/ by computing etA , where A is the matrix of this system. [Hint: verify that the characteristic polynomial of A is 3 2 C 1; so that A3 D A2 A C I.] 6. Consider the system x0 .t/ D y.t/ C z.t/
x.0/ D x0
y0 .t/ D x.t/
y.0/ D y0
0
z .t/ D x.t/ C y.t/ C z.t/
z.0/ D z0 :
Verify that for some c 2 R one has z.t/ D x.t/ C y.t/ C c for t 2 R. Verify that the solution to the system x0 .t/ D x.t/ C 2y.t/ C c
x.0/ D x0
y0 .t/ D x.t/
y.0/ D y0
satisfies x.t/ D aet C be2t ; y.t/ D aet C .b=2/e2t c=2. Deduce from this the solution to the first system and the expression of etA where A is the matrix of this system. From this expression compute An :
10.1.5 The Laplace Transform Whereras the Fourier transform is adapted to functions defined on all of R or Rd ; the Laplace transform is suited to functions defined on RC WD Œ0; 1Œ or P WD0; 1Œ: We deal with functions with values in R or C but an extension to functions with values in a complex Banach space could be considered. This transform enables us to change an evolution equation into another equation, a polynomial equation if the evolution equation is a linear differential equation of order n. It is defined as follows.
10.1 Ordinary Differential Equations
615
Definition 10.1 Given a measurable function f W P ! C, its Laplace transform is the function f # on dom f # WD fz 2 C W t 7! jetz f .t/j 2 L1 .RC /g given by Z f # .z/ WD
1 0
etz f .t/dt
z 2 dom f # :
One says that f is Laplace transformable if dom f # is nonempty. Let us observe that if z WD r C is 2 dom f # with r; sˇ 2 R,ˇ then for all ˇw WD p C iq ˇ .pCiq/t # ˇ with p r one has w 2 dom f since e f .t/ˇ ˇe.rCis/t f .t/ˇ : Thus, for tz f WD inffRe z W t 7! e f .t/ 2 L1 .RC /g; one has dom f # Df ; 1ŒR. If f is locally integrable and of exponential order in the sense that there exist a, b, c in R such that jf .t/j beat for t 2 Œc; 1Œ, then one has a; 1ŒCiR dom f # . It can be shown that the function f # is analytic on the half-space f ; 1ŒR and such that f # .p C iq/ ! 0 as p ! 1. We leave the proof of the following propositions to the reader as exercises (see also [229]). Proposition 10.6 Let f W P ! R be Laplace transformable and for c 2 R let g be given by g.t/ WD ect f .t/. Then, for s > c C f one has g# .s/ D f # .s c/: Proposition 10.7 Let f , g W P ! R be RLaplace transformable. Then, for the t convolution f g given by .f g/.t/ WD 0 f .r/g.t r/dr, dom f # \ dom g# dom . f g/# and for s 2 dom f # \ dom g# one has . f g/# .s/ D f # .s/:g# .s/: Proposition 10.8 Let f W P ! R be continuous and such that f D 0 and f # .z/ D 0 for all z 2 P. Then f D 0: Since the Laplace transform is linear, this last result shows that the map f 7! f # is injective on the space Cb .P/ of bounded continuous functions on P. Moreover, if f and g are two regulated functions of exponential order on RC and if f # D g# , then f D g at any point of continuity of f and g. We shall not determine the image of the Laplace transform; we just quote the next result from [229, Thm 5.42]. Proposition 10.9 Let a, b 2 P, h 2 L1 .R; RC / and let g Wa; 1ŒCiR ! C be such that jg.z/ b=zj h.y/ for z WD x C iy with x > a: Then there exists some continuous function f on P such that f .t/ beat for all t 2 P and f # D g. The following result can used to solve linear differential equations. Proposition 10.10 Let f W P ! R be of class C1 ; of exponential order and such that f 0 is Laplace transformable. Then, for a > f 0 ; a > 0 one has . f 0 /# .s/ D sf # .s/ f .0C /
s 2 Œa; C1Œ:
616
10 Evolution Problems
Proof An integration by parts yields 0 #
. f / .s/ D lim .Œe
st
"!0C
f .t/1=" "
Z
1="
Cs "
est f .t/dt/ D f .0C / C sf # .s/:
Example Given a0 ; a1 2 R, let us consider the second-order differential equation f 00 C a1 f 0 C a0 f D 0: Since .f 00 /# .s/ D s2 f # .s/ sf .0C / f 0 .0C /, one can show that f # .s/ D
sf .0C / C f 0 .0C / C a1 f .0C / s2 C a1 s C a0
from which one can get f : In particular, when f is of class Cn and when f .k/ .0C / WD limr!0C f .k/ .r/ exists for k 2 Nn [ f0g we see that if f is a solution to a linear differential equation of order n, its Laplace transform satisfies a polynomial equation of degree n. Such results explain the interest of using the Laplace transform for ordinary differential equations, the differentiation being transformed into a multiplication by s along with the subtraction of the initial value. Moreover, tables containing the Laplace transforms of commonly occurring functions are available in several books, f.i. [140, 229] and on //mathworld.wolfram.com/LaplaceTransform.html. As an example of the use of the Laplace transform, let us consider the heat equation (in which the Laplacian bears on the space variable x in an open subset ˝ of Rd ) @v .x; t/ v.x; t/ D 0 @t
.x; t/ 2 ˝ P
with initial condition v.x; 0/ D f .x/: Performing a Laplace transform with respect to t W Z 1 v # .x; r/ WD vx# .r/ WD ert v.x; t/dt; 0
we see that (under appropriate assumptions) the heat equation implies that Z v .x; r/ D
1
#
e 0
Z
Dr 0
1
rt
Z v.x; t/dt D
1
ert
0
@v .x; t/dt @t
# ert v.x; t/dt C Œert v.x; t/tD1 tD0 D rv .x; r/ f .x/:
Thus, the function ur on ˝ given by ur .x/ WD v # .x; r/ satisfies the stationary equation ur rur D f :
10.2 Semigroups
617
Therefore, the rich knowledge about the solutions to this stationary equation can be transferred to the solution of the heat equation.
Exercises 1. For f given by f .t/ WD ect verify that f # .z/ D 1=.z c/ and that .cosh/# .z/ D z=.z2 1/: 2. For f W R ! R null on R and a 2 RC let fa be given by fa .t/ WD f .t a/: Show that . fa /# .z/ D eaz f # .z/ when both sides are defined. 3. Let f W P ! R be locally integrable and Laplace transformable and let g be the primitive of f satisfying g.0/ D 0: Show that g# .s/ D f # .s/=s for s 2 P. 4. Show by induction that if f is of class Cn on P and if f .k/ is Laplace transformable for all k 2 Nn then . f .n/ /# .s/ D sn f # .s/ sn1 f 0 .0C / f .n1/ .0C /: 5. Using the Laplace transform, find the solution y of the differential equation y00 .t/ C y.t/ D t
t 2 RC
satisfying the initial conditions y.0/ D 1, y0 .0/ D 2: [Hint: verify that y.t/ D tCcos tCsin t by showing that y# .s/ D s.s2 C1/1 C2.s2 C1/Cs2 .s2 C1/1 .] 6. Let f W P ! R be Laplace transformable and such that f .0C / WD limt!0C f .t/ (resp. f .1/ WD limt!1 f .t/) exists. Show that sf # .s/ ! f .0C / as s ! 1 (resp. sf # .s/ ! f .1/ as s ! 0C ).
10.2 Semigroups The mathematical complexity of (nonlinear) semigroups corresponds to the complexity of time-dependent processes in nature. One may observe turbulence, shock waves, and explosions. Mathematically, these phenomena are reflected by instability, strange attractors, and blowing up effects that are outside the scope of this book. Thus, we essentially limit our study to central results in the case of linear semigroups. When A is a densely defined unbounded linear operator on a Banach space X, it is not obvious how to generalize the definition and the properties of etA WD exp tA considered in the preceding section for A 2 L.X/. In this section we first study the consequences of a property weaker than the group property e.sCt/A D esA ı etA for s; t 2 R. We show that under an assumption on the resolvent of A the solvability of (10.21) with u0 2 D.A/ can be established. After a subsection devoted to a general class of multimaps, we close the section with some views on the nonlinear multivalued case. We start with a study of continuous linear semigroups. In this section, for linear maps F; G W X ! X and x 2 X we often write Fx instead of F.x/ and FG instead of F ı G.
618
10 Evolution Problems
10.2.1 Continuous Linear Semigroups and Their Generators We adopt the notion of a continuous semigroup of operators in the following sense as an appropriate generalization of the notion of a group of operators. Definition 10.2 A family S./ WD .S.t//t2RC of maps S.t/ W X ! X is called a semigroup if it satisfies the conditions (SG0) S.0/ D IX ; (SG) for all s, t 2 RC S.s/ ı S.t/ D S.s C t/: If, moreover, the following condition holds, then .S.t//t2RC is said to be a (strongly) continuous semigroup: (SGC) the map .t; x/ 7! S.t/x is continuous on RC X. The following weakening of condition (SGC) is often adopted: (SGC0 )
for all x 2 X the map S./x is continuous at 0 from RC into X.
Lemma 10.4 When X is a Banach space and the maps S.t/ are linear and continuous, conditions (SGC) and (SGC0 ) are equivalent. Moreover, when the semigroup S./ satisfies these conditions there exist ! 2 R and c 2 RC such that kS.t/k ce!t
8t 2 RC :
(10.22)
If c D 1 one says that .S.t//t2RC is a !-contraction semigroup. If, moreover, ! D 0, one says that .S.t//t2RC is a contraction semigroup, or more correctly, a continuous nonexpansive semigroup. Since in this subsection and the next one we only consider semigroups of linear operators, we often omit any mention to linearity. Moreover, the composition of two operators A, B is often written AB rather than AıB. Proof Let us first show that condition (SGC0 ) implies that there exists some ı > 0 and c 2 RC such that kS.t/k c for all t 2 Œ0; ı: If that assertion were false, we could find a sequence .tn / ! 0C such that kS.tn /k > n for all n 2 N. Then, by the uniform boundedness theorem there would exist some x 2 X such that sup kS.tn /xk D 1; n
contradicting the assumption that .S.tn /x/ ! x. Let ı > 0, c 2 RC be such that kS.t/k c for all t 2 Œ0; ı. Given t 2 RC let s 2 Œ0; 1Œ and n 2 N be such that t=ı D n C s: Then, for ! WD .1=ı/ log c we have kS.t/k kS.sı/k : kS.ı/kn c:cn D ce!nı ce!t : Given x 2 X; the continuity from the right of the map t 7! S.t/x stems from the relation S.t C s/x S.t/x D S.t/.S.s/x x/ for s 2 RC . The continuity from the left of this map is a consequence in the inequality kS.t s/x S.t/xk kS.t s/k : kx S.s/xk ce!t kx S.s/xk :
10.2 Semigroups
619
Given .t; x/ 2 RC X and a sequence ..tn ; xn // ! .t; x/, we have kS.tn /xn S.t/x/k kS.tn /.xn x/k C kS.tn /x S.t/xk ! 0 n!1
since .kS.tn /k/n is bounded. Thus (SGC) holds.
Lemma 10.5 Let .S.t//t2RC be a semigroup of continuous linear operators on a Banach space X and let x 2 X; u.t/ WD S.t/x: Then u./ is right differentiable on RC if and only if u./ is right differentiable at 0: Moreover, for t 2 RC the right derivative u0r of u WD u./ satisfies u0r .t/ D S.t/u0r .0/: If condition (SGC) is satisfied, then u./ is differentiable on P WD0; 1Œ whenever u0r .0/ exists. Proof For the first assertion it suffices to prove that u./ is right differentiable at t 2 P when u0r .0/ exists. Since S.t/ is linear and continuous we have S.r/x x 1 .S.t C r/x S.t/x/ D S.t/. / ! S.t/u0r .0/: r!0C r r When (SGC) holds, to prove that u./ has S.t/u0r .0/ as left derivative at t we write 1 1 .S.t r/x S.t/x/ D S.t r/. .S.r/x x// r r
r 20; tŒ
and we see from (SGC) that the right-hand side converges to S.t/u0r .0/ as r ! 0C . The preceding lemma incites us to consider the set D.A/ of x 2 X such that u./ WD S./x is right differentiable at 0 and the operator A W x 7! u0r .0/ D .S./x/0r .0/: The latter is called the infinitesimal generator (or just the generator) of the semigroup S./: D.A/ WD fx 2 X W 9v 2 X W Ax WD limt!0C
S.t/x x ! vg; t!0C t
u.t/x x if x 2 D.A/: t
Proposition 10.11 Let .S.t//t2RC be a continuous semigroup of linear operators on a Banach space X and let A be its generator. Then the domain D.A/ of A is a linear subspace, A is linear; for all u0 2 D.A/ and all t 2 RC one has u.t/ WD 0 S.t/u0 2 D.A/ and u W t 7! S.t/u0 is continuously differentiable R t with u .t/ D Au.t/ for all t 2 RC . Moreover, Au.t/ D S.t/Au0 and S.t/u0 D u0 C 0 S.r/Au0 dr:
620
10 Evolution Problems
Proof The linearity of A is obvious. From the lemma we know that if u0 2 D.A/ then, for all t 2 P, S.t/.S.s/u0 u0 / S.s/S.t/u0 S.t/u0 D ! S.t/u0r .0/: s!0 s s This means that u.t/ 2 D.A/, that u./ WD S./u0 is differentiable at t, and that Au.t/ D S.t/Au0 . Since S./Au0 is continuous, u./ Rt R t WD S./u0 is continuously differentiable on RC and u.t/ u0 D 0 u0 .r/dr D 0 S.r/Au0 dr since t 7! S.t/Au0 is continuous. Theorem 10.6 Let A be the generator of a continuous semigroup .S.t//t2RC of linear operators on a Banach space X: Then D.A/ is a dense linear subspace of X and A is closed in the sense that its graph is a closed subset of X X: Moreover, S./ is determined by A in the sense that if T./ is another continuous semigroup with generator A, then T./ D S./: Rt Proof Given x 2 X and t > 0, let yt WD 0 S.s/xds; xt WD t1 yt : Since the map s 7! S.s/x is continuous, we have .xt / ! x as t ! 0C . To prove the first assertion, let us show that xt 2 D.A/ for all t > 0; or, equivalently (since D.A/ is a linear subspace) that yt 2 D.A/: Now for r 2 0; tŒ we have Z
r1 .S.r/yt yt / WD r1 ŒS.r/ D r1 D r1
Z Z
t 0
S.s/xds r1
t
S.s/xds 0
.S.r C s/x S.s/x/ds D r1
tCr t
Z
t
Z
S.s/xds 0
Z
rCt
S.s/xds r1
Z
t
S.s/xds 0
r r
S.s/xds ! S.t/x x: r!0C
0
This shows that yt 2 D.A/ and Ayt D S.t/x x: In order to prove that the graph G.A/ of A is closed, we consider the limit .x; y/ of a sequence ..xn ; yn // in G.A/: Since xn 2 D.A/; the preceding proposition ensures that Z r Z r d .S.t/xn /dt D S.r/xn xn D S.t/Axn dt 8r > 0: 0 dt 0 Since .Axn /n ! y and, by Lemma 10.4, there exists positive constants c; ! such that kS.t/Axn k ce!t .kyk C 1/ for n large enough, passing to the limit on n; we get Z
r
S.r/x x D
S.t/ydt 0
8r > 0:
Rr Thus, r1 .S.r/x x/ D r1 0 S.t/ydt ! y as r ! 0C : By definition of D.A/ this means that x 2 D.A/ and y D Ax:
10.2 Semigroups
621
Let us prove the last assertion. For r fixed in RC and t 2 Œ0; r, let us set Q.t/ WD T.r t/S.t/. Then, for x 2 D.A/, since y WD S.t/x 2 D.A/ by Proposition 10.11, Q./x is differentiable with .Q./x/0 .t/ D AT.r t/S.t/x C T.r t/AS.t/x D 0 since AT.r t/y D T.r t/Ay. Observing that Q.0/x D T.r/x; Q.r/x D S.r/x, and Q.r/x D Q.0/x, we conclude that S.r/x D T.r/x. Since D.A/ is dense in X and both S.r/ and T.r/ are linear and continuous, we get S.r/ D T.r/: It is natural to wonder whether any densely defined closed linear operator A on X is the generator of a continuous semigroup. The answer is negative as the counterexample of Exercise 8 shows. Under an additional assumption we shall get a positive answer in the next subsection.
Exercises 1. Show that for a continuous linear semigroup S./ with generator A on a Banach space X the following assertions are equivalent: (a) (b) (c) (d)
A is everywhere defined and continuous; the domain D.A/ of A is XI the domain D.A/ is closed; the map S./ is continuous from RC into L.X/ WD L.X; X/:
2. Show that a linear semigroup .St /t0 on X is continuous if and only if it is weakly continuous in the sense that for all x 2 X, x 2 X the function t 7! hx ; S.t/xi is continuous on RC : 3. Prove that the translation semigroup .T.t//t0 defined by T.t/. f /.s/ WD f .s C t/ for s, t 2 RC ; f 2 C0 .RC / WD ff 2 C.RC / W lims!1 f .s/ D 0g is a continuous semigroup when C0 .RC / is endowed with the sup norm kk1 . 4. Consider the same question when C0 .RC / is replaced with the space Cub .RC / of f 2 C.RC / that are bounded and uniformly continuous, Cub .RC / being endowed with kk1 . 5. Consider the same question when C0 .RC / is replaced with the space C01 .RC / of f 2 C0 .RC / that are of class C1 ; with f 0 2 C0 .RC /; C01 .RC / being endowed with the norm f 7! kf k1 C kf 0 k1 . 6. Let c 2 P, ! 2 R, and let .S.t//t2RC be a continuous semigroup of linear operators on a Banach space X satisfying kS.t/k ce!t for all t 2 RC . Setting kxk0 WD supfe!t kS.t/xk W t 2 RC g; show that kk0 is a norm on X satisfying kk kk0 c kk. Verify that S./ is a continuous semigroup on X satisfying kS.t/xk0 e!t kxk0 for all x 2 X:
622
10 Evolution Problems
7. Extending operators on a real Banach space X to its complexified space, prove that a closed, densely defined linear operator A generates a continuous nonexpansive semigroup if andonly if for all 2 C satisfying Re > 0 one has 2 .A/ and .I A/1 1= Re . [See [113, Thm 3.5] and the next subsection.] 8. Verify the following counterexample showing that a closed linear operator A with dense domain in a Banach space X and whose spectrum is contained in some interval 1; ! is not necessarily the generator of a continuous semigroup. Take for X the space of continuous functions f W RC ! R that are continuously differentiable on Œ0; 1 and satisfy limr!1 f .r/ D 0 and endow X with the norm f 7! kf k WD supr2RC jf .r/jCsupr2Œ0;1 jf 0 .r/j : Consider the operator A W D.A/ ! X defined by D.A/ WD ff 2 X \ C1 .RC / W f 0 2 Xg, Af WD f 0 . Show that D.A/ is dense, that A is closed and that the resolvent of A contains P but that if A generates a continuous semigroup .S.t//t0 one must have .S.t/f /.r/ D f .r C t/ for all f 2 X, r; t 2 RC , contradicting the fact that S.t/f 2 X for all f 2 X.
10.2.2 Characterization of Generators of Continuous Semigroups Recall that the resolvent set .A/ of a closed (linear) operator A is the set of 2 R such that I A is a bijection from D.A/ onto X, I standing for the identity map IX of X. Then, the resolvent operator R WD RA WD .I A/1 is a continuous linear operator, as seen in Proposition 3.38. The following proposition shows that RA can be considered as the Laplace transform of the semigroup generated by A. The estimate it provides is the first step in the characterization of generators of continuous semigroups we have in view. Proposition 10.12 If R is the resolvent of the generator A of a continuous semigroup .S.t//t2RC and if ! 2 R and c 2 RC are such that kS.t/k ce!t for all t 2 RC as in Lemma 10.4, then !; 1Œ .A/ and for all 2 !; 1Œ one has kR k c=. !/ and Z R x D
1
er S.r/xdr:
0
In particular, if A is the generator of a nonexpansive continuous semigroup, then one has P .A/ and for all 2 P one has kR k 1=: Proof By our assumption, for > ! we have er S.r/x ce.!/r kxk, so that R 1 r the integral T .x/ WD 0 e S.r/xdr is well defined and kT .x/k c kxk =.!/.
10.2 Semigroups
623
Setting e S.t/ WD c1 et S.t/; we reduce the assertion of the statement to the case ! < 0, D 0 and c D 1; the generator of .e S.t//t0 being e AR WD c1 .A IX /; so 1 1 that R D .ce A/ . Then, for x 2 X and t 2 P, for e T 0 x WD 0 e S.r/xdr D c1 T x we have Z e 1 1e S.t/ IX e .S.t/ IX /.e S.r/x/dr T 0x D t t 0 Z Z 1 1e 1 1e S.t C r/xdr S.r/xdr D t 0 t 0 Z Z Z 1 1e 1 1e 1 te D S.s/xds S.r/xdr D S.s/xds: t t t 0 t 0 T 0 x 2 D.e A/ D Since the right-hand side converges to x as t ! 0C we get that e D.A/ and e A.e T 0 x/ D x: For x 2 X we have Z t e e T 0 x D lim S.r/xdr t!1 0
and similarly, for x 2 D.e A/, using Lemma 3.21 and the fact that e A has a closed graph by Theorem 10.6, e T 0e Ax D lim
Z
t!1 0
t
e S.r/e Axdr D lim
Z
t!1 0
t
e Ae S.r/xdr D lim e A t!1
Z
t
e S.r/xdr:
0
Since e A is a closed linear operator, it follows that e T 0e Ax D e Ae T 0 x D x: These 1 e e e relations prove that T 0 D A , so that 0 2 .A/ and 2 .A/, R D ce A1 D e cT 0 D T : In the sequel, given ! 2 RC , we denote by D! .X/ the set of closed linear operators A whose domains are dense in X and that satisfy !; 1Œ .A/; the resolvent set of A, and kR k 1=. !/ for all > !, where R WD .I A/1 is the resolvent operator of A: This class of operators will be studied later on and generalized in the next subsection. We introduce !.A/ WD inff! 2 RC W A 2 D! .X/g
for A 2 D.X/ WD
[
D! .X/:
!2RC
Given A 2 D! .X/ and r 20; 1=!Œ we set Pr .WD PAr / WD
1 R 1 D .I rA/1 ; r r
Ar WD
1 .Pr I/: r
624
10 Evolution Problems
Note that since the graph of A is closed, the graphs of I rA and .I A/1 are closed, as is easily seen, so that Pr is a continuous linear operator from X into D.A/: Moreover, since .I rA/.I rA/1 D I, and hence .I rA/1 I D rA.I rA/1 , one sees that Ar x D APr x
8x 2 X; 8r 20; 1=!Œ:
Let us give some properties of the map Ar called the Yosida approximation of A: We start with a study of the proximal operator Pr of A. In Proposition 10.14 below we use the fact that APr x D Pr Ax for x 2 D.A/ (by Proposition 3.38 (a)), so that Ar x D APr x D Pr Ax:
(10.23)
Proposition 10.13 Given ! 2 RC ; r 20; 1=!Œ, with 1=! WD 1 if ! D 0, and A 2 D! .X/ the following properties hold: (a) kPr k .1 !r/1 I (b) kPr x xk r.1 r!/1 kAxk for all x 2 D.A/I (c) limr!0C Pr x D x for all x 2 X: Proof (a) By definition of D! .X/ we have kPr k D (b) For x 2 D.A/ we have
1 r
R 1r .1 !r/1 :
kPr x xk D r1 R 1 x R 1 .I rA/x r r r1 R 1 kx .I rA/xk r.1 r!/1 kAxk : r
(c) ensues for x 2 D.A/. Since kPr k 2 for r 20; 1=.2!/Œ and since D.A/ is dense in X; given x 2 X and " > 0, picking w 2 D.A/ with kw xk "=6, the inequalities kPr x xk kPr x Pr wkCkPr w wkCkw xk 3 kw xkC yield kPr x xk < " provided r is small enough.
r kAwk 1 r!
Proposition 10.14 Given ! 2 RC ; r 20; 1=!Œ, and A 2 D! .X/ the following properties hold: (a) Ar is linear and continuous and kAr k r1 .1 !r/1 C r1 ; (b) kAr xk .1 r!/1 kAxk for all x 2 D.A/; (c) limr!0C Ar x D Ax for all x 2 D.A/:
10.2 Semigroups
625
Proof (a) Since Ar D 1r .Pr I/, this stems from Proposition 10.13(a). (b) Since Ar x D 1r .Pr x x/, this follows from Proposition 10.13(b). (c) For x 2 D.A/, by relation (10.23) and Proposition 10.13 (c), we have kAr x Axk D kPr Ax Axk ! 0 as r ! 0C . Exercise For ! 2 RC and A 2 D! .X/ show that Ar 2 D!.1r!/1 .X/ for r 2 0; 1=!Œ. We are ready to present the main result of this section. Its proof will be presented by starting with the special case of nonexpansive continuous semigroups, which is technically simpler than the general case (it corresponds to the case ! D 0). Theorem 10.7 (Hille-Yosida) If A is a linear operator with domain D.A/ in a Banach space X and if ! 2 RC , the following assertions are equivalent: (a) A is closed, densely defined, the resolvent set .A/ of A contains the interval !; 1Œ and for all > ! one has .I A/1 . !/1 ; (b) A is the infinitesimal generator of a continuous semigroup .S.t//t2RC of linear continuous operators such that kS.t/k e!t for all t 2 RC . Corollary 10.2 For a linear operator A with domain D.A/ in a Banach space X, the following assertions are equivalent: (a) A is closed, densely defined, P .A/ and .I A/1 1 for all 2 P; (b) A is the generator of a nonexpansive continuous semigroup. Proof of Corollary 10.2 In Theorem 10.6 and Proposition 10.12 we have seen the implication (b))(a), so that it remains to show (a))(b). For r 2 P let Sr .t/ WD etAr
t 2 RC :
Since Ar D r1 Pr r1 I; we have etAr D et=r e.t=r/Pr and since kPr k 1 and e.t=r/Pr ek.t=r/Pr k et=r we get kSr .t/k D etAr et=r et=r D 1: Thus, for r, s 2 P, using the relation Ar As D As Ar , Lemma 10.3 yields kSr .t/x Ss .t/xk t kAr x As xk
x 2 X; t 2 RC :
(10.24)
For x 2 D.A/ since .Ar.n/ x/ ! Ax for any sequence .r.n// ! 0C , we see that .Sr.n/ ./x/ is a Cauchy sequence with respect to the norm of uniform convergence on every compact interval of RC . Thus it converges uniformly to a limit S./x on
626
10 Evolution Problems
each such interval and the limit is independent of the choice of the sequence .r.n//. Since D.A/ is dense in X, taking x 2 X and a sequence .xn / in D.A/ with limit x and using the estimates kSr .t/k 1, kSs .t/k 1 yielding the inequality kSr .t/.x/ Ss .t/.x/k kx xn k C kSr .t/.xn / Ss .t/.xn /k C kxn xk we see that this convergence also holds for x 2 X, uniformly on every compact interval of RC . Thus t 7! S.t/x is continuous. Passing to the limit as r ! 0C in the relations kSr .t/.x/k kxk and .Sr .t/ ı Sr .t0 //.x/ D Sr .t C t0 /.x/; we get that .S.t//t0 is a semigroup of nonexpansive linear maps. Lemma 10.4 shows that the semigroup .S.t//t0 is continuous. It remains to show that A coincides with the generator B of .S.t//t0 : We first note that for every x 2 D.A/ and every > 0 the maps ur W t 7! Sr .t/.x/ and their derivatives u0r W t 7! Sr .t/.Ax/ converge uniformly on Œ0; to u W t 7! S.t/.x/ and t 7! S.t/.Ax/ respectively as r ! 0C . This shows that u is of class C1 on RC and that u0 .t/ D S.t/.Ax/: Thus D.A/ D.B/ and A D B on D.A/: Now, by Proposition 10.12, given 2 P we have 2 .B/; so that I B is a bijection from D.B/ onto X: Since we also have 2 .A/ by assumption, I A also is a bijection from D.A/ onto X. Since A D B on D.A/; we see that I B is a bijection from D.A/ D.B/ onto X: Thus D.A/ D D.B/ and A D B: Proof of Theorem 10.7 Again, we only have to show (a))(b). Setting B WD A !I; for > 0 we have that B I D A . C onto of D.B/ D D.A/ !/I is a bijection X; so that P .B/ D .A/ ! and .I B/1 D . C !/I A/1 .. C !/ !/1 D 1 : Thus B is the generator of a nonexpansive continuous semigroup .T.//: Setting S.t/ WD e!t T.t/ for t 2 RC , we see that kS.t/k e!t for t 2 RC , that S./ is a continuous semigroup and that for all x 2 D.A/ D D.B/ we have d d jtD0C S.t/x D !x C jtD0C T.t/x D !x C Bx D Ax: dt dt
Thus S./ is generated by A.
Let us identify the class D0 in terms of the class of (single-valued) linear dissipative operators in the sense of the next definition. In the next subsection we will enlarge this class to nonlinear (multivalued) operators. Definition 10.3 A linear map A W D.A/ ! X with domain D.A/ in a Banach space X is said to be dissipative if for any x 2 D.A/ there exists some x 2 J.x/ WD fx 2 X W kx k D kxk ; hx ; xi D kxk2 g such that hAx; x i 0: A linear map A W D.A/ ! X is said to be accretive if A is dissipative.
(10.25)
10.2 Semigroups
627
Given ! 2 R one says that A W D.A/ ! X is !-dissipative (resp. !-accretive) if A !I is dissipative (resp. A C !I is accretive). Kato’s lemma (Lemma 6.17) shows the following characterization using a semi-scalar product Œ; and the semi-inner product hjiC defined by hxjyiC WD limt!0C 2t1 .kx C tyk2 kxk2 /. Proposition 10.15 A linear map A W D.A/ ! X is dissipative if and only if one of the following assertions holds: (a) kxk kx rAxk for all x 2 D.A/; r > 0I (b) for some semi-scalar product Œ; one has Œx; Ax 0 for all x 2 D.A/; (c) hxj AxiC 0 for all x 2 D.A/: The first characterization shows that for all > 0 the map I A W D.A/ ! X is injective and its inverse R WD .I A/1 W R.I A/ ! D.A/ is continuous with norm at most r WD 1= : given z 2 R.I A/ WD .I A/.X/, for x 2 D.A/ such that z D x Ax, one has kxk 1 kzk : In particular, if z D 0 then x D 0. Proposition 10.16 Let A be a dissipative linear operator. Then A is closed if and only if for some (hence all) > 0 the set R.I A/ is closed. In particular any linear dissipative operator such that R.I A/ D X for some > 0 is closed . Proof Clearly, A is closed if and only if I A is closed. In turn this is equivalent to R WD .I A/1 being closed. Since .I; R / W D.R / ! G.R /, i.e. z 7! .z; R .z// is a continuous bijection whose inverse is the restriction to G.R / of the canonical projection, this is equivalent to D.R / D R.I A/ being complete or closed. We conclude from the next theorem that the class D0 coincides with the set of densely defined dissipative operators A satisfying R.I A/ D X. Theorem 10.8 (Lumer-Phillips) Let A be a densely defined linear operator on a Banach space X: Then A is the generator of a nonexpansive continuous semigroup if and only if A is dissipative and R.I A/ D X. Proof Suppose A generates a nonexpansive continuous semigroup .S.t//t0 : Then, for any semi-scalar product Œ; on X and x 2 D.A/ we have Œx; Ax 0 since Œx; S.t/x x D Œx; S.t/x Œx; x kxk : kS.t/xk kxk2 0; so that Œx; Ax D Œx; limt!0C .1=t/.S.t/x x/ 0, Œx; being a continuous linear form. Thus A is dissipative. Moreover, R.I A/ D D..I A/1 / D X since A is the generator of a nonexpansive semigroup so that P .A/ by Corollary 10.2. Conversely, let us suppose A is dissipative and R.I A/ D X. Then, by the preceding proposition, A is closed. Note that for > 0; z 2 X, and x WD R z we have x Ax D z; kxk2 Œx; x Œx; Ax D Œx; x Ax kxk : kzk ;
628
10 Evolution Problems
hence kR zk .1=/ kzk. In particular WD 1 2 .A/ and kR1 k 1. Moreover, if is such that j 1j < 1; by Proposition 3.17 we have 2 .A/ and for 2 R satisfying j j < 1= kR k we have 2 .A/: Repeating this argument, we see that P .A/ and kR k 1=. Then Corollary 10.2 shows that A is the generator of a nonexpansive continuous semigroup. Corollary 10.3 Let A be a densely defined linear dissipative operator on a smooth Banach space X such that R.I A/ D X. Then, for all u0 2 X there exists a u 2 C1 .P; X/ \ C.RC ; X/ such that u.0/ D u0 ; u.P/ D.A/, and u0 .t/ D Au.t/ for all t 2 P: Moreover, ku./k is nonincreasing. In particular, one has ku.t/k ku0 k for all t 2 RC . Proof The first assertion follows from the Lumer-Phillips Theorem and Proposition 10.11. Since d ku.t/k2 D 2hJ.u.t//; u0.t/i D 2hu.t/ j Au.t/iC 0; dt ku./k2 is nonincreasing.
Corollary 10.4 Let A be a densely defined linear operator on a Hilbert space X such that B WD A !I is maximally dissipative (i.e. B is maximally monotone) for some ! 2 R. Then A generates a !-contraction semigroup .S.t//t0 in X: Proof By maximal monotonicity (Proposition 9.15) B is closed (see also Corollary 10.6 below for a direct proof). Since for all x 2 D.A/ we have h.!I A/x j xi 0; given 2!; 1Œ; we have R.I A/ D R.. !/I B/ D X by Theorem 9.28, so that !; 1Œ .A/. Given y 2 X and x WD R y we have kxk : kyk hy j xi D h.I A/x j xi . !/ kxk2 hence . !/ kxk kyk : Thus .I A/1 . !/1 and the Hille-Yosida Theorem yields the conclusion. The present section culminates in the following generalization of the HilleYosida Theorem. Theorem 10.9 (Feller-Miyadera-Phillips) Given ! 2 RC and some c 2 Œ1; 1Œ, a closed linear operator A with dense domain D.A/ in a Banach space X is the infinitesimal generator of a continuous semigroup .S.t//t2RC of linear continuous operators such that kS.t/k ce!t for all t 2 RC if and only if the resolvent set .A/ of A contains the interval !; 1Œ and k.IX A/n k c. !/n
8 > !; 8n 2 N:
10.2 Semigroups
629
Proof The necessary assertion has been proved in Proposition 10.12 for n D 1. For n D 0 it is obvious. For n 2 in N and R WD .IX A/1 we use the relations Rn D
Z
.1/n1 dn1 R ; .n 1/Š dn1
R x D
1
er S.r/xdr
0
obtained in Propositions 3.38 and 10.12 respectively to get by induction and differentiation under the integral symbol an integral representation of Rn : Rn D
.1/n1 .n 1/Š
Z
1
tn1 et S.t/dt:
0
Then, an integration by parts yields the estimate n R
c .n 1/Š
Z
1
tn1 e.!/t dt D
0
c : . !/n
Let us prove the sufficiency assertion. Again, passing from A to B WD A !IX , we may suppose ! D 0; so that P .A/ and k.IX A/n k cn for all 2 P and all n 2 N. For every 2 P let us define a new norm kk on X by setting kxk WD sup kn .IX A/n xk : n2N
These norms have the following properties: (a) kk kk c kk for all 2 P. (b) kk kk for > > 0. (c) .IX A/1 1= for all 2 P. (d) k.IX A/n k n for all n 2 N and all ; 2 P satisfying . We leave the proofs to the reader except for (d). By induction, it suffices to prove the case n D 1: Proposition 3.38 ensures that R R D . /R R , hence y WD R x D R x C . /R R x D R x C . /R y: This relation implies, using (c), that kyk
1 . / kxk C kyk
hence kyk kxk : In view of these properties, one can define still another norm by kxk0 WD sup kxk >0
630
10 Evolution Problems
which satisfies kk kk0 c kk and, for all 2 P, .IX A/1 0 1=: Then one can apply Corollary 10.2 on the space .X; kk0 / which shows that A generates a nonexpansive continuous semigroup .S.//: Applying the relations kk kk0 c kk, we get that kS.t/k c for all t 2 RC :
Exercises 1. Prove the properties of the norm kk . 2. Show that a dissipative (linear) operator A on a Banach space is closable if R.A/ cl.D.A//: Show that in such a case the closure A of A is again dissipative and satisfies R.I A/ D cl.R.I A// for all 2 P. 3. Let A be a dissipative (linear) operator on a Banach space X. Prove that if both A and its transpose A| are dissipative, then the closure of A generates a nonexpansive semigroup on X.
10.2.3
Dissipative and Accretive Multimaps
Although we essentially focus our attention on linear single-valued dissipative operators, in the present subsection we make a detour through nonlinear, multivalued maps. A dissipativity assumption generalizing the one we considered gives rise to interesting properties that justify such a detour. It clearly generalizes the class of dissipative operators. Again J denotes the duality map of the Banach space X. The class of multimaps we introduce plays a key role for evolution problems. Definition 10.4 A multimap (or multivalued map) M W X X on a Banach space X is said to be dissipative if for any .x1 ; y1 /, .x2 ; y2 / 2 M there exists some x 2 J.x1 x2 / such that hy1 y2 ; x i 0:
(10.26)
A multimap M W X X is said to be accretive if M is dissipative. Given ! 2 RC one says that M W X X is !-dissipative (resp. !-accretive) if M !I is dissipative (resp. !I M is accretive). The following characterization is similar to the one obtained for the case of a linear single-valued operator: it stems from Kato’s lemma (Lemma 6.17).
10.2 Semigroups
631
Proposition 10.17 A multimap M W X X is dissipative if and only if one of the following assertions holds: (a) kx1 x2 k k.x1 x2 / r.y1 y2 /k for all .x1 ; y1 /, .x2 ; y2 / 2 M; r 2 RC I (b) there exists a semi-scalar product Œ; on X such that Œx1 x2 ; y1 y2 0 for all .x1 ; y1 /, .x2 ; y2 / 2 M; (c) Œx1 x2 ; y2 y1 C 0 for all .x1 ; y1 /, .x2 ; y2 / 2 M: The first characterization shows that for all > 0 the map IX M W D.M/ ! X 1 is injective and RM is Lipschitzian with rate 1= on R.IX M/: WD .IX M/ given z1 , z2 2 R.IX M/ WD .IX M/.X/, for x1 , x2 2 D.M/ such that zi 2 xi Mxi for i D 1, 2, setting r WD 1 , yi WD xi zi 2 Mxi , one has kx1 x2 k 1 kz1 z2 k : In particular, if z1 D z2 then x1 D x2 . If M is single-valued and linear, M is dissipative if and only if for all x 2 D.M/; r 2 RC one has kxk kx rMxk ; or equivalently if forall x 2 D.M/, 2 P one has kxk .1=/ k.I M/.x/k or P .M/ and RM y .1=/ kyk for all y 2 R.I M/, 2 P. Thus the class of closed linear dissipative operators with dense domains is D0 .X/: Proposition 10.18 A nonlinear (multivalued) operator M W D.M/ X on a Hilbert space X (identified with its dual) is monotone if and only if M is accretive. Thus, in a Hilbert space the definitions of dissipativity given in Definitions 9.7 and 10.4 coincide. Proof Given u, w 2 D.M/; v 2 Mu, z 2 Mw and r > 0 one has k.u C rv/ .w C rz/k2 ku wk2 D 2rhv z j u wi C r2 kv zk2 : If M is monotone, the right-hand side is nonnegative, hence M is accretive. Conversely, if M is accretive, for all r > 0 the left-hand side is nonnegative, so that one must have hv z j u wi 0. The resolvent operator associated with an !-dissipative multimap M is defined 1 by R WD RM for > !, where I stands for IX . In the sequel, passing WD .I M/ from to r WD 1=, and assuming M is !-dissipative, we generalize previous definitions in Sect. 10.2.2 by setting (with 1=! D 1 if ! D 0) Pr .x/ WD .I rM/1 .x/
x 2 R.I rM/
r 20; 1=!Œ
Mr .x/ WD r1 .Pr .x/ x/
x 2 R.I rM/
r 20; 1=!Œ:
The map Pr called the proximal operator of M is single-valued and one has 1 Pr D R 1 . / r r
r 20; 1=!Œ:
632
10 Evolution Problems
The map Mr WD r1 .Pr I/ is called the Yosida operator of M: It can be seen as an approximation of M: We have seen that the closed graph theorem ensures that Pr W X ! X is linear and continuous when M is single-valued, linear and !dissipative with r 20; 1=!Œ. In the general case we have the following properties. The reader may assume ! D 0 to get a simpler view, since in the sequel we focus on this case. Proposition 10.19 Let M W X X be !-dissipative. Then for all r 20; 1=!Œ the (single-valued) maps Mr and Pr satisfy the following properties. In particular, if M is dissipative, Mr is dissipative and Pr is nonexpansive. (a) (b) (c) (d) (e) (f)
kPr x Pr x0 k .1 !r/1 kx x0 k for all x, x0 2 R.I rM/: Mr is Lipschitzian with rate .2 !r/=r.1 !r/ on R.I rM/: Mr x 2 MPr x for all x 2 R.I rM/: .1 !r/ kMr xk jMxj WD inffkyk W y 2 Mxg if x 2 D.M/ \ R.I rM/: limr!0C Pr x D x for all x 2 D.M/ \r20;1=!Œ R.I rM/: If M is !-dissipative and such that D.M/ R.I rM/ for all r 20; 1=!Œ; then for all x 2 cl.D.M// one has .Pr x/ ! x as r ! 0C .
Proof In the sequel we take r 20; ! 1 Œ or r 2 P if M is dissipative. (a) Since M !I is dissipative, for .x1 ; y1 /, .x2 ; y2 / 2 M; one has kx1 x2 k k.1 C !r/.x1 x2 / r.y1 y2 /k ; .1 !r/ kx1 x2 k k.x1 ry1 / .x2 ry2 /k ; so that Pr WD .I rM/1 W R.I rM/ ! X is single-valued and .1 !r/1 Lipschitzian on R.I rM/. (b) Then the map Mr WD r1 .Pr I/ is Lipschitzian on R.I rM/ with rate r1 ..1 !r/1 C 1/ D r1 .2 !r/.1 !r/1 . If M is dissipative, Pr being nonexpansive, Mr is dissipative since for x, x0 2 X; z 2 J.x x0 / one has hx x0 ; z i D kx x0 k2 D kz k2 , hence h.Pr I/x .Pr I/x0 ; z i D hPr x Pr x0 ; z i hx x0 ; z i 2 x x0 : kz k x x0 0: (c) Given x 2 R.I rM/, w WD Pr x, we have w x 2 rMw, hence Mr x D r1 .w x/ 2 Mw D MPr x: (d) For x 2 D.M/ \ R.I rM/, y 2 Mx, we have x D Pr .x ry/, hence Mr x D r1 .Pr .x/ Pr .x ry//. Since Pr is .1 !r/1 -Lipschitzian, we get kMr xk r1 .1 !r/1 kx .x ry/k D .1 !r/1 kyk. Passing to the infimum over y 2 Mx, we get the announced inequality.
10.2 Semigroups
633
(e) For x 2 D.M/ \r20;1=!Œ R.I rM/ we have kPr x xk D r kMr xk
r jMxj ; 1 !r
so that .Pr x/ ! x when r ! 0C : (f) Suppose M is !-dissipative and such that D.M/ R.I rM/ for all r 20; 1=!Œ. Given x 2 cl.D.M// and " > 0, we pick x0 2 D.M/ D D.M/\r20;1=!Œ R.I rM/ such that kx x0 k < "=4 and we take ı 20; 1=2!Œ such that kPr x0 x0 k < "=4 for all r 20; ı: Then, since Pr is 2-Lipschitzian we have kPr x Pr x0 k 2 kx x0 k < "=2 and kPr x xk Pr x Pr x0 C Pr x0 x0 C x0 x < "; so that .Pr x/ ! x when r ! 0C :
A multimap M W X X is said to be hyperdissipative (or m-dissipative) if it is dissipative and if R.IX M/ D X: In view of the next result, for such a multimap, for all r 2 P the approximate map Mr and the proximal map Pr are defined on the whole of X and the last assertions of Proposition 10.19 can be simplified. Proposition 10.20 A dissipative multimap M W D.M/ X on a Banach space is hyperdissipative if and only if for all (or, equivalently for some) r > 0 one has R.IX rM/ D X: Proof Let M be hyperdissipative. Given an arbitrary r 2 P and y 2 X we have to prove that the equation y 2 xrMx has a solution x: We rewrite it as 1r yC.1 1r /x 2 .I M/.x/ or 1 1 x D P1 . y C .1 /x/: r r For r 2 12 ; 1Œ the map x 7! P1 . 1r y C .1 1r /x/ is a contraction, so that this equation has a unique solution. Taking s 2 12 ; 1Œ we see that sM is m-dissipative. Repeating the preceding argument with M changed into sM we see that for r 2 2s ; 1Œ one has R.I rM/ D X and iterating this procedure we get that R.I rM/ D X for all r > 0: If the dissipative multimap M satisfies R.I sM/ D X for some s 2 P, then for all t 2 P the hyperdissipative multimap sM satisfies R.I tsM/ D X. Given r 2 P, taking t WD r=s we get that R.I rM/ D X. Corollary 10.5 Any hyperdissipative multimap is maximally dissipative in the sense that any dissipative multimap whose graph contains the graph of M coincides with M: Proof If .x; y/ 2 X X is such that kx uk k.x ry/ .u rv/k
8.u; v/ 2 M; 8r > 0;
634
10 Evolution Problems
using the relation R.I rM/ D X, choosing .u; v/ 2 M such that u rv D x ry, we see that x D u and y D v; so that .x; y/ 2 M: Corollary 10.6 (a) Any maximally dissipative multimap M is closed in the sense that its graph is closed. (b) Moreover, for any x, y 2 X, any sequences .rn / ! 0C ; .xn / ! x satisfying xn 2 D.Mrn / for all n and .Mrn xn / ! y, one has .x; y/ 2 M. (c) If J is single-valued and continuous, then M is demi-closed, i.e. sequentially closed in X Xw and if x, y 2 X are such that for some sequences .rn / ! 0C ; .xn / ! x with xn 2 D.Mrn / for all n and .Mrn xn / ! y weakly then one has .x; y/ 2 M. Proof (a) Given a sequence ..xn ; yn // in (the graph of) M converging to .x; y/, by dissipativity we have kxn uk k.xn ryn / .u rv/k
8.u; v/ 2 M; r > 0:
(10.27)
Taking limits we get kx uk k.x ry/ .u rv/k
8.u; v/ 2 M; r > 0:
By maximality of M; we obtain .x; y/ 2 M: (b) Now suppose .rn / ! 0C , .xn / ! x with xn 2 D.Mrn / for all n, and .Mrn xn / ! y: Since .Prn xn / D .xn C rn Mrn xn / ! x and since Mrn xn 2 MPrn xn by Proposition 10.19, we get .x; y/ 2 M by the closedness of M. (c) From now on we assume J is single-valued and continuous. We denote weak convergence by +. Let .xn / ! x, .yn / + y with .xn ; yn / 2 M for all n: For all .u; v/ 2 M; passing to the limit in the relation hyn v; J.xn u/i 0 we obtain hyv; J.xu/i 0. Then M[f.x; y/g is dissipative, so that .x; y/ 2 M by maximality. The proof of the second assertion is similar: setting .wn / WD .Prn xn / D .xn / .rn Mrn .xn // ! x since .rn yn / ! 0, yn WD Mrn xn 2 MPrn xn D Mwn (by Proposition 10.19(c)) we have .x; y/ 2 M. If M is maximally dissipative and if X is strictly convex, so that J is singlevalued, then for all x 2 X the set Mx is given by Mx D fy 2 X W hy v; J.x u/i 0 8.u; v/ 2 Mg: Thus, it is closed and convex. If, moreover, X is reflexive and strictly convex, there exists in Mx WD M.x/ a unique element of minimum norm. We denote it
10.2 Semigroups
635
by M 0 x or M 0 .x/. In the sequel, if X is a reflexive space, using Asplund’s Theorem (Theorem 6.23) we endow it with a norm that is strictly convex along with its dual norm. Proposition 10.21 Suppose X is a reflexive Banach space whose duality map is single-valued and continuous. Let M W X X be hyperdissipative. Then, for all x 2 D.M/, .Mr .x// ! M 0 .x/ weakly as r ! 0C . If the norm of X satisfies the Kadec-Klee Property, for all x 2 D.M/ one has .Mr .w// ! M 0 .x/ as .r; w/ ! .0C ; x/ with .kw xk =r/ ! 0. In particular, .Mr .x// ! M 0 .x/ as r ! 0C . Proof 0 Given x 2 D.M/; r > 0, Proposition 10.19 (d) ensures that kMr xk jMxj WD M x. Now, for any sequence .rn / ! 0C ; the preceding corollary yields .Mr x/ + n M 0 x since .Mrn x/ hasa weak limit point y that belongs to Mx and since the norm of y is not greater than M 0 x, we get y D M 0 x and the first assertion stems from the uniqueness of this limit. The Kadec-Klee Property ensures that .Mrn x/ ! M 0 x: the relations Since M0r is Lipschitzian with rate 2=r, the last assertion stems from Mr w M x kMr w Mr xkC Mr x M 0 x .2=r/ kw xkC Mr x M 0 x : If X and X are uniformly convex one can show that cl.D.M// is convex (see [92, Prop. 13.2]). In the case when M is hyperdissipative one has the following result concerning the differential inclusion u0 .t/ 2 M.u.t//
u.0/ D x0 2 D.M/:
(10.28)
Here u is said to be a solution if for all t 2 RC u.t/ 2 D.M/, u is continuous, weakly right differentiable at t, and if its weak right derivative (simply denoted by u0 ) satisfies relation (10.28). Theorem 10.10 (K¯omura-Kato) Let X be a uniformly convex Banach space whose dual X is also uniformly convex. Let M W D X be hyperdissipative. Then the differential inclusion (10.28) has a unique solution u. Moreover, u is Lipschitzian, its right derivative u0 is continuous from the right, and ku0 ./k is nonincreasing, u0 .t/ being the element M 0 .u.t// of least norm in M.u.t// for all t 2 RC . For all t 2 RC the map x0 7! St .x0 / WD u.t/ is nonexpansive and .St /t0 is a semigroup of (nonlinear) nonexpansive maps. If M is single-valued, the right derivative of u exists in the strong topology. The property u0 .t/ D M 0 .u.t// is surprising: so to speak, solutions are lazy as they choose the speed that minimizes the norm! Proof in the case M is single-valued. See [23, 93, 194] for the general case. We first prove uniqueness. Let u and v be two solutions, w WD u v. Setting j./ WD .1=2/ kk2 , f ./ WD j.w.//; let us verify that f is right differentiable, with f 0 .t/ D hu0 .t/ v 0 .t/; J.u.t/ v.t//i 0: Since X is uniformly convex, j is
636
10 Evolution Problems
differentiable on X; so that, setting wt .s/ WD .1=s/.w.t C s/ w.t// D w0 .t/ C zt .s/ with zt .s/ ! 0 weakly as s ! 0C , and taking a remainder rt ./ WD "t ./ kk W X ! R such that j.w.t/ C x/ j.w.t// D hx; J.w.t//i C "t .x/ kxk we get j.w.t C s// j.w.t// D shwt .s/; J.w.t//i C s"t .swt .s// kwt .s/k ; 1 .j.w.t C s// j.w.t/// ! hw0 .t/; J.w.t//i as s ! 0C : s Thus f is right differentiable, with f 0 .t/ D hw0 .t/; J.w.t//i D hu0 .t/ v 0 .t/; J.u.t/ v.t//i 0 since M is dissipative and u0 .t/ 2 M.u.t//, v 0 .t/ 2 M.v.t//. Since f .0/ D 0 and f is continuous with f ./ 0, by the Mean Value Theorem we get f ./ D 0 and u D v: The preceding calculation also shows that if u and v are the solutions with initial conditions x0 and y0 respectively, then for all t 2 RC one has ku.t/ v.t/k kx0 y0 k : The fact that St .x0 / WD u.t/ defines a semigroup stems from uniqueness. Next we show that if u is a solution to (10.28) and if the right derivative exists in the strong topology then the function g./ WD ku0 ./k is nonincreasing. Since for a fixed s > 0 the map t 7! v.t/ WD u.t C s/ satisfies v 0 .t/ 2 M.v.t//; setting again f .t/ WD j.u.t/ v.t//, for 0 r t we get ku.t C s/ u.t/k ku.r C s/ u.r/k : Dividing by s and taking limits as s ! 0C , it follows that ku0 .t/k ku0 .r/k : Let us turn to existence. Since for r > 0 the approximate map Mr is single-valued and Lipschitzian with rate 2=r on R.I rM/ D X; the approximate equation u0r .t/ D Mr .ur .t//;
ur .0/ D x0
has a solution on RC : It is natural to wonder whether .ur /r converges as r ! 0C . Before considering this question we note that since Mr is dissipative and ur is of class C1 , the preceding argument and Proposition 10.19 (d) show that 0 0 u .t/ u .0/ D kMr .x0 /k c WD M 0 .x0 / : (10.29) r r Thus, combining (10.29) with the Mean Value Theorem, we have kur .t/ ur .s/k c jt sj
8r 2 P; 8s; t 2 RC :
(10.30)
In particular, for every 2 P, and every r; t 2 Œ0; ; we have kur .t/ x0 k c: Since u0r .t/ D Mr .ur .t// and Mr D r1 .Pr I/, setting vr .t/ WD Pr .ur .t//, we have ru0r .t/ D vr .t/ ur .t/ so that relation (10.29) implies kur .t/ vr .t/k cr:
(10.31)
10.2 Semigroups
637
Since ur .t/ 2 BŒx0 ; c; we see that vr .t/ 2 BŒx0 ; c C cr: Let ˛ be the modulus of uniform continuity of J on BŒx0 ; 2c: Then for r; s; t 2 Œ0; we have kJ.ur .t/ us .t// J.vr .t/ vs .t//k ˛.cr C cs/:
(10.32)
Since ur is the solution of u0r D Mr ur D Mvr , by dissipativity of M we have hu0r .t/ u0s .t/; J.vr .t/ vs .t//i 0: Using this relation and estimate (10.32), we get for the right derivative of f ./ WD 2 1 2 kur ./ us ./k f 0 .t/ D hu0r .t/ u0s .t/; J.ur .t/ us .t//i
hu0r .t/ u0s .t/; J.vr .t/ vs .t//i C ˛.cr C cs/ u0r .t/ u0s .t/ 2c˛.cr C cs/:
Thus kur .t/ us .t/k2 4c˛.cr C cs/ for r; s; t 2 Œ0; and, by the Cauchy criterion, .ur / converges to some map u as r ! 0C , uniformly on compact intervals. Moreover, by (10.30), the limit u is Lipschitzian with rate c on RC . For any r 2 P; t 2 RC , ku0r .t/k D kMr .ur .t//k is bounded by c in view of (10.29) and since X is reflexive, given sequences .rn / ! 0C , .tn / ! t in RC , we can find a subsequence ..rk.n/ ; tk.n/ // of ..rn ; tn // such that .u0rk.n/ .tk.n/ // weakly converges to some y.t/. By Corollary 10.6 we have u.t/ 2 D.M/ and y.t/ D M.u.t//. Thus, by uniqueness, .u0r .s// weakly converges to M.u.t// as .r; s/ ! .0C ; t/. In particular, .u0r .t// + M.u.t// and M.u.// is weakly continuous. For all x 2 X ; passing to the limit as r ! 0C in the relation hx ; ur .t/ x0 i D
Z 0
t
hx ; u0r .s/ids
Rt we get u.t/ D x0 C 0 y.s/ds by the Dominated Convergence Theorem, so that u is weakly differentiable with derivative u0 .t/ WD y.t/ D M.u.t//: Proposition 10.21 ensures that .Mr .x0 //r ! M.x0 / as r ! 0C . Moreover, by (10.29) and by weak lower semicontinuity of the norm, for all t 2 RC we get kM.u.t//k D ky.t/k lim inf kMr .ur .t//k c D kM.u.0//k : r!0C
For all s 2 RC ; since v W t 7! u.s C t/ is the solution of v 0 .t/ D M.v.t//; v.0/ D u.s/; we get kM.u.s C t//k D kM.v.t//k kM.v.0//k D kM.u.s//k ;
638
10 Evolution Problems
i.e. kM.u.//k is nonincreasing. It is also right continuous: given a sequence .tn / ! tC , since .M.u.tn /// + M.u.t//; the weak lower semicontinuity of the norm yields kM.u.t//k lim inf kM.u.tn //k lim sup kM.u.tn //k kM.u.t//k n
n
and these relations are equalities. The Kadec-Klee Property ensures that .M.u.tn /// converges toRM.u.t//. Thus M.u.// itself is right continuous on RC : The relation t u.t/ D x0 C 0 M.u.s//ds then entails that 1 .u.t C s/ u.t// ! M.u.t// s as s ! 0C . Thus u is right differentiable for the strong topology.
The semigroup .St /t0 we consider has a regularizing effect in the sense that St can be extended to a nonexpansive map from cl.D.A// into D.A/I see [51, Thm 3.3]. Let us quote a result of interest in this direction. It bears some analogy with the classical gradient method for the numerical minimization of a differentiable convex function. Theorem 10.11 Let f W X ! R1 be a closed proper convex function on a Hilbert space X and let M WD @f W X X, X being identified with X via the Riesz isomorphism. Then the semigroup .St /t0 generated by M can be extended to a semigroup of maps St : cl.D.M// ! D.M/ for all t 2 P. Moreover, for all > 0 the map u W t 7! St .x0 / is Lipschitzian and right differentiable on Œ ; 1Œ and f ıu is convex, nonincreasing and Lipschitzian on Œ ; 1Œ with . f ı u/0 .t/ D ku0 .t/k2 . Several approximation methods are available for solving equation (10.21); see [71, 88, 144]. Let us end this section by quoting some results dealing with a numerical approach called the implicit time discretization scheme. This terminology can be explained as follows. Given > 0 one divides the interval Œ0; into n intervals of equal length n WD =n and, starting with u0;n WD x0 , one obtains uk;n inductively on k by solving the inclusions ukC1;n uk;n 2 MukC1;n
n or, when M is single-valued, .I n M/ukC1;n D uk;n ; so that uk;n D .I
k M/ x0 : n
For t 2 Œk =n; .k C 1/ =nŒ one sets un .t/ WD uk;n or a natural convex combination of uk;n and ukC1;n . One says that un ./ is an approximate solution to the differential inclusion u0 .t/ 2 M.u.t//
u.0/ D x0 2 D.M/:
(10.33)
10.2 Semigroups
639
One says that a continuous map u W RC ! X is a mild solution of the preceding equation if there is a sequence .un / of approximate solutions that converges uniformly to u on any compact interval. Theorem 10.12 (Crandall-Liggett) Let M W X X be a !-dissipative multimap such that, for some > 0; cl.D.M// R.I rM/
8r 2 Œ0; :
Then (10.33) has a unique mild solution u: Moreover, it is given by t u.t/ D lim .I M/n x0 n!1 n
t > 0;
the convergence being uniform on compact intervals. One can also consider the case when M depends on t; f.i. when M is replaced with M C f .t/; with f 2 L1 .RC ; X/: See [25, 173]. On the other hand, the proof of Corollary 10.2 shows that under its assumptions one has the convergence result 1 S.t/x D lim exp.tA.I A/1 x/: n!1 n
Exercises 1 . (Trotter-Kato) Let S, Sn (n 2 N) be continuous semigroups on a Banach space X satisfying for some c > 0, ! 2 R the estimates kS.t/k ce!t , kSn .t/k ce!t for t 2 RC , n 2 N. Let An (resp. A) be the generator of Sn (resp. S). Prove that the following assertions are equivalent: (a) for all x 2 D.A/ there exists an xn 2 D.An / such that .xn /n ! x and .An xn /n ! Ax; (b) .RAn /n ! RA for some > !; RAn (resp. RA ) being the resolvent of An (resp. A); (c) .RAn /n ! RA for all > !; (d) .Sn .t//n ! S.t/ uniformly on compact intervals. 2 . Let A be the generator of a continuous semigroup S./ of linear operators satisfying for some c; ! 2 RC the relation kS.t/k ce!t : Show that for all t 2 RC , x 2 X one has S.t/x D limn!1 .IX .t=n/A/n x; the limit being uniform on every compact interval of RC . 3 . Let M W X X be a maximally dissipative multimap on a Hilbert space X: Prove that the solution u to the problem u0 .t/ 2 M.u.t//, u.0/ D u0 , where u0 is a given element of D.M/, belongs to C1 .RC ; X/ and C.RC ; D.M//:
640
10 Evolution Problems
10.3 Parabolic Problems: The Heat Equation In this section and the following one we consider evolution problems involving second-order partial differential equations. Following the classical classification of partial differential equations into elliptic, parabolic and hyperbolic equations, for which we refer to [83], we separate our study of evolution equations into two distinct sections. In both sections, ˝ denotes a bounded open subset of class C1 of Rd , with boundary , and T is the interval T WD0; Œ; with 20; 1. Given essentially bounded measurable functions ai;j W ˝ T ! R, ai W ˝ T ! R, a0 W ˝ T ! R, we denote by L the operator defined by .Lu/.x; t/ WD
d X
Di .ai;j .x; t/Dj u.x; t// C
i;jD1
d X
ai .x; t/Di u.x; t/ C a0 .x; t/u.x; t/:
iD1
A simple example is the Laplacian operator L WD in the space variables xi . In such a case, the general parabolic equation @u .x; t/ .Lu/.x; t/ D f .x; t/ @t u.x; t/ D 0 u.x; 0/ D g.x/
.x; t/ 2 ˝ T;
(10.34)
.x; t/ 2 T
(10.35)
x 2 ˝;
(10.36)
in which f 2 L2 .˝ T/, g 2 H01 .˝/ \ H 2 .˝/ are given, turns out to be the heat equation when f D 0. It describes the distribution of the temperature in a medium ˝ over time t 2 T with initial temperature g. The boundary condition (10.35) can be replaced with a Neuman condition @u .x; t/ D 0 @n
.x; t/ 2 T;
but we limit our study to the Dirichlet condition. Many other diffusion phenomena can be described by parabolic equations, u measuring the concentration of a chemical, for example. Definition 10.5 The partial differential operator @t@ L is said to be uniformly parabolic if there exists a constant cE > 0 such that d X
ai;j .x; t/v i v j cE kvk2
8.x; t/ 2 ˝ T; v WD .v 1 ; : : : ; v d / 2 Rd :
i;jD1
Thus, for all t 2 T the operator L is elliptic; but the above definition requires a lower bound for the ellipticity constant that is valid for the whole interval T: In the sequel, for the sake of simplicity, we suppose that the coefficients of L are smooth
10.3 Parabolic Problems: The Heat Equation
641
and do not depend on t. The Galerkin method allows one to get rid of this last restriction (see [117, Section 7.1] for instance). Here we want to easily derive an existence result from the Hille-Yosida Theorem. We consider the operator A W D.A/ ! H WD L2 .˝/ with domain D.A/ WD H01 .˝/ \ H 2 .˝/ L2 .˝/ given by A.v/.x/ WD .Lv/.x/ for x 2 ˝ and we look for a continuous map u W T [ f0g ! H satisfying u.T/ D.A/ and u0 .t/ D Au.t/
t2T
u.0/ D g; and we set u.x; t/ WD u.t/.x/ for .x; t/ 2 ˝ .T [ f0g/. We introduce the bilinear form b associated with L as in (9.17) given by b.u; v/ WD
Z X d ˝ i;jD1
aij Di uDj v C
Z X d ˝ iD1
Z ai uDi v C
˝
a0 uv
u; v 2 H01 .˝/:
By Gårding’s inequality (9.21) there exist constants c > 0, ! 2 R such that b.u; u/ C ! kuk20 c kuk21
8u 2 H01 .˝/;
(10.37)
where kk0 (resp. kk1 ) is the norm of L2 .˝/ (resp. H01 .˝/). Proposition 10.22 The operator A generates an !-contraction semigroup .S.t//t0 in L2 .˝/: Proof In order to apply Theorem 10.8 or rather Corollary 10.3, we verify that A!I is a linear maximally dissipative operator or that !I A is a maximally monotone operator on L2 .˝/. By Green’s formula and (10.37) h.!I A/u j ui D b.u; u/ C ! kuk20 0 we see that the linear operator !I A is a monotone operator on L2 .˝/ endowed with its usual scalar product h j i. Moreover, by Theorem 9.28, it is maximally monotone since for > ! and every f 2 L2 .˝/ the equation u Au D f has a solution u 2 D.A/ WD H01 .˝/ \ H 2 .˝/ by Theorem 9.18, so that for WD ! > 0 one has R.I C .!I A// D R.I A/ D L2 .˝/: Given f 2 L2 .T; L2 .˝//, g 2 L2 .˝/, we say that a function u 2 L2 .T; H01 .˝// whose derivative u0 belongs to L2 .T; H 1 .˝// is a weak solution to the system (10.34)–(10.36) if u.0/ D g and hu0 .t/; vi C b.u; v/ D hf .t/; vi
8v 2 H01 .˝/:
642
10 Evolution Problems
It can be shown (see [117, Thm 3 p. 287]) that a function u 2 L2 .T; H01 .˝// whose derivative u0 belongs to L2 .T; H 1 .˝// is in fact in C.T [ f0g; L2 .˝//, so that u.0/ is well defined. Galerkin’s method and regularity estimates enable one to obtain the following result (see [117, Section 7.1]). When f D 0 and g 2 H01 .˝/ \ H 2 .˝/ its second assertion is a consequence in the preceding proposition. Theorem 10.13 Assume f 2 L2 .˝ T/ and g 2 L2 .˝/: Then there exists a unique weak solution u to the system (10.34)–(10.36). If g 2 H01 .˝/, then the weak solution u is in L2 .T; H 2 .˝// \ L1 .T; H01 .˝//. Moreover, for some c > 0 one has the estimate kukL2 .T;H 2 .˝// C kukL1 .T;H01 .˝// C u0 L2 .˝T/ c kf kL2 .˝T/ C c kgkH01 .˝/ : Assuming more regularity on f and g; one can get more regular solutions. Exercise (Heat Kernel) For x 2 Rd , t 2 P let 2 1 ekxk =4t : .4t/d=2
kt .x/ WD
Let X WD C0 .Rd / be the space of continuous functions w W Rd ! R such that w.x/ ! 0 as kxk ! 1, with the norm given by kwk WD supx jw.x/j : For t 2 P and w 2 X let Z kt .x y/w.y/dd .y/ x 2 Rd : St .w/.x/ WD .kt w/.x/ WD Rd
Prove that St .w/ 2 X for all t 2 P and w 2 X and that kSt .w/k kwk : Set S0 .w/ WD w for all w 2 X: Prove the Chapman-Komolgorov equation: for all s; t 2 P one has Z ksCt .x/ D St .ks /.x/ WD
Rd
kt .x y/ks .y/dd .y/
x 2 Rd :
From this deduce that .St /t0 is a continuous semigroup on X. [See: [246, p. 162– 169].] It can be proved that for g 2 C0 .Rd / the function u W .x; t/ 7! St .g/.x/ is the solution of the heat equation @u .x; t/ u.x; t/ D 0; @t
u.x; 0/ D g.x/
x 2 Rd ; t 2 P.
10.4 Hyperbolic Problems: The Wave Equation
643
10.4 Hyperbolic Problems: The Wave Equation Retaining the notation and the assumptions of the preceding subsection, let us consider now the second-order hyperbolic problem @2 u .x; t/ .Lu/.x; t/ D f .x; t/ @t2 u.x; t/ D 0 u.x; 0/ D g.x/;
.x; t/ 2 ˝ T;
(10.38)
.x; t/ 2 T;
(10.39)
@u .x; 0/ D h.x/ @t
x 2 ˝;
(10.40)
where g 2 H01 .˝/, h 2 L2 .˝/. We now suppose ai;j D aj;i for i; j 2 Nd . Following a scheme used for second-order ordinary differential equations, we recast this problem as a first-order system in .u; v/ where v WD @u W @t .
@v @u .x; t/; .x; t// .v.x; t/; .Lu/.x; t// D .0; f .x; t// @t @t
.x; t/ 2 ˝ T;
.u.x; t/; v.x; t// D .0; 0/
.x; t/ 2 T
.u.x; 0/; v.x; 0// D .g.x/; h.x//
x 2 ˝:
Note that we have added the condition v.x; t/ D 0 for .x; t/ 2 TI but this condition is a consequence in (10.39) obtained by differentiating u with respect to t. Again, by Gårding’s Theorem (Theorem (9.13)) there exist constants c > 0, ! 2 R such that inequality (10.37) is satisfied, b being the bilinear form of Theorem 9.13 associated with L as above. We take X WD H01 .˝/ L2 .˝/ endowed with the norm kkX associated with the scalar product given by h.u; v/ j .x; z/iX WD b.u; x/ C 2!hu j xi0 C hv j zi0
.u; v/; .x; z/ 2 X;
where h j i0 is the scalar product in L2 .˝/. By (9.19) with m D 1; the norm kkX is equivalent to the product norm of kk1 D kkH 1 .˝/ with kk0 D kkL2 .˝/ given by 0 k.u; v/k WD .kuk21 C kvk20 /1=2
.u; v/ 2 H01 .˝/ L2 .˝/:
We consider the operator A with domain D.A/ WD .H 2 .˝/ \ H01 .˝// H01 .˝/ and values in X defined by A.u; v/ WD .v; Lu/
.u; v/ 2 D.A/:
644
10 Evolution Problems
Theorem 10.14 The operator A generates an !-contraction semigroup on X WD H01 .˝/ L2 .˝/: If the coefficients ai of L are null, the operator A generates a nonexpansive semigroup on X: Proof Let us verify the assumptions of Corollary 10.4. Without loss of generality, we assume that ! 1. First, for .u; v/ 2 D.A/; by inequality (9.19) we see that !I A is a monotone operator: h.!I A/.u; v/ j .u; v/iX D !.b.u; u/ C 2! kuk20 C kvk20 / b.v; u/ 2!hv j ui0 hv j Lui0 !.b.u; u/ C ! kuk20 / C ! 2 kuk20 C kvk20 2!hv j ui0 0: Now let us show that for > ! 1=2 and for every . f ; g/ 2 X the equation .u; v/ A.u; v/ D . f ; g/
.u; v/ 2 X
(10.41)
has a solution. This equation amounts to the system u v D f
u 2 H 2 .˝/ \ H01 .˝/; v 2 H01 .˝/
v Lu D g
u 2 H 2 .˝/ \ H01 .˝/; v 2 H01 .˝/:
These two equations imply that v D u f and 2 u Lu D f C g: Since 2 > !, Theorem 9.18 ensures that this equation has a unique solution u 2 H 2 .˝/ \ H01 .˝/. Then .u; u f / 2 X is a solution of equation (10.41): A I is surjective and !I A is maximally dissipative. If the coefficients ai of L are null we can take ! D 0 in the preceding and apply Corollary 10.4. Given f 2 L2 .T; L2 .˝//, g 2 H01 .˝/; we say that a function u 2 L2 .T; H01 .˝// whose derivatives u0 and u00 belong to L2 .T; L2 .˝// and L2 .T; H 1 .˝// respectively is a weak solution to the system (10.34)–(10.36) if u.0/ D g, u0 .0/ D h and hu0 .t/; vi C b.u; v/ D hf .t/; vi
8v 2 H01 .˝/:
Again, u.0/ and u0 .0/ are well defined and, using Galerkin’s method and regularity estimates, one can establish the following result (see [117, Section 7.2]). Theorem 10.15 Assume f 2 L2 .T; L2 .˝//, g 2 H01 .˝/; h 2 L2 .˝/: Then, the weak solution u of the system (10.38)–(10.40) exists and is unique. Moreover, for
10.4 Hyperbolic Problems: The Wave Equation
645
some c > 0 one has the estimate kukL1 .T;H01 .˝// C u0 L1 .T;L2 .˝// c kf kL2 .T;L2 .˝// C c kgkH01 .˝/ C c khkL2 .˝/ : If in addition f has a weak derivative f 0 in L2 .T; L2 .˝//; g 2 H 2 .˝/, h 2 H01 .˝/; then u 2 L1 .T; H 2 .˝//; u0 2 L1 .T; H01 .˝//; u00 2 L1 .T; L2 .˝//, u000 2 L2 .T; H 1 .˝//, and their norms are bounded above by a multiple of the sum of the norms of f , f 0 , g, h in their respective spaces.
Exercises 1. Given c 2 R and two functions f , g of class C2 on R, verify that the function u given by u.x; t/ WD f .x ct/ C g.x C ct/
.x; t/ 2 R2
is a solution of the equation 2 @2 u 2@ u .x; t/ c .x; t/ D 0 @t2 @x2
.x; t/ 2 R2 :
(10.42)
2. Let Cc2 .R2 / be the space of functions of class C2 with compact support on R2 : Show that any solution u of class C2 of equation (10.42) is a weak solution of this equation in the sense that for any ' 2 Cc2 .R2 / the following relation holds: Z
@2 u @2 u . 2 c2 2 /.x; t/'.x; t/dxdt D @x R2 @t
Z R2
.
2 @2 ' 2@ ' c /.x; t/u.x; t/dxdt: @t2 @x2
3. Prove the converse of the assertion of the preceding exercise: if a function u of class C2 is a weak solution of equation (10.42) then u is a (classical) solution of this equation. 4. Show that if . fn / and .gn / are two sequences of functions of class C2 on R that converge in L2 .R/ to f , g 2 L2 .R/ respectively, then u given by u.x; t/ WD f .x ct/ C g.x C ct/ for .x; t/ 2 R2 is a weak solution to (10.42). 5. Given u0 , u1 2 C2 .R/, verify that the function u given by 1 1 u.x; t/ WD .u0 .x C ct/ C u0 .x ct// C 2 2c
Z
xCct
u1 .s/ds
xct
is a solution of equation (10.42) satisfying the initial conditions u.x; 0/ D u0 .x/
@u .x; 0/ D u1 .x/ @t
x 2 R.
Appendix: The Brouwer’s Fixed Point Theorem
This appendix is devoted to a proof of Brouwer’s Theorem. This simple but powerful result plays an important role in analysis and the proof we give uses techniques that are scattered throughout the book. Thus, it can serve as a conclusion. Moreover, we consider some striking related properties. This result has received numerous and very different proofs, the first ones being due to H. Poincaré (1886), J. Hadamard (1910), and L.E.J. Brouwer (1912). Whereas arguments from algebraic topology are natural for such a result, nice analytical proofs have been devised by J. Milnor (1978) and C.A. Rogers (1980); see [197], [222]. The proof we present is inspired by the latter. Throughout we denote by B the closed unit ball of Rd and by S its unit sphere, Rd being endowed with its Euclidean norm and its scalar product denoted by x:y for x, y 2 Rd rather than hx j yi for the sake of brevity. Theorem 10.16 (Brouwer) Any continuous map f W B ! B has a fixed point: there exists some z 2 B such that f .z/ D z: We denote by (F) the assertion of this theorem and we consider some other assertions: (Z) For any continuous map v W B ! Rd such that v.x/:x 0 for all x 2 S there exists some z 2 B such that v.z/ D 0I (R) There exist no continuous map r W B ! S such that r.x/ D x for all x 2 S: Assertion (R) is often called the Retraction Theorem (a map r W B ! S such that r.x/ D x for all x 2 S being called a retraction of B onto S) and assertion (Z) is sometimes called the Hairy Ball Theorem. In assertion (Z) v can be considered as a vector field on B since for all x 2 B one has v.x/ 2 T.B; x/; the tangent cone to the convex subset B (which is also a smooth manifold with boundary). Recall that the definition yields T.B; x/ D Rd for x 2 U WD intB D BnS and T.B; x/ D fw 2 Rd W w:x 0g for x 2 S: The relationships between the preceding assertions are remarkable.
Proposition 10.23 The assertions (F), (Z), (R) are equivalent. Proof (F)H)(Z) Suppose (Z) does not hold: there exists some continuous v W B ! Rd such that v.x/ ¤ 0 for all x 2 B and v.x/:x 0 for all x 2 S: Then, setting w.x/ WD
v.x/ kv.x/k
one gets a continuous map w W B ! S and by (F) there exists some z 2 B such that w.z/ D z: Then one has z 2 S and one gets the contradiction 1 D kzk2 D w.z/:z D
v.z/:z 0: kv.z/k
(Z)H)(R) Suppose there exists a continuous map r W B ! S such that r.x/ D x for all x 2 S: Setting v.x/ WD x 2r.x/ one defines a continuous vector field v satisfying v.x/:x D kxk2 D 1 for all x 2 S and v.x/:r.x/ D x:r.x/ 2 1 for x 2 B by the Cauchy-Schwarz inequality. Thus v cannot have a zero in B; contradicting (Z). (R)H)(F) Suppose there exists a continuous map f W B ! B such that f .x/ ¤ x for all x 2 B. Set g.x/ WD x f .x/; hx .t/ WD kx C tg.x/k2 1 for x 2 B, t 2 R. For all x 2 B one has hx .0/ D kxk2 1 0 and limt!1 hx .t/ D 1; so that there exists some tx 2 RC such that hx .tx / D 0 (Fig. A.1). In fact, tx is the unique nonnegative root of the quadratic function hx W t 7! t2 kg.x/k2 C 2tg.x/:x C kxk2 1; hence 1=2 tx D kg.x/k2 .g.x/:x/2 C kg.x/k2 .1 kxk2 / kg.x/k2 g.x/:x: Since by compactness of B there exists a c > 0 such that kg.x/k c for all x 2 B, the function x 7! tx is continuous, and r W x 7! x C tx g.x/ is continuous too. By construction, one has r.B/ S. For x 2 S one has hx .0/ D 0, so that tx D 0; this also follows from the expression of tx : g.x/:x D kxk2 f .x/:x 0; ..g.x/:x/2 /1=2 Fig. A.1 Intersecting the sphere with the half-line x C RC .x f .x//
x f(x)
Appendix: The Brouwer’s Fixed Point Theorem
649
D g.x/:x and tx D 0; so that r.x/ D x: Assertion (R) is thus denied. Thus (F) must hold. Remark The implication (Z)H)(F) is immediate: given a continuous map f W B ! B; setting v.x/ WD f .x/ x one has v.x/:x 0 for x 2 S by the Cauchy-Schwarz inequality, so that (Z) implies that there exists some z 2 B such that f .z/z D 0: u t We prove Brouwer Theorem, or rather assertion (R), in two steps. The first one is a weakened version of (R). Lemma 10.6 There is no retraction of class C1 from B onto S: Proof Suppose p W B ! S is a retraction of class C1 from B onto S: For t 2 T WD Œ0; 1 let pt D .1 t/I C tp, where I is the identity map on B: Since p I is of class C1 and B is compact, there exists some c 1 such that pI is Lipschitzian with rate c on B: Then, for t 2 Œ0; 1=cŒ the map pt is injective and its inverse is Lipschitzian since for x, y 2 B one has kpt .x/ pt .y/k kx yk t k.p I/.x/ .p I/.y/k .1 ct/ kx yk ; so that x D y when pt .x/ D pt .y/. Moreover, since .t; x/ 7! det.Dpt .x// is a continuous function on T B and is 1 on f0g B; there exists some " > 0 such that det.Dpt .x// > 0 for .t; x/ 2 Œ0; " B: The inverse function theorem ensures that for t 2 Œ0; " the image pt .U/ of the open unit ball U WD BnS is an open subset of B; hence is contained in U: Since pt .S/ D S and U \ S D ¿, we have Unpt .U/ D Unpt .B/; so that Unpt .U/ is open, pt .B/ being compact. Since U is connected and is the union of the disjoint open subsets pt .U/ and Unpt .U/, with pt .U/ nonempty, we have U D pt .U/. Let Z f .t/ WD det Dpt dd t 2 T: U
The change of variable theorem ensures that for t 2 Œ0; "; f .t/ is the measure of the open set pt .U/ D U: Since f ./ is a polynomial, it is constant on T; with value f .0/ D d .U/ > 0: However, since kp.x/k2 D 1 for x 2 B we have Dp.x/:p.x/ D 0 for x 2 B; since p.x/ is nonzero, this relation shows that Dp.x/ is not an isomorphism, hence that det Dp.x/ D 0. Thus f .1/ D 0; a contradiction. t u The second step of the proof is given by the next lemma. Lemma 10.7 If there were a continuous retraction from B onto S; then there would exist a retraction of class C1 from B onto S: Proof Suppose there exists a continuous retraction q from B onto S. Approximating the components of q by functions of class C1 ; we would obtain a sequence .qn / of maps of class C1 on B such that .kqn qk1 / ! 0, where kk1 is the norm of uniform convergence. Let h W R !Œ0; 1 be a bump function of class C1 satisfying h.0/ D 1; h.r/ D 0 for r 2 RnŒ1; 1 and let hn W Rd ! R be given by
650
Appendix: The Brouwer’s Fixed Point Theorem
hn .x/ WD h.n kxk2 n/, so that hn .x/ D 1 for all x 2 S and .hn .x// ! 0 for x 2 Rd nS. Let pn W B ! Rd be given by pn .x/ WD hn .x/x C .1 hn .x//qn .x/
x 2 B:
Since hn .x/ 2 Œ0; 1; we have rn .x/ WD kpn .x/k max.kqn .x/k ; 1/: Let us show that there exist some c > 0 and m 2 N such that rn .x/ c for all x 2 B and n m. Otherwise we could find an infinite subset N of N and x 2 B, t 2 Œ0; 1 such that .rn .xn //n2N ! 0, .xn /n2N ! x, .hn .xn // ! t, hence tx C .1 t/q.x/ D 0. The case x 2 S would be excluded since then q.x/ D x whereas the case x 2 U would be excluded too since then we would have hn .xn / D 0 for n 2 N large enough, hence q.x/ D 0; a contradiction with q.x/ 2 S. We conclude that, for n m, .pn =rn / is a map of class C1 from B to S satisfying pn .x/=rn .x/ D x for all x 2 S since then hn .x/ D 1 and pn .x/ D x; rn .x/ D 1.
1. R.A. Adams, Sobolev Spaces, Academic Press, 1975. 2. R.P. Agarval and D. O’Regan, Ordinary and Partial Differential Equations With Special Functions, Fourier Series, and Boundary Value Problems, Universitext, Springer, New York, 2009. 3. S. Agmon, Lectures on Elliptic Boundary Value Problems, Van Nostrand, Princeton, 1965. 4. A.A. Agrachev, Geometry of optimal control problems and Hamiltonian systems, in Nonlinear and Optimal Control Theory, A.A. Agrachev et al. (eds) Lecture Notes in Mathematics #1932, 1–59, (2000). 5. N.I. Akhiezer, The Calculus of Variations, Harwood, Chur, 1988. 6. N. Akhiezer and I. Glazman, Theory of Linear Operators in Hilbert Spaces, Pitman, 1980. 7. A. Ambrosetti and G. Prodi, A Primer of Nonlinear Analysis, Cambridge University Press, London, 1993. 8. A. Ambrosetti and D. Arcoya Alvarez, An Introduction to Nonlinear Functional Analysis and Elliptic Problems, Progress in Nonlinear Differential Equations and Their Applications #82, Birkhäuser, Boston, 2011. 9. H. Amann, Analysis II, Birkhäuser, Basel, 1999, 2008. 10. T.M. Apostol, Mathematical Analysis, Addison-Wesley, Reading, Massachusetts, 1974. 11. V.I. Arnold, Ordinary Differential Equations, Universitext, Springer, Berlin, 1992. 12. J.-P. Aubin and I. Ekeland, Applied Nonlinear Analysis, Wiley Interscience, New York, 1984. 13. J.-P. Aubin and H. Frankowska, Set-Valued Analysis, Birkhäuser, Boston, 1990. 14. D. Azé, Eléments d’Analyse Convexe et Variationnelle, Ellipse, Paris, 1997. 15. D. Azé and J.-B. Hiriart-Urruty, Analyse Variationnelle et Optimisation, Cepadues, Toulouse, 2010. 16. D. Azé, G. Constans, J.-B. Hiriart-Urruty, Calcul Différentiel et Equations Différentielles, Dunod, Paris, 2002. 17. D. Azé and J.-P. Penot, Uniformly convex and uniformly smooth convex functions, Ann. Fac. Sciences Toulouse, 4 no 4, 705–730 (1995). 18. G. Bachman, L. Narici, E. Beckenstein, Fourier and Wavelet Analysis, Springer-Verlag, New York, 2000. 19. M. Badiale and E. Serra, Semilinear Elliptic Equations for Beginners, Existence Results via the Variational Approach, Universitext, Springer, 2011. 20. H. Bahouri, J.-Y. Chemin and R. Danchin, Fourier Analysis and Nonlinear Partial Differential Equations, Grundlehren der mathematischen Wissenschaften #343, Springer-Verlag, Berlin, 2011. 21. I.J. Bakelman, Convex Analysis and Nonlinear Geometric Elliptic Equations, SpringerVerlag, Berlin, 1994.
22. A. Balakrishnan, Applied Functional Analysis, Applications of Mathematics 3, Springer, New York, 1976. 23. V. Barbu, Nonlinear Semigroups and Differential Equations in Banach Spaces, Noordhoff, Leyden, 1978. 24. V. Barbu, Partial Differential Equations and Boundary Value Problems, Kluwer, Dordrecht, 1998. 25. V. Barbu, Nonlinear Differential Equations of Monotone Types in Banach Spaces, Springer Monographs in Mathematics, Springer, New York, 2010. 26. V. Barbu and G. Da Prato, Hamilton-Jacobi Equations in Hilbert Spaces, Research Notes in Mathematics #86, Pitman, Boston, 1983. 27. M. Bardi and I. Capuzzo-Dolcetta, Optimal Control and Viscosity Solutions to HamiltonJacobi-Bellman Equations, Birkhäuser, Basel, 1997, 2008. 28. T. Bascelli et al., Fermat, Leibniz, Euler, and the gang: the true history of the concept of limit and shadow, Notices of the American Mathematical Society, 61, no 8, 848–864 (2014). 29. H.H. Bauschke and P.-L. Combettes, Convex Analysis and Monotone Operator Theory in Hilbert Spaces, CMS Books in Mathematics, Springer, 2011. 30. R. Beals, Advanced Mathematical Analysis, Graduate Texts in Mathematics #12, Springer, New York, 1973. 31. J.J. Benedetto-M.W. Frazier, Wavelets, Mathematics and Applications, CRC Press, Boca Raton, 1994. 32. P. Bérard, Spectral Geometry: Direct and Inverse Problems, Lecture Notes in Mathematics 1207, Springer, 1986. 33. S.K. Berberian, Lectures in Functional Analysis and Operator Theory, Graduate Texts in Mathematics #15, Springer, New York, 1974. 34. S.K. Berberian, A First Course in Real Analysis, Springer, New York, 1994. 35. S.K. Berberian and P. R. Halmos, Lectures in Functional Analysis and Operator Theory, Graduate Texts in Mathematics, Springer, New York, 1974, 2014. 36. M. Berger, Geometry of the Spectrum, in: Differential Geometry (S.S. Chern and R. Osserman eds.), pp. 129–152, Proc. Sympos. Pure Math. Vol 27, Part 2, American Mathematical Society, Providence, 1975. 37. L. Bers, F. John, M. Schechter, Partial Differential Equations, American Mathematical Society, Providence, Rhode Island, 1979. 38. P. Billingsley, Probability and Measure, Anniversary Edition, Wiley, Hoboken, New Jersey, 1979, 1986, 1995, 2012. 39. G. Birkhoff, A Source Book in Classical Analysis, Harvard University Press, Cambridge, MA, 1973. 40. G.A. Bliss, Lectures on the Calculus of Variations, The University of Chicago Press, Chicago, 1946, 1968. 41. V.I. Bogachev, Measure Theory I, Springer, Berlin, 2007. 42. V.I. Bogachev, Measure Theory II, Springer, Berlin, 2007. 43. J.M. Borwein, Maximal monotonicity via convex analysis, J. Convex Anal. 13, 561–586 (2006). 44. J.M. Borwein and A.S. Lewis, Convex Analysis and Nonlinear Optimization. Theory and Examples, Canadian Mathematical Society, Books in Mathematics, Springer-Verlag, New York, 2000. 45. J.M. Borwein and J.D. Vanderwerff, Convex Functions: Constructions, Characterizations and Counterexamples, Cambridge University Press, Cambridge, 2010. 46. J.M. Borwein and J. Vanderwerff, Fréchet-Legendre functions and reflexive Banach spaces, J. Convex Anal. 17 no 3&4, 915–924 (2010). 47. J.M. Borwein and Q.J. Zhu, Techniques of Variational Analysis, CMS Books in Mathematics, Springer, 2005 48. J.M. Borwein and Q.J. Zhu, Variational methods in convex analysis, J. Global Optim. 35 no 2, 197–213 (2006).
References
653
49. N. Bourbaki, Elements d’Histoire des Mathématiques, Masson, Paris, 1984, Springer, Berlin, 2007. 50. A. Bressan, Lecture Notes on Functional Analysis With Applications to Linear Partial Differential Equations, Graduate Studies in Mathematics #143, American Mathematical Society, Providence, Rhode Island, 2013. 51. H. Brézis, Opérateurs Maximaux Monotones et Semi-groupes de Contractions dans les Espaces de Hilbert, Mathematics Studies #5, North Holland, Amsterdam, 1973. 52. H. Brezis, Functional Analysis, Sobolev Spaces and Partial Differential Equations, Universitext, Springer, 2011; adapted from “Analyse Fonctionnelle”, Masson, Paris, 1983. 53. H. Brezis, J.M. Coron and L. Nirenberg, Free vibrations for a nonlinear wave equation and a theorem of Rabinowicz, Comm. Pure and Appl. Math. 33, 667–689 (1980). 54. M. Briane and G. Pagès, Analyse. Théorie de l’Intégration. Convolution et Transformée de Fourier, Vuibert, Paris 2012. 55. Brook and Chacon, Continuity and compactness of measures, Adv. in Math. 37, 16–26 (1980). 56. A. Browder, Mathematical Analysis. An Introduction, Undergraduate Texts in Mathematics, Springer-Verlag, New York, 1996. 57. F.E. Browder, Existence theorems for nonlinear partial differential equations, in “Global Analysis”, Proceedings of Symposia in Pure Mathematics 16, American Mathemetical Society, Providence, 1970. 58. F.E. Browder, Nonlinear Operators and Nonlinear Equations of Evolution in Banach Spaces, Proceedings of Symposia in Pure Mathematics 18, Part 2, American Mathematical Society, Providence, 1976. 59. A. Brown and C. Pearcy, An Introduction to Analysis, Graduate Texts in Mathematics #154, Springer-Verlag, New York, 1995. 60. A.M. Bruckner, Differentiation of Real Functions, Springer-Verlag, Berlin, 1978, American Mathematical Society, Providence, R.I., 1994. 61. G. Buttazzo, M. Giaquinta, and S. Hildebrandt, One-Dimensional Variational Problems, Oxford University Press, Oxford, 1998 62. E. Cancès, C. Le Bris, Y. Maday, Méthodes Mathématiques en Chimie Quantique. Une introduction, Mathématiques et Applications #53, Springer-Verlag, Berlin, 2006. 63. C. Canuto and A. Tabacco, Mathematical Analysis II, Universitext, Springer-Verlag Italia, Milan, 2010. 64. M. Capinski and E. Kopp, Measure, Integral and Probability, Springer Undergraduate Mathematics Series, Springer-Verlag, London 1999. 65. L. Carleson, On convergence and growth of partial sums of Fourier series, Acta Math. 116, 135–157 (1966). 66. H. Cartan, Differential Calculus, Mifflin Co, Boston, 1971, translated from “Cours de Calcul Différentiel”, Hermann, Paris, 1971. 67. J. Cerdà, Linear Functional Analysis, Graduate Studies in Mathematics #116, American Mathematical Society, Providence, 2010. 68. K.-C. Chang, Methods in Nonlinear Analysis, Springer Monographs in Mathematics, Springer-Verlag, Berlin, 2005. 69. I. Chavel, Eigenvalues in Riemannian Geometry, Academic Press, 1994. 70. Y.-Z. Chen and L.C. Wu, Second Order Elliptic Equations and Elliptic Systems, Translation of Mathematical Monographs #174, American Mathematical Society, Providence, 1998. 71. W. Cheney, Analysis for Applied Mathematics, Graduate Texts in Mathematics #208, Springer-Verlag, New York, 2001. 72. P. Cherrier and A. Milani, Linear and Quasi-linear Evolution Equations in Hilbert Spaces, Graduate Studies in Mathematics #135, American Mathematical Society, Providence, 2012. 73. Ch. Chidume, Geometric Properties of Banach Spaces and Nonlinear Iterations, Lecture Notes in Mathematics #1965, Springer-Verlag, London 2009. 74. M. Chipot, Elliptic Equations: An Introductory Course, Birkhäuser Advanced Texts, Birkhäuser, Basel 2009.
654
References
75. G. Choquet, Topology, Academic Press 1966, translated from: Cours d’Analyse, Tome II, Topologie, Masson, Paris 1969. 76. A. Cialdea and V. Maz’ya, Semi-bounded Differential Operators, Contractive Semigroups and Beyond, Birkhäuser, Basel, 2014. 77. Ph. Ciarlet, Linear and Nonlinear Functional Analysis With Applications, SIAM, Philadelphia, 2013. 78. F.H. Clarke, Functional Analysis, Calculus of Variations and Optimal Control, Graduate Texts in Mathematics #264, Springer, London, 2013. 79. E. Coddington and N. Levinson, Theory of Ordinary Differential Equations, McGraw Hill, 1955. 80. D.L. Cohn, Measure Theory, Birkhäuser, Boston, 1980, 1993, 1996, 1997. 81. P. L. Combettes, J.-B. Hiriart-Urruty and M. Théra, Preface to Modern Convex Analysis, special issue of Mathematical Programming, series B 148, no 1–2 (2014). 82. J.B. Conway, A Course in Functional Analysis, Graduate Texts in Mathematics 96, Springer, New York, 1990. 83. R. Courant and D. Hilbert, Methods of Mathematical Physics, I, II, Intersience, 1962. 84. R. Courant and F. John, Introduction to Calculus and Analysis I, Classics in Mathematics, Springer-Verlag, Berlin, 1946, 1989, 1999. 85. M.G. Crandall and A. Pazy, Nonlinear semi-groups of contractions and dissipative sets, J. Funct. Anal. 3, 376–418 (1969). 86. B. Dacorogna, Direct Methods in the Calculus of Variations, Applied Mathematical Sciences #78, Springer, New York, 2008. 87. M. Danesi, Discovery in Mathematics: An Interdisciplary Perspective, Lincom Europa, Muenchen, 2013. 88. R. Dautray and J.-L. Lions, Mathematical Analysis and Numerical Methods for Science and Technology, (6 volumes), Springer, 1988, Masson, Paris, 1988. 89. B. Davies, Integral Transforms and Their Applications, Texts in Applied Mathematics #41, Springer, 1978, 1985, 2002. 90. S.R. Deans, The Radon Transform and Some of its Applications, Wiley, New York, 1973. 91. K. Deimling, Ordinary Differential Equations in Banach Spaces, Lecture Notes in Mathematics, #596, Springer-Verlag, Berlin, 1977. 92. K. Deimling, Nonlinear Functional Analysis, Springer-Verlag, Berlin, 1985. 574 93. K. Deimling, Multivalued Differential Equations, De Gruyter Series in Nonlinear Analysis and Applications 1, De Gruyter, Berlin, 1992. 94. F. Demengel and G. Demengel, Functional Spaces for the Theory of Elliptic Partial Differential Equations, Universitext, Springer, London, 2012. 95. R. Deville, G. Godefroy and V. Zizler, Smoothness and Renormings in Banach Spaces, Pitman Monographs 64, Longman, 1993. 96. E. DiBenetto, Partial Differential Equations, Birkhäuser, Basel, 2009. 97. J. Diestel, Geometry of Banach Spaces: Selected Topics, Springer, New York, 1975. 98. J. Diestel, Sequences and Series in Banach Spaces, Graduate Texts in Mathematics vol. 92, Springer, New York, 1984. 99. J. Diestel and J.J. Uhl Jr., Vector Measures, Mathematical Surveys #15, American Mathematical Society, Providence, R.I., 1977. 100. J. Dieudonné, Treatise on Analysis, (8 volumes) Academic Press, New York and London, 1960, 1969. 101. J. Dieudonné, Panorama des Mathématiques Pures. Le Choix Bourbachique, Gauthier-Villars, Bordas, Paris, 1977. 102. A.L. Dontchev and R.T. Rockafellar, Implicit Functions and Solutions Mappings. A View from Variational Analysis, Springer Monographs in Mathematics, Springer Dordrecht, 2009. 103. P. Drábek and J. Milota, Methods of Nonlinear Analysis. Applications to Differential Equations, Birhäuser Advanced Texts, Birhäuser, Basel, 2007. 104. R.M. Dudley and R. Norvaiša, Concrete Functional Calculus, Springer Monographs in Mathematics, Springer, New York, 2011.
References
655
105. P. Dugac, Histoire de l’Analyse, Vuibert, Paris, 2003. 106. N. Dunford and J.T. Schwartz, Linear Operators, I, General Theory, Pure and Applied Mathematics vol. 7, Wiley Interscience, New York, 1958, 1967. 107. Dunkl, Ch. F., Xu, Y., Orthogonal Polynomials of Several Variables, Encyclopedia of Mathematics and its Applications 81, Cambridge University Press, Cambridge, 2001. 108. W.F. Eberlein, Weak compactness in Banach spaces, I, Proc. Nat. Acad. Sci. U.S.A. 33, 51– 53, 1947. 109. R. Edwards, Functional Analysis, Holt-Rinehart-Winston, 1965. 110. C.H. Edwards Jr., The Historical Development of the Calculus, Springer, 1979. 111. Y. Eidelman, V. Milman and A.Tsolomitis, Functional Analysis. An Introduction, Graduate Studies on Mathematics #66, American Mathematical Society, Providence, Rhode Island, 2004. 112. I. Ekeland and R. Temam, Convex Analysis and Variational Problems, Classics in Applied Mathematics #28, Society for Industrial and Applied Mathematics, Philadelphia, PA, 1999, translated from the French: Analyse Convexe et Problèmes Variationnels, Dunod, Paris, 1974. 113. K.-J. Engel and R. Nagel, One-parameter Semigroups for Linear Evolution Equations, Graduate Texts in Mathematics #194, Springer, New York, 2000. 114. K.-J. Engel and R. Nagel, A Short Course on Operator Semigroups, Universitext, Springer, New York, 2006. 115. B. Epstein, Introduction to Lebesgue Integration and Infinite Dimensional Problems, Saunders, Philadelphia, 1970. 116. G. Eskin, Lectures on Linear Partial Differential Equations, Graduate Studies in Mathematics #123, American Mathematical Society, Providence, Rhode Island, 2011. 117. L.C. Evans, Partial Differential Equations, Graduate Studies in Mathematics 19, American Mathematical Society, Providence, Rhode Island, 1998, 2002, 2008, 2010 . 118. L.C. Evans and R.F. Gariepy, Measure Theory and Fine Properties of Functions, Studies in Advanced Mathematics, CRC Press, Boca Raton FL, 1992. 119. M. Fabian, P. Habala, P. Hájek, J. Pelant, V. Montesinos, and V. Zizler, Functional Analysis and Infinite Dimensional Geometry, CMS Books # 8, Springer, New York 2001. 120. M. Fabian, P. Habala, P. Hájek, V. Montesinos, and V. Zizler, Banach Space Theory, The Basis for Linear and Nonlinear Analysis, CMS Books in Mathematics, Springer, New York, 2011. 121. T. Figiel, On the moduli of convexity and smoothness, Studia Math 56(2), 121–155 (1976). 122. T.G. Foeman, The Mathematics of Medical Imagery, Undergraduate Texts in Mathematics and Technology, Springer, 2010. 123. G.B. Folland, Introduction to Partial Differential Equations, Princeton University Press, 1976. 124. I. Fonseca and G. Leoni, Modern Methods in the Calculus of Variations: Lp Spaces, Springer Monographs in Mathematics, Springer, New York, 2007. 125. J. Foran, Fundamentals of Real Analysis, Marcel Dekker, New York, 1991. 126. D.H. Fremlin, Measure Theory, Volume 1, The Irreducible Minimum, Torres Fremlin, Colchester, 2000, 2001, 2004. 127. D.H. Fremlin, Measure Theory, Volume 2, Broad Foundations, Torres Fremlin, Colchester, 2001. 128. A. Friedman, Partial Differential Equations, Holt, Rinehart and Winston, New York, 1969. 129. B. Friedman, Lectures on Applications-Oriented Mathematics, Wiley Classics Library, John Wiley, New York, 1964, 1991. 130. G. Gagneux and M. Madaune-Tort, Analyse mathématique de modèles nonlinéaires de l’ingénierie pétrolière, Mathematics and Applications no 22, Springer, Berlin, 1996. 131. P. Garabedian, Partial Differential Equations, Wiley, 1964. 132. R.F. Gariepy and W.P. Ziemer, Modern Real Analysis, PWS Publishing company, Boston, 1994. 133. M.I. Garrido and J. Jaramillo, Homomorphisms on function lattices, Monatsh. Math. 141, 127–146, (2004) 134. I.M. Gelfand and S.V. Fomin, Calculus of Variations, Prentice Hall, Englewood Cliffs, N.J., 1963.
656
References
135. N. Ghossoub and D. Preiss, A general mountain pass principle for locating and classifying critical points, Ann. Inst. H. Poincaré, 6, 321–330, (1989). 136. M. Giaquinta, Multiple Integrals in The Calculus of Variations and Nonlinear Elliptic Systems, Princeton University Press, 1983. 137. M. Giaquinta and S. Hildebrandt, Calculus of Variations I: the Lagrangian Formalism, Grundlehren der mathematischen Wissenschaften #310, Springer-Verlag, Berlin, 1996. 138. M. Giaquinta and S. Hildebrandt, Calculus of Variations II: the Hamiltonian Formalism, Grundlehren der mathematischen Wissenschaften #311, Springer-Verlag, Berlin, 1996. 139. D. Gilbarg and N. Trudinger, Elliptic Partial Differential Equations of Second Order, Springer, Classics in Mathematics, 1977, 1998, 2001. 140. J.-M. Gilsinger and M. El Jai, Eléments d’Analyse Fonctionnelle. Fondements et Applications aux Sciences de l’Ingénieur, Presses polytechniques romandes, Lausanne, 2010. 141. O. Giraud and K. Thas, Hearing shapes of drums-mathematical and physical aspects of isospectrality, Mathematical Physics 82, 2213–2245, 2010. 142. E. Giusti, Direct Methods in the Calculus of Variations, World Scientific, Singapore, 2003. 143. S. Givant, Duality Theories for Boolean Algebras with Operators, Springer Monographs in Mathematics, Cham, 2014. 144. R. Glowinski, Numerical Methods for Nonlinear Variational Problems, Computational Physics Series, Springer Verlag, Heidelberg, 1984. 145. R. Glowinski, J.-L. Lions and R. Trémolières, Numerical Analysis of Variational inequalities, North Holland, Amsterdam, 1981. 146. R. Godement, Analysis I, II, Universitext, Springer-Verlag, Berlin, 2005. 147. J. Goldstein, Semigroups of Operators and Applications, Oxford University Press, Oxford, 1985. 148. H.H. Goldstine, A History of the Calculus of Variations, Springer-Verlag, New York, 1980. 149. L. Grafakos, Classical and Modern Fourier Analysis, Graduate Texts in Mathematics #249, Springer, New York, 2008. 150. P.R. Halmos, Measure Theory, Van Nostrand, Princeton, 1950, Graduate Texts in Mathematics # 18, Springer, New York, 1974. 151. Qing Han, A Basic Course in Partial Differential Equations, Graduate Studies in Mathematics #120, American Mathematical Society, Providence 2011. 152. A. Haraux, Nonlinear Evolution Equations_Global Behavior of Solutions, Lecture Notes in Mathematics #841, Springer-Verlag, Berlin, 1981. 153. D. Haroske and H. Triebel, Distributions, Sobolev Spaces, Elliptic Equations, European Mathematical Society, Zürich, 2008. 154. Ph. Hartman, Ordinary Differential Equations, Wiley, New York, 1964. 155. H. Hattori, Partial Differential Equations, Methods, Applications and Theories, World Scientific, Singapore, 2013. 156. B. Hauchecorne and D. Suratteau, Des Mathématiciens de A à Z, Ellipses, Paris, 2008. 157. S. Helgason, The Radon Transform, Birkhäuser, Basel, 1980. 158. E. Hernandez, G. Weiss, A First Course on Wavelets, Studies in Advanced Mathematics, CRC Press, Boca Raton, Florida, 1996. 159. E. Hewitt and K. Stromberg, Real and Abstract Analysis, Springer, New York (1965). 160. E. Hille, Methods in Classical and Functional Analysis, Addison-Wesley, 1972. 161. E. Hille and R.S. Phillips, Functional Analysis and Semi-groups, American Mathemetical Society, Providence, R.I., 1974. 162. J.-B. Hiriart-Urruty and C. Lemaréchal, Fundamentals of Convex Analysis, Springer, Berlin, 2001. 163. F. Hirsch and G. Lacombe, Elements of Functional Analysis, Graduate Texts in Mathematics #192, Springer, New York, 1999. 164. R.B. Holmes, A Course on Optimization and Best Approximation. Lecture Notes in Mathematics #257, Springer-Verlag, New York, 1972. 165. R.B. Holmes, Geometric Functional Analysis and its Applications, Graduate Texts in Mathematics #24, Springer-Verlag, New York 1975.
References
657
166. A.D. Ioffe and V.M. Tikhomirov, Theory of Extremal Problems, Studies in Mathematics and Its Applications, North Holland, Amsterdam, 1979, translated from the Russian, Nauka, Moscow, 1974. 167. M. Ivanov and N. Zlateva, A (hopefully) new proof of maximality of the subdifferential operator of a convex function, preprint, February 2015. 168. J. Jacod and P. Protter, Probability Essentials, Universitext, Springer-Verlag, Berlin, 2000, 2003, 2004. 169. J. Jost, Postmodern Analysis, Universitext, Springer, New York, 1997, 2002, 2005. 170. J. Jost and X. Li-Jost, Calculus of Variations, Cambridge studies in advanced mathematics #64, Cambridge University Press, Cambridge, 1998. 171. M. Kac, Can one hear the shape of a drum?, American Math. Monthly 73, pp. 1–23 (1966). 172. T. Kato, Perturbation Theory for Linear Operators, Grundlehren der mathematischen Wissenschaften #132, Springer-Verlag, Berlin, 1966, 1976. 173. T. Kato, Accretive Operators and Nonlinear Evolution Equations in Banach Spaces, Proceedings of Symposia in Pure Mathematics 18, Part 2, pp. 138–161, American Mathemetical Society, Providence, 1976. 174. J. Kelley, General Topology, Van Nostrand, New York, 1955. 175. D. Kinderlehrer and G. Stampacchia, An Introduction to Variational Inequalities and Their Applications, Academic Press, New York, 1980. 176. A. Komolgorov and S. Fomin, Introductory Real Analysis, Prentice-Hall, 1970. 177. T.W. Körner, A Companion to Analysis. A Second First and First Second Course in Analysis, Graduate Studies in Mathematics #62, American Mathematical Society, Providence, Rhode Island, 2004. 178. A.M. Krall, Applied Analysis, D.Reidel, Dordrecht, Holland, 1986. 179. M.A. Krasnoselskii, P.P. Zabreiko, E.I. Pustylnik, P.E. Sobolevskii, Integral operators in spaces of summable functions, Noordhoff, Groningen, 1976. 180. N.V. Krylov, Lectures on Elliptic and Parabolic Equations in Sobolev Spaces, Graduate Studies in Mathematics #96, American Mathematical Society, Providence, Rhode Island, 2008. 181. D. Labate, G. Weiss, E. Wilson, Wavelets, Notices Amer. Math. Soc. 60, no1, 66–76 (2013). 182. S. Lang, Analysis II, Addison-Wesley, Reading, Massachusetts, 1969. 183. S. Lang, Real and Functional Analysis, Springer-Verlag, New York, 1993. 184. P. Lax, S. Burstein and A. Lax, Calculus With Applications and Computing, I, II, Undergraduate Texts in Mathematics, Springer-Verlag, New York, 1976, 1984. 185. E. H. Lieb and M. Loss, Analysis, Graduate Studies in Mathematics #14, American Mathematical Society, Providence, 1997, 2001. 186. J. Lindenstrauss, A short proof of Liapounoff’s convexity theorem, J. Math. Mech. 15, 1966, 971–972. 187. J. Lindenstrauss and L. Tzafriri, Classical Banach Spaces, (two volumes), Springer, 1973, 1979 188. J-L. Lions, Quelques Méthodes de Résolution des Problèmes aux Limites Non-Linéaires, Dunod-Gauthier-Villars, Paris, 1969. 189. D. Luenberger, Optimization by Vector Spaces Methods, Wiley, New York, 1969. 190. B. Makarov and A. Podkorytov, Real Analysis: Measures, Integrals and Applications, Universitext, Springer-Verlag, London, 2013. 191. D.P. Maki and M. Thompson, Mathematical Models and Applications, Prentice Hall, Englewood Cliffs, N.J., 1973. 192. P.A. Markowich, Applied Partial Differential Equation. A Visual Approach, Springer, Berlin, 2007. 193. C.-M. Marle, Mesures et Probabilités, Hermann, Paris, 1974. 194. R.H. Martin Jr., Nonlinear Operators and Differential Equations in Banach Spaces, Wiley, New York, 1976. 195. V.G. Maz’ja, Sobolev Spaces, With Applications to Elliptic Partial Differential Equations, Grundlehren der mathematischen Wissenschaften #342, Springer-Verlag, Berlin, 1985, 2011.
658
References
196. R.E. Megginson, An Introduction to Banach Space Theory, Graduate Texts in Mathematics #183, Springer, New York, 1998. 197. J. Milnor, Analytic proofs of the “Hairy ball theorem” and the Brouwer fixed point theorem, Amer. Math. Monthly 85, 521–524, 1978. 198. J.–J. Moreau, Fonctionnelles Convexes, Collège de France, Paris, 1966–1967. 199. C.B. Morrey, Multiple Integrals in the Calculus of Variations, Springer-Verlag, New York, 1966. 200. D. Motreanu, N. Pavel, Tangency, Flow Invariance for Differential Equations, and Optimization Problems, Dekker, 1999. 201. F. Natterer, The Mathematics of Computerized Tomography, Teubner, Stuttgart, Wiley, Chichester, 1986, 1989. 202. J. Neˇcas, Direct Methods in the Theory of Elliptic Equations, Springer Monographs in Mathematics, Springer-Verlag, Berlin, 2012, translated from “Les Méthodes Directes en Théorie des Equations Elliptiques”, Academia, Praha and Masson, Paris, 1967. 203. R. Osserman, Isoperimetric inequalities and eigenvalues of the Laplacian, in Proc. International Congress of Mathematicians, Helsinki, 1978 and Bull. Amer. Math. Soc. 84, pp.1182–1238, 1978. 204. A. Pazy, Semigroups of Linear Operators and Applications to Partial Differential Equations, Springer, 1983. 205. J.-P. Penot, Subdifferential calculus without qualification assumptions, J. Convex Anal. 3 (2), 1–13, (1996). 206. J.-P. Penot, Well-behavior, well-posedness and nonsmooth analysis, Pliska Stud. Math. Bulgar. 12, 141–190 (1998). 207. J.-P. Penot, The relevance of convex analysis for the study of monotonicity, Nonlin. Anal. 58, 855–871 (2004). 208. J.-P. Penot, Calculus Without Derivatives, Graduate Texts in Mathematics #266, Springer, New York, 2013. 209. R.R. Phelps, Convex Functions, Monotone Operators, and Differentiability, Lecture Notes in Mathematics #1364, Springer, Berlin, 1988. 210. E.R. Pinch, Optimal Control and the Calculus of Variations, Oxford, 1993. 211. Y. Pinchover and J. Rubinstein, An Introduction to Partial Differential Equations, Cambridge University Press, 2005. 212. M.A. Pinsky, Introduction to Fourier Analysis and Wavelets, Graduate Studies in Mathematics #102, American Mathematical Society, Providence, 2002. 213. M.A. Pons, Real Analysis for the Undergraduate. With an Introduction to Functional Analysis, Springer, New York, 2014. 214. P. Pucci and J. Serrin, The Maximum Principle, Birkhäuser, Basel, 2007. 215. J. Rauch, Partial Differential Equations, Graduate Texts in Mathematics #128, Springer, New York, 1991. 216. M. Renardy and R.C. Rogers, An Introduction to Partial Differential Equations, Springer Texts in Applied Mathematics #13, Springer, New York, 1993, 2004. 217. F. Riesz and B.Sz.-Nagy, Functional Analysis, Dover Publications, 1990, translated from Leçons d’Analyse Fonctionnelle, Académie des Sciences de Hongrie, 1954. 218. R.T. Rockafellar, Characterization of the subdifferentials of convex functions, Pacific J. Math. 17 no 3, 497–510, (1966). 219. R.T. Rockafellar, On the maximal monotonicity of subdifferential mappings, Pacific J. Math. 33 no 1, 209–216, (1966). 220. R.T. Rockafellar, Convex Analysis, Princeton University Press, Princeton, 1970. 221. R.T. Rockafellar and R. J.-B. Wets, Variational Analysis, Grundlehren der mathematischen Wissenschaften #317, Springer-Verlag, Berlin, 1998. 222. C.A. Rogers, A less strange version of Milnor’s proof of Brouwer’s fixed point theorem, Amer. Math. Monthly 87, 525–527 (1980). 223. H. Royden and P. Fitzpatrick, Real Analysis, Pearson, Boston, 2010. 224. W. Rudin, Real and Complex Analysis, McGraw-Hill, 1966, 1974, 1987.
References
659
225. W. Rudin, Functional Analysis, McGraw-Hill, New York, 1973, 1991. 226. S. Saks, Theory of the integral, Monografje Matematyczne, Warsaw, 1937, Dover, New York, 1947. 227. H.H. Schaefer and M.P. Wolff, Topological Vector Spaces, Graduate Texts in Mathematics #3, Springer, New York 1966, 1999. 228. M. Schechter, Principles of Functional Analysis, Graduate studies in Mathematics #36, American Mathematical Society, Providence, 2002. 229. G.E. Shilov, Elementary Functional Analysis, Dover New York, 1974. 230. R.E. Showalter, Monotone Operators in Banach Spaces and Nonlinear Partial Differential Equations, Mathematical Surveys and Monographs #49, American Mathematical Society, Providence, 1997. 231. A.H. Siddiqi, Applied Functional Analysis. Numerical Methods, Wavelet Methods, and Image Processing, Marcel Dekker, New York, 2004. 232. S. Simons, From Hahn-Banach to Monotonicity, Lecture Notes in Mathematics #1693, Springer, New York, 2008. 233. S. Simons and C. Zalinescu, A new proof for Rockafellar’s characterization of maximal monotone operators, Proc. Amer. Math. Soc. 132, 2969–2972 (2004). 234. S. Simons and C. Zalinescu, Fenchel duality, Fitzpatrick functions and maximal monotonicity, J. Nonlin. Convex Anal. 6, 1–22, (2005). 235. I.M. Singer, Eigenvalues of the Laplacian and invariants of manifolds, Proc. International Congress of Mathematicians, Vancouver, 1974. 236. L. Sirovich, Introduction to Applied Mathematics, Texts in Applied Mathematics #1, Springer-Verlag, New York 1988. 237. M. Spivak, Calculus on Manifolds, Benjamin, New York, 1965. 238. G. Stampacchia, Equations elliptiques à coefficients discontinus, Presses de l’Université de Montréal, Montréal, 1966. 239. E.M. Stein and R. Shakarchi, Fourier Analysis, an Introduction, Princeton University Press, Princeton, 2003. 240. E.M. Stein and R. Shakarchi, Real Analysis, Princeton University Press, Princeton, 2005. 241. G. Strang, Introduction to Applied Mathematics, Wellesley-Cambridge, Cambridge, Mass., 1986. 242. W. Strauss, Partial Differential Equations: An Introduction, Wiley, New York,1972. 243. M. Struwe, Variational Methods: Applications to Nonlinear Partial Differential Equations and Hamiltonian Systems, Ergebnisse der Mathematik und der Grenzgebedte #34, Springer, Berlin, 2008. 244. J. Szarski, Differential Inequalities, Monografie Matematyczne #43, Panstwowe Wydawnictwo Naukowe, Warsaw, 1965. 245. P. Szekeres, A Course in Modern Mathematical Physics. Groups, Hilbert Spaces and Differential Geometry, Cambridge University Press, Cambridge, 2004. 246. K. Taira, Semigroups, Boundary Value Problems and Markov Processes, Springer Monographs in Mathematics, Springer, Berlin, 2004, 2014. 247. T. Tao, An Introduction to Measure Theory, American Mathematical Society, Providence, 2011. 248. M. Taylor, Partial Differential Equations, vol. I–III, Springer, 1966. 249. L. Tartar, An Introduction to Sobolev Spaces and Interpolation Spaces, Lecture Notes in Mathematics, Springer, Berlin Heidelberg, 2007. 250. R. Teman, Navier-Stokes Equations, North-Holland, Amsterdam, 1966. 251. L. Thibault, Sequential convex subdifferential calculus and sequential Lagrange multipliers, SIAM J. Contr. Optim. 35 no 4, 1434–1444 (1997). 252. H. Triebel, Theory of Function Spaces, (3 volumes) Birhäuser, 1983, 1992, 2006. 253. F. Tröltzsch, Optimal Control of Partial Differential Equations, Graduate Studies in Mathematics #112, American Mathematical Society, Providence, 2010. 254. M.M. Vainberg, Variational Method and Method of Monotone Operators in the Theory of Nonlinear Equations, Wiley, New York, 1973.
660
References
255. M. Väth, Topological Analysis. From the basics to the triple degree for Fredholm inclusions, De Gruyter, Berlin, 2012. 256. J.-P. Vial, Strong convexity of sets and functions, Math. Oper. Res. 8, 231–259 (1983). 257. A. A. Vretblad, Fourier Analysis and its Applications, Graduate Texts in Mathematics #223, Springer-Verlag, New York, 2003. 258. J.S. Walker, Fast Fourier Transform, CRC Press, Boca Raton, 1996. 259. H. Weinberger, A First Course on Partial Differential Equations, Blaisell, 1965. 260. J. Wloka, Partial Differential Equations, Cambridge University Press, 1987. 261. A. Yagi, Abstract Parabolic Evolution Equations and their Applications, Springer Monographs in Mathematics, Springer-Verlag, Berlin, 2010. 262. K. Yosida, Functional Analysis, Grundlehren der mathematischen Wissenschaften #123, Springer-Verlag, Berlin, 1965, 1980, 1995. 263. L.C. Young, Lectures on the Calculus of Variations and Optimal Control Theory, Saunders, Philadelphia, 1969. 264. C. Z˘alinescu, On uniformly convex functions, J. Math. Anal. Appl. 95, 344–374 (1983). 265. C. Z˘alinescu, Convex Analysis in General Vector Spaces, World Scientific, Singapore, 2002. 266. E. Zeidler, Nonlinear Functional Analysis and its Applications, (4 volumes) Springer, New York, 1994, 1990. 267. A.H. Zemanian, Distribution Theory and Transform Analysis, An Introduction to Generalized Functions, with Applications, Dover, New York, 1965. 268. W.P. Ziemer, Weakly Differentiable Functions, Graduate Texts in Mathematics #120, Springer-Verlag, New York, 1989. 269. N. Zlateva, Integrability through infimal regularization, C.R. Acad. bulgare des Sciences 68 (5), 551–560, (2015). 270. V.A. Zorich, Mathematical Analysis II, Universitext, Springer-Verlag, Berlin, 2004.