256 63 5MB
English Pages 800 Seiten [794] Year 2014;2015
Variational Analysis in Sobolev and BV Spaces
MO17_FM-07-11-14.indd 1
7/11/2014 12:25:27 PM
MOS-SIAM Series on Optimization This series is published jointly by the Mathematical Optimization Society and the Society for Industrial and Applied Mathematics. It includes research monographs, books on applications, textbooks at all levels, and tutorials. Besides being of high scientific quality, books in the series must advance the understanding and practice of optimization. They must also be written clearly and at an appropriate level for the intended audience. Editor-in-Chief Katya Scheinberg Lehigh University Editorial Board Santanu S. Dey Georgia Institute of Technology
Stefan Ulbrich Technische Universität Darmstadt
Maryam Fazel University of Washington
Luis Nunes Vicente University of Coimbra
Andrea Lodi University of Bologna
David Williamson Cornell University
Arkadi Nemirovski Georgia Institute of Technology
Stephen J. Wright University of Wisconsin
Series Volumes Attouch, Hedy, Buttazzo, Giuseppe, and Michaille, Gérard, Variational Analysis in Sobolev and BV Spaces: Applications to PDEs and Optimization, Second Edition Shapiro, Alexander, Dentcheva, Darinka, and Ruszczynski, Andrzej, Lectures on Stochastic Programming: ´ Modeling and Theory, Second Edition Locatelli, Marco and Schoen, Fabio, Global Optimization: Theory, Algorithms, and Applications De Loera, Jesús A., Hemmecke, Raymond, and Köppe, Matthias, Algebraic and Geometric Ideas in the Theory of Discrete Optimization Blekherman, Grigoriy, Parrilo, Pablo A., and Thomas, Rekha R., editors, Semidefinite Optimization and Convex Algebraic Geometry Delfour, M. C., Introduction to Optimization and Semidifferential Calculus Ulbrich, Michael, Semismooth Newton Methods for Variational Inequalities and Constrained Optimization Problems in Function Spaces Biegler, Lorenz T., Nonlinear Programming: Concepts, Algorithms, and Applications to Chemical Processes Shapiro, Alexander, Dentcheva, Darinka, and Ruszczynski, Andrzej, Lectures on Stochastic Programming: ´ Modeling and Theory Conn, Andrew R., Scheinberg, Katya, and Vicente, Luis N., Introduction to Derivative-Free Optimization Ferris, Michael C., Mangasarian, Olvi L., and Wright, Stephen J., Linear Programming with MATLAB Attouch, Hedy, Buttazzo, Giuseppe, and Michaille, Gérard, Variational Analysis in Sobolev and BV Spaces: Applications to PDEs and Optimization Wallace, Stein W. and Ziemba, William T., editors, Applications of Stochastic Programming Grötschel, Martin, editor, The Sharpest Cut: The Impact of Manfred Padberg and His Work Renegar, James, A Mathematical View of Interior-Point Methods in Convex Optimization Ben-Tal, Aharon and Nemirovski, Arkadi, Lectures on Modern Convex Optimization: Analysis, Algorithms, and Engineering Applications Conn, Andrew R., Gould, Nicholas I. M., and Toint, Phillippe L., Trust-Region Methods
MO17_FM-07-11-14.indd 2
7/11/2014 12:25:27 PM
Variational Analysis in Sobolev and BV Spaces Applications to PDEs and Optimization Second Edition
Hedy Attouch
Université Montpellier II Montpellier, France
Giuseppe Buttazzo
Università di Pisa Pisa, Italy
Gérard Michaille
Université Montpellier II Montpellier, France
Society for Industrial and Applied Mathematics Philadelphia
MO17_FM-07-11-14.indd 3
Mathematical Optimization Society Philadelphia
7/11/2014 12:25:27 PM
Copyright © 2014 by the Society for Industrial and Applied Mathematics and the Mathematical Optimization Society 10 9 8 7 6 5 4 3 2 1 All rights reserved. Printed in the United States of America. No part of this book may be reproduced, stored, or transmitted in any manner without the written permission of the publisher. For information, write to the Society for Industrial and Applied Mathematics, 3600 Market Street, 6th Floor, Philadelphia, PA 19104-2688 USA. Trademarked names may be used in this book without the inclusion of a trademark symbol. These names are used in an editorial context only; no infringement of trademark is intended. Library of Congress Cataloging-in-Publication Data Attouch, H. Variational analysis in Sobolev and BV spaces : applications to PDEs and optimization / Hedy Attouch, Université Montpellier II, Montpellier, France, Giuseppe Buttazzo, Università di Pisa, Pisa, Italy, Gérard Michaille, Université Montpellier II, Montpellier, France. -- Second edition. pages cm. -- (MOS-SIAM series on optimization ; 17) Includes bibliographical references and index. ISBN 978-1-611973-47-1 1. Mathematical optimization. 2. Function spaces. 3. Calculus of variations. 4. Sobolev spaces. 5. Functions of bounded variation. 6. Differential equations, Partial. I. Buttazzo, Giuseppe. II. Michaille, Gérard. III. Title. QA402.5.A84 2014 515’.782--dc23 2014012612
is a registered trademark.
MO17_FM-07-11-14.indd 4
is a registered trademark.
7/11/2014 12:25:27 PM
i
i
i
book 2014/8/6 page v i
Contents Preface to the Second Edition
ix
Preface to the First Edition
xi
1
Introduction
1
I
Basic Variational Principles
5
2
Weak solution methods in variational analysis 2.1 The Dirichlet problem: Historical presentation 2.2 Test functions and distribution theory . . . . . . 2.3 Weak solutions . . . . . . . . . . . . . . . . . . . . . 2.4 Weak topologies and weak convergences . . . .
3
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
7 7 14 30 39
Abstract variational principles 3.1 The Lax–Milgram theorem and the Galerkin method 3.2 Minimization problems: The topological approach . 3.3 Convex minimization theorems . . . . . . . . . . . . . . 3.4 Ekeland’s -variational principle . . . . . . . . . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
65 65 74 87 94
. . . .
. . . .
. . . .
4
Complements on measure theory 105 4.1 Hausdorff measures and Hausdorff dimension . . . . . . . . . . . . . . 105 4.2 Set functions and duality approach to Borel measures . . . . . . . . . . 119 4.3 Introduction to Young measures . . . . . . . . . . . . . . . . . . . . . . . . 132
5
Sobolev spaces 5.1 Sobolev spaces: Definition, density results . . . . . . . . . . . . . . . 5.2 The topological dual of H01 (Ω). The space H −1 (Ω). . . . . . . . . . 1, p 5.3 Poincaré inequality and Rellich–Kondrakov theorem in W0 (Ω) 5.4 Extension operators from W 1, p (Ω) into W 1, p (RN ). Poincaré inequalities and the Rellich–Kondrakov theorem in W 1, p (Ω) . . . 5.5 The Fourier approach to Sobolev spaces. The space H s (Ω), s ∈ R 5.6 Trace theory for W 1, p (Ω) spaces . . . . . . . . . . . . . . . . . . . . . . 5.7 Sobolev embedding theorems . . . . . . . . . . . . . . . . . . . . . . . . 5.8 Capacity theory and elements of potential theory . . . . . . . . . .
6
145 . . 146 . . 159 . . 161 . . . . .
. . . . .
167 173 178 184 197
Variational problems: Some classical examples 219 6.1 The Dirichlet problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220 v
i
i i
i
i
i
i
vi
book 2014/8/6 page vi i
Contents
6.2 6.3 6.4 6.5 6.6 6.7 6.8 6.9 6.10 6.11 6.12 7
8
9
II 10
The Neumann problem . . . . . . . . . . . . . . . . Mixed Dirichlet–Neumann problems . . . . . . . Heterogeneous media: Transmission conditions . Linear elliptic operators . . . . . . . . . . . . . . . . The linearized elasticity system . . . . . . . . . . . Introduction to the Signorini problem . . . . . . . The Stokes system . . . . . . . . . . . . . . . . . . . . Convection-diffusion equations . . . . . . . . . . . Semilinear equations . . . . . . . . . . . . . . . . . . The nonlinear Laplacian Δ p . . . . . . . . . . . . . The obstacle problem . . . . . . . . . . . . . . . . . .
The finite element method 7.1 The Galerkin method: Further results . . . 7.2 Description of finite element methods . . . 7.3 An example . . . . . . . . . . . . . . . . . . . . 7.4 Convergence of the finite element method 7.5 Complements . . . . . . . . . . . . . . . . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
226 236 241 246 249 258 261 264 267 275 279
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
285 285 287 290 291 303
Spectral analysis of the Laplacian 8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 The Laplace–Dirichlet operator: Functional setting . . . . . . . . . . . 8.3 Existence of a Hilbertian basis of eigenvectors of the Laplace–Dirichlet operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.4 The Courant–Fisher min-max and max-min formulas . . . . . . . . . . 8.5 Multiplicity and asymptotic properties of the eigenvalues of the Laplace–Dirichlet operator . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.6 A general abstract theory for spectral analysis of elliptic boundary value problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
307 307 309
Convex duality and optimization 9.1 Dual representation of convex sets . . . . . . . . . . . . . . . . . . . . 9.2 Passing from sets to functions: Elements of epigraphical calculus . 9.3 Legendre–Fenchel transform . . . . . . . . . . . . . . . . . . . . . . . . 9.4 Legendre–Fenchel calculus . . . . . . . . . . . . . . . . . . . . . . . . . 9.5 Subdifferential calculus for convex functions . . . . . . . . . . . . . . 9.6 Mathematical programming: Multipliers and duality . . . . . . . . 9.7 A general approach to duality in convex optimization . . . . . . . . 9.8 Duality in the calculus of variations: First examples . . . . . . . . .
333 333 337 343 352 355 364 380 387
. . . . . . . .
. . . . . . . .
Advanced Variational Analysis Spaces 10.1 10.2 10.3 10.4 10.5
BV and S BV The space BV (Ω): Definition, convergences, and approximation The trace operator, the Green’s formula, and its consequences . The coarea formula and the structure of BV functions . . . . . . Structure of the gradient of BV functions . . . . . . . . . . . . . . . The space SBV (Ω) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
313 317 324 329
391
. . . . .
. . . . .
. . . . .
393 393 400 408 427 429
i
i i
i
i
i
i
Contents
book 2014/8/6 page vii i
vii
11
Relaxation in Sobolev, BV , and Young measures spaces 11.1 Relaxation in abstract metrizable spaces . . . . . . . . . . . . . . . . . . . 11.2 Relaxation of integral functionals with domain W 1, p (Ω, R m ), p > 1 . 11.3 Relaxation of integral functionals with domain W 1,1 (Ω, R m ) . . . . . 11.4 Relaxation in the space of Young measures in nonlinear elasticity . . 11.5 Mass transportation problems . . . . . . . . . . . . . . . . . . . . . . . . .
437 437 440 456 468 480
12
Γ -convergence and applications 12.1 Γ -convergence in abstract metrizable spaces . . . . . . . . . 12.2 Application to the nonlinear membrane model . . . . . . . 12.3 Application to homogenization of composite media . . . . 12.4 Stochastic homogenization . . . . . . . . . . . . . . . . . . . . 12.5 Application to image segmentation and phase transitions .
487 487 491 496 506 533
13
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
Integral functionals of the calculus of variations 13.1 Lower semicontinuity in the scalar case . . . . . . . . . . . . . . . . . . . 13.2 Lower semicontinuity in the vectorial case . . . . . . . . . . . . . . . . . 13.3 Lower semicontinuity for functionals defined on the space of measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.4 Functionals with linear growth: Lower semicontinuity in BV and SBV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
547 547 552 559 562
14
Application in mechanics and computer vision 569 14.1 Problems in pseudoplasticity . . . . . . . . . . . . . . . . . . . . . . . . . . 569 14.2 Some variational models in fracture mechanics . . . . . . . . . . . . . . 576 14.3 The Mumford–Shah model . . . . . . . . . . . . . . . . . . . . . . . . . . . 595
15
Variational problems with a lack of coercivity 15.1 Convex minimization problems and recession functions . . . . 15.2 Nonconvex minimization problems and topological recession 15.3 Some examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.4 Limit analysis problems . . . . . . . . . . . . . . . . . . . . . . . . .
. . . .
. . . .
. . . .
. . . .
599 599 617 625 632
An introduction to shape optimization problems 16.1 The isoperimetric problem . . . . . . . . . . . 16.2 The Newton problem . . . . . . . . . . . . . . . 16.3 Optimal Dirichlet free boundary problems . 16.4 Optimal distribution of two conductors . . . 16.5 Optimal potentials for elliptic operators . . .
. . . . .
. . . . .
. . . . .
. . . . .
643 644 645 648 651 654
16
17
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
Gradient flows 17.1 The classical continuous steepest descent . . . . . . . . . . . . . . . . . . 17.2 The gradient flow associated to a convex potential . . . . . . . . . . . . 17.3 Gradient flow associated to a tame function. Kurdyka–Łojasiewicz theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.4 Sequences of gradient flow problems . . . . . . . . . . . . . . . . . . . . . 17.5 Steepest descent and gradient flow on general metric spaces . . . . . . 17.6 Minimizing movements and the implicit Euler scheme . . . . . . . . .
663 663 670 724 734 766 768
Bibliography
771
Index
791
i
i i
i
i
i
i
book 2014/8/6 page ix i
Preface to the Second Edition This second edition takes advantage of several comments received by colleagues and students. With respect to the first edition (published by SIAM in 2006) several new sections have been added and the organization of the material is now slightly different. The section on capacity theory and elements of potential theory has been completed by the notions of quasi-open sets and quasi-continuity. We also have increased the number of examples in Section 6 (the linearized elasticity system, obstacles problems, convection-diffusions and semilinear equations, . . . ). Section 11, devoted to the relaxation theory, has been completed by a section on mass transportation problems and the Kantorovich relaxed formulation of the Monge problem. We have added a subsection on stochastic homogenization to the section devoted to the Gamma-convergence: we establish the mathematical tools coming from ergodic theory and illustrate them in the scope of statistically homogeneous materials. Section 16 has been augmented by two examples illustrating the shape optimization procedure. The main novelty of this second edition is the new and very comprehensive section devoted to gradient flows, as well as the dynamical approach to equilibria.
ix
i
i i
i
i
i
i
book 2014/8/6 page xi i
Preface to the First Edition Most of the material in this book comes from graduate-level courses on variational analysis, PDEs, and optimization which have been given during the last decades by the authors, H. Attouch and G. Michaille at the University of Montpellier, France, and G. Buttazzo at the University of Pisa, Italy. Our objective is twofold. The first objective is to provide to students the basic tools and methods of variational analysis and optimization in infinite dimensional spaces together with applications to classical PDE problems. This corresponds to the first part of the book, Chapters 1 through 9, and takes place in classical Sobolev spaces. We have made an effort to provide, as much as possible, a self-contained exposition, and we try to introduce each new development from various perspectives (historical, numerical, . . .). The second objective, which is oriented more toward research, is to present new trends in variational analysis and some of the most recent developments and applications. This corresponds to the second part of the book, Chapters 10 through 16, where in particular are introduced the BV (Ω) spaces. This organization is intended to make the book accessible to a large audience, from students to researchers, with various backgrounds in mathematics, as well as physicists, engineers, and others. As a guideline, we try to portray direct methods in modern variational analysis—one century after D. Hilbert delineated them in his famous lecture at Collège de France, Paris, 1900. The extraordinary success of these methods is intimately linked with the development, throughout the 20th century, of new branches in mathematics: functional analysis, measure theory, numerical analysis, (nonlinear) PDEs, and optimization. We try to show in this book the interplay among all these theories and also between theory and applications. Variational methods have proved to be very flexible. In recent years, they have been developed to study a number of advanced problems of modern technology, like composite materials, phase transitions, thin structures, large deformations, fissures, and shape optimization. To grasp these often involved phenomena, the classical framework of variational analysis, which is presented in the first part, must be enlarged. This is the motivation for the introduction in the second part of the book of some advanced techniques, like BV and SBV spaces, Young measures, Γ -convergence, recession analysis, and relaxation methods. Finally, we wish to stress that variational analysis is a remarkable example of international collaboration. All mathematical schools have contributed to its success, and it is a modest symbol that this book has been written in collaboration between mathematicians of two schools, French and Italian. This book owes much to the support of the Universities of Montpellier and Pisa and of their mathematical departments, and the convention of cooperation that connects them.
xi
i
i i
i
i
i
i
xii
book 2014/8/6 page xii i
Preface to the First Edition
Acknowledgments. We would like to express our sincere thanks to all the students and colleagues whose comments and encouragement helped us in writing the final manuscript. Year by year, the redaction of the book profited much from their comments. We are grateful to our colleagues in the continuous optimization community, who strongly influenced the contents of the book and encouraged us from the very beginning in writing this book. The chapter on convex analysis benefited much from the careful reading of L. Thibault and M. Valadier. We would like to thank SIAM and the editorial board of the MPS-SIAM Series on Optimization for the quality of the editing process. We address special thanks to B. Lacan in Montpellier, who helped us when we started the project. Finally, we take this opportunity to express our consideration and gratitude to H. Brezis and E. De Giorgi, who were our first guides in the discovery of this fascinating world of variational methods and their applications.
i
i i
i
i
i
i
book 2014/8/6 page 1 i
Chapter 1
Introduction
Let us detail the contents of each of the two parts of the book. Part I: Basic Variational Principles. In Part I, we follow as a guideline the variational treatment of the celebrated Dirichlet problem. We show how the program of D. Hilbert, which was first delineated in his famous lecture at Collège de France in 1900 [241], has been progressively solved throughout the 20th century. We introduce the basic elements of variational analysis which allow one to solve this classical problem and closely related ones, like the Neumann problem and the Stokes problem. Chapter 2 contains an extensive exposition of weak solution methods in variational analysis and of the accompanying notions: test functions, the distribution theory of L. Schwartz, weak convergences, and topologies. Chapter 3 provides an exposition of the basic abstract variational principles. We enhance the importance of the direct method for solving minimization problems and put to the fore some of its basic topological ingredients: lower semicontinuity, coercivity, and inf-compactness. We show how weak topologies, reflexivity, and convexity properties come naturally into play. We insist on the modern approach to optimization theory where the concept of epigraph of a function plays a central role; see the monograph of Rockafellar and Wets [330] on variational analysis, where the epigraphical analysis is systematically developed. Chapter 4 contains some complements on geometric measure theory. We introduce in a self-contained way the notion of Hausdorff measure, which allows us to recover, as special cases, both the Lebesgue measure on an open set of RN and surface measures (which play an important role, for example, in the definition of the space trace of Sobolev spaces). These two basic ingredients, roughly speaking, the generalized differential calculus of distribution theory and the generalized integration theory of Lebesgue, allow us to introduce in Chapter 5 the classical Sobolev spaces which provide the right functional setting for the variational approach to the studied problems. In this new edition, we have completed the section related to capacity by introducing the notions of quasi-continuity, quasi-open sets, and capacitary measures. These play a central role in the analysis of the limiting behavior of variational problems in wildly varying domains (finely perforated domains, cloud of ice, etc.) and shape optimization (Chapter 16). All the ingredients of the variational approach to the model examples are now available: in Chapter 6 we describe some of them, including Dirichlet, Neumann, and mixed 1
i
i i
i
i
i
i
2
book 2014/8/6 page 2 i
Chapter 1. Introduction
problems. With regard to these examples, we are in the classical favorable situation: we have to minimize a convex coercive lower semicontinuous function on a reflexive Banach space. In such a situation, the direct method of Hilbert and Tonelli does apply, although for the model of linearized elasticity treated in this second edition, coercivity is a delicate point. We also have completed the first edition with three models slightly less classical: the reaction-diffusion equations for which we apply the Lax–Milgram theorem in a nonsymmetric case, the semilinear equations, and the obstacle problem, including the Signorini problem in the framework of linearized elasticity. Chapters 7 and 8 complement this classic portrait of variational methods by introducing two of the most powerful numerical tools which allow one to compute approximate solutions of variational problems: finite element methods and spectral analysis methods. Each of these two methods corresponds to a very specific type of Galerkin approximation of an infinite dimensional problem by a sequence of finite dimensional ones. Each method has its own advantages; for example, finite element methods allow one to treat engineering problems involving general domains, like the wing of a plane, which explains their great success. Around 1970, the study of constrained problems and variational inequalities led Stampacchia, Browder, Brezis, Moreau, Rockafellar, et al. to develop the elements of a unilateral variational analysis. In particular, convex variational analysis has known considerable success and has familiarized mathematicians with the idea that sets play a decisive role in analysis. The Fenchel duality, the subdifferential calculus of convex functions, and the extension of the Fermat rule are striking examples of this new approach. The role of the epigraph has progressively emerged as essential in the geometrical understanding of these concepts. Chapter 9 provides a thorough exposition of these elements of convex variational analysis in infinite dimensional spaces. We stress the importance of the Fenchel duality, which allows us to associate to each convex variational problem a dual one, whose solutions have in general a deep physical (or numerical or economical) interpretation as multipliers. Part II: Advanced Variational Analysis. This second part corresponds to Chapters 10 through 16 and deals with our second objective, which is to present new trends in variational analysis. Indeed, in recent years, variational methods have proved to be very flexible. They have been developed to study a number of advanced problems of modern technology, like composite materials, image processing, and shape optimization. To grasp these phenomena, the classical framework of variational analysis, which was studied in Part I, must be enlarged. Let us describe some of these extensions: 1. The modelization of a large number of problems in physics, image processing, requires the introduction of new functional spaces permitting discontinuities of the solution. In phase transitions, image segmentation, plasticity theory, and the study of cracks and fissures, in the study of the wake in fluid dynamics and the shock theory in mechanics, the solution of the problem presents discontinuities along one-codimensional manifolds. Its first distributional derivatives are now measures which may charge zero Lebesgue measure sets, and the solution of these problems cannot be found in classical Sobolev spaces. The classical theory of Sobolev spaces, which was developed in Chapter 5, is completed in Chapter 10 by a self-contained and detailed presentation of these spaces, BV (Ω), SBV (Ω), B D(Ω). The space BV (Ω), for example, is the space of functions with bounded variations, and a function u belongs to BV (Ω) iff its first distributional derivatives are bounded measures. The SBV (Ω) space is the subspace of
i
i i
i
i
i
i
book 2014/8/6 page 3 i
3
BV (Ω) which consists of functions whose first distributional derivatives are bounded measures with no Cantor part. 2. In Chapter 12, we introduce the concept of Γ -convergence, which provides a parametrized version of the direct method in variational analysis. Following Stampacchia’s work, Mosco [304], [305] and Joly [251] introduced the Mosco-epiconvergence (1970) of sequences of convex functions to study approximation and perturbation schemes in variational analysis and potential theory. The general topological concept, without any convexity assumption, has progressively emerged, and De Giorgi in 1975 introduced the notion of Γ -convergence for sequences of functions. It corresponds to the topological set convergence of the epigraphs, whence the equivalent terminology “epi-convergence.” This concept has been successfully applied to a large variety of approximation and perturbation problems in calculus of variations and mechanics: homogenization of composite materials, materials with many small holes and porous media, thin structures and reinforcement problems, and so forth. We illustrate the concept by describing some recent applications to thin structures, composite material, phase transitions, and image segmentation. We have completed the first edition with an introduction to the ergodic theory of subadditive processes and its application to stochastic homogenization. 3. Chapters 11 and 13 deal with the question of lower semicontinuity and relaxation of functionals of calculus of variations. Indeed, as a general rule, when applying the direct method to a functional F which is not lower semicontinuous, one obtains that minimizing sequences converge to solutions of the relaxed problem, which is the minimization of the lower semicontinuous envelope cl F of F . In the vectorial case, that is, when functionals are defined on Sobolev spaces W 1, p (Ω, R m ), Ω ⊂ RN , relaxation with respect to the weak topology of W 1, p (Ω, R m ) (or strong topology of L p (Ω, R m )) leads to the important concepts of quasi-convexity (in the sense of Morrey), polyconvexity, and rank-one convexity. We consider as well the case of functionals with linear growth and the corresponding lower semicontinuity and relaxation problems on BV and SBV spaces. All these notions play an important role in the modeling of large deformations in mechanics and plasticity, as described in Chapter 14. Following the microstructure school of Ball and James, in the modeling of the solid/solid phase transformations, the density energy possesses a multiwell structure. An alternative and appropriate procedure consists in relaxing the corresponding free energy functional in the space of Young measures generated by gradients. To complete Chapter 11, in this second edition, we have introduced the Kantorovich relaxed formulation of the Monge transport problem: the goal is to find a probability on the product space RN × RN , which minimizes a suitable relaxed transportation cost, i.e., the pth power of the so-called Wasserstein distance between two probability measures on RN . 4. Another important aspect of the direct method concerns the coercivity property. In Chapter 15, we examine how the method works when the variational problem has a lack of coercivity. In that case, existence of solutions relies on compatibility conditions, whose general formulation involves recession functions.
i
i i
i
i
i
i
4
book 2014/8/6 page 4 i
Chapter 1. Introduction
5. The next topic considered, in Chapter 16, is shape optimization, which is a good illustration of the powerfulness of direct methods in variational analysis and also of their limitations. This chapter has been completed, in this second edition, by the description of another interesting case of the shape optimization problem, consisting in establishing the existence of optimal potentials for some suitable cost functionals, as, for example, the integral cost functionals newly introduced in Chapter 5. 5. The final chapter is the main contribution of this new edition. Indeed, the previous edition was entirely devoted to the mathematical tools related to variational problems in a static framework. This new edition completes the previous one by treating in some depth the concept of gradient flows. We have chosen to introduce this notion through optimization: when the potential is continuously differentiable, the Cauchy problem governed by the associated gradient vector field is nothing but the classical continuous steepest descent implemented to minimize the potential. The analysis of the generalized steepest descent is valid for arbitrary convex lower semicontinuous potentials on a Hilbert space and also may be generalized to complete metric spaces. This last generalization is not fortuitous and is involved in some cases of cost functionals coming from mass transportation theory. Evolution equations in general describe the changing of a physical (or economic, or social, etc.) system with respect to the time. The Cauchy problem governed by a gradient flow arises, for instance, when one wants to model the heat equation or the Stefan-type problem. Another important aspect of the concept of gradient flows intervenes in the study of some homogenization problems, in which suitable variational convergences for sequences of gradient flows provide a powerful framework to describe the limit Cauchy problems. These problems arise, for example, in the analysis of the diffusion through heterogeneous media, or in first-order evolution problems with small parameters, provided that the potentials involved are convex.
i
i i
i
i
i
i
book 2014/8/6 page 7 i
Chapter 2
Weak solution methods in variational analysis
2.1 The Dirichlet problem: Historical presentation Throughout this book, we adopt the following notation: Ω is an open subset of RN (N ≤ 3 for applications in classical mechanics), and x = (x1 , x2 , . . . , xN ) is a generic point in Ω. The topological boundary of Ω is denoted by ∂ Ω. Let g : ∂ Ω −→ R be a given function which is defined on the boundary of Ω. The Dirichlet problem consists in finding a function u : Ω −→ R which satisfies Δu = 0
on
Ω,
(2.1)
u=g
on
∂ Ω.
(2.2)
The operator Δ is the Laplacian Δu =
N ∂ 2u i =1
∂ xi2
=
∂ 2u ∂ x12
+ ··· +
∂ 2u ∂ xN2
;
it is equal to the sum of the second partial derivatives of u with respect to each variable x1 , x2 , . . . , xN . Equation (2.1) is the Laplace equation, and a solution of this equation is said to be harmonic on Ω. Clearly, there are many harmonic functions. The following examples of harmonic functions illustrate how rich this family is: • u(x) = N a x + b (affine function) is harmonic on RN ; i =1 i i • u(x) = ax12 + b x22 + c x32 is harmonic on R3 iff a + b + c = 0; • u(x) = e x1 cos x2 and v(x) = e x1 sin x2 are harmonic on R2 (note that u(x1 , x2 ) = Re e (x1 +i x2 ) and v(x1 , x2 ) = Im e (x1 +i x2 ) ); • u(x) = (x12 + x22 + x32 )−1/2 is harmonic on R3 \{0} (Newtonian potential). The study of harmonic functions is a central topic of the so-called potential theory and of harmonic analysis. We will see further, as the above examples suggest, the close connections between this theory, the theory of the complex variable, and the potential theory. Thus, we can reformulate the Dirichlet problem by saying that we are looking for a harmonic function on Ω which satisfies the boundary data u = g on ∂ Ω. It is called a 7
i
i i
i
i
i
i
8
book 2014/8/6 page 8 i
Chapter 2. Weak solution methods in variational analysis
boundary value problem. The condition u = g on ∂ Ω, which consists in prescribing the value of the function u on the boundary of Ω, is called the Dirichlet boundary condition and gives rise to the name of the problem. Let us consider, for illustration, the elementary case N = 1. Take Ω = ]a, b [ an open bounded interval. Given two real numbers g1 and g2 , the Dirichlet problem reads as follows: find u : [a, b ] −→ R such that u = 0 on ]a, b [, u(a) = g1 , u(b ) = g2 . Clearly, this problem has a unique solution, whose graph in R2 is the line segment joining point (a, g1 ) to point (b , g2 ). We will see that for an open bounded subset Ω and under some regularity hypotheses on Ω and g covering most practical situations, one can prove existence and uniqueness of a solution of the Dirichlet problem. Indeed, this is a long story whose important steps are summarized below. It is in 1782 that the Laplace equation appears for the first time. When studying the orbits of the planets, Laplace discovered that the Newtonian gravitational potential of a distribution of mass of density ρ on a domain Ω ⊂ R3 , which is given by the formula 1 ρ(y) u(x) = d y, (2.3) 4π Ω |x − y| satisfies the equation
Δu = 0
¯ R3 \Ω.
on
(2.4)
Indeed, it is a good exercise to establish this formula. One first verifies that the Newtonian potential v(x1 , x2 , x3 ) = (x12 + x22 + x32 )−1/2 satisfies Δv = 0 on R3 \{0}. Then a direct derivation under the integral sign yields that u ¯ is harmonic on R3 \Ω. In 1813, Poisson establishes that on Ω the potential u satisfies −Δu = ρ
on Ω,
(2.5)
which is the so-called Poisson equation. The central role played by the Laplace and Poisson equations in mathematical physics appeared with more and more evidence, especially because of the work of Gauss. In 1813, Gauss established the following formula (which is often called the Gauss formula or the − → divergence theorem). Given a vector field V : Ω ⊂ R3 −→ R3 , − → − → → V (x) · − n (x) d s(x), (2.6) div V (x) d x = Ω
∂Ω
− → which states that the volume integral of the divergence of the vector field V is equal to − → the global outward flux of V through the boundary of Ω. In the above formula, if we denote − → V (x) = (v1 (x), v2 (x), v3 (x)), ∂ vi ∂ v1 ∂ v2 ∂ v3 − → (x) = (x) + (x) + (x) div V (x) = ∂ x1 ∂ x2 ∂ x3 i ∂ xi
i
i i
i
i
i
i
2.1. The Dirichlet problem: Historical presentation
book 2014/8/6 page 9 i
9
− → → is the divergence of V . The vector − n (x) is the unit vector which is orthogonal to ∂ Ω at x and which is oriented toward the outside of Ω. The measure d s is the two-dimensional Hausdorff measure on ∂ Ω. Let us briefly explain how the mathematical formulation of conservation laws in − → physics leads to the Laplace equation. Suppose that the vector field V derives from a potential u, that is, − → V (x) = D u(x) = gradu(x) = ∇u(x) =
∂u ∂ xi
(x)
.
(2.7)
i =1,...,N
(This is the most commonly used notation.) Suppose, moreover, that the vector field − − → → → V (x) is such that V ·− n d s = 0 for all closed surfaces ∂ G ⊂ Ω. ∂G By the Gauss theorem, it follows that − → div V d x = 0 ∀ G ⊂ Ω G
and hence
− → div V = 0
on Ω.
(2.8)
From (2.7) and (2.8), noticing that div(gradu) = Δu, we obtain Δu = 0, that is, u is harmonic. The above argument is valid both in the case of the gravitational vector field of Newton and in the case of the electrostatic field of Coulomb in the regions where there is no mass (respectively, charges). With the help of this formula, Gauss was able to prove a number of important properties of harmonic functions, like the mean value property; this was the beginning of the potential theory. Riemann, who was successively a student of Gauss (1846–1847 in Göttingen) and of Dirichlet (1847–1849 in Berlin), established the foundations of the theory of the complex variable and made the link (when N = 2) with the Laplace equation. Let us recall that for any function z ∈ C −→ f (z), which is assumed to be derivable as a function of the complex variable z ( f is then said to be holomorphic), its real and imaginary parts P and Q ( f (z) = P (x, y) + iQ(x, y), where z = x + i y) satisfy the socalled Cauchy–Riemann equations ⎧ ∂Q ∂P ⎪ ⎪ ⎪ ⎨∂ x = ∂ y , (2.9) ⎪ ∂P ∂Q ⎪ ⎪ =− . ⎩ ∂y ∂x It follows that ΔP =
∂ 2Q ∂ y∂ x
−
∂ 2Q ∂ x∂ y
= 0,
ΔQ = −
∂ 2P ∂ x∂ y
−
∂ 2P ∂ y∂ x
= 0.
i
i i
i
i
i
i
10
book 2014/8/6 page 10 i
Chapter 2. Weak solution methods in variational analysis
Thus, the real part and the imaginary part of a holomorphic function are harmonic. This approach allows, for example, solution in an elegant way of the Dirichlet problem in a disc. Take Ω = D(0, 1) = {z ∈ C : |z| < 1}, the unit disc centered at the origin in R2 . Given g : ∂ D −→ R a continuous function, we want to solve the Dirichlet problem Δu = 0 on D, (2.10) u = g on ∂ D. Let us start with the Fourier expansion of the 2π-periodic function cn ( g )e i nθ , g (e i θ ) =
(2.11)
n∈Z
+π 1 where cn ( g ) = 2π g (e i t )e −i n t d t . Note that the above Fourier series converges in −π an L2 (−π, +π) norm sense (Dirichlet theorem). Indeed, when starting with the Fourier expansion of the boundary data g , one can give an explicit formula for the solution u of the corresponding Dirichlet problem: cn ( g )r |n| e i nθ . (2.12) u(r e i θ ) := n∈Z
Clearly, when taking r = 1, one obtains u = g on ∂ D. Thus, the only point one has to verify is that u is harmonic on D. Take r < 1 and replace cn ( g ) by its integral expression in (2.12). By a standard argument based on the uniform convergence of the series, one can +π exchange the symbols n and −π to obtain iθ
u(r e ) = Let us introduce
1
2π
+π −π
P r (θ) =
g (e i t )
r |n| e i n(θ−t ) d t .
n∈Z
r |n| e i nθ ,
(2.13)
n∈Z
the so-called Poisson kernel, and observe that (we denote z = r e i θ )
it 1− r2 e +z = . P r (θ − t ) = Re i t 1 − 2r cos(θ − t ) + r 2 e −z Hence, u(r e i θ ) = Re
1 2π
+π −π
eit + z eit − z
g (e i t )d t ,
(2.14)
(2.15)
and u appears as the real part of a holomorphic function, which says that u is harmonic on D. Using (2.14) one obtains the Poisson formula +π 1 1− r2 u(r e i θ ) = g (e i t ) d t . (2.16) 2π −π 1 − 2r cos(θ − t ) + r 2 By similar arguments, one can explicitly solve the Dirichlet problem on a square ]0, a[×]0, a[ of R2 . Unfortunately, these methods can be used only in the two-dimensional case. The modern general treatment of the Dirichlet problem starts with the Dirichlet principle, whose formulation goes back to Gauss (1839), Lord Kelvin, and Dirichlet. It can be formulated as follows.
i
i i
i
i
i
i
2.1. The Dirichlet problem: Historical presentation
book 2014/8/6 page 11 i
11
The solution of the Dirichlet problem is the solution of the following minimization problem: N ∂v 2 d x : v = g on ∂ Ω . (2.17) min Ω i =1 ∂ xi The functional v → J (v) :=
N ∂v 2 Ω i =1
∂ xi
dx
is called the Dirichlet integral or Dirichlet energy. Thus, the Dirichlet principle states that the solution of the Dirichlet problem minimizes, over all functions v satisfying the boundary data v = g on ∂ Ω, the Dirichlet energy. Equivalently,
J (u) ≤ J (v) ∀v, u = g on ∂ Ω,
v = g on ∂ Ω,
that is, u is characterized by a minimization of the energy principle. One can easily verify, at least heuristically, that the solution u of the minimization problem (2.17) is a solution of the Dirichlet problem (2.1), (2.2). Indeed, the Laplace equation (2.1) can be seen as the optimality condition satisfied by the solution of the minimization problem (2.17). The celebrated Fermat’s rule which asserts that the derivative of a function is equal to zero at any point u where f achieves a minimum (respectively, maximum) can be developed in our situation by using the notion of directional derivative. When dealing with problems coming from variational analysis, the so-obtained optimality condition is called the Euler equation. Thus let us take v : Ω −→ R, which satisfies v = 0 on ∂ Ω. Then, for any t ∈ R, u + t v still satisfies u + t v = g on ∂ Ω and, since u minimizes J on the set {w = g on ∂ Ω}, we have J (u) ≤ J (u + t v). Let us compute for any t ∈ R, t = 0 1
1 J (u + t v) − J (u) = t t
|D u + t Dv|2 − |D u|2 Ω = 2 D u · Dv d x + t |Dv|2 . Ω
Ω
For any t > 0, this is a nonnegative quantity, and thus by letting t −→ 0+ D u · Dv d x ≥ 0. Ω
By taking t < 0 and letting t −→ 0− , we obtain the reverse inequality Ω
Finally
D u · Dv d x ≤ 0.
Ω
D u · Dv d x = 0
for any v = 0 on ∂ Ω.
i
i i
i
i
i
i
12
book 2014/8/6 page 12 i
Chapter 2. Weak solution methods in variational analysis
Taking v regular and after integration by parts, we obtain (Δu)v d x = 0 for any v regular, v = 0 on ∂ Ω, Ω
that is, Δu = 0 on ∂ Ω. Riemann recognized the importance of this principle but he did not discuss its validity. In 1870, Weierstrass, who was a very systematic and rigorous mathematician, discovered when studying some results of his friend Riemann that the Dirichlet principle raises some difficulties. Indeed, Weierstrass proposed the following example (apparently close to the Dirichlet problem!): +1 2 dv 2 x d x : v(−1) = a, v(+1) = b , (2.18) min dx −1 which fails to have a solution. A minimizing sequence for (2.18) can be obtained by considering the viscosity approximation problem +1 2 dv 1 2 min x + 2 d x : v(−1) = a, v(+1) = b , (2.19) dx n −1 which now has a unique solution un given by un (x) =
a+b
−
2
a − b arctan nx 2
arctan n
,
n = 1, 2, . . . .
One can directly verify that un satisfies the boundary data and that
+1
x −1
2
d un
2 d x −→ 0 as n −→ +∞.
dx
Thus the value of the infimum of (2.18) is zero. But there is no regular function (continuous, piecewise C1 ) which satisfies the boundary conditions and such that
+1
x −1
2
du dx
2 d x = 0.
Such a function would satisfy u = constant on ] − 1, +1[ which is incompatible with the boundary data when a = b . As we will see, the pathology of the Weierstrass example comes from the coefficient x 2 in front of ( dd vx )2 which vanishes (at zero) on the domain Ω = (−1, +1). As a result, there is a lack of uniform ellipticity or coercivity, which is not the case in the Dirichlet problem. Thus, the Weierstrass example is not a counterexample to the Dirichlet principle; its merit is to underline the shortcomings of this principle. Moreover, it raises a decisive question, which is to understand in which class of functions one has to look for the solution of the Dirichlet principle. Until that date, it was commonly admitted that the functions to consider have to be regular C1 or C2 (differentiation being taken in the classical sense), depending on the situation. It is only in 1900 with a famous conference at the Collège de France in Paris that Hilbert formulates the foundations of the modern variational approach to the Dirichlet principle and hence to the Dirichlet problem. These ideas, which have been worked out
i
i i
i
i
i
i
2.1. The Dirichlet problem: Historical presentation
book 2014/8/6 page 13 i
13
in the classical book of mathematical physics of Courant and Hilbert (1937) [181], can be summarized as follows. The basic idea of Hilbert is to enlarge the class of functions in which one looks for a solution of the Dirichlet principle and simultaneously to generalize the notion of solution. More precisely, Hilbert proposed the following general method for solving the Dirichlet principle: 1. First, construct a minimizing sequence of functions (un )n∈N . 2. Then, extract from this minimizing sequence a convergent subsequence, say, unk −→ u¯. The so-obtained function u¯ is the (generalized) solution of the original problem. This is what in modern terminology is called a compactness argument. Let us first notice that, even when starting with a sequence (un )n∈N of smooth functions, its limit u¯ may be no more differentiable in the classical sense. Such construction of a generalized solution, which corresponds to finding a space obtained by a completion procedure, is very similar to the one which consists of passing from the set of rational numbers to the set of real numbers or from the Riemann integrable functions to Lebesgue integrable functions. This was a decisive step, and this program was developed throughout the 20th century by a number of mathematicians from different countries. The functional space in which to find the generalized solution was only in an implicit form in the work of Courant and Hilbert (as a completion of piecewise C1 functions). The celebrated Sobolev spaces were gradually introduced in the work of Friedrichs (1934) [221] and, for the Soviet mathematical school, Sobolev (1936) [336], and Kondrakov. The compactness argument, that is, the compact embedding of the Sobolev space H 1 (Ω) into L2 (Ω) when Ω is bounded, was proved by Rellich (1930). The modern language of distributions which provides a generalized notion of derivatives for nonsmooth functions (and much more) was systematically developed by L. Schwartz (1950), who was teaching at the École Polytechnique in Paris. This has proved to be a very flexible tool for handling generalized solutions for PDEs. The ideas of the compactness method introduced by Hilbert to solve the Dirichlet principle were developed in a systematic way by the Italian school. Tonelli (1921) had the intuition to put together the semicontinuity notion of Baire and the Ascoli–Arzelà compactness theorem. So doing, he was able to transfer from real functions to functionals of the calculus of variations (like the Dirichlet integral) the classical compactness argument. He developed the so-called direct methods in the calculus of variations, whose basic topological ingredients are the lower semicontinuity of the functional and the compactness of the lower level sets of the functional. In the line of the Hilbert approach, he founded the basis of the topological method for minimization problems in infinite dimensional spaces. Thus, modern tools in variational analysis provide a general and quite simple approach to existence results of generalized solutions for a large number of boundary value problems from mathematical physics. The natural question which then arises is to study the regularity of such solutions and to establish under which conditions on the data and the domain we have a classical solution. A large number of contributions have been devoted to this difficult question. Let us say that in the case of the Dirichlet problem, if the do¯ For main Ω and g are sufficiently smooth, then there exists a classical solution u ∈ C2 (Ω). a detailed bibliography on the regularity problem, see Brezis [137, Chapter IX]. So far, we have considered the Dirichlet problem as it has been introduced historically. Indeed, one can reformulate it in an equivalent form which is more suitable for a variational treatment.
i
i i
i
i
i
i
14
book 2014/8/6 page 14 i
Chapter 2. Weak solution methods in variational analysis
Let us introduce g˜ : Ω −→ R, a function defined on the whole of Ω and whose restriction on ∂ Ω is equal to g : (2.20) g˜ |∂ Ω = g . One usually prescribes g˜ to preserve the regularity properties on g and ∂ Ω (for example, continuity or Lipschitz continuity) and, in most practical situations, this is quite easy to achieve. Take as a new unknown function v := u − g˜ .
(2.21)
Clearly it is equivalent to find v or u. The boundary value problem satisfied by v is −Δv = f on Ω, (2.22) v = 0 on ∂ Ω with f = Δ g˜ . The Dirichlet boundary data v = 0 on ∂ Ω is then said to be homogeneous, and problem (2.22) is often called the homogeneous Dirichlet problem. Note that a number of important physical situations lead to (2.22). For example, when describing the electrostatic potential u in a domain Ω with a density of charge f and whose boundary is connected with the earth, then −Δu = f on Ω, u = 0 on ∂ Ω. Let us consider an elastic membrane in the horizontal plane x3 = 0 occupying a domain Ω in the (x1 , x2 ) plane. Suppose that at each point x ∈ Ω a vertical force of intensity f (x) is exerted and that the membrane is fixed on its boundary. Let us denote by u(x) the vertical displacement of the point x of the membrane when the equilibrium is attained. Then −c Δu = f on Ω, u = 0 on ∂ Ω, where c > 0 is the elasticity coefficient of the membrane.
2.2 Test functions and distribution theory 2.2.1 Definition of distributions The concept of distribution is quite natural if we start from some simple physical observations. Let us first consider a function f ∈ L1l oc (Ω), where Ω is an open subset of RN . One cannot, for an arbitrary x ∈ Ω, give a meaning to f (x). But, from a physical point of view, it is meaningful to consider the average of f on a small ball with center x and radius > 0 and let go to zero. Indeed, it follows from the Lebesgue theory that for almost every x ∈ Ω, 1 lim f (ξ )d ξ −→0 |B(x, )| B(x,) exists and the limit is a representative of f . Such points x are called Lebesgue points of f . In the formula above B(x, ) denotes the ball with center x and radius and |B(x, )| its Lebesgue measure. Let us notice that 1 f (ξ )d ξ = f (ξ )v x, (ξ )d ξ , |B(x, )| B(x,) Ω
i
i i
i
i
i
i
2.2. Test functions and distribution theory
book 2014/8/6 page 15 i
15
where v x, (ξ ) =
1 |B(x,)|
if ξ ∈ B(x, ), elsewhere.
0
Thus, it is equivalent to know f as an element of L1l oc (Ω) or to know the value of the integrals Ω f (x)v(x)d x for v belonging to a sufficiently large class of functions. This is the starting point of the notion of distribution. Functions v will be called test functions. It is equivalent to know f as a function or as a distribution, the distribution being viewed as the mapping assigning the real number Ω f vd x to each test function v:
f : v →
Ω
f vd ξ .
If one is concerned only with L1l oc functions, there are many possibilities for the choice of the class of test functions. Let us go further and suppose we want to model the concept of a Dirac mass, that is, of a unit mass concentrated at a point. This is an important physical notion which can be viewed as the limiting case of the unit mass concentrated in a ball of radius > 0 with −→ 0. For example, consider the Dirac mass at the origin 0 ∈ Ω and the functions 1 if x ∈ B(0, ), f (x) = |B(0,)| 0 elsewhere. Then, the distribution attached to f is the mapping 1 v → f (ξ )v(ξ )d ξ = v(ξ )d ξ . |B(0, )| B(0,) Ω When passing to the limit as −→ 0, we need to take test functions at least continuous (at the origin) in order for the above limit to exist. The limiting distribution is the mapping v ∈ Cc (Ω) → v(0), where Cc (Ω) is the set of continuous real-valued functions, with compact support in Ω. This is the modern way to consider a Dirac mass (at the origin) as the linear mapping which to a regular test function v associates its value at the origin. Note that this distribution is no longer attached to a function. A similar device can be developed to attach a distribution to more general mathematical objects, such as the derivative of an L1l oc function. Take f ∈ L1l oc (Ω) and try to define the distribution attached to
∂f ∂ xi
. Let us approximate f by a sequence fn of smooth func-
tions fn . Then the distribution of
∂ fn ∂ xi
is the mapping
v →
∂ fn Ω
∂ xi
(x)v(x)d x.
But we cannot pass to the limit on this quantity just by taking v continuous, like in the previous step. So let us assume v to be of class C1 on Ω with a compact support. Then, let us integrate by parts ∂ fn ∂v vd x = − fn d x; ∂ xi Ω ∂ xi Ω
i
i i
i
i
i
i
16
book 2014/8/6 page 16 i
Chapter 2. Weak solution methods in variational analysis
we can now pass to the limit on this last expression. Finally, the distribution attached to ∂f is the mapping ∂ xi ∂v v ∈ C1c (Ω) → − f d x, Ω ∂ xi where C1c (Ω) is the set of real-valued functions of class C1 with compact support in Ω. We are now ready to define the concept of distribution. We consider the space of test functions (Ω), which is the vector space of real or complex valued functions on Ω which are indefinitely derivable and with compact support in Ω. (This allows us to cover all the previous situations and much more!) For v ∈ (Ω), we say that the support of v is contained in a compact subset K ⊂ Ω, and we write sptv ⊂ K if v = 0 on Ω \ K (equivalently, {v = 0} ⊂ K). We use the following notation. An element p ∈ NN , p = ( p1 , p2 , . . . , pN ), where N is the dimension of the space (Ω ⊂ RN ), is called a multi-index. The integer | p| = p1 + p2 + · · · + pN is called the length of the multi-index p. For v ∈ (Ω), we write D p v :=
∂ | p| v ∂
p x1 1 ∂
p
p
x2 2 . . . ∂ xNN
.
The operator D p can be viewed as the composition of elementary partial derivation operators p1 pN ∂ ∂ ◦ ··· ◦ , Dp = ∂ x1 ∂ xN where ( ∂∂x ) pi = ∂∂x ◦ ∂∂x ◦ · · · ◦ ∂∂x , pi times. i i i i Let us introduce the notion of sequential convergence on (Ω). It is the only topological notion on (Ω) that we use. Definition 2.2.1. A sequence (vn )n∈N of functions converges in the sense of the space (Ω) to a function v ∈ (Ω) if the two following conditions are satisfied: (i) There exists a compact subset K in Ω such that spt vn ⊂ K for all n ∈ N and spt v ⊂ K. (ii) For all multi-index p ∈ NN , D p vn −→ D p v uniformly on K. One can prove the existence of a locally convex topology on the space (Ω) with respect to which a linear functional F is continuous iff it is sequentially continuous, that is, F (vn ) −→ F (v) whenever vn −→ v in the sense of (Ω). But this topology is not easy to handle (it is not metrizable); we don’t really need to use it, so we will use only the notion of convergent sequence in (Ω) as defined above. Definition 2.2.2. A distribution T on Ω is a continuous linear form on (Ω). Equivalently, a linear form T : (Ω) −→ R is a distribution on Ω if for any sequence (vn )n∈N in (Ω), the following implication holds: vn −→ 0
in the sense of
(Ω) =⇒ T (vn ) −→ 0.
The space of distributions on Ω is denoted by (Ω). It is the topological dual space of (Ω) and we will write 〈T , v〉( (Ω), (Ω)) := T (v) the duality pairing between T ∈ (Ω) and v ∈ (Ω).
i
i i
i
i
i
i
2.2. Test functions and distribution theory
book 2014/8/6 page 17 i
17
Let us now give a practical criterion which allows us to verify that a linear form on
(Ω) is continuous (and hence is a distribution). Proposition 2.2.1. Let T be a linear form on (Ω). Then T is a distribution on Ω iff for all compact K in Ω, there exists n ∈ N and C ≥ 0, possibly depending on K, such that D p v∞ . ∀v ∈ (Ω) with spt v ⊂ K, |T (v)| ≤ C | p|≤n
PROOF. Clearly, the above condition implies that T is continuous on (Ω). To prove the converse statement, let us argue by contradiction. Thus, given T ∈ (Ω), let us assume that there exists a compact K in Ω and a sequence (vn )n∈N in (Ω) such that for each n∈N D p vn ∞ . spt vn ⊂ K and |T (vn )| > n | p|≤n
Let us define wn :=
n
| p|≤n
1 D p vn ∞
vn .
Then wn ∈ (Ω), spt wn ⊂ K, and for each m ∈ N D m wn =
n
| p|≤n
1 p
D vn ∞
D m vn ,
so that ∀n>m
D m wn ∞ ≤
1 n
,
and wn −→ 0 in (Ω). By linearity of T |T (wn )| > 1, so that T (wn ) does not tend to zero, a contradiction with the fact that T ∈ (Ω). Proposition 2.2.1 allows us to naturally introduce the notion of distribution with finite order. Definition 2.2.3. A distribution T ∈ (Ω) has a finite order if there exists an integer n ∈ N such that for each compact subset K ⊂ Ω, there exists a constant C (K) such that ∀ v ∈ (Ω)
with spt v ⊂ K, |T (v)| ≤ C (K) sup D p v∞ . | p|≤n
If T has a finite order, the order of T is the smallest integer n for which the above inequality holds. In Proposition 2.2.1, the integer n a priori depends on the compact set K. A distribution has a finite order if n can be taken independent of K. Let us describe some first examples of distributions.
i
i i
i
i
i
i
18
book 2014/8/6 page 18 i
Chapter 2. Weak solution methods in variational analysis
2.2.2 Locally integrable functions as distributions: Regularization by convolution and mollifiers Take f ∈ L1l oc (Ω), which means that for each compact subset K of Ω, One can associate to f the linear mapping
K
| f (x)| d x < +∞.
T f : v ∈ (Ω) −→
Ω
f (x)v(x) d x.
For any compact subset K ⊂ Ω, for any v ∈ (Ω) with spt v ⊂ K, the following inequality holds: |T f (v)| ≤ C (K)v∞ with C (K) = K | f (x)| d x < +∞. By Proposition 2.2.1 T f is a distribution of order zero. Indeed, a function f ∈ L1l oc (Ω) is uniquely determined by its corresponding distribution T f , as stated in the following. Theorem 2.2.1. Let f ∈ L1l oc (Ω), g ∈ L1l oc (Ω) be such that ∀v ∈ (Ω)
Ω
f (x)v(x) d x =
Ω
g (x)v(x) d x.
Then f = g almost everywhere (a.e.) on Ω. The above result allows us to identify f ∈ L1l oc with the corresponding distribution T f , which gives the injection L1l oc (Ω) → (Ω). The proof of Theorem 2.2.1 is a direct consequence of the density of the space of test functions (Ω) in the space Cc (Ω). Proposition 2.2.2. (Ω) is dense in Cc (Ω) for the topology of the uniform convergence. More precisely, for every v ∈ Cc (Ω), there exists a sequence (vn )n∈N , vn ∈ (Ω), and a compact set K ⊂ Ω such that vn → v uniformly and spt vn ⊂ K. PROOF. Take v ∈ Cc (Ω) and extend v by zero outside of Ω. We so obtain an element, which we still denote by v, which is continuous on RN and with compact support in Ω. Let us use the regularization method by convolution and introduce a smoothing kernel ρ: ⎧ ρ ∈ (RN ), ρ ≥ 0, ⎪ ⎪ ⎨spt ρ ⊂ B(0, 1), ⎪ ⎪ ⎩ ρ(x)d x = 1. RN
Take, for example,
ρ(x) =
m e −1/(1−|x| ) 0 2
if |x| ≤ 1, elsewhere,
m being chosen in order to have RN ρ(x)d x = 1. Then let us define for each integer n = 1, 2, . . . , ρn (x) := n N ρ(n x),
i
i i
i
i
i
i
2.2. Test functions and distribution theory
which satisfies
book 2014/8/6 page 19 i
19
⎧ ρn ∈ (RN ), ρn ≥ 0, ⎪ ⎪ ⎨spt ρ ⊂ B(0, 1/n), n ⎪ ⎪ ⎩ ρn (x)d x = 1. RN
The sequence (ρn )n∈N is said to be a mollifier. Given v ∈ Cc (RN ), let us define vn = v ρn , that is, vn (x) = We have
RN
v(y)ρn (x − y)d y.
spt vn ⊂ spt v + spt ρn ⊂ spt v + B(0, 1/n)
and vn has a compact support in Ω for n large enough. The classical derivation theorem under the integral sign yields D α vn = v D α ρ n
∀ α ∈ NN
and vn belongs to C∞ (Ω). Thus vn belongs to (Ω). Let us now prove that vn converges uniformly to v. To that end, we use the other equivalent formulation of vn , v(x − y)ρn (y)d y, vn (x) = RN
and the fact that
RN
ρn = 1 to obtain vn (x) − v(x) =
RN
[v(x − y) − v(x)]ρn (y)d y.
Using that spt ρn ∈ B(0, 1), we have sup |vn (x) − v(x)| ≤
x∈RN
sup
|v(z) − v(x)|.
x,z∈RN x−z≤1/n
This last quantity goes to zero as n → +∞; this is a consequence of the uniform continuity of v on RN . (Recall that v is continuous with compact support.) Let us now complete the proof. PROOF OF THEOREM 2.2.1. Take h = f − g , h ∈ L1l oc , which satisfies ∀v ∈ (Ω)
Ω
h(x)v(x) d x = 0.
By density of (Ω) in Cc (Ω) (see Proposition 2.2.2) for any v ∈ Cc (Ω) there exists a sequence (vn )n∈N in (Ω) such that
spt vn ⊂ K vn −→ v
for some fixed compact K in Ω, uniformly on K.
i
i i
i
i
i
i
20
book 2014/8/6 page 20 i
Chapter 2. Weak solution methods in variational analysis
Since Ω h(x)vn (x) d x = K h(x)vn (x) d x = 0, and h ∈ L1 (K), by passing to the limit as n −→ ∞, we obtain ∀ v ∈ Cc (Ω)
Ω
h(x)v(x)d x = 0.
The conclusion h = 0 follows by the Riesz–Alexandrov theorem (see Theorem 2.4.7) and the uniqueness of the representation. Let us give a direct independent proof of the fact that h = 0. It will mostly rely on the Tietze–Urysohn separation lemma. 1 One can first reduce to consider the case h ∈ L (Ω) with |Ω| < +∞ (write 1Ω = Ω with Ωn open and Ωn compact, and take h|Ωn ). By density of Cc (Ω) in L (Ω), n∈N n for each > 0 there exists some h ∈ Cc (Ω) such that h − h L1 (Ω) < . Hence (2.23) h (x)v(x) d x ≤ vL∞ (Ω) ∀v ∈ Cc (Ω). Ω Consider
K1 = {x ∈ Ω : h (x) ≥ },
K2 = {x ∈ Ω : h (x) ≤ −}.
These two sets are disjoint and compact. By the Tietze–Urysohn separation lemma, there exists a function ϕ ∈ Cc (Ω) such that ⎧ on K1 , ⎨ϕ(x) = 1 ϕ(x) = −1 on K , ⎩−1 ≤ ϕ(x) ≤ 1 ∀ x ∈2 Ω. Taking K = K1 ∪ K2 ,
Ω
|h |d x =
K
|h |d x +
Ω\K
|h | d x.
On K, we have |h | = h ϕ, so that |h | d x = h ϕ = h ϕ d x − K
By (2.23),
Ω
K
Ω\K
h ϕ d x.
h ϕ d x ≤ ϕL∞ ≤ and, since |h | ≤ on Ω \ K, h ϕ d x ≤ |Ω|. Ω\K
So,
K
Finally,
|h | d x ≤ (1 + |Ω|).
Ω
and
Ω
|h | d x ≤ (1 + |Ω|) + |Ω| = + 2|Ω|
Ω
|h| d x ≤
Ω
|h − h | d x +
Ω
|h | d x ≤ 2(1 + |Ω|).
This being true for any > 0, we conclude that h = 0. Noticing that L p (Ω) → L1l oc (Ω) for any 1 ≤ p ≤ +∞, as a direct consequence of Proposition 2.2.2, we obtain the following corollary.
i
i i
i
i
i
i
2.2. Test functions and distribution theory
book 2014/8/6 page 21 i
21
Corollary 2.2.1. Given 1 ≤ p ≤ +∞, let us suppose that f ∈ L p (Ω), g ∈ L p (Ω) satisfy f (x)v(x) d x = g (x)v(x) d x ∀v ∈ (Ω); Ω
Ω
then f = g a.e. on Ω. Let us notice that when taking 1 < p < +∞, we can obtain this result in a more direct way, by using the density of (Ω) in Lq (Ω), 1 ≤ q < +∞. Then take q = p the Hölder conjugate exponent of p, 1p + p1 = 1. Clearly the density of (Ω) in Lq (Ω) is a consequence of Proposition 2.2.2. Because of the importance of this result, let us give another proof of it, of independent interest, which relies only on L p techniques. Proposition 2.2.3. Let Ω be an arbitrary open subset of RN . Then (Ω) is dense in L p (Ω) for 1 ≤ p < +∞. This will result from the following. Proposition 2.2.4. Let f ∈ L p (RN ) with 1 ≤ p < +∞. Then, for any mollifier (ρn )n∈N , the following properties hold: (i) f ρn ∈ L p (RN ), (ii) f ρn L p (RN ) ≤ f L p (RN ) , (iii) f ρn −→ f in L p (RN ) as n → +∞. PROOF. To prove (i) and (ii) we omit the subscript n ∈ N. Let us consider the case 1 < p < ∞ and introduce p with 1p + p1 = 1: |( f ρ)(x)| ≤ | f (x − y)|ρ(y)d y RN | f (x − y)| ρ(y)1/ p ρ(y)1/ p d y. ≤ RN
Let us apply the Hölder inequality 1/ p p |( f ρ)(x)| ≤ | f (x − y)| ρ(y)d y Since
RN
RN
RN
ρ(y)d y = 1, we obtain
1/ p ρ(y) d y
.
|( f ρ)(x)| p ≤ R
N
| f (x − y)| p ρ(y)d y.
Let us integrate with respect to x ∈ R and apply the Fubini–Tonelli theorem |( f ρ)(x)| p d x ≤ | f (x − y)| p ρ(y) d y d x RN RN RN p ≤ ρ(y) | f (x − y)| d x d y N RN R ≤ | f (x)| p d x, N
RN
i
i i
i
i
i
i
22
book 2014/8/6 page 22 i
Chapter 2. Weak solution methods in variational analysis
where we have used again that ρ d x = 1 and the fact that the Lebesgue measure on RN is invariant by translation. Thus, f ρ ∈ L p and f ρL p ≤ f L p . The convergence of f ρn to f in L p (RN ) relies on a quite similar computation. Using that ρn d x = 1, we can write f (x) − ( f ρn )(x) = and
RN
[ f (x) − f (x − y)] ρn (y) d y
| f (x) − ( f ρn )(x)| ≤
Let us rewrite this last inequality as | f (x) − ( f ρn )(x)| ≤
| f (x) − f (x − y)| ρn (y) d y.
RN
R
N
| f (x) − f (x − y)| ρn (y)1/ p ρn (y)1/ p d y
and apply the Hölder inequality to obtain p | f (x) − f (x − y)| p ρn (y) d y. | f (x) − ( f ρn )(x)| ≤ RN
Integrating with respect to x on RN and applying the Fubini–Tonelli theorem, we obtain p p ρn (y) f − τy f L p d y. f − ( f ρn )L p ≤ RN
p
Let us introduce ϕ(y) := f − τy f L p . Since f ∈ L p (RN ), ϕ is a continuous function on RN such that ϕ(0) = 0. We conclude thanks to the following property: ∀ϕ : RN −→ RN continuous with ϕ(0) = 0, we have lim
n→+∞
RN
ϕ(y) ρn (y) d y = 0.
This results from the inequality ϕ(y) ρn (y) d y − ϕ(0) = (ϕ(y) − ϕ(0)) ρn (y) d y RN RN ≤ |ϕ(y) − ϕ(0)| ρn (y) d y B(0,1/n)
≤ sup |ϕ(y) − ϕ(0)|, |y|≤1/n
which tends to zero as n → +∞. Indeed, we will interpret this last result as a convergence in (Ω) of the sequence (ρn ) to δ0 , the Dirac mass at the origin. We can now complete the proof.
i
i i
i
i
i
i
2.2. Test functions and distribution theory
book 2014/8/6 page 23 i
23
PROOF OF PROPOSITION 2.2.3. Take f ∈ L p (Ω), > 0, and g ∈ Cc (Ω) such that f − g L p (Ω) < . Then let us extend g outside of Ω by zero to obtain a function that we still denote by g which belongs to Cc (RN ). Take fn = g ρn . Then, fn ∈ (RN ), and in fact fn ∈ (Ω) for n large enough, because spt fn ⊂ spt g + B(0, 1/n). Moreover, fn −→ g in L p (Ω), so that f − fn L p ≤ for n large enough.
2.2.3 Radon measures Let us recall that a Radon measure μ is a linear form on Cc (Ω) such that for each compact K ⊂ Ω, the restriction of μ to CK (Ω) is continuous, that is, for each K ⊂ Ω, K compact, there exists some C (K) ≥ 0 such that ∀ v ∈ Cc (Ω)
with spt v ⊂ K,
|μ(v)| ≤ C (K)v∞ .
To such a Radon measure, one can associate its restriction to D(Ω), Tμ : v ∈ (Ω) −→ v(x) d μ(x), Ω
which by the definition of μ is a distribution of order zero. Conversely, μ is completely determined by the corresponding distribution Tμ . This is a consequence of the density of (Ω) in Cc (Ω); see Proposition 2.2.2. As a consequence, we can identify any Radon measure with its corresponding distribution and → (Ω). As a typical example of a distribution measure which is not in L1l oc (Ω), if 0 ∈ Ω, take μ = δ0 the Dirac mass at the origin, with 〈μ, v〉( , ) := v(0). To describe further examples of great importance in applications, we need to introduce further notions, namely, the derivation of distributions and weak limits of distributions.
2.2.4 Derivation of distributions, introduction to Sobolev spaces Definition 2.2.4. Let T ∈ (Ω) be a distribution on Ω. Then ∂∂ xT is defined as the linear i mapping on (Ω), ∂T ∂v : v ∈ (Ω) −→ − T , . ∂ xi ∂ xi ( , ) More generally, for any multi-index p = ( p1 , . . . , pN ), we define D p T : v ∈ (Ω) −→ (−1)| p| 〈T , D p v〉( , ) . Proposition 2.2.5. For any distribution T on Ω, for any multi-index p ∈ NN , we have that D p T is still a distribution on Ω. Therefore, for any v ∈ (Ω) 〈D p T , v〉( (Ω), (Ω)) = (−1)| p| 〈T , D p v〉( (Ω), (Ω)) .
i
i i
i
i
i
i
24
book 2014/8/6 page 24 i
Chapter 2. Weak solution methods in variational analysis
PROOF. One just needs to notice that for p ∈ NN fixed, the mapping v −→ D p v is continuous from (Ω) into (Ω). This is an immediate consequence of the definition of the sequential convergence in (Ω), which, we recall, involves a compact support condition and the uniform convergence of the derivatives of arbitrary order. These two properties are clearly preserved by the operations D p . Therefore, every distribution in (Ω) possesses derivatives of arbitrary orders in
(Ω). Indeed, the notion of derivative D p T of a distribution has been defined so as to extend the classical notion of derivative for a smooth function. Let us recall the identification we make between f ∈ L1l oc and the corresponding distribution T f .
Proposition 2.2.6. Let f be some function in the set C m (Ω) of real-valued functions of class C m in Ω. Then, for any p ∈ NN with | p| ≤ m, the distribution derivative D p f coincides with the classical derivative D p f of functions. PROOF. It is a direct consequence of the integration by parts formula. If f ∈ C1 (Ω) and v ∈ (Ω), ∂v ∂f (x) v(x) d x = − f (x) (x) d x. ∂ x ∂ xi Ω Ω i Similarly, integration by parts | p| times gives the following formula: p | p| (D f )(x) v(x) d x = (−1) f (x) D p v(x) d x; Ω
Ω
this formula is valid for f ∈ C| p| (Ω) and v ∈ (Ω). The fact that the test function v has a compact support is essential in making, in the integration by parts formula, the integral term on ∂ Ω equal to zero. We stress the fact that the notion of derivative D p T of a distribution T ∈ (Ω) takes as a definition the integration by parts formula, the derivation operation being transferred, by this operation, on the test functions. This can be done at an arbitrary order since the test functions have been taken indefinitely differentiable. The two previous remarks justify the choice of test functions v ∈ (Ω). We can now describe a fundamental example of distribution coming from the theory of Sobolev spaces. This theory, which plays a central role in the variational approach to a large number of boundary value problems (like the Dirichlet problem) will be developed in detail in Chapter 5. Here we give some definitions and elementary examples. For any m ∈ R, p ∈ [1, +∞], W m, p (Ω) = { f ∈ L p (Ω) : D j f ∈ L p (Ω)
∀ j , | j | ≤ m}.
One of the most important Sobolev spaces is the space ∂f 1,2 1 2 2 W (Ω) = H (Ω) = f ∈ L (Ω) : ∈ L (Ω), i = 1, 2, . . . , N . ∂ xi ∂f
In the above definition, the derivation ∂ x (or more generally D j f ) is taken in the distrii bution sense. We will see that the choice of this notion of derivation is fundamental to obtain the desirable properties for the corresponding spaces.
i
i i
i
i
i
i
2.2. Test functions and distribution theory
book 2014/8/6 page 25 i
25
As an elementary example, let us consider Ω = (−1, 1) and f (x) = |x|. Clearly, f is not differentiable in the classical sense at the origin. The function f is continuous on Ω, it belongs to any L p (Ω), 1 ≤ p ≤ +∞, and thus it defines a distribution and we can compute its first distribution derivative D f 1 f (x)v (x)d x. 〈D f , v〉( (Ω), (Ω)〉 := − −1
Let us write
1 −1
f (x)v (x)d x =
0
−1
1
f (x)v (x)d x
f (x)v (x)d x + 0
and let us integrate by parts on each interval (−1, 0) and (0, 1). This is possible since now f ∈ C1 ([−1, 0]) and f ∈ C1 ([0, 1]). We have 0 0 f (x)v (x)d x = f (0)v(0) − f (−1)v(−1) − f (x)v(x)d x, −1
1
f (x)v (x)d x = f (1)v(1) − f (0)v(0) −
0
−1
1
f (x)v(x)d x.
0
Note that since v ∈ (−1, 1), we have v(−1) = v(1) = 0, but for a general v ∈ (−1, 1), v(0) = 0. By adding the two above equalities, the terms containing v(0) cancel and we obtain 0 1 1 f (x)v (x) d x = − −v(x) d x + v(x) d x , ∀v ∈ (−1, 1) −1
that is,
1
−1
−1
f (x)v (x)d x = −
where g (x) =
−1 1
0
1 −1
g (x)v(x)d x,
if − 1 < x < 0, if 0 < x < 1.
The above function g is then the distributional derivative of f (x) = |x| on Ω = (−1, 1). It belongs to L p (Ω) for any 1 ≤ p ≤ +∞, so that f ∈ W 1, p (Ω) for any 1 ≤ p ≤ +∞. The parameters m ∈ N and p ∈ [1, +∞] yield a scale of spaces which allow us to distinguish, for example, in our situation the different behavior of the functions fα (x) = |x|α and of their derivatives at zero. A similar computation as above yields D fα = α|x|α−2 x in (−1, 1). 1 1 Hence, fα ∈ W 1, p (−1, 1) iff −1 |x| p(α−1) d x < +∞, that is, p < 1−α . When p = 2, we 1 α 1 have that fα (x) = |x| belongs to H (−1, 1) iff α > 2 : this expresses that in some sense the derivative of fα at x does not tend to +∞ too rapidly when x goes to zero. Let us now examine the other parameter m, which is relative to the order of derivation. Take again f (x) = |x| and compute the second-order derivative of f on (−1, 1). This amounts to computing the first-order derivative of g (x) = sign x. Thus 1 2 〈D f , v〉( , ) = 〈D g , v〉( , ) = − g (x)v (x)d x. −1
i
i i
i
i
i
i
26
book 2014/8/6 page 26 i
Chapter 2. Weak solution methods in variational analysis
As before, let us split the integral over (−1, 1) into two parts,
1 −1
g (x)v (x) d x = −
0 −1
1
v (x) d x +
v (x) d x
0
= −[v(0) − v(−1)] + [v(1) − v(0)] = −2v(0), since v ∈ (−1, 1) and v(−1) = v(1) = 0. Thus 〈D g , v〉( (−1,1), (−1,1)) = 2v(0) and D g = 2δ0 , where δ0 is the Dirac mass at the origin. But the distribution δ0 is a measure which is not representable by a function: suppose that there exists a function h ∈ L1l oc such that ∀v ∈ (−1, 1)
v(0) =
1
h(x)v(x) d x. −1
Then, taking successively v ∈ (−1, 0) and v ∈ (0, 1), we conclude by Theorem 2.2.1 that h = 0 a.e. on (−1, 0) and on (0, 1). Hence h = 0 a.e. on (−1, 1), which would imply v(0) = 0 for every v ∈ (−1, 1), a clear contradiction. Therefore f ∈ W 1,2 (−1, 1) but f ∈ W 2,1 (−1, 1). The above computation of the distributional derivative of a function g which has a discontinuity is very important in a number of applications (phase transitions, plasticity, image segmentation, etc.). This situation will be considered in detail in Chapter 10 and will lead us to the introduction of the functional space BV (Ω), the space of functions with bounded variation, which can be characterized as the space of integrable functions whose first distributional derivatives are bounded measures. The next operation on distributions which is very useful for applications is the notion of limit of a sequence of distributions.
2.2.5 Convergence of sequences of distributions Definition 2.2.5. Let Tn ∈ (Ω) for all n ∈ N and T ∈ (Ω). The sequence (Tn )n∈N is said to converge to T in (Ω) if ∀v ∈ (Ω)
lim Tn (v) = T (v).
n→+∞
We will write Tn −→ T in (Ω) or limn→+∞ Tn = T in (Ω). In Section 2.4, we will interpret this convergence as a weak∗ convergence in the dual space (Ω) of (Ω). Let us recall that any distribution T ∈ (Ω) possesses derivatives or arbitrary orders. The following result expresses that for any multi-index p ∈ NN the mapping T −→ D p T is continuous. Proposition 2.2.7. Let p ∈ NN . The mapping T ∈ (Ω) −→ D p T ∈ (Ω)
i
i i
i
i
i
i
2.2. Test functions and distribution theory
book 2014/8/6 page 27 i
27
is continuous, which means that for any sequence (Tn )n∈N , T in (Ω), the following implication holds: Tn −→ T in (Ω) =⇒ D p Tn −→ D p T in (Ω). PROOF. The proof is a direct consequence of the definitions. Let us assume that Tn −→ T in (Ω) and take v ∈ (Ω). By definition of D p 〈D p Tn , v〉 = (−1)| p| 〈Tn , D p v〉
∀ v ∈ (Ω).
Since v ∈ (Ω), D p v still belongs to (Ω), and the convergence of Tn to T in (Ω) implies lim 〈Tn , D p v〉 = 〈T , D p v〉. n→+∞
Thus, again by the definition of D p , lim 〈D p Tn , v〉 = (−1)| p| 〈T , D p v〉 = 〈D p T , v〉,
n→+∞
which expresses that D p Tn −→ D p T in (Ω). The above proposition is one of the reasons for the success of the theory of distributions. It makes this theory a very flexible tool for the study of PDEs. We will often use this type of argument—for example, in the chapter on Sobolev spaces. Suppose (vn )n∈N is a sequence in H 1 (Ω) such that vn −→ v in L2 (Ω), ∂ vn −→ gi in L2 (Ω), ∂ xi Then, v ∈ H 1 (Ω) and gi =
∂v ∂ xi
i = 1, 2, . . . , N .
, i = 1, 2, . . . , N . This can be justified with the language of
distribution as follows. Since vn −→ v in L2 (Ω), vn −→ v in (Ω) and hence in (Ω). On the other hand,
∂ vn ∂ xi
∂ vn ∂ xi
−→
∂v ∂ xi
−→ gi in L2 (Ω) and hence in (Ω). The uniqueness
of the limit in (Ω) implies that ∂∂ xv = gi for all i = 1, 2, . . . , N and v ∈ H 1 (Ω). i Let us give another illustration of the above tools and compute the fundamental solution of the Laplacian in R3 . Proposition 2.2.8. Take N = 3 and consider the Newtonian potential f (x) = Then −Δ
1 4π
1 x12 + x22 + x32
.
f = δ.
PROOF. Let us denote by r (x) = x12 + x22 + x32 the Euclidean distance of x = (x1 , x2 , x3 ) 1 from the origin, and notice that f (x) = r (x) belongs to L1l oc (R3 ) and thus defines a dis-
tribution on R3 . Let us compute Δ f in (Ω). A standard approach consists in approximating f by a sequence f of smooth functions, then computing Δ f in a classical sense by Proposition 2.2.6 and passing to the limit in (Ω) as → 0. Clearly, the difficulty is
i
i i
i
i
i
i
28
book 2014/8/6 page 28 i
Chapter 2. Weak solution methods in variational analysis
at the origin where f has a singularity. Thus, the parameter is intended to isolate the origin and regularize f at the origin. At this point, there are two possibilities that lead to different computations and that we examine now. First, take for > 0 1/ if r (x) ≤ , f (x) = 1/r (x) if r (x) ≥ . Clearly, f is now a continuous, piecewise smooth function (it is not C1 ) on R3 and a ∂f standard computation yields that ∂ x belongs to L2 (R3 ) with i
∂ f ∂ xi
=
if r < , if r > .
0 −xi /r 3
Note that ∂∂ xr = ri for i = 1, 2, 3. i Let us now compute −Δ f . By definition, x
〈−Δ f , v〉( (R3 ), (R3 )) = 〈 f , −Δv〉( , ) 3 ∂ f ∂ v = , . ∂ xi ∂ xi ( , ) i =1 Since
∂ f ∂ xi
belongs to L1l oc , 〈−Δ f , v〉 =
3
∂ xi ∂ xi
R3
i =1
=−
∂ f ∂ v
3 i =1
r ≥
dx
xi ∂ v . d x. r 3 ∂ xi
Let us now integrate by parts this last expression. Noticing that on R3 \ {0}, ∂ xi i
∂ xi
r
3
= =
3 r
3
3 r
3
+
−3
xi ·
−3 xi · r4 r
xi2 r
5
=
3 r
3
−
3 r3
=0
(which means that Δ f = 0 on R3 \ {0}), we obtain 3 xi ! xi " 〈−Δ f , v〉 = − − v d x, 3 r S i =1 r where S = {x ∈ R3 : r (x) = } is the sphere of radius centered at the origin. Note that the unit normal to S at x which is oriented toward the outside of {r ≥ } is equal to − xr . Hence 1 〈−Δ f , v〉( , ) = (x)v(x) d x 2 S r = 〈μ , v〉( , ) ,
i
i i
i
i
i
i
2.2. Test functions and distribution theory
book 2014/8/6 page 29 i
29
where μ = −2 2 S and 2 S is the two-dimensional Hausdorff measure supported by S . An elementary calculus yields that lim →0
1 4π
μ = δ in (R3 ).
By definition of the convergence in (R3 ) we have 1 −Δ f −→ δ in (R3 ). 4π On the other hand, f converges to f in L1 (R3 ) (for example, by the dominated convergence theorem) and hence in (R3 ). By the continuity of 1the differential operator Δ in f = δ.
(R3 ) (cf. Proposition 2.2.7) we finally obtain that −Δ 4π Another regularization consists of building f a C1 (R3 ) function which approximates f . Take, for example, 2 a r (x) + b if r (x) ≤ , f (x) = 1/r (x) if r (x) ≥ , a and b being chosen in order to have f ∈ C1 (R3 ). This is equivalent to the system 2 a + b = 1/, a = −1/(23 ), which gives b = 3/(2). Noticing that a r 2 + b = −
r2 3
+
3
2 2 # 1 r2 $ = 3− 2 2 3 ≤ for r (x) ≤ , 2
we have that 0 ≤ f (x) ≤
3
=
2r (x)
3 2
f (x) on R3 .
Hence, by the Lebesgue dominated convergence, f −→ f in L1 (R3 ) as −→ 0. So f −→ f in (R3 ) and Δ f −→ Δ f in (R3 ). An elementary computation yields −Δ f = −6a 1B(0,) =+
Hence −Δ
1 4π
3 3
f =
1B(0,) . 1
1B(0,) . 4 π3 3
Noticing that 43 π3 is precisely the volume of the ball B(0, ), we have lim
1
1B(0,) →0 4 π3 3
= δ,
i
i i
i
i
i
i
30
book 2014/8/6 page 30 i
Chapter 2. Weak solution methods in variational analysis
which finally implies
−Δ
1 4π
f
=δ
and completes the proof. Remark 2.2.1. In the case when N = 2, using a similar calculation (see, for instance, [243, Theorem 3.2]), it can be shown that the fundamental solution of the Laplacian is given by the logarithmic potential. More precisely, consider the function f defined by " ! 1 . f (x) = ln x12 + x22 1 f ) = δ. Then −Δ( 2π
2.3 Weak solutions 2.3.1 Weak formulation of the model examples The Dirichlet problem. Let Ω be an open subset of RN and f : Ω −→ R a given function; take f ∈ L2 (Ω), for example. We recall that the Dirichlet problem is to find a function u : Ω −→ R which solves −Δu = f
on Ω,
u =0
on ∂ Ω.
(2.24)
In Section 2.1, we explained that it is difficult to prove directly the existence of a classical solution to this problem. By classical solution, we mean a function u which is continuous on Ω and of class C2 on Ω. So, the idea is to allow u to be less regular (at least in a first stage) and to interpret Δu in a weak sense, namely, in a distribution sense. Taking test functions v ∈ (Ω), (2.24) is equivalent to f vd x ∀ v ∈ (Ω). 〈−Δu, v〉( (Ω), (Ω)) = Ω
The definition of the derivation of distributions (see Definition 2.2.4) is precisely based on the integration by parts formula and allows us to transfer the derivation operation from u onto the test functions v ∈ (Ω). At this point, we have two possibilities. The two equalities N ∂u ∂v , , (2.25) 〈−Δu, v〉 = ∂ xi ∂ xi ( (Ω), (Ω)) i =1 〈−Δu, v〉 = 〈u, −Δv〉( (Ω), (Ω))
(2.26)
correspond, respectively, to a partial transfer and a global transfer of the derivatives on the test functions. They give rise to two distinct weak formulations of the initial problem, which indeed depend on the regularity properties which we expect the solution u to satisfy. If we expect the solution u to have first distribution derivatives which are integrable, then (2.25) gives rise to ⎧ N ∂u ∂v ⎪ ⎨ dx = f v d x ∀ v ∈ (Ω), (2.27) Ω i =1 ∂ xi ∂ xi Ω ⎪ ⎩ u = 0 on ∂ Ω.
i
i i
i
i
i
i
2.3. Weak solutions
book 2014/8/6 page 31 i
31
If we don’t expect u to have first derivatives which are integrable and just expect u to be L1 (Ω) or in C(Ω), then by using (2.26), we obtain the (very) weak formulation ⎧ ⎪ ⎨− uΔv d x = f v d x ∀ v ∈ (Ω), (2.28) Ω Ω ⎪ ⎩ u = 0 on ∂ Ω. For many reasons, the weak formulation (2.27) is the one which is well adapted to our situation: (a) As a general rule, we will see that it is preferable to perform the integrations by parts which are necessary and no more. Otherwise, the solution which is obtained satisfies the equation in a very weak sense, we have only poor information on this solution, and the study of its uniqueness and regularity becomes quite involved. (b) When trying to give a sense to the boundary condition u = 0 on ∂ Ω, it is useful to have some information on the derivatives of u on Ω. To find a weak solution u for which we know just that u belongs to some L p (Ω) space is not sufficient to give meaning to the trace of u on ∂ Ω. (Recall that ∂ Ω has a zero Lebesgue measure.) Since we will be able to find a weak solution of the Dirichlet problem in the space H 1 (Ω) (that is, with first-order distribution derivatives in L2 (Ω)), we will use (2.27) as a variational formulation of the Dirichlet problem. Note, too, that (2.27) has another advantage over (2.28): the left member of the equation, which is the important part and which involves the partial differential operator governing the equation, is symmetric with respect to u and v in (2.27), while it is not in (2.28). This has important consequences on the variational formulation of the problem; see Section 2.3.2 and Chapter 3. Let us summarize the above comments and give a first definition of the notion of a weak solution for the Dirichlet problem. It will be made precise later and solved in Chapter 6. Given f ∈ L2 (Ω), a weak solution u of the Dirichlet problem (2.24) is a function u ∈ H 1 (Ω) which satisfies ⎧ N ∂u ∂v ⎪ ⎨ dx = f v d x ∀ v ∈ (Ω), (2.29) ∂ xi ∂ xi Ω Ω i =1 ⎪ ⎩ u = 0 on ∂ Ω. Indeed, we will justify the choice of the functional space H 1 (Ω) and explain how to interpret the trace of such functions on ∂ Ω. We will reformulate (2.29) by introducing the subspace H01 (Ω) of H 1 (Ω): H01 (Ω) = {v ∈ H 1 (Ω) : v = 0 on ∂ Ω}. Indeed, H01 (Ω) is equal to the closure of (Ω) in H 1 (Ω). As a consequence, the equality (2.29) can be extended by a density and continuity argument to H01 (Ω). In this way we obtain the classical weak formulation of the Dirichlet problem. Definition 2.3.1. A weak solution of the Dirichlet problem is a solution of the following system: ⎧ N ∂u ∂v ⎪ ⎨ dx = f v d x ∀ v ∈ H01 (Ω), (2.30) Ω i =1 ∂ xi ∂ xi Ω ⎪ ⎩ 1 u ∈ H0 (Ω).
i
i i
i
i
i
i
32
book 2014/8/6 page 32 i
Chapter 2. Weak solution methods in variational analysis
Note that (2.30) can be written in the following abstract form: find u ∈ V such that a(u, v) = L(v) for all v ∈ V , where a : V ×V −→ R is a bilinear form which is symmetric and positive (a(v, v) ≥ 0 for every v ∈ V ) and L is a linear form on V . In Chapter 3, an existence result for such an abstract problem will be proved (Lax– Milgram theorem); in Chapter 5 the basic ingredients of the theory of Sobolev spaces will be developed, for example, to treat the case V = H01 (Ω). Thus, in Chapter 6 we will be able to prove the existence of a weak solution to the Dirichlet problem. The Neumann problem. We recall that the Neumann problem consists of finding a solution u to the boundary problem ∂u
u − Δu = f on Ω,
= 0 on ∂ Ω,
∂n
(2.31)
where ∂∂ nu = D u · n is the outward normal derivative of u on ∂ Ω. A major difference between the Dirichlet and the Neumann problem is that in the Neumann problem, the value of u on the boundary is not prescribed (it is ∂∂ nu which is prescribed). As a consequence, we have to test u on Ω and on ∂ Ω; it is not sufficient to take test functions v ∈ (Ω). We will take test functions v ∈ C1 (Ω). We are no longer in the setting of the distribution theory, but we can follow the lines of this theory. Let us first assume that u is regular and let us multiply (2.31) by v ∈ C1 (Ω) and integrate by parts. Recall that from the divergence theorem, div(v D u) d x = v D u · n d σ. Ω
Thus
∂Ω
Ω
(v Δu + D u · Dv) d x =
∂Ω
By using (2.31) and (2.32) we obtain (uv + D u · Dv) d x = f v dx Ω
Ω
∂u
d σ.
(2.32)
∀v ∈ C1 (Ω).
(2.33)
v
∂n
Now (2.33) makes sense even for a function u for which we are only able to define first generalized derivatives as functions. So, we will take (2.33) in a first step as a notion of the weak solution. Precisely, given f ∈ L2 (Ω), a weak solution of the Neumann problem is a function u ∈ H 1 (Ω) such that (uv + D u · Dv) d x = f v d x ∀v ∈ C1 (Ω). (2.34) Ω
Ω
The striking feature is that in this weak formulation (2.34), the Neumann boundary condition has disappeared! It is important to verify that, so doing, we have not lost any information. In other words, we need to show that, conversely, if u ∈ H 1 (Ω) verifies (2.34), then u satisfies (2.31). The second condition in (2.31) is the so-called Neumann boundary condition. First take v ∈ (Ω). Clearly (Ω) is a subspace of C1 (Ω) and so we obtain u − Δu = f
in (Ω).
(2.35)
i
i i
i
i
i
i
2.3. Weak solutions
book 2014/8/6 page 33 i
33
To recover the Neumann boundary condition, we have to perform the integration by parts in a reverse way. To do so, we assume that we have been able to prove that the weak solution u of (2.34) is in fact a regular function. So, by using (2.32) and (2.34), ∂u (u − Δu)vd x + v f v d x ∀v ∈ C1 (Ω). (2.36) dσ = ∂n Ω ∂Ω Ω By (2.35), u − Δu = f , so we can simplify (2.36) to obtain ∂u d σ = 0 ∀v ∈ C1 (Ω), v ∂ n ∂Ω which implies
∂u ∂n
= 0.
Thus, the Neumann boundary condition is implicitly contained in the weak variational formulation (2.34). Indeed, just like for the Dirichlet problem, we will prove a density result, namely, “C1 (Ω) is dense in H 1 (Ω).” As a consequence, the equality (2.34) can be extended to all v ∈ H 1 (Ω) and the final variational formulation of the Neumann problem will be the following. Definition 2.3.2. A weak solution of the Neumann problem is a solution u of the following system: ⎧ ⎨ (uv + D u · Dv) d x = f v d x ∀ v ∈ H 1 (Ω), (2.37) Ω Ω ⎩ u ∈ H 1 (Ω). Note again that the above problem can be written as find u ∈ V = H 1 (Ω) such that a(u, v) = L(v) ∀ v ∈ V , where a(u, v) = Ω (uv + D u · Dv) d x is a bilinear form, symmetric, and positive and L(v) = f v d x is a linear form on V . Ω
The basic difference between the weak variational formulations of the Dirichlet and Neumann problems is in the choice of the space V which reflects the choice of the test functions: V = H01 (Ω) in the Dirichlet problem; V = H 1 (Ω) in the Neumann problem. The Stokes system. Given f = ( f1 , f1 , . . . , fN ) ∈ L2 (Ω)N and μ > 0, we are looking for the velocity vector field of the fluid u = (u1 , u2 , . . . , uN ) and the pressure p : Ω −→ R of the fluid which satisfy − μΔui +
∂p
∂ xi div u = 0 ui = 0
The condition div u =
N
∂ ui i =1 ∂ xi
= fi on Ω, on Ω, on ∂ Ω,
i = 1, . . . , N , i = 1, . . . , N .
(2.38) (2.39) (2.40)
= 0 expresses that the fluid is incompressible.
i
i i
i
i
i
i
34
book 2014/8/6 page 34 i
Chapter 2. Weak solution methods in variational analysis
The choice of the test functions is not as immediate as in the two previous situations. A guideline is to choose the test functions smooth enough to perform the integration by parts and which look like the function or vector field we want to test. A clever choice ( J. Leray developed this method) is to take test fields v ∈ , where = {v = (v1 , . . . , vn ), vi ∈ (Ω), i = 1, . . . , N , and div v = 0}. One may require the vi to be C1 function with compact support as well. The important point is to assume that the divergence of v is equal to zero. Let us interpret (2.38) in the sense of distributions. If we expect to find ui with first partial derivatives in L2 (Ω) (i.e., ui ∈ H 1 (Ω)) and p ∈ L2 (Ω), this is equivalent to writing for each i = 1, 2, . . . , N ∂ vi p· dx = fi vi d x ∀vi ∈ (Ω). (2.41) μ D ui · Dvi d x − ∂ xi Ω Ω Ω The trick is now to add these equalities (i = 1, 2, . . . , N ). Since the test functions v1 , . . . , vN , ∂ vi by definition of , verify = 0, we obtain ∂x i
μ
N i =1
Ω
D ui · Dvi d x =
N i =1
Ω
fi vi d x
∀ v ∈ .
(2.42)
Conversely, it is easy to verify that if u is regular and satisfies (2.42), then N (−μΔui − fi )vi d x = 0 ∀ v ∈ . i =1
Ω
In other words, the vector (μΔui + fi )i =1,...,N is orthogonal to in L2 (Ω)N . One can prove—indeed, this is quite an involved result (see Chapter 6)—that this property implies the existence of p ∈ L2 (Ω) such that μ Δui + fi =
∂p ∂ xi
,
i = 1, 2, . . . , N .
Indeed, as in the previous examples, the equality (2.42) can be extended by a density and continuity argument to V = {v ∈ H01 (Ω)N : div v = 0}. Finally, the variational formulation of the Stokes system is given below. Definition 2.3.3. A weak solution of the Stokes system is a solution u = (u1 , u2 , . . . , uN ) of the system ⎧ N N ⎪ ⎨μ D ui · Dvi d x = fi vi d x ∀v ∈ V , (2.43) Ω i =1 i =1 Ω ⎪ ⎩ u ∈ V, where V = {v ∈ H01 (Ω)N : div v = 0}. The choice of the functional space V (which is obtained by a completion of the space of test functions) is of fundamental importance. The pressure p has apparently disappeared in this formulation. It is contained implicitly in it, since p can be interpreted as a Lagrange multiplier of the constraint div v = 0.
i
i i
i
i
i
i
2.3. Weak solutions
book 2014/8/6 page 35 i
35
Notice that, once more, the weak formulation we have obtained can be written in the following form: find u ∈ V such that ∀v ∈ V , where a(u, v) = μ Ω i =1 D ui Dvi d x and L(v) = Ω fi vi are, respectively, a bilinear form and a linear form on V . N
a(u, v) = L(v)
2.3.2 Positive quadratic forms and convex minimization The weak formulations of the model examples studied in the previous section have very similar structures. Indeed, they can be viewed as particular cases of the following abstract problem. Given V a linear vector space, a : V × V −→ R a bilinear form, and L : V −→ R a linear form, find u ∈ V such that a(u, v) = L(v)
∀v ∈ V .
(2.44)
In Chapter 3, we will study in detail the existence of solutions to such problems. This will require some topological assumptions on the data V , a, L. For the moment, we will examine algebraic properties of such problems and make the link, when a(·, ·) is symmetric and positive, with convex minimization problems. Let us first make precise these notions concerning bilinear and quadratic forms. Definition 2.3.4. Let V be a linear vector space and a : V × V −→ R a bilinear form, i.e., ∀ u ∈ V v −→ a(u, v) is a linear form, ∀ v ∈ V u −→ a(u, v) is a linear form. The bilinear form is said to be symmetric if ∀u, v ∈ V
a(u, v) = a(v, u).
When a is symmetric, one can associate to a(·, ·) the quadratic form q : V −→ R which is equal to q(v) = a(v, v). The bilinear form a is said to be positive (one can say as well that the associated quadratic form q is positive) if ∀ v ∈ V a(v, v) ≥ 0. We say that a(·, ·) (or q(·)) is positive definite if ∀v ∈V
a(v, v) ≥ 0
and
a(v, v) = 0 =⇒ v = 0.
We can now make the link between problem (2.44) and a minimization problem. All the notions used in the following statement are algebraic. Proposition 2.3.1. Let V be a linear vector space, L : V −→ R a linear form, and a : V × V −→ R a bilinear, symmetric, positive form. Then the two following statements are equivalent: (i) u ∈ V , a(u, v) = L(v) ∀v ∈ V ; (ii) u ∈ V , J (u) ≤ J (v) ∀ v ∈ V , where 1 J (v) := a(v, v) − L(v). 2
i
i i
i
i
i
i
36
book 2014/8/6 page 36 i
Chapter 2. Weak solution methods in variational analysis
PROOF. Let us first prove (i) =⇒ (ii). Since V is a linear space, it is equivalent to prove that J (u) ≤ J (u + v) ∀v ∈ V . A simple computation gives % J (u + v) − J (u) =
1 2
&
%
a(u + v, u + v) − L(u + v) −
1 2
& a(u, u) − L(u) .
Note that because of the symmetry assumption on the bilinear form a(·, ·), a(u + v, u + v) = a(u, u) + 2a(u, v) + a(v, v). Thus %
& 1 a(u, u) + a(u, v) + a(v, v) 2 2 1 − a(u, u) − [L(u) + L(v)] + L(u) 2 1 = [a(u, v) − L(v)] + a(v, v). 2
J (u + v) − J (u) =
1
Since, by assumption, u is a solution of (i), a(u, v) = L(v) and 1 J (u + v) − J (u) = a(v, v), 2 which is nonnegative, since a has been assumed to be positive. Let us now prove (ii) =⇒ (i). We know that u is a solution of the minimization problem, i.e., u minimizes J (·). One is naturally tempted to write an optimality condition which expresses that some derivative of J at u is equal to zero. Since V was only assumed to be a linear vector space, the only derivation notion we can use is the directional derivative which always makes sense since it relies only on the topological structure of the real line. Since u minimizes J , for any t ∈ R, for any v ∈ V , J (u + t v) − J (u) ≥ 0. Dividing by t > 0, we have 1 t
[J (u + t v) − J (u)] ≥ 0.
Before letting t go to zero, let us compute this last expression: 1 t
[J (u + t v) − J (u)] =
% 1 1
1
&
a(u + t v, u + t v) − L(u + t v) − a(u, u) + L(u) t 2 2 2 t 1 t a(u, v) + a(v, v) − t L(v) = t 2 t = a(u, v) + a(v, v) − L(v). 2
i
i i
i
i
i
i
2.3. Weak solutions
book 2014/8/6 page 37 i
37
Thus, by letting t go to zero, we obtain lim+
t →0
1 t
[J (u + t v) − J (u)] = a(u, v) − L(v) ≥ 0.
Then, one can either make the same argument by using t < 0 or replace v by −v in the above inequality to obtain the opposite inequality and conclude that a(u, v) = L(v)
∀v ∈ V .
Let us return to the model examples studied in Section 2.3.1 and use their weak formulations together with Proposition 2.3.1 to obtain the results below. Corollary 2.3.1. With the notation of Section 2.3.1, the following facts hold: (a) The weak solution u of the Dirichlet problem −Δu = f on Ω, u=0 on ∂ Ω is a solution of the minimization problem J (u) ≤ J (v) ∀ v ∈ H01 (Ω), u ∈ H01 (Ω), where J (v) := 12 Ω |Dv|2 d x − Ω f vd x. This is the Dirichlet variational principle. (b) The weak solution u of the Neumann problem u − Δu = f on Ω, ∂u = 0 on ∂Ω ∂n is a solution of the minimization problem J (u) ≤ J (v)
∀ v ∈ H 1 (Ω),
u ∈ H 1 (Ω), where J (v) := 12 Ω (|Dv|2 + v 2 )d x − Ω f v d x. (c) The weak solution u of the Stokes system ⎧ ∂p ⎪ ⎨−μΔui + ∂ xi = fi , i = 1, 2, . . . , N on Ω, div u = 0 on Ω, ⎪ ⎩ u = 0 on ∂ Ω is a solution of the minimization problem J (u) ≤ J (v) ∀ v ∈ V = {v ∈ H01 (Ω)N : div v = 0}, u ∈ V, where J (v) = 12 N |Dvi |2 d x − N f v d x. i =1 Ω i =1 Ω i i Note that in all these examples, the bilinear form a(·, ·) is symmetric and positive. We now come to the question of the nature of the minimization problem and the properties of the functional J given by 1 J (v) = a(v, v) − L(v). 2 Note that J is the sum of a quadratic form and of a linear form.
i
i i
i
i
i
i
38
book 2014/8/6 page 38 i
Chapter 2. Weak solution methods in variational analysis
We are ready to introduce a property of fundamental importance in the study of the minimization problems—convexity. Recall that a function J : V −→ R, where V is a linear vector space, is convex if ∀ u, v ∈ V ,
∀ λ ∈ [0, 1] J (λu + (1 − λ)v) ≤ λJ (u) + (1 − λ)J (v).
The role of convexity in minimization problems will be examined in detail in Chapters 3, 9, 13, and 15. The class of convex functionals is stable with respect to the sum, it contains the linear forms, and we are going to see that it contains the positive quadratic forms. Thus, functionals of the form J (v) = 12 a(v, v)−L(v) with a and L as above will be convex. Let us now formulate the convexity property for the positive quadratic forms. Proposition 2.3.2. Let V be a linear vector space and a : V × V −→ R a bilinear form which is symmetric and positive. Then, the quadratic form q : V −→ R which is associated with a, i.e., q(v) = a(v, v) is a convex function. PROOF. This is just an algebraic computation. For any u, v ∈ V and λ ∈ [0, 1], q(λu + (1 − λ)v) = a(λu + (1 − λ)v, λu + (1 − λ)v) = λ2 a(u, u) + (1 − λ)2 a(v, v) + 2λ(1 − λ)a(u, v). Thus, λq(u) + (1 − λ)q(v) − q(λu + (1 − λ)v) = (λ − λ2 )a(u, u) − 2λ(1 − λ)a(u, v) + [(1 − λ) − (1 − λ)2 ]a(v, v) = λ(1 − λ)[a(u, u) − 2a(u, v) + a(v, v)] = λ(1 − λ)a(u − v, u − v), which is nonnegative, because λ ∈ [0, 1] and a is positive. When examining the question of the uniqueness of the solution of the previous problems, the notion which plays a central role is the strict convexity. Recall that J : V −→ R is strictly convex if J is convex and the convexity inequality J (λu + (1 − λ)v) < λJ (u) + (1 − λ)J (v) is strict whenever u = v and λ ∈ ]0, 1[. The importance of this notion is justified by the following elementary result. Proposition 2.3.3. Let V be a linear space and J : V −→ R a strictly convex function. Then there exists at most one solution u to the minimization problem J (u) ≤ J (v) ∀ v ∈ V , u ∈ V. PROOF. Suppose that we have two distinct solutions u1 and u2 to the above minimization problem. Then u1 + u2 1 < [J (u1 ) + J (u2 )] = inf J (u), J 2 2 a clear contradiction. Hence u1 = u2 .
i
i i
i
i
i
i
2.4. Weak topologies and weak convergences
book 2014/8/6 page 39 i
39
Proposition 2.3.4. Let V be a linear vector space and a : V × V −→ R a bilinear form which is symmetric and positive definite. Then, the quadratic form q : V −→ R which is associated with a, i.e., q(v) = a(v, v) is strictly convex. PROOF. The proof is the same computation as in the proof of Proposition 2.3.2: for any u, v ∈ V , for any λ ∈ [0, 1], λq(u) + (1 − λ)q(v) − q(λu + (1 − λ)v) = λ(1 − λ)a(u − v, u − v). When taking λ ∈ ]0, 1[ we have λ(1−λ) > 0, and when taking u = v we have a(u − v, u − v) > 0 because a is positive definite. So, for λ ∈ ]0, 1[ and u = v, λq(u) + (1 − λ)q(v) > q(λu + (1 − λ)v) and q is strictly convex. Proposition 2.3.5. The sum of a convex function and a strictly convex function is strictly convex. PROOF. The proof is a direct consequence of the fact that when adding an inequality and a strict inequality, one obtains a strict inequality. Let us return to the model examples and their variational formulations as minimization problems as given in Corollary 2.3.1. Noticing that in all these situations the quadratic form q(v) = a(v, v) is positive definite, we obtain that the corresponding functional J is strictly convex. So, the weak solution of the problems under consideration, when it exists, is characterized as the unique solution of the associated minimization problems. This makes a natural transition to Chapter 3, where the existence question will be examined.
2.4 Weak topologies and weak convergences In recent decades, weak topologies have proved useful as a basic tool in variational analysis in the study of PDEs, and more generally in all fields using tools from functional analysis. Let us explain some of the reasons for the success of weak convergence methods. (a) Distributions are defined as continuous linear forms on (Ω). In other words, a distribution T ∈ (Ω) is viewed via its action on test functions v ∈ (Ω): T ∈ (Ω) : v ∈ (Ω) −→ 〈T , v〉( (Ω), (Ω)) . Given a sequence T1 , T2 , . . . , Tn , . . . of distributions (for example, functions, measures), a natural mode of convergence for such sequences is to assume that ∀ v ∈ (Ω)
lim 〈Tn , v〉( , ) = 〈T , v〉.
n→∞
This is a typical example of weak convergence. (b) A celebrated theorem from Riesz asserts that the closed unit ball of a normed linear space is compact iff the space has a finite dimension. Thus, when looking for topologies making bounded sets relatively compact in infinite dimensional spaces, one is naturally led to introduce new topologies which are weaker than the topology of the norm. This is why weak topologies play a decisive role. (c) Besides the importance of weak topologies from a theoretical point of view, we will see that weak convergences naturally occur when describing concrete situations. For example, weak convergences allow us to describe high oscillations of a sequence of functions, as well as concentration phenomena on zero Lebesgue measure sets.
i
i i
i
i
i
i
40
book 2014/8/6 page 40 i
Chapter 2. Weak solution methods in variational analysis
Before introducing weak topologies on normed linear spaces, let us recall some basic facts from general topology.
2.4.1 Topologies induced by functions in general topological spaces First we need to fix the notation. Recall that a topology on a space X is a family θ of subsets of X , called the family of the open sets of X , satisfying the axioms of the open sets, namely, (i) X and belong to θ; (ii) for all (Gi )i ∈I Gi ∈ θ, I arbitrary, ∪i ∈I Gi ∈ θ; (iii) for all (Gi )i ∈I Gi ∈ θ, I finite, ∩i ∈I Gi ∈ θ. In other words, the open sets of X for a given topology are a family of subsets of X which is stable with respect to arbitrary unions and finite intersection. We will often denote by τ a topology on a space X and by θτ the family of the τ-open sets. A topology can be seen as a subset of P (X ), where P (X ) is the family of all subsets of X . There is a natural partial ordering on the topologies on a given space X , which is induced by the inclusion ordering on the subsets of P (X ): we will say that a topology τ1 is coarser or weaker than a topology τ2 and we write τ1 < τ2 if θτ1 ⊂ θτ2 , that is, if any element G of θτ1 also belongs to θτ2 . Conversely, we will say that τ2 is stronger or finer than τ1 . Proposition 2.4.1. The family of the topologies on a set X forms a complete lattice for the relation τ1 < τ2 (τ1 weaker than τ2 ), that is, given an arbitrary collection of topologies (τi )i ∈I on X , the following hold: (a) There exists a lower bound, that is a topology which is the largest among all the topologies weaker than the τi , i ∈ I . We denote τ = ∧i ∈I τi the lower bound (or infimum) of the τi . Clearly θτ = ∩i ∈I θτi , that is, G ∈ θτ iff G belongs to θτi for all i ∈ I . (b) There exists an upper bound (or supremum) that is a topology which is the smallest among the topologies which are stronger than all the τi . We denote τ = ∨i ∈I τi the upper bound of the τi . We have that θτ is generated by ∪i ∈I θτi in the sense of Proposition 2.4.2. PROOF. (a) Clearly if (θτi )i ∈I is a family of topologies on X , then ∩i ∈I θτi = {G ∈ P (X ) : G ∈ θτi ∀i ∈ I } still satisfies the axioms of the open sets, and it is a topology. The topology τ attached to the family θ = ∩i ∈I θτi is weaker than all the topologies τi , i ∈ I , and clearly it is the largest among the weaker ones. (b) In contradiction to the previous case, if a topology τ is stronger than all the topologies τi , then θτ must contain all the families θτi , that is, θτ ⊃ ∪i ∈I θτi . But now = ∪i ∈I θτi does not satisfy (in general) the axioms of the open sets. So, one is naturally led to address the following question: given a class of subsets of an abstract space X , does there exist a smallest topology θτ on X which contains ? Clearly by (a), the answer is yes, one has to take for θτ the intersection (or, with an equivalent terminology, the infimum) of all the topologies containing = ∪i ∈I θτi . This is made precise in Proposition 2.4.2.
i
i i
i
i
i
i
2.4. Weak topologies and weak convergences
book 2014/8/6 page 41 i
41
Proposition 2.4.2. Let X be an abstract space and let be any class of subsets of X . Then there exists a smallest (weakest) topology on X containing , denoted by τ , called the topology generated by . It is equal to the intersection of all the topologies containing . It can be obtained via the following two-step procedure: 1. First, take the finite intersections of elements of . One so obtains a family of sets which we call . 2. Then is a base for the topology τA generated by , that is, any member of τ can be obtained as the union of a family of members of . We stress that in this construction, one has first to take finite intersections of elements of , then arbitrary unions of the so-obtained sets. When reversing the two operations, one obtains a family which is not stable by union. For a proof, see one of several books on general topology (for instance, Bourbaki [119]). We now come to the situation which is of interest when considering weak topologies. Suppose X is an abstract space and (Yi , τi )i ∈I is a family of topological spaces. Suppose that for each i ∈ I , a function fi : X → Yi is given. We want to investigate the topologies on X with respect to which all the functions fi are continuous, and, among these topologies, examine the question of the existence of a smallest (weakest) one. Noticing that for each i ∈ I , fi −1 (θτi ) := { fi −1 (Gi ) : Gi ∈ θτi } still satisfies the axioms of the open sets, we denote fi −1 (τi ) the corresponding topology on X , which is the weakest making fi (for i fixed) continuous. It follows from Propositions 2.4.1 and 2.4.2 that the answer to the previous question is given by τ = ∨i ∈I fi −1 (τi ), whose precise description is given in the following. Theorem 2.4.1. Let X be an abstract space and let (Yi , τi )i ∈I be an arbitrary collection of topological spaces with for each i ∈ I , fi : X → Yi a given function. Then, there exists a weakest topology τ on X making all the functions ( fi )i ∈I continuous, fi : (X , τ) −→ (Yi , τi ) i ∈ I . This topology τ is equal to ∨i ∈I fi −1 (τi ), that is, τ is generated by the class = { fi −1 (Gi ), Gi ∈ θτi , i ∈ I }. The class τ =
' i ∈J
fi
−1
(Gi ), Gi ∈ θτi , J ⊂ I , J finite
is a base for this topology, that is, each element of θτ can be written as a union of elements of τ . We say that the topology τ is induced by the family ( fi )i ∈I . The following properties are quite elementary. Proposition 2.4.3. Let fi : X −→ (Yi , τi ), i ∈ I , be given, and let τ = ∨ fi −1 (τi ) be the topology on X induced by the ( fi )i ∈I . For any sequence (xn )n∈N of elements of X , the two conditions are equivalent: τ
(i) xn −→ x as n → ∞; τi
(ii) for all i ∈ I fi (xn ) −→ fi (x) as n → ∞. PROOF. Since the topology τ makes each fi continuous, we have clearly (i) =⇒ (ii). Conversely, let us assume (ii) and prove (i). When considering a neighborhood of x, it is equivalent to take an element of the base τ which contains x. So let,
i
i i
i
i
i
i
42
book 2014/8/6 page 42 i
Chapter 2. Weak solution methods in variational analysis
x ∈
(
τi
f −1 (Gi ), Gi ∈ θτi , J ⊂ I , J finite. For each i ∈ J , since fi (xn ) −→ fi (x) we
i ∈J i
have that xn ∈ fi −1 (Gi ) for n ≥ ni . Take N = maxi ∈J ni , since J is finite, N is a finite ( τ integer, and xn ∈ i ∈J fi −1 (Gi ) for n ≥ N , which expresses that xn −→ x. A similar type argument yields the following result.
Proposition 2.4.4. Let (Z, ) be a topological space and let g : (Z, ) −→ (X , τ) be a given function, where τ is the topology induced by the family fi : X −→ (Yi , τi ). Then g is continuous iff fi ◦ g is continuous from (Z, ) into (Yi , τi ) for each i ∈ I .
2.4.2 The weak topology σ (V, V ∗ ) We now assume that X is a vector space. To enhance this property, we denote it by V (like vector) and assume that V is a normed linear space, the norm of v ∈ V being denoted by vV or v when no confusion is possible. We denote by V ∗ the topological dual of V , which is the set of all linear continuous forms on V . To avoid confusion, generic elements of V and V ∗ are denoted, respectively, by v ∈ V and v ∗ ∈ V ∗ . We will write 〈v ∗ , v〉 = v ∗ (v) for the canonical pairing between V ∗ and V , which is just the evaluation of v ∗ ∈ V ∗ at v ∈ V . Recall that V ∗ is a normed linear space (indeed, it is a Banach space) when equipped with the dual norm v ∗ V ∗ = sup{|v ∗ (v)| : vV ≤ 1}. With this definition, we have ∀v ∈ V , ∀v ∗ ∈ V ∗
|〈v ∗ , v〉| ≤ v ∗ v,
and v ∗ is precisely the smallest constant for which the above inequality holds. Definition 2.4.1. Let (V , · ) be a normed linear space with topological dual V ∗ . The topology σ(V ,V ∗ ), called the weak topology on V , is the weakest topology on V making continuous all the elements of V ∗ . Let us first comment on this definition. By definition, each element v ∗ ∈ V ∗ is a function from V into R, that is, v ∗ : V −→ R,
v −→ 〈v ∗ , v〉(V ∗ ,V ) .
The weak topology on V is defined as the weakest topology on V making all these functions {v ∗ : v ∗ ∈ V ∗ } continuous. By Theorem 2.4.1, such a topology exists, and it is weaker than the norm topology (since by definition all the elements v ∗ of V ∗ are continuous for the norm topology). We collect below some first results on the topology σ(V ,V ∗ ) which are direct consequences of its definition. Proposition 2.4.5. Let V be a normed space and σ(V ,V ∗ ) the weak topology on V . (i) A local base of neighborhoods of v0 ∈ V for σ(V ,V ∗ ) consists of all sets of the form N (v0 ) = {v ∈ V : |〈vi∗ , v − v0 〉| < ∀i ∈ I }, where I is a finite index set, vi∗ ∈ V ∗ for each i ∈ I , and > 0.
i
i i
i
i
i
i
2.4. Weak topologies and weak convergences
book 2014/8/6 page 43 i
43
(ii) (V , σ(V ,V ∗ )) is a Hausdorff topological space. (iii) The topology σ(V ,V ∗ ) is coarser than the topology of the norm on V . (iv) When V is finite dimensional, the weak topology and the norm topology coincide. (v) When V is infinite dimensional, the weak topology σ(V ,V ∗ ) is strictly coarser than the norm topology. PROOF. (i) We have N (v0 ) =
' i ∈I
(vi∗ )−1 (]αi − , αi + [),
〈vi∗ , v0 〉.
By definition of the weak topology σ(V ,V ∗ ), N (v0 ) is an open set where αi = for this topology. Let us prove that such sets form a local base of open neighborhoods of v0 for σ(V ,V ∗ ). Take A an open set for σ(V ,V ∗ ) containing v0 . By Theorem 2.4.1, there exists some open set B for σ(V ,V ∗ ) such that v0 ∈ B ⊂ A ( ∗ −1 ∗ ∗ with B = i ∈I (vi ) (Gi ), vi ∈ V , Gi open in R, I finite. Since vi∗ (v0 ) ∈ Gi and Gi is open in R, there exists some > 0 such that |vi∗ (v) − ∗ vi (v0 )| < for all i ∈ I implies vi∗ (v) ∈ Gi for all i ∈ I . Hence v0 ∈ N (v0 ) ⊂ B ⊂ A with
N (v0 ) =
' i ∈I
(vi∗ )−1 (]αi − , αi + [), αi = 〈vi∗ , v0 〉.
(ii) Let us prove that the topology σ(V ,V ∗ ) is Hausdorff. Take v1 and v2 two distinct elements of V and prove that there exist A1 and A2 two open sets for σ(V ,V ∗ ) such that v1 ∈ A1 , v2 ∈ A2 , and A1 ∩ A2 = . This is a direct consequence of the Hahn–Banach separation theorem. There exists a closed hyperplane which strictly separates v1 and v2 , that is, there exists some v ∗ ∈ V ∗ and α ∈ R such that 〈v ∗ , v1 〉 < α < 〈v ∗ , v2 〉. Take A1 = {v ∈ V : 〈v ∗ , v〉 < α}, A2 = {v ∈ V : 〈v ∗ , v〉 > α}. They are open for the topology σ(V ,V ∗ ) and separate v1 and v2 . Assertion (iii) is obvious since all the elements v ∗ of V ∗ are continuous for the norm topology. (iv) Since the weak topology σ(V ,V ∗ ) is coarser than the norm topology, it has fewer open sets. Let us prove that when V is a finite dimensional space, the opposite inclusion is true, that is, any open set A for the norm topology is also an open set for the weak topology. Take v0 ∈ B(v0 , ) ⊂ A, where B(v0 , ) is an open ball in (V , ·), and prove that there exists some open set U for σ(V ,V ∗ ) such that v0 ∈ U ⊂ B(v0 , ) ⊂ A.
i
i i
i
i
i
i
44
book 2014/8/6 page 44 i
Chapter 2. Weak solution methods in variational analysis
Let us choose a base e1 , . . . , eN of V with ei = 1, i = 1, . . . , N . Each element v of V ei∗ can be uniquely written as v = xi ei and the mappings v −→xi are linear continuous forms on V , i.e., they are elements of V ∗ . We have (with v0 = xoi ei ) ) ) ) ) ) ) v − v0 = ) (xi − xoi )ei ) ) ) i ≤ |xi − xoi | i
≤
|〈ei∗ , v − vo 〉|.
i
( Therefore, v ∈ B(v0 , ) as soon as v ∈ U := i =1,...,n (ei∗ )−1 (]αi − N , αi + N [), where αi = 〈ei∗ , v0 〉 = xoi . Then notice that U is open for σ(V ,V ∗ ), which concludes the proof of (iv). (v) There are different ways to prove that in infinite dimensional spaces the weak topology is strictly coarser than the norm topology. One of them consists of proving that the unit sphere S = {v ∈ V : v = 1} is never closed in infinite dimensional spaces ∗ for the topology σ(V ,V ∗ ). Indeed S¯σ(V ,V ) = {v ∈ V : v ≤ 1}; see, for instance, [137, Proposition III.6] and related comments for a detailed proof. Remark 2.4.1. It is quite convenient when formulating topological properties to express them with the help of sequences. This can be done without loss of generality when the topology under consideration is metrizable. But the weak topology σ(V ,V ∗ ) when V is an infinite dimensional normed space is a locally convex topology (the basic operations on V , vectorial sum and multiplication by a scalar, are continuous for the topology σ(V ,V ∗ )) which is not metrizable. Therefore, it is important to state the properties of this topology with general topological arguments as we have done up to now. Nevertheless, in most practical situations, one can just use weakly convergent sequences. This will follow from deep results like the Eberlein–Smulian compactness theorem or from simpler observations like the following one: if V ∗ is separable, the weak topology σ(V ,V ∗ ) is metrizable on each bounded set of V . Consequently we now focus on properties of sequences which are σ(V ,V ∗ ) convergent. Let us start with the following elementary results, which are direct consequences of the definition and of Proposition 2.4.3. Proposition 2.4.6. Let V be a normed linear space and σ(V ,V ∗ ) the weak topology on V . For any sequence (vn )n∈N in V the following properties hold: σ(V ,V ∗ )
(i) vn −→ v ⇐⇒ ∀v ∗ ∈ V ∗ 〈v ∗ , vn 〉 −→ 〈v ∗ , v〉; ·
σ(V ,V ∗ )
(ii) vn −→ v =⇒ vn −→ v; σ(V ,V ∗ )
(iii) vn −→ v =⇒ the sequence (vn )n∈N is bounded and v ≤ lim infn vn ; σ(V ,V ∗ )
·∗
(iv) vn −→ v and vn∗ −→ v ∗ =⇒ 〈vn∗ , vn 〉 −→ 〈v ∗ , v〉. PROOF. (iii) The fact that the sequence (vn )n∈N is bounded is a consequence of the Banach–Steinhaus theorem: consider the family of linear operators from the Banach space
i
i i
i
i
i
i
2.4. Weak topologies and weak convergences
V ∗ into R
book 2014/8/6 page 45 i
45
Tn : V ∗ −→ R
n ∈ N,
v ∗ −→ 〈v ∗ , vn 〉.
For each n ∈ N, Tn is a linear continuous operator with norm Tn " (V ∗ ,R) = sup |〈v ∗ , vn 〉| = vn . v ∗ ∗ ≤1
(This last equality is a consequence of the Hahn–Banach theorem.) For each v ∗ ∈ V ∗ , the sequence (Tn (v ∗ ))n ∈ N is bounded in R. This is a direct consequence of the equality Tn (v ∗ ) = 〈v ∗ , vn 〉 and of the weak convergence of the sequence (vn )n∈N . By the Banach– Steinhaus theorem, supn∈N Tn " (V ∗ ,R) < +∞, which is equivalent to supn∈N vn < ∞. Let us now prove the inequality v ≤ lim infn vn . By assumption, for each v ∗ ∈ V ∗ 〈v ∗ , v〉 = lim 〈v ∗ , vn 〉. n→+∞
By using the inequality |〈v ∗ , vn 〉| ≤ v ∗ ∗ vn , we infer ∀v ∗ ∈ V ∗
|〈v ∗ , v〉| ≤ (lim inf vn )v ∗ ∗ . n→+∞
By using the Hahn–Banach theorem, we obtain v = sup |〈v ∗ , v〉| ≤ lim inf vn . v ∗ ∗ ≤1
n
(iv) This is just a triangulation argument. Write 〈vn∗ , vn 〉 − 〈v ∗ , v〉 = 〈vn∗ − v ∗ , vn 〉 + 〈v ∗ , vn − v〉. Hence
|〈vn∗ , vn 〉 − 〈v ∗ , v〉| ≤ vn∗ − v ∗ ∗ vn + |〈v ∗ , vn − v〉|.
The previous result (iii) tells us that there exists some constant C ∈ R+ such that vn ≤ C for all n ∈ N. So, |〈vn∗ , vn 〉 − 〈v ∗ , v〉| ≤ C vn∗ − v ∗ + |〈v ∗ , vn − v〉|, which clearly implies the result. Remark 2.4.2. We will interpret in Section 3.2.3 the property σ(V ,V ∗ )
vn −→ v =⇒ v ≤ lim inf vn n
as a lower semicontinuity property of the norm ·V for the topology σ(V ,V ∗ ). Indeed, more generally, this can be viewed as a consequence of the fact that · V is convex and continuous on V (and hence lower semicontinuous for the topology σ(V ,V ∗ )). Because of its importance, let us say a few words about the weak convergence in Hilbert spaces (which we denote by H ). The Riesz representation theorem tells us that any element of the topological dual space can be represented as H # v −→ 〈 f , v〉,
i
i i
i
i
i
i
46
book 2014/8/6 page 46 i
Chapter 2. Weak solution methods in variational analysis
where f is a given element of H and 〈·, ·〉 is the scalar product in H . Let us complete this observation by a few elementary results. Proposition 2.4.7. Let H be a Hilbert space. A sequence (vn )n∈N is weakly convergent in H iff ∀z ∈ H 〈vn , z〉 −→ 〈v, z〉. n→+∞
Moreover, we have the following implication: σ(H ,H )
vn −→ v
and
·
vn −→ v =⇒ vn −→ v.
PROOF. We just need to prove the last statement. We have vn − v2 = vn 2 + v2 − 2〈vn , v〉. Hence
lim vn − v2 = v2 + v2 − 2〈v, v〉 = 0,
n→+∞ ·
that is, vn −→ v. In the next section we will prove that this last property, which is quite important in the applications, is valid in a much larger class than the Hilbert spaces, namely, the uniformly convex Banach spaces. For the moment, we pause in these theoretical developments to give some examples of sequences which are σ(V ,V ∗ ) convergent but not ·V convergent. 2 Example 2.4.1. Take real numbers, v = V = l 2 . An element v of V is a sequence of (vk )k∈N , such that k∈N |vk | < +∞. The scalar product 〈u, v〉 := k∈N uk vk and the corresponding norm v = ( |vk |2 )1/2 give to V a Hilbert space structure. Consider the sequence e1 , e2 , . . . , en , . . . with
en = (δn,k )k∈N , where δn,k (the Kronecker symbol) takes the value 1 if k = n and 0 elsewhere. The family (en )n∈N is called the canonical basis of l 2 (it is a Hilbertian basis). Let us show that (en )n∈N weakly converges to 0 in V , that is, ∀v ∈ V
〈en , v〉 −→ 0. n→+∞
Observe that 〈en , v〉 = vn , where v = (vn )n∈N . Since Σn∈N |vn |2 < +∞, the general term of this convergent series, that is, vn , tends to zero as n goes to +∞, which proves the result. The sequence (en )n∈N is not norm convergent in V ; otherwise it would necessarily norm converge to zero. (Recall that the norm convergence implies the weak convergence.) This is impossible since for each n ∈ N, en = 1. This can be equivalently obtained when observing that for all n = m $ en − e m = 2, the sequence (en )n∈N is not a Cauchy sequence, and hence it is not norm convergent. Example 2.4.2 (weak convergence in L p (Ω), 1 ≤ p < ∞). Take Ω a bounded open set in RN , 1 ≤ p < ∞, and p p V = L (Ω) = v : Ω −→ R Lebesgue measurable: |v(x)| d x < +∞ . Ω
i
i i
i
i
i
i
2.4. Weak topologies and weak convergences
47
*
V equipped with the norm v =
L p (Ω), ∗
1 p
+ ∞
1 p
book 2014/8/6 page 47 i
|v(x)| p d x Ω
+1/ p
is a Banach space with dual V ∗ =
= 1 (with the convention that the conjugate exponent of 1 is +∞, i.e.,
L (Ω) = L (Ω)). The weak convergence in V = L p (Ω) can be formulated as follows: 1
σ(L p ,L p )
vn −→ v ⇐⇒ ∀z ∈ L p (Ω)
Ω
vn (x)z(x)d x −→
n→+∞
v(x)z(x)d x. Ω
The weak convergence in L p allows us to model two different types of phenomena (they may occur simultaneously): 1. Oscillations. We describe the simplest situation of wild oscillations. Take Ω = (a, b ) a bounded open interval of the real line, d x the Lebesgue measure on Ω, and vn (x) = sin(nx), n ∈ N. Clearly vn oscillates between −1 and +1 with period equal to Tn = 2π/n. When n goes to +∞, Tn −→ 0 and simultaneously its frequency goes to +∞.
σ(L p ,L p )
Let us prove that vn −→ 0 for any 1 ≤ p < ∞. Indeed, we can state a slightly more precise result: for any z ∈ L1 (Ω)
b
z(x) sin nxd x −→ 0
as n → +∞.
a
(Note that for all 1 ≤ p < +∞, L p (a, b ) ⊂ L1 (a, b ).) Indeed, that is exactly the Riemann’s theorem which states that the Fourier coefficients of an integrable function tend to zero as n → +∞. For convenience, we give the proof, which is a nice illustration of a density argument. We also emphasize that this result is a particular case of an ergodic theorem (see Section 12.4.1 or Section 13.2, Proposition 13.2.1, Remark 13.2.5). For an arbitrary z ∈ L1 , it is difficult to compute or get information on the integral b z(x) sin nx d x. So, let us first consider the case where z belongs to some dense subspace a Z of L1 (Ω), Z being chosen to make the computation easier. For example, take Z = C1c (a, b ), the subspace of C1 functions with compact support in (a, b ). By integration by parts, for any z ∈ C1c (a, b ),
b
b
z(x) sin nxd x = a
which implies
z (x)
cos nx
a
n
d x,
1b b z(x) sin nxd x ≤ |z (x)|d x −→ 0. n a n→+∞ a
Another choice would consist of taking for Z the subspace of the step functions on (a, b ). In that case a direct computation yields a similar result. So, for any z belonging to a dense subspace Z of L1 (Ω), we have that
b
z(x) sin nxd x −→ 0. a
n→+∞
We complete the proof by a density argument.
i
i i
i
i
i
i
48
book 2014/8/6 page 48 i
Chapter 2. Weak solution methods in variational analysis
Take z an arbitrary element of L1 (Ω). By the density of Z in L1 (Ω), for any > 0, there exists some element z ∈ Z such that z − z 1 < . Let us write
b
a
b
z(x) sin nx d x = a
z (x) sin nx d x +
b a
(z(x) − z (x)) sin nx d x.
Thus, b b b z(x) sin nxd x ≤ z (x) sin nxd x + |z(x) − z (x)|d x a a a b ≤ z (x) sin nxd x + . a Since z ∈ Z, we obtain b lim sup z(x) sin nxd x ≤ n→+∞ a
∀ > 0,
b which implies limn→+∞ a z(x) sin nxd x = 0. We now observe that the sequence (vn )n∈N , vn (x) = sin nx does not norm converge in V = L p (a, b ). Otherwise, it would be norm convergent to zero, but this is impossible since, for example, with p = 2, vn 2 =
b
2
(sin nx) d x = a
b
1
(1 − cos 2nx)d x 2 b −a 1 = − (sin 2nb − sin 2na) 2 4n b −a −→ , which is different from zero! n→+∞ 2 a
2. Concentration. Take Ω = (0, 1) and (vn )n∈N a sequence of step functions which is described as follows: , $ k k let An = k=1,...,n n+1 − 2n1 2 , n+1 + 2n1 2 and take vn = n on An and vn = 0 elsewhere. Let us examine the mode of convergence of the sequence (vn )n∈N in V = L2 (0, 1). One can 1first observe that (a) 0 vn2 (x) d x = n · n12 · n = 1 for all n ∈ N; (b) the sequence (vn )n∈N converges to zero in measure, that is, ∀δ > 0
meas{x ∈ (0, 1) : |vn (x)| > δ} −→ 0.
In fact, one just needs to observe that {x ∈ (0, 1) : |vn (x)| > δ} = An and meas(An ) = n · n12 = n1 −→ 0. n→+∞
Therefore, the sequence (vn )n∈N does not norm converge in L2 (0, 1); otherwise it would converge to zero (recall that the norm convergence in L2 implies the convergence in measure), which is impossible since vn L2 = 1.
i
i i
i
i
i
i
2.4. Weak topologies and weak convergences
book 2014/8/6 page 49 i
49
Let us now prove that the sequence (vn )n∈N weakly converges to zero in V = L2 (0, 1). Indeed, by using the same density argument as in the previous oscillation example, we just need to prove that for any step function z : (0, 1) −→ R,
1 0
vn (x)z(x)d x −→ 0. n→+∞
By linearity of the integral, we just need to compute for any 0 < a < b < 1 the integral b v (x)d x. Let us now observe that as n goes to +∞, a n
b
vn (x)d x % n(b − a) ·
a
1 $ b −a · n = $ −→ 0 2 n n
(% stands for equivalent). We stress the fact that in the concentration example, the weak convergence occurs simultaneously with the pointwise convergence. What happens in this situation is that the mass of |vn |2 is concentrated in a set of small Lebesgue measure. This is the concentration phenomenon. Let us notice, too, that the sequence (vn )n∈N , in the above example norm converges to zero in any L p (0, 1), 1 ≤ p < 2! To see this, just compute
1 0
|vn (x)| p d x = n ·
1 n2
· n p/2 = n ( p/2)−1 −→ 0
as n → +∞ as soon as ( p/2) − 1 < 0, that is, p < 2. Thus p = 2, in this situation, is a critical exponent, for which we pass from strong convergence of the sequence (vn )n∈N in L p , p < 2, to weak convergence in L2 . As we will see, weak convergences related to concentration effect, often occurs in situations where some critical exponent is involved (like the critical Sobolev exponent). The two previous examples illustrate the utility of the density arguments when proving weak convergence. Let us state it in an abstract setting. Proposition 2.4.8. Let V be a normed linear space and Z a dense subset of V ∗ . For any bounded sequence (vn )n∈N in V , the following assertions are equivalent: σ(V ,V ∗ )
(i) vn −→ v. (ii) For all z ∗ ∈ Z, 〈z ∗ , vn 〉 −→ 〈z ∗ , v〉 as n → +∞. PROOF. Clearly (i) =⇒ (ii). So let us assume (ii) and prove that for any v ∗ ∈ V ∗ , we have lim 〈v ∗ , vn 〉 = 〈v ∗ , v〉.
n→+∞
By the density of Z in V ∗ for any > 0, there exists some element z∗ ∈ Z such that v ∗ − z∗ ∗ < . Let us write 〈v ∗ , vn − v〉 = 〈z∗ , vn − v〉 + 〈v ∗ − z∗ , vn − v〉, which by the triangle inequality and the definition of the dual norm · ∗ yields |〈v ∗ , vn − v〉| ≤ |〈z∗ , vn − v〉 + v ∗ − z∗ ∗ · vn − v.
i
i i
i
i
i
i
50
book 2014/8/6 page 50 i
Chapter 2. Weak solution methods in variational analysis
Using the assumption that the sequence (vn )n∈N is bounded in V and that v ∗ − z∗ ∗ < , we obtain that for some constant C ∈ R+ , |〈v ∗ , vn − v〉| ≤ |〈z∗ , vn − v〉| + C . Now let n tend to +∞, and use assumption (ii) together with z∗ ∈ Z to get lim sup |〈v ∗ , vn − v〉| ≤ C . n→+∞
This inequality being true for any > 0, we finally infer ∀v ∗ ∈ V ∗
lim 〈v ∗ , vn − v〉 = 0,
n→+∞
that is, v = σ(V ,V ∗ ) limn→+∞ vn .
2.4.3 Weak convergence and geometry of uniformly convex spaces In this section we pay attention to a particular class of Banach spaces, namely, the uniformly convex Banach spaces, where we will be able to extend the result of Proposition 2.4.7, that is, weak convergence and convergence of the norms imply the strong convergence. Definition 2.4.2. A Banach space (V , ·) is said to be uniformly convex if for any sequences (un )n∈N , (vn )n∈N in V with un = vn = 1 for all n ∈ N, the following implication holds: ) ) ) un + vn ) ) ) −→ 1 =⇒ u − v −→ 0. n n ) 2 ) n→+∞ This result reflects a geometrical property of the unit ball which has to be well rotund. Note that this definition is not stable when replacing a norm by an equivalent one. As an elementary example, one can observe that V = RN equipped with the norm x2 = 1/2 n x2 is uniformly convex, whereas the norms x1 = N |x | and x∞ = i =1 i i =1 i max1≤i ≤N |xi | are not uniformly convex. The uniform convexity of the norm expresses that if u and v are on the unit sphere, the fact that u+v is close to the sphere forces u and v to be close to each other. 2 Proposition 2.4.9. (a) The Hilbert spaces are uniformly convex. (b) The L p spaces, 1 < p < ∞ are uniformly convex. PROOF. (a) The uniform convexity of Hilbert spaces is a direct consequence of the parallelogram equality: ∀u, v ∈ V
u + v2 + u − v2 = 2(u2 + v2 ).
Notice that this property characterizes Hilbert spaces among general Banach spaces. So, let us take un , vn in V such that un = vn = 1. We have ) ) ) u + v )2 ) n n) 2 un − vn = 4 1 − ) ) . ) 2 ) So
un +vn −→ 1 2
forces un − vn to converge to zero as n → +∞.
i
i i
i
i
i
i
2.4. Weak topologies and weak convergences
book 2014/8/6 page 51 i
51
(b) The same type of argument works in L p spaces, 1 < p < ∞, when replacing the parallelogram identity by the so-called Clarkson’s inequalities; if · is the L p norm, one has to distinguish two cases: ) ) ) ) ) u − v )p ) u + v )p 1 p p ) +) ) ) if 2 ≤ p < ∞, ) 2 ) ≤ 2 (u + v ) ) 2 ) & p−1 ) ) ) % ) ) u − v )p ) u + v )p 1 ) +) ) ≤ (u p + v p ) ) if 1 < p ≤ 2 ) 2 ) ) 2 ) 2 with
1 p
+
1 p
= 1.
The following result justifies the introduction of the notion of uniform convexity in this section devoted to the weak convergence. Proposition 2.4.10. Let V be a uniformly convex Banach space. Then for any sequence (vn )n∈N in V the following implication holds: σ(V ,V ∗ )
·
vn −→ v and vn −→ v =⇒ vn −→ v. PROOF. Let us reduce ourselves to the case vn = v = 1. To that end, let us consider v v wn := vn and w := v . (The case v = 0 is obvious, so we can assume v = 0.) n
σ(V ,V ∗ )
One can notice that wn = 1 and wn −→ w: indeed, for any v ∗ ∈ V ∗ , v vn ∗ ∗ 〈v , wn − w〉= v , − vn v 1 1 1 ∗ = 〈v , vn − v〉 + − 〈v ∗ , v〉, vn vn v σ(V ,V ∗ )
which goes to zero as n → +∞, since vn −→ v and vn −→ v = 0. ) ) ∗ w +w σ(V ,V ) ) w +w ) Let us show that ) n2 ) → 1 as n → +∞. Since n2 −→ w, by the lower semicontinuity of the norm for the weak topology (Proposition 2.4.6(iii)), ) ) ) ) ) wn + w ) ) wn + w ) ) ) ) ) ≤ 1. ≤ lim sup ) 1 = w ≤ lim inf ) n 2 ) 2 ) n The last inequality follows from the triangle inequality and the fact that wn = w = 1. w +w So we have wn = w = 1 and n2 −→ 1. It follows from the uniform convexity ·
property that wn −→ w. Using once more that vn −→ v we derive that vn = vn wn norm converges to v = vw. Remark 2.4.3. It is a quite useful method, when proving that a sequence (un )n∈N is norm converging in a Hilbert space, or more generally in a uniformly convex Banach space, to prove first that the weak convergence holds and then to prove that the norms converge, too. For example, when minimizing the norm over a closed convex bounded subset of a uniformly convex Banach space, one automatically obtains that any minimizing sequence is norm convergent. The property for a Banach space to verify for any sequence (vn )n∈N in V the implicaσ(V ,V ∗ )
·
tion vn −→ v and vn −→ v =⇒ vn −→ v is often called the Kadek property.
i
i i
i
i
i
i
52
book 2014/8/6 page 52 i
Chapter 2. Weak solution methods in variational analysis
Remark 2.4.4. Let us observe that the Kadek property fails to be true in general Banach spaces. For example, it is false in the space L1 (Ω), Ω ⊂ RN equipped with the Lebesgue measure: indeed, take Ω = (0, π),
vn (x) = 1 + sin nx,
n = 1, 2, . . . .
π σ(L1 ,L∞ ) Then vn −→ v ≡ 1, vn 1 = 0 vn (x)d x = π + n1 (1 − cos nπ), so that vn 1 −→ π = n→∞ π v1 . But vn − v1 = 0 | sin nx|d x does not converge to zero as n → +∞. Indeed,
π
π/n
| sin nx|d x = n ·
0
sin nxd x = 2.
0
2.4.4 Weak compactness theorems in reflexive Banach spaces We have already observed that L p spaces enjoy quite different properties with respect to the weak convergence, depending on the two situations 1 < p < ∞ and p = 1, p = ∞. One can distinguish them by introducing the concept of uniform convexity, or local uniform convexity of the space as done in Section 2.4.3. But this is a geometrical concept related to the choice of the norm, and when dealing with topological concepts like compactness, one is naturally led to consider notions which are of topological nature (i.e., invariant by the choice of an equivalent norm). This is where the notion of reflexive Banach space plays a fundamental role. Let us first recall its definition. Let V be a Banach space, V ∗ its topological dual, and V ∗∗ its topological bidual equipped, respectively, with the norms · V = · ,
v ∗ ∗ = sup |〈v ∗ , v〉|, v≤1
v ∗∗ ∗∗ = sup |〈v ∗∗ , v ∗ 〉|. v ∗ ∗ ≤1
There exists a canonical embedding of V into V ∗∗ denoted by J : V −→ V ∗∗ , which is defined as follows: ∀v ∈ V , ∀v ∗ ∈ V ∗
〈J v, v ∗ 〉(V ∗∗ ,V ∗ ) = 〈v ∗ , v〉(V ∗ ,V ) .
Let us comment on this definition. For any v ∈ V , the mapping v ∗ ∈ V ∗ −→ 〈v ∗ , v〉(V ∗ ,V ) ∈ R is linear and continuous on V ∗ , so it defines uniquely an element of V ∗∗ which is denoted by J v. Let us observe that |〈v ∗ , v〉| ≤ v ∗ ∗ · v
∀ v ∈ V,
so that J v∗∗ ≤ v. Indeed, as a consequence of the Hahn–Banach theorem we have J v∗∗ = sup |〈v ∗ , v〉| = v, v ∗ ∗ ≤1
so that J is a linear isometry from V into V ∗∗ . As a consequence, J is an embedding of V into V ∗∗ . Definition 2.4.3. A Banach space V is said to be reflexive if J (V ) = V ∗∗ . When V is reflexive one can identify V and V ∗∗ with the help of J .
i
i i
i
i
i
i
2.4. Weak topologies and weak convergences
book 2014/8/6 page 53 i
53
Remark 2.4.5. J is a linear isometry. Thus it preserves the linear and the normed structures and allows us to identify V and V ∗∗ when J is onto, that is, in the case of reflexive Banach spaces. One has to pay attention to the following fact: the definition of reflexive Banach spaces says that the map J realizes an isometrical isomorphism between V and V ∗∗ . It is essential to use J in the definition since one can exhibit a nonreflexive Banach space V such that there exists an isometry from V onto V ∗∗ ! Proposition 2.4.11. Let V be a uniformly convex Banach space. Then V is reflexive. For a proof of this result, see, for instance, [361], [137]. When considering L p spaces, this result is in accordance with the results concerning the dual of L p spaces. When 1 < p < ∞, L p is uniformly convex, (L p )∗ = L p , where 1p + p1 = 1, so that (L p )∗∗ = (L p )∗ = L p (equalities above mean isometric isomorphisms).
Remark 2.4.6. Note that there exist reflexive Banach spaces which do not admit an equivalent norm which makes the space uniformly convex. However, one can always renorm a reflexive Banach space with a norm (equivalent) which is locally uniformly convex both with its dual norm. With this renorming, it will satisfy the Kadek property (as well as its dual); see Section 2.4.3. The importance of reflexive Banach spaces is justified by the following theorem. Theorem 2.4.2. (a) In a reflexive Banach space (V , · ) the closed unit ball B = {v ∈ V : vV ≤ 1} is compact for the topology σ(V ,V ∗ ). As a consequence, the bounded subsets of V are relatively compact for the topology σ(V ,V ∗ ). (b) The above property characterizes the reflexive Banach spaces: a Banach space is reflexive iff the closed unit ball is compact for the topology σ(V ,V ∗ ). PROOF. The proof is a direct consequence of the Banach–Alaoglu–Bourbaki theorem, Theorem 2.4.8. It makes use of the weak∗ topology on the dual of a Banach space. Let us now state a theorem from Eberlein and Smulian which states that from every bounded sequence in a reflexive Banach space one can extract a sequence which converges for the topology σ(V ,V ∗ ). This is an important and quite surprising result, since the weak topology is not metrizable. One does not expect to have such a sequential compactness result! Theorem 2.4.3. Let V be a reflexive Banach space. Then, from each bounded sequence (un )n∈N in V , one can extract a subsequence (unk )k∈N which converges for the topology σ(V ,V ∗ ). PROOF. (a) Let us first assume that V ∗ is separable, that is, there exists a countable set D = (vk∗ )k∈N which is dense in V ∗ . The proof relies on a diagonalization argument. First, let us notice that for all v ∗ ∈ V ∗ , |〈v ∗ , un 〉| ≤ v ∗ ∗ · un ≤ C v ∗ ∗ , where C = supn∈N un < +∞. So the sequence {〈v ∗ , un 〉 : n ∈ N} is bounded in R. For each v ∗ ∈ V ∗ , one can extract from the sequence {〈v ∗ , un 〉 : n ∈ N} a convergent subsequence. The difficult point is that without any further argument, the so extracted
i
i i
i
i
i
i
54
book 2014/8/6 page 54 i
Chapter 2. Weak solution methods in variational analysis
sequence depends on v ∗ . This is where the separability of V ∗ and the diagonalization argument take place. Let us start with v1∗ ∈ D (D dense countable subset of V ∗ ) and extract a convergent subsequence {〈v1∗ , uσ1 (n) 〉 : n ∈ N}, where σ1 : N → N is a strictly increasing mapping. In a similar way, the sequence {〈v2∗ , uσ1 (n) 〉 : n ∈ N} is bounded in R, so there exists a convergent subsequence {〈v2∗ , uσ1 ◦σ2 (n) 〉 : n ∈ N}. Let us iterate this argument by induction. We can so construct for each n ∈ N an increasing mapping σn : N −→ N such that the sequence {〈vn∗ , uσ1 ◦σ2 ◦···◦σn (k) 〉 : k ∈ N} is convergent in R. Note that it is important to have the composition of the mapping σi in the precise order σ1 ◦ · · · ◦ σn to have this subsequence extracted from all the previous ones, σ1 , σ1 ◦ σ2 , . . . , σ1 ◦ · · · ◦ σn−1 . Now the diagonalization argument consists of taking τ : N −→ N defined by τ(n) = (σ1 ◦ σ2 ◦ · · · ◦ σn )(n). In other words, τ(n) is the element of rank n of the subsequence which has been extracted at step n. Clearly τ is strictly increasing. It is important to notice that for each n ∈ N, the sequence (uτ(k) )k≥n is extracted from the sequence (uσ1 ◦···◦σn (k) )k≥n : indeed, for k ≥ n, uτ(k) = uσ1 ◦σ2 ◦···◦σk (k) = u(σ1 ◦···◦σn )( p) , where p = (σn+1 ◦ · · · ◦ σk )(k). Since k ≥ n and σi are strictly increasing, we also have p ≥ n. It follows that for each k ∈ N, the sequence (〈vk∗ , uτ(n) 〉 : n ∈ N), as a subsequence of a convergent sequence, is convergent. By the density of D in V ∗ , the result is easily extended to an arbitrary element v ∗ of ∗ V : for each > 0 take vk∗ ∈ such that v ∗ − vk∗ < . Then for each n, m ∈ N,
∗
|〈v , uτ(n) 〉 − 〈v
Hence,
∗
, uτ(m) 〉| ≤ |〈vk∗ , u(τn ) − u(τ m )〉| + |〈v ∗ − vk∗ , u(τn ) − u(τ m )〉| ≤ 2C + |〈vk∗ , u(τn ) − u(τ m )〉|.
lim sup |〈v ∗ , uτ(n) − uτ(m) 〉| ≤ 2C . n,m→∞
This being true for any > 0, we infer that the sequence {〈v ∗ , uτ(n) 〉 : n ∈ N} satisfies the Cauchy criteria and is thus convergent in R. Let us denote for all v ∗ ∈ V ∗ , L(v ∗ ) := limn→+∞ 〈v ∗ , uτ(n) 〉. Clearly, L is a linear continuous form on V ∗ ; note that by passing to the limit in the inequality |〈v ∗ , un 〉| ≤ C v ∗ ∗ , we also have |L(v ∗ )| ≤ C v ∗ ∗ . Hence L ∈ V ∗∗ . Let us stress the fact that up to this point, we have not used the reflexivity hypothesis. We now use that V is reflexive to assert that L = J (u) for some u ∈ V , where J is the canonical embedding of V into V ∗∗ . So for all v ∗ ∈ V ∗ we have lim 〈v ∗ , uτ(n) 〉 = 〈v ∗ , u〉,
n→+∞
that is, u = σ(V ,V ∗ ) limn→∞ uτ(n) .
i
i i
i
i
i
i
2.4. Weak topologies and weak convergences
book 2014/8/6 page 55 i
55
(b) Now take V a reflexive Banach space and do not make any separability assump¯ the closure tions. Take E the subspace of V generated by the (un )n∈N and define W = E, of E in (V , · ). It is easy to verify that W is reflexive and separable. This in turn implies that W ∗ is reflexive and separable (see [137, Corollary III.24]). We are now in the situation studied in the first part of the proof. One can extract a subsequence (uτ(n) )n∈N which converges in W for the topology σ(W ,W ∗ ). Since V ∗ ⊂ W ∗ (by restriction of the linear continuous forms on V to W ) we derive that (uτ(n) )n∈N is convergent for the topology σ(V ,V ∗ ).
2.4.5 The Dunford–Pettis weak compactness theorem in L1 (Ω) When 1 < p < ∞, the L p spaces are reflexive Banach spaces, and it follows from Theorem 2.4.3 that the relatively compact subsets of L p for the topology σ(L p , L p ) are exactly the p bounded subsets of L . The situation in the case p = 1 is very different (L1 is not a reflexive Banach space), and the comprehension of the weak convergence properties of bounded sequences in L1 is a subject of great importance and is quite involved. Let us first examine the following example. Take Ω = (−1, 1) equipped with the Lebesgue measure and take 1 1 ≤ x ≤ + 2n , n if − 2n vn (x) = 0 elsewhere. Clearly, the sequence (vn )n∈N satisfies ⎧ ⎪ ⎪ v ≥ 0, vn (x)d x = 1, ⎪ n ⎪ ⎪ Ω ⎨ v (x) −→ 0 for a.e. x ∈ Ω, ⎪ n ⎪ ⎪ ⎪ ⎪ ⎩ vn (x)z(x)d x −→ z(0) for any z ∈ C(Ω). Ω
n→+∞
We can observe that the sequence (vn )n∈N is bounded in L1 (Ω), but one cannot extract a weakly convergent subsequence in the sense σ(L1 , L∞ ). By contrast, we will see in the next section that one can interpret the convergence of the sequence (vn )n∈N with the help of a weak topology of dual σ(V ∗ ,V ), for example, σ( b (Ω), C0 (Ω)). For the moment, we just retain from this example that to obtain σ(L1 , L∞ ) compactness of a sequence of functions, it is not sufficient to assume that the sequence is bounded in L1 . This is where the notion of uniform integrability plays a central role. Definition 2.4.4. Let (Ω, , μ) be a measure space with μ a positive and finite measure (μ(Ω) < +∞). Let K be a subset of L1 (Ω, , μ). We say that K is uniformly integrable if (a) and (b) hold: (a) K is bounded in L1 (Ω, , μ); (b) for every > 0 there exists some δ() > 0 such that A ∈ , μ(A) < δ() =⇒ sup |v(x)|d μ(x) < . v∈K
A
It is sometimes convenient to consider the following criterion of uniform integrability (also called equi-integrability)
i
i i
i
i
i
i
56
book 2014/8/6 page 56 i
Chapter 2. Weak solution methods in variational analysis
Proposition 2.4.12. Let (Ω, , μ) be a measure space satisfying the hypotheses of Definition 2.4.4. Then K is uniformly integrable iff |v(x)| d μ(x) = 0. (2.45) lim sup R→+∞ v∈K
[|v|>R]
PROOF. Assume that (a) and (b) hold and consider R0 = δ1 supv∈K Ω |v(x)| d μ(x), which is finite according to (a). Let R ≥ R0 ; from 1 1 |v(x)| d μ(x) ≤ |v(x)| d μ(x) μ [|v| > R] ≤ R Ω R0 Ω we infer that μ [|v| > R] < δ so that, from (b), [|v|>R] |v(x)| d μ(x) < . Conversely, assume that (2.45) holds. For each A ∈ and each R > 0 one has |v(x)| d μ(x) = |v(x)| d μ(x) + |v(x)| d μ(x) A
A∩[|v|≤R]
A∩[|v|>R]
≤ Rμ(A) + sup v∈K
[|v|>R]
|v(x)| d μ(x).
(2.46)
Taking A = Ω in (2.46) gives (a). On the other hand, for all > 0, choose R large enough . Then so that supv∈K [|v|>R] |v(x)| d μ(x) < 2 , and take A in satisfying μ(A) < 2R (2.46) yields (b). A comprehensive characterization of this property is given by the De La Vallée–Poussin theorem. Theorem 2.4.4. Let (Ω, , μ) be a measure space with μ a positive and finite measure and K a subset of L1 (Ω, , μ). The following properties are equivalent: (i) K is uniformly integrable; (ii) there exists a function θ : [0, +∞[→ [0, +∞[ (θ can be taken convex and increasing) θ(s ) such that lim s →+∞ s = +∞ and sup θ(|v(x)|)d μ(x) < +∞. v∈K
Ω
PROOF. The implication (ii) =⇒ (i) is important for applications. Let us prove it. First, one can observe that since θ has a superlinear growth, for each M ∈ R+ there exists some C (M ) ∈ R+ such that ∀s ∈ R+
0≤s ≤
1 M
θ(s) + C (M ).
Let us fix M0 > 0. We have for each v ∈ K 1 |v(x)|d μ(x) ≤ θ(|v(x)|)d μ(x) + C (M0 )μ(Ω) M0 Ω Ω and hence supv∈K |v|d μ < +∞, which proves (a) of Definition 2.4.4.
i
i i
i
i
i
i
2.4. Weak topologies and weak convergences
book 2014/8/6 page 57 i
57
Let us now prove (b). Fix > 0; for any v ∈ K, A ∈ , and M ∈ R+ , 1 |v(x)|d μ(x) ≤ θ(|v(x)|)d μ(x) + C (M )μ(A) M A A 1 ≤ sup θ(|v|)d μ + C (M )μ(A). M v∈K Ω Take M () :=
2
sup
v∈K
Then if μ(A) ≤ δ() we have
Ω
|v(x)|d μ(x) ≤
sup v∈K
θ(|v|) d μ,
A
δ() :=
2
+
2
2C (M ())
.
= ,
which proves (b). For the proof of the reverse implication (i) =⇒ (ii), see, for instance, [198, Theorem 22] or [153, Theorem 2.12]. We are now ready to state the Dunford–Pettis theorem, which gives a characterization of the weak compactness property in L1 . Theorem 2.4.5 (Dunford–Pettis theorem). Let (Ω, , μ) be a measure space with μ a positive and finite measure. Let K be a subset of L1 (Ω, , μ). The following properties are equivalent: (i) K is relatively compact for the weak topology σ(L1 , L∞ ). (ii) K is uniformly integrable. (iii) From each sequence (vn )n∈N contained in K, one can extract a subsequence converging for the topology σ(L1 , L∞ ). PROOF. See [198, Theorem 25], [203, Theorem IV.8.9, Corollary IV.8.11], and [153]. When Ω is an open subset of RN and μ the Lebesgue measure on Ω, one can find a proof of implication (ii) =⇒ (iii) in Proposition 4.3.7 by using the notion of Young measures. Remark 2.4.7. As an illustration of the Dunford–Pettis theorem, let us consider (vn )n∈N a sequence in L1 (Ω, , μ), μ(Ω) < +∞ such that supn∈N Ω |vn | ln |vn |d μ < +∞. We claim that the sequence (vn ) is σ(L1 , L∞ ) relatively compact. To obtain this result, just use the De La Vallée–Poussin theorem with θ(r ) = r ln r (which is superlinear) and then use the Dunford–Pettis theorem. It is immediate that any dominated sequence (vn )n∈N in L1 (Ω, , μ), i.e., satisfying for a.e. x ∈ Ω, |vn (x)| ≤ g (x), where g is some function in L1 (Ω, , μ), is uniformly integrable. The following theorem extends the Lebesgue dominated convergence theorem for uniformly integrable sequences. Theorem 2.4.6 (Vitali convergence theorem). Let (Ω, , μ) be a measure space with μ a positive and finite measure. Let (vn )n∈N be a uniformly integrable sequence of functions in
i
i i
i
i
i
i
58
book 2014/8/6 page 58 i
Chapter 2. Weak solution methods in variational analysis
L1 (Ω, , μ) which converges a.e. to some function v ∈ L1 (Ω, , μ). Then (vn )n∈N strongly converges to v in L1 (Ω, , μ). PROOF. For each R > 0 one has |vn − v| d μ ≤ Ω
≤
[vn −v|≤R]
[vn −v|≤R]
|vn − v| d μ +
|vn −v|>R
|vn − v| d μ
|vn − v| d μ + sup
n∈N
|vn −v|>R
|vn − v| d μ.
(2.47)
Consider the continuous function φR : [0, +∞) → [0, R] defined by φR (t ) = min(t , R). From (2.47) we infer |vn − v| d μ ≤ φR (|vn − v|) d μ + sup |vn − v| d μ. (2.48) Ω
Ω
n∈N
|vn −v|>R
Since φR (|vn − v|) converges a.e. to 0 and is dominated by R, according to the Lebesgue dominated convergence theorem, we deduce that lim φR (|vn − v|) d μ = 0. n→+∞
Ω
On the other hand the sequence (vn − v)n∈N is clearly uniformly integrable; hence, from Proposition 2.4.12, lim sup
R→+∞ n∈N
|vn −v|>R
|vn − v| d μ = 0.
The conclusion then follows by letting first n → +∞, then R → +∞ in (2.48).
2.4.6 The weak∗ topology σ (V ∗ , V ) Let (V , · ) be a normed linear space with topological dual V ∗ . On V ∗ we have already defined two topologies: • The norm topology associated to the dual norm v ∗ ∗ = sup{|〈v ∗ , v〉| : v ∈ V , v ≤ 1}, which makes V ∗ a Banach space; • The weak topology σ(V ∗ ,V ∗∗ ), where V ∗∗ is the topological bidual of V . But this topology is often difficult to handle because the space V ∗∗ may have a rather involved structure (think, for example, V = L1 , V ∗ = L∞ , and V ∗∗ = (L∞ )∗ ), and moreover it may be too strong to enjoy desirable compactness properties. The idea is to introduce a topology weaker than σ(V ∗ ,V ∗∗ ) by considering a weak topology on V ∗ induced not by all the linear continuous forms on V ∗ but only by a subfamily. At this point, there is a natural candidate which consists in taking J (V ) ⊂ V ∗∗ , where J : V −→ V ∗∗ is the canonical embedding from V into its bidual V ∗∗ . Recall that (see Section 2.4.4) ∀v ∈ V , ∀v ∗ ∈ V ∗
〈J (v), v ∗ 〉(V ∗∗ ,V ∗ ) := 〈v ∗ , v〉(V ∗ ,V ) .
i
i i
i
i
i
i
2.4. Weak topologies and weak convergences
book 2014/8/6 page 59 i
59
Definition 2.4.5. Let V be a normed space with topological dual V ∗ . The weak∗ topology σ(V ∗ ,V ) on V ∗ is the weakest topology on V ∗ making continuous all the mappings (J (v))v∈V , where J is the canonical embedding from V into V ∗∗ : J (v) : V ∗ −→ R, v ∗ −→ 〈J (v), v ∗ 〉(V ∗∗ ,V ∗ ) = 〈v ∗ , v〉. Let us collect in the following proposition some first elementary properties of the topology σ(V ∗ ,V ). Proposition 2.4.13. Let V be a normed space and σ(V ∗ ,V ) the weak∗ topology on the topological dual V ∗ . Then (i) A local base of neighborhoods of v0∗ ∈ V ∗ for the topology σ(V ∗ ,V ) consists of all sets of the form N (v0∗ ) = {v ∗ ∈ V ∗ : |〈v ∗ − v0∗ , vi 〉| < ∀i ∈ I }, where I is a finite index set, vi ∈ V for each i ∈ I , and > 0. (ii) For any sequence (vn∗ )n∈N in V ∗ , we have σ(V ∗ ,V )
(a) vn∗ −→ v ∗ ⇐⇒ ∀ v ∈ V ·
〈vn∗ , v〉 −→ 〈v ∗ , v〉;
σ(V ∗ ,V )
(b) vn∗ −→ v ∗ =⇒ vn∗ −→ v ∗ . Assume now that V is a Banach space. Then σ(V ∗ ,V )
(c) vn∗ −→ v ∗ =⇒ the sequence (vn∗ ) is bounded and v ∗ ∗ ≤ lim inf vn∗ ∗ ; n
σ(V ∗ ,V )
·
(d) vn∗ −→ v ∗ and vn −→ v =⇒ 〈vn∗ , vn 〉 −→ 〈v ∗ , v〉. PROOF. (i) and (ii)(a) are direct consequences of the general properties of topologies induced by functions; see, respectively, Theorem 2.4.1 and Proposition 2.4.3. Then (ii)(b), (ii)(c), and (ii)(d) are obtained in a similar way as in the proof of Proposition 2.4.6. Just notice that to apply the uniform boundedness theorem, one needs to assume that V is a Banach space. (In Proposition 2.4.6 one works on V ∗ , which is always a Banach space!) An important example: V = C0 (Ω), V∗ = b (Ω). Let Ω be a locally compact topological space which is σ-compact (i.e., Ω can be written as Ω = n∈N Kn with Kn compact). For example, we may take Ω = RN , or Ω an open subset of RN , or Ω an arbitrary topological compact set. Take V = C0 (Ω) the linear space of real continuous functions on Ω which tend to zero at infinity; more precisely, we say that a continuous function u is in C0 (Ω) if for every > 0 there exists a compact set K such that |u(x)| < on Ω \ K . Notice that C0 (Ω) reduces to C(Ω) when Ω is compact. We may endow the space C0 (Ω) with the norm vV = sup |v(x)|. x∈Ω
Then V is a Banach space whose topological dual V ∗ can be described thanks to the celebrated result below.
i
i i
i
i
i
i
60
book 2014/8/6 page 60 i
Chapter 2. Weak solution methods in variational analysis
Theorem 2.4.7 (Riesz–Alexandroff representation theorem). The topological dual of C0 (Ω) can be isometrically identified with the space of bounded Borel measures. More precisely, to each bounded linear functional Φ on C0 (Ω) there is a unique Borel measure μ on Ω such that for all f ∈ C0 (Ω), Φ( f ) =
Ω
f (x)d μ(x).
Moreover, φ = |μ|(Ω). For a proof of this theorem, see, for instance, [331, Theorem 6.19] or [143, Theorem 1.4.22]. Notice that this theorem holds true also when Ω is not supposed to be σ-compact, but in that case one has to consider measures μ which are regular. We recall that a Borel measure μ ≥ 0 is said to be regular if ∀B ∈ (Ω) μ(B) = inf{μ(V ) : V ⊃ B,V open}, ∀B ∈ (Ω) μ(B) = sup{μ(K) : K ⊂ B, K compact}, and a signed measure μ is said to be regular if its total variation measure |μ| is regular. When Ω is σ-compact, this property is automatically satisfied by Borel measures which are bounded. Note also that since Cc (Ω) is dense in C0 (Ω), these two spaces have the same topological dual. We prefer to consider C0 (Ω) in this construction because it is a Banach space for the sup norm. Thus a bounded Borel measure μ can be as well considered as a σ-additive set function on the Borel σ-algebra (that’s the probabilistic approach) or as a continuous linear form on C0 (Ω) or Cc (Ω) (that’s the functional analysis approach). Given a sequence (μn )n∈N of bounded Borel measures, we can consider these measures as elements of the topological dual space V ∗ = b (Ω) of V = C0 (Ω). This leads to the following definition. Definition 2.4.6. (i) A sequence (μn )n∈N ⊂ b (Ω) converges weakly to μ ∈ b (Ω), and we write μn −→ μ provided
Ω
in b (Ω)
ϕ d μn −→
Ω
ϕ dμ
as n → ∞
for each ϕ ∈ Cc (Ω). (ii) A sequence (μn ) ⊂ b (Ω) converges σ( b , C0 ) to μ ∈ b (Ω), and we write μn provided
σ( b ,C0 )
−→ μ,
Ω
ϕ d μn −→
Ω
ϕ dμ
for each ϕ ∈ C0 (Ω). The relation between these two close concepts is given by the following result. Proposition 2.4.14. Given (μn )n∈N ⊂ b (Ω), μ ∈ b (Ω) one has the equivalence μn
σ( b ,C0 )
−→ μ ⇐⇒ μn −→ μ
and
sup |μn |(Ω) < +∞.
n∈N
i
i i
i
i
i
i
2.4. Weak topologies and weak convergences
book 2014/8/6 page 61 i
61 σ( b ,C0 )
PROOF. Since Cc (Ω) ⊂ C0 (Ω) the implication μn −→ μ =⇒ μn −→ μ is clear. Moreover, since b (Ω) = V ∗ with V = C0 (Ω) which is a Banach space, the uniform boundedness theorem implies (see Proposition 2.4.13(ii)(c)) μn
σ( b ,C0 )
−→ μ =⇒ sup μn < +∞,
that is, supn∈N |μn |(Ω) < +∞. The converse statement follows from a density argument which is similar to the one developed in Proposition 2.4.8. (Note that Cc (Ω) is dense in C0 (Ω).) Corollary 2.4.1. On any bounded subset of b (Ω) there is the equivalence μn
σ( b ,C0 )
−→ μ ⇐⇒ μn −→ μ.
Let us now return to the general abstract properties of the weak∗ topologies and state the following compactness theorem, which explains the importance of these topologies. Theorem 2.4.8 (Banach–Alaoglu–Bourbaki). Let V be a normed linear space. Then the unit ball BV ∗ = {v ∗ ∈ V ∗ : v ∗ ∗ ≤ 1} of the topological dual V ∗ is compact for the topology σ(V ∗ ,V ). PROOF. An element v ∗ ∈ V ∗ is a function from V into R. Let us write briefly RV for the set of all functions from V into R and denote by i the canonical embedding i : V ∗ −→ RV , v ∗ −→ {〈v ∗ , v〉}v∈V . When v ∗ ∗ ≤ 1, we have indeed |〈v ∗ , v〉| ≤ v, so . [−v, +v] := Y, i : BV ∗ −→ v∈V
i(v ∗ ) = {〈v ∗ , v〉}v∈V . Let us endow Y with the product topology, which is the weakest topology on Y making all the projections continuous. This topology induces on i(BV ∗ ) the weak∗ topology σ(V ∗ ,V ); this is exactly the way it has been defined. The topological space Y , which is a product of compact spaces and which is equipped with the product topology, is compact; this is the compactness Tikhonov theorem. So, we just have to verify that i(BV ∗ ) is closed in Y . This is clear since for all generalized sequence (vν∗ )ν∈I , the convergence of i(vν∗ ) for the product topology 〈vν∗ , v〉 −→ Φ(v) ∈ Y implies that Φ is still linear and |Φ(v)| ≤ v, so that Φ(v) = 〈v ∗ , v〉 for some v ∗ ∈ BV ∗ and i(vν∗ ) −→ i(v ∗ ). To obtain a weak∗ sequential compactness result on BV ∗ we need to assume a separability condition on V .
i
i i
i
i
i
i
62
book 2014/8/6 page 62 i
Chapter 2. Weak solution methods in variational analysis
Let us recall that a topological space V is said to be separable if there exists a dense countable subset of V . Typically that is the case of spaces C0 (Ω), L p (Ω) for 1 ≤ p < +∞ but not L∞ (Ω). Theorem 2.4.9. Let V be a separable normed space. Then the unit ball BV ∗ of V ∗ is metrizable for the topology σ(V ∗ ,V ). Before proving Theorem 2.4.9 let us formulate the following important result, which is a direct consequence of Theorems 2.4.8 and 2.4.9. Corollary 2.4.2. Let V be a separable normed linear space and (vn∗ )n∈N a bounded sequence in V ∗ . Then one can extract a subsequence (vn∗ )k∈N which converges for the topology k σ(V ∗ ,V ). PROOF OF THEOREM 2.4.9. Let (vn )n≥1 be a dense countable subset of the unit ball BV of V (which exists since V is assumed to be separable; note that separability is a hereditary property). Then define on the unit ball BV ∗ of V ∗ the following distance d : ∀ u ∗ , v ∗ ∈ BV ∗ d (u ∗ , v ∗ ) =
∞ 1 n=1
2n
|〈u ∗ − v ∗ , vn 〉|.
Let us verify that the topology associated to the distance d coincides with the weak∗ topology σ(V ∗ ,V ) on BV ∗ . This can be done with the help of neighborhoods or by using generalized sequences (nets): one has to verify that for an arbitrary net (vi∗ )i ∈I contained in BV ∗ , σ(V ∗ ,V ) vi∗ −→ v ∗ ⇐⇒ d (vi∗ , v) −→ 0. I
It is easy to verify that since supi ∈I vi∗ ≤ 1, d (vi∗ , v) −→ 0 ⇐⇒ ∀k ∈ N I
〈vi∗ , vk 〉 −→ 〈v ∗ , vk 〉. I
So we have to verify that ∀v ∈ BV 〈vi∗ , v〉 −→ 〈v ∗ , v〉 ⇐⇒ ∀k ∈ N 〈vi∗ , vk 〉 −→ 〈v ∗ , vk 〉. This is exactly the same argument as in Proposition 2.4.8, where one just uses nets instead of sequences. Back to weak∗ convergence of measures. Let us denote by Ω a locally compact topological metrizable space which is σ-compact. Recallthat V = C0 (Ω) is separable: one can first notice that Cc (Ω) is dense in C0 (Ω), Cc (Ω) = n C(Kn ), where Kn are compact, because of the σ-compactness assumption. Then observe that C(K) is separable; this can be obtained as a consequence of the Stone–Weierstrass theorem. Note that the metrizability of Ω is equivalent to the separability of C(K) (see [143, Theorem 2.3.29]). We can now reformulate Corollary 2.4.2 in the case of sequences of measures. Proposition 2.4.15. Let Ω be a locally compact, metrizable, σ-compact topological space. Then, from any bounded sequence of Borel measures (μn )n∈N on Ω, i.e., verifying sup |μn |(Ω) < +∞,
n∈N
i
i i
i
i
i
i
2.4. Weak topologies and weak convergences
book 2014/8/6 page 63 i
63
one can extract a subsequence (μnk )k∈N which is σ( b , C0 ) convergent to some bounded Borel measure μ: ∀ϕ ∈ C0 (Ω) ϕ d μnk −→ ϕ d μ. k→+∞
Ω
Ω
As a particular important situation where the result above can be directly applied let us mention the following. Corollary 2.4.3. Let Ω be an open subset of RN equipped with the Lebesgue measure d x and ( fn )n∈N a sequence of functions which is bounded in L1 (Ω), i.e., supn∈N Ω | fn (x)|d x < +∞. Then there exists a subsequence ( fnk )k∈N and a bounded Borel measure μ on Ω such that
∀ϕ ∈ '0 (Ω)
Ω
ϕ(x) fnk (x)d x −→
k→+∞
Ω
ϕ(x)d μ(x).
PROOF. Apply Proposition 2.4.15 to the sequence μn = fn d x and note that |μn |(Ω) = | f | d x. Ω n Remark 2.4.8. (a) The above result, which allows us to extract from any bounded sequence ( fn )n∈N in L1 (Ω) a subsequence which is σ( b , C0 ) convergent to a bounded Borel measure μ, relies on the embedding of L1 into b (Ω) which is a dual, namely, of C0 . This is a general method which consists, when one has some estimations on a sequence (vn )n∈N in some space X , to embed X → Y ∗ , where Y ∗ is the topological dual of σ(Y ∗ ,Y )
some (separable) normed space Y . Then one can extract a subsequence vnk −→ y ∗ , the limit y ∗ belonging to Y ∗ . (b) As an example of application of Corollary 2.4.3, take the sequence (vn )n∈N defined at the beginning of Section 2.4.5; one has vn
σ( b ,C0 )
−→ δ0 ,
where δ0 is the Dirac measure at the origin. (c) As a counterpart of its generality, the information given by a weak∗ convergence σ( b ,C0 )
σ(L1 ,L∞ )
fn d x −→ μ or even fn −→ f is often not sufficient to analyze some situations, for example, occurring in the study of some nonlinear PDEs. To treat such situations, we will introduce the concept of Young measures in Chapter 4.
i
i i
i
i
i
i
book 2014/8/6 page 65 i
Chapter 3
Abstract variational principles
The introduction in Chapter 2 of a weak formulation of the model examples (Dirichlet problem, Stokes system) leads to study of the following problem. Given a linear vector space V , a bilinear symmetrical form a : V × V −→ R, and a linear form L : V −→ R, find u ∈ V such that a(u, v) = L(v) ∀ v ∈ V .
(3.1)
When a is positive, this turns out to be equivalent to the following minimization problem: find u ∈ V such that J (u) ≤ J (v),
(3.2)
where J (v) := 12 a(v, v) − L(v). In this chapter, we introduce the topological and geometrical concepts which allow us to solve this kind of problem and much more.
3.1 The Lax–Milgram theorem and the Galerkin method 3.1.1 The Lax–Milgram theorem In this section, V is a Hilbert space equipped with the scalar product 〈·, ·〉 and the associated norm: ∀v ∈ V
v2 = 〈v, v〉.
Let us recall the celebrated Riesz theorem. Theorem 3.1.1 (Riesz). Let V be a Hilbert space and L ∈ V ∗ a linear continuous form on V . Then there exists a unique f ∈ V such that ∀ v ∈V
L(v) = 〈 f , v〉.
Notice that given f ∈ V , the linear form L f defined by L f (v) = 〈 f , v〉 65
i
i i
i
i
i
i
66
book 2014/8/6 page 66 i
Chapter 3. Abstract variational principles
satisfies (by application of the Cauchy–Schwarz inequality) |L f (v)| ≤ f v and hence
L f ∗ ≤ f ,
where L f ∗ is the dual norm of the continuous linear form L f . On the other hand, by taking v = 1f f (if f = 0) we obtain L f ∗ ≥ f . The Riesz theorem tells us that the linear isometrical embedding f → L f from V into V ∗ is onto. So V and V ∗ can be identified both as vector spaces and as Hilbert spaces. Note that it is not completely correct to say that the topological dual of V is V itself! The Riesz theorem tells us that the topological dual of V , that is, V ∗ , is isometric to V and describes how any element of V ∗ can be uniquely represented with the help of an element of V : the mapping f ∈ V → L f ∈ V ∗ is an isometrical isomorphism from V onto V ∗ . So we will often identify V ∗ with V . But one may imagine other representations of V ∗ . We will illustrate this when describing the dual of the Sobolev space H01 (Ω). An important situation where one has to be careful with such identifications is when we have an embedding (i linear continuous, V dense in H ) i
V → H of two Hilbert spaces, (V , (·, ·)) and (H , 〈·, ·〉). Clearly, any linear continuous form L on H when “restricted” to V (indeed, L|V = L ◦ i) defines a linear continuous form on V . So H ∗ ⊂ V ∗ and the mapping L ∈ H ∗ → L|V ∈ V ∗ is one to one because V is dense in H . When identifying H and H ∗ we have the usual “triplet” V → H → V ∗ . But now we cannot also identify V and V ∗ because we should end with the conclusion that H → V ! Thus one has to choose one identification. One cannot identify both (H and H ∗ ) and (V and V ∗ ). A typical situation is V = H01 (Ω) → L2 (Ω) → H −1 (Ω) = V ∗ . This leads to a representation of the dual of the Hilbert space V = H01 (Ω) which is different from the Riesz representation. The Riesz representation theorem will play a key role when establishing the following theorem. Theorem / 3.1.2 (Lax–Milgram). Let V be a Hilbert space with the scalar product 〈·, ·〉 and · = 〈·, ·〉 the associated norm. Let a : V × V −→ R be a bilinear form which satisfies (i) and (ii): (i) a is continuous, that is, there exists a constant M ∈ R+ such that ∀u, v ∈ V
|a(u, v)| ≤ M u · v;
i
i i
i
i
i
i
3.1. The Lax–Milgram theorem and the Galerkin method
book 2014/8/6 page 67 i
67
(ii) a is coercive, that is, there exists a constant α > 0 such that ∀v ∈ V
a(v, v) ≥ αv2 .
Then for any L ∈ V ∗ (L is a linear continuous form on V ) there exists a unique u ∈ V such that a(u, v) = L(v) ∀v ∈ V . Remark 3.1.1. Let us make some complements to the discussions above. (a) Before proving the Lax–Milgram theorem, let us notice that it contains as a particular case the Riesz representation theorem. Take a(u, v) = 〈u, v〉 and verify (i) and (ii). By the Cauchy–Schwarz inequality we have |a(u, v)| ≤ u v. Hence a is continuous (take M = 1). Moreover, a(v, v) = 〈v, v〉 = v2 and a is coercive (take α = 1). So, for any L ∈ V ∗ there exists a unique u ∈ V such that L(v) = 〈u, v〉
∀ v ∈ V,
and this is the Riesz representation theorem. (b) One can easily verify that for a bilinear form, the continuity property is equivalent to the existence of some constant M ≥ 0 such that |a(u, v)| ≤ M u v.
(3.3)
Let us first verify that if (3.3) is satisfied, then a is continuous: take un −→ u and vn −→ v. Then a(un , vn ) − a(u, v) = a(un , vn ) − a(un , v) + a(un , v) − a(u, v) = a(un , vn − v) + a(un − u, v). It follows that |a(un , vn ) − a(u, v)| ≤ M un · vn − v + M un − u · v.
(3.4)
The sequence (un )n∈N being norm convergent is bounded in V and there exists a constant C ≥ 0 such that supn un ≤ C . Returning to (3.4), |a(un , vn ) − a(u, v)| ≤ M C vn − v + v · un − u , which implies that limn a(un , vn ) = a(u, v). Conversely, let us assume that a is a bilinear continuous form on V × V . Since a(0, 0) = 0 and a is continuous at (0, 0), for any > 0 there exists some η() > 0 such that u ≤ η() and v ≤ η() =⇒ a(u, v)| ≤ . Take u, v arbitrary elements of V , u = 0, v = 0. Then ) ) ) ) ) η() ) ) η() ) ) ) ) ) u ) ≤ η() and ) v ) ≤ η(). ) ) u ) ) v ) Hence
η() η() u, v ≤ , a u v
i
i i
i
i
i
i
68
book 2014/8/6 page 68 i
Chapter 3. Abstract variational principles
which implies ∀ u, v = 0
|a(u, v)| ≤
η()2
u · v.
This is still true if u or v is the zero element of V . So, one can take M = /η2 (). PROOF OF THE LAX–MILGRAM THEOREM. When establishing a weak formulation for some partial differential equations or systems Au = f
(3.5)
(for example, Au = −Δu for the Dirichlet problem with prescribed boundary data contained in the domain of A), we have been led to study problems of the form a(u, v) = L(v)
∀ v ∈ V.
(3.6)
Indeed, we are going to reconstruct an abstract equation (3.5) from (3.6). The major interest of this reverse operation is that now we are able to formulate precisely the topological and geometrical properties of the operator A. Let us first apply the Riesz theorem to L: there exists some f ∈ V such that L(v) = 〈 f , v〉
∀v ∈ V .
(3.7)
For any fixed u ∈ V , the mapping v → a(u, v) is a continuous linear form on V ; note that |a(u, v)| ≤ M u v ∀v ∈ V . Applying once more the Riesz theorem, there exists a unique element, which we denote A(u) ∈ V , such that ∀v ∈ V a(u, v) = 〈A(u), v〉. The mapping u → A(u) from V into V is linear: given u1 , u2 belonging to V 〈A(u1 + u2 ), v〉 = a(u1 + u2 , v) = a(u1 , v) + a(u2 , v) = 〈A(u1 ), v〉 + 〈A(u2 ), v〉 = 〈A(u1 ) + A(u2 ), v〉 Hence, Similarly,
∀ v ∈ V.
A(u1 + u2 ) = A(u1 ) + A(u2 ). ∀ λ ∈ R, ∀ u ∈ V
A(λu) = λA(u).
So, our problem can be reformulated as follows: find u ∈ V such that 〈A(u), v〉 = 〈 f , v〉 that is,
∀v ∈ V ,
Au = f .
(3.8)
Let us reformulate in terms of A the properties of the bilinear form a(·, ·): (a) since a is bilinear we have that A : V −→ V is a linear mapping;
(3.9)
i
i i
i
i
i
i
3.1. The Lax–Milgram theorem and the Galerkin method
book 2014/8/6 page 69 i
69
(b) A is continuous: for any u, v ∈ V 〈Au, v〉 = a(u, v) ≤ M u v. Taking v = Au we obtain
Au2 ≤ M u Au,
which implies Au ≤ M u.
(3.10)
This expresses that A is a linear continuous operator from V into V with AL(V ,V ) ≤ M . (c) A is coercive in the following sense: there exists some α > 0 such that ∀v ∈ V
〈Av, v〉 ≥ αv2 .
(3.11)
To solve (3.8) we formulate it as a fixed point problem. Let λ be some strictly positive parameter. Clearly, to solve (3.8) is equivalent to finding u ∈ V such that u − λ(Au − f ) = u.
(3.12)
In other words, we are looking for a fixed point u ∈ V of the mapping gλ : V −→ V given by (3.13) gλ (v) = v − λ(Av − f ). Let us prove that with λ adequately chosen, the mapping gλ satisfies the condition of the Banach fixed point theorem, which we recall now. Theorem 3.1.3 (Banach fixed point theorem—Picard iterative method). Let (X , d ) be a complete metric space and g : X −→ X be a Lipschitz continuous mapping with a Lipschitz constant k strictly less than one, i.e., ∀ x, y ∈ X
d ( g (x), g (y)) ≤ k d (x, y).
Then, there exists a unique x¯ ∈ X such that g (¯ x ) = x¯ . Moreover, for any x0 ∈ X , the sequence (xn ) starting from x0 with xn+1 = g (xn ) for all n ∈ N converges to x¯ as n goes to +∞. PROOF OF LAX–MILGRAM THEOREM CONTINUED. Take v1 , v2 ∈ V . Then gλ (v2 ) − gλ (v1 ) = [v2 − λ(Av2 − f )] − [v1 − λ(Av1 − f )] = (v2 − v1 ) − λA(v2 − v1 ). Let us denote v = v2 − v1 . So gλ (v2 ) − gλ (v1 ) = v − λAv. To majorize this quantity, we consider its square and take advantage of the Hilbertian structure of V : gλ (v2 ) − gλ (v1 )2 = v − λAv2 = v2 − 2λ〈Av, v〉 + λ2 Av2 .
i
i i
i
i
i
i
70
book 2014/8/6 page 70 i
Chapter 3. Abstract variational principles
By using (3.10) and (3.11) (note that we have assumed λ > 0), we obtain gλ (v2 − gλ (v1 )2 ≤ (1 − 2λα + λ2 M 2 ) v2 .
(3.14)
So, the question is to find some λ > 0 such that 1 − 2λα + λ2 M 2 < 1. Take λ¯ for which the quantity 1 − 2αλ + λ2 M 2 is minimal, that is, λ¯ = α/M 2 , in which case 2 ¯ + λ¯2 M 2 = 1 − α < 1. 1 − 2λα M2
Hence
0 1 2 α2 gλ¯(v2 ) − gλ¯ (v1 ) ≤ 1 − 2 v2 − v1 , M
(3.15)
and kλ¯ = 1 − α2 /M 2 is strictly less than one (note that α > 0). So gλ¯ has a unique fixed point u¯ ; equivalently, the equation Au = f has a unique solution u¯. Remark 3.1.2. (a) One of the main advantages of the proof above is that since it relies on the Banach fixed point theorem, it is a constructive proof. (b) A second advantage is that it can be easily extended to nonlinear equations: solve Au = f , where A : V −→ V satisfies ∃M ≥ 0 such that ∀u, v ∈ V ∃α > 0 such that ∀u, v ∈ V
Au − Av ≤ M u − v;
〈Au − Av, u − v〉 ≥ αu − v2 .
(c) Another approach consists of proving that A is onto, that is, R(A) = V . To that end, one first establishes that (i) R(A) is closed: for this one can first notice that ∀v ∈ V
αv2 ≤ 〈Av, v〉 ≤ Av · v
and hence αv ≤ Av. If Avn −→ z, we have αvn − v m ≤ A(vn − v m ) = Avn − Av m and (vn ) is a Cauchy sequence in V . Hence vn −→ v for some v ∈ V and by continuity of A, Avn −→ Av. Consequently z = Av. (ii) R(A) is dense: if z ∈ R(A)⊥ , then 〈Av, z〉 = 0 ∀ v ∈ V . Take v = z to conclude z = 0. Hence R(A)⊥ = {0}, that is, R(A) = V .
i
i i
i
i
i
i
3.1. The Lax–Milgram theorem and the Galerkin method
book 2014/8/6 page 71 i
71
3.1.2 The Galerkin method A quite natural idea when considering an infinite dimensional (variational) problem is to approximate it by finite dimensional problems. This has important consequences both from the theoretical (existence, etc.) and the numerical point of view. In this section, we consider the situation corresponding to the Lax–Milgram theorem, and by using the Galerkin method we will both provide another proof of the existence of a solution and describe a corresponding approximation numerical schemes. We stress the fact that this type of finite dimensional approximation method is very flexible and can be applied (we will illustrate it further in various situations) to a large number of linear or nonlinear problems. Definition 3.1.1. Let V be a Banach space. A Galerkin approximation scheme is a sequence (Vn )n∈N of finite dimensional subspaces of V such that for all v ∈ V , there exists some sequence (vn )n∈N with vn ∈ Vn for all n ∈ N and (vn )n∈N norm converging to v. This approximation property can be reformulated as ∀v ∈ V
lim dist(v,Vn ) = 0,
n→+∞
where dist(v,Vn ) = infw∈Vn v − w. Proposition 3.1.1. Let V be a separable Banach space. Then one can construct a Galerkin approximation scheme (Vn ) by the following method: (i) take (un )n∈N a countable dense subset of V (the separability of V just expresses that such set exists); (ii) let Vn = span {u1 , u2 , . . . , un }. Then (Vn )n∈N is a Galerkin scheme. PROOF. Let v ∈ V . By (i), there exists a mapping k → n(k) from N into N such that un(k) − v ≤ k1 for all k ∈ N∗ . For k fixed, un(k) ∈ Vn(k) and hence dist(v,Vn(k) ) ≤
1 k
.
Since Vn ⊃ Vn(k) for n ≥ n(k), we obtain ∀n ≥ n(k)
dist(v,Vn ) ≤ dist(v,Vn(k) ) ≤
1 k
,
that is, dist(v,Vn ) −→ 0 as n → +∞. Remark 3.1.3. (a) The vectors u1 , . . . , un need not to be linearly independent. By a classical linear algebra argument, one can replace in Proposition 3.1.1 the sequence (un )n∈N by a sequence (wn )n∈N , wn ∈ V made by linearly independent vectors. (b) In Proposition 3.1.1 the sequence of subspaces (Vn )n∈N satisfies • V1 ⊂ V2 ⊂ V3 ⊂ · · · ⊂ Vn ⊂ · · · is an increasing sequence of finite dimensional subspaces; • Un∈NVn = V .
i
i i
i
i
i
i
72
book 2014/8/6 page 72 i
Chapter 3. Abstract variational principles
The Galerkin approach to the Lax–Milgram theorem. We now suppose that V is a separable Hilbert space, a : V × V −→ R is a bilinear, continuous, coercive form, L : V −→ R is a linear, continuous form. We want to study the following problem: find u ∈ V such that a(u, v) = L(v)
∀ v ∈ V.
(3.16)
Since V is separable, by Proposition 3.1.1 there exists a Galerkin scheme (Vn )n∈N with Vn increasing with n ∈ N. Consider the approximated problems find un ∈ Vn such that (3.17) a(un , v) = L(v) ∀ v ∈ Vn . Problem (3.17) can be equivalently reformulated as An un = fn ,
(3.18)
where An is the linear operator from Vn −→ Vn such that a(u, v) = 〈An u, v〉 and
L(v) = 〈 fn , v〉
∀ u, v ∈ Vn ∀ v ∈ Vn .
This is exactly the same argument as in the proof of the Lax–Milgram theorem except that now we work on finite dimensional spaces, which makes the existence of (un )n∈N very easy: since ker An = 0 (which follows from the coercivity of An ), then An is onto. Note the basic difference with the infinite dimensional situation where such an argument is false! The question now is to study the convergence of the sequence (un )n∈N . In (3.17) take v = un so that αun 2 ≤ a(un , un ) = L(un ) ≤ L∗ un and un ≤
L∗
. (3.19) α The sequence (un )n∈N is bounded and hence weakly relatively compact, that is, there exists a subsequence (unk ) and some u ∈ V such that 〈unk , v〉 −→ 〈u, v〉 ∀v ∈ V .
(3.20)
w−V
We write unk −→ u. (See Theorem 2.4.3 with a direct independent proof of this result in separable Hilbert spaces.) Given v ∈ V m , we have Vnk ⊃ V m for k sufficiently large and hence a(unk , v) = L(v) (3.21) for all k sufficiently large. Then notice that w−V
unk −→ u =⇒ a(unk , v) −→ a(u, v) as k −→ +∞. This follows from the representation of the linear continuous form u → a(u, v) on V , where a(u, v) = 〈u, At v〉 (At denotes the adjoint of A). Hence w−V
unk −→ u =⇒ a(unk , v) = 〈unk , At v〉 −→ 〈u, At v〉 = a(u, v).
i
i i
i
i
i
i
3.1. The Lax–Milgram theorem and the Galerkin method
book 2014/8/6 page 73 i
73
So, when passing to the limit in (3.21), we obtain that, given v ∈ V m , a(u, v) = L(v). Hence a(u, v) = L(v) for every v ∈ a(u, ·) and L(·) we finally infer
m∈N
V m . Since
m∈N
V m = V by continuity of
a(u, v) = L(v) ∀ v ∈ V . Since u is the unique solution of this problem, by a classical compactness argument, the whole sequence (un )n∈N weakly converges to u. Indeed, one can prove that the sequence (un )n∈N norm converges to u. This is explained below with an explicit bound on un −u. Proposition 3.1.2 (Cea lemma). Let V be a separable Hilbert space and (Vn )n∈N a Galerkin scheme. Suppose a(un , v) = L(v) ∀ v ∈ Vn , un ∈ Vn , and
a(u, v) = L(v) u ∈ V,
∀ v ∈ V,
where a and L satisfy the assumptions of the Lax–Milgram theorem. Then un − u ≤
M α
dist(u,Vn ).
PROOF. Subtracting (3.16) from (3.17), we obtain a(u − un , v) = 0
∀v ∈ Vn ,
and in particular a(u − un , un ) = 0. It follows, for every v ∈ Vn , a(u − un , u − un ) = a(u − un , u − v) + a(u − un , v − un ) = a(u − un , u − v). Let us now use the continuity and coercivity property of a, α u − un 2 ≤ M u − un · u − v, to obtain αu − un ≤ M u − v Hence u − un ≤
M α
∀ v ∈ Vn .
dist (u,Vn ).
Since (Vn )n∈N is a Galerkin scheme, dist(u,Vn ) −→ 0 as n −→ +∞, and (un )n∈N norm converges to u.
i
i i
i
i
i
i
74
book 2014/8/6 page 74 i
Chapter 3. Abstract variational principles
3.2 Minimization problems: The topological approach As a corollary of the Lax–Milgram theorem, we have obtained that if a : V × V −→ R is bilinear, continuous, coercive, and symmetric, then, for any L ∈ V ∗ , there exists a unique solution to the minimization problem: find u ∈ V such that J (u) ≤ J (v)
∀v ∈ V ,
where J (v) = 12 a(v, v) − L(v). Let us observe that J : V −→ R is convex, continuous, and coercive. By coercive, we mean that limv−→+∞ J (v) = +∞, which follows easily from the inequality J (v) ≥
α 2
v2 − L∗ · v.
Indeed, we will prove in Section 3.3.2 the following general result, which contains as a particular case the above situation (Theorem 3.3.4). Let J : V −→ R ∪ {+∞} be a real-extended valued function on a reflexive Banach space V , which is convex, lower semicontinuous, and coercive. Then there exists at least one u ∈ V such that J (u) ≤ J (v)
∀ v ∈ V.
Before proving this theorem, in this section we will successively examine its basic ingredients. We will first justify the introduction of extended real-valued functions. Then, we will state the Weierstrass minimization theorem, which is purely topological, and in the process we will study the notions of lower semicontinuity, inf-compactness, and the interplay between inf-compactness, coercivity, and the role of the weak topology. So doing, we will be able to explain why convexity plays an important role in such questions.
3.2.1 Extended real-valued functions A main reason for introducing extended real-valued functions is that they provide a natural and flexible modelization of minimization (or maximization) problems with constraints. Since in this chapter we consider minimization problems, we just need to consider functions f : X −→ R ∪ {+∞}. On the other hand, if one considers maximization problems, one needs to introduce functions f : X −→ R∪{−∞}. If maximization and minimization are both involved, just like ¯ in saddle value problems, one needs to work with functions f : X −→ R. The (effective) domain of a function f : X −→ R ∪ {+∞} is the set dom f = {x ∈ X : f (x) < +∞}. The function f is said to be proper if dom f = . Let us briefly justify the introduction of extended real-valued functions. Most minimization problems can be written as min{ f0 (x) : x ∈ C },
(3.22)
where f0 : X −→ R is a real-valued function and C ⊂ X is the set of constraints. In economics, C describes the available resources, the possible productions of a firm, or a set of decisions, and f0 is the corresponding cost or economical criteria. In physics, the
i
i i
i
i
i
i
3.2. Minimization problems: The topological approach
book 2014/8/6 page 75 i
75
configurations x of the system are subject to constraints (unilateral or bilateral) and f0 , for example, is the corresponding energy. A natural way to solve such problems is to approach them by penalization. For example, let us introduce a distance d on X and, for any positive real number k, consider the minimization problem (3.23) min{ f0 (x) + k d (x, C ) : x ∈ X }, where
d (x, C ) = inf{d (x, y) : y ∈ C }
(3.24)
is the distance function from x to C . Note that the penalization term is equal to zero if x ∈ C (that is, if the constraint is fulfilled), and when x ∈ C (that is, if the constraint is violated) it takes larger and larger values which are increasing to +∞ with k. Let us also notice that the approximated problem (3.23) can be written as min{ fk (x) : x ∈ X }, where
fk (x) = f0 (x) + k d (x, C )
is a real-valued function. Then the approximated problems are unconstrained problems, which makes the method interesting. As k → +∞, the sequence of functions { fk : k ∈ N} increases to the function f : X −→ R ∪ {+∞}, which is equal to f0 (x) if x ∈ C , (3.25) f (x) = +∞ otherwise. The function f is an extended real-valued function, so, if we want to treat in a unified way problems (3.22) and (3.23) we are naturally led to introduce extended real-valued functions. The minimization problem (3.22) can be equivalently formulated as min{ f (x) : x ∈ X }, where f is given by (3.25). Note that in this formulation the constraint is equal to the domain of f . A particularly useful function in this unilateral framework is the indicator function δC of the set C : 0 if x ∈ C , (3.26) δC (x) = +∞ otherwise. With this notation we have f = f0 + δC . More generally, in variational analysis and optimization, one is often faced with expressions of the form f = sup fi i ∈I
and one should notice that the class of extended real-valued functions is stable under such supremum operation. As a further illustration of these considerations, the convex duality theory establishes a one-to-one correspondence between a convex C ⊂ X of a normed linear space and its support function σC : X ∗ −→ R ∪ {+∞}, which is defined by
σC (x ∗ ) = sup{x ∗ (x) : x ∈ C }
i
i i
i
i
i
i
76
book 2014/8/6 page 76 i
Chapter 3. Abstract variational principles
with X ∗ the topological dual of X . One cannot avoid the value +∞ for σC as soon as the set C is not bounded (which is often the case!) and general duality statements must consider extended real-valued functions. So, from now on, unless explicitly specified, we will consider functions f : X −→ R∪{+∞} possibly taking the value +∞. We adopt the conventions that λ×(+∞) = +∞ if λ > 0 and 0 × (+∞) = 0.
3.2.2 The interplay between functions and sets: The role of the epigraph The analysis of unilateral problems (like minimization) naturally leads to the introduction of mathematical concepts which have a unilateral character. The classical approach of analysis does not provide the appropriate tools to deal with the mathematical objects and operations that are intrinsically unilateral (constraints, minimization). In the previous section, we justified the introduction of extended real-valued functions f : X −→ R∪ {+∞} which is the appropriate concept for dealing with minimization problems. Although in classical analysis the properties of the graph of a function play a fundamental role, in variational analysis it is the epigraph that will take over this role. The set epi f = {(x, λ) ∈ X × R : λ ≥ f (x)}
(3.27)
is the epigraph of the function f : X −→ R ∪ {+∞}. For any γ ∈ R, the lower γ -level set of f is l e vγ f = {x ∈ X : f (x) ≤ γ }.
(3.28)
When considering the minimization problem of a given function f : X −→ R ∪ {+∞}, the solution set is (3.29) arg min f = {¯ x ∈ X : f (¯ x ) = inf f (x)}. X
Note that arg min f can be possibly empty and ' arg min f = l e vγ f .
(3.30)
γ >inf X f
Thus, to an extended real-valued function, we have associated many different sets—its epigraph, its lower level sets, its minimum set. Conversely, to a set we have associated extended real-valued functions, for example, the indicator function, the support function, the distance function. (This last one is real-valued if the set is nonempty.) In the following section, we describe how some basic topological properties of functions for minimization problems can be naturally formulated with the help of the attached geometrical sets, epigraphs, and lower level sets. Note that the lower level sets can be obtained by cutting operations on the epigraph: l e vγ f × {γ } = epi f ∩ (X × {γ }). Most basic operations in variational analysis can be naturally formulated with the help of the epigraphs. They give rise to the so-called epigraphical calculus; see [57], [68], [107]. The following result is valid for an arbitrary family of functions on an abstract space X . Proposition 3.2.1. Let X be an abstract space and ( fi )i ∈I be a family of extended real-valued functions fi : X −→ R ∪ {+∞} indexed by an arbitrary set I . Then ' epi (sup fi ) = epi fi , i ∈I
epi (inf fi ) = i ∈I
i ∈I
3
epi fi .
i ∈I
i
i i
i
i
i
i
3.2. Minimization problems: The topological approach
book 2014/8/6 page 77 i
77
3.2.3 Lower semicontinuous functions Let (X , τ) be a topological space. For any x ∈ X , we denote by τ (x) the family of the neighborhoods of x for the topology τ. We recall the classical definition of continuity for a function f : (X , τ) −→ R. The function f is said to be continuous at x ∈ X for the topology τ if ∀>0
∃ V ∈ τ (x) such that ∀ y ∈ V
| f (y) − f (x)| < .
This can be viewed as the conjunction of the two following properties: 1. ∀ > 0 ∃ V ∈ τ (x) such that ∀ y ∈ V 2. ∀ > 0 ∃ W ∈ τ (x) such that ∀ y ∈ W
f (y) > f (x) − ; f (y) < f (x) + .
Then, take V ∩W which still belongs to τ (x) to obtain the continuity result. These properties are called, respectively, the lower semicontinuity and the upper semicontinuity of f at x for the topology τ. To deal with possibly extended real-valued functions, definition 1 of lower semicontinuity has to be formulated slightly differently. Definition 3.2.1. Let (X , τ) be a topological space and f : X −→ R ∪ {+∞}. The function f is said to be τ-lower semicontinuous (τ-lsc) at x if ∀ λ < f (x)
∃ Vλ ∈ τ (x)
such that f (y) > λ
∀ y ∈ Vλ .
( We write Vλ to stress the dependence of the set V upon the choice of λ!) If f is τ-lsc at every point of X , then f is said to be τ-lsc on X . Proposition 3.2.2. Let (X , τ) be a topological space and f : X −→ R ∪ {+∞} an extended real-valued function. The following statements are equivalent: (i) f is τ-lsc; (ii) epi f is closed in X × R (where X × R is equipped with the product topology of τ on X and of the usual topology on R); (iii) for all γ ∈ R, l e vγ f is closed in (X , τ); (iv) for all γ ∈ R, {x ∈ X : f (x) > γ } is open in (X , τ); (v) for all x ∈ X ,
f (x) ≤ lim infy→x f (y) := sup infV ∈τ (x) y∈V f (y).
PROOF. We are going to prove (i) =⇒ (ii) =⇒ (iii) =⇒ (iv) =⇒ (v) =⇒ (i). Assume that f is τ-lsc and prove that epi f is closed. Equivalently, let us prove that the complement of epi f in X × R is open. Take (x, ł) ∈ epi f . By definition of epi f , ł < f (x). Take ł < γ < f . Since f is τ-lsc, there exists Vγ ∈ τ (x) such that f (y) > γ ∀y ∈ Vγ . Equivalently, (y, γ ) ∈ epi f for all y ∈ Vγ . It follows that (Vγ ×] − ∞, γ [) ∩ epi f = . Noticing that Vγ ×] − ∞, γ [ is a neighborhood of (x, λ), the conclusion follows.
i
i i
i
i
i
i
78
book 2014/8/6 page 78 i
Chapter 3. Abstract variational principles
(ii) =⇒ (iii). The implication follows directly from the relation l e vγ f × {γ } = epi f ∩ (X × {γ }). Assuming that epi f is closed, we infer that l e vγ f × {γ } is closed, and hence l e vγ f is closed. (Note that x → (x, γ ) from X onto X × {γ } is a homeomorphism.) (iii) =⇒ (iv) is obvious just by taking the complement of l e vγ f . (iv) =⇒ (v). Let γ < f (x). Since, by assumption, {y ∈ X : f (y) > γ } is open, there exists some V ∈ τ (x) such that V ⊂ {y ∈ X : f (y) > γ }. Equivalently, ∀y ∈V which implies
f (y) > γ , f (y) ≥ γ .
inf
y∈V
Hence
inf f (y) ≥ γ ,
sup
V ∈τ (x) y∈V
and this being true for any γ < f (x), it follows that sup
inf f (y) ≥ f (x),
V ∈τ (x) y∈V
that is,
f (x) ≤ lim inf f (y). y−→x
(v) =⇒ (i). Let λ < f (x). By assumption (v) λ < sup
inf f (y),
V ∈τ (x) y∈V
which implies the existence of some Vλ ∈ τ (x) such that inf f (y) > λ,
y∈Vλ
i.e., f (y) > λ for every y ∈ Vλ . This is exactly the lower semicontinuity property. The class of lower semicontinuous functions enjoys remarkable stability properties. Since closedness is preserved under arbitrary intersections and finite union of sets, we derive from Proposition 3.2.1 (epigraphical interpretation of sup and inf) and Proposition 3.2.2 (equivalence between f lsc and epi f closed) the following important result. Proposition 3.2.3. Let (X , τ) be a topological space and ( fi )i ∈I , fi : X −→ R ∪ {+∞} be an arbitrary collection of τ-lsc functions. Then, supi ∈I fi is still τ-lsc. When I is a finite set of indices, infi ∈I fi is still τ-lsc. As a consequence, the supremum of a family of continuous functions is lower semicontinuous. One can prove that if (X , τ) is metrizable, the converse is true: if f is τ-lsc, there exists an increasing sequence ( fn )n∈N of τ-continuous functions which is pointwise convergent to f . We will establish this important approximation result in Theorem 9.2.1 by using the epigraphical regularization. See also [86, Theorem 1.3.7].
i
i i
i
i
i
i
3.2. Minimization problems: The topological approach
book 2014/8/6 page 79 i
79
Proposition 3.2.4. Let f , g : (X , τ) −→ R∪ {+∞} be two lower semicontinuous functions. Then f + g is still lower semicontinuous. PROOF. Take xν −→ x a τ-converging net. Then lim inf( f + g )(xν ) = lim inf[ f (xν ) + g (xν )] ν
ν
≥ lim inf f (xν ) + lim inf g (xν ) ν
ν
≥ f (x) + g (x).
3.2.4 The lower closure of a function and the relaxation problem In some important situations, the function f to minimize fails to be lower semicontinuous for a topology τ which makes a minimizing sequence τ-relatively compact (see Section 3.2.5). In that case, the analysis of the behavior of the minimizing sequences requires the introduction of the lower closure of f and leads to the introduction of the relaxed problem. Definition 3.2.2. Given (X , τ) a topological space and f : X −→ R ∪ {+∞}, the τ-lower envelope of f is defined as c lτ f = sup{g : X → R ∪ {+∞} : g τ-l s c, g ≤ f }. Proposition 3.2.5. Let (X , τ) be a topological space and f : X −→ R ∪ {+∞} an extended real-valued function. Then c lτ f is τ-lsc; it is the greatest τ-lsc function which minorizes f . We have the following properties: (a) epi (c lτ f ) is the closure of epi f in X ×R equipped with the product topology of τ with the usual topology of R: epi (c lτ f ) = c l (epi f ); (b) c lτ f = supV ∈τ (x) infy∈V f (y) = lim infy−→x f (y); (c) f is τ-lsc at x iff f (x) ≤ c lτ f (x); (d) f is τ-lsc at x iff f (x) = c lτ f (x). PROOF. Let us first notice that a set C in X × R is an epigraph iff (i) C recedes in the vertical direction: (x, ł) ∈ C and μ > ł =⇒ (x, μ) ∈ C ; (ii) C is vertically closed: for every x ∈ X the set {ł ∈ R : (x, ł) ∈ C } is closed. (a) This implies that the closure of an epigraph is an epigraph. Set c l (epi f ) = epi g . Since epi g is closed, this implies that g is τ-lsc. Moreover, since epi g ⊃ epi f , we have g ≤ f . Hence g is a τ-lsc minorant of f . We claim that g is the lower envelope of f , that is, g is the greatest of such τ-lsc minorants of f . Take h ≤ f and h τ-lsc. Hence epi h ⊃ epi f ,
i
i i
i
i
i
i
80
book 2014/8/6 page 80 i
Chapter 3. Abstract variational principles
which implies c l (epi h) = epi h ⊃ c l (epi f ) = epi g . Hence, epi h ⊃ epi g and h ≤ g . So g = sup{h : X → R ∪ {+∞} : h τ-l s c, h ≤ f }, that is, g = c lτ f . We have proved that c l (epi f ) = epi g = epi (c lτ f ). (b) Since c lτ f is τ-lsc, it follows from Proposition 3.2.2(v) that (c lτ f )(x) ≤ lim inf(c lτ f )(y), y−→x
and since c lτ f ≤ f ,
(c lτ f )(x) ≤ lim inf f (y) ∀ x ∈ X . y−→x
(3.31)
Then notice that the function h(x) = lim infy−→x f (y) is less than or equal to f and is τ-lsc. To verify this last point, take λ < lim inf f (y) = sup y−→x
inf f (y).
V ∈τ (x) y∈V
Then there exists some open set Vλ ∈ τ (x) such that infy∈Vλ f (y) > λ. It follows that for all ξ ∈ Vλ , Vλ ∈ τ (ξ ) and hence sup
inf f (y) > λ,
V ∈τ (ξ ) y∈V
that is, lim inf f (y) > λ. y−→ξ
Thus, x → lim infy−→x f (x) is τ-lsc and minorizes f . By definition of c lτ f we have lim inf f (y) ≤ (c lτ f )(x). y−→x
(3.32)
Then compare (3.31) and (3.32) to obtain ∀x ∈ X
(c lτ f )(x) = lim inf f (y). y−→x
(c) Proposition 3.2.2 expresses that f is τ-lsc at x iff f (x) ≤ lim inf f (y). y−→x
This is equivalent to saying that f (x) ≤ c lτ f (x), and since c lτ f ≤ f is always true, it is equivalent to f (x) = c lτ f (x).
i
i i
i
i
i
i
3.2. Minimization problems: The topological approach
book 2014/8/6 page 81 i
81
We have the following “sequential” formulation of c lτ f . Proposition 3.2.6. Let (X , τ) be a topological space and f : X −→ R ∪ {+∞}. Then, for any x ∈ X , (c lτ f )(x) = lim inf f (y) y−→x
= min{lim inf f (xν ) : xν is a net , xν −→ x}. ν
When (X , τ) is metrizable (c lτ f )(x) = lim inf f (y) y−→x
= min{lim inf f (xn ) : (xn ) sequence, xn −→ x}. n
Corollary 3.2.1. Let (X , τ) be a metrizable space and let f : X −→ R ∪ {+∞} be an extended real-valued function. Then f is τ-lsc at x ∈ X iff ∀xn −→ x
f (x) ≤ lim inf f (xn ). n
PROOF OF PROPOSITION 3.2.6. We give for simplicity the proof only in the metrizable case. We have lim inf f (y) = sup inf f (y), >0 y∈Bτ (x,)
y−→x
where Bτ (x, ) = {y ∈ X : dτ (y, x) < }, dτ being a distance inducing the topology τ. Then, for any xn −→ x, for any > 0, xn belongs to Bτ (x, ) for n sufficiently large. Hence inf f (y) ≤ f (xn ) ∀n ≥ N (). y∈Bτ (x,)
Passing to the limit as n → +∞ gives inf
y∈Bτ (x,)
f (y) ≤ lim inf f (xn ). n
This being true for any > 0 and any xn −→ x, leads to the inequality lim inf f (y) ≤ inf{lim inf f (xn ) : xn −→ x}. y−→x
n
(3.33)
On the other hand, for each n ∈ N, there exists some xn ∈ Bτ (x, 1/n) such that inf
y∈Bτ (x,1/n)
f (y) ≥ f (xn ) − −n ≥ f (xn )
1 n if
if
inf
y∈Bτ (x,1/n)
inf
y∈Bτ (x,1/n)
f (y) > −∞
f (y) = −∞.
In both cases, lim inf f (y) = lim y−→x
inf
n y∈Bτ (x,1/n)
f (y) ≥ lim sup f (xn ) n
≥ lim inf f (xn ).
(3.34)
n
i
i i
i
i
i
i
82
book 2014/8/6 page 82 i
Chapter 3. Abstract variational principles
Then compare (3.33) and (3.34) to obtain lim inf f (y) = min{lim inf f (xn ) : xn −→ x} y−→x
n
= min{lim sup f (xn ) : xn −→ x}. n
Proposition 3.2.7. Let f : (X , τ) −→ R∪ {+∞} be an extended real-valued function. Then inf f = inf c lτ f . X
X
More generally, for any τ-open subset G of X inf f = inf c lτ f . G
Moreover,
G
arg min f ⊂ arg min c lτ f .
PROOF. (a) Since f ≥ c lτ f , we just need to prove that inf c lτ f ≥ inf f . G
G
For any x ∈ G, since G is τ-open, we have G ∈ τ (x). By Proposition 3.2.5, (c lτ f )(x) = sup
inf f (y).
V ∈τ (x) y∈V
Hence, for every x ∈ G
(c lτ f )(x) ≥ inf f (y). y∈G
This being true for any x ∈ G gives inf(c lτ f ) ≥ inf f . G
G
(b) Let x ∈ arg min f . We have c lτ f (x) ≤ f (x) ≤ inf f = inf c lτ f , X
X
which implies x ∈ arg min c lτ f . Remark 3.2.1. The function c lτ f , which we call the lower envelope of f , is often called the lower semicontinuous regularization of f or the relaxed function or the τ-closure. The problem min{c lτ f (x) : x ∈ X } is called the relaxed problem.
3.2.5 Inf-compactness functions, coercivity Besides lower semicontinuity, the second basic ingredient in minimization problems is the inf-compactness property. Definition 3.2.3. Let (X , τ) be a topological space and let f : X −→ R ∪ {+∞} be an extended real-valued function. The function f is said to be τ-inf-compact if for any γ ∈ R l e vγ f = {x ∈ X : f (x) ≤ γ } is relatively compact in X for the topology τ.
i
i i
i
i
i
i
3.2. Minimization problems: The topological approach
book 2014/8/6 page 83 i
83
When f is τ-lsc, its lower level sets are closed for τ, and τ-compactness is equivalent to saying that the lower level sets of f are τ-compact. Definition 3.2.4. Let X be a normed linear space. A function f : X −→ R ∪ {+∞} is said to be coercive if limx−→+∞ f (x) = +∞. The relation between the two concepts is made clear in the following. Proposition 3.2.8. Let X be a normed space and f : X −→ R ∪ {+∞}. The following conditions are equivalent: (i) f is coercive; (ii) for any γ ∈ R, l e vγ f is bounded. PROOF. (i) =⇒ (ii). Assume f is coercive. If, for some γ0 ∈ R, l e vγ0 f is not bounded, then there exists a sequence (xn )n∈N such that xn −→ +∞ as n goes to +∞ and f (xn ) ≤ γ0 for all n ∈ N. But f coercive implies that f (xn ) −→ +∞, a clear contradiction. (ii) =⇒ (i). Assume that for any γ ∈ R, l e vγ f is bounded. If f is not coercive, we can construct a sequence (xn )n∈N such that xn −→ +∞ as n goes to +∞ and such that f (xn ) ≤ γ0 for some γ0 ∈ R. This contradicts the fact that l e vγ0 f is bounded. Let us recall the Riesz theorem: the bounded sets in a normed space are relatively compact iff the space has a finite dimension. Corollary 3.2.2. Let X = Rn equipped with the usual topology and let f : X −→ R∪{+∞}. The following conditions are equivalent: (i) f is coercive; (ii) f is inf-compact. In infinite dimensional spaces, the topologies which are directly related to coercivity are the weak topologies (see Section 2.4).
3.2.6 Topological minimization theorems In this section, unless otherwise specified (X , τ) is a general topological space. Theorem 3.2.1. Let (X , τ) be a topological space and let f : X −→ R∪{+∞} be an extended real-valued function which is τ-lsc and τ-inf compact. Then infX f > −∞ and there exists some x¯ ∈ X which minimizes f on X : f (¯ x ) ≤ f (x)
∀ x ∈ X.
Because of the importance of this theorem, which is often referred to as the Weierstrass theorem, we give two different proofs of it, each of independent interest. Without any restriction we may assume that f is proper, that is, f ≡ +∞.
. We use formula (3.30), which relates FIRST PROOF. We want to prove that arg min f = arg min f to the lower level sets of f . ' arg min f = l e vγ f =
γ >inf X f '
l e vγ f ,
γ0 >γ >infX f
i
i i
i
i
i
i
84
book 2014/8/6 page 84 i
Chapter 3. Abstract variational principles
where γ0 ∈ R is taken arbitrary with γ0 > infX f . This comes from the fact that the sets l e vγ f are decreasing with γ . The τ-lower semicontinuity of f implies that the sets l e vγ f are closed for the topology τ. Moreover, for γ0 > γ > infX f the sets l e vγ f are nonempty and contained in l e vγ0 f which is compact by the τ-inf compact property of f . Therefore, we have a family {l e vγ f : γ0 < γ < infX f } of nonempty closed subsets, contained in a fixed compact set, and which is decreasing with γ . For any finite subfamily {l e vγi f : i = 1, 2, . . . , m} ' l e vγ i f = l e vγ f i =1,...,m
with γ = inf{γ1 , . . . , γ m } > infX f , and hence ' l e vγi f = . i =1,...,m
From the finite intersection property (which characterizes topological compact sets; it is obtained from the Heine–Borel property just by ( passing to the complement, and so replacing open sets by closed sets), we conclude that γ >γ >inf f l e vγ f = . 0
X
SECOND PROOF. The proof we present now illustrates the direct method in the calculus of variations. It is the proof initiated by Hilbert and further developed by Tonelli, which first introduces a minimizing sequence. Let us observe that given a function f : X −→ R ∪ {+∞}, one can always construct a minimizing sequence, that is, a sequence (xn )n∈N such that f (xn ) −→ infX f as n −→ +∞. To do so, we just rely on the definition of the infimum of a family of real numbers: if infX f > −∞, take infX f ≤ f (xn ) ≤ infX f + 1/n; if infX f = −∞, take f (xn ) ≤ −n. Since f is proper, infX f < +∞ and for n ≥ 1 f (xn ) ≤ max{inf f + 1/n, −n} X
≤ max{inf f + 1, −1} := γ0 . X
Note that γ0 > infX f , which implies l e vγ0 f = and xn ∈ l e vγ0 f
∀ n ≥ 1.
Thus, the sequence (xn )n∈N is trapped in a lower level set of f which is compact for the topology τ ( f is τ-inf compact). We follow the argument and assume that τ is metrizable. In the general topological case, one has to replace sequences by nets. So, we can extract a subsequence τ-converging to some x¯ ∈ X , xnk −→ x¯. We have lim f (xnk ) = lim f (xn ) = inf f . n
k
X
By the τ-lower semicontinuity of f , f (¯ x ) ≤ lim f (xnk ). k
i
i i
i
i
i
i
3.2. Minimization problems: The topological approach
Hence,
book 2014/8/6 page 85 i
85
f (¯ x ) ≤ inf f , X
which says both that infX f > −∞ since f : X −→ R ∪ {+∞} and f (¯ x ) ≤ f (x)
∀ x ∈ X,
and the proof is complete. Remark 3.2.2. Indeed, the inf-compactness assumption can be slightly weakened by noticing that in the proof of Theorem 3.2.1, we just need to know that some lower level set of f is relatively compact. Notice that it is equivalent to assume that l e vγ0 f is relatively compact or to assume that l e vγ f is relatively compact for all γ ≤ γ0 . So, we can formulate the following result. Theorem 3.2.2. Let (X , τ) be a topological space and f : X −→ R ∪ {+∞} an extended real-valued function which is τ-lsc and such that for some γ0 ∈ R, l e vγ0 f is τ-compact. Then infX f > −∞ and there exists some x¯ ∈ X which minimizes f on X : f (¯ x ) ≤ f (x)
∀ x ∈ X.
To illustrate the difference between Theorems 3.2.1 and 3.2.2, take X = R and f (x) = Then l e vγ f is compact for γ < 1, but l e v1 f = R. Thus we can apply Theorem 3.2.2 to conclude the existence of a minimizer (which is zero!), but Theorem 3.2.1 does not apply! x2 . 1+x 2
Corollary 3.2.3. Let (X , τ) be a topological space and assume that f : X −→ R ∪ {+∞} is τ-lsc. Then, for any compact subset K of (X , τ), there exists some x¯ ∈ K such that f (¯ x ) ≤ f (x)
∀x ∈ K.
PROOF. Take g := f + δK . Since K is closed, δK is lower semicontinuous. So, g as a sum of two lower semicontinuous functions is still lower semicontinuous. The sublevel sets of g are contained in K, so g is τ-inf compact. Therefore, by Theorem 3.2.1 there exists some x¯ ∈ X such that g (¯ x ) ≤ g (x) for every x ∈ X , that is, f (¯ x ) ≤ f (x) ∀ x ∈ K, x¯ ∈ K, which completes the proof. The above statement is the unilateral version of the classical theorem which says that a continuous function achieves on any compact its minimum value and maximum value. If one is concerned only with the minimization problem, one just needs to consider lower semicontinuous functions. Corollary 3.2.4. Take X = RN with the usual topology. Take f : RN −→ R ∪ {+∞} which is lower semicontinuous and coercive. Then, there exists some x¯ ∈ RN such that f (¯ x ) ≤ f (x)
∀x ∈ RN .
PROOF. Since f : RN −→ R ∪ {+∞} is coercive, it is inf-compact for the usual topology (Corollary 3.2.2). This combined with the lower semicontinuity of f implies the existence of a minimizer.
i
i i
i
i
i
i
86
book 2014/8/6 page 86 i
Chapter 3. Abstract variational principles
Comments on the direct methods of the calculus of variations. Theorems 3.2.1 and 3.2.2 provide both a general existence result for the global minimization problem of an extended real-valued function and a method for solving such problems, originally introduced by Hilbert and further developed by Tonelli: when considering a minimization problem for a function f : X −→ R ∪ {+∞} min{ f (x) : x ∈ X }, one first constructs a minimizing sequence, which is a sequence (xn )n∈N such that f (xn ) −→ inf f
as n −→ +∞.
X
This is always possible; at this point, we don’t need any structure on X . Then, one has to establish that the sequence (xn )n∈N is relatively compact for some topology τ on X , and this is how the topology τ appears. This usually comes from some estimations on (xn )n∈N which follow from a coercivity property of f with respect to some norm on X . Then, one may use weak topologies or some compact embeddings to find the topology τ. We stress that there is a great flexibility in this method. One may consider special minimizing sequences enjoying compactness properties which are not shared by the whole lower level sets. We will return to this important point. We just say here that it is the skill of the mathematician to find a minimizing sequence which is relatively compact for a topology τ which is as strong as possible. Indeed, and this is the second point of the direct methods, one then has to verify that f is τ-lsc. Clearly, the stronger the topology τ, the easier it is to verify the lower semicontinuity property. As a general rule, inf-compactness and lower semicontinuity are two properties which are antagonist: if τ1 > τ2 , then f τ1
inf-compact =⇒ f τ2
inf-compact,
while f τ2 −lsc =⇒ f τ1 -lsc. Thus, a balance with respect to these two properties determines the choice of a “good” topology τ (if it exists!). As we will see, in some important situations, the function f fails to be τ-lsc and there is no solution to the minimization problem of f . In such situations, it is still interesting to understand the behavior of the minimizing sequences. The following “relaxation result” gives a first general answer to this question. Theorem 3.2.3. Let (X , τ) be a topological space and let f : X −→ R ∪ {+∞} be an extended real-valued function. Let (xn )n∈N be a minimizing sequence for f , and suppose that a subsequence xnk τ-converges to some x¯ ∈ X . Then (c lτ f )(¯ x ) ≤ (c lτ f )(x)
∀ x ∈ X,
that is, x¯ is a minimum point for c lτ f . PROOF. Since (xn )n∈N is a minimizing sequence, lim f (xnk ) = lim f (xn ) = inf f . k
n
X
i
i i
i
i
i
i
3.3. Convex minimization theorems
book 2014/8/6 page 87 i
87
By Proposition 3.2.6, since x¯ = τ − lim xnk , x ) ≤ lim f (xnk ). c lτ f (¯ k
By Proposition 3.2.7, inf f = inf c lτ f . X
X
Hence x ) ≤ lim f (xnk ) = lim f (xn ) = inf f = inf c lτ f , c lτ f (¯ X
X
that is, x ) ≤ (c lτ f )(x) (c lτ f )(¯
∀x ∈ X .
We say that min{(c lτ f )(x) : x ∈ X } is the relaxed problem of the initial minimization problem of f over X .
3.2.7 Weak topologies and minimization of weakly lower semicontinuous functions Until now, the basic ingredients used in the direct approach to minimization problems have been purely topological notions. We now assume that the underlying space (X , τ) is a vector space. To stress that fact, we denote it by V (like vector) and assume that V is a normed space, the norm of v ∈ V being denoted by vV or v when no confusion is possible. The consideration of coercive functions f : V −→ R ∪ {+∞} leads naturally to study of the topological properties of the bounded subsets of V , and this is a basic reason for studying weak topologies on topological vector spaces. We recall from Theorems 2.4.2 and 2.4.3 the following result. Theorem 3.2.4. In a reflexive Banach space V , the bounded sets are weakly relatively compact. Moreover, from any bounded sequence (un )n∈N in V one can extract a weakly convergent subsequence. As a direct application of Proposition 3.2.8 and of the previous compactness result, we obtain that a coercive function on a reflexive Banach space is weakly inf-compact. By using the Weierstrass minimization Theorem 3.2.1 we obtain the following existence result. Theorem 3.2.5. Let V be a reflexive Banach space and f : V −→ R ∪ {+∞} an extended real-valued function which is coercive and weakly lower semicontinuous. Then there exists some u ∈ V such that f (u) ≤ f (v) ∀v ∈ V . The question that now naturally arises is to describe the class of functions which are weakly lower semicontinuous. This is where convexity plays a central role.
3.3 Convex minimization theorems Let us first recall some definitions and elementary properties of extended real-valued convex functions.
i
i i
i
i
i
i
88
book 2014/8/6 page 88 i
Chapter 3. Abstract variational principles
3.3.1 Extended real-valued convex functions and weak lower semicontinuity Definition 3.3.1. Let V be a linear space and let f : V −→ R ∪ {+∞}. Then, f is said to be convex if for each u, v ∈ V and each λ ∈ [0, 1] we have f (λu + (1 − λ)v) ≤ λ f (u) + (1 − λ) f (v). Proposition 3.3.1. Let V be a linear space and f : V −→ R ∪ {+∞}. Then, f is convex iff its epigraph is a convex subset of V × R. PROOF. Let us first assume that f is convex. Fix (u, α) and (v, β) in epi f and λ ∈ [0, 1]. Since α ≥ f (u) and β ≥ f (v), then f (u) and f (v) are finite and we have λα + (1 − λ)β ≥ λ f (u) + (1 − λ) f (v) ≥ f (λu + (1 − λ)v). This is equivalent to saying that (λu + (1 − λ)v, λα + (1 − λ)β) ∈ epi f , i.e., λ(u, α) + (1 − λ)(v, β) ∈ epi f , and so epi f is convex. Conversely, let us assume that epi f is convex. Fix λ ∈ [0, 1]. If either f (u) = +∞ or f (v) = +∞, since 0 × (+∞) = 0, the inequality is clearly valid. So, let us assume that f (u) < +∞ and f (v) < +∞. The two points (u, f (u)) and (v, f (v)) are in epi f and so is the segment joining this two points. In particular, λ(u, f (u)) + (1 − λ)(v, f (v)) ∈ epi f , which is equivalent to saying that f (λu + (1 − ł)v) ≤ λ f (u) + (1 − λ) f (v). So, it is equivalent to study convex sets or convex functions. Let us now recall the geometrical version of the Hahn–Banach theorem, which plays a basic role in convex analysis (see Chapter 9). Theorem 3.3.1 (Hahn–Banach separation theorem). Let (V , · ) be a normed linear space and suppose that C is a nonempty closed convex subset of V . Then, each point u ∈ C can be strongly separated from C by a closed hyperplane, which means ∃u ∗ ∈ V ∗ , ∃α ∈ R such that ∀v ∈ C
u ∗ (v) ≤ α and u ∗ (u) > α.
This is equivalent to saying that C is contained in the closed half-space α,u ∗ = {v ∈ V : u ∗ (v) ≤ α}, whereas u is in its complement. Corollary 3.3.1. Let (V , ·) be a normed linear space and let C be a nonempty closed convex subset of V . Then C is equal to the intersection of the closed half-spaces that contain it. Theorem 3.3.1 and Corollary 3.3.1 have important consequences with respect to topological (closedness) properties of convex sets: let us notice that by definition of the weak topology on V, any linear strongly continuous form is continuous for the weak topology, which implies that any closed half-space α,u ∗ = {v ∈ V : u ∗ (v) ≤ α}
i
i i
i
i
i
i
3.3. Convex minimization theorems
book 2014/8/6 page 89 i
89
is closed for the weak topology. So, any closed convex set, which is equal to an intersection of closed half-spaces, is closed for the weak topology. Since for an arbitrary set, the reverse implication “closed for the weak topology” =⇒ “closed for the strong topology” is always true, we finally obtain the following result. Theorem 3.3.2. Let (V , · ) be a normed linear space and C a nonempty convex subset of V . Then, the following statements are equivalent: (i) C is closed for the norm topology of V; (ii) C is closed for the weak topology of V. When translating the above theorem from sets to functions via the correspondence f −→ epi f and recalling that f convex ⇐⇒ epi f convex in X × R, f τ-l s c ⇐⇒ epi f closed in X × R, we obtain the following result. Theorem 3.3.3. Let V be a normed linear space and f : V −→ R ∪ {+∞} a convex proper function. The following statements are equivalent: (i) f is lower semicontinuous for the norm topology on V; (ii) f is lower semicontinuous for the weak topology on V. Of course it is the implication (i) =⇒ (ii) which is important. It tells us that a convex lower semicontinuous function is automatically weakly lower semicontinuous. As a particular case, a convex continuous function is weakly lower semicontinuous.
3.3.2 Convex minimization in reflexive Banach spaces In this section (V , · ) is a reflexive Banach space. We can now state the following important result. Theorem 3.3.4. Let (V , ·) be a reflexive Banach space and f : V −→ R∪{+∞} a convex, lower semicontinuous, and coercive function. Then there exists u ∈ V which minimizes f on V : f (u) ≤ f (v) ∀ v ∈ V . FIRST PROOF. Since f is coercive, its lower level sets are bounded in V and hence weakly relatively compact. So f is weakly inf-compact. Since f is convex lower semicontinuous, it is weakly lower semicontinuous. Then apply the Weierstrass minimization Theorem 3.2.1 to f with τ equal to the weak topology of V . SECOND PROOF. We use the direct methods of the calculus of variations. Take (un )n∈N , a minimizing sequence for f , that is, limn f (un ) = infV f . For n sufficiently large, (un )n∈N remains in a fixed sublevel set of f , which is bounded by coercivity of f . Because the space V is reflexive, one can extract a weakly convergent subsequence unk u. We have lim f (unk ) = lim f (un ) = inf f . k
n
V
i
i i
i
i
i
i
90
book 2014/8/6 page 90 i
Chapter 3. Abstract variational principles
The function f is convex and lower semicontinuous, so it is weakly lower semicontinuous and f (u) ≤ lim inf f (unk ). k
It follows that f (u) ≤ lim inf f (unk ) = lim f (un ) = inf f , n
k
V
that is, f (u) ≤ f (v) ∀v ∈ V . Remark 3.3.1. Concerning the question of uniqueness, we need to recall the notion of strict convexity: the function f −→ R ∪ {+∞} is said to be strictly convex if ∀u = v,
∀λ ∈ ]0, 1[
f (λu + (1 − λ)v) < λ f (u) + (1 − λ) f (v).
It is easily seen that the conclusion of Proposition 2.3.3 still holds for functions with values in R ∪ {+∞}. Example 3.3.1. Take for V a Hilbert space with norm · 2 = 〈·, ·〉. Then · 2 is strictly convex: indeed, for v1 , v2 ∈ V , λ ∈ ]0, 1[ we have λv1 + (1 − λ)v2 2 − λv1 2 − (1 − λ)v1 2 = −λ(1 − λ) v2 2 + v1 2 − 2〈v1 , v2 〉 = −λ(1 − λ)v1 − v2 2 ≤ 0. The above inequality becomes an equality iff v1 = v2 . As an application of the results above, we consider the problem of the best approximation in Hilbert spaces. / Theorem 3.3.5. Let (V , · ), be a Hilbert space with norm · = 〈·, ·〉. Given C ⊂ V a closed convex nonempty subset of V and u0 ∈ V , there exists a unique element u¯ ∈ C such that u0 − u¯ ≤ inf u0 − v. v∈C
We have u0 − u¯ = d (u0 , C ), that is, u¯ ∈ C realizes the minimum of the distance between u0 and C . We say that u¯ is the projection of u0 on C and we write u¯ = projC u0 . Moreover, u¯ is characterized by the following property: u¯ ∈ C , 〈u0 − u¯, v − u¯〉 ≤ 0 ∀ v ∈ C . Because of the importance of this result, we give two proofs of independent interest. The first relies on Theorem 3.3.4 and is straightforward, but recall that we have used the weak topology to prove Theorem 3.3.4. The second is a direct one and completely elementary (it does not use the weak topology) and can be the starting point for developing a theory of Hilbert spaces at a more elementary level without using the weak topologies.
i
i i
i
i
i
i
3.3. Convex minimization theorems
book 2014/8/6 page 91 i
91
FIRST PROOF. First notice that it is equivalent to have u0 − u¯ ≤ u0 − v or
∀v ∈C
u0 − u¯ 2 ≤ u0 − v2
∀ v ∈ C.
So, we may say that our problem is equivalent to minimizing f (v) = u0 − v2 + δC (v) over V . Clearly f is a convex function, as a sum of convex functions. It is strictly convex, because · 2 is strictly convex and the sum of a convex and of a strictly convex function is still strictly convex. Since C is closed, δC is lower semicontinuous, and since · 2 is continuous it is also lower semicontinuous, and f as a sum of two lower semicontinuous functions is still lower semicontinuous. Finally f (v) ≥ u0 −v2 , which is clearly coercive, and so is f . So, f is strictly convex, lower semicontinuous, and coercive. It achieves its minimum at a unique point u¯ ∈ C . SECOND PROOF. Let (un )n∈N be a minimizing sequence, that is, un ∈ C , u0 − un 2 −→ inf{u0 − v2 : v ∈ C } = d (u0 , C )2 . Then, use the parallelogram equality: given n, m ∈ N, ) )2 ) (un + u m ) ) ) ) 2u0 − un + 2u0 − u m = un − u m + 4) u0 − ). ) ) 2 2
2
2
Since C is convex, (un + u m )/2 belongs to C and ) ) u + um )u − n ) 0 2 Hence It follows that
)2 ) ) ≥ d (u , C )2 . 0 )
un − u m 2 ≤ 2u0 − un 2 + 2u0 − u m 2 − 4d (u0 , C )2 . lim sup un − u m 2 ≤ 0,
n,m−→+∞
that is, the sequence (un )n∈N is a Cauchy sequence. Since V is a Hilbert space, the sequence (un )n∈N norm converges to some element u¯ which still belongs to C , because C is closed. Moreover, u0 − u¯ 2 = lim u0 − un 2 = d (u0 , C )2 . n
Let us now prove the optimality condition for u¯, that is, u¯ ∈ C , 〈u0 − u¯ , v − u¯ 〉 ≤ 0 ∀v ∈ C . This property says that u¯ is characterized by the following geometrical property: for any v ∈ C , the angle between the two vectors u0 − u¯ and v − u¯ is greater than or equal to π/2.
i
i i
i
i
i
i
92
book 2014/8/6 page 92 i
Chapter 3. Abstract variational principles
We will later derive this property from general subdifferential calculus. At the moment, we give a direct elementary proof of it. For any v ∈ C , by convexity of C , the line segment [ u¯, v] still belongs to C and hence, for all t ∈ [0, 1], w t = t v + (1 − t ) u¯ belongs to C . By definition of u¯ u0 − u¯2 ≤ u0 − t v − (1 − t ) u¯ 2 ≤ (u0 − u¯) − t (v − u¯)2 . By developing this last expression, we obtain 2t 〈u0 − u¯ , v − u¯〉 ≤ t 2 v − u¯ 2 . Divide by t > 0, and then let t go to zero to obtain 〈u0 − u¯ , v − u¯〉 ≤ 0. Conversely, let us prove that if u satisfies the optimality condition above, then u0 − u ≤ inf u0 − v. v∈C
First notice that the optimality condition implies ∀v ∈ C
〈u0 − v, v − u¯〉 ≤ 0.
(3.35)
Indeed, 〈u0 − v, v − u¯〉 = 〈u0 − u¯ + u¯ − v, v − u¯〉 = 〈u0 − u¯, v − u¯〉 − u¯ − v2 ≤ 0. We then have u0 − u¯2 = 〈u0 − u¯, u0 − v + v − u¯〉 ≤ 〈u0 − u¯, u0 − v〉 = 〈u0 − v + v − u¯ , u0 − v〉 = u0 − v2 + 〈u0 − v, v − u¯〉 ≤ u0 − v2 , where we have used the optimality condition in the first inequality and relation (3.35) in the last one. Corollary 3.3.2. When C = W is a closed subspace, then u¯ = projW u0 is characterized by u¯ ∈ W , u0 − u¯ ∈ W ⊥ , that is, u0 = (u0 − u¯) + u¯ ∈ W ⊥ + W . This is the orthogonal decomposition of V = W ⊕W ⊥ as the sum of two orthogonal subspaces. Moreover, the projection operator proj : V −→ W is linear.
i
i i
i
i
i
i
3.3. Convex minimization theorems
book 2014/8/6 page 93 i
93
Proposition 3.3.2. When V is a Hilbert space and C is a closed convex nonempty subset of V , the projection operator V −→ C which associates to each u ∈ V its projection projC u on C is a contraction: ∀ u, v ∈ V
projC u − projC v ≤ u − v.
PROOF. We have 〈u − projC u, z − projC u〉 ≤ 0
∀ z ∈ C,
〈v − projC v, z − projC v〉 ≤ 0
∀ z ∈ C.
Take z = projC v in the first inequality and z = projC u in the second one. Summing up, we obtain 〈projC v − projC u, u − projC u − v + projC v〉 ≤ 0, that is, projC v − projC u2 ≤ 〈projC v − projC u, v − u〉. By the Cauchy–Schwarz inequality, it follows that projC v − projC u ≤ v − u. We end this section by remarking that the Hilbertian structure plays a fundamental role in the previous results for the best approximation. The existence of the best approximation u¯ still holds true when (V , · ) is a reflexive Banach space. But when the space (V , ·) is no longer reflexive, even the existence of u¯ may fail to be true, as shown by the following example. Example 3.3.2. Take V = C([0, 1]; R) equipped with the sup norm ∀v ∈ V
v∞ = sup{|v(t )| : t ∈ [0, 1]}.
Then (V , · ∞ ) is a Banach space which is not reflexive. Indeed, this is a consequence of the fact that the existence of a projection may fail to be true in this space: take
1/2
C = v ∈V :
v(t )d t − 0
1
v(t )d t = 1 . 1/2
Clearly C is a closed convex nonempty subset of V (it is in fact a closed hyperplane). One can easily verify that d (0, C ) = 1, but there is no element v ∈ C such that v∞ = 1. Indeed, if v ∈ C , 1=
1/2
1
v(t )d t − 0
v(t )d t ≤ 1/2
1
1/2
|v(t )|d t +
2
0
1 2
1 1/2
|v(t )|d t ≤ v∞ .
Hence d (0, C ) = inf{v∞ : v ∈ C } ≥ 1, and it is not difficult to show that d (0, C ) = 1. Suppose now that for some v ∈ C , v∞ = 1. Since 1/2 1 v(t ) d t ≤ , 0 2 we necessarily have
1/2 0
v(t ) d t =
1 2
and
1 1 v(t ) d t ≤ , 1/2 2
1 1/2
v(t ) d t = − 12 .
i
i i
i
i
i
i
94
book 2014/8/6 page 94 i
Chapter 3. Abstract variational principles
1/2 So 0 (1 − v(t ))d t = 0. Since 1 − v(x) ≥ 0, this implies v(t ) ≡ 1 on [0, 12 ]. Similarly v(t ) ≡ −1 on [ 12 , 1], a clear contradiction with the fact that v has to be continuous.
3.4 Ekeland’s -variational principle The so-called -variational principle was introduced by Ekeland in 1972 [206], [207], [208]. It is a general powerful tool in variational analysis and optimization which can be traced back to a maximality result for a partial ordering introduced by Bishop and Phelps in 1962 [100]. Ekeland’s -variational principle asserts the existence of minimizing sequences of a particular kind. Not only do they approach the infimal value of the minimization problem, but they also simultaneously satisfy the first-order necessary conditions up to any desired approximation. In many instances, this makes this variational principle play a key role in the application of the direct method. Indeed, Ekeland’s -variational principle has known considerable success with applications in a wide variety of topics in nonlinear analysis and optimization (critical point theory, geometry of Banach spaces, etc.). In the last two decades, there has been increasing evidence that Ekeland’s -variational principle has close connections with dissipative dynamical systems (dynamical systems with entropy). Indeed, solutions provided by the -variational principle can be seen as stable equilibria of such dynamics [138], [67]. In particular, the recent model in dynamical decision theory introduced by Attouch and Soubeyran [55] will serve as a guideline in the proof and interpretation of the results.
3.4.1 Ekeland’s -variational principle and the direct method Let us start with the following formulation of the Ekeland’s -variational principle. Theorem 3.4.1 (Ekeland). Let (X , d ) be a complete metric space and f : X −→ R ∪ {+∞} an extended real-valued function which is lower semicontinuous and bounded below ( infX f > −∞). Then, for each > 0, there exists some x ∈ X which satisfies the two following properties: (i) inf f ≤ f (x ) ≤ inf f + , (ii)
X
X
f (x) ≥ f (x ) − d (x, x ) ∀ x ∈ X .
Let us first comment on this result and show some direct consequences of it. (Its proof is postponed to the next section.) Condition (ii) has a clear interpretation when f : X −→ R is Gâteaux differentiable on a Banach space (X , .). It can be seen as a unilateral nonsmooth version of the condition D f (x )∗ ≤ . Let us start with the definition of the Gâteaux differentiability property of f at x . For any ξ ∈ X , with ξ = 1 and any t > 0, f (x + t ξ ) = f (x ) + t 〈D f (x ), ξ 〉 + o(t ). By taking x = x + t ξ in (ii) and using the above equality, we obtain f (x ) + t 〈D f (x ), ξ 〉 + o(t ) ≥ f (x ) − t . Let us simplify, divide by t > 0, and let t → 0+ . We obtain 〈D f (x ), ξ 〉 ≥ −.
i
i i
i
i
i
i
3.4. Ekeland’s -variational principle
book 2014/8/6 page 95 i
95
Changing ξ into −ξ yields
|〈D f (x ), ξ 〉| ≤ .
This being true for any ξ ∈ X with ξ ≤ 1, we finally obtain D f (x )∗ ≤ . We can summarize this result in the following corollary. Corollary 3.4.1. Let (X , .) be a Banach space and f : X −→ R a real-valued function which is lower semicontinuous, Gâteaux differentiable, and bounded below. Then, for each > 0, there exists some x ∈ X such that inf f ≤ f (x ) ≤ inf f + , X
D f (x )∗ ≤ .
X
The above result asserts the existence of minimizing sequences (xn )n∈N of particular type: take xn = xn with n → 0+ ; then
f (xn ) → infX f as n → +∞, D f (xn ) → 0 in X ∗ as n → +∞.
Application of the direct method when dealing with such particular minimizing sequences leads us naturally to introduce the so-called Palais–Smale compactness condition for a functional f . Definition 3.4.1. Let (X , .) be a Banach space. We say that a C1 function f : X −→ R satisfies the Palais–Smale condition if every sequence (xn )n∈N in X which satisfies sup | f (xn )| < +∞
D f (xn ) → 0
and
n
in X ∗ as n → +∞
possesses a convergent subsequence ( for the topology of the norm of X ). As an immediate consequence of Corollary 3.4.1, we obtain the next theorem. Theorem 3.4.2. Let (X , .) be a Banach space and f : X −→ R a C1 function which satisfies the Palais–Smale condition and which is bounded below. Then the infimum of f on X is achieved at some point x¯ ∈ X and x¯ is a critical point of f , i.e., D f (¯ x ) = 0. PROOF. Using Corollary 3.4.1 of the Ekeland’s -variational principle, we have the existence of a sequence (xn )n∈N which satisfies f (xn ) → inf f , D f (xn ) → 0. X
Since infX f ∈R , we have supn | f (xn | < +∞, and the sequence (xn )n∈N satisfies the hypotheses of the Palais–Smale condition. Hence, one can extract a convergent subsequence x) = xnk → x¯. By using the continuity properties of f and D f , one gets at the limit f (¯ infX f and D f (¯ x ) = 0. Judicious applications of this kind of result (based on the Palais–Smale compactness condition) provide existence results for critical points, not only local minima or maxima but also saddle points. One of the most celebrated of these results is the mountain pass theorem of Ambrosetti and Rabinowitz [15]. For further results in this direction, see [67] or [193].
i
i i
i
i
i
i
96
book 2014/8/6 page 96 i
Chapter 3. Abstract variational principles
Indeed, it turns out that when f is not necessarily smooth, condition (ii) is a convenient formulation of an -approximate optimality condition. The key is the following observation: property (ii) of Theorem 3.4.1 just expresses that x is an exact solution of the perturbed minimization problem (+ ): 5 4 (+ ) inf f (x) + d (x, x ) : x ∈ X . In the particular and important case where f is convex and lower semicontinuous on a Banach space, one gets the following result. Corollary 3.4.2. Let (X , .) be a Banach space and f : X −→ R ∪ {+∞} an extended real-valued function which is convex, lower semicontinuous, proper ( f ≡ +∞), and bounded below. Then, for each > 0 there exist x ∈ X and x∗ ∈ X ∗ such that
inf f ≤ f (x ) ≤ inf f + , X
X
x∗ ∈ ∂ f (x ), x ∗ ≤ , where ∂ f (x ) is the subdifferential of f at x . PROOF. We use standard tools from convex subdifferential calculus (see Chapter 9). Since x minimizes the closed convex proper function x → ϕ(x) := f (x) + x − x , we have ∂ ϕ(x ) # 0. The norm being a continuous function in X , the additivity rule for the subdifferential calculus holds (Theorem 9.5.4) and we have ∂ f (x ) + B(0, 1) # 0. Equivalently, there exists some x∗ ∈ ∂ f (x ) with x∗ ∗ ≤ .
3.4.2 A dynamical approach and proof of Ekeland’s -variational principle Ekeland’s -variational principle has a close connection with dissipative dynamical systems. This fact was recognized by Brezis and Browder [138]: “A general ordering principle”; Aubin and Ekeland [67]: “walking in complete metric spaces”; and Zeidler [363]: “The abstract entropy principle.” More recently, the importance of this principle in the modelization of dynamical decision with bounded rationality was put to the fore by Attouch and Soubeyran [55]. This cognitive interpretation will serve as a guideline throughout this section. The central concept in the dynamical approach to Ekeland’s -variational principle is the following partial ordering relation. Definition 3.4.2. Let (X , d ) be a metric space and f : X −→ R ∪ {+∞} an extended realvalued function which is proper ( f ≡ +∞). Let us introduce the following partial ordering on X : y ,S x ⇐⇒ f (y) + d (x, y) ≤ f (x). We call it the marginal satisficing relation. We write 4 5 S(x) = y ∈ X : y ,S x 4 5 = y ∈ X : f (y) + d (x, y) ≤ f (x) the set of elements of X which satisfy this ordering with respect to x.
i
i i
i
i
i
i
3.4. Ekeland’s -variational principle
book 2014/8/6 page 97 i
97
Let us introduce some elements of decision theory that allow us to interpret this relation in a natural and intuitive way. Space X is the decision or performance space (the state space in physics). It is supposed that to each element x ∈ X the agent is able to attribute a value or valence f (x) ∈ R ∪ {+∞} which measures the quality of the decision or performance x. (The value +∞ allows us to take account of the constraints.) For example, when performing x, f (x) measures how far the agent is from a given goal. In our context, f (x) measures the dissatisfaction of the agent who, making x ∈ X , is faced with a problem which is not completely solved. Thus the agent is willing to reduce its dissatisfaction and make f (x) as small as possible. The connection with the traditional formulation in decision sciences is obtained by taking f (x) = g − g (x), where g is a classical utility or gain function and g is a desirable level of resolution of the problem (for example, g = sup x∈X g (x)). We choose this presentation to fit well with the classical formulation of variational principles in mathematics and physics and, in our situation, with the usual formulation of Ekeland’s -variational principle. The classical decision theory deals with perfectly rational agents who have immediate and free access to a global knowledge of their environment, and correspondingly minimize their value function f on X . Modelization of decision processes in a complex real world requires us to introduce some further notions. Following Simon’s [334] pioneering work in decision theory and bounded rationality, one needs to modelize the ability and difficulty of the agent to move and decide in a complex environment. A major difficulty for the agent is that it needs to explore its environment and get enough information to make further decisions. In this context, making decision becomes a dynamical process which at each step k = 1, 2, . . . is based on the following question: Is it worthwhile for the agent to pass from a given state xk ∈ X (performance, decision, allocation) at time tk into a further state xk+1 ∈ X at time tk+1 ? A key ingredient of the modelization of this balance between the advantage for the agent to pass from x to y and the possibility and difficulty of realizing it is the notion of cost to change. Following Attouch and Soubeyran [55], one introduces for any x and y in X , c(x, y) ≥ 0, which is the cost to pass (change, move) from x to y. In our context, we assume that c(x, y) ≥ θd (x, y), where d is a metric on X and θ > 0, is a unitary cost to move. This expresses that the cost to move is high for small displacements (by contrast with c(x, y) = d (x, y)2 , for example!). This metric d modelizes the difficulty for the agent to pass from a state x to a further state y. This is where the metric d in the cognitive interpretation of Ekeland’s -variational principle appears! This is a rich concept which covers several aspects: there are costs to explore and get information (this comes from limited time and energy available for the agent), physical costs to move, and also costs with psychological and cognitive interpretation (dissimilarity costs, cost to quit a routine and enter into an other one, excitation and inhibition costs). From now on, we consider the particular situation c(x, y) = d (x, y), which is enough for our purpose. We now have the two terms of the balance: on one hand, the marginal gain f (x) − f (y), and on the other hand, the cost to change d (x, y). Precisely, the marginal satisficing relation y ,S x says that it is worthwhile for the agent to pass from x to y if the expected marginal gain (“reduction of dissatisfaction”) f (x) − f (y) is greater than or equal to the cost to pass from x to y: f (x) − f (y) ≥ d (x, y). Let us examine some properties of the marginal satisficing relation ,S .
i
i i
i
i
i
i
98
book 2014/8/6 page 98 i
Chapter 3. Abstract variational principles
Proposition 3.4.1. The marginal satisficing relation ,S is a partial ordering relation on dom f . An element x¯ ∈ X is maximal with respect to this order iff for all x ∈ X , x = x¯ one has f (¯ x ) < f (x) + d (¯ x , x). PROOF. Clearly ,S is reflexive; we have x ,S x for all x ∈ X , because d (x, x) = 0. Let us verify that ,S is antisymmetric. Suppose y ,S x and x ,S y. We thus have f (x) ≥ f (y) + d (x, y) and f (y) ≥ f (x) + d (x, y). Let us add these two inequalities and simplify the resulting expression. (At this point, note that it is important to consider x and y in dom f .) One obtains 2d (x, y) ≤ 0, and hence x = y. Let us now verify that ,S is transitive. Suppose that z ,S y and y ,S x. We have f (y) ≥ f (z) + d (y, z), f (x) ≥ f (y) + d (x, y). By adding the two above inequalities, and using again that x, y, z ∈ dom f , we obtain f (x) ≥ f (z) + d (x, y) + d (y, z) ≥ f (z) + d (x, z). In the above inequality we used the triangle inequality property satisfied by the metric d . Hence, z ,S x and ,S is transitive. Let us now express that an element x¯ of X is maximal with respect to the partial ordering relation ,S . This means that for any x ∈ X , the following implication holds: x ,S x¯ =⇒ x = x¯, ∀x ∈ X , x = x¯,
i.e.,
f (¯ x ) < f (x) + d (¯ x , x),
which completes the proof. Cognitive interpretation of maximal elements for S . Let us show that maximal elements of the satisficing relation ,S can be interpreted as stable routines of the corresponding “worthwhile to move” (marginal satisficing) dynamical system. In our cognitive version, x¯ ∈ X is said to be a stable routine if, starting from x¯, the agent prefers to stay at x¯ than to move from x¯ to x for all x different from x¯. Let us make this precise and consider, when starting from x¯, the two following possibilities: (a) if the agent chooses to stay at x¯, his gain (dissatisfaction in our case) will be f (¯ x ) + d (¯ x , x¯) = f (¯ x ); (b) if the agent considers to move from x¯ to x, his gain (dissatisfaction) is after moving (one adds two unsatisfactions, the cost to move d (¯ x , x) and the dissatisfaction attached to x): f (x) + d (¯ x , x). Thus, the agent is willing to stay at x¯ and rejects any move from x¯ to x for all x = x¯ iff x , x), f (¯ x ) < f (x) + d (¯ which, by Proposition 3.4.1, just expresses that x¯ is maximal for ,S .
i
i i
i
i
i
i
3.4. Ekeland’s -variational principle
book 2014/8/6 page 99 i
99
Therefore, Ekeland’s -variational principle can be reformulated as an existence result of a maximal element for the partial ordering ,S . Let us make this precise in the following statement. Theorem 3.4.3. Let us assume that (X , d ) is a complete metric space and f : X −→ R ∪ {+∞} is an extended real-valued function which is lower semicontinuous and bounded below. Then for any x0 ∈ dom f there exists some x¯ ∈ X which satisfies the two following properties: (i) x¯ ,S x0 , (ii) x¯ is maximal with respect to the partial ordering ,S . Before proving Theorem 3.4.3, let us show how Ekeland’s -variational principle can be derived from it: given > 0, take x0 ∈ dom f such that inf f ≤ f (x0 ) ≤ inf f + . X
X
Then, let us apply Theorem 3.4.3 with the metric d and the corresponding satisficing relation y ,S x ⇐⇒ f (y) + d (x, y) ≤ f (x). Theorem 3.4.3 asserts the existence of x¯ such that x¯ ,S x0 and x¯ maximal with respect to ,S . The property x¯ ,S x0 implies x , x0 ) f (¯ x ) ≤ f (x0 ) − d (¯ ≤ f (x0 ) ≤ inf f + . X
On the other hand, by Proposition 3.4.1 and the maximality property of x¯ , we have ∀x = x¯
f (¯ x ) < f (x) + d (¯ x , x).
Thus, x¯ satisfies the two desired properties (i) and (ii) of Theorem 3.4.1. We are going to prove Theorem 3.4.3 (and hence Ekeland’s -variational principle 3.4.1) by using the dynamical system, which is naturally associated to the marginal satisficing relation. Definition 3.4.3. A trajectory (xk )k∈N of the marginal satisficing dynamics is a sequence of elements xk of X such that xk+1 ∈ S(xk )
∀ k = 0, 1, 2, . . . ,
(S)
where S is the marginal satisficing relation. Equivalently, we have x0 -S x1 -S x2 -S · · · -S xk -S xk+1 -S · · · , that is, f (xk+1 ) + d (xk , xk+1 ) ≤ f (xk )
∀ k = 0, 1, 2, . . . .
Let us establish some general properties of the trajectories of the above dynamical system (S). We are mostly concerned with the asymptotic behavior as k → +∞ of these trajectories.
i
i i
i
i
i
i
100
book 2014/8/6 page 100 i
Chapter 3. Abstract variational principles
Proposition 3.4.2. Let (X , d ) be a metric space and f : X −→ R ∪ {+∞} an extended real-valued function which is proper and bounded below. Take any trajectory (xk )k∈N of (S) starting from some x0 ∈ dom f , x0 -S x1 -S x2 -S · · · -S xk -S xk+1 -S · · · . Then, the following properties hold: (i) ( f (xk ))k∈N decreases with k, and f (xk ) → infk f (xk ) ∈ R when k → +∞. (ii) The sequence (xk )k∈N satisfies +∞ d (xk , xk+1 ) < +∞. Hence, it is a Cauchy sequence k=0 in (X , d ). When (X , d ) is a complete metric space, the sequence (xk )k∈N converges in (X , d ) to some x¯ ∈ X . Moreover, when f is lower semicontinuous, we have x¯ ,S xk for all k ∈ N. PROOF. (i) For any k ∈ N, by definition of -S f (xk+1 ) ≤ f (xk+1 ) + d (xk , xk+1 ) ≤ f (xk ). We have used d (xk , xk+1 ) ≥ 0, which expresses that changes are costly. Therefore, the sequence ( f (xk ))k∈N is decreasing. Since −∞ < inf f ≤ f (xk ) ≤ f (x0 ) < +∞, X
we have f (xk ) ↓ infk f (xk ), which is a finite real number. (ii) Let us write the inequality f (xk+1 ) + d (xk , xk+1 ) ≤ f (xk ) for k = 0, 1, . . . , n − 1, f (x1 ) + d (x0 , x1 ) ≤ f (x0 ) . . . f (xn ) + d (xn−1 , xn ) ≤ f (xn−1 ). Then, we sum these inequalities and simplify the resulting expression. f (xk ) ∈ R for all k ∈ N.) We obtain f (xn ) +
n−1
(Note that
d (xk , xk+1 ) ≤ f (x0 ).
k=0
Let us now use the minorization f (xn ) ≥ infX f and the assumption infX f > −∞. We thus have n−1
d (xk , xk+1 ) ≤ f (x0 ) − inf f < +∞. X
k=0
This being true for any n ∈ N, we deduce +∞ k=0
d (xk , xk+1 ) ≤ f (x0 ) − inf f < +∞. X
i
i i
i
i
i
i
3.4. Ekeland’s -variational principle
book 2014/8/6 page 101 i
101
Note that this holds true, just by assuming that (X , d ) is a metric space. Then, by a classical argument, when (X , d ) is a complete metric space, this implies the convergence of the sequence (xk )k∈N in (X , d ). To see this, write the triangle inequality d (xn , xn+ p ) ≤ ≤
n+ p−1
d (xk , xk+1 )
k=n +∞
d (xk , xk+1 ),
k=n
which tends to zero as n → +∞. Hence (xk )k∈N is a Cauchy sequence in (X , d ) which implies its convergence when (X , d ) is a complete metric space. Let xk → x¯ in (X , d ) as k → +∞. Let us prove that x¯ ,S xn for all n ∈ N. We have xk ,S xn for all k ≥ n (by transitivity of ,S ), i.e., f (xk ) + d (xk , xn ) ≤ f (xn ) ∀ k ≥ n. Let us fix n ∈ N and let k → +∞ in this inequality. Since xk → x¯ in (X , d ), by using the lower semicontinuity property of f (up to now we have not used it!), we obtain f (¯ x ) + d (¯ x , xn ) ≤ f (xn ), that is, x¯ ,S xn . Our objective is to prove the existence of a trajectory (xk )k∈N of the marginal satisficing dynamics (S) which converges to a maximal element x¯ for ,S . So doing, we will have x¯ ,S xk for all k ∈ N and hence x¯ ,S x0 , which combined with the maximality of x¯ is precisely the claim of Theorem 3.4.3. In this perspective, to consider an arbitrary trajectory of (S) does not provide enough information: note that xk ≡ x0 for all k ∈ N is a trajectory of (S)! The dynamical system (S) modelizes a general rejection decision mechanism. We are now going to consider some trajectory of (S) which describes the decision process of a motivated agent. This means that at each step, the agent is willing to substantially improve his performance. This is a rich modelization subject involving some optimization aspects. In this perspective, the notion of aspiration index m(x), which is defined in the next statement, plays an important role. Lemma 3.4.1. For any x ∈ X , diam S(x) ≤ 2( f (x) − m(x)), where
5 4 5 4 m(x) = inf f (y) : y ∈ S(x) = inf f (y) : y ,S x
is called the aspiration index of the agent at x. PROOF. The proof is an immediate consequence of the definition of y ∈ S(x): y ∈ S(x) ⇐⇒ f (y) + d (x, y) ≤ f (x).
i
i i
i
i
i
i
102
book 2014/8/6 page 102 i
Chapter 3. Abstract variational principles
Noticing that for y ∈ S(x) we have f (y) ≥ m(x), we deduce ∀y ∈ S(x)
d (x, y) ≤ f (x) − m(x).
As a consequence, for any y, z ∈ S(x), d (y, z) ≤ d (y, x) + d (x, z) ≤ 2( f (x) − m(x)), and diam S(x) ≤ 2( f (x) − m(x)). The aspiration index of the agent at x, say, m(x), measures the gap between its present level of satisfaction at x and the maximum level of satisfaction that it can hope to obtain at a further step. Note that m(x) is, in general, not known by the agent who is not able to explore all of S(x). The cognitive model says that if the agent is motivated and is willing to explore enough at each step (and pay corresponding exploration costs!), then it knows a sufficiently good approximation of m(x) and the process converges to a stable routine. We are going to consider trajectories (xk )k∈N corresponding to a motivated agent who satisfies enough at each step. As an example, we consider that at each step k, the agent satisfies and fills a given fraction λ ∈ ]0, 1[ of the gap between f (xk ) and m(xk ): thus xk+1 ,S xk and f (xk+1 ) ≤ λm(xk+1 ) + (1 − λ) f (xk ) = f (xk ) − λ[ f (xk ) − m(xk )]. We now have all the ingredients to state a dynamical, cognitive version and proof of Ekeland’s -variational principle [55]. Theorem 3.4.4 (Attouch and Soubeyran). Let (X , d ) be a complete metric space and f : X −→ R ∪ {+∞} an extended real-valued function which is lower semicontinuous and bounded below. (a) Then, for any x0 ∈ dom f , there exists a trajectory (xk )k∈N of the marginal satisficing dynamical system (S), x0 -S x1 -S x2 -S · · · -S xk -S xk+1 -S · · · , which converges in (X , d ) to some x¯ ∈ X which is a maximal element for the partial ordering ,S . (b) Such trajectory can be obtained by satisficing enough at each step, for example, given some positive parameter 0 < λ < 1, by taking at each step xk+1 ,S xk and f (xk+1 ) ≤ f (xk ) − λ[ f (xk ) − m(xk )], 5 4 where m(.) is the aspiration index : m(x) = inf f (y) : y ,S x . PROOF. Take a trajectory (xk )k∈N of (S) which satisfies enough at each step. One can always construct such a trajectory just by using the definition of m(x) as an infimum. Then observe that the sequence (S(xk ))k∈N is nested. Since ,S is transitive and xk+1 ,S xk , we have the following implication: y ∈ S(xk+1 ) ⇐⇒ y ,S xk+1 =⇒ y ,S xk ⇐⇒ y ∈ S(xk ), i.e., S(xk+1 ) ⊂ S(xk ) for all k ∈ N.
i
i i
i
i
i
i
3.4. Ekeland’s -variational principle
book 2014/8/6 page 103 i
103
Let us prove that diam S(xk ) → 0 as k → +∞. By using Lemma 3.4.1, it is enough to prove that f (xk ) − m(xk ) → 0 as k → +∞. Since S(xk+1 ) ⊂ S(xk ) we have 4 5 m(xk+1 ) = inf f (y) : y ∈ S(xk+1 ) 4 5 ≥ inf f (y) : y ∈ S(xk ) = m(xk ). We now use that this agent satisfies enough, i.e., f (xk+1 ) ≤ f (xk ) − λ[ f (xk ) − m(xk )], and the inequality m(xk+1 ) ≥ m(xk ) to obtain f (xk+1 ) − m(xk+1 ) ≤ f (xk ) − λ[ f (xk ) − m(xk )] − m(xk ) ≤ (1 − λ)[ f (xk ) − m(xk )]. Hence and
f (xk ) − m(xk ) ≤ (1 − λ)k [ f (x0 ) − m(x0 )] diam S(xk ) ≤ 2(1 − λ)k [ f (x0 ) − m(x0 )].
Since 0 < λ < 1 we have diam S(xk ) → 0 as k → +∞. The sequence (S(xk ))k∈N is a decreasing sequence of closed nonempty sets (closedness follows from the lower semicontinuity of f )( whose diameter tends to zero. Since (X , d ) is complete, we have, by a classical x } is nonvoid and is reduced to a single element x¯ ∈ X . For result, that k∈N S(xk ) = {¯ any k ∈ N, we have xk and x¯, which belong to S(xk ), hence d (xk , x¯) ≤ diam S(xk ) which tends to zero. Thus, xk converges to x¯ in (X , d ) as k → +∞. The maximality of x¯ with respect to ,S follows from the following observation: suppose that y ,S x¯. Since x¯ ∈ S(xk ) for every k ∈ N, we have y ,S xk for all k ∈ N, i.e., ( y ∈ k∈N S(xk ) = {¯ x }. Indeed, when proving Theorem 3.4.3 and its dynamical version (Theorem 3.4.4), we have obtained a stronger version of Ekeland’s variational principle, which is formulated below. Theorem 3.4.5. Let (X , d ) be a complete metric space and f : X −→ R ∪ {+∞} a proper lower semicontinuous function which is bounded below. Let > 0 and x0 ∈ X be given such that f (x0 ) ≤ inf f + , X
and let λ > 0. Then there exists some x¯,λ ∈ X such that f (¯ x,λ ) ≤ f (x0 ) ≤ inf f + ; d (¯ x,λ , x0 ) ≤ λ; f (¯ x,λ ) < f (x) +
X
λ
d (¯ x,λ , x)
∀ x = x¯,λ .
PROOF. Let us apply Theorem 3.4.3 with λ d instead of d . One obtains the existence of some x¯,λ ∈ X which satisfies x¯,λ ,S x0 , i.e., f (¯ x,λ ) +
λ
d (¯ x,λ , x0 ) ≤ f (x0 ),
i
i i
i
i
i
i
104
book 2014/8/6 page 104 i
Chapter 3. Abstract variational principles
and x¯,λ is maximal with respect to ,S . We thus have f (¯ x,λ ) ≤ f (x0 ) and inf f + X
λ
d (¯ x,λ , x0 ) ≤ f (¯ x,λ ) + ≤ f (x0 ) ≤ inf f + ,
λ
d (¯ x,λ , x0 )
X
which implies d (¯ x,λ , x0 ) ≤ λ. The last property expresses that x¯,λ is maximal with respect to ,S .
i
i i
i
i
i
i
book 2014/8/6 page 105 i
Chapter 4
Complements on measure theory
4.1 Hausdorff measures and Hausdorff dimension This section aims to define intrinsically the intuitive notion of length, area, and volume. More precisely, we would like to provide a nonnegative measure for any subset of RN , which agrees with the well-known k-dimensional measure for regular k-dimensional surfaces when k is an integer. Hausdorff’s construction, as shown below, is particularly well suited to the geometry of sets and does not require any local parametrization on these sets; therefore, no assumption of regularity is needed. For instance, the Hausdorff measure offers the possibility of measuring fractal sets, as well as defining a new notion of dimension for any set, thereby extending the classical topological dimension. Note that the process described in the first subsection is the Carathéodory general approach to construct a measure from a σ-subadditive set function (or outer measure).
4.1.1 Outer Hausdorff measures and Hausdorff measures We denote the collection of all subsets of RN by + (RN ) and, for any nonempty set E of + (RN ), we set diam(E) = sup{d (x, y) : (x, y) ∈ E}, the diameter of E, where d is the Euclidean distance in RN . When s is a positive integer we denote the volume of the unit ball of R s by ω s ; in the general case s ≥ 0, we set π s /2
ωs =
Γ (1 + s/2)
,
where Γ is the well-known Euler function Γ (t ) =
+∞
x t −1 e −x d x.
0
We also set c s = 2−s ω s . For instance, we have c0 = 1,
c1 = 1,
c2 = π/4,
c3 = π/6.
Let E be any set of + (RN ) and δ > 0. A finite or countable family (Ai )i ∈N of sets in + (RN ) satisfying 0 < diam(Ai ) ≤ δ and E ⊂ i ∈N Ai will be called a δ-covering of E. 105
i
i i
i
i
i
i
106
book 2014/8/6 page 106 i
Chapter 4. Complements on measure theory
Definition 4.1.1. For each s ≥ 0, δ > 0, and E ⊂ RN , let us set s s δ (E) := inf c s diam(Ai ) : (Ai )i ∈N δ-covering of E . i ∈N
The s-dimensional outer Hausdorff measure is the set mapping s taking its values in [0, +∞] defined by s (E) := sup δs (E) δ>0
= lim δs (E). δ→0
Remark 4.1.1. The constant c s is a normalization parameter so that when s ∈ N and E is an s-dimensional regular hypersurface of RN , s (E) is the s-dimensional area of E (see Proposition 4.1.6 or Theorem 4.1.1). The positive number δ, intended to tend to zero, forces the sets Ai of the δ-covering to follow the geometry of E. Proposition 4.1.1. The set function s : + (RN ) −→ [0, +∞] is an outer measure, i.e., satisfies (i) s () = 0; (ii) (σ-subadditivity) for all sequences (Ei )i ∈N of subsets of RN such that E ⊂ ∪i ∈N Ei , s (E) ≤ s (Ei ); i ∈N
(iii) s is a nondecreasing set function, that is, s (A) ≤ s (B) whenever A ⊂ B. PROOF. For all A in + (RN ) such that diam(A) ≤ δ, one has δs () ≤ diam(A) s ≤ δ s and (i) follows. The monotonicity property (iii) follows straightforwardly from the definition of s . Let now > 0 and (Ai , j ) j ∈N be a δ-covering of Ei satisfying cs
j ∈N
diam(Ai , j ) s ≤
2i
+ δs (Ei ).
Obviously, (Ai , j )(i , j )∈N2 is a δ-covering of i ∈N Ei so that 6 7 3 s δ Ei ≤ δs (Ei ) + 2. i ∈N
i ∈N
Letting δ → 0, conclusion (ii) follows from the fact that s is a nondecreasing set function and is arbitrary. Following the classical construction of a measure from an outer measure, we define the subset s of s -measurable sets in the sense of Carathéodory: A ∈ s ⇐⇒ ∀X ∈ + (RN ), s (X ) = s (X ∩ A) + s (X \ A). Note that and RN belong to s . Proposition 4.1.2. The set s is a σ-algebra and s is σ-additive on s .
i
i i
i
i
i
i
4.1. Hausdorff measures and Hausdorff dimension
book 2014/8/6 page 107 i
107
PROOF. Obviously RN belongs to s . The proof of stability of s with respect to the complementary operation is easily established from a straightforward calculation. The proof of stability for countable union and the σ-additivity of s is divided into three steps. First step. Stability for finite unions and finite intersections. Let A1 , A2 be two sets in s and X some set in + (RN ). Since A1 and A2 are two measurable sets, an elementary calculation on sets gives s (X ) = s (A1 ∩ X ) + s (X \ A1 ) = s (A1 ∩ X ) + s ((X \ A1 ) ∩ A2 ) + s ((X \ A1 ) \ A2 ) = s (A1 ∩ X ) + s (X ∩ A2 \ A1 ) + s (X \ (A1 ∪ A2 )). According to the identities X ∩ A1 = X ∩ (A1 ∪ A2 ) ∩ A1 and X ∩ A2 \ A1 = X ∩ (A1 ∪ A2 ) \ A1 , we derive
s (X ) = s (X ∩ (A1 ∪ A2 )) + s (X \ (A1 ∪ A2 ))
so that A1 ∪ A2 ∈ s . Stability for finite intersection is then obtained thanks to the stability with respect to the complementary operation. Note that substituting X by X ∩ (A1 ∪ A2 ) in s (X ) = s (A1 ∩ X ) + s (X \ A1 ) we obtain the following important equality used in the step below: s (X ∩ (A1 ∪ A2 )) = s (X ∩ A1 ) + s (X ∩ A2 )
(4.1)
whenever A1 and A2 are disjoint sets in s . Second step. Stability for disjoint countable unions and σ-additivity of s . Let (Ai )i ∈N be a family of pairwise disjoint sets in s and A their union. According to the subadditivity of s for all X in + (RN ) we have s (X ) ≤ s (X \ A) + s (X ∩ A) ∞ ≤ s (X \ A) + s (X ∩ Ai ) i =0
6
s
= (X \ A) + lim
s
n→+∞
6
6
≤ lim inf s X \ n→+∞
n 3
7
n 3
7 X ∩ Ai
i =0
6
Ai + s
i =0
n 3
77 X ∩ Ai
i =0
= s (X ), where we have used (4.1) in the first equality, the nondecreasing of s in the inequality, and the stability for finite union in the last equality. This proves that A belongs to s . The σ-additivity of s is obtained by taking X = A. Last step. Stability for countable union. Let (Ai )i ∈N be a family of sets in s . Set B0 = A0 and for n ≥ 1,
Bn = An \
n−1 3
Ai .
i =0
According to the firststep, the family (Bi )i ∈N is made up of pairwise disjoint sets of s so that, from step 2, i ∈N Bi = i ∈N Ai belongs to s .
i
i i
i
i
i
i
108
book 2014/8/6 page 108 i
Chapter 4. Complements on measure theory
Definition 4.1.2. The restriction to s of the set function s is called the s-dimensional Hausdorff measure. The s-dimensional Hausdorff measure s is a [0, +∞]-valued Borel measure in the following sense. Proposition 4.1.3. The σ-algebra s contains the σ-algebra of all the Borel sets of RN . PROOF. The proof is based on the following Carathéodory criterion. An outer measure satisfying this criterion is sometimes called an outer metric measure. Lemma 4.1.1. Let μ be an outer measure on a metric space E equipped with its Borel σalgebra (E) and μ be the σ-algebra of the measurable sets in the Carathéodory sense. Then (E) ⊂ μ iff μ is additive on any pair {A, B} of sets of E satisfying d (A, B) > 0. PROOF OF LEMMA 4.1.1. Let us assume (E) ⊂ μ and let A, B be two subsets of E such that d (A, B) > 0. Since A ∈ μ the two decompositions A = (A ∪ B) ∩ A, B = (A ∪ B) \ A imply μ(A ∪ B) = μ(A) + μ(B). Conversely, let A be an open subset of E and X any fixed subset of E such that μ(X ) < +∞. According to subadditivity, it is enough to establish the inequality μ(X ) ≥ μ(X ∩ A) + μ(X \ A). One may assume μ(X ) < +∞. For all k ∈ N∗ , let us define the sets 1 Ak := x ∈ A : d (x, E \ A) > , k Bk := Ak+1 \ Ak . − k1 > 0, we have for all n ∈ N 7 6 n n 3 μ(X ∩ B2i ) = μ X ∩ B2i ≤ μ(X ),
Noticing that for k − l ≥ 2, d (Bk , B l ) ≥
1 l +1
i =1
hence
i =1
μ(X ∩ Bk ) ≤ μ(X ).
k even
The same calculation gives
μ(X ∩ Bk ) ≤ μ(X )
k odd
so that
μ(X ∩ Bk ) ≤ 2μ(X )
k∈N
and lim
k→+∞
μ(X ∩ Bi ) = 0.
i ≥k
i
i i
i
i
i
i
4.1. Hausdorff measures and Hausdorff dimension
book 2014/8/6 page 109 i
109
We infer lim μ(X ∩ (A \ Ak )) = lim μ X ∩ (∪i ≥k Bi ) k→+∞ k→+∞ ≤ lim μ(X ∩ Bi ) = 0, k→+∞
i ≥k
and, by subadditivity, we deduce μ(X ∩ A) ≤ μ(X ∩ Ak ) + μ(X ∩ (A \ Ak )) ≤ lim inf μ(X ∩ Ak ). k→+∞
Since obviously lim supn→+∞ μ(X ∩ Ak ) ≤ μ(X ∩ A), one has μ(X ∩ A) = lim μ(X ∩ Ak ). k→+∞
From d (Ak , X \ A) >
1 k
> 0 we now obtain
μ(X ) ≥ μ((X ∩ Ak ) ∪ (X \ A)) = μ(X ∩ Ak ) + μ(X \ A), and we complete the proof of Lemma 4.1.1 by letting k → +∞. PROOF OF PROPOSITION 4.1.3 CONTINUED. According to Lemma 4.1.1, it is enough to prove that s is an outer metric measure. Let A and B be two subsets of RN such that d (A, B) > 0 and let ' = (Ci )i ∈N be a covering of A ∪ B with diam(Ci ) ≤ δ < d (A, B). We can write this covering as the union of two disjoint δ-coverings: take = {C ∈ ' : C ∩ A = } and = {C ∈ ' : C ∩ B = }. Therefore ∞ i =1
c s diam(Ci ) s =
C ∈
c s diam(C ) s +
C ∈
c s diam(C ) s
and δs (A ∪ B) ≥ δs (A) + δs (B). We end the proof by letting δ → 0. Remark 4.1.2. The following process is often used to construct an outer measure and thus a measure on Cantor-type sets C in RN . Let /0 = {C } and for n ∈ N∗ let /n be a finite collection of disjoint subsets E of C , such that each E ∈ /n is contained in one of the sets of /n−1 . We assume moreover that limn→+∞ {diam(E) : E ∈ /n } = 0. We set now μ(C ) = a, where a is an arbitrary number satisfying 0 < a < +∞ and, for all the sets Ei , i = 1, . . . , m1 , of /1 , we define the masses μ(Ei ) such that m1 i =1
μ(Ei ) = μ(C ). m
n Ei , Similarly, we assign masses μ(Ei ) to the sets of /n such that if E ∈ /n−1 , E = ∪i =1 Ei ∈ /n , mn μ(Ei ) = μ(E).
i =1
We finally set
6 μ RN \
3
7 E = 0.
E∈/n
i
i i
i
i
i
i
110
book 2014/8/6 page 110 i
Chapter 4. Complements on measure theory
Let us denote the collection of all the complementary sets of /n by / n and set / = ∪n∈N (/n ∪ / n ). For each set A ∈ + (RN ), we now extend μ by setting μ(A) = inf
i ∈N
μ(Ei ) : A ⊂
3
Ei , Ei ∈ /
,
i ∈N
which defines an outer measure. One can prove that μ is a measure whose σ-algebra of measurable sets contains the σ-algebra of Borel sets of RN . Moreover, the support of μ, that is, the smallest closed set X of RN such that μ(RN \ X ) = 0, is contained in ∩n∈N ∪E∈/n E. Remark 4.1.3. The construction of Hausdorff measures s can be made in a general metric space X and indeed most of the properties continue to hold in this larger framework. For a detailed presentation of Hausdorff measures in metric spaces, see Ambrosio and Tilli [29]. Theorem 4.1.1. For all Lebesgue measurable sets E in RN , we have N (E) = " N (E), where " N denotes the Lebesgue measure on RN . Moreover, s (E) = 0 for s > N , whereas 0 is the counting measure. PROOF. For all Lebesgue measurable sets E in RN , let us recall the so-called isodiametric inequality, which asserts that the Lebesgue measure of E is less than the Lebesgue measure of any ball having the same diameter: " N (E) ≤ cN (diam(E))N . For a proof see, for instance, Evans and Gariepy [211]. For all covering (Ai )i ∈N of E we then have cN (diam(Ai ))N ≥ " N (Ai ) ≥ " N (E), i ∈N
i ∈N
hence δN (E) ≥ " N (E). Letting δ → 0 gives N (E) ≥ " N (E). The converse inequality is more involved. We will use the so-called Vitali’s covering lemma. Let us first define the notion of fine covering. Definition 4.1.3. A family 0 of closed balls in RN is said to cover a set E ⊂ RN finely if, for each x ∈ E and each > 0, there exists B r (x) ∈ 0 with r < , where B r (x) denotes the open ball with radius r and centered at x. Lemma 4.1.2 (Vitali’s covering theorem). Let E ⊂ RN with " N (E) < +∞ and finely covered by a family of closed balls 0 . Then there exists a countable subfamily 1 of pairwise disjoint elements of 0 such that 6 7 3 N E\ B = 0. " B∈1
PROOF. For the proof see, for instance, Ziemer [366]. Remark 4.1.4. In the definition of a family finely covering a subset E of RN one may replace the family of closed balls by a regular family of closed subsets of RN (see [366]).
i
i i
i
i
i
i
4.1. Hausdorff measures and Hausdorff dimension
book 2014/8/6 page 111 i
111
Vitali’s covering theorem is also valid for nonnegative Radon measures μ on RN (i.e., locally bounded nonnegative Borel measures; see Section 4.2). For a proof consult [366]. PROOF OF THEOREM 4.1.1 CONTINUED. Let A be a set of finite Lebesgue measure in RN and, for each η > 0, let U be an open subset of RN such that A ⊂ U and " N (U \ A) < η. There exists a family 0 of closed balls with diameter less than δ, included in U , and covering finely U . Indeed, for all x ∈ U , there exists a closed ball B r (x) (x) included in U and 0 = (B r (x)) x∈U , r ≤r (x)∧δ/2 is such a suitable family. According to Lemma 4.1.2 one can extract a subfamily (Bi )i =1,...,∞ of pairwise disjoint elements with diameter less than δ, satisfying 6 7 ∞ ∞ 3 3 N U \ Bi = 0, Bi ⊂ U . " ∗
Set A :=
∞ i =1
i =1
i =1 ∗
(Bi ∩ A). We have " (A \ A ) = 0 and N
δN (A∗ ) ≤ =
∞ i =1
∞
cN (diamBi )N " N (Bi )
i =1 N
= " (U ) ≤ " N (A) + η. Letting δ → 0 and η → 0, we obtain N (A∗ ) ≤ " N (A). It remains to establish " N (A \ A∗ ) = 0 =⇒ N (A\A∗ ) = 0 and more generally that if E is a Borel set satisfying " N (E) = 0, then N (E) = 0. This property is a corollary of the first assertion of Lemma 4.1.3 below, whose proof may be found in [366]. Lemma 4.1.3. Let 0 be a family of closed balls with sup{diamB : B ∈ 0 } < +∞. Then there exists a countable subfamily 1 of pairwise disjoint elements of 0 such that 3 3 B⊂ B ∗, B∈0
B∈1
∗
where B denotes the closed ball concentric with B with radius five times as big as that of B. Let E be a subset of RN , finely covered by 0 ; then, for all finite family 1 ∗ ⊂ 1 , ! 3 "3! 3 " E⊂ B B∗ . B∈1 ∗
B∈1 \1 ∗
Indeed, for η > 0, consider an open subset U of RN such that E ⊂ U and " N (U ) ≤ η. The open set U being a union of closed balls included in U with diameter less than δ, according to Lemma 4.1.3, there exists a family 1 of pairwise disjoint closed balls included in U such that U ⊂ B∈1 B ∗ and δN (E) ≤ HδN (B ∗ ) B∈1
≤
cN 5N (diam(B))N
B∈1
= 5N
6 " N (B) = 5N " N
B∈1
3
7 B < 5N η.
B∈1
The conclusion follows by letting δ → 0 and η → 0.
i
i i
i
i
i
i
112
book 2014/8/6 page 112 i
Chapter 4. Complements on measure theory
We establish now that for all Borel subsets E of RN , s (E) = 0 when s > N . One can write E = ∪n∈N En , where N (En ) < +∞. (Take En = Bn ∩ E, where Bn denotes the ball of RN with radius n, centered at 0.) Let δ > 0 and (Ai )i ∈N a δ-covering of En . We have cs diam(Ai ) s ≤ δ s −N cN diam(Ai )N cs cN i ∈N i ∈N which yields δs (En ) ≤
cs cN
δ s −N N (En ),
and finally, letting δ → 0, one obtains s (En ) = 0. Therefore, by subadditivity, s (E) ≤
∞
s (En ) = 0.
n=0
The fact that 0 is the counting measure is easy to establish and left to the reader.
4.1.2 Hausdorff measures: Scaling properties and Lipschitz transformations The scaling properties of length, area, or volume are well known. The two propositions below summarize and generalize these properties. Proposition 4.1.4. Let A be any subset of RN and λ > 0. Then s (λA) = λ s s (A). PROOF. If (Ai )i is a δ-covering of A, then (λAi )i is a λδ-covering of λA so that s λδ (λA) ≤
∞ i =1
c s (λdiam(Ai )) s = λ s
∞ i =1
c s diam(Ai ) s .
s Therefore λδ (λA) ≤ λ s δs (A). Letting δ → 0, we obtain s (λA) ≤ λ s s (A). Replacing λ with 1/λ and A with λA gives the converse inequality.
Proposition 4.1.5. Let A be any subset of RN and f : A −→ R m satisfying for all x and y in A | f (x) − f (y)| ≤ L|x − y|α , where L > 0 and α > 0 are two given constants. Then s /α ( f (A)) ≤
c s /α cs
L s /α s (A).
Consequently, if f is a Lipschitz function with modulus L, then s ( f (A)) ≤ L s s (A), and if f is an isometry, then s ( f (A)) = s (A).
i
i i
i
i
i
i
4.1. Hausdorff measures and Hausdorff dimension
book 2014/8/6 page 113 i
113
PROOF. If (Ai )i is a δ-covering of A, then ( f (A ∩ Ai ))i is an Lδ α -covering of f (A), and s /α
Lδ α ( f (A)) ≤ s /α
∞ c s /α
cs
i =1
We deduce that Lδ α ( f (A)) ≤ δ → 0.
c s diam( f (A ∩ Ai )) s /α ≤ cs/α cs
c s /α cs
L s /α
∞ i =1
c s diam(Ai ) s .
L s /α δs (A), and the conclusion follows after letting
The Hausdorff measure of an N -dimensional regular hypersurface of R m is nothing but its classical area when the hypersurface is defined by means of a parametrization. Proposition 4.1.6. Let m ∈ N with m ≥ N , f : RN −→ R m be a one-to-one Lipschitz map and E a Borel subset of RN . Then ⎛ N ⎞1/2 Cm |Ji |2 ⎠ d " N , N ( f (E)) = ⎝ E
i =1
where Ji , i = 1, . . . , C mN , are the N × N -minors of the Jacobian matrix of f . PROOF. To illustrate how the definition of the one-dimensional Hausdorff measure is well adapted to the local geometry of arcs, we only give the proof in the case N = 1, where f : [0, 1] −→ R m is denoted by t → (xi (t ))i =1,...,m . For a complete proof, consult Ambrosio [19] or Evans and Gariepy [211]. We must establish
m 1/2 1 2 f ([0, 1]) = |xi (t )| dt =
[0,1]
[0,1]
i =1
| f (t )| d t ,
which is the length of the parametrized arc Γ = f ([0, 1]). We use the well-known classical result straightforwardly obtained from Taylor’s formula and Riemann integration theory: if f belongs to C1 ([α, β], R m ), we have n−1 | f (t )| d t = lim | f (ti +1 ) − f (ti )| ≥ | f (α) − f (β)|, [α,β]
η→0
i =0
where t0 = α < t1 < · · · ti < ti +1 < · · · tn = β is some finite subdivision of [α, β] with η = maxi =0,...,n−1 (ti +1 − ti ). f is assumed to belong to C1 ([0, 1], R m ) and we prove Step 1. The function 1 | f (t )|d t ≥ (Γ ). For all δ > 0, let η > 0 be such that [0,1] |t − t | < η =⇒ | f (t ) − f (t )| < δ. Let us consider a finite subdivision t0 = 0 < t1 < · · · ti < ti +1 < · · · tn = 1 of [0, 1] satisfying η > maxi =0,...,n−1 (ti +1 −ti ) and set Γi = f ([ti , ti +1 )) for i = 0, . . . , n−1. Consider now ai , bi in [ti , ti +1 ] such that diam(Γi ) = | f (ai ) − f (bi )|. We have Γ \ { f (1)} =
n−1 3
Γi
i =0
i
i i
i
i
i
i
114
book 2014/8/6 page 114 i
Chapter 4. Complements on measure theory
with diam(Γi ) < δ and
[0,1]
| f (t )| d t =
n−1 [ti ,ti+1 ]
i =0
≥
n−1
[ai ,bi ]
i =0
≥
n−1 i =0
=
n−1 i =0
| f (t )| d t
| f (t )| d t
| f (ai ) − f (bi )| diam(Γi ) ≥ δ1 (Γ ).
We end this step by letting δ → 0. Step 2. The function f is again assumed to belong to C1 ([0, 1], R m ) and we prove the converse inequality [0,1] | f (t )| d t ≤ 1 (Γ ). Let n ∈ N, h = 1/n, ti = hi, i = 0, . . . , n − 1, and consider the covering Γ=
n−1 3 i =0
f ([ti , ti +1 )) ∪ { f (1)}
by pairwise disjoint Borel subsets f ([ti , ti +1 )) of Γ (recall that f is one-to-one). We then have n−1 1 (Γ ) = 1 ( f ([ti , ti +1 ))). i =0
For each i = 1, . . . , n − 1, consider the orthogonal projection π of f ([ti , ti +1 )) onto the line ( f (ti ), f (ti +1 )). As π does not increases distances, from Proposition 4.1.5 we have 1 f ([ti , ti +1 )) ≥ 1 π( f ([ti , ti +1 ))) , which is greater than 1 ([ f (ti ), f (ti +1 ))). Indeed, a convexity argument yields [ f (ti ), f (ti +1 )) ⊂ π( f ([ti , ti +1 ))). By using the definition, it is easy to prove for the segments [a, b ] in R m that 1 ([a, b ]) = |b − a|. Therefore 1 ([ f (ti ), f (ti +1 )]) = | f (ti ) − f (ti +1 )| and 1 (Γ ) ≥
n−1 i =0
| f (ti ) − f (ti +1 )|.
The conclusion follows after letting n → +∞. Step 3. The function f is assumed to be Lipschitz continuous. The conclusion is a straightforward consequence of the two previous steps and the following approximating result: every Lipschitz map f : E → R m can be approximated in the Lusin sense by functions of class C1 . More precisely, there exists a nondecreasing family (Ki )i ∈N of compact sets included in E such that 6 7 ∞ 3 N E \ Ki = 0 i =0
i
i i
i
i
i
i
4.1. Hausdorff measures and Hausdorff dimension
book 2014/8/6 page 115 i
115
and such that f Ki is the restriction of a function of class C1 . For a proof of this property, consult [211]. CmN 1/2 Remark 4.1.5. The measure |J |2 · " N is sometimes called the element of area i =1 i of the N -dimensional surface f (E) and, for all Borel subset B of f (E), ⎛
⎝
B → N (B) = f
−1
(B)
⎞1/2
N
Cm i =1
|Ji |2 ⎠
d" N
is its corresponding surface measure. When f is not one-to-one, the generalization of the formula in Proposition 4.1.6 is the so-called area formula: 71/2 6 m−N +1 0 (E ∩ f −1 (x)) N (d x) = |Ji |2 d" N, f (E)
i =1
E
where x → 0 (E ∩ f −1 (x)) is nothing but the multiplicity function as illustrated in the example of the parametrization of the unit circle: t → (cos(2nπt ), sin(2nπt )) from E = [0, 1[ into R2 . One can also generalize this formula as follows: let h : E → [0, +∞] be a Borel function; then
f (E) x∈ f −1 (y)
⎛
h(x) ⎝
h(x) N (d y) =
i =1
E
⎞1/2
N
Cm
|Ji |2 ⎠
" N (d x).
For these generalizations, see [19], [211], or Federer [213].
4.1.3 Hausdorff dimension Among the wide variety of definitions of dimension, the Hausdorff dimension introduced in Theorem 4.1.2 has the advantage of being defined for any set. For alternative definitions of dimensions, consult, for instance, Falconer [212]. Lemma 4.1.4. For all fixed set A of RN , the map s → s (A) is nonincreasing. More precisely, for all δ > 0 and for t > s, we have δt (A) ≤ δ t −s
ct cs
δs (A).
(4.2)
PROOF. Let (Ai )i ∈N be a δ-covering of A. One has δt (A) ≤ c t diam(Ai ) t i ∈N
≤
ct cs
i ∈N
c s diam(Ai ) s δ t −s .
The conclusion then follows by taking the infimum on all the δ-covering of A.
i
i i
i
i
i
i
116
book 2014/8/6 page 116 i
Chapter 4. Complements on measure theory
Theorem 4.1.2 (definition of the Hausdorff dimension). Let A be a set of RN and set s0 := inf{t ≥ 0 : t (A) = 0}. Then s0 satisfies s (A) =
s0 .
The real number s0 is called the Hausdorff dimension of the set A and is denoted by dimH (A). At the critical value s0 , s0 (A) may be zero or infinite or may satisfy 0 < s0 (A) < +∞. In this last case, A is called an s0 -set. PROOF. Letting δ → 0 in inequality (4.2), we obtain s (A) < +∞ =⇒ ∀t > s, t (A) = 0. Set s0 := inf{t ≥ 0 : t (A) = 0} and take s < s0 . Assume that s (A) < +∞. For t satisfying s < t < s0 we have t (A) = 0, which contradicts the definition of s0 . Consequently, for s < s0 , s (A) = +∞. Since the map s → s (A) is nonincreasing, for s > s0 , we have s (A) = 0. Obviously dimH (R) = 1 and 1 (R) = +∞. On the other hand, Proposition 4.1.9 provides a nontrivial set A in R with dimH (A) = 1 and satisfying 1 (A) = 0. Remark 4.1.6. Taking, as δ-covering, the class of balls of RN , one may define ∞ ∞ 3 s s N ˜ (E) := inf c , diam(B ) : E ⊂ B , 0 < diam(B ) ≤ δ, B ball of R δ
s
i =1
i
i
i
i
i =1
and the set mapping ˜ s , by ˜ s (E) := sup ˜δs (E) δ>0
= lim ˜δs (E). δ→0
It is easy to establish the following bounds: s (E) ≤ ˜ s (E) ≤ 2 s s (E). Thanks to this estimate, the Hausdorff dimensions defined from the two mappings s and ˜ s are equal. The following are useful for the Hausdorff dimension. Proposition 4.1.7. Let A be any set in RN . (i) The following implications hold true: s (A) < +∞ =⇒ dimH (A) ≤ s, s (A) > 0 =⇒ dimH (A) ≥ s. (ii) Let f : A −→ R m satisfying for all x and y in A, | f (x) − f (y)| ≤ L|x − y|α , where L > 0 and α > 0 are two given positive constants. Then dimH ( f (A)) ≤ α1 dimH (A).
i
i i
i
i
i
i
4.1. Hausdorff measures and Hausdorff dimension
book 2014/8/6 page 117 i
117
PROOF. The proof of assertion (i) is a straightforward consequence of the definition. Let us prove (ii). Indeed s > dimH (A) =⇒ s (A) = 0 and inequality s /α ( f (A)) ≤ cs/α s /α L s (A) established in Proposition 4.1.5 yields s /α ( f (A)) = 0 so that c s
s ≥ dimH ( f (A)). α The conclusion follows by letting s → dimH (A). Example 4.1.1. When U is an open subset of RN , then dimH (U ) = N . Indeed, U contains an open ball B and N (U ) ≥ N (B) > 0, and thus dimH (U ) ≥ N . Example 4.1.2. Every countable subset A of RN is a set of zero Hausdorff dimension. Indeed, by σ-additivity, one has s ({a}). s (A) = a∈A
Since ({a}) = 1, for all s > 0 one has s ({a}) = 0, and thus s (A) = 0. This proves that dimH (A) = 0. 0
Example 4.1.3. Let us consider for N ≤ m a one-to-one map f : RN −→ R m of class C1 . Let E be a compact subset of RN . We have dimH ( f (E)) = N . Indeed, from Proposition 4.1.6, we have 0 < N ( f (E)) ≤ C " N (E) < +∞, where C is a positive constant. Example 4.1.4. Let us consider now the middle third Cantor set C of the interval [0, 1]. 2 . Indeed, We have dimH (C ) = ln ln 3 % & % & 1 2 C = C ∩ 0, ∪ C ∩ ,1 . 3 3 Set C1 := C ∩[0, 13 ] and C2 := C ∩[ 32 , 1]. These two sets are geometrically similar to C by a ratio 13 . According to the properties of the Hausdorff measure established in Proposition 4.1.5, we derive 2 s (C ) = s s (C ). 3 The conclusion follows if we assume that C is an s-set, that is, 0 < s (C ) < +∞. For a proof of this property, consult [212], where various interesting methods are given for the calculation of Hausdorff dimensions of fractal sets. More generally, let S1 , . . . , S m : RN → RN be m similarities, i.e., satisfying |Si (x) − Si (y)| = ci |x − y| for all x, y in RN , where 0 < ci < 1 (the ratio of Si ). We assume that the Si satisfy the open set condition, that is, there exists a nonempty bounded open set U such that m 3
Si (U ) ⊂ U.
i =1
Consider now a so-called self-similar set F satisfying F = where s is the solution of m cis = 1
m
S (F ). i =1 i
Then dimH F = s,
i =1
and 0 < s (F ) < +∞. For a proof, consult [212].
i
i i
i
i
i
i
118
book 2014/8/6 page 118 i
Chapter 4. Complements on measure theory
Note that the middle third Cantor set is such a self-similar set by taking S1 (x) = 13 x and S2 (x) = 13 x + 23 and the open set condition holds for U = ]0, 1[. The Hausdorff dimension of a set gives us some information about its topological structure, as stated in the proposition below. Proposition 4.1.8. A subset E of RN with dimH (E) < 1 is totally disconnected: no two of its points lie in the same connected component. PROOF. Let x and y be distinct elements of E and consider the distance function d x to x in RN , that is, d x (z) = |z − x|. As d x does not increase distances, from Proposition 4.1.7 we deduce that dimH (d x (E)) ≤ dimH (E) < 1. Thus d x (E) is a subset of R of 1 measure (or Lebesgue measure) zero. Consequently, there exists r > 0 such that r < d x (y) and r ∈ d x (E) (otherwise (0, d x (y)) ⊂ d x (E)). The set E is then the union of the two disjoint open sets E = {z ∈ E : d x (z) < r } ∪ {z ∈ E : d x (z) > r } where x is in one set and y is in the other, so that x and y lie in different connected components. We end this section by giving a nontrivial set in R, with null Lebesgue measure but with Hausdorff dimension equal to one. Proposition 4.1.9. There exists a compact set E of [0, 1] such that 1 (E) = 0 and dimH (E) = 1. PROOF. We make use of a Cantor-type set construction but reduce the proportion of intervals removed at each stage. More precisely, we define a family (Kn )n∈N of closed sets as follows: K0 = [0, 1], K1 = K0 \ I11 , where I11 is an open interval centered at 1/2 with length l0 , 0 < l0 < 1; for all n > 1, Kn is the union of 2n closed disjoint intervals Iin with the same length ln and Kn+1 is obtained by removing from each Iin an open interval of l
n length n+1 and centered at the center of Iin . Let us show that the compact set E = ∩n∈N Kn answers the question. A straightforward calculation gives ln = 2−n (1 − l0 )/n so that
1 (E) = lim 1 (Kn ) = 0. n→+∞
Now fix 0 < r < 1 and define the integer (depending on r ), n0 = sup{n ∈ N : ∃i ∈ {1, . . . , 2n }, B r (x) ∩ E ⊂ Iin }. It is easily seen that r≥
ln0 n0 + 1
=
2−n0 (1 − l0 ) n0 (n0 + 1)
.
(4.3)
On the other hand, for each 0 < α < 1, there exists a positive constant (i.e., independent of n0 ) C (α, l0 ) such that 2−αn0 (1 − l0 )α 2−n0 ≤ C (α, l0 ) α ; (4.4) n0 (n0 + 1)α
i
i i
i
i
i
i
4.2. Set functions and duality approach to Borel measures
book 2014/8/6 page 119 i
119
indeed, it is enough to take 4 5 C (α, l0 ) = (1 − l0 )−α max x α (1 + x)α 2−(1−α)x : x ≥ 1 . Consider now the outer measure μ in R satisfying μ(Iin ) = 2−n for all n and all i = 1, . . . , 2n , defined following the general construction of Remark 4.1.2. Collecting (4.3) and (4.4), we obtain for all x ∈ E and all 0 < r < 1, μ(B r (x) ∩ E) ≤ C (α, l0 )r α . This estimate yields dimH (E) ≥ α. Indeed, if (Bi )i ∈N is a δ-covering of E made up of balls Bi centered in E with δ < 1, ωα ωα ωα cα diam(Bi )α ≥ μ(Bi ∩ E) ≥ μ(E) = , C (α, l0 ) i ∈N C (α, l0 ) C (α, l0 ) i ∈N so that ˜ α (E) > 0, then α (E) > 0 and dimH (E) ≥ α. Since α < 1 is arbitrary, we have dimH (E) ≥ 1. The opposite inequality is trivial; then dimH (E) = 1.
4.2 Set functions and duality approach to Borel measures 4.2.1 Borel measures as set functions Let Ω be a separable locally compact topological space (for example, RN or an open subset of RN ) and (Ω) its Borel field. We denote the set of all R m -valued Borel measures by M(Ω, R m ). Let us recall that M(Ω, R m ) is the vectorial space of all the set functions μ : (Ω) → R m satisfying μ() = 0 and the σ-additivity condition: 7 6 3 Bn = μ(Bn ) for all pairwise disjoint families (Bn )n∈N in (Ω). μ n∈N
n∈N
In the case when m = 1, we will use the notation M(Ω) (or M b (Ω)) and the elements of M(Ω) are called signed Borel measures. The subset of its nonnegative elements is denoted by M+ (Ω). If A is a fixed Borel subset of Ω, the restriction to A of a Borel measure μ ∈ M(Ω, R m ) is the Borel measure μA of M(Ω, R m ) defined for all Borel sets E of Ω, by μA(E) = μ(E ∩ A). It is worth noticing that Section 4.1 provides many concrete Borel measures on RN . Indeed, if A is a Borel subset in RN satisfying s (A) < +∞, then μ = s A belongs to M+ (RN ). The support of a measure μ ∈ M(Ω, R m ) is the smallest closed set E of Ω, denoted by spt(μ), such that |μ|(Ω \ E) = 0. As a straightforward consequence of the definition, we also have spt(μ) = {x ∈ Ω : ∀r > 0, |μ|(B r (x)) > 0}. Let us recall that if μ ∈ M+ (Ω), the measure μ(B) of all Borel subsets B of Ω can be approximated by the measures of the open or compact subsets of Ω. More precisely, the following holds. Proposition 4.2.1. The Borel measures μ in M+ (Ω) are regular, i.e., for all Borel sets B of Ω, one has μ(B) = sup{μ(K) : K ⊂ B, K compact set of Ω}, = inf{μ(U ) : B ⊂ U , U open set of Ω}.
i
i i
i
i
i
i
120
book 2014/8/6 page 120 i
Chapter 4. Complements on measure theory
The total variation of a measure μ ∈ M(Ω, R m ) is the real-valued set function |μ|, defined for all Borel sets B of Ω by |μ|(B) = sup
∞ i =0
|μ(Bi )| :
∞ 3
Bi = B ,
i =0
where the supremum is taken over all the partitions of B in (Ω). We point out that for all μ in M(Ω, R m ) we automatically have |μ|(Ω) < +∞ (μ is bounded) and |μ| is σadditive, so that |μ| is a Borel measure in M+ (Ω). Actually, it is easily seen that |μ| is the smallest nonnegative scalar Borel measure ν such that |μ(B)| ≤ ν(B) for all Borel sets B and that the mapping μ → |μ|(Ω) is a norm for which M(Ω, R m ) is a Banach space. In the scalar case, for all μ ∈ M(Ω) we define in M+ (Ω) the positive part μ+ and the negative parts μ− of μ by μ+ =
|μ| + μ 2
,
μ− =
|μ| − μ 2
,
so that μ = μ+ −μ− and |μ| = μ+ +μ− . We define now the nonnegative Radon measures as the locally finite nonnegative Borel measures. Definition 4.2.1. A set function μ : (Ω) → [0, +∞] such that for all Ω ⊂⊂ Ω its restrictions to (Ω ) is a Borel measure on Ω is called a nonnegative Radon measure. Remark 4.2.1. It is easily seen that the nonnegative Radon measures are regular. Given a measure λ in M+ (Ω), we denote the set of all Borel functions f : Ω → R m such that | f | d λ < +∞ Ω
by L1λ (Ω, R m ) or by L1λ (Ω) when m = 1. Given now two measures μ ∈ M(Ω, R m ) and λ ∈ M+ (Ω), we say that the measure μ is absolutely continuous with respect to the measure λ and we write μ 2 λ iff ∀B ∈ (Ω), λ(B) = 0 =⇒ μ(B) = 0. We say that the measure μ is singular with respect to the measure λ and we write μ ⊥ λ iff there exists B in (Ω) such that λ(B) = 0 and μ is concentrated on B, i.e., μ(C ) = 0 for all Borel set C such that B ∩ C = . One may establish that μ 2 λ ⇐⇒ ∃ f ∈ L1λ (Ω, R m ) s.t. μ = f λ. The following theorem extends this result. Theorem 4.2.1 (Radon–Nikodým). Let μ and λ be two Borel measures, respectively, in M(Ω, R m ) and M+ (Ω). Then there exist a function f in L1λ (Ω, R m ) and a measure μ s in M(Ω, R m ) such that μ = f λ + μs ,
μ s ⊥ λ.
i
i i
i
i
i
i
4.2. Set functions and duality approach to Borel measures
book 2014/8/6 page 121 i
121
Moreover, f is given by f (x) = lim
ρ→0
μ(Bρ (x))
for μ a.e. x ∈ Ω,
λ(Bρ (x))
where Bρ (x) denotes the open ball in Ω centered at x ∈ Ω with radius ρ. For details about these elementary notions and various proofs of these properties, we refer the reader to the books by Buchwalter [143], Marle [287], and Rudin [331]. We now give three useful lemmas concerning Borel measures in M+ (Ω). The first states that one can reduce families of pairwise disjoint Borel sets to a countable subfamily as long as one considers their measures. The second is a localization lemma in M+ (Ω). The last lemma is a basic result concerning the s-density of a nonnegative Radon measure and is an essential tool in the study of the structure of sets with finite perimeter (see Chapter 10). Lemma 4.2.1. Let μ be a Borel measure in M+ (Ω) and (Bi )i ∈I a family of pairwise disjoint Borel subsets of Ω. Then the subset of indices i ∈ I such that μ(Bi ) = 0 is at most countable. PROOF. Since {i ∈ I : μ(Bi ) = 0} = n∈N∗ {i ∈ I : μ(Bi ) > n1 }, it is enough to prove that each set In = {i ∈ I : μ(Bi ) > n1 } is finite. Assume that In contains an infinite sequence of indices in I , that is, {i0 , . . . , ik , . . .} ⊂ I . We then have 6 7 3 +∞ > μ Bi k = μ(Bik ) = +∞, k∈N
k∈N
a contradiction. Example 4.2.1. Let μ ∈ M(Ω, R m ) and B r (x0 ) be the open ball in Ω centered at x0 with radius r . Then for all but countably many r in R+ , one has |μ| = 0. ∂ B r (x0 )
Lemma 4.2.2. Let μ ∈ M+ (Ω), ( fi )i ∈N be a family of nonnegative functions in L1μ (Ω) and set f = supi fi . Then f d μ = sup fi d μ , Ω
i ∈I
Ai
where the supremum is taken over all finite families (Ai )i ∈I of pairwise disjoint open subsets of Ω. PROOF. Let n be a fixed element of N and consider the μ-measurable sets Ei := x ∈ Ω : sup fk (x) = fi (x) . 0≤k≤n
We now construct the following family (Ωi )i =0,...,n of pairwise disjoint μ-measurable sets: 7 6 i 3 Ω0 = E0 , Ωi +1 = Ω \ Ek ∩ Ei +1 , i = 0, . . . , n − 1. k=1
i
i i
i
i
i
i
122
book 2014/8/6 page 122 i
Chapter 4. Complements on measure theory
It is easy to check that Ω =
n i =0
Ei =
n i =0
sup fk d μ =
Ω 0≤k≤n
= =
Ωi . Moreover, 1 n
Ω n
i=0
Ωi
sup fk d μ
0≤k≤n
sup fk d μ
i =0 Ωi 0≤k≤n n i =0
Ωi
fi d μ.
Let μi = fi ·μ be the Borel measure in M+ (Ω) whose density with respect to μ is fi . Since each measure μi is regular, one has μi (Ωi ) = sup{μi (K):K compact subset of Ωi }, hence n sup fi d μ = sup fi d μ : (Ki )i =1,...,n pairwise disjoint compact sets . Ω 0≤i ≤n
i =0
Ki
From the regularity property of μi again, and by compactness, for all > 0, there exists a family (3i )i =0,...,n of open sets, that one may assume pairwise disjoint, each 3i containing Ki , such that μi (3i \ Ki ) < /n. Therefore n sup fi d μ = sup fi d μ : (Ai )i =1,...,n pairwise disjoint open sets . Ω 0≤i ≤n
i =0
Ai
Taking the supremum on n and by the monotone convergence theorem, we obtain f d μ ≤ sup fi d μ . Ω
i ∈I
Ai
Since the converse inequality is obvious, the proof is complete. Example 4.2.2. Let Ω be a bounded open set of RN and let us consider the following measure associated with a given function u in the space SBV (Ω) defined in Section 10.5: S u is a hypersurface in Ω, which is the union of a negligible subset of Ω for the N − 1Hausdorff measure and of countable many C 1 -hypersurfaces with Hausdorff dimension N − 1 (S u is the jump set of u). For H N −1 almost every x ∈ S u , ν u (x) is a normal unit vector at x and μ := " N Ω + ν u N −1 S u , where " N denotes the Lebesgue measure in RN . On the other hand, let us consider the functions fν := |∇u · ν| p + |ν u · ν|1Su indexed by a countable dense subset D of ν in S N −1 . We have sup fν d μ = sup fν d μ + sup fν d μ Ω ν∈D
=
Ω\S u ν∈D
Ω
S u ν∈D
|∇u| p d x + N −1 (S u ),
which is the Mumford–Shah energy functional introduced in image segmentation and studied in Sections 12.5 and 14.3. By applying Lemma 4.2.2 now, we then obtain the following formula, which is a central point to extend to arbitrary dimension, the approximation of the Mumford–Shah energy functional established in one dimension:
p N −1 p N −1 . |∇u| d x + (S u ) = sup |∇u · νi | d x + |νν · νi | d S u Ω
i ∈I
Ai
Ai
i
i i
i
i
i
i
4.2. Set functions and duality approach to Borel measures
book 2014/8/6 page 123 i
123
The supremum is taken over all finite families (Ai )i ∈I of pairwise disjoint open subsets of Ω. Note that for p = 1, we obtain the expression of the total variation of the measure D u := ∇u " N Ω + [u]ν u N −1 S u . For more details on the spaces BV (Ω) and SBV (Ω), see Chapter 10, and for the definition and properties concerning the Mumford– Shah energy functional, consult Sections 12.5 and 14.3. For another application of the localization Lemma 4.2.2, see Section 13.3. Lemma 4.2.3. Let Ω be an open bounded subset of RN , μ a nonnegative Radon measure on Ω, and 0 < s ≤ N , t > 0. For each Borel subset E of Ω, the following implication holds: ∀x ∈ E
lim sup
μ(Bρ (x)) ρs
ρ→0
> t =⇒ μ ≥ C t s E,
where C is a positive constant depending only on s. When s is an integer, one may take C = ω s , the volume of the unit ball of R s . PROOF. One may assume μ(E) < +∞. Since μ(E) = inf{μ(U ) : E ⊂ U, U open set of Ω}, one may choose an arbitrary open subset U of Ω such that E ⊂ U and μ(U ) < +∞. Let now δ > 0 and consider the family of closed balls δ μ(Bρ (x)) 0 := B ρ (x) ⊂ U : x ∈ E, ρ ≤ , ≥t . 2 ρs From the hypothesis, it is easily seen that this family finely covers E (Definition 4.1.3). Therefore, according to Lemma 4.1.3, there exists a countable subfamily 1 of pairwise disjoint elements of 0 such that 3 3 B⊂ B ∗, B∈0
B∈1
∗
where B denotes the closed ball concentric with B, with radius five times as big as that of B. Moreover, for each finite family 1 ∗ ⊂ 1 , 6 7 6 7 3 3 ∗ E⊂ B ∪ B . B∈1 ∗
Thus, s (E) ≤ c s 5δ
B∈1
s
≤ 2 cs t
∗
B∈1 \1 ∗
(diam(B)) s + c s 5 s
−1
6
B∈1 ∗
7
B∈1 \1 s
(diam(B)) s ∗
s −1
μ(B) + 2 c s 5 t
6
7 μ(B) .
B∈1 \1 ∗
Since the second sum of the last inequality can be taken less than δ for an appropriate choice of 1 ∗ , we obtain 6 7 s (E) ≤ ω s t −1 μ(B) + δ 5δ B∈1 ∗
≤ ωs t
−1
μ(U ) + δ.
We end the proof by letting δ → 0 and taking the infimum on U .
i
i i
i
i
i
i
124
book 2014/8/6 page 124 i
Chapter 4. Complements on measure theory
4.2.2 Duality approach We recall now some definitions and important results concerning the Riesz functional analysis approach. We set C0 (Ω, R m ) to denote the space of all continuous functions which tend to zero at infinity, i.e., ∀ > 0
there exists a compact set K ⊂ Ω such that sup |ϕ(x)| ≤ . x∈Ω\K
Recall that equipped with the norm ϕ∞ = sup |ϕ(x)|, x∈Ω
C0 (Ω, R m ) is a Banach space. We denote its subspace made up of all continuous functions with compact support in Ω by Cc (Ω, R m ). When m = 1, the two above spaces will be denoted, respectively, by C0 (Ω) and Cc (Ω). According to the vectorial version of the Riesz–Alexandroff representation Theorem 2.4.7, the dual of C0 (Ω, R m ) (and then of Cc (Ω, R m )) can be isometrically identified with M(Ω, R m ). Then, any Borel measure is a continuous linear form on C0 (Ω, R m ) or Cc (Ω, R m ) and the two dual norms · C (Ω,Rm ) and · C (Ω,Rm ) are equal to the total mass c 0 | · |(Ω): |μ|(Ω) = sup{〈μ, ϕ〉 : ϕ ∈ C0 (Ω, R m ), ||ϕ||∞ ≤ 1} = sup{〈μ, ϕ〉 : ϕ ∈ Cc (Ω, R m ), ||ϕ||∞ ≤ 1},
where 〈μ, ϕ〉 = μ(ϕ). In what follows, 〈μ, ϕ〉 will also be denoted by Ω ϕ d μ. Note that M(Ω, R m ) is isomorphic to the product space M(Ω) m and that according to this isomorphism μ ∈ M (Ω, R m ) ⇐⇒ μ = (μ1 , . . . , μ m ) and μi ∈ C0 (Ω) , i = 1, . . . , m. The following proposition is an easy generalization, for vectorial measures, of Proposition 2.4.14 and Corollary 2.4.1. To shorten notation, σ(C0 , C0 ) and σ(Cc , Cc ) denote, respectively, the two weak topologies σ(C0 (Ω, R m ), C0 (Ω, R m )) and σ(Cc (Ω, R m ), Cc (Ω, R m )). Proposition 4.2.2. The weak topologies σ(C0 , C0 ) and σ(Cc , Cc ) induce the same topology on bounded subsets of M(Ω, R m ). Moreover, from any bounded sequence of Borel measures (μn )n∈N in M(Ω, R m ), one can extract a subsequence σ(C0 , C0 )-converging (thus σ(Cc , Cc )converging) to some Borel measure μ in M(Ω, R m ). Note that for unbounded sequences of M(Ω, R m ), the two weak topologies do not agree, as illustrated in the following example: take Ω = (0, +∞) and μn = nδn . Then nδn 0 in the topology σ(Cc (Ω), Cc (Ω)) but 〈nδn , ϕ〉 → 1 as n → +∞ for any function of C0 (Ω) satisfying ϕ ∼ 1x in the neighborhood of +∞. According to Proposition 4.2.2, from now on, for bounded sequences in M(Ω, R m ), we do not distinguish the convergences associated with these two weak topologies and we will refer to these convergences as the weak convergence in M(Ω, R m ). In terms of probabilistic approach, we have the following properties.
i
i i
i
i
i
i
4.2. Set functions and duality approach to Borel measures
book 2014/8/6 page 125 i
125
Proposition 4.2.3 (Alexandrov). Let μ, μn in M+ (Ω) such that μn weakly converges to μ; then for all open subsets U of Ω,
μ(U ) ≤ lim inf μn (U ), n→+∞
for all compact subsets K of Ω,
μ(K) ≥ lim sup μn (K). n→+∞
Consequently, for all relatively compact Borel subset B of Ω such that μ(∂ B) = 0, we have μ(B) = lim μn (B). n→+∞
PROOF. Let U be an open subset of Ω and 1U its characteristic function. Classically, there exists a nondecreasing sequence (ϕ p ) p∈N in Cc (Ω) such that 1U = sup p ϕ p . Therefore μ(U ) = lim ϕp dμ p→+∞ Ω ϕ p d μn = lim lim p→+∞ n→+∞ Ω ≤ lim inf d μn = lim inf μn (U ). n→+∞
n→+∞
U
For the other assertion, it suffices to notice that if K is a compact subset of Ω, there exists a nonincreasing sequence (ϕ p ) p∈N in Cc (Ω) such that 1K = inf p ϕ p and to argue similarly. Let us prove the last assertion. Since μn is a nondecreasing set function, according to the first assertions, we have ◦
◦
μ(B ) ≤ lim inf μn (B ) ≤ lim inf μn (B) n→+∞
n→+∞
≤ lim sup μn (B) ≤ lim sup μn (B) ≤ μ(B). n→+∞
n→+∞
◦
The conclusion follows from μ(B) = μ(B ). The following corollary clarifies the relation between the weak convergence of sequences in M(Ω, R m ) and the convergence of the corresponding measures of suitable Borel sets. Corollary 4.2.1. Let μ, μn in M(Ω, R m ) be such that μn weakly converges to μ and |μn | weakly converges to some σ in M+ (Ω). Then |μ| ≤ σ and, for all relatively compact Borel subsets B of Ω such that σ(∂ B) = 0, we have μ(B) = limn→+∞ μn (B). PROOF. Let U be an open subset of Ω and ϕ any function in Cc (U, R m ) with ϕ∞ ≤ 1. Letting n → +∞ in ϕ d μn ≤ |ϕ| d |μn |, U U we obtain
ϕ d μ ≤ |ϕ| d σ. U U
Taking the supremum on ϕ gives |μ|(U ) ≤ σ(U ). The conclusion |μ| ≤ σ follows from Proposition 4.2.1.
i
i i
i
i
i
i
126
book 2014/8/6 page 126 i
Chapter 4. Complements on measure theory
For proving the last assertion, let us denote the m components of μn and μ by μin and μi , i = 1, . . . , m, respectively, and the weak limits (for a subsequence not relabeled) of the positive and negative parts of μin by ν i ,+ and ν i ,− , respectively. Going to the limit when n → +∞ in μin = μin,+ − μin,− and μin,± ≤ |μn |, we obtain μi = ν i ,+ − ν i ,− and ν i ,± ≤ σ. The conclusion then follows by applying Proposition 4.2.3 to the m components μin± . We now restrict ourselves to the case when Ω = RN and we study the approximation of a Borel measure in M(RN , R m ) by a regular function in the sense of the weak convergence of measures. To this end, we define the regularization (or the mollification) of a measure by means of a regularizing kernel. Let us recall that a regularizer ρ is a function in C∞ (RN ) defined by ρ (x) = −N ρ(x/), where ρ is some nonnegative real-valued c function in C∞ (RN ) satisfying c ρ(x) d x = 1, sptρ ⊂ B 1 (0). RN
Note that the support spt ρ of ρ is included in B (0). For any measure μ in M(RN , R m ), we define the function ρ ∗ μ defined on RN by ρ ∗ μ(x) = ρ (x − y) μ(d y) RN
and we aim to show that ρ ∗ μ is a suitable approximation of μ. Theorem 4.2.2. The functions ρ ∗ μ belong to C∞ (RN , R m ) and for any α ∈ NN , D α (ρ ∗ μ) = D α ρ ∗ μ. Moreover, when goes to zero, (i) ρ ∗ μ μ, weakly in M(RN , R m ); |ρ ∗ μ| ≤ |μ|; (ii) RN
RN
(iii)
RN
|ρ ∗ μ| →
RN
|μ|.
PROOF. The classical derivation theorem under the integral sign yields D α (ρ ∗ μ) = D α ρ ∗ μ. Let us establish (i). Let ϕ ∈ Cc (RN , R m ). According to Fubini’s theorem, ϕ(x) ρ ∗ μ(x) d x − ϕ(y) μ(d y) RN N R = ϕ(x)ρ (x − y) μ(d y) d x − ϕ(y) μ(d y) RN RN N R ϕ(x) ρ (x − y) μ(d y) d x − ρ (x − y) d x ϕ(y) μ(d y) = RN RN N N R R ≤ |ϕ(x) − ϕ(y)|ρ (x − y) d x |μ|(d y) RN RN |ϕ(x) − ϕ(y)| |μ|, ≤ sup {(x,y)∈R2N :|x−y|≤}
RN
which, thanks to the uniform continuity of ϕ, tends to zero when → 0.
i
i i
i
i
i
i
4.2. Set functions and duality approach to Borel measures
book 2014/8/6 page 127 i
127
We establish now (ii). By Fubini’s theorem and a change of scale, we have x −y −N |ρ ∗ μ|(x) d x = ρ μ(d y) d x RN RN RN x −y ≤ −N ρ |μ|(d y) d x RN RN x −y −N ≤ ρ |μ|. d x |μ|(d y) = RN RN RN Assertion (iii) is a straightforward consequence of (i), the weak lower semicontinuity of the map μ → RN |μ| and (ii). Let C b (Ω, R m ) be the set of all bounded continuous functions from Ω into R m . We introduce now a stronger notion of convergence induced by the weak topology σ(Cb (Ω, R m ), C b (Ω, R m )). Definition 4.2.2. A sequence (μn )n in M (Ω, R m ) narrowly converges to μ in M (Ω, R m ) iff
ϕ d μn →
ϕ dμ
for all ϕ in C b (Ω, R m ). This convergence is strictly stronger than the weak convergence of measures. Indeed, let Ω = (0, 1) and μn = δ1/n . Then μn weakly converges to 0 but Ω μn = 1. Moreover, taking ϕ ∼ sin(1/x) at 0+ , ϕμn does not converge for any subsequence. This example shows that the unit ball of M(Ω) is not weakly sequentially compact for this topology. Nevertheless, the Prokhorov theorem below asserts that the bounded sets of M+ (Ω) are sequentially compact for the narrow topology as long as a uniform control is assumed outside a compact set whose measure is close to that of Ω. Theorem 4.2.3 (Prokhorov). Let be a bounded subset of M+ (Ω) satisfying ∀, ∃K compact subset of Ω such that sup{μ(Ω \ K ) : μ ∈ } ≤ . Then is sequentially compact for the narrow topology. For a proof, consult, for instance, Delacherie and Meyer [198]. Any subset of M+ (Ω) satisfying the previous uniform bound is said to be tight. Note that in the previous example, {δ1/n : n ∈ N∗ } is not tight. In the bounded subsets of M+ (Ω), the weak and the narrow topology agree when there is no loss of mass, i.e., when μ(Ω) = limn→+∞ μn (Ω). More precisely, we have the following. Proposition 4.2.4. Let μn , μ in M+ (Ω). Then the following assertions are equivalent: (i) μn μ narrowly; (ii) μn μ weakly and μn (Ω) → μ(Ω).
i
i i
i
i
i
i
128
book 2014/8/6 page 128 i
Chapter 4. Complements on measure theory
PROOF. We prove (ii) =⇒ (i). The converse is obvious. Let f ∈ C b (Ω), > 0 and K a compact subset of Ω such that μ(Ω \ K) ≤ . Let moreover ϕ ∈ Cc (Ω) satisfying 0 ≤ ϕ ≤ 1, ϕ = 1 in K. We have f d μn − f d μ ≤ f d μn − f ϕ d μn + f ϕ d μn − f ϕ d μ + f ϕ d μ − f d μ ≤ f ∞ (1 − ϕ) d μn + f ϕ d μn − f ϕ d μ + f ∞ (1 − ϕ) d μ so that
lim sup f d μn − f d μ ≤ 2 f ∞ n→+∞
and the conclusion follows after letting → 0. In the probabilistic approach we have a similar statement. Proposition 4.2.5. Let μn , μ in M+ (Ω). Then the following assertions are equivalent: (i) μn narrowly converges to μ; (ii) μn (Ω) → μ(Ω) and for all open subset U , μ(U ) ≤ lim inf μn (U ); n→+∞
(iii) μn (Ω) → μ(Ω) and for all closed subset F , μ(F ) ≥ lim sup μn (F ); n→+∞
(iv) for all Borel subset B such that μ(∂ B) = 0, μn (B) → μ(B). PROOF. The only implication we have to establish is (iv) =⇒ (i). Indeed, each of the others is an easy consequence of Propositions 4.2.4 and 4.2.3. According to Proposition 4.2.4, it is enough to establish the weak convergence of μn to μ. For this, let ϕ ∈ Cc (Ω), ϕL∞ (Ω) = M , with compact support K, and consider the subdivision ⎧ ⎨ −M = a0 < a1 · · · < ai < ai +1 < · · · < a m = M , ai +1 − ai ≤ , ⎩ μ([ϕ = ai ]) = 0. Such a subdivision exists. Indeed, the last property is a consequence of Lemma 4.2.1. Consider now the Borel subsets Ui = [ϕ < ai ] ∩ K and the function ϕ =
m i =1
ai χUi \Ui−1 .
Since μ(∂ Ui ) = 0, from assertion (iv) we have μn (Ui ) → μ(Ui ) so that for all > 0 ϕ d μn → ϕ d μ.
i
i i
i
i
i
i
4.2. Set functions and duality approach to Borel measures
book 2014/8/6 page 129 i
129
From the equiboundedness of the measures μn and the estimate ϕ − ϕ∞ ≤ , we easily deduce ϕd μn → ϕd μ, which ends the proof. The following proposition is an extension of property (iv). For a proof, consult Marle [287, Proposition 9.9.4]. Proposition 4.2.6. Let μn , μ be Borel measures in M+ (Ω) such that μn narrowly converges to μ and let f be a μn -measurable (for every n) and bounded function from Ω into R such that the set of its discontinuity points has a null μ-measure. Then f is μ-measurable and
f d μn =
lim
n→+∞
f d μ.
We end this subsection by stating two theorems extending in some sense the classical Fubini’s theorem. We prove only the first one. For the second, see, for instance, [211]. Let μ in M+ (Ω × R m ). We denote the projection of μ on Ω by σ. Let us recall that σ is the measure of M+ (Ω) defined for all Borel set E of Ω by σ(E) = μ(E × R m ). The following slicing decomposition holds. Theorem 4.2.4. There exists a family (μ x ) x∈Ω of probability measures on R m , unique up to equality σ-a.e., such that for all f in C0 (Ω × R m ) (i) x → f (x, y) d μ x (y) is σ-measurable; Rm
(ii) Ω×R m
f (x, y) d μ(x, y) =
! Ω
Rm
" f (x, y) d μ x (y) d σ(x).
We will write μ = (μ x ) x∈Ω ⊗ σ. PROOF. First step. We establish the result for all f of the form f = g ⊗ h, where g (x) = 1B (x), B is any Borel subset of Ω, and h belongs to C0 (R m ). Let us first assume that h belongs to a dense countable subset D of C0 (R m ) and define γ h ∈ M+ (Ω) by γ h (B) = h(y) d μ(x, y) ∀ B ∈ (Ω). B×R m
Since μ(B × R m ) = σ(B) = 0 =⇒ γ h (B) = 0, according to the Radon–Nikodým theorem, Theorem 4.2.1, the measure γ h has a density a h ∈ L1σ (Ω) with respect to σ, i.e., γ h = a h · σ. We then obtain
B×R m
h(y) d μ(x, y) = B
a h (x) d σ(x).
(4.5)
i
i i
i
i
i
i
130
book 2014/8/6 page 130 i
Chapter 4. Complements on measure theory
Moreover, there exists a sequence (N h ) h∈D of σ-null sets such that a h is given for all x0 ∈ Ω := Ω \ (∪ h∈D N h ) by a h (x0 ) = lim
ρ→0
γ h (Bρ (x0 )) σ(Bρ (x0 )) (Bρ (x0 ))×R m
= lim
h(y) d μ(x, y)
μ(Bρ (x0 ) × R m )
ρ→0
.
(4.6)
For all fixed x0 in Ω , let us now consider the linear map Γ x0 : D → R defined by Γ x0 (h) = a h (x0 ). From (4.6), we easily check that |Γ x0 (h)| ≤ h∞ . Therefore Γ x0 may be extended by a measure μ x0 in M+ (R m ). Since the map x → 1B (x)μ x (h) = 1B (x)a h (x) is σ-measurable for all h ∈ D, it is also σ-measurable for all h in C0 (R m ). Now, from (4.5), one can write h(y) d μ(x, y) = h(y) d μ x (y) d σ(x) B×R m
B
Rm
with μ x ≤ 1. Second step. We establish the result for all f of the form f = g ⊗ h, where g ∈ L1σ (Ω) and h ∈ C0 (R m ). For all > 0, let us consider the step function g = i ∈I αi 1Bi , Bi ∈ (Ω), I finite, such that Ω
|g − g | d σ < .
(4.7)
We have g ⊗ h dμ − g (x) h(y) d μ x (y) d σ(x) Ω×Rm m Ω R g ⊗ h dμ − g ⊗ h d μ ≤ Ω×Rm m Ω×R + g ⊗ h d μ − g (x) h(y) d μ x (y) d σ(x) Ω×Rm m Ω R + g (x) h(y) d μ x (y) d σ(x) − g (x) h(y) d μ x (y) d σ(x). Ω m m R Ω R According to the first step, the second term of the right-hand side is equal to zero. According to (4.7), each of the two other terms is less than h∞ . Since is arbitrary, we obtain g ⊗ h dμ = g (x) h(y) d μ x (y) d σ(x). (4.8) Ω×R m
Ω
Rm
We are going to prove that μ x is a probability measure. Let (hn )n∈N be a nondecreasing sequence of functions in C0 (R m ) pointwise converging to 1Rm . From (4.8) we deduce, for all Borel set B in Ω, hn (y) d μ(x, y) = hn (y) d μ x (y) d σ(x), B×R m
B
Rm
i
i i
i
i
i
i
4.2. Set functions and duality approach to Borel measures
and, by letting n → +∞,
book 2014/8/6 page 131 i
131
σ(B) = B
μ x (R m ) d σ(x).
Since for σ a.e. x in Ω, μ x (R m ) = μ x ≤ 1 we infer that μ x (R m ) = 1 for σ a.e. x in Ω. Third step. We assume that f belongs to C0 (Ω × R m ). The result is now an easy consequence of the density of gi ⊗ hi : gi ∈ Cc (Ω), hi ∈ Cc (R m ), I ∈ P F (N) i ∈I
in C0 (Ω × R ) for the uniform norm. (P F (N) denotes the family of all finite subsets of N.) m
Last step. It remains to establish the uniqueness of the family (μ x ) x , up to equality σ-a.e. Take f = 1Bρ (x0 ) × h, where h is any function in D and x0 ∈ Ω is such that the limit
lim
Bρ (x0 )
Rm
h(y) d μ x (y) d σ(x)
σ(Bρ (x0 ))
ρ→0
exists. According to the theory of differentiation of measures (Radon theorem), we know that there exists a Borel set N h with σ(N h ) = 0 such that the above limit exists for x0 ∈ Ω \ N h . Now, this limit exists for x0 ∈ Ω = Ω \ ∪ h∈D N h and for all h ∈ D. From 1Bρ (x0 ) h(y) d μ(x, y) = h(y) d μ x (y) d σ(x) Ω×R m
Bρ (x0 )
we deduce that for x0 ∈ Ω
Rm
Rm
(Bρ (x0 ))×R m
h(y) d μ x0 (y) = lim
h(y) d μ(x, y)
σ(Bρ (x0 ))
ρ→0
= Γ x0 (h). Therefore μ x = Γ x for all x ∈ Ω and all h ∈ D. This gives the required uniqueness. Theorem 4.2.5 (classical coarea formula). For all Lipschitz functions f : RN −→ R and for all functions g : RN −→ R in L1 (RN ) we have +∞ N −1 g (x)|D f | d x = g (x) d (x) d t . RN
[ f =t ]
−∞
As a corollary of Theorem 4.2.5 we obtain the so-called curvilinear Fubini theorem. Corollary 4.2.2. Let Ω be a bounded open subset of RN and Γ t the set {x ∈ Ω : d (x, RN \ Ω) = t }. Then +∞ N −1 g (x) d x = g (x) d (x) d t . Ω
−∞
Γt
PROOF. Take f = d (·, RN \ Ω).
i
i i
i
i
i
i
132
book 2014/8/6 page 132 i
Chapter 4. Complements on measure theory
Taking now g = 1 and f the truncation of d (·, S) between s and s , with s < s , we obtain the next corollary. Corollary 4.2.3. Let S be a subset of RN . Then s N −1 [d (·, S) = t ] d t " N [s < d (·, S) < s ] = and the distributional derivative of " d dt
N
s
[d (·, S) < t ] is given by
" N [d (·, S) < t ] = N −1 [d (·, S) = t ] .
4.3 Introduction to Young measures We deal now with the notion of Young measure, a measure theoretical tool, well suited to the analysis of oscillations of minimizing sequences (see, for instance, [170]). We will give in Chapter 11 an important application in the scope of relaxation in nonlinear elasticity. For an application to phase transitions for crystals, consult [172]. For a general exposition of the theory, see [78], [80], [162], [344], [354], [355], and the references therein.
4.3.1 Definition In this section, Ω is an open bounded subset of RN and E = Rd . In Chapter 11, Section 11.4, we will consider the case when d = mN so that E will be isomorphic to the space M m×N of m × N matrices. To shorten notation, we denote the N -dimensional Lebesgue measure restricted to Ω by " . Definition 4.3.1. We call a Young measure on Ω × E any positive measure μ ∈ M+ (Ω × E) whose image πΩ #μ by the projection πΩ on Ω is the Lebesgue measure " on Ω, i.e., for all Borel subset B of Ω, πΩ #μ(B) := μ(B × E) = " (B). We denote the set of all Young measures on Ω × E by 6 (Ω; E). We now consider the space C b (Ω; E) of Carathéodory integrands, namely, the space of all functions ψ : Ω × E → R, (Ω) ⊗ (E) measurable, and satisfying (i) ψ(x, ·) is bounded continuous on E for all x ∈ Ω; (ii) x → ψ(x, ·) is Lebesgue integrable. We equip 6 (Ω; E) with the narrow topology, i.e., the weakest topology which makes the maps μ →
Ω×E
ψ dμ
continuous, when ψ runs through C b (Ω; E). This topology induces the narrow convergence of Young measures defined as follows: let (μn )n∈N be a sequence of measures in 6 (Ω; E) and μ ∈ 6 (Ω; E); then na r ψ(x, λ)d μn (x, λ) = ψ(x, λ)d μ(x, λ) ∀ψ ∈ C b (Ω; E). μn μ⇐⇒ lim n→+∞
Ω×E
Ω×E
i
i i
i
i
i
i
4.3. Introduction to Young measures
book 2014/8/6 page 133 i
133
Remark 4.3.1. Let (μn )n∈N be a sequence in 6 (Ω; E) and μ ∈ 6 (Ω; E). It is easily seen (see Valadier [354], [355]) that na r 1B (x)ϕ(λ)d μn = 1B (x)ϕ(λ)d μ ∀(B, ϕ) ∈ (Ω) × C b (E). μn μ⇐⇒ lim n→+∞
Ω×E
Ω×E
The space 6 (Ω; E) is closed in M(Ω×E) equipped with the narrow convergence; more precisely, we have the next proposition. Proposition 4.3.1. Let (μn )n∈N be a sequence in 6 (Ω; E) narrowly converging to some μ in M(Ω × E). Then μ belongs to 6 (Ω; E). PROOF. From Remark 4.3.1, taking the test function (x, λ) → ϕ(x, λ) = 1B (x)1E (λ), we obtain " (B) = lim μn (B × E) = μ(B × E) n→+∞
for all Borel subsets B of Ω.
4.3.2 Slicing Young measures According to Theorem 4.2.4, for each Young measure μ corresponds a unique family (μ x ) x∈Ω (up to equality a.e.) of probability measures on E such that μ = (μ x ) x∈Ω ⊗ " . Moreover, the map x → μ x is measurable in the following sense: h d μ x is measurable. ∀h ∈ C0 (E), x → E
A Young measure μ is then also called a parametrized measure and Ω is the set of the parameters. Let Lw (Ω, M(E)) be the space of all families (μ x ) x∈Ω of measures μ x ∈ M(E) (not necessarily probability measures) such that x → μ x is measurable in the previous sense. By identifying μ = (μ x ) x∈Ω ⊗ " with (μ x ) x∈Ω , we have 6 (Ω; E) ⊂ Lw (Ω, M(E)). We equip Lw (Ω, M(E)) with the following weak convergence: Lw n n (μ x ) x∈Ω (μ x ) x∈Ω ⇐⇒ ∀ h ∈ C0 (E), h d μ x h d μ x in L∞ (Ω) weak star, E
i.e.,
Ω
g (x) E
h d μnx
dx →
Ω
E
g (x) E
h d μx
d x ∀ h ∈ C0 (E) and ∀ g ∈ L1 (Ω).
Remark 4.3.2. The set 6 (Ω; E) is not closed in Lw (Ω, M(E)) equipped with this convergence. Indeed, take Ω = (0, 1), E = R, and μn = δn ⊗ " . Then (μnx ) x∈Ω is the constant Lw
family δn and (μnx ) x∈Ω 0 which is not a Young measure. Let us define the tightness notion for Young measures. Definition 4.3.2. A subset of 6 (Ω; E) is said to be tight if ∀ > 0, ∃7 compact subset of E such that sup μ(Ω × (E \ 7 )) < . μ∈
i
i i
i
i
i
i
134
book 2014/8/6 page 134 i
Chapter 4. Complements on measure theory
Tight subsets of 6 (Ω; E) are closed in Lw (Ω, M(E)). More precisely, we have the next proposition. Proposition 4.3.2. Let (μn )n∈N be a tight sequence in 6 (Ω; E) with μn = (μnx ) x∈Ω ⊗" and Lw
assume that (μnx ) x∈Ω (μ x ) x∈Ω in Lw (Ω, M(E)). Then for a.e. x in Ω, μ x is a probability measure on E so that (μ x ) x∈Ω ⊗ " belongs to 6 (Ω; E). PROOF. Since supn∈N μn (Ω × E) = " (Ω) < +∞, there exists a subsequence (not relabeled) and some μ ∈ M+ (Ω × E) such that μn μ weakly in the sense of measures in M(Ω × E). We claim that it suffices to prove that μ ∈ 6 (Ω; E). Indeed, assuming μ ∈ 6 (Ω; E) and denoting by (ν x ) x∈Ω the family of probability measures associated with μ, by using a density argument, we easily obtain Lw
(μnx ) x∈Ω (ν x ) x∈Ω
in Lw (Ω, M(E)).
Therefore, by unicity of the weak limit in Lw (Ω, M(E)), up to a Lebesgue negligible subset of Ω, we will obtain (ν x ) x∈Ω = (μ x ) x∈Ω so that μ = (μ x ) x∈Ω ⊗ " ∈ 6 (Ω; E). We are going to establish that μ ∈ 6 (Ω; E). According to Alexandrov’s theorem (Proposition 4.2.3), for all open subsets U of Ω, one has πΩ #μ(U ) = μ(U × E) ≤ lim inf μn (U × E) = " (U ). n→+∞
Let now K be any compact subset of Ω and for all > 0 let 7 be a compact subset of E given by the tightness hypothesis. According to Alexandrov’s theorem again, one has πΩ #μ(K) = μ(K × E) ≥ μ(K × 7 ) ≥ lim sup μn (K × 7 ) n→+∞
≥ lim sup μn (K × E) − n→+∞
= " (K) − so that, since is arbitrary, πΩ #μ(K) ≥ " (K). Since the measure πΩ #μ is regular, we deduce that πΩ #μ(B) = " (B) for all Borel subsets B of Ω. On 6 (Ω; E) the narrow convergence and the weak convergence of families of corresponding probability measures are equivalent. More precisely, we have the following theorem. Theorem 4.3.1. Let (μn )n∈N be a sequence in 6 (Ω; E) and μ ∈ 6 (Ω; E) with μn = (μnx ) x∈Ω ⊗ " and μ = (μ x ) x∈Ω ⊗ " . Then na r
Lw
μn μ ⇐⇒ (μnx ) x∈Ω (μ x ) x∈Ω in Lw (Ω, M(E)).
i
i i
i
i
i
i
4.3. Introduction to Young measures
book 2014/8/6 page 135 i
135
PROOF. Implication =⇒ is straightforward. We now prove the converse implication. First step. We establish the tightness of (μn )n∈N . Averaging each family, we define the two probability measures on E, νn :=
1 " (Ω)
Ω
μnx d x,
ν :=
1 " (Ω)
Ω
μ x d x,
which act on all ϕ ∈ C0 (E) as follows: 〈νn , ϕ〉 :=
1 " (Ω)
Ω
ϕ(λ) E
d μnx (λ)
d x,
〈ν, ϕ〉 :=
1 " (Ω)
Ω
E
ϕ(λ) d μ x (λ) d x.
Thus, the weak convergence of (μnx ) x∈Ω toward (μ x ) x∈Ω in Lw (Ω, M(E)) yields the weak convergence of νn toward ν in M(E). According to the regularity property satisfied by ν, for arbitrary > 0, there exists a compact subset 7 of E such that ν(E \ 7 ) < . From Lemma 4.2.1, one may assume ν(∂ 7 ) = 0 so that, according to Alexandrov’s theorem, Proposition 4.2.3, νn (7 ) → ν(7 ), and, since νn and ν are probability measures, νn (E \ 7 ) → ν(E \ 7 ). We then deduce supn≥N νn (E \ 7 ) < 2 for a certain N in N. Our claim then follows from μn (Ω × (E \ 7 )) = " (Ω)νn (E \ 7 ). na r
Second step. We establish μn μ. According to Remark 4.3.1, it suffices to prove
lim
n→+∞
Ω×E
1B (x)ϕ(λ) d μn (x, λ) =
Ω×E
1B (x)ϕ(λ) d μ(x, λ)
∀B ∈ (Ω), ∀ϕ ∈ C b (E).
For > 0 let 7 be the compact subset of E given by the tightness of (μn , μ)n∈N and consider φ ∈ Cc (E) satisfying 0 ≤ φ ≤ 1 and φ = 1 on 7 . We now write n 1B (x)ϕ(λ) d μ (x, λ) − 1B (x)ϕ(λ) d μ(x, λ) Ω×E Ω×E n n ≤ 1B (x)ϕ(λ) d μ (x, λ) − 1B (x)φ (λ)ϕ(λ) d μ (x, λ) Ω×E Ω×E n + 1B (x)φ (λ)ϕ(λ) d μ (x, λ) − 1B (x)φ (λ)ϕ(λ) d μ(x, λ) Ω×E Ω×E + 1B (x)φ (λ)ϕ(λ) d μ(x, λ) − 1B (x)ϕ(λ) d μ(x, λ). Ω×E Ω×E
(4.9)
According to the tightness of (μn , μ)n∈N , the first and the last term in the right-hand side of (4.9) are less than ϕ∞ . On the other hand, since φ ϕ ∈ C0 (E), by hypothesis, the second term tends to zero when n goes to +∞. Therefore, since is arbitrary, we end the proof by letting n → +∞ in (4.9).
i
i i
i
i
i
i
136
book 2014/8/6 page 136 i
Chapter 4. Complements on measure theory
4.3.3 Prokhorov’s compactness theorem The theorem below may be considered as a parametrized version of the classical Prokhorov compactness Theorem 4.2.3. Theorem 4.3.2 (Prokhorov’s compactness theorem for Young measures). Let (μn )n∈N be a tight sequence in 6 (Ω; E). Then there exists a subsequence (μnk )k∈N of (μn )n∈N and μ in 6 (Ω; E) such that na r
μnk μ in 6 (Ω; E). PROOF. Since supn∈N μn (Ω × E) = " (Ω) < +∞, there exists a subsequence (not relabeled) and μ ∈ M+ (Ω×E) such that μn μ weakly in the sense of measures in M(Ω×E). Since (μn )n∈N is tight, arguing as in the proof of Proposition 4.3.2, one may assert that μ belongs to 6 (Ω; E). It remains to establish the narrow convergence of μn toward μ, or, equivalently, according to Theorem 4.3.1, the weak convergence of (μnx ) x∈Ω toward (μ x ) x∈Ω in Lw (Ω, M(E)). Let Φ ∈ L1 (Ω), ϕ ∈ C0 (E), and Φ ∈ Cc (Ω) satisfying Ω
|Φ − Φ | d x < .
(4.10)
Since μn weakly converges to μ in M(Ω×E) and (x, λ) → Φ (x)ϕ(λ) belongs to C0 (Ω×E), according to the slicing Theorem 4.2.4, one has n lim Φ (x) (4.11) ϕ(λ) d μ x d x − Φ (x) ϕ(λ) d μ x d x = 0. n→+∞ Ω E Ω E Let us write
n ϕ(λ) d μ x d x − Φ(x) ϕ(λ) d μ x d x Φ(x) Ω E Ω E n n ϕ(λ) d μ x d x − Φ (x) ϕ(λ) d μ x d x ≤ Φ(x) Ω E Ω E n + Φ (x) ϕ(λ) d μ x d x − Φ (x) ϕ(λ) d μ x d x Ω E Ω E + Φ (x) ϕ(λ) d μ x d x − Φ(x) ϕ(λ) d μ x d x . Ω E Ω E
(4.12)
From (4.10), the first and the last term of the right-hand side of (4.12) are less than ϕ∞ . Since is arbitrary, the claim follows from (4.11) by letting n → +∞ in (4.12).
4.3.4 Young measures associated with functions and generated by functions Let u : Ω → E be a given Borel function and consider the image μ = G#" of the measure " by the graph function G : Ω → Ω × E, x → (x, u(x)). Since the image of the measure μ by the projection πΩ on Ω is the Lebesgue measure " , μ belongs to 6 (Ω; E). This measure, concentrated on the graph of u, is called the Young measure associated with the function u. By definition of the image of a measure,
i
i i
i
i
i
i
4.3. Introduction to Young measures
book 2014/8/6 page 137 i
137
μ “acts” on C b (Ω; E) as follows: ∀ϕ ∈ C b (Ω; E),
Ω×E
ϕ(x, λ) d μ(x, λ) =
Ω
ϕ(x, u(x)) d x.
This shows that the probability family (μ x ) x∈Ω associated with μ is (δ u(x) ) x∈Ω . Let (un )n∈N be a sequence of Borel functions un : Ω → E and consider the sequence of na r their associated Young measures (μn )n∈N , μn = (δ un (x) ) x∈Ω ⊗ " . If μn μ in 6 (Ω; E), the Young measure μ is said to be generated by the sequence (un )n∈N . In general (see Examples 4.3.1 and 4.3.2), μ is not associated with a function. Let us rephrase the tightness of a sequence (μn )n∈N in terms of the associated sequence (un )n∈N . We easily obtain the following equivalence: the sequence (μn )n∈N is tight iff 5 4 ∀ > 0, ∃7 , compact subset of E, such that sup " x ∈ Ω : un (x) ∈ E \ 7 < . n∈N
Remark 4.3.3. It is worth noticing that a sequence (μn )n∈N of Young measures associated with a bounded sequence (un )n∈N in L1 (Ω, E) is tight. Indeed according to the Markov inequality, one has 5 4 1 |u | d x " x ∈ Ω : |un (x)| > M ≤ M Ω n 1 ≤ sup |un | d x, M n∈N Ω which tends to zero when M → +∞. Therefore, according to Prokhorov’s theorem, Theorem 4.3.2, for each bounded sequence (un )n∈N in L1 (Ω, E), one can extract a subsequence generating a Young measure μ, i.e., na r
(δ un (x) ) x∈Ω ⊗ " μ.
4.3.5 Semicontinuity and continuity properties Here is a first semicontinuity result related to extended real-valued nonnegative functions. Proposition 4.3.3. Let ϕ : Ω × E → [0, +∞] be a (Ω) ⊗ (E) measurable function such that λ → ϕ(x, λ) is lsc for a.e. x in Ω. Moreover, let (μn )n∈N be a sequence of Young measures in 6 (Ω; E), narrowly converging to some Young measure μ in 6 (Ω; E). Then ϕ(x, λ) d μ(x, λ) ≤ lim inf ϕ(x, λ) d μn (x, λ). n→+∞
Ω×E
Ω×E
PROOF. Let us consider the Lipschitz regularization of ϕ 4 5 ϕ p (x, λ) := inf ϕ(x, ξ ) + p|λ − ξ | ξ ∈E
for p ∈ N intended to go to +∞, and set ψ p = ϕ p ∧ p. It is easily seen that ψ belongs to C b (Ω; E) (see Theorem 9.2.1) and that (ψ p ) p∈N is a nondecreasing sequence which
i
i i
i
i
i
i
138
book 2014/8/6 page 138 i
Chapter 4. Complements on measure theory
pointwise converges to ϕ. Consequently, ψ p (x, λ) d μ(x, λ) = lim n→+∞
Ω×E
ψ p (x, λ) d μn (x, λ)
Ω×E
≤ lim inf n→+∞
Ω×E
ϕ(x, λ) d μn (x, λ),
and we complete the proof thanks to the monotone convergence theorem by letting p → +∞ in the left-hand side. We would like to improve Proposition 4.3.3 for functions ϕ taking negative values or more generally for functions ϕ which are not necessarily bounded from below. We restrict ourselves to sequences of Young measures associated with functions. Let us first recall the notion of uniform integrability: a sequence ( fn )n∈N of functions fn : Ω → R in L1 (Ω) is said to be uniformly integrable if lim sup | fn | = 0. R→+∞ n∈N
[| fn |>R]
From Proposition 2.4.12 this definition is equivalent to Definition 2.4.4 (see also Delacherie and Meyer [198]). Proposition 4.3.4. Let (μn )n∈N be a sequence of Young measures associated with a sequence of functions (un )n∈N , narrowly converging to some Young measure μ. On the other hand, let ϕ : Ω × E → R be a (Ω) ⊗ (E) measurable function such that λ → ϕ(x, λ) is lower semicontinuous for a.e. x in Ω. Assume moreover that the negative part x → ϕ(x, un (x))− is uniformly integrable. Then ϕ(x, λ) d μ(x, λ) ≤ lim inf ϕ(x, un (x)) d x. n→+∞
Ω×E
Ω
PROOF. Let R > 0 intended to tend to +∞ and set ϕR = sup(−R, ϕ)+R. Since ϕR ≥ 0 and λ → ϕR (x, λ) is lower semicontinuous, one may apply Proposition 4.3.3 so that, removing the term R" (Ω), one obtains ϕ(x, λ) d μ(x, λ) ≤ sup(−R, ϕ(x, λ)) d μ(x, λ) Ω×E
Ω×E
≤ lim inf n→+∞
= lim inf n→+∞
On the other hand, sup(−R, ϕ(x, un (x))) d x = Ω
Ω×E
sup(−R, ϕ(x, λ)) d μn (x, λ)
Ω
sup(−R, ϕ(x, un (x))) d x.
(4.13)
[ϕ(.,un (.))≥−R]
ϕ(x, un (x)) d x +
≤ ϕ(x, un (x))d x − Ω
[ϕ(.,un (.)) 0 such that |λ − λ | < η =⇒ |φ(λ) − φ(λ )| < . Let us write ϕ(x, un (x)) d x − ϕ(x, u(x)) d x ≤ |φ(un (x)) − φ(u(x))| d x Ω Ω Ω = |φ(un (x)) − φ(u(x))| d x + |φ(un (x)) − φ(u(x))| d x [|un −u|>η]
[|un −u|≤η]
≤ 2φ∞ " ([|un − u| > η]) + " (Ω).
(4.19)
Now, by hypothesis, limn→+∞ " ([|un − u| > η]) = 0 and, since is arbitrary, the claim follows after letting n → +∞ in (4.19). Second step. We establish the converse implication. Let us consider ϕ ∈ C b (Ω; E) na r defined by ϕ(x, λ) = |λ − u(x)| ∧ C , where C is any positive constant. Since μn μ, one has lim ϕ(x, λ) d μn (x, λ) = ϕ(x, λ) d μ(x, λ), n→+∞
Ω×E
Ω×E
i
i i
i
i
i
i
144
book 2014/8/6 page 144 i
Chapter 4. Complements on measure theory
that is,
lim
n→+∞
Ω
|un (x) − u(x)| ∧ C d x =
Ω
|u(x) − u(x)| ∧ C d x = 0.
(4.20)
On the other hand, for any η > 0, " ([|un − u| > η]) ≤
1 min(η, C )
Ω
|un (x) − u(x)| ∧ C d x.
Consequently, (4.20) yields limn→+∞ " ([|un − u| > η]) = 0. Last step. We establish the second assertion. As previously, according to Theorem 4.3.1, it suffices to establish lim φ(vn ) d x = 1B (x)φ(λ) d μ(x, λ) n→+∞
B
Ω×E
for all Borel subsets B of Ω and all φ ∈ Cc (E). Let > 0. Since φ is uniformly continuous on E, there exists η > 0 such that |φ(vn (x)) − φ(un (x))| < for all x in the set [|vn − un | < η]. On the other hand, since vn − un tends to 0 in measure, limn→+∞ " ([|vn − un | ≥ η]) = 0. Therefore φ(vn ) d x − φ(un ) d x ≤ |φ(vn ) − φ(un )| d x B B Ω = |φ(vn ) − φ(un )| d x [|vn −un |≥η]
+
[|vn −un | Thus, |x|α belongs to H 1 (−1, 1) iff α > 12 . / As a consequence, the function u(x)= |x| does not belong to H 1 (−1, 1)! (c) We are going to prove that any element v ∈ H 1 (a, b ) has a continuous representative. Indeed, this is a very special property of the one-dimensional case. It is no more true as soon as N ≥ 2. We will further carefully study the regularity properties of elements of Sobolev spaces. Before proving this result, let us observe that a function v which has a jump at one point in R does not belong to a Sobolev space W 1, p . For simplicity, take Ω = ] − 1, 1[ and 1 if x ≥ 0, v(x) = 0 if x < 0. 1 . 2
An elementary computation yields that the distributional derivative of v is equal to the Dirac mass at the origin, indeed, v = δ0 , which, as we already observed, is a distribution which is not attached to a function f ∈ L1 (Ω).
i
i i
i
i
i
i
5.1. Sobolev spaces: Definition, density results
book 2014/8/6 page 149 i
149
Theorem 5.1.1. Take 1 ≤ p ≤ +∞. Let Ω = (a, b ) be an open interval of R. (i) Let v ∈ W 1, p (a, b ) and denote by v ∈ L p (a, b ) its first distributional derivative. Then there exists a continuous function v˜ ∈ C([a, b ]) such that ⎧ ˜ fora.e. x ∈ (a, b ), ⎨ v(x) = v(x) x ˜ ˜ v (t )d t ∀ x, y ∈ [a, b ]. v(x) − v(y) = ⎩ y
˜ Indeed v˜ is unique, We say in this case that v admits a continuous representative v. and, when p > 1, it belongs to C0,α ([a, b ]) with α = p1 , 1p + p1 = 1, i.e., ˜ − v(y)| ˜ |v(x) ≤ C |x − y|α
∀x, y ∈ [a, b ].
As a consequence,
x
v(x) − v(y) =
v (t )d t
for a.e. x, y ∈ (a, b ).
y
(ii) Conversely, let us assume that v ∈ L p (Ω) and there exists a function g ∈ L p (Ω) such that x v(x) − v(y) = g (t )d t for a.e. x, y ∈ (a, b ). y
Then, v ∈ W
1, p
(a, b ) and v = g in the distributional sense.
Before proving Theorem 5.1.1, let us recall that the fundamental theorem of calculus states that for any C1 function v we have x v (t ) d t . v(x) = v(a) + a
The above theorem allows us to extend such formula to the class of functions v ∈ W 1, p , the derivative v being taken in the distributional sense and the integral in the Lebesgue sense. PROOF OF THEOREM 5.1.1. Let v ∈ W 1, p (a, b ) and v ∈ L p (a, b ) its distributional derivative. Let us denote by w(·) the function x v (t ) d t . w(x) = a
When 1 < p ≤ +∞, w(·) is Hölder continuous: indeed, for all x, y ∈ (a, b ), y |w(y) − w(x)| = v (t ) d t x y |v (t )| d t ≤ x 1/ p
y
1/ p
|v (t )| d t
≤ |y − x|
p
x
≤ v L p (a,b ) |y − x|1/ p .
i
i i
i
i
i
i
150
book 2014/8/6 page 150 i
Chapter 5. Sobolev spaces
Let us prove that there exists a constant C ∈ R such that v(x) − w(x) = C
for a.e. x ∈ (a, b ).
To that end, let us compute w in a distributional sense and prove that w = v . Taking ϕ ∈ (a, b ) we have b 〈w , ϕ〉(D ,D) = − w(x)ϕ (x) d x a
b
v (t ) d t ϕ (x) d x
x
=− a
a
b
b
1[a,x] (t )v (t ) d t ϕ (x) d x.
=− a
a
The function (t , x) −→ 1[a,x] (t )v (t )ϕ (x) belongs to L1 ((a, b ) × (a, b )), which allows us to apply the Fubini theorem to obtain
b b 1[a,x] (t )ϕ (x) d x d t 〈w , ϕ〉(D ,D) = − v (t )
a
a
b
=−
v (t ) b
=
ϕ (x) d x d t
a
b t
v (t )ϕ(t ) d t
a
= 〈v , ϕ〉( , ) . Hence, w = v in (a, b ), and as a consequence (w − v) = 0. We conclude the proof of assertion (i) by the help of the lemma below; the proof of assertion (ii) is easy and thus is left to the reader. Lemma 5.1.1. Let f ∈ L1l oc (a, b ) such that f = 0 in distributional sense. Then, there exists a constant C ∈ R such that f (x) = C for a.e. x ∈ (a, b ). PROOF. By assumption f = 0 in (a, b ), which is equivalent to saying that b f (x)ϕ (x)d x = 0 ∀ ϕ ∈ (a, b ). a
To exploit this information we need to understand what is the space W = {ϕ : ϕ ∈ (a, b )}. Clearly, for any ϕ ∈ (a, b ) we have that ϕ still belongs to (a, b ) and b ϕ (x)d x = ϕ(b ) − ϕ(a) = 0. a
Conversely, take ψ ∈ (a, b ) such that ϕ ∈ (a, b ). Indeed
b a
ψ(x)d x = 0 and prove that ψ = ϕ for some
x
ϕ(x) =
ψ(t )d t a
i
i i
i
i
i
i
5.1. Sobolev spaces: Definition, density results
book 2014/8/6 page 151 i
151
belongs to C∞ (a, b ) and satisfies ϕ = ψ. Assuming that ψ ≡ 0 outside of [c, d ] with a < c < d < b , we have ϕ(x) = 0 for all x ≤ c and, for all x > d ,
x
b
ψ(t )d t =
ϕ(x) =
ψ(t )d t = 0.
a
a
Hence ϕ ∈ (a, b ). We have obtained that W = {ϕ : ϕ ∈ (a, b )} b ψ(x)d x = 0 . = ψ ∈ (a, b ) : a
b To construct functions ψ ∈ W , we first take some function θ ∈ D(a, b ) such that a θ(x) d x = 1. Then, observe that for any χ ∈ D(a, b ) the function h(·) which is defined by
b h(x) := χ (x) −
χ (t ) d t θ(x) a
satisfies h ∈ (a, b ) and
b
h(x)d x = 0. Hence h ∈ W and
a
b
f (x)h(x) d x = 0. a
Equivalently,
b
f (x)χ (x)d x = Denoting C :=
b a
b
b
χ (t )d t
a
a
f (x)θ(x) d x. a
f (x)θ(x)d x, we obtain
b
( f (x) − C )χ (x) d x = 0.
∀χ ∈ (a, b ) a
We conclude thanks to Theorem 2.2.1. As soon as the dimension N is greater than or equal to 2, the situation is more complex, and, in general, elements of H 1 (Ω) have no continuous representative. Let us illustrate that fact with the help of the following example. 2 Example 5.1.2. Take N = 2 and Ω = B(0, R) with R < 1, that is, Ω = {x = (x1 , x2 ) ∈ R : 2 2 k |x| = x1 + x2 < R}. On Ω, we consider the function v(x) = | ln |x|| , where k is a real parameter. Let us examine, depending on the value of the parameter k, whether the function v belongs to H 1 (Ω). To do so, it is natural to use radial coordinates and take r = |x|. As ˜ ). Here v(r ˜ ) = | ln r |k . It a classical result, we have |∇v|2 = ( ∂∂ vr˜ )2 , where v(x) = v(r follows R
v 2 (x)d x = 2π
B(0,R)
|∇v(x)|2 d x = 2π B(0,R)
0
(ln r )2k r d r,
0 R
k2 r2
(ln r )2k−2 r d r.
i
i i
i
i
i
i
152
book 2014/8/6 page 152 i
Chapter 5. Sobolev spaces
Let us make the change of variable t = − ln r to obtain +∞ 2 2 2k −2t 2 (v + |∇v| )d x = 2π t e d t + 2πk Ω
− ln R
+∞
− ln R
dt t
2−2k
.
It follows that v belongs to H 1 (Ω) iff 2 − 2k > 1, i.e., k < 12 . By taking 0 < k < 12 , we obtain a function belonging to H 1 (Ω) which blows up to +∞ at the origin and which does not have a continuous representative. Remark 5.1.2. It is a central question to know the best regularity results on the elements of the Sobolev spaces. As a general rule, for a function v ∈ L p (Ω), which is a priori defined only almost everywhere on Ω, to know that some of its distribution derivatives belong to L p (Ω) allows us, even if the function v has no continuous representative, to treat it more precisely than almost everywhere with respect to x. We will describe further three distinct approaches to this question: the Sobolev embedding theorem, the trace theory, and the capacity theory. Let us now examine the general properties of the Sobolev spaces. Theorem 5.1.2. Let Ω be an open subset of RN . For any nonnegative integer m and any real number p with 1 ≤ p ≤ +∞, W m, p (Ω) is a Banach space. When p = 2, W m, p (Ω) = H m (Ω) is a Hilbert space. PROOF. Let (vn )n∈N be a Cauchy sequence in W m, p (Ω). For any multi-index α = (α1 , α2 , . . . , αN ) with |α| ≤ m, we have D α vn − D α vn L p ≤ vn − v m W m, p . Hence, the sequence (D α vn )n∈N is a Cauchy sequence in L p (Ω), which is a Banach space. Let in L p (Ω) as n → +∞, vn → v α D vn → gα in L p (Ω) as n → +∞. Convergence in L p (Ω) clearly implies convergence in the distribution sense. By continuity of the derivation operation with respect to the convergence in the distribution sense (see Proposition 2.2.7), we obtain D α vn → gα = D α v
in L p (Ω) as n → +∞.
Hence v ∈ W m, p (Ω) and the sequence (vn )n∈N converges to v in W m, p (Ω). Remark 5.1.3. Notice that the above result relies essentially on the fact that the Sobolev spaces are built by means of the Lebesgue spaces L p (Ω) (which are Banach spaces) and of the generalized notion of derivation, namely, the distributional derivative. The following important result makes the link between the definition of Sobolev spaces relying on the notion of distributional derivative and the completion approach. Theorem 5.1.3. Let Ω = RN . For any 1 ≤ p < +∞ and m ∈ N, (RN ) is a dense subspace of W m, p (RN ). Equivalently,
i
i i
i
i
i
i
5.1. Sobolev spaces: Definition, density results
book 2014/8/6 page 153 i
153
m, p
W0 (RN ) = W m, p (RN ), W m, p (RN ) is the completion of (RN ) with respect to the · W m, p norm. PROOF. For simplicity of notation, let us consider the case m = 1 and prove that (RN ) is dense in W 1, p (RN ), (1 ≤ p < +∞). This is a two-step approximation procedure: (a) Truncation (of the domain). Let M ∈ (RN ) such that M (0) = 1. For any n ∈ N∗ , let us define M n (ξ ) = M ( ξn ). Clearly, M n ∈ (RN ) and, for any ξ ∈ RN , limn→+∞ M n (ξ ) = M (0) = 1. The most commonly used truncation function is a function M : RN → R+ such that 0 ≤ M ≤ 1, M (x) = 1 on B(0, 12 ), and M (x) = 0 for x ≥ 1. Such a function exists by using the Tietze–Urysohn lemma, or it can be explicitly described if needed. But notice that for our purpose, we will only exploit the fact that M ∈ (RN ) and M (0) = 1. For any v ∈ W 1, p (RN ), let us define vn (x) = M n (x)v(x). Clearly, vn has a compact support, since spt vn ⊂ spt M n ⊂ n spt M . Let us verify that vn still belongs to W 1, p (RN ): p |vn | p d x ≤ M ∞ R
N
R
N
|v| p d x < +∞.
On the other hand, the classical differentiation rule is still valid in this context. It is worthwhile to state it as a lemma. Lemma 5.1.2. Let v ∈ W 1, p (Ω) and M ∈ (Ω). Then M v ∈ W 1, p (Ω) and ∂ ∂ xi
(M v) = M
∂v ∂ xi
+
∂M ∂ xi
v.
PROOF. Take ϕ ∈ (Ω) as a test function. By definition, ∂ϕ ∂ (M v), ϕ = − M v, ∂ xi ∂ xi ( , ) ( (Ω), (Ω)) ∂ϕ = − M (x)v(x) d x. ∂ xi Ω Since M ∈ (Ω), we can use the classical differentiation rule ∂ ∂ xi Hence
∂ ∂ xi
(M ϕ) = M
(M v), ϕ
(D ,D)
=−
v. Ω
∂ϕ ∂ xi ∂ ∂ xi
+
∂M ∂ xi
ϕ.
(M ϕ)d x +
v Ω
∂M ∂ xi
ϕ d x.
Noticing that M ϕ ∈ (Ω) we obtain ∂ ∂v ∂M (M v), ϕ = ,Mϕ + v ϕ d x. ∂ xi ∂ xi ∂ xi (D ,D) (D ,D)
i
i i
i
i
i
i
154
book 2014/8/6 page 154 i
Chapter 5. Sobolev spaces
Since v ∈ W 1, p (Ω), ∂v ∂v ∂M ∂M ∂ ϕ dx = (M v), ϕ = M +v M +v ,ϕ . ∂ xi ∂ xi ∂ xi ∂ xi Ω ∂ xi (D ,D) (D ,D) We conclude by noticing that ∂v ∂ xi
M +v
∂M ∂ xi
∈ L p (Ω).
PROOF OF THEOREM 5.1.3 CONTINUED. Applying Lemma 5.1.2 we have ∂ vn ∂ xi Since M n ,
∂ Mn ∂ xi
= Mn
belong to (RN ), we have
∂v ∂ xi ∂ vn ∂ xi
+
∂ Mn ∂ xi
v.
∈ L p (RN ).
Let us prove that the sequence (vn )n∈N norm converges to v in W 1, p (RN ) as n → +∞: |vn − v| p d x = |1 − M n | p |v| p d x. RN
Then notice that
RN
|1 − M n | p |v| p → 0 pointwise as n → +∞, |1 − M n | p |v| p ≤ (1 + M ∞ ) p |v| p
∀n ∈ N,
and apply the Lebesgue dominated convergence theorem to obtain that vn → v in L p (RN ). Similarly, ∂v ∂v → in L p (RN ) as n → +∞. Mn ∂ xi ∂ xi To prove that
∂ vn ∂ xi
→
∂v ∂ xi
in L p (RN ), we just need to prove that ∂ Mn ∂ xi
v →0
in L p (RN ) as n → +∞.
This follows immediately from ∂ M 1 ∂ M x n (x)v(x) = |v(x)| ∂ xi n ∂ xi n ) ) ) 1) )∂ M ) ≤ ) ) |v(x)|, n ) ∂ xi )∞ and hence
) ) ) ) ) )∂ M ) 1) )∂ M ) ) n ) v) ≤ ) ) vL p . ) ) ∂ xi ) p n ) ∂ xi ) L ∞
(b) Regularization. The second step consists in proving that any element u ∈ W 1, p (RN ) with compact support can be approximated in W 1, p (RN ) by a sequence (un )n∈N of elements in (RN ). To that end, we use the regularization by convolution method. With the
i
i i
i
i
i
i
5.1. Sobolev spaces: Definition, density results
book 2014/8/6 page 155 i
155
same notation as in Section 2.2.2, we introduce ρ ∈ (RN ), ρ ≥ 0, with spt ρ ⊂ B(0, 1), and RN ρ(x) d x = 1 and define for each n ∈ N∗ ρn (x) := n N ρ(nx), which satisfies
⎧ ρn ∈ (RN ), ρn ≥ 0, ⎪ ⎪ ⎪ ⎨spt ρ ⊂ B(0, 1/n), n ⎪ ⎪ ⎪ ρn (x) d x = 1. ⎩
Given u ∈ W
(R ), let us define for all n ∈ N∗ , un := u ∗ ρn , that is, u(x − y)ρn (y) d y un (x) = N R = u(y)ρn (x − y) d y.
RN
1, p
N
(5.2) (5.3)
RN
To pass from one equality to the other in the formula above, one just has to make a change of variable and use that the Lebesgue measure on RN is invariant by translation. Depending on the situation, we will use one formula or the other: note that the x variable appears in (5.2) in u, while in (5.3) it appears in ρn . Let us verify that for each n ∈ N, un belongs to (RN ). This means that un has a compact support and that un belongs to C∞ (RN ). Take R > 0 such that u(x) = 0 for |x| > R. By construction, for each n ∈ N∗ , ρn (x) = 0 for |x| > n1 . Let us verify that un (x) = u ∗ ρn (x) is equal to zero when |x| > R + n1 . This follows from the fact that the function y −→ u(x − y)ρn (y) is identically zero when |x| > R+ n1 . Indeed, either |y| > n1 , in which case ρn (y) = 0, or |y| ≤ n1 , in which case |x − y| ≥ |x| − |y| > R +
1 n
−
1 n
=R
and u(x − y) = 0. (One should notice that the above argument can be easily generalized to obtain that for any two functions f and g , spt( f g ) ⊂ spt f + spt g .) Let us now verify that un = u ∗ ρn belongs to C∞ (RN ). At this point, one has to use precisely formula (5.3), where the x variable, with respect to which one wants to derive, appears in ρn . It is the differentiability property of ρn , which makes un = u ∗ ρn differentiable too! One has to derive under the sum sign, which is a direct consequence of the differentiability properties of ρn and of the Lebesgue dominated convergence theorem. As an ∂u illustration, let us prove that for each i = 1, 2, . . . , N , for each x ∈ RN , ∂ xn (x) exists and ∂ un ∂ xi
(x) =
To avoid any confusion, we prefer to write
∂u ∂ xi ∂ un ∂ ei
i
∗ ρn (x). instead of
∂ un , ∂ xi
where ei is the ith vector
of the canonical basis of R . For any t = 0 let us consider the differential quotient ρn (x − y + t ei ) − ρn (x − y) 1 u(y) [un (x + t ei ) − un (x)] = d y. t t RN N
i
i i
i
i
i
i
156
book 2014/8/6 page 156 i
Chapter 5. Sobolev spaces
Clearly u(y)
ρn (x − y + t ei ) − ρn (x − y) t
→ u(y)
∂ ρn
(x − y)
∂ ei
as t → 0,
∂ ρ ρn (x − y + t ei ) − ρn (x − y) n (z) ≤ M |u(y)|, ≤ |u(y)| sup u(y) t z∈RN ∂ ei which belongs to L p (RN ). Note that in the above inequality, we have used that for each n ∈ N, since ρn belongs to (RN ), ρn and its partial derivatives are bounded functions on RN . (M here is a constant which depends on n.) By using the Lebesgue dominated convergence theorem, we obtain ∂ un ∂ ρn (x − y) d y. (x) = u(y) ∂ ei ∂ ei RN ∂ρ
Let us now notice that for x fixed, the function y −→ ( ∂ e n )(x − y) can be written as a i
partial derivative of a function of (RN ). Take ξn (y) = −ρn (x − y). Then ∂ ξn
1 (y) = lim [ξn (y + t ei ) − ξn (y)] t →0 ∂ ei t ρn (x − y − t ei ) − ρn (x − y) = lim t →0 −t ∂ ρn = (x − y). ∂ ei
Thus
∂ un ∂ ei
(x) =
u(y) RN
∂ ξn ∂ ei
(y) d y.
Since u ∈ W 1, p (RN ), and by definition of the distribution derivative ∂ un ∂u (x) = − (y)ξn (y)d y ∂ ei RN ∂ e i ∂u (y) ρn (x − y)d y, = N ∂ e R i that is,
∂ un
=
∂ ei Returning to the usual notation with
∂ ∂ xi
∂ un ∂ xi
∂u ∂ ei
∗ ρn .
instead of =
∂u ∂ xi
∂ ∂ ei
we have
∗ ρn .
Let us apply Proposition 2.2.4(iii) to obtain ∂ un ∂ xi
→
∂u ∂ xi
in L p (RN ) as n → +∞,
that is, un → u in W 1, p (RN ).
i
i i
i
i
i
i
5.1. Sobolev spaces: Definition, density results
book 2014/8/6 page 157 i
157 1, p
In general, when Ω = RN , the two spaces W0 (Ω) and W 1, p (Ω) do not coincide. The 1, p space W0 (Ω) is strictly included in W 1, p (Ω). We will justify this fact a little further by 1, p proving that W0 (Ω) consists of functions whose trace on ∂ Ω is equal to zero. 1, p The strict inclusion W0 (Ω) W 1, p (Ω) for Ω = RN (at least for Ω with a smooth boundary) also can be seen as a consequence of the following result of independent interest. m, p
Proposition 5.1.1. Let Ω be an open set in RN and let u ∈ W0 (Ω). Then the function u˜ which is equal to u on Ω and zero on RN \ Ω belongs to W m, p (RN ). The linear mapping p m, p defined by p(u) = u˜ is an isometry from W0 (Ω) into W m, p (RN ). PROOF. Let us consider the linear mapping p : (Ω) −→ (RN ), u −→ p(u) = u˜ . It is important to notice that since u ∈ (Ω), for each x ∈ ∂ Ω there exists a neighborhood of x on which u is equal to zero. This clearly implies that u˜ ∈ (RN ). Moreover, p is an isometry for the W m, p norms: ∀u ∈ (Ω)
p(u)W m, p (RN ) = uW m, p (Ω) .
Thus, p is a uniformly continuous mapping p : (Ω) → W m, p (RN ), and moreover W m, p (RN ) is a Banach space. It can be continuously extended into a mapping (that we still denote by p) W m, p (Ω)
p : (Ω) m, p
For u ∈ W0 and so
m, p
= W0
(Ω) → W m, p (RN ). m, p
(Ω), there exists a sequence un ∈ (Ω) such that un → u in W0 p(un ) = u˜n → p(u)
(Ω),
in W m, p (RN ).
The convergence in W m, p implies the convergence in L p , which implies the convergence almost everywhere of a subsequence. It follows that p(u) = u˜ , m, p
Moreover, for any u ∈ W0
that is, u˜ ∈ W m, p (RN ).
(Ω), p(u)W m, p (RN ) = uW m, p (Ω) .
Example 5.1.3. Take Ω a smooth, bounded domain in RN , for example, Ω = B(0, 1) the open ball centered at the origin with radius one. The constant function u ≡ 1 on Ω be1, p longs to W 1, p (Ω) but does not belong to W0 (Ω). Otherwise, its extension u˜ by zero outside of Ω would belong to W 1, p (RN ), which is not true since its first partial derivatives ∂∂ xu˜ are measures supported by the sphere ∂ Ω. When N = 1, dd ux = δ{−1} − δ{1} , i a distribution which cannot be represented by an integrable function! When Ω ⊂ RN , one can adapt the previous argument, that is, truncation on the domain and regularization by convolution.
i
i i
i
i
i
i
158
book 2014/8/6 page 158 i
Chapter 5. Sobolev spaces
Theorem 5.1.4 (Meyers–Serrin). If 1 ≤ p < ∞, then C∞ (Ω) ∩ W m, p (Ω) is dense in W m, p (Ω). PROOF. For each integer k ≥ 1 set 1 Ωk = x ∈ Ω : x ≤ k and dist(x, ∂ Ω) > k and Ω0 = the empty set. Define for each k = 1, 2, . . . Gk := Ωk+1 ∩ (Ωk−1 )c . Then (Gk )k≥1 is an open covering of Ω. Let (αk )k≥1 be a C∞ partition of unity for Ω subordinate to the open cover (Gk )k≥1 , that is, ⎧ spt αk ⊂ Gk , αk ∈ (Gk ), αk ≥ 0, ⎪ ⎨ +∞ ⎪ ⎩ αk = 1. k=1
Let u ∈ W (Ω). Given > 0, we are going to construct an element ϕ ∈ C∞ (Ω) such that u − ϕW m, p (Ω) < . Consider the truncated function αk u. The same argument as in Theorem 5.1.3 applies m, p and αk u ∈ W0 (Gk ). It can be extended by zero outside of Gk into a function belonging to W m, p (RN ). We now use a regularization by convolution method with a kernel (ρn )n∈N . By taking n = n(k) sufficiently large, we have that m, p
ρn(k) ∗ (αk u) − αk uW m, p (Ω) ≤ and Define ϕ :=
+∞ k=1
2k
spt[ρn(k) ∗ (αk u)] ⊂ Gk . ρn(k) ∗ (αk u) and note that for x ∈ Gk we have ϕ(x) =
+1 j =−1
[ρn(k+ j ) ∗ (αk+ j u)](x),
i.e., there are at most three terms which are nonzero in this sum. Hence ϕ ∈ C∞ (Ω) and ) ) ∞ ) ) ) ) u − ϕW m, p (Ω) = ) αk u − (αk u) ∗ ρn(k) ) ) ) m, p k=1
≤
∞
W
αk u − (αk u) ∗ ρn(k) W m, p
k=1
≤ and the proof is complete. Remark 5.1.4. The above result shows that, equivalently, the space W m, p (Ω) can be defined as the completion with respect to the · W m, p (Ω) norm of the space = {v ∈ C∞ (Ω) : vW m, p (Ω) < +∞}. This result was obtained in 1964 by Meyers and Serrin.
i
i i
i
i
i
i
5.2. The topological dual of H01 (Ω). The space H −1 (Ω).
book 2014/8/6 page 159 i
159
5.2 The topological dual of H01 (Ω). The space H −1 (Ω). When studying a functional space, a fundamental question which arises is the description of its topological dual space. Let us first consider the Sobolev space H01 (Ω), in which case this question can be easily solved thanks to the Hilbert structure of H01 (Ω) and the density of (Ω). Let us take T ∈ H01 (Ω)∗ an element of the topological dual of H01 (Ω). By using the Riesz representation theorem of the elements of the topological dual of a Hilbert space, we obtain the existence of a unique element g ∈ H01 (Ω) such that 7 6 N ∂ v ∂ g ∀v ∈ H01 (Ω) T (v) = 〈v, g 〉 = vg + d x. (5.4) Ω i =1 ∂ xi ∂ xi Moreover, T H 1 (Ω)∗ 0
⎤1/2 ⎡ 6 7 N ∂g 2 2 g + d x⎦ . = g H 1 (Ω) = ⎣ 0 Ω i =1 ∂ xi
(5.5)
This is a first description of H01 (Ω)∗ . Indeed, one can give another description of the elements of H01 (Ω)∗ as distributions. To that end, let us introduce the identity mapping i : (Ω) → H01 (Ω). To each element T ∈ H01 (Ω)∗ one can associate its restriction T ◦ i to (Ω) i
T
T ◦ i : (Ω) → H01 (Ω) → R. Noticing that (Ω) is dense in H01 (Ω) for the · H 1 norm, it is equivalent to know T 0 or its restriction T ◦ i to (Ω). Moreover, the convergence for the topology of (Ω) (Definition 2.2.1) is stronger than the norm convergence in H01 (Ω). Therefore, T ◦ i is a continuous linear form on (Ω) and T ◦ i is a distribution. One can summarize the above results by saying that the mapping H01 (Ω)∗ → (Ω), T −→ T ◦ i = T | (Ω) is a continuous embedding of H01 (Ω)∗ into the space of distributions (Ω). Let us now give a precise description of the distributions which are so obtained: for any ϕ ∈ (Ω) 7 6 N ∂ϕ ∂ g 〈T ◦ i, ϕ〉 = T (ϕ) = gϕ + d x. Ω i =1 ∂ xi ∂ xi ∂g
Let us write g0 = g and gi = − ∂ x for i = 1, . . . , N . Then, i
〈T ◦ i, ϕ〉 =
Ω
g0 ϕ −
A =
g0 +
N i =1
gi
N ∂ gi i =1
∂ xi
∂ϕ ∂ xi B
dx
,ϕ
. ( (Ω), (Ω))
i
i i
i
i
i
i
160
book 2014/8/6 page 160 i
Chapter 5. Sobolev spaces
Therefore, we can identify T with a distribution of the form g0 +
N
∂ gi i =1 ∂ xi
∈ L (Ω). Moreover, 2
6 T H 1 (Ω)∗ = g H 1 (Ω) = 0
0
71/2
N i =0
with g0 , g1 , . . . , gN
2
Ω
gi (x) d x
.
(5.6)
Conversely, any distribution T ∈ (Ω) which can be written as T = g0 + g0 , g1 . . . gN ∈ L (Ω) clearly satisfies
N
2
∀ϕ ∈ (Ω)
6
〈T , ϕ〉(D ,D) =
6 ≤
g0 ϕ −
Ω
N Ω
i =0
N i =1
gi
∂ϕ
with
7 dx
∂ xi 71/2
gi2 (x) d x
∂ gi i =1 ∂ xi
ϕH 1 (Ω) . 0
As a consequence, T can be uniquely extended by density to a continuous linear form on
(Ω)
H01 (Ω)
= H01 (Ω) which we still denote by T . Moreover, 6 T H 1 (Ω)∗ ≤ 0
71/2
N i =0
2
Ω
gi (x) d x
.
(5.7)
∂g It is important to notice that there is not a unique way to write T as a sum T = g0 + ∂ xi . i ∂g Indeed, take any g such that Δ g = 0, then ∂∂x ( ∂ x ) = 0, and T can, as well, be written i i ∂g as T = g0 + ∂∂x ( gi + ∂ x ). By using (5.6) and (5.7), we obtain that i
i
T H 1 (Ω)∗ = min 0
⎧6 N ⎨ ⎩
i =0
71/2
Ω
gi (x)2 d x
⎫ N ∂ g ⎬ i . : T = g0 + ∂ x i⎭ i =1
(5.8)
Note that the minimum is achieved precisely by taking the ( g0 , . . . , gN ) provided by the Riesz representation (5.4), (5.5). Let us summarize the above results in the following statement. Theorem 5.2.1. Let us define H −1 (Ω) as the topological dual space of H01 (Ω), i.e., H −1 (Ω) = H01 (Ω)∗ . Then H −1 (Ω) is isometrically isomorphic to the Hilbert space of distributions T ∈
(Ω) satisfying T = g0 +
N ∂ g i i =1
with T H −1 (Ω) = inf
∂ xi
⎧6 N ⎨ ⎩
i =0
for some g0 , g1 , . . . , gN ∈ L2 (Ω) 71/2 gi 2L2 (Ω)
: T = g0 +
⎫ N ∂ gi ⎬ i =1
∂ xi ⎭
.
The above results can be easily extended to higher-order Sobolev spaces and to Sobolev spaces built on L p (Ω) spaces instead of L2 (Ω). In that case, one uses the duality between L p (Ω) and L p (Ω) with 1p + p1 = 1 to obtain the following result.
i
i i
i
i
i
i
1, p
5.3. Poincaré inequality and Rellich–Kondrakov theorem in W0 (Ω)
book 2014/8/6 page 161 i
161 m, p
Theorem 5.2.2. Let 1 ≤ p < +∞. The topological dual space of W0 (Ω) is denoted by W −m, p (Ω), where 1p + p1 = 1. It is isometrically isomorphic to the Banach space consisting of those distributions T ∈ (Ω) satisfying D α gα T=
with each gα ∈ L p (Ω)
0≤|α|≤m
with T W −m, p (Ω) = inf
⎧6 ⎨ ⎩
0≤|α|≤m
71/ p
Ω
|gα | p (x) d x
: T=
⎫ ⎬
D α gα . ⎭ 0≤|α|≤m
5.3 Poincaré inequality and Rellich–Kondrakov theorem in 1,p W0 (Ω) The Poincaré inequality is a basic ingredient of the variational approach to the Dirichlet problem. It provides the coercivity of the Dirichlet integral Ω |Dv|2 d x on the space H01 (Ω). Theorem 5.3.1 (Poincaré inequality). Let Ω be an open subset of RN which is bounded in one direction. Then, for each 1 ≤ p < +∞, there exists a constant C p,N (Ω) which depends only on Ω, p, and N such that 6 p 71/ p 1/ p N 1, p ∂v p |v(x)| d x ≤ C p,N (Ω) ∀ v ∈ W0 (Ω). (5.9) dx ∂ x Ω Ω i i =1
PROOF. Since Ω is bounded in a direction, we can find a system of coordinates, which for simplicity we still denote (x1 , x2 , . . . , xN ), such that Ω is contained in the strip a ≤ xN ≤ b . Let us write x = (x , xN ) with x = (x1 , x2 , . . . , xN −1 ). 1, p 1, p Since (Ω) is dense in W0 (Ω) (by definition of W0 (Ω)) let us first take v ∈ (Ω). We will then extend the result by a density and continuity argument. Let us define v on Ω, v˜ = 0 on RN \ Ω. We have, for any x ∈ RN , with a ≤ xN ≤ b , x ∈ RN −1 xN ∂ v˜ ˜ , a) + ˜ = v(x ˜ , xN ) = v(x (x , t ) d t . v(x) ∂ xN a ˜ , a) = 0 Since v(x
˜ , xN ) = v(x
xN a
∂ v˜ ∂ xN
Applying the Hölder inequality we obtain ( 1p +
(x , t ) d t .
1 p
= 1) p xN ∂ v˜ p p/ p ˜ , xN )| ≤ (xN − a) (x , t ) d t |v(x ∂ xN a p ∂ v˜ ≤ (xN − a) p/ p (x , t ) d t . R ∂ xN
i
i i
i
i
i
i
162
book 2014/8/6 page 162 i
Chapter 5. Sobolev spaces
Let us first integrate with respect to x ∈ RN −1 . We obtain p ∂ v˜ p p/ p ˜ , xN )| d x ≤ (xN − a) |v(x (x) d x. N −1 N ∂ x R R N Let us now integrate with respect to xN (a ≤ xN ≤ b ) to obtain p ∂ v˜ (b − a)1+ p/ p p ˜ |v(x)| dx ≤ (x) d x. N N ∂ x 1 + p/ p R R N Since
∂ v˜ ∂ xN
= Dv · n, where n is a unit normal vector to the strip containing Ω, we have p 6 7 p/ p p7 6 N ∂v N ∂v N ∂ v˜ p ˜ ˜ p ni ≤ ni = ∂ xN i =1 ∂ xi i =1 ∂ xi i =1 p N ∂v ˜ p/ p ≤N . ∂ xi i =1
p p
Hence, noticing that 1 + = p p N (b − a) p p/ p ∂ v˜ p ˜ |v(x)| dx ≤ N d x. p RN RN i =1 ∂ xi Since v˜ and all the tively, to v and
∂v ∂ xi
∂ v˜ ∂ xi
(i = 1, . . . , N ) are equal to zero outside of Ω and are equal, respec-
on Ω, we obtain
6
1/ p Ω
|v(x)| p d x
≤ C p,N (Ω)
p 71/ p N ∂v dx Ω i =1 ∂ xi
∀ v ∈ (Ω)
1/ p
with C p,N (Ω) = (b − a) N 1/ p . p
1, p
By the density of (Ω) in W0 (Ω), this inequality can be directly extended to v ∈ 1, p W0 (Ω). Definition 5.3.1. The Poincaré constant is the smallest constant C for which the inequality 1, p (5.9) holds for all v ∈ W0 (Ω). We denote it by C¯ p,N (Ω): ⎧ ⎫ 6 71/ p 1/ p N ∂ v p ⎨ ⎬ 1, p |v| p d x ≤C ∀v ∈ W0 (Ω) . C¯ p,N (Ω) = inf C : dx ⎩ ⎭ Ω Ω i =1 ∂ xi Equivalently, ⎫ ⎧6 p 71/ p N ⎬ ⎨ ∂v 1, p : |v| p d x = 1, v ∈ W0 (Ω) . = inf dx ⎭ ⎩ Ω i =1 ∂ xi C¯ p,N (Ω) Ω 1
In some instances (see the “cloud of ice” in Attouch [37], for example), it is useful to know precisely how this constant depends on the size of Ω.
i
i i
i
i
i
i
1, p
5.3. Poincaré inequality and Rellich–Kondrakov theorem in W0 (Ω)
book 2014/8/6 page 163 i
163
Proposition 5.3.1. For any R > 0, for any Ω in RN , C¯ p,N (RΩ) = RC¯ p,N (Ω). 1, p
1, p
PROOF. Let v ∈ W0 (RΩ). Let us define vR (x) = v(Rx). Clearly vR ∈ W0 (Ω) and p p −N |vR (x)| d x = |v(Rx)| d x = R |v(y)| p d y. Ω
Ω
RΩ
p p N N ∂ vR ∂ p |DvR (x)| d x = (x) d x = v(Rx) d x ∂ x ∂ x Ω Ω i =1 Ω i =1 i i p N ∂v = Rp (Rx) d x Ω i =1 ∂ xi p N ∂v p−N (ξ ) d ξ . =R ∂ x RΩ i i =1
Hence
p
|vR (x)| p d x Ω N ¯ p |DvR | p d x ≤ R C p (Ω) Ω N p−N ¯ p |Dv| p d x C p (Ω) ≤R R RΩ p ¯ |Dv| p d x. = (R C p (Ω))
|v(y)| d y = R RΩ
N
RΩ
1, p
This being true for any v ∈ W0 (RΩ), it follows that C¯ p,N (RΩ) ≤ RC¯ p,N (Ω). Conversely, since Ω = R1 (RΩ), we have C¯ p,N (Ω) ≤ C¯ p,N (RΩ) = RC¯ p,N (Ω) holds.
1 ¯ C (RΩ), R p,N
and thus the equality
We will see later when p = 2 in Chapter 8, Theorem 8.4.1, how the Poincaré constant can be related to the first eigenvalue of the Laplacian operator with Dirichlet boundary condition. Another basic result is the compact embedding theorem of Rellich–Kondrakov. For that purpose, we need to recall the Kolmogorov compactness criteria in L p (RN ). Theorem 5.3.2 (Kolmogorov). Let p ∈ [1, +∞[ and let 0 be a subset of L p (RN ). Then 0 is relatively compact in L p (RN ) iff the following three conditions are satisfied: (i) 0 is bounded in L p (RN ); (ii) limR→+∞ {|x|>R} |v(x)| p d x = 0 uniformly with respect to v ∈ 0 ; (iii) lim h→0 τ h v − vL p (RN ) = 0 uniformly with respect to v ∈ 0 , where τ h v is the translated function (τ h v)(x) := v(x − h).
i
i i
i
i
i
i
164
book 2014/8/6 page 164 i
Chapter 5. Sobolev spaces
PROOF. Let us prove the implication which is useful for applications, that is, (i), (ii), (iii) =⇒ 0 relatively compact in L p (RN ). Equivalently, we have to prove that 0 is precompact, which means that for any > 0, there exists a finite number of balls B(v1 , ), . . . , B(vk , ) which cover 0 . So, let us give > 0. By (ii) there exists some R > 0 such that |v(x)| p d x < . ∀v ∈ 0 |x|>R
Let (ρn )n∈N be a mollifier. It follows from Proposition 2.2.4 that p p ∀n ≥ 1, ∀v ∈ L p (RN ) v − v ∗ ρn L p ≤ ρn (y)v − τy vL p d y. RN
Hence
v − v ∗ ρn L p ≤ sup v − τy vL p . |y|≤1/n
By (iii), there exists some integer N () ∈ N such that ∀v ∈ 0
v − v ∗ ρN () L p < .
On the other hand, for any x, x ∈ RN , for any v ∈ L p (RN ) and n ∈ N, |(v ∗ ρn )(x) − (v ∗ ρn )(x )| ≤ |v(x − y) − v(x − y)|ρn (y)d y ˇ L p ρn L p ≤ τ x vˇ − τ x v ≤ τ x−x v − vL p ρn L p . (Note that this last property follows from the invariance property of the Lebesgue measure.) Moreover, |(v ∗ ρn )(x)| ≤ vL p ρn L p . Let us consider the family = {v ∗ ρN () : B(0, R) → R, v ∈ 0 }. By using (i) and (iii), we have that it satisfies the conditions of the Ascoli theorem. Hence, it is precompact for the topology of the uniform convergence on B(0, R), and we have the existence of a finite set {v1 , . . . , vk } of elements of 0 such that k 3
B(vi ∗ ρN () , R−N / p ) ⊃ .
i =1
So, for all v ∈ 0 , there exists some j ∈ {1, 2, . . . , k} such that ∀x ∈ B(0, R)
|v ∗ ρN () (x) − v j ∗ ρN () (x)| ≤ |B(0, R)|−1/ p .
Hence,
v − v j L p (RN ) ≤
1/ p p
|x|>R
|v| d x
+
1/ p p
|x|>R
|v j | d x
+ v − v ∗ ρN () L p + v j − v j ∗ ρN () L p + v ∗ ρN () − v j ∗ ρN () L p (B(0,R)) .
i
i i
i
i
i
i
1, p
5.3. Poincaré inequality and Rellich–Kondrakov theorem in W0 (Ω)
book 2014/8/6 page 165 i
165
The last term can be majorized as follows:
v ∗ ρN () − v j ∗ ρN () L p (B(0,R)) =
1/ p p
|v ∗ ρN () (x) − v j ∗ ρN () (x)| d x
B(0,R)
≤ |B(0, R)|−1/ p |B(0, R)|1/ p = . Finally, v − v j L p (RN ) ≤ 5, which proves that 0 is precompact in L p (RN ). The rate of convergence in L p (RN ) of τ h v to v as |h| → 0 can be made precise for functions v belonging to W 1, p (RN ). Proposition 5.3.2. For all 1 ≤ p ≤ +∞ and all v ∈ W 1, p (RN ) the following inequality holds: ∀h ∈ RN τ h v − vL p (RN ) ≤ DvL p (RN ) |h|, where DvL p (RN ) = ( RN |Dv(x)| p d x)1/ p and |Dv(x)| is the Euclidean norm of Dv(x). PROOF. Since (RN ) is dense in W 1, p (RN ), we just need to prove this inequality for v ∈ (RN ). We have (τ h v)(x) − v(x) = v(x − h) − v(x) 1 =− Dv(x − t h) h d t . 0
Hence, by the Cauchy–Schwarz inequality |(τ h v)(x) − v(x)| ≤
1
|Dv(x − t h)| |h| d t , 0
and then, by Hölder’s inequality, |(τ h v)(x) − v(x)| p ≤ |h| p
1
|Dv(x − t h)| p d t .
0
Integrating over RN we have
p
RN
|(τ h v)(x) − v(x)| d x ≤ |h|
p
1
p
|Dv(x − t h)| d t d x.
RN
0
Let us use the Fubini–Tonelli theorem and the invariance by translation of the Lebesgue measure in RN to obtain p p |τ h v − v| d x ≤ |h| |Dv(x)| p d x, RN
RN
which ends the proof.
i
i i
i
i
i
i
166
book 2014/8/6 page 166 i
Chapter 5. Sobolev spaces
We can now state the main compactness theorem in Sobolev spaces. Theorem 5.3.3 (Rellich–Kondrakov). Let Ω be a bounded open subset of RN . Then the 1, p canonical embedding W0 (Ω) → L p (Ω) is compact. In other words, every bounded subset of 1, p W0 (Ω) is relatively compact in L p (Ω). PROOF. Let us denote by p the natural extension operator by zero outside of Ω, which is 1, p a continuous operator from W0 (Ω) into W 1, p (RN ). Indeed, it follows from Proposition 5.1.1 that p is a linear isometry. The restriction operator r : L p (RN ) → L p (Ω) defined by r (v) = v|Ω is clearly linear continuous (with norm less than or equal to one). So, the embedding 1, p
i : W0 (Ω) → L p (Ω) can be written as the composition i = r ◦ j ◦ p, 1, p
p
j
r
W0 (Ω) → W 1, p (RN ) → L p (RN ) → L p (Ω), where j is the canonical embedding from W 1, p (RN ) into L p (RN ). 1, p Let B be the unit ball in W0 (Ω). Since r is continuous and the image of a compact set by a continuous mapping is still compact, we need to show that ( j ◦ p)(B) is relatively compact in L p (RN ). To that end, we use the Kolmogorov compactness criteria in L p (RN ); cf. Theorem 5.3.2. (i) Since j and p are linear continuous, ( j ◦ p)(B) is bounded in L p (RN ). (ii) Since Ω is bounded, there exists R > 0 such that Ω ⊂ B(0, R). Hence for all v ∈ B, p(v) = 0 on RN \ B(0, R) and {|x|>R} | p(v)| p d x = 0. (iii) Since p(B) is contained in the unit ball of W 1, p (RN ), it follows from Proposition 5.3.2 that there exists a constant C > 0 such that ∀v ∈ B τ h p(v) − p(v)L p (RN ) ≤ C |h|, which proves that τ h p(v) tends to p(v) in L p (RN ) as h → 0, uniformly with respect to v ∈ B. This completes the proof. By a proof similar to the one above we obtain the following useful result. Corollary 5.3.1. Let 0 be a subset of W 1, p (RN ), 1 ≤ p < +∞, which satisfies the following two conditions: (i) 0 is bounded in W 1, p (RN ), i.e., supv∈0 vW 1, p (RN ) < +∞. (ii) 0 is L p -equi-integrable at infinity, i.e., limR→+∞ {|x|>R} |v(x)| p d x = 0 uniformly with respect to v ∈ 0 . Then 0 is relatively compact in L p (RN ).
i
i i
i
i
i
i
5.4. Extension operators from W 1, p (Ω) into W 1, p (RN ).
book 2014/8/6 page 167 i
167
5.4 Extension operators from W 1,p (Ω) into W 1,p (RN ). Poincaré inequalities and the Rellich–Kondrakov theorem in W 1,p (Ω) 1, p
We have been able to prove some important properties of the space W0 (Ω), without any regularity assumptions on the boundary of the open set Ω, because there always exists a 1, p continuous extension operator from W0 (Ω) into W 1, p (RN ), namely, the extension by zero (Proposition 5.1.1). When working with the space W 1, p (Ω), the situation is more delicate and the regularity of the boundary of Ω will play a crucial role in the proofs and the statements of the results. Once more, a basic ingredient will be the obtainment of an extension operator. So doing, we will be able to use the previous results in W 1, p (RN ), where techniques like convolution and translation naturally apply. Notation. Given x ∈ RN we write x = (x , xN ) with x ∈ RN −1 , x = (x1 , x2 , . . . , xN −1 ). We write = {x = (x , xN ) : xN > 0} the open upper half-space; RN + x 2 )1/2 < 1} the open unit ball in RN ; B = B(0, 1) = {x ∈ RN : |x| = ( N i =1 i ; B+ = B(0, 1) ∩ RN + B0 = B ∩ RN −1 = {x = (x , xN ) ∈ RN : |x | ≤ 1 and xN = 0}. A C1 -diffeomorphism from an open set U ⊂ X into an open set V ⊂ Y , where X and Y are normed linear spaces, is a one-to-one mapping ϕ from U into V which is continuously differentiable and such that its inverse ϕ −1 is continuously differentiable from V onto U . Definition 5.4.1. Let Ω be an open subset of RN . We say that Ω is of class C1 if for all x ∈ Γ = ∂ Ω (the topological boundary of Ω), there exists an open neighborhood G of x and a C1 -diffeomorphism ϕ from B(0, 1) onto G such that ϕ(B+ ) = G ∩ Ω, ϕ(B0 ) = G ∩ Γ . ). Then Theorem 5.4.1. Let Ω be an open subset of class C1 which is bounded (or Ω = RN + 1, p 1, p N there exists an extension operator P : W (Ω) → W (R ) which is linear and continuous. More precisely, for all v ∈ W 1, p (Ω), (i) Pv|Ω = v; (ii) PvL p (RN ) ≤ C vL p (Ω) ; (iii) PvW 1, p (RN ) ≤ C vW 1, p (Ω) . The constant C above depends only on Ω and p. is a halfThe proof of Theorem 5.4.1 first proves the result in the case where Ω = RN + space then passes to the general situation by using partition of unity and local coordinates. ) → W 1, p (RN ) which is Lemma 5.4.1. There exists an extension operator P : W 1, p (RN + obtained by reflection: u(x , xN ) if xN > 0, (Pu)(x , xN ) = u(x , −xN ) if xN < 0.
i
i i
i
i
i
i
168
book 2014/8/6 page 168 i
Chapter 5. Sobolev spaces
Moreover, P is linear and continuous, with PuL p (RN ) ≤ 2uL p (RN ) ; +
PuW 1, p (RN ) ≤ 2uW 1, p (RN ) . +
PROOF. Clearly Pu ∈ L p (RN ) and 1/ p 6 p |Pu(x)| d x = 2 RN
71/ p p
RN +
|u(x)| d x
= 21/ p uL p ≤ 2uL p .
Let us prove that ∂ ∂ xi ∂ ∂ xN
(Pu) = P (Pu) = S
∂u
∂ xi ∂u
where we have set
∂ xN
(Sv)(x , xN ) =
for i = 1, 2, . . . , N − 1, ,
if xN > 0, v(x , xN ) −v(x , −xN ) if xN ≤ 0.
We need to introduce a truncation (on the domain) function. Let η ∈ C∞ (R) be such that 0 if t < 12 , η(t ) = 1 if t > 1 and define ηk (t ) = η(k t ) for k ∈ N∗ . Given ϕ ∈ (RN ) let us compute 〈 ∂∂x Pu, ϕ〉D ,D : i
(a) Take first 1 ≤ i ≤ N − 1. By definition ∂ ∂ϕ (Pu), ϕ =− Pu. d x. ∂ xi ∂ xi RN ( (RN ), (RN )) By definition of P, ∂ϕ ∂ϕ Pu dx = d x u(x , xN ) (x , xN ) d xN ∂ xi ∂ xi RN RN −1 R+ ∂ϕ + dx u(x , −xN ) (x , xN ) d xN N −1 − ∂ xi R R ∂ψ = u(x) d x, ∂ xi RN +
(5.10)
(5.11)
where ψ(x , xN ) = ϕ(x , xN ) + ϕ(x , −xN ). But ψ does not belong to (RN ); this is where we need to use a truncation method. + Take as a test function (ηk ψ)(x , xN ) = ηk (xN )ψ(x , xN ). Note that ηk ψ belongs to
(RN ) and since u ∈ W 1, p (RN ) + + ∂u ∂ u (ηk ψ) d x = − ηk ψ d x. RN ∂ x i RN ∂ x i +
+
i
i i
i
i
i
i
5.4. Extension operators from W 1, p (Ω) into W 1, p (RN ).
Noticing that
∂ ∂ xi
book 2014/8/6 page 169 i
169
∂ψ
(ηk ψ) = ηk ∂ x (since 1 ≤ i ≤ N − 1), we obtain i
RN +
u ηk
∂ψ ∂ xi
dx = −
∂u
RN +
∂ xi
ηk ψ d x.
Then, pass to the limit as k → +∞. By using the Lebesgue dominated convergence theorem, ∂u ∂ψ u dx = − ψ d x. (5.12) RN ∂ x i RN ∂ x i +
+
Then combine (5.10), (5.11), (5.12) to obtain ∂u ∂ (Pu), ϕ = ψdx ∂ xi ∂ xi RN + ∂u ∂u = ϕ(x , xN ) d x + ϕ(x , −xN ) d x N ∂ x N ∂ x R+ R+ i i ∂u = ϕ d x. P ∂ xi RN Hence
∂ ∂ xi
(Pu) = P( ∂∂ xu ) for i = 1, 2, . . . , N − 1. I
(b) Take now i = N and compute ∂ϕ ∂ϕ Pu. dx = d x u(x , xN ) (x , xN ) d xN ∂ xN ∂ xN RN RN −1 R+ ∂ϕ + dx u(x , −xN ) (x , xN ) d xN . N −1 − ∂ xN R R Then note that ∂ϕ ∂ u(x , −xN ) (x , xN ) d xN = − u(x , xN ) (ϕ(x , −xN )) d xN . ∂ xN ∂ xN R− R+ Let us introduce χ (x , xN ) := ϕ(x , xN ) − ϕ(x , −xN ). We have ∂ϕ ∂χ Pu. dx = u d x. ∂ xN RN RN ∂ xN
(5.13)
+
Since χ (x , 0) = 0, there exists some constant M > 0 such that |χ (x , xN )| ≤ M |xN | for |xN | ≤ R,
where spt ϕ ⊂ B(0, R).
The same argument as before yields, with ηk as a truncation function (note that ηk χ ∈
(RN )), + ∂u ∂ u (ηk χ ) d x = − ηk χ d x. N N R+ ∂ xN R+ ∂ xN We have
∂ ∂ xN
(ηk χ ) = ηk
∂χ ∂ xN
+ ηk χ
i
i i
i
i
i
i
170
book 2014/8/6 page 170 i
Chapter 5. Sobolev spaces
and
ηk (xN ) = kη (k xN ).
Hence RN +
∂u ∂ xN
ηk χ d x = −
uηk
RN +
∂χ ∂ xN
dx −
u kη (k xN )χ (x , xN ) d x.
RN +
(5.14)
Let now pass to the limit as k → +∞. By the Lebesgue dominated convergence theorem,
∂u
ηk χ d x →
∂ xN
RN +
uηk
RN +
∂χ ∂ xN
dx →
∂u RN +
∂ xN u
RN +
χ d x,
∂χ ∂ xN
d x,
and the last integral in (5.14) can be majorized as k uη (k xN )χ (x , xN ) d x ≤ k M C |u|xN d x RN 0 0 such that ∀v ∈ W
1, p
(Ω)
) ) ) ) 1 ) ) v(x) d x ) )v − ) ) |Ω| Ω
≤ C p DvL p (Ω) . L p (Ω)
PROOF. Take V = {v ∈ W 1, p (Ω) : Ω v(x) d x = 0}. Clearly, V is a closed linear subspace of W 1, p (Ω) and the only constant function which is in V is the function which 1 vanishes identically. Then the proof is concluded by noticing that v − |Ω| v d x belongs Ω to V for every v ∈ W 1, p (Ω).
5.5 The Fourier approach to Sobolev spaces. The space H s (Ω), s ∈R We give here a description of the space H 1 (RN ) by using the Fourier transform. We recall that for any v ∈ L1 (RN ), its Fourier transform vˆ is defined by 1 ˆ )= v(ξ e −i ξ ·x v(x) d x. (5.15) (2π)N /2 RN It can be shown that vˆ belongs to C0 (RN ). When vˆ ∈ L1 (RN ), one can invert the Fourier transform and obtain v from vˆ by using the following formula: 1 ˆ ) dξ . e i ξ ·x v(ξ v(x) = (5.16) (2π)N /2 RN Indeed, the condition vˆ ∈ L1 (RN ) is not always easy to handle. One often prefers to work with the Fourier–Plancherel transform, which is defined as follows. The basic property is that if v ∈ L1 (RN ) ∩ L2 (RN ), then vˆ ∈ L2 (RN ) and, in fact, ˆ L2 (RN ) = vL2 (RN ) . v
(5.17)
Note that, since the Lebesgue measure of RN is infinite, L2 (RN ) is not a subspace of L1 (RN ) and one cannot define directly vˆ for v ∈ L2 (RN ) by using formula (5.15). However, if v ∈ L1 ∩L2 , formula (5.15) makes sense and (5.17) tells us that v → vˆ is an isometry for the L2 norm. By using the density of L1 ∩ L2 into L2 (note, for example, that L1 ∩ L2 contains the continuous functions with compact support), one can extend the Fourier transform to L2 .
i
i i
i
i
i
i
174
book 2014/8/6 page 174 i
Chapter 5. Sobolev spaces
The so-obtained mapping is called the Fourier–Plancherel transform and we denote it by 0 . The basic properties of 0 are summarized in the following proposition. Proposition 5.5.1. One can associate to each function v ∈ L2 (RN ) a function 0 v ∈ L2 (RN ), called the Fourier–Plancherel transform of v, so that the following properties hold: ˆ (i) For all v ∈ L1 (RN ) ∩ L2 (RN ), 0 v = v. (ii) For all v ∈ L2 (RN ), 0 vL2 (RN ) = vL2 (RN ) . (iii) The mapping 0 is an isometric isomorphism from L2 (RN ) onto L2 (RN ). (iv) For v ∈ L2 (RN ), setting for each n ∈ N wn (ξ ) =
1 (2π)N /2
e −i ξ ·x v(x) d x, |x|≤n
we have wn → 0 v in L2 (RN ) as n → +∞. (v) Conversely, for v ∈ L2 (RN ), setting for each n ∈ N 1 e i ξ ·x 0 (v)(ξ ) d ξ , vn (x) = (2π)N /2 {|ξ |≤n} we have vn → v in L2 (RN ) as n → +∞. We can now give the following characterization of the Sobolev space H 1 (RN ). Theorem 5.5.1. We have H 1 (RN ) = {v ∈ L2 (RN ) : (1 + |ξ |2 )1/2 0 (v) ∈ L2 (RN )} and, for any v ∈ H 1 (RN ), vH 1 (RN ) = (1 + |ξ |2 )1/2 0 (v)L2 (RN ) . PROOF. Take first v ∈ (RN ). By definition, ∂G v ∂ xk
(ξ ) =
1 (2π)
N /2
e −i ξ ·x R
N
∂v ∂ xk
(x) d x.
Let us integrate by parts the above formula. Since v has compact support, we obtain ∂G v ∂ xk
ˆ ). (ξ ) = (i ξk )v(ξ
(5.18)
We now use (see Theorem 5.1.3) the density of (RN ) in H 1 (RN ). For any v ∈ H 1 (RN ) there exists a sequence (vn )n∈N in (RN ) such that vn → v with respect to the norm topology of H 1 (RN ). So for each n ∈ N, ∂H vn ∂ xk
(ξ ) = (i ξk )vˆn (ξ ).
(5.19)
i
i i
i
i
i
i
5.5. The Fourier approach to Sobolev spaces. The space H s (Ω), s ∈ R
book 2014/8/6 page 175 i
175
By definition of the Fourier–Plancherel transform (which coincides with the classical Fourier transform for functions in (RN )) we have ∂ vn 0 (ξ ) = (i ξk )0 (vn )(ξ ). (5.20) ∂ xk Since vn → v for the · H 1 (RN ) norm, we have that vn → v in L2 (RN ) and
∂ vn ∂ xk
−→
∂v ∂ xk
in L2 (RN ). We now use the continuity property of 0 for the L2 (RN ) norm and pass to the limit in (5.20): one can extract a subsequence n( p) such that for a.e. ξ ∈ RN , 0 (vn( p) )(ξ ) −→ 0 (v)(ξ )
! " ∂ vn( p) 0 (ξ ) −→ 0 ∂∂ xv (ξ ) for a.e. ξ ∈ RN k ∂ xk and so obtain
0
∂v
(ξ ) = i ξk 0 (v)(ξ )
∂ xk
for a.e. ξ ∈ RN .
(5.21)
It follows from the above argument and the isometry property of 0 in L2 (RN ) that v ∈ H 1 (RN ) ⇐⇒ 0 (v) ∈ L2 (RN ) and i ξk 0 (v) ∈ L2 (RN ) ∀ k = 1, 2, . . . , N , ⇐⇒ (1 + |ξ |2 )1/2 0 (v) ∈ L2 (RN ). Moreover, v2H 1 (RN ) =
RN
v 2 (x) + 6
N
∂v
2
∂ xk
k=1
(x) d x
2 7 N ∂v |0 (v)(ξ )| + (ξ ) d ξ 0 ∂ xk 2
=
RN
= R
N
k=1
(1 + |ξ |2 ) |0 (v)(ξ )|2 d ξ
= (1 + |ξ |2 )1/2 0 (v)2L2 (RN ) and the proof is complete. The above characterization of the space H 1 (RN ) via the Fourier–Plancherel transform is quite useful. It permits us to obtain in an elegant way a number of important properties of Sobolev spaces. Let us first show how to obtain by this way the Rellich–Kondrakov theorem. Theorem 5.5.2 (Rellich–Kondrakov, second proof). Let Ω be a bounded open subset of RN with a boundary ∂ Ω of class C1 . Then the embedding of H 1 (Ω) into L2 (Ω) is compact. PROOF. Let (vn )n∈N be a sequence in H 1 (Ω) which is bounded for the · H 1 (Ω) norm, i.e., supn∈N vn H 1 (Ω) < +∞. By using the extension operator P defined in Theorem 5.4.1, which is linear and continuous from H 1 (Ω) into H 1 (RN ), and by using a truncation
i
i i
i
i
i
i
176
book 2014/8/6 page 176 i
Chapter 5. Sobolev spaces
argument (Ω is assumed to be bounded), we can assume that the sequence vn is bounded in H 1 (RN ) and vn ≡ 0 outside a ball B(0, R) (with R independent of n ∈ N). Since (vn )n∈N is bounded in L2 (RN ), we can extract a weakly convergent subsequence vn(k) v
in w − L2 (RN ).
We are going to prove that, because of the uniform bound on the H 1 (RN ) norm of the vn(k) , the convergence is actually strong in L2 (RN ). Without loss of generality we can assume v = 0 (replacing vn(k) by vn(k) − v !). Let us simplify the notation and write vk instead of vn(k) . We have in w − L2 (RN ),
(5.22)
outside of B(0, R),
(5.23)
vk → 0 vk ≡ 0
sup vk H 1 (RN ) < +∞,
(5.24)
k∈N
and we want to prove that vk → 0 in s − L2 (RN ). To that end, we use the Fourier– Plancherel transformation and its isometrical property from L2 (RN ) onto L2 (RN ). We need to prove that 0 (vk )L2 (RN ) −→ 0 as k → +∞, and we know (see Theorem 5.5.1) that sup (1 + |ξ |2 )1/2 0 (vk )L2 (RN ) := C < +∞.
k∈N
Let us write
2
RN
2
|0 (vk )(ξ )| d ξ =
|ξ |≤M
|0 (vk )(ξ )| d ξ +
≤
|ξ |≤M
+ ≤
|0 (vk )(ξ )|2 d ξ
|0 (vk )(ξ )|2 d ξ 1
1 + M2
|ξ |≤M
|ξ |≥M
RN
(1 + |ξ |2 )|0 (vk )(ξ )|2 d ξ
|0 (vk )(ξ )|2 d ξ +
C2 1 + M2
.
(5.25)
On the other hand, since vk ≡ 0 outside of B(0, R), we have vk ∈ L2 (RN ) ∩ L1 (RN ) and 0 (vk ) = vˆk , i.e., 1 0 (vk )(ξ ) = e −i ξ ·x vk (x) d x (2π)N /2 RN 1 = 1B(0,R) (x)e −i ξ ·x vk (x) d x. (2π)N /2 Rn For any ξ ∈ RN , the function x → 1B(0,R) (x)e −i ξ ·x belongs to L2 (RN ). By using (5.22), we obtain ∀ξ ∈ RN
0 (vk )(ξ ) → 0
as k → +∞.
(5.26)
i
i i
i
i
i
i
5.5. The Fourier approach to Sobolev spaces. The space H s (Ω), s ∈ R
book 2014/8/6 page 177 i
177
To apply the Lebesgue dominated convergence theorem, we notice that by the Cauchy– Schwarz inequality, |0 (vk )(ξ )|2 ≤
1
vk 2L2 (RN ) |B(0, R)|,
(5.27)
where |B(0, R)| is the Lebesgue measure of the ball B(0, R). By using (5.26) and (5.27), we obtain |0 (vk )(ξ )|2 d ξ → 0 as k → +∞.
(5.28)
(2π)N
|ξ |≤M
Returning to (5.25), by using (5.28), we obtain |0 (vk )(ξ )|2 d ξ ≤ lim sup RN
k→+∞
C2 1 + M2
.
This being true for arbitrarily large M , by letting M → +∞, we finally obtain lim 0 (vk )L2 (RN ) = 0,
k→+∞
which completes the proof. Clearly, the characterization of the space H 1 (RN ) by means of the Fourier–Plancherel transform can be easily extended to higher-order Sobolev spaces H m (Ω), m ∈ N. Theorem 5.5.3. For any m ∈ N H m (RN ) = {v ∈ L2 (RN ) : (1 + |ξ |2 ) m/2 0 (v) ∈ L2 (RN )} and, for any v ∈ H m (RN ), vH m (Rn ) = (1 + |ξ |2 ) m/2 0 (v)L2 (RN ) . A major interest of the above approach is that it suggests a natural definition of the space H s (RN ) for s ∈ R, s being not necessarily an integer (s being possibly a fraction or more generally any real number), and s being not necessarily positive. The central point is that the property for the integral (1 + |ξ |2 ) m |0 (v)(ξ )|2 d ξ RN
to be finite makes sense for any exponent m ∈ R, since the quantity a x makes sense for any x ∈ R as soon as a > 0. (Here a = 1 + |ξ |2 , which is clearly positive.) We are led to the following definition. Definition 5.5.1. Let s ≥ 0 be a nonnegative real number. Let us define H s (RN ) = {v ∈ L2 (RN ) : (1 + |ξ |2 ) s /2 0 (v) ∈ L2 (RN )}, which is equipped with the scalar product, for any u, v ∈ H s (RN ) 〈u, v〉H s (RN ) := (1 + |ξ |2 ) s 0 (u)(ξ )0 (v)(ξ ) d ξ , RN
i
i i
i
i
i
i
178
book 2014/8/6 page 178 i
Chapter 5. Sobolev spaces
and the corresponding norm, vH s (RN ) :=
RN
1/2 (1 + |ξ |2 ) s |0 (v)(ξ )|2 d ξ
.
Let us summarize in the following statement some of the basic properties of the Sobolev spaces H s (RN ). Proposition 5.5.2. For any s ∈ R+ , H s (RN ) is a Hilbert space. When s = m ∈ N we have that H s (RN ) = H m (RN ) = W m,2 (Ω) is the classical Sobolev space. PROOF. Clearly 0 is an isomorphism from H s (RN ) onto the Lebesgue space L2 (a d m), where a d m is the measure with density a(ξ ) = (1 + |ξ |2 ) s with respect to the Lebesgue measure d m on RN . Moreover, 0 is an isometry and the Hilbert structure of the weighted Lebesgue space L2 (a d m) is transferred to H s (RN ) by the Fourier–Plancherel mapping 0 . So H s (RN ) ∼ = L2 ((1 + |ξ |2 ) s d m) (isomorphism) and the proof is complete. In accordance with the definition of H −1 (Ω) as the topological dual space of H01 (Ω), let us give the following definition of the Sobolev space H −s (RN ) when s ≥ 0. Definition 5.5.2. Let s be a nonnegative real number. By definition H −s (RN ) = H s (RN )∗ is the topological dual of H s (RN ).
5.6 Trace theory for W 1,p (Ω) spaces To study the Dirichlet problem by variational techniques one needs to solve the following problem: “For an arbitrary v ∈ H 1 (Ω), is it possible to give a meaning to the boundary condition v = g on ∂ Ω?” Clearly, considering v only as an element of L2 (Ω) is not sufficient information to talk about v on ∂ Ω, because the Lebesgue measure of ∂ Ω is zero (for Ω smooth enough). Therefore, one has to rely on the additional information on v, namely, “ ∂∂ xv belongs to i
L2 (Ω) for any i = 1, . . . , N ,” to give meaning to v on ∂ Ω. In this section and the next two, we give different answers to this question by using very different techniques. Note that except when N = 1, the space H 1 (Ω) is not embed¯ In this section, we use the geometrical ded in the space of continuous functions C(Ω). properties of ∂ Ω (the fact that it is locally an (N −1)-dimensional manifold) to prove that for a general v ∈ W 1, p (Ω), 1 ≤ p ≤ +∞, it is possible to give meaning to v on ∂ Ω, that is, the trace of v on ∂ Ω. Note that for a regular function v, this notion has to reduce to the restriction of v on ∂ Ω. This naturally suggests defining the notion of trace by using a density and extension by continuity argument. Theorem 5.6.1. Let Ω be a bounded open set whose boundary ∂ Ω is of class C1 . Then, for ¯ is dense in W 1, p (Ω) and the restriction mapping any 1 ≤ p < +∞, (Ω) ¯ −→ L p (∂ Ω), γ0 : (Ω) v −→ γ0 (v) = v|∂ Ω ,
i
i i
i
i
i
i
5.6. Trace theory for W 1, p (Ω) spaces
book 2014/8/6 page 179 i
179
¯ associates its restriction to ∂ Ω, can be extended by continuity which to each element v ∈ (Ω) into a linear continuous mapping from W 1, p (Ω) into L p (∂ Ω), which we still denote by γ0 . (Without ambiguity, one can still use the simpler notation v|∂ Ω .) The so-defined mapping γ0 : W 1, p (Ω) −→ L p (∂ Ω) is called the trace operator of order zero. PROOF. By Proposition 5.4.1, and the regularity property of the boundary ∂ Ω, we know ¯ is dense in W 1, p (Ω). For any v ∈ (Ω), ¯ without any ambiguity, we can define that (Ω) the restriction of v to ∂ Ω, setting γ0 (v) := v|∂ Ω . Assume for a moment that we have been able to prove that ¯ · 1, p ) −→ (L p (∂ Ω), · p γ0 : ( (Ω), W (Ω) L (∂ Ω) ) is continuous. Since γ0 is linear, it is uniformly continuous. The space L p (∂ Ω) is a Banach space (it is complete). Therefore, all the conditions of the extension by continuity theorem are fulfilled, which provides the existence of γˆ0 , the unique linear and continuous extension of γ0 γˆ0 : W 1, p (Ω) −→ L p (∂ Ω). For simplicity of notation, we still denote by γ0 the so-defined operator, which is the trace operator. Thus, we just need to prove that γ0 is continuous. To do so, we first consider the case of a half-space Ω = RN and then use local coordinates. + = {x = (x , xN ) ∈ RN −1 × R : xN > 0}. Then, for any Lemma 5.6.1. Take Ω = RN + 1 ≤ p < +∞, the following inequality holds: ¯N) ∀v ∈ (R +
γ0 (v)L p (RN −1 ) ≤ p 1/ p vW 1, p (RN ) . +
¯ N ). For any x ∈ RN −1 we have PROOF. Let v ∈ (R + +∞ ∂ |v(x , 0)| p = − |v(x , xN )| p d xN ∂ x 0 N +∞ ∂v p−1 ≤p |v(x , xN )| (x , xN )d xN . ∂ x 0 N Let us apply the Young convexity inequality ab ≤ with
1 p
+
1 p
1 p
ap +
1 p
bp
= 1 to the following situation: ∂v (x , xN ) a= ∂ xN
We obtain
p
|v(x , 0)| ≤ p 0
+∞
and
b = |v(x , xN )| p−1 .
p
1 1 ∂v ( p−1) p (x , xN ) + |v(x , xN )| d xN . p ∂ xN p
i
i i
i
i
i
i
180
book 2014/8/6 page 180 i
Chapter 5. Sobolev spaces
By using the relation ( p − 1) p = p we obtain
p
|v(x , 0)| ≤ ( p − 1) 0
+∞
p
|v(x , xN )| d xN +
p ∂v (x , xN ) d xN . ∂ xN
+∞ 0
Integrating over RN −1 yields
p ∂v |v(x , 0)| p d x ≤ ( p − 1) |v(x)| p d x + (x) d x N −1 N N ∂ x R R+ R+ N N p ∂v ≤ ( p − 1) |v(x)| p d x + dx ∂ xi RN RN + + i =1 p7 6 N ∂v p |v(x)| + ≤p d x. RN i =1 ∂ xi
+
Hence v(·, 0)L p (RN −1 ) ≤ p 1/ p vW 1, p (RN ) +
and the proof is complete. PROOF OF THEOREM 5.6.1 CONTINUED. We use a system of local coordinates. With the same notation as in Section 5.4, we set ¯⊂ Ω
k 3
Gi
¯ ⊂ Ω, G open ∀i = 0, . . . , k, with G 0 i
i =0
while ϕi : B(0, 1) → Gi , i = 1, 2, . . . , k, are the local coordinates, and {α0 , . . . , αi } is an ¯ associated partition of unity, i.e., αi ∈ (Gi ), αi ≥ 0, ki=0 αi = 1 on Ω. ¯ For any 1 ≤ i ≤ k, let us define Take v ∈ (Ω). (αi v) ◦ ϕi on B+ , wi = \ B+ . 0 on RN + ). By Lemma 5.6.1, we have Clearly wi belongs to (RN + wi (·, 0)L p (RN −1 ) ≤ p 1/ p wi W 1, p (RN ) . +
(5.29)
By using classical differential calculus rules (note that all the functions αi , v, ϕi are continuously differentiable), one obtains the existence, for any i = 1, . . . , k, of a constant Ci such that (5.30) wi W 1, p (RN ) ≤ Ci vW 1, p (Ω) . +
Combining the two inequalities (5.29) and (5.30), we obtain wi (·, 0)L p (RN −1 ) ≤ Ci p 1/ p vW 1, p (Ω) .
(5.31)
We now use the definition of the L p (∂ Ω) norm which is based on the use of local coordinates. One can show that an equivalent norm to the L p (∂ Ω) norm can be obtained
i
i i
i
i
i
i
5.6. Trace theory for W 1, p (Ω) spaces
book 2014/8/6 page 181 i
181
by using local coordinates: denoting by ∼ the extension by zero outside of RN −1 \ {y ∈ RN −1 : |y| < 1}, we have that p N −1 I L p (∂ Ω) = {v : ∂ Ω → R : (α ), 1 ≤ i ≤ k} i v) ◦ ϕi (·, 0) ∈ L (R
and v −→
6 k i =1
71/ p p I (α i v) ◦ ϕi p
L (RN −1 )
(5.32)
is an equivalent norm to the L p (∂ Ω) norm. I This definition of the L p (∂ Ω) norm and the inequality (5.31) (note that wi = (α i v) ◦ ϕi ) yield vL p (∂ Ω) ≤ C ( p, N , Ω)vW 1, p (Ω) for some constant C ( p, N , Ω). Thus, γ0 is continuous, which ends the proof of Theorem 5.6.1. Let us now give some of the most important properties of the trace operator γ0 . Proposition 5.6.1. Let us assume that Ω is an open bounded subset of RN whose boundary 1, p ∂ Ω is C1 . Then, for any 1 ≤ p < ∞, W0 (Ω) is equal to the kernel of γ0 , i.e., 1, p
W0 (Ω) = {v ∈ W 1, p (Ω) : γ0 (v) = 0}. 1, p
PROOF. We first show the inclusion W0 (Ω) ⊂ ker γ0 . 1, p 1, p Take v ∈ W0 (Ω). By definition of W0 (Ω), there exists a sequence of functions 1, p (vn )n∈N , vn ∈ (Ω) such that vn → v in W (Ω). Since γ0 (vn ) = vn |∂ Ω = 0, by continuity of γ0 we obtain that γ0 (v) = 0, i.e., v ∈ ker γ0 . The other inclusion is a bit more involved. We just sketch the main lines of its proof. Using local coordinates, we prove the following result. 1, p Take v ∈ W 1, p (RN ) such that γ0 (v) = 0. Prove that v ∈ W0 (RN ). Let us first extend + + N v by zero outside of R+ . By using the information γ0 (v) = 0 one can verify that the soobtained extension v˜ belongs to W 1, p (RN ). Then let us translate v˜ and consider for any h >0 ˜ , xN − h). ˜ , xN ) = v(x τ h v(x ˜ We have that for sufficiently Finally, one regularizes by convolution the function τ h v. ˜ belongs to (RN ˜ ) and ρ (τ ) as h → 0 and small, ρ (τ h v) v) tends to v in W 1, p (RN h + + 1, p
). → 0. Hence v ∈ W0 (RN +
Proposition 5.6.2 (Green’s formula). Let Ω be an open bounded set in RN whose boundary ∂ Ω is of class C1 . Then, for any u, v ∈ H 1 (Ω) and for any 1 ≤ i ≤ N , we have
∂u Ω
∂ xi
vd x = −
u Ω
∂v ∂ xi
dx +
∂Ω
γ0 (u)γ0 (v)(n.ei )d σ.
i
i i
i
i
i
i
182
book 2014/8/6 page 182 i
Chapter 5. Sobolev spaces
¯ Let PROOF. Let us first establish the Green’s formula for smooth functions u, v ∈ (Ω). 1 us start from the divergence theorem, which states that for all C real-valued function u and vector-valued function V , div(u V )d x = u(V · n) d σ. Ω
∂Ω
Take now V = (0, . . . , v, . . . , 0) = vei , all components being equal to zero, except the component of rank i. We obtain ∂v ∂u u dx = +v uv(n · ei ) d σ. ∂ xi ∂ xi Ω ∂Ω Let us now consider arbitrary elements u, v ∈ H 1 (Ω) and use a density argument. By Proposition 5.4.1, there exist approximating sequences (uk )k∈N (respectively, ¯ such that u converges to u in H 1 (Ω) (respectively, v con(vk )k∈N ) of elements of (Ω) k k 1 verges to v in H (Ω)). For each k ∈ N, we have ∂ vk ∂ uk uk dx = + vk uk vk (n · ei ) d σ. (5.33) ∂ xi ∂ xi Ω ∂Ω Let us now apply Theorem 5.6.1. By definition of γ0 , and the continuity property of γ0 from H 1 (Ω) into L2 (∂ Ω), we have uk |∂ Ω −→ γ0 (u) in L2 (∂ Ω), vk |∂ Ω −→ γ0 (v) in L2 (∂ Ω). Hence
∂Ω
uk vk (n · ei ) d σ −→
∂Ω
γ0 (u)γ0 (v)(n · ei ) d σ.
One can pass to the limit, without any difficulty, on the left-hand side of (5.33). We finally obtain ∂v ∂u u dx = +v γ0 (u)γ0 (v)(n · ei ) d σ, ∂ xi ∂ xi Ω ∂Ω which ends the proof. Remark 5.6.1. (a) One should retain from the above argument the general method of the proof of Green’s formulas: first establish it for smooth functions, then pass to the limit by using the continuity properties of the trace operators. (b) The same formula holds for u ∈ W 1, p (Ω) and v ∈ W 1,q (Ω) with 1p + q1 = 1. Let us now examine the space of traces {γ0 (v) : v ∈ W 1, p (Ω)}. By Theorem 3.6.1 we know that the space trace is included in L p (∂ Ω). Indeed, we are going to show that the range of γ0 is a strict subspace of L p (∂ Ω), namely, γ0 (v) ∈ W 1−1/ p, p (∂ Ω). This result requires the use of fractional Sobolev spaces on a manifold, which is a quite involved subject. When p = 2, one can give a simpler proof of it thanks to the Fourier approach to Sobolev spaces. Proposition 5.6.3 (range of γ0 ). Let Ω be an open bounded set in RN whose boundary ∂ Ω is of class C1 . Then the trace operator γ0 is linear continuous and onto from H 1 (Ω) → H 1/2 (∂ Ω).
i
i i
i
i
i
i
5.6. Trace theory for W 1, p (Ω) spaces
book 2014/8/6 page 183 i
183
PROOF. The definition of H 1/2 (∂ Ω) is obtained by using local coordinates. Thus, we just and ∂ Ω = RN −1 . By using the continuity need to prove Proposition 5.6.3 when Ω = RN + of the reflection operator P : H 1 (RN ) → H 1 (RN ) (see Lemma 5.4.1), we just need to prove + that the trace operator γ0
H 1 (RN ) −→ H 1/2 (RN −1 ) is continuous. By using the density of (RN ) in H 1 (RN ) and the definition of γ0 , we can finally reduce our study to the following situation. Prove the existence of some constant C ≥ 0 such that for any v ∈ (RN ) v(·, 0)H 1/2 (RN −1 ) ≤ C vH 1 (RN ) .
(5.34)
To prove (5.34), we use the Fourier transform as defined in Section 5.5. Denoting mN = (1/(2π)N /2)d x, we have 0N (v)(ξ ) = e −i ξ .x v(x)d mN , RN
where 0N refers to the Fourier transform in RN . Let us use the classical notation x = (x , xN ) ∈ RN −1 × R. For any v ∈ (RN ), by using the Fubini theorem, we obtain −i xN t −i x ·y e v(y, t )d mN −1 (y) d m1 (t ) 0N (v)(x , xN ) = e RN −1
R
= 01 (0N −1 v(x , ·))(xN ) = 0N −1 (01 v(·, xN ))(x ).
(5.35)
Let us compute the H 1/2 (RN −1 ) norm of γ0 (v). We have γ0 (v)(x ) = v(x , 0) = 0¯1 01 (v(x , ·))(0), where we used the Fourier inversion formula (Proposition 5.5.1). Hence +∞ 01 (v(x , ·))(t ) d t . γ0 (v)(x ) =
(5.36)
−∞
Let us apply 0N −1 to the two sides of (5.36) and use (5.35) to obtain
0N −1 (γ0 (v))(x ) = From Theorem 5.6.1 we have vH 1 (RN ) =
+∞ −∞
(0N v)(x , t ) d t .
(5.37)
R
N
R
N
=
(1 + |x|2 )|(0N v)(x)|2 d x (1 + |x |2 + t 2 )|(0N v)(x , t )|2 d x d t .
We may rewrite (5.37) as 0N −1 (γ0 (v))(x ) =
+∞ −∞
(0N v)(x , t ).(1 + |x |2 + t 2 )1/2
dt (1 + |x |2 + t 2 )1/2
i
i i
i
i
i
i
184
book 2014/8/6 page 184 i
Chapter 5. Sobolev spaces
and use the Cauchy–Schwarz inequality to obtain |0N −1 (γ0 (v))(x )|2 ≤
+∞
−∞
|(0N v)(x , t )|2 (1 + |x |2 + t 2 ) d t
+∞ −∞
dt 1 + |x |2 + t 2
. (5.38)
An elementary computation yields (after the change of variable t = (1 + |x |2 )1/2 s)
+∞ −∞
dt 2
1 + |x | + t
2
=
π
.
(5.39)
|(0N v)(x , t )|2 (1 + |x |2 + t 2 )d t .
(5.40)
(1 + |x |2 )1/2
Combining (5.38) and (5.39), we obtain |(1 + |x |2 )1/4 0N −1 (γ0 (v))(x )|2 ≤ π
+∞ −∞
Let us integrate (5.40) over RN −1 to obtain |(1 + |x |2 )1/4 0N −1 (γ0 (v))(x )|2 d x RN −1 ≤π |(0N v)(x , t )|2 (1 + |x |2 + t 2 )d x d t .
(5.41)
RN
By using again Theorem 5.5.1 and Definition 5.5.1 of H 1/2 , we thus get $ γ0 (v)H 1/2 (RN −1 ) ≤ πvH 1 (RN ) , which expresses that γ0 is continuous from H 1 (RN ) into H 1/2 (RN −1 ). Remark 5.6.2. When v belongs to W 2, p (Ω), by a similar argument one can give a meaning to ∂∂ nv . Just notice that ∇v ∈ W 1, p (Ω)N , and hence the trace of ∇v on ∂ Ω belongs to L p (∂ Ω)N . One defines ∂v := γ0 (∇v) · n, ∂n which belongs to L p (∂ Ω). Indeed, one can show that ∂v ∂n
∈ W 1−1/ p, p (∂ Ω).
When p = 2, for v ∈ H 2 (Ω) we have ∂∂ vn ∈ H 1/2 (∂ Ω). One can also show that the operator v → {v|∂ Ω , ∂∂ nv } is linear continuous and onto from W 2, p (Ω) onto W 2−1/ p, p (∂ Ω) × W 1−1/ p, p (∂ Ω).
5.7 Sobolev embedding theorems Let Ω be an open set in RN . We have seen in Section 5.1 (Theorem 5.1.1) that each element of the Sobolev space W 1, p (a, b ), 1 ≤ p ≤ +∞, has a continuous representative. This is no longer true for the elements of the space W 1,2 (Ω) as soon as the dimension of the space N ≥ 2. This raises a natural question: Is there a general relation between the numbers m, ¯ p, N which allows us to conclude that W m, p (Ω) → C(Ω)?
i
i i
i
i
i
i
5.7. Sobolev embedding theorems
book 2014/8/6 page 185 i
185
Indeed, the answer is yes, and the Sobolev embedding Theorem 5.7.2 establishes that this is true as soon as m p > N . Another important aspect of the Sobolev embedding theorem is that, even if v ∈ W m, p (Ω) and m p < N , one can say better than v ∈ L p (Ω): m . indeed v ∈ Lq (Ω) with q1 = 1p − N We stress the fact that the Sobolev embedding theorem plays a crucial role in the variational approach to partial differential equations. It allows us to make the link between the two scales of spaces: W m, p (Ω) and Ck,α (Ω). When p = 2, an incisive approach to this question consists in using the Fourier– Plancherel transformation, as developed in Section 5.5. Theorem 5.7.1. Let s > 0 and assume that 2s > N . Then H s (RN ) is continuously embedded in C(RN ). PROOF. We recall (see Theorem 5.5.1 and Definition 5.5.1) that v ∈ H s (RN ) ⇐⇒ (1 + |ξ |2 ) s /2 0 (v) ∈ L2 (RN ). Let us set g = (1 + |ξ |2 ) s /2 0 (v), which, by assumption, belongs to L2 (RN ). We have 0 (v) = g (1 + |ξ |2 )−s /2 and v = 0¯ (0 v). When 0 (v) = g (1 + |ξ |2 )−s /2 belongs to L1 (RN ), then 0¯ coincides with the classical inverse Fourier transform, and consequently g (1 + |ξ |2 )−s /2 belongs to L1 (RN ) as v ∈ C(RN ). Since g belongs to L2 (RN ), the function 2 −s /2 2 N soon as (1 + |ξ | ) belongs to L (R ), i.e., RN (1 + |ξ |2 )−s d ξ < +∞. We have RN
dξ (1 + |ξ |2 ) s
=c
r N −1
+∞
0
(1 + r 2 ) s
dr
and this last integral is finite iff 2s − (N − 1) > 1, i.e., 2s > N . Let us now examine the general case 1 ≤ p ≤ +∞ and first consider the space W 1, p (Ω). By induction on m, we will then derive the result for W m, p (Ω). Theorem 5.7.2 (Sobolev). Let Ω be an open bounded subset of RN with a C1 boundary ∂ Ω. Let 1 ≤ p ≤ +∞ and consider the Sobolev space W 1, p (Ω). Then, the following continuous embedding results hold: ∗
(i) If 1 ≤ p < N , then W 1, p (Ω) → L p (Ω) with ∗
1 p∗
=
1 p
− N1 . More precisely, any element
v ∈ W 1, p (Ω) belongs to L p (Ω) and there exists a constant C , depending only on p, N , and Ω such that for all v ∈ W 1, p (Ω), vL p ∗ (Ω) ≤ C vW 1, p (Ω) . (ii) If p = N , then W 1, p (Ω) → Lq (Ω) for all 1 ≤ q < +∞. ¯ More precisely, we have W 1, p (Ω) → C0,α (Ω) with (iii) If p > N , then W 1, p (Ω) → C(Ω). N α = 1 − p , i.e., each element v ∈ W 1, p (Ω) is Hölder continuous with exponent α, and
there exists a constant C depending only on p, N , and Ω such that for all v ∈ W 1, p (Ω), |v(x) − v(y)| ≤ C vW 1, p (Ω) |x − y|α
for a.e. x, y ∈ Ω.
i
i i
i
i
i
i
186
book 2014/8/6 page 186 i
Chapter 5. Sobolev spaces
PROOF. The proof method is similar to the one used in Section 5.4. We first prove the Sobolev continuous embeddings (i), (ii), (iii) in the case Ω = RN : indeed, one obtains in this case slightly more precise results as stated in Theorem 5.7.3 (Sobolev–Gagliardo– Nirenberg for the case 1 ≤ p < N ), Theorem 5.7.4 (Morrey for the case p > N ), and Theorem 5.7.5 for the critical case p = N . Let us introduce the linear continuous extension operator P : W 1, p (Ω) → W 1, p (RN ), whose properties are described in Theorem 5.4.1. Let us consider, for example, the case 1 ≤ p < N . The composition of the continuous operators P
W 1, p (Ω) → W 1, p (RN )
T h m. 4.7.3
→
∗
r
∗
L p (RN ) → L p (Ω)
(where r is the restriction operator to Ω) is still continuous, and it is the canonical em∗ bedding from W 1, p (Ω) into L p (Ω), which is the identity. The same argument works for the cases p > N and p = N . Note that to use the extension operator technique, which is developed in Theorem 5.4.1, we need to make some regularity assumptions on Ω, namely, ∂ Ω is assumed to be of class C1 . From now on in this section, we work on the whole space RN .
5.7.1 Case 1 ≤ p < N Theorem 5.7.3 (Sobolev–Gagliardo–Nirenberg). Let 1 ≤ p < N . Then W 1, p (RN ) → ∗ L p (RN ) with p1∗ = 1p − N1 . More precisely, there exists a constant C = C ( p, N ) such that ∀v ∈ W 1, p (RN )
vL p ∗ (RN ) ≤ C ∇vL p (RN ) .
Remark 5.7.1. Before proving Theorem 5.7.3, it is worth pointing out that by an elementary homogeneity argument, one can obtain the precise value of p ∗ = pN /(N − p). Let us assume that there exists a constant C and some 1 ≤ q < +∞ such that for all v ∈ W 1, p (RN ), vLq (RN ) ≤ C ∇vL p (RN ) . Let us prove that, necessarily, q = p ∗ . In the above inequality, instead of v, let us take vλ (x) = v(λx) with λ > 0. We have
1/q q
RN
|v(λx)| d x
≤C
1/ p p
RN
p
λ |∇v(λx)| d x
.
Let us make the change of variable y = λx. We obtain 1 λ
N /q
vLq (RN ) ≤ C λ
1 λ
N/p
∇vL p (RN ) ,
which, after simplification, yields vLq (RN ) ≤ C λ(1−N / p+N /q) ∇vL p (RN ) .
i
i i
i
i
i
i
5.7. Sobolev embedding theorems
book 2014/8/6 page 187 i
187
Clearly, this formula makes sense only when 1 −
N p
N q
+
= 0, i.e., q = p ∗ . Otherwise, if
1 − Np + Nq > 0, let λ → 0 in the above inequality, and if 1 −
N p
+ Nq < 0, let λ → +∞. In
both cases, one obtains that for any v ∈ W 1, p (RN ), v ≡ 0, which is an absurd statement! To prove Theorem 5.7.3 we will use the following lemma, which is due to Gagliardo. Lemma 5.7.1. Let N ≥ 2 and g1 , g2 , . . . , gN ∈ LN −1 (RN −1 ). Then the function g defined by g (x) = g1 (˜ x1 ) g2 (˜ x2 ) . . . gN (˜ xN ) belongs to L1 (RN ) and g L1 (RN ) ≤
N . i =1
gi LN −1 (RN −1 ) .
PROOF. The inequality is obvious when N = 2. So let us argue by induction, assume that the result has been proved until N , and prove it for N + 1. Let us give N + 1 functions g1 , g2 , . . . , gN +1 belonging to LN (RN ) and consider the function g defined for any x = (x1 , x2 , . . . , xN +1 ) ∈ RN +1 by g (x) = g1 (˜ x1 ) g2 (˜ x2 ) . . . gN +1 (˜ xN +1 ) = [g1 (˜ x1 ) . . . gN (˜ xN )]gN +1 (˜ xN +1 ). Let us fix xN +1 and apply the Hölder inequality
RN
|g (x)|d x1 d x2 . . . d xN ≤ gN +1 LN (RN )
RN
1/N
|g1 (˜ x1 ) . . . gN (˜ xN )|N d x1 . . . d xN (5.42)
with N = N /(N − 1). Then, note that the functions |g1 (˜ x1 )|N , . . . , |gN (˜ xN )|N belong to N −1 N −1 (R ). By the induction hypothesis, L
RN
|g1 (˜ x1 ) . . . gN (˜ xN )|N d x1 . . . d xN ≤
N . i =1
gi (., xN +1 )N . LN (RN −1 )
(5.43)
gi (·, xN +1 )LN (RN −1 ) .
(5.44)
Combining (5.42) and (5.43) we obtain RN
|g (x)|d x1 . . . d xN ≤ gN +1 LN (RN )
N . i =1
Let us now make xN +1 vary and integrate (5.44) over R. We have RN +1
|g (x)|d x1 . . . d xN +1 ≤ gN +1 LN (RN )
. N R i =1
gi (·, xN +1 )LN (RN −1 ) d xN +1 .
(5.45)
Let us notice that for all i = 1, . . . , N the function gi (·, xN +1 )LN (RN −1 ) belongs to LN (R). Applying Hölder’s inequality to (5.45) with N1 + · · · + N1 = 1 (N times) we obtain RN +1
|g (x)|d x1 . . . d xN +1 ≤ gN +1 LN (RN )
N . i =1
gi LN (RN ) ,
i
i i
i
i
i
i
188
book 2014/8/6 page 188 i
Chapter 5. Sobolev spaces
i.e., g L1 (RN ) ≤
N +1 . i =1
gi LN (RN ) ,
which completes the induction and the proof. PROOF OF THEOREM 5.7.3. (a) Let us use a density argument and prove that it is equivalent to know that the inequality vL p ∗ (RN ) ≤ C ∇vL p (RN )
(5.46)
holds for any v ∈ (RN ) or for any v ∈ W 1, p (RN ). Given v ∈ W 1, p (RN ), by using Theorem 5.1.3, we can find a sequence (vn )n∈N of elements vn ∈ (RN ) such that vn → v in W 1, p (RN ) and vn (x) → v(x) for almost every x ∈ RN . Let us write
vn L p ∗ (RN ) ≤ C ∇vn L p (RN ) .
Hence, vL p ∗ (RN ) ≤ lim inf vn L p ∗ (RN ) n
≤ C lim ∇vn L p (RN ) = C ∇vL p (RN ) , n
where the first above inequality is obtained by using Fatou’s lemma. (b) Let us now verify that it is enough to prove (5.46) when p = 1. To do so, let us assume that it is true for p = 1, i.e., ∀v ∈ W 1,1 (RN )
vL1∗ (RN ) ≤ C1 (N )∇vL1 (RN ) ,
(5.47)
and prove that it is true for all 1 ≤ p < N . ∗ ∗ Let us observe that if v ∈ (RN ), then |v| p /1 belongs to W 1,1 (RN ). Indeed, since ∗ ∗ p ∗ /1∗ is continuously differentiable with comp > 1 we have p > 1 and the function |v| ∗ ∗ pact support. Hence |v| p /1 ∈ W 1,1 (RN ) and + p∗ * ∗ ∗ ∗ ∗ ∇ |v| p /1 = ∗ sign v|v|( p /1 )−1 ∇v. 1 ∗
(5.48)
∗
Let us replace v by |v| p /1 in (5.47). Applying (5.48) we obtain 1/1∗ p∗ ∗ ∗ p∗ |v| d x ≤ C1 (N ) ∗ |v|( p /1 )−1 |∇v|d x. N N 1 R R Let us apply Hölder’s inequality with
p∗
RN
|v| d x
1/1∗ ≤ C1 (N )
p∗ 1∗
1 p
+
1 p
= 1 to the right-hand side of (5.49): p∗
( 1∗ −1) p
RN
|v|
(5.49)
vL p ∗ (RN ) ≤ C1 (N )
1 1∗
dx
p∗ 1∗
−
1 p∗
1/ p p
RN
An elementary computation yields the equality 1) p = p ∗ , and allows us to simplify (5.50),
1/ p
=
1 , p
|∇v| d x
.
(5.50) p∗
which is equivalent to ( 1∗ −
∇vL p (RN ) ,
(5.51)
i
i i
i
i
i
i
5.7. Sobolev embedding theorems
book 2014/8/6 page 189 i
189
which is precisely (5.46) for an arbitrary 1 ≤ p < N . Note that we have obtained that one can take p∗ (5.52) C ( p, N ) = C1 (N ) ∗ . 1 (c) Thus, we just need to prove that for any v ∈ (RN ), vL1∗ (RN ) ≤ C1 (N )∇vL1 (RN ) .
(5.53)
For any x = (x1 , x2 , . . . , xN ) ∈ RN we have xi ∂ v |v(x)| = (x1 , x2 , . . . , xi −1 , t , xi +1 , . . . , xN ) d t −∞ ∂ xi xi ∂v ≤ (x1 , x2 , . . . , xi −1 , t , xi +1 , . . . , xN ) d t . ∂ x −∞ i Symmetrically,
∂v (x1 , x2 , . . . , xi −1 , t , xi +1 , . . . , xN ) d t . ∂ xi
+∞
|v(x)| ≤
xi
Adding these two inequalities we obtain 1 +∞ ∂ v (x1 , . . . , xi −1 , t , xi +1 , . . . , xN ) d t . |v(x)| ≤ 2 −∞ ∂ xi
(5.54)
Let us adopt the following notation. For any x ∈ RN and i = 1, 2, . . . , N x˜i := (x1 , x2 , . . . , xi −1 , xi +1 , . . . , xN ), fi (˜ xi ) :=
∂v (x1 , x2 , . . . , xi −1 , t , xi +1 , . . . , xN ) d t . ∂ xi
+∞ −∞
We can rewrite (5.54) as |v(x)| ≤ symmetric expression,
1 f (˜ x) 2 i i
for all i = 1, . . . , N , which implies, as a more
|v(x)|N ≤
N 1 .
2N
i =1
fi (˜ xi ).
(5.55)
Indeed, in (5.53) we need to majorize vL1∗ . Noticing that 1∗ = 1
∗
|v(x)|1 ≤
2
N /N −1
N . i =1
1
fi (˜ xi ) N −1 .
N , let N −1
us write (5.55) as (5.56)
1
We note that for each i = 1, . . . , N the function gi (˜ xi ) := fi (˜ xi ) N −1 belongs to LN −1 (RN −1 ) and ) ) 1 ) ∂ v ) N −1 ) ) . (5.57) gi LN −1 (RN −1 ) = ) ∂x ) 1 N i
L (R )
i
i i
i
i
i
i
190
book 2014/8/6 page 190 i
Chapter 5. Sobolev spaces
Applying Lemma 5.7.1 to (5.56) and using (5.57) we obtain ) ) N ) 1 ) ) ). 1∗ |v(x)| d x ≤ N /N −1 ) gi (˜ x i )) ) 1 N ) N 2 R i =1 L (R ) ≤
1
N .
2N /N −1
i =1
gi LN −1 (RN −1 )
) ) 1 N ) ) N −1 . )∂v) ≤ N /N −1 . ) ) ) ) 2 i =1 ∂ xi L1 (RN ) 1
Since 1∗ =
N , N −1
it follows that vL
1∗
) )1/N N ) ) 1. )∂v ) . ) ) N ≤ (R ) ) ∂ xi ) 1 N 2 i =1
JN
≤ N1 N a , we finally obtain i =1 i ) ) N ) ) 1 1 )∂v ) vL1∗ (RN ) ≤ ∇vL1 (RN ) , = ) ) 2N i =1 ) ∂ xi )L1 (RN ) 2N
From the convexity inequality (
a) i =1 i
(5.58)
L (R )
1/N
(5.59)
which ends the proof. Remark 5.7.2. Combining (5.47), (5.52), and (5.59), we obtain that one can take C ( p, N ) =
1 p∗ 2N 1
∗
=
p(N − 1) 2N (N − p)
.
But this is not the best estimate. The best constant is strictly less than this one; it is known and quite involved (cf. Aubin [63], Talenti [345], and Lieb [275]). As a direct consequence of the Sobolev–Gagliardo–Nirenberg theorem, one obtains the following Poincaré–Sobolev inequality. Proposition 5.7.1. There exists a constant C ( p, N ) such that for any open subset Ω in RN and for any 1 ≤ p < N , the following inequality holds: 1, p
∀v ∈ W0 (Ω)
vL p ∗ (Ω) ≤ C ( p, N )∇vL p (Ω) .
1, p
PROOF. Given v ∈ W0 (Ω), let v˜ be the extension of v by zero outside of Ω. By Proposition 5.1.1, v˜ belongs to W 1, p (RN ). Applying Theorem 5.7.3 to v˜ and noticing ˜ L p ∗ (RN ) = vL p ∗ (Ω) and ∇v ˜ L p (RN ) = ∇vL p (Ω) , we obtain the desired concluthat v sion. Remark 5.7.3. (a) A striking feature of the above Poincaré–Sobolev inequality is that it is valid for an arbitrary open set Ω (not necessarily bounded) and the constant C ( p, N ) is p(N −1) independent of Ω. Indeed, one can take as a value of C ( p, N ) the number 2N (N − p) . These properties rely on the fact that one estimates vL p ∗ from above by ∇vL p . This makes a great contrast with the classical Poincaré inequality, where one estimates
i
i i
i
i
i
i
5.7. Sobolev embedding theorems
book 2014/8/6 page 191 i
191
from above vL p by ∇vL p : in the classical Poincaré inequality (Theorem 5.3.1) one has to assume that Ω is bounded (at least in one direction) and the Poincaré constant does depend on Ω. (b) When Ω is bounded, one can recover the classical Poincaré inequality from Proposition 5.7.1 just by using Hölder’s inequality. Indeed, assuming that 1 ≤ p < N , p
Ω
1− p/ p ∗
|v(x)| d x ≤ |Ω|
Ω
p∗
p/ p ∗
|v(x)| d x
.
Hence ∗
vL p (Ω) ≤ |Ω|1/ p−1/ p vL p ∗ (Ω) ∗
≤ |Ω|1/ p−1/ p C ( p, N )∇vL p (Ω) . Using the equality
1 p
−
1 p∗
=
1 , N
we obtain
vL p (Ω) ≤ |Ω|1/N C ( p, N )∇vL p (Ω) . ¯ (Ω) depends on Ω. Observing that This is another way to see how the Poincaré constant C p 1/N = R|Ω|, we find again the conclusion of Proposition 5.3.1, which is C¯ (RΩ) = |RΩ| p
RC¯ p (Ω). Let us summarize the above results in the following statement.
Corollary 5.7.1. Let Ω be a bounded open subset of RN and take 1 ≤ p < N . Then, for any 1, p v ∈ W0 (Ω), the following inequality holds: vL p (Ω) ≤ |Ω|1/N C ( p, N )∇vL p (Ω) . For example, when Ω = B(0, r ), we obtain 2 2 v(x) d x ≤ C r |∇v(x)|2 d x Br
Br
∀v ∈ H01 (B r ).
We also have a Poincaré–Wirtinger–Sobolev inequality. Proposition 5.7.2. Let Ω be a bounded, connected, open set in RN whose boundary ∂ Ω is of class C1 . Then there exists a constant C ( p, N , Ω) such that for any 1 ≤ p < +∞, the following inequality holds: ) ) ) ) 1 ) ) 1, p ∀v ∈ W (Ω) v(x) d x ) ≤ C ( p, N , Ω)∇vL p (Ω) . )v − ) ) p∗ |Ω| Ω L (Ω) 1 PROOF. Let us denote M (v) = |Ω| v(x) d x and apply the Sobolev embedding Theorem Ω 5.7.2(i) to the function v − M (v). We obtain v − M (v)L p ∗ (Ω) ≤ C1 ( p, N , Ω)v − M (v)W 1, p (Ω) ≤ C1 ( p, N , Ω) v − M (v)L p (Ω) + ∇vL p (Ω) .
i
i i
i
i
i
i
192
book 2014/8/6 page 192 i
Chapter 5. Sobolev spaces
Let us now apply the classical Poincaré–Wirtinger inequality (Corollary 5.4.1) to v: v − M (v)L p (Ω) ≤ C2 ( p, N , Ω)∇vL p (Ω) . Combining the two last inequalities, we obtain v − M (v)L p ∗ (Ω) ≤ C ( p, N , Ω)∇vL p (Ω) , which ends the proof.
5.7.2 Case p > N We now consider the space W 1, p (RN ) with p > N . The following theorem is due to Morrey [302]. Theorem 5.7.4 (Morrey). Assume that p > N . Then there exists a continuous embedding W 1, p (RN ) → C0,α (RN ) with α = 1 − Np . More precisely, there exists a constant C ( p, N ) such that for all v ∈ W 1, p (RN ), |v(y) − v(x)| ≤ C ( p, N )∇vL p (RN ) |y − x|α
for a.e. x, y ∈ RN .
PROOF. Let us first take v ∈ (RN ). The proof is then completed by a density argument. Let Q be a cube containing the origin and whose edges are parallel to the coordinate axes in RN and have a common length equal to r > 0. For each x ∈ Q we have
1
v(x) − v(0) =
d dt
0
v(t x)d t .
From this, we infer ∂v |v(x) − v(0)| ≤ |xi | (t x) d t ∂ x 0 i =1 i 1 N ∂v (t x) d t . ≤r i =1 0 ∂ xi 1 N
Let v¯ := obtain
1 |Q|
Q
(5.60)
v(x)d x denote the mean value of v on Q. Integrating (5.60) on Q we N 1 ∂v |v¯ − v(0)| ≤ dx (t x) d t . |Q| Q ∂ x i i =1 0 r
Let us exchange the order of integration (Fubini’s theorem): 1 N ∂v 1 dt (t x) d x. |v¯ − v(0)| ≤ N −1 ∂ xi r Q 0 i =1
Making the change of variable y = t x, we obtain |v¯ − v(0)| ≤
1 r N −1
0
1
N dy ∂v dt (y) N . t ∂ x t Q i =1 i
(5.61)
i
i i
i
i
i
i
5.7. Sobolev embedding theorems
book 2014/8/6 page 193 i
193
We now use Hölder’s inequality and majorize this last integral as follows: p 1/ p ∂ v ∂v (y) d y ≤ |t Q|1/ p (y) d y . t Q ∂ xi t Q ∂ xi
(5.62)
Since 0 ∈ Q, we have t Q ⊂ Q for all 0 ≤ t ≤ 1. From the above inequalities (5.61) and (5.62) it follows that p 1/ p
N r N/p 1 t N/p ∂v |v¯ − v(0)| ≤ N −1 d t (y) dy N ∂ x r t Q 0 i i =1 ≤
r 1−N / p 1 − N/p
∇vL p (Q) .
By translation, this inequality remains true for any cube Q whose edges are parallel to the coordinate axes and have common length equal to r . Hence, for any x ∈ Q, |v¯ − v(x)| ≤
r 1−N / p 1 − N/p
∇vL p (Q) .
We use the triangle inequality ¯ + |v¯ − v(x)| |v(y) − v(x)| ≤ |v(y) − v| to obtain |v(y) − v(x)| ≤
2r 1−N / p 1 − N/p
∇vL p (Q)
∀x, y ∈ Q.
Then we observe that for any two points x, y ∈ RN , there exists an open cube Q which is constructed as above and with r = 2|y − x|. It follows that for any x, y ∈ RN , |v(y) − v(x)| ≤ C ( p, N )∇vL p (Q) |y − x|1−N / p , where C ( p, N ) depends only on p and N . Here we have obtained C ( p, N ) = The proof is then completed by a standard density argument.
22−N / p . 1−N / p
5.7.3 Case p = N Let us first show that for any 1 ≤ q < +∞, W 1,N (RN ) is continuously embedded in q L l oc (RN ). This result follows easily from the Sobolev–Gagliardo–Nirenberg theorem in the case 1 ≤ p < N and the fact that p ∗ = q
pN N−p
tends to +∞ as p goes to N . |v(x)|q d x < +∞. The B(0,R)
We recall that v ∈ L l oc (RN ) means that for any R > 0, q
topology on L l oc (RN ) is generated by the family of seminorms { · k , k = 1, 2, . . .}
1/q vk =
B(0,Rk )
|v(x)|q d x
,
where Rk is an arbitrary sequence tending to +∞ with k. We obtain in this way a Fréchet topology (metrizable and complete), which does not depend on the choice of the sequence q (Rk )k∈N , Rk → +∞. It is equivalent to say that vn → v in L l oc (RN ) and ∀R < +∞ |vn (x) − v(x)|q d x → 0. B(0,R)
We can now state the following result.
i
i i
i
i
i
i
194
book 2014/8/6 page 194 i
Chapter 5. Sobolev spaces
Theorem 5.7.5 (the limiting case p = N ). We have q
W 1, p (RN ) → L l oc (RN )
∀ 1 ≤ q < +∞
with a continuous injection. PROOF. Let v ∈ W 1,N (RN ). Let us use a truncation (on the domain) argument. Lemma 5.1.2 shows that for any M ∈ (RN ) the function M v belongs to W 1,N (RN ) and has a compact support. As a consequence M v ∈ W 1,N − (RN )
∀ > 0 arbitrarily small.
(To obtain this result, we use that for any R > 0, for any > 0, LN (B(0, R)) ⊂ LN − (B(0, R)). Note that this is false on the whole of RN !) Let us apply Theorem 5.7.3 with p = N − < N . We obtain M v ∈ Lq (RN )
with
1 q
=
1 N −
−
1 N
=
N (N − )
.
N (N −)
Hence, for any M ∈ (RN ), for any > 0, M v ∈ L (RN ). N (N −) Noticing that → +∞ as → 0, and taking M ≡ 1 on B(0, R), we obtain that for any 1 ≤ q < +∞, for any R > 0, v ∈ Lq (B(0, R)). One can easily verify that the above operations are continuous and so is the embedding q W 1,N (RN ) → L l oc (RN ) for all 1 ≤ q < +∞. q
Remark 5.7.4. 1. Note that W 1,N (RN ) → L l oc (RN ) for all q finite. In general v ∈ W 1,N (RN ) does not imply that v is bounded on the bounded sets, as the example v(x) = | ln(x)|k described in Section 5.1 shows. 2. One can show (see, for example, [137, Corollary IX.11] that W 1,N (RN ) → Lq (RN ) for all 1 ≤ q < +∞. This result clearly implies the conclusion of Theorem 5.7.5. 3. Indeed, functions v ∈ W 1,N (RN ) are not in L∞ (RN ), but one can say something l oc q better than v ∈ L l oc (RN ) for each 1 ≤ q < +∞. To do so, we need to use a sharper scaling of spaces than the classical Lebesgue {L p ; 1 ≤ p ≤ +∞} scaling. We use Orlicz spaces scaling as shown in the following proposition, which is again an easy consequence of Theorem 5.7.3. Proposition 5.7.3. There exist two constants K > 0 and L > 0 such that for any R > 0, for any function v ∈ W 1,N (RN ) which satisfies spt v ⊂ B(0, R) and ∇vLN (RN ) ≤ 1, we have e K v(x) d x ≤ L|B(0, R)|. B(0,R)
PROOF. Let us return to Theorem 5.7.3. We have seen that for any 1 ≤ p < N , for any v ∈ W 1, p (RN ), (N − 1) p vL p ∗ (RN ) ≤ ∇vL p (RN ) . 2N (N − p) We have
(N − 1) p 2N (N − p)
=
(N − 1) 2N
2
·
Np N−p
.
i
i i
i
i
i
i
5.7. Sobolev embedding theorems
Noticing that p ∗ = Hence
book 2014/8/6 page 195 i
195
Np N−p
and using the inequality
N −1 2N 2
≤ 1, we obtain that
(N −1) p 2N (N − p)
≤ p∗.
vL p ∗ (RN ) ≤ p ∗ ∇vL p (RN ) .
Let us now suppose that spt v ⊂ B(0, R). Using Hölder’s inequality with we obtain
1/ p p ∇vL p (RN ) = |∇v| d x B(0,R) 1/ p ∗
+ N1/ p = 1,
1/N
N
≤ |BR |
1 p∗/ p
|∇v| d x
.
B(0,R)
Combining the two last inequalities, we obtain ∗
vL p ∗ (B ) ≤ p ∗ |BR |1/ p ∇vLN (BR ) . R
Then note that when 1 ≤ p < N , the corresponding p ∗ = ∗
Np N−p
varies from
N N −1
to +∞.
Replacing p by q, we obtain the following intermediate result: for any q such that NN−1 ≤ q < +∞, (5.63) vLq (BR ) ≤ q|BR |1/q ∇vLN (BR ) . We now use the asymptotic development e K|v| = 1 + K|v| + to obtain
K2
|v|2 + · · · +
2!
e
K|v|
BR
d x = |BR | + K
Let us assume that N ≥ 2. This implies ∇vLN (BR ) ≤ 1, we have ∀q ≥ 2
|v| d x + BR N N −1
Kq q!
|v|q + · · ·
+∞ K q q=2
q!
|v|q d x.
(5.64)
BR
≤ 2, and (5.63) is valid for any q ≥ 2. Since
1
|v|q d x ≤ q q .
|BR |
(5.65)
BR
When q = 1, we use the Hölder inequality 1/2
BR
1/2
|v(x)|d x ≤ |BR |
2
|v(x)| d x BR
and the inequality (5.65) with q = 2 to obtain
1/2 1 1 2 |v(x)|d x ≤ |v(x)| d x ≤ 2. |BR | BR |BR | BR Combining (5.64), (5.65), and (5.66), we obtain +∞ 1 Kq qq . e K|v(x)| d x ≤ 1 + 2K + |BR | BR q=2 q!
(5.66)
(5.67)
i
i i
i
i
i
i
196
book 2014/8/6 page 196 i
Chapter 5. Sobolev spaces
Set uq := K q q q /q!. We have lim
uq+1
q→+∞
uq
= K lim
q→+∞
1+
1
q = K e.
q
q q Hence, the series +∞ Kq qq /q! is convergent when K < 1/e. Choosing K < 1/e, we take L = 1 + 2K + q=2 K q /q! and so we obtain for any R > 0 1 e K v(x) d x ≤ L, |BR | BR which ends the proof. One can improve the conclusion of Proposition 5.7.3. Instead of (5.63), one can prove the sharper estimation 1 vLq ≤ CN q 1− N |BR |1/q ∇vLN . At this point, note the importance of getting the best constant C ( p, N ) in the Sobolev embedding theorem. Then, by using the same argument as above, one obtains the following inequality, which is due to Trudinger [352] and Moser [306]. Proposition 5.7.4. Under the assumptions of Proposition 5.7.3, there exist some constants σ > 0 and K(N ) > 0 such that for any v in W 1,N (RN ) with spt v ⊂ BR and ∇vLN (BR ) ≤ 1, we have e σ|v(x)|
BR
Note that the power
N N −1
N /N −1
d x ≤ K(N )|BR |.
is the best possible power.
Let us now return to the general situation and complete the study of the Sobolev embedding theorem by the following results. Repeated applications of Theorem 5.7.2 enable us to obtain the statement below. Theorem 5.7.6. Let Ω be an open bounded subset of RN with a C1 boundary ∂ Ω. Let 1 ≤ p < +∞ and let m ≥ 0 be an integer. (i) If m p < N , then W m, p (Ω) → Lq (Ω) with
1 q
=
1 p
−
m . N
(ii) If m p = N , then W m, p (Ω) → Lq (Ω) for all 1 ≤ q < +∞. (iii) If m p > N , let us set k = [m− Np ] and α = m− Np −k (0 ≤ α < 1). Then W m, p (Ω) → Ck,α (Ω), where v ∈ C k,α (Ω) means that v ∈ Ck (Ω) and D l v ∈ C0,α (Ω) for any l with |l | = k. Remark 5.7.5. As an illustration of the above theorem, we notice that H 2 (Ω), where ¯ as soon as N < 4, i.e., for N = 1, 2, 3. Ω ⊂ RN , is continuously embedded in C(Ω) Let us end this section by the following compactness embedding theorem. Using the Sobolev embedding theorem, one can improve the Rellich–Kondrakov theorem, Theorem 5.4.2. Theorem 5.7.7. Let Ω be an open bounded subset of RN which has a C1 boundary ∂ Ω. Then we have the following compact injections: (i) If p < N , W 1, p (Ω) → Lq (Ω) for any q < p ∗ with
1 p∗
=
1 p
− N1 .
i
i i
i
i
i
i
5.8. Capacity theory and elements of potential theory
book 2014/8/6 page 197 i
197
(ii) If p = N , W 1, p (Ω) → Lq (Ω) for any 1 ≤ q < +∞. ¯ (iii) If p > N , W 1, p (Ω) → C(Ω). PROOF. The case (iii) follows from Theorem 5.7.2(iii) and Ascoli’s theorem: any bounded subset of W 1, p (Ω) is bounded in C0,α (Ω) with α = 1 − Np and hence equicontinuous.
Let us consider the case (i). Take a bounded sequence (vn )n∈N in W 1, p (Ω). By the ∗ Sobolev embedding Theorem 5.7.2(i), the sequence (vn )n∈N is bounded in L p (Ω). By the classical Rellich–Kondrakov theorem, Theorem 5.4.2, we can extract a convergent subsequence vnk → v in L p (Ω). Let us prove that, indeed, the convergence of vnk to v holds in every Lq (Ω), with 1 ≤ q < p ∗ . We use the following generalized version of Hölder’s inequality: If f1 , f2 , . . . , fk satisfy fi ∈ L pi (Ω) for 1 ≤ i ≤ k and 1 p then the product f =
Jk
f i =1 i
=
1 p1
+
1 p2
+ ··· +
1 pk
≤ 1,
belongs to L p (Ω) and f L p ≤ f1 L p1 . . . fk L pk .
In particular, if f ∈ L p (Ω) ∩ Lq (Ω) with 1 ≤ p ≤ q ≤ +∞, then f ∈ L r (Ω) for all p ≤ r ≤ q and the following interpolation formula holds: f L r ≤ f αL p f 1−α Lq
with
1 r
=
α p
+
1−α
(0 ≤ α ≤ 1),
q
(5.68)
where we just applied Hölder’s inequality to | f |α ∈ L p/α and | f |1−α ∈ Lq/1−α noticing that 1 1 1 = + . r p/α q/1 − α So, let us take p ≤ q < p ∗ and apply the above interpolation formula (5.68) to |vnk − v| and α > 0 (because q < p ∗ ). We have with q1 = αp + 1−α p∗ vnk − vLq (Ω) ≤ vnk − vαL p vnk − v1−α p∗ ≤ C vnk − vαL p
L
∗
since (vnk )k∈N is bounded in L p (Ω). Note that by Fatou’s lemma, for example, one has ∗ also that v belongs to L p (Ω). Since α > 0 and vnk → v in L p (Ω) we obtain that vnk → v in Lq (Ω) for any p ≤ q < p ∗ and hence for any 1 ≤ q < p ∗ (Ω is bounded). Remark 5.7.6. Another proof of Theorem 5.7.7 relies on the use of Vitali’s theorem: one can notice that for any 1 < q < p ∗ the sequence {|vnk |q : k ∈ N} is equi-integrable (by a direct application of the classical Hölder’s inequality) and converges in measure (after extraction of a further subsequence).
5.8 Capacity theory and elements of potential theory 1, p
In this section, we consider the Sobolev spaces W 1, p (Ω) and W0 (Ω), with 1 ≤ p < +∞, and pay particular attention to the case p= 2. We introduce the notion of capacity with respect to the energy functional Φ(v) = Ω |∇v(x)| p d x. As a key tool, we use that the contractions operate on the space W 1, p (Ω), as explained below.
i
i i
i
i
i
i
198
book 2014/8/6 page 198 i
Chapter 5. Sobolev spaces
5.8.1 Contractions operate on W 1,p (Ω) The central idea, which goes back to Deny and Beurling [95], is that the contractions 1, p operate on the spaces W 1, p (Ω) and W0 (Ω). Let us make this more precise. We say that T : R → R is a contraction if T (0) = 0 and |T (r ) − T (s)| ≤ |r − s| for all r, s ∈ R. To say that “the contractions operate on W 1, p (Ω)” means that, for any v ∈ W 1, p (Ω), T ◦ v ∈ W 1, p (Ω) and T ◦ vW 1, p (Ω) ≤ vW 1, p (Ω) . We denote by T ◦v the composition of the two functions (T ◦v)(x) = T (v(x)). When p = 2, this property is the basis of the so-called theory of Dirichlet spaces and Dirichlet forms. The most commonly used contraction functions are T (r ) = r + , T (r ) = |r |, and T (r ) = 1 ∧ r + (fundamental contraction). To state the results for a general open set Ω, we need to use the following density result, which is closely related to Theorem 5.1.3. Theorem 5.8.1 (Friedrichs). Let Ω be an arbitrary open set in RN . Then, for any v ∈ W 1, p (Ω), 1 ≤ p < +∞, there exists a sequence vn ∈ (RN ) such that vn |Ω → v in L p (Ω) and a.e., ∂ vn ∂ v in L p (ω) ∀ ω ⊂⊂ Ω, i = 1, . . . , N . → ∂ xi ω ∂ x i ω ¯ PROOF. Let v¯ be the extension by zero of v outside of Ω, i.e., v(x) = v(x) if x ∈ Ω, 1, p N ¯ v(x) = 0 if x ∈ R \ Ω. Note that, except when v ∈ W0 (Ω), v¯ does not belong to W 1, p (RN )! We proceed analogously to the proof of Theorem 5.1.3. We regularize v¯ by taking ¯ vn = M n (v(x) ρ n ), where M n is a truncation function (on the domain) and ρn is a smoothing kernel. Clearly, vn belongs to (RN ). Then note that for any ω ⊂⊂ Ω one can find a function α ∈ (Ω) such that α = 1 on ω and 0 ≤ α ≤ 1. The point is that for n sufficiently large, v¯ ρn = α v¯ ρn on ω. Since α v¯ belongs to W 1, p (RN ) one can apply the same argument as in the proof of Theorem 5.1.3 to obtain the result. After extraction of a subsequence one can also obtain the convergence almost everywhere. Let us first consider the case of smooth truncations. Proposition 5.8.1. Let T ∈ C1 (R) be a smooth truncation, i.e., T : R → R is a C1 function which satisfies T (0) = 0 and |T (r )| ≤ 1 for all r ∈ R. Let Ω be an arbitrary open set in RN , and let 1 ≤ p < +∞. Then the following properties hold: (a) for all v ∈ W 1, p (Ω), T ◦ v ∈ W 1, p (Ω), T ◦ vW 1, p (Ω) ≤ vW 1, p (Ω) ; 1, p
∂ ∂ xi
(T ◦ v) = T (v) ∂∂ xv for all 1 ≤ i ≤ N , and i
1, p
(b) when v ∈ W0 (Ω) we have T ◦ v ∈ W0 (Ω). PROOF. We have that for all r, s ∈ R, |T (r ) − T (s)| ≤ |r − s|. This combined with T (0) = 0 yields |T (r )| ≤ |r | for all r ∈ R. Hence |T ◦ v(x)| ≤ |v(x)|
i
i i
i
i
i
i
5.8. Capacity theory and elements of potential theory
book 2014/8/6 page 199 i
199
and T ◦ v ∈ L p (Ω). In a similar way, ∂ v ∂ v T (v) ≤ ∂ xi ∂ xi and T (v) ∂∂ xv belongs to L p (Ω). Thus, we just need to prove that ∂∂x (T ◦ v) = T (v) ∂∂ xv i i i in the distribution sense, which means that ∂ϕ ∂v ∀ϕ ∈ (Ω) (T ◦ v) d x = − T (v) ϕ d x. (5.69) ∂ xi ∂ xi Ω Ω To prove (5.69) we use Theorem 5.8.1, which provides us a sequence (vn )n∈N in (RN ) ∂v such that vn → v in L p (Ω) and a.e., and ∂ xn → ∂∂ xv in L p (ω) for all ω ⊂⊂ Ω. i
i
Since T ◦ vn belongs to C1 (Ω) and ϕ ∈ (Ω), we have by using classical differential calculus that for all n ∈ N ∂ϕ ∂ vn (T ◦ vn ) d x = − T (vn ) ϕ d x. (5.70) ∂ xi ∂ xi Ω Ω We have
|T ◦ vn − T ◦ v| ≤ |vn − v|,
which implies that T ◦ vn → T ◦ v in L p (Ω). On the other hand, let ϕ ≡ 0 outside of ω ⊂⊂ Ω. We have ∂ vn ∂ v ∂ v ∂ v ∂ v n − T (v) − ≤ T (vn ) T (vn ) + (T (vn ) − T (v)) ∂ xi ∂ xi ∂ xi ∂ xi ∂ xi ∂ v ∂v ∂ v n ≤ − + |T (vn ) − T (v)| . ∂ xi ∂ xi ∂ xi Since vn → v a.e. and T is continuous, we have T (vn ) → T (v) and
a.e.
∂ v p ∂ v p p |T (vn ) − T (v)| ≤2 . ∂ xi ∂ xi
p
We obtain, thanks to the Lebesgue dominated convergence theorem, T (vn )
∂ vn ∂ xi
→ T (v)
∂v ∂ xi
in
L p (ω).
We can now pass to the limit in (5.70) to obtain (5.69). 1, p Now take v ∈ W0 (Ω). The proof is even simpler, since one can take as an approximating sequence vn ∈ (Ω) with vn → v in W 1, p (Ω). We have T ◦ vn → T ◦ v in W 1, p (Ω) 1, p and T ◦ vn |∂ Ω = 0. By continuity of the trace operator, we obtain T ◦ v ∈ W0 (Ω). Let us now consider nonsmooth contractions. We start with the important case T (r ) = r + = r ∨0. The idea is to regularize the function T (r ) to reduce ourselves to the previous situation. This elementary construction is described below.
i
i i
i
i
i
i
200
book 2014/8/6 page 200 i
Chapter 5. Sobolev spaces
Lemma 5.8.1. Let Tn : R → R be defined by ⎧ ⎨r 2 Tn (r ) = r + n r2 ⎩ 1 − 2n
if r ≥ 0, if − n1 ≤ r ≤ 0, if r ≤ − n1 .
Then Tn ∈ C1 (R), Tn (0) = 0, |Tn (r )| ≤ 1 for all r ∈ R and, for all r ∈ R, Tn (r ) → r + as n → +∞. PROOF. The function r → r + is not a C1 function. Its distribution derivative is not continuous. It is indeed the Heaviside function θ(r ) = 1 if r ≥ 0, and θ(r ) = 0 elsewhere. Let us approximate θ by a sequence of continuous functions, taking for n ≥ 1 ⎧ if r ≥ 0, ⎨1 θn (r ) = n r + 1 if − n1 ≤ r ≤ 0, ⎩ 0 if r ≤ − n1 .
Set Tn (r ) = Since 0 ≤ θn ≤ 1 we obtain
r
0
θn (s)d s.
|Tn (r )| = |θn (r )| ≤ 1,
and all the properties of Tn are easily verified. We can now state the following result. Theorem 5.8.2. Let Ω be an arbitrary open set in RN and let 1 ≤ p < +∞. Then, for any v ∈ W 1, p (Ω), v + ∈ W 1, p (Ω) and ∂ ∂ xi
v + = 1{v≥0}
∂v ∂ xi
,
i = 1, . . . , N ,
v + W 1, p (Ω) ≤ vW 1, p (Ω) . 1, p
1, p
Moreover, when v ∈ W0 (Ω), one still has v + ∈ W0 (Ω). PROOF. Let Tn be the approximation of T (r ) = r + which is defined in Lemma 5.8.1. Since Tn belongs to C1 (R), Tn (0) = 0 and |Tn (r )| ≤ 1 for all r ∈ R, we can apply Proposition 5.8.1 to obtain ∂ ∂v Tn ◦ v = Tn (v) . (5.71) ∂ xi ∂ xi Let us pass to the limit on (5.71) as n → +∞. By Lemma 5.8.1, we have Tn (r ) → r + for all r ∈ R. Hence Tn ◦ v → v + a.e. and
|Tn ◦ v| ≤ |v|.
By the Lebesgue dominated convergence theorem, we have Tn ◦ v → v +
in L p (Ω).
(5.72)
i
i i
i
i
i
i
5.8. Capacity theory and elements of potential theory
book 2014/8/6 page 201 i
201
We now examine the right-hand side of (5.71). We have Tn (r ) = 1
∀ r ≥ 0,
Tn (r ) → 0 as n → +∞ Hence
Tn (v) → 1{v≥0}
and Tn (v) Moreover,
∂v ∂ xi
→ 1{v≥0}
∀ r < 0. a.e.
∂v
a.e.
∂ xi
∂ v ∂ v Tn (v) ≤ . ∂ xi ∂ xi
We apply again the Lebesgue dominated convergence theorem and obtain Tn (v)
∂v ∂ xi
→ 1{v≥0}
∂v ∂ xi
in L p (Ω).
(5.73)
Combining (5.71), (5.72), and (5.73) we conclude that ∂ v+ ∂ xi
= 1{v≥0}
∂v ∂ xi
.
(5.74)
In other words, the equality above follows from the continuity of the derivation with respect to the convergence in distribution and from the fact that the L p (Ω)-convergence implies the convergence in distribution. From (5.74) we obtain ∂ v+ ∂ v ≤ . ∂ xi ∂ xi This combined with |v + | ≤ |v| yields v + W 1, p (Ω) ≤ vW 1, p (Ω) . 1, p
Finally, note that (5.72) and (5.73) imply that Tn ◦ v → v + in W 1, p (Ω). If v ∈ W0 (Ω), 1, p 1, p we know by Proposition 5.8.1 that Tn ◦ v ∈ W0 (Ω). Hence v + ∈ W0 (Ω). Let us derive from Theorem 5.8.2 some useful results. Proposition 5.8.2. Let v ∈ W 1, p (Ω), 1 ≤ p < +∞. Then, for each i = 1, . . . , N , 1{v=0}
∂v ∂ xi
= 0.
In other words, ∂v ∂ xi
=0
a.e. in E = {x ∈ Ω : v(x) = 0}, i = 1, . . . , N .
i
i i
i
i
i
i
202
book 2014/8/6 page 202 i
Chapter 5. Sobolev spaces
PROOF. Let us consider the truncation T (r ) = r − = (−r ) ∨ 0. An argument similar to the one of the proof of Theorem 5.8.2 shows that v − ∈ W 1, p (Ω) and ∂ v− ∂ xi
= −1{v≤0}
∂v ∂ xi
.
From v = v + − v − we obtain ∂v
=
∂ xi
∂ v+
−
∂ v−
∂ xi ∂ xi * +∂v = 1{v≥0} + 1{v≤0} ∂ xi * +∂v ∂v = 1{v>0} + 1{v≤0} + 1{v=0} ∂ xi ∂ xi ∂v ∂v = + 1{v=0} . ∂ xi ∂ xi
Hence, 1{v=0}
∂v ∂ xi
= 0,
which completes the proof. Corollary 5.8.1. Let v ∈ W 1, p (Ω), 1 ≤ p < +∞. Then |v| ∈ W 1, p (Ω) and ∂ ∂ xi
|v| = 1{v≥0}
∂v ∂ xi
− 1{v 0, where K = {x ∈ Ω : dist (x, K) < }. To see this, just notice that x → dist(x, Ωc ) is a continuous function on a compact set K which takes only positive values. Hence, its minimal value satisfies > 0. 1, p As a consequence, if v ∈ W0 (Ω) satisfies “v ≥ 1 a.e. on a neighborhood of K,” there exists some open set G ⊃ K such that v ≥ 1 a.e. on G and some > 0 such that G ⊃ K . Hence, v ≥ 1 a.e. on K . Then use standard regularization techniques as developed in Theorem 5.1.3 to approximate v by a sequence vn ∈ (Ω) which satisfies “vn ≥ 1 and
a.e. on a neighborhood of K”
p
Ω
|∇vn | d x −→
Ω
|∇v| p d x.
i
i i
i
i
i
i
5.8. Capacity theory and elements of potential theory
book 2014/8/6 page 207 i
207
To that end one should notice that, given (ρn )n∈N a regularization kernel with ρn ≡ 0 outside of B(0, 1/n), we have v(x − y)ρn (y) d y. (v ρn )(x) = B(0,1/n)
When 1/n < , x ∈ K, we have that for all y ∈ B(0, 1/n), x−y ∈ K and hence v(x−y) ≥ 1. It follows that for 1/n < 1/ and x ∈ K, (v ρn )(x) ≥ ρn (y) d y = 1. B(0,1/n)
Then combine this regularization by the convolution method with a truncation on the domain (to preserve the Dirichlet boundary condition) to obtain vn . As a consequence, |∇v| p d x : v ∈ (Ω), v(x) ≥ 1 Cap p (K) = inf Ω ∀ x in a neighborhood of K ≥ inf |∇v| p d x : v ∈ (Ω), v(x) ≥ 1 ∀ x ∈ K , Ω
the last inequality being a consequence of the fact that one minimizes on a larger set. (b) Since Cap p (·) is a monotone set function, we have Cap p (G) ≥ sup{Cap p (K) : K compact, K ⊂ G}. To prove the opposite inequality, we consider only the nontrivial case when the righthand side is finite. Take a sequence (Kn ) of compact subsets of G such that ∪n Kn = G and for any integer n, by using statement (a) above, let un ∈ (Ω) such that 1 un ≥ 1 on Kn , |∇un | p d x ≤ + Cap p (Kn ). n Ω 1, p
The sequence (un ) is bounded in W0 (Ω) and we may extract a subsequence (that we still 1, p denote by (un )) weakly converging to some u ∈ W0 (Ω). Then u ≥ 1 a.e. on G and Cap p (G) ≤ |∇u| p d x Ω ≤ lim inf |∇un | p d x n→+∞
Ω
≤ lim inf Cap p (Kn ) n→+∞
≤ sup{Cap p (K) : K compact, K ⊂ G}, which concludes the proof. The capacity of a set E ⊂ RN can be defined independently of the domain Ω; to do that it is enough to take Ω = RN , and in this case the L p norm of the competing functions u has to be taken into account: p p cap p (E) = inf (5.75) |∇u| + |u| d x : u ∈ 9 p,E , RN
i
i i
i
i
i
i
208
book 2014/8/6 page 208 i
Chapter 5. Sobolev spaces
where 9 p,E is the set of all functions u ∈ W 1, p (RN ) such that u ≥ 1 a.e. (in the Lebesgue sense) in a neighborhood of E. We stress the fact that for a set, what is important is not the precise value of its capacity but whether its capacity is zero. The term RN |u| p d x in the definition of cap(E) is essential when the relative set Ω is unbounded or is the entire space RN . In particular, if N = 2, without this term, every bounded set would have capacity zero. Indeed, for a disk BR centered at the origin, taking t > R, the function log(|x|/t )
u t (x) = would give
R2
with u(x) = 1 on BR
log(R/t )
|∇u t |2 d x = 2π
t R
1 2
1
log (R/t ) r
d r = 2π log(t /R),
so that cap(BR ) ≤ 2π log(t /R). Letting t → +∞ would give cap(BR ) = 0. Since every bounded set E is contained in a ball BR , this would give cap(E) = 0 for every bounded set E ⊂ R2 . From the definition (5.75) above we obtain immediately cap p (E) ≥ |E| for every E, and thus every set with capacity zero is also Lebesgue negligible. The opposite is not true: for instance a smooth N − 1 dimensional surface in RN has a positive 2-capacity but zero Lebesgue measure. More precisely, the following result holds. Proposition 5.8.5. Let S be a smooth k-dimensional surface in RN and let p > 1. Then (i) cap p (S) = 0 for every 1 < p ≤ N − k; (ii) cap p (S) > 0 for every p > N − k. PROOF. By a smooth change of variables and by using the countable subadditivity of the capacity we may reduce ourselves to the case of S a k-dimensional plane, and in this case assertions (i) and (ii) are equivalent to showing that a point in Rd (with d = N − k) (i ) has zero p-capacity for every 1 < p ≤ d ; (ii ) has a positive p-capacity if p > d . To prove assertion (i ) in the case p = d it is enough to take in polar coordinates u(r ) =
log r
if < r < 1,
log
u(r ) = 1
if r ≤ .
(5.76)
This gives cap p ({0}) ≤ Cd
1
r d −1 (log ) p
1 rd
+ (log r )
p
d r + |B(0, )|,
where Cd is a constant depending only on d , and an easy calculation shows that as → 0 we obtain cap p ({0}) = 0.
i
i i
i
i
i
i
5.8. Capacity theory and elements of potential theory
book 2014/8/6 page 209 i
209
In the case p = d a similar calculation can be done with u(r ) = and we obtain that
1 − r ( p−d )/( p−1)
if < r < 1,
1 − ( p−d )/( p−1) cap p ({0}) = 0
u(r ) = 1
if r ≤ ,
(5.77)
for every 1 < p < d .
To prove (ii ) we take as a lower bound of cap p ({0}) the quantity, in polar coordinates,
lim min Cd
→0
1
r
d −1
p
|u (r )| d r : u() = 1, u(1) = 0 .
The Euler–Lagrange equations of the minimum problems above give as solutions the functions in (5.76) and (5.77) and, by a simple integration, we obtain the conclusion. In the following, unless specified differently, we will use the capacity with p = 2, then omit the index p. If a property P (x) holds for all x ∈ E except for the elements of a set Z ⊂ E with cap(Z) = 0, we say that the property P (x) holds quasi-everywhere (q.e.) on E. The expression almost everywhere (a.e.) refers, as usual, to the Lebesgue measure. We summarize here the main properties of the capacity; the interested reader may find all the details and the related proofs in one of the classical books [198], [200], [222], [366]. • The capacity cap(E) is a monotone set function, that is, cap(E1 ) ≤ cap(E2 )
whenever E1 ⊂ E2 .
• The set function cap(E) is continuous for increasing sequences, that is, cap(En ) ↑ cap(E)
whenever En ↑ E.
• The set function cap(E) is countably subadditive, that is, 3 cap(E) ≤ cap(En ) whenever E = En . n∈N
n∈N
• The set function cap(E) is not additive, that is, for E1 ∩ E2 = the inequality cap(E1 ∪ E2 ) ≤ cap(E1 ) + cap(E2 ) may be, in general, strict.
5.8.3 Quasi-open sets, quasi-continuity A subset A of RN is said to be quasi-open (respectively, quasi-closed) if for every > 0 there exists an open (respectively, closed) subset A of RN , such that cap(A ΔA) < , where Δ denotes the symmetric difference of sets. In the definition of a quasi-open set we can additionally require that A ⊂ A . In a similar way, if Ω is an open domain, a function f : Ω → R is said to be quasicontinuous (respectively, quasi-lower semicontinuous) if for every > 0 there exists a continuous (respectively, lower semicontinuous) function f : Ω → R such that cap({ f = f }) < , where { f = f } = {x ∈ Ω : f (x) = f (x)}. It is well known (see, e.g., Ziemer
i
i i
i
i
i
i
210
book 2014/8/6 page 210 i
Chapter 5. Sobolev spaces
[366]) that every function u of the Sobolev space H 1 (Ω) has a quasi-continuous representative, which is uniquely defined up to a set of capacity zero. It is convenient for our purposes to identify the function u with its quasi-continuous representative, so that a pointwise condition can be imposed on u(x) for quasi-every x ∈ Ω. Notice that with this convention we have for every subset E of Ω cap(E, Ω) = min
Ω
|∇u|2 d x : u ∈ H01 (Ω), u ≥ 1 q.e. on E .
We recall the following theorems from [5]. Theorem 5.8.3. Let u ∈ H 1 (RN ). Then a quasi-continuous representative u˜ of u is given, for q.e. x ∈ RN , by 1 u˜ (x) = lim u(y) d y = u˜ (x). →0 |B | B x, x, Theorem 5.8.4. Every strongly convergent sequence in H 1 (RN ) has a subsequence converging q.e. in RN . It is important to notice that the Sobolev space H01 (A) can be defined for every quasiopen set A as the space of all functions u ∈ H01 (RN ) such that u = 0 q.e. on RN \ A. The Hilbert space structure of H01 (A) is inherited from H01 (RN ). Note that if A ⊂ Ω, H01 (A) is a closed subspace of H01 (Ω) as a consequence of the properties above of quasi-continuous representatives of Sobolev functions. If A is an open set, then the previous definition of H01 (A) is equivalent to the usual one (see Adams and Hedberg [5]). Indeed, we recall the following result. Theorem 5.8.5. Let A ⊂ RN be an open set. A function u ∈ H 1 (RN ) belongs to H01 (A) iff u = 0 q.e. on RN \ A. In the statement above, the assertion u belongs to H01 (A) has to be understood in the sense that u is the strong limit in H 1(RN ) of a sequence of Cc∞ (RN ) functions with support in A. Most of the properties that hold for Sobolev spaces over open sets can be extended to this larger framework; for instance, the following result holds (see [146]). Lemma 5.8.2. Let A1 , A2 be two quasi-open sets that are quasi-disjoint, that is, with cap(A1 ∩ A2 ) = 0. Then H01 (A1 ∪ A2 ) = H01 (A1 ) ∩ H01 (A2 ) in the sense that for every u ∈ H01 (A1 ∪ A2 ) we have u|A1 ∈ H01 (A1 ) and u|A2 ∈ H01 (A2 ). Since the family of quasi-open sets of RN is not a topology (only countable unions of quasi-open sets are quasi-open), when dealing with arbitrary unions of quasi-open sets sometimes it is more interesting to work with the so-called finely open sets, that is, open sets with respect to the fine topology defined below. The fine topology on Ω is the coarsest topology making all super-harmonic functions continuous. The relation between quasi-open sets and the fine topology is studied in [5], [222], [255]. We recall the following theorem from [255].
i
i i
i
i
i
i
5.8. Capacity theory and elements of potential theory
book 2014/8/6 page 211 i
211
Theorem 5.8.6. Suppose A ⊂ RN . Then the following assertions are equivalent: (i) A is quasi-open; (ii) A is the union of a finely open set and a set of zero capacity; (iii) A = {u > 0} for some nonnegative quasi-continuous function u ∈ H 1 (RN ). In addition, if A is a quasi-open subset of RN and u is a function on A, then the following assertions are equivalent: (i) u is quasi-lower semicontinuous; (ii) the sets {u > c} are quasi-open for all c ∈ R; (iii) u is finely lower semicontinuous up to a set of zero capacity. Remark 5.8.2. All the definitions and results presented in this section have natural exten1, p sion to the Sobolev spaces W0 (Ω) with 1 < p < +∞. We refer to [237] for a review of the main definitions and properties of the p-capacity. From the shape optimization point of view, the most interesting case is when 1 < p ≤ N , since for p > N the p-capacity of a point is strictly positive and every W 1, p -function has a continuous representative. For this reason, a property which holds p-quasi-everywhere, with p > N , holds in fact everywhere, and this explains why for shape optimization problems the most interesting case is when p ≤ N .
5.8.4 Capacitary measures The class of quasi-open sets is considerably larger than that of classical domains; nevertheless several shape optimization problems, 4 5 min F (Ω) : Ω ⊂ D, Ω quasi-open , do not admit a solution. In Section 16.3 we show an example in which this nonexistence phenomenon occurs. On the other hand, minimizing sequences (Ωn )n∈N always exist, and it is interesting to study the asymptotic behavior of them as n → ∞. As a general philosophy of relaxation problems (see Section 3.2.4), to do that we have to endow the class 4 5 = Ω ⊂ D, Ω quasi-open (5.78) with a metric convergence γ , to consider the completion with respect to this metric, and to define the relaxed functional F on by setting M N F (Ω) = inf lim inf F (Ωn ) : Ωn →γ Ω n→∞
for every Ω ∈ . In this way, under a coercivity assumption on the functional F (i.e., assuming that the sublevel sets {F ≤ t } are relatively γ -compact), the minimizing sequences (Ωn )n∈N will γ -converge, up to extraction of subsequences, to minimum points of the relaxed problem 5 4 min F (Ω) : Ω ∈ that, by the coercivity assumption above, always admits a solution.
i
i i
i
i
i
i
212
book 2014/8/6 page 212 i
Chapter 5. Sobolev spaces
We refer to [147] for a complete discussion about relaxation theory on general spaces and in particular for the integral functional on function spaces as Sobolev spaces, Lebesgue spaces, BV spaces, and spaces of measures. Here we highlight the case when the space is the class of admissible domains, endowed with a suitable γ -convergence, and the cost functional is a shape functional F (Ω) defined for every Ω ∈ . Definition 5.8.2. Let D be a bounded open set of RN and let be the class of its quasi-open subdomains defined in (5.78). We say that a sequence (Ωn )n∈N in γ -converges to Ω ∈ if for every right-hand side f ∈ L2 (D) the solutions of the elliptic boundary value problems −Δun = f in Ωn ,
u ∈ H01 (Ωn ),
extended by zero to D \ Ωn , converge in L2 ( ) to the solution u of −Δu = f in Ω,
u ∈ H01 (Ω).
Proposition 5.8.6. The convergence un → u of Definition 5.8.2 is actually strong in H01 (D). PROOF. From the equations −Δun = f , multiplying by un and integrating by parts, we obtain D
|∇un |2 d x =
D
f un d x.
Similarly, we obtain for the limit solution u |∇u|2 d x = f u d x. D
D
Since un → u in L (D), we have |∇un |2 d x = lim f un d x = f u dx = |∇u|2 d x, lim 2
n→∞
D
n→∞
D
D
D
which gives strong convergence of un to u in H01 (D). Remark 5.8.3. In [145] it is shown that in Definition 5.8.2 it is equivalent to require the convergence of un to u only for the right-hand side f = 1. Moreover, if for every Ω ∈ we denote by RΩ : L2 (D) → L2 (D) the resolvent operator which associates to every righthand side f ∈ L2 (D) the solution u ∈ H01 (Ω) ⊂ L2 (D) of −Δu = f in Ω, the convergence Ωn →γ Ω can be restated as ∀ f ∈ L2 (D)
RΩ n ( f ) → RΩ ( f )
in L2 (D).
Some conditions equivalent to γ -convergence are listed in the proposition below (see proposition 4.5.3 of [145]). Proposition 5.8.7. Let (Ωn )n∈N and Ω be quasi-open subsets of a given bounded open set D. The following assertions are equivalent. 1. Ωn →γ Ω. 2. For every f ∈ L2 (D) we have RΩn ( f ) → RΩ ( f ) strongly in H01 (D).
i
i i
i
i
i
i
5.8. Capacity theory and elements of potential theory
book 2014/8/6 page 213 i
213
3. We have RΩn (1) → RΩ (1) strongly in H01 (D). 4. The resolvent operators RΩn converge in the operator norm of " (L2 (D)) to RΩ . Remark 5.8.4. Denoting by G(Ω, u) the Dirichlet energy functional ⎧ ⎨ |∇u|2 d x if u ∈ H 1 (Ω), 0 G(Ω, u) = D ⎩+∞ 2 if u ∈ L (D) \ H 1 (Ω), 0
a further equivalence can be stated in terms of Γ -convergence (see Chapter 12): Ωn →γ Ω iff G(Ωn , ·) →Γ G(Ω, ·), where the Γ -convergence is intended with respect to the L2 (D) topology. The γ -convergence is very strong; if Ωn →γ Ω, not only do the solutions RΩn ( f ) of the corresponding boundary value problems converge strongly in H01 (D) for every righthand side f , but, as a consequence of the norm convergence of the resolvent operators (see Proposition 16.5.1), we also have the convergence of the entire spectrum Λ(Ωn ) to Λ(Ω). By the ellipticity of the operator −Δ these spectra consist of discrete eigenvalues (see Chapter 8) which then converge in the sense that λk (Ωn ) → λk (Ω)
for every k ∈ N.
In this way, many shape functionals turn out to be γ -lower semicontinuous or even γ -continuous. Two important classes are the following ones. Integral functionals. Given a right-hand side f ∈ L2 (D) we consider the solution RΩ ( f ) of the PDE −Δu = f in Ω, u ∈ H01 (Ω), that we assume extended by zero outside of Ω. The integral shape cost functionals we may consider are of the form j x, RΩ ( f ), ∇RΩ ( f ) d x, F (Ω) = D
where j is a suitable integrand. Since the γ -convergence implies the strong H01 convergence of the corresponding solutions, as a consequence of Fatou’s lemma and of the Sobolev embedding theorem (see Section 5.7), we obtain that the functional F is γ -lower semicontinuous provided the integrand j satisfies the following properties: (1) j (x, s, z) is measurable in x and lower semicontinuous in (s, z); (2) j (x, s, z) ≥ −a(x) − c |s| p + |z|2 , where a ∈ L1 (D), p = 2N /(N − 2) ( p < +∞ if N = 2), c ∈ R. The shape functional F is γ -continuous if assumptions (1) and (2) above are replaced by (1 ) j (x, s, z) is measurable in x and continuous in (s, z); (2 ) | j (x, s, z)| ≤ a(x) + c |s| p + |z|2 , where a ∈ L1 (D), p = 2N /(N − 2) ( p < +∞ if N = 2), c ∈ R.
i
i i
i
i
i
i
214
book 2014/8/6 page 214 i
Chapter 5. Sobolev spaces
Spectral functionals. For every domain Ω of the class we consider the spectrum Λ(Ω) of the Dirichlet Laplacian −Δ on H01 (Ω). Since our domains Ω are bounded, the Dirichlet Laplacian −Δ has a compact resolvent and so its spectrum Λ(Ω) is discrete: Λ(Ω) = λ1 (Ω), λ2 (Ω), . . . , where λk (Ω) are the eigenvalues counted with their multiplicity. The spectral shape cost functionals we may consider are of the form F (Ω) = Φ Λ(Ω) for a suitable function Φ : RN → R. For instance, taking Φ(Λ) = λk we obtain F (Ω) = λk (Ω). The functional F is then γ -lower semicontinuous provided the function Φ is lower semicontinuous, that is, λnk → λk ∀k
⇒
Φ(Λ) ≤ lim inf Φ(Λn ). n→+∞
Analogously, F is γ -continuous if the function Φ is continuous, that is, λnk → λk ∀k
⇒
Φ(Λn ) → Φ(Λ).
The γ -convergence, defined on the class of (5.78), is actually a metric convergence; from Definition 5.8.2 and from Remark 5.8.3 we have that the distance dγ (Ω1 , Ω2 ) = RΩ1 (1) − RΩ2 (1) generates the γ -convergence. As a consequence, the space ( , dγ ) is a metric space. The following density result was proved in [186]. Proposition 5.8.8. The class of smooth domains A ⊂ D is dγ -dense in . The metric space ( , dγ ) is then separable. However, it is not compact, as the following example shows. Example 5.8.1. This example is due to Cioranescu and Murat (see [175]). Let D = ]0, 1[×]0, 1[ be the unit square of R2 and let f ∈ L2 (D). We construct a sequence (Ωn ) of open subsets of D such that the solutions un = RΩn ( f ) of
−Δu = f in Ωn , u ∈ H01 (Ωn ),
(5.79)
extended by zero on D \ Ωn , converge weakly in H01 (D) to a function u which is not of the form u = RΩ ( f ), then proving the noncompactness of the metric space ( , dγ ). Consider, for n large enough, the sequence of sets Cn =
n 3 i , j =0
B (i /n, j /n),rn ,
Ωn = D \ Cn ,
i
i i
i
i
i
i
5.8. Capacity theory and elements of potential theory
book 2014/8/6 page 215 i
215
where rn = e −cn , being c > 0 a fixed positive constant. It is easy to see that un = RΩn ( f ) are bounded in H01 (D); then we may assume for a subsequence (that we denote by the same indices) that un converges to some function u weakly in H01 (D). It is convenient to introduce the functions zn ∈ H 1 (D) defined by ⎧ 0 on Cn , ⎪ ⎪ ⎪ ⎪ ⎨ ln (x − i/n)2 + (y − j /n)2 + c n 2 zn = on B (i /n, j /n),1/(2n) \ Cn , ⎪ c n 2 − ln(2n) ⎪ ⎪ n ⎪ ⎩1 on D \ i , j =0 B (i /n, j /n),1/(2n) . 2
We notice that 0 ≤ zn ≤ 1 and that ∇zn converges to zero weakly in L2 (D); hence zn converges weakly in H 1 (D) to a constant function. Computing the limit of D zn d x we find that this constant is equal to 1. For every ϕ ∈ C0∞ (D) the function zn ϕ belongs to H01 (Ωn ), and thus we can take zn ϕ as a test function for (5.79) on Ωn : ϕ∇un ∇zn d x + zn ∇un ∇ϕ d x = f ϕzn d x. D
D
D
The second and third terms of this equality converge to D ∇u∇ϕ d x and D f ϕ d x, respectively. For the first term, the Gauss–Green formula gives n ∂ zn ϕ∇un ∇zn d x = un un ∇zn ∇ϕ d x. ϕ dσ − ∂ν D D i , j =0 ∂ B(i/n, j /n),1/(2n) The term with Δzn does not appear since zn is harmonic on B(i /n, j /n),1/(2n) \ Cn ; similarly, the boundary term on ∂ B(i /n, j /n),rn vanishes since un vanishes on it. The last term of the identity above tends to 0 as n → ∞. We now compute the boundary integral. We have n n ∂ zn 2n un un ϕ d σ ϕ dσ = 2 ∂ν i , j =0 ∂ B(i/n, j /n),1/(2n) i , j =0 ∂ B(i/n, j /n),1/(2n) c n − ln(2n) n 1 2n 2 un ϕ d σ. = 2 c n − ln(2n) i , j =0 ∂ B(i/n, j /n),1/(2n) n Let us denote by μn ∈ H −1 (D) the distribution defined by n 1 〈μn , ψ〉H −1 (D)×H 1 (D) = ψ d σ. 0 n i , j =0 ∂ B(i/n, j /n),1/(2n) We prove that μn converges strongly in H −1 (D) to π d x. Indeed, we introduce the functions vn ∈ H 1 (D) defined by Δvn = 4 in i , j B(i /n, j /n),1/(2n) , vn = 0 on D \ i , j B(i /n, j /n),1/(2n) . Therefore
∂ vn ∂ν
=
1 n
on
3
∂ B(i /n, j /n),1/(2n) .
i,j
i
i i
i
i
i
i
216
book 2014/8/6 page 216 i
Chapter 5. Sobolev spaces
We notice that vn converges to zero strongly in H 1 (D), and therefore Δvn converges to zero strongly in H −1 (D). Moreover, n ∇vn ∇ψ d x 〈−Δvn , ψ〉H −1 (D)×H 1 (D) = 0
=
i , j =0 B(i/n, j /n),1/(2n) n i , j =0
∂ B(i/n, j /n),1/(2n)
1 n
ψ dσ −
n i , j =0
Passing to the limit as n → ∞ and using the fact that 1 −1
i, j
4ψ d x. B(i/n, j /n),1/(2n)
B(i/n, j /n),1/(2n)
tends to π/4 weakly
in L (D), we get that μn tends to π d x strongly in H (D). Summarizing, the equation satisfied by u ∈ H01 (D) is 2π ∇u∇ϕ d x + uϕ d x = f ϕdx ∀ϕ ∈ C0∞ (D), c D D D 2
that is, −Δu +
2π c
u=f.
Remark 5.8.5. The two-dimensional example above can be repeated, with similar calcu2 lations, for any dimension N > 2; it is enough to replace the critical radius rn = e −cn by rn = c n −N /(N −2) . An important question is now to characterize the completion of with respect to dγ , that is, all possible γ -limits of sequences of domains of . The conclusion of Example 5.8.1 can be rephrased by saying that all constant nonnegative functions belong to . The characterization of was achieved in [186], where the following result was proved. Theorem 5.8.7. The completion of with respect to dγ is the class M0 (D) of all nonnegative regular Borel measures μ on D (not necessarily finite) such that μ(E) = 0
whenever cap(E) = 0.
The regularity in Theorem 5.8.7 is intended in the usual sense of measures, that is, 4 5 μ(E) = inf μ(A) : A open , E ⊂ A . The measures of M0 (D) will be called capacitary measures. Notice that not all regular Borel measures belong to M0 (D); if N ≥ 2, a point has zero capacity (see Proposition 5.8.5), and hence the Dirac measures δ x0 do not belong to M0 (D). An element of , that is, a quasi-open set Ω ⊂ D, can be identified with the measure 0 if cap(E \ Ω) = 0, ∞D\Ω (E) = +∞ otherwise. For every μ ∈ M0 (D) and f ∈ L2 (D) we may consider the PDE formally written as −Δu + μu = f , (5.80) u ∈ H01 (D),
i
i i
i
i
i
i
5.8. Capacity theory and elements of potential theory
book 2014/8/6 page 217 i
217
whose precise meaning has to be given in the weak form ∇u∇ϕ d x + uϕ d μ = fϕ ∀ϕ ∈ H01 (D) ∩ L2μ (D). D
D
D
It is possible to show (see [149]) that the space X = H01 (D)∩ L2μ (D) is a Hilbert space with the norm u2X = |∇u|2 d x + |u|2 d μ; D
D
then by the Lax–Milgram theorem, Theorem 3.1.2, we obtain that the PDE (5.80) admits a unique solution. Notice that, when μ = ∞D\Ω , (5.80) becomes the PDE −Δu = f in Ω, u ∈ H01 (Ω). For every measure μ ∈ M0 (D) we may define the resolvent operator Rμ : L2 (D) → L2 (D) which associates to every f ∈ L2 (D) the solution u of (5.80). The definition of γ -convergence can be now extended to M0 (D) by setting μn →γ μ ⇐⇒ Rμn ( f ) → Rμ ( f )
for every f ∈ L2 (D).
Again, a result similar to the one of Remark 5.8.3 holds, showing that μn →γ μ iff Rμn (1) → Rμ (1) in L2 (D). In this way, the distance dγ can be extended to M0 (D) by setting dγ (μ1 , μ2 ) = Rμ1 (1) − Rμ2 (1)L2 (D) . As shown by Example 5.8.1, finely perforated domains may tend, in the γ -convergence, to capacitary measures (a multiple of the Lebesgue measure in the example); see Figure 5.1.
Figure 5.1. A finely perforated set as in the Cioranescu–Murat example.
Let us summarize the properties of the γ -convergence on the space M0 (D) (we refer the reader to [145] for the proofs and all the details): 1. The space M0 (D) endowed with the distance dγ is a compact metric space. 2. The class is included in M0 (D) via the identification Ω → ∞D\Ω and is dγ dense in M0 (D). In fact, also the class of all smooth domains Ω is dγ -dense in M0 (D).
i
i i
i
i
i
i
218
book 2014/8/6 page 218 i
Chapter 5. Sobolev spaces
3. The measures of the form a(x) d x with a ∈ L1 (D) belong to M0 (D) and are dγ -dense in M0 (D). In fact, also the class of measures a(x) d x with a smooth is dγ -dense in M0 (D). 4. If μn → μ for the γ -convergence, then the spectrum of the compact resolvent operator Rμn converges to the spectrum of Rμ ; in other words, the eigenvalues of the Schrödinger-like operator −Δ + μn defined on H01 (D) converge to the corresponding eigenvalues of the operator −Δ + μ.
i
i i
i
i
i
i
book 2014/8/6 page 219 i
Chapter 6
Variational problems: Some classical examples
This chapter shows how the direct variational method can be used to solve some classical boundary value problems. The following examples have been selected because of their importance in continuum mechanics, physics, biology, and so forth, and because of their relative simplicity. Indeed, in most of these examples, the unknown function is a scalar valued function and the variational problem can be expressed as a convex minimization problem in a reflexive Banach space V ; typically V is a Sobolev space W m, p (Ω) with 1 < p < ∞. The existence of a (weak) solution can be obtained by application of the convex coercive minimization theorem, Theorem 3.3.4. This contrasts with more involved situations where the unknown function is a vector-valued function, and/or when the functional is no longer convex, and/or the functional space is no longer reflexive, and/or the functional is no longer lower semicontinuous and coercive. Most of these questions will be considered in the next chapters. When the functional which has to be minimized is the sum of a convex quadratic (positive) form and a linear form, an equivalent approach consists in working with the Euler equation (Proposition 2.3.1) and the corresponding existence theorem, namely, the Lax–Milgram theorem, Theorem 3.1.2. In that case, one can treat the problem by either of the two above equivalent methods. We stress that it is part of the skill of the mathematician to find a variational formulation (if it exists) of the studied problem. It is not a priori given! In particular, one has to find a functional setting which is well adapted to the problem under consideration. In the examples which are considered in this chapter we use various Sobolev spaces like H01 (Ω), 2 H 1 (Ω), H 2 (Ω), H pe (Ω), and W 1, p (Ω). r As a general rule, the variational methods provide only weak solutions. It is an important (and often quite involved) question to study the regularity of the variational solution and hence to decide whether it is a classical solution. We will just give some indications on this question in the case of the Dirichlet problem. Notation. Ω is a bounded open set in RN ; ∂ Ω is its topological boundary (also denoted by Γ ). Ω is said to be a regular open set if ∂ Ω is piecewise of class C1 , and n(x) is the ¯ → R we write outward unit normal vector to ∂ Ω at x. Given v : Ω ∂v (x) = ∇v(x) · n(x), ∂n the outward normal derivative of v at x ∈ ∂ Ω. 219
i
i i
i
i
i
i
220
book 2014/8/6 page 220 i
Chapter 6. Variational problems: Some classical examples
6.1 The Dirichlet problem ¯ → R of the following boundary Given f : Ω → R, we are looking for a solution u : Ω value problem: −Δu = f on Ω, (6.1) u =0 on ∂ Ω. The condition “u = 0 on ∂ Ω” is called the homogeneous Dirichlet boundary condition. Problem (6.1) is called the homogeneous Dirichlet boundary value problem for the Laplace operator. We call it the Dirichlet problem. ¯ → R solution of The nonhomogeneous Dirichlet problem consists in finding the u : Ω
−Δu = f u=g
on Ω, on ∂ Ω,
where f : Ω → R and g : ∂ Ω → R are given functions. The word homogeneous refers precisely to the case g = 0. We will see at the end of this section that, in general, the nonhomogeneous problem can be reduced to the homogeneous problem.
6.1.1 The homogeneous Dirichlet problem Let us recall that when N = 2, problem (5.1) can be interpreted, for example, as describing the vertical motion of an elastic membrane under the action of a vertical force of density f . The Dirichlet boundary condition expresses that the membrane is fixed on its boundary. Theorem 6.1.1. The variational approach to the Dirichlet problem is described in the following statements: (a) For every f ∈ L2 (Ω) there exists a unique u ∈ H01 (Ω) which satisfies ⎧ ⎨ ∇u · ∇v d x = f v dx Ω ⎩ uΩ∈ H 1 (Ω).
∀v ∈ H01 (Ω),
(6.2)
0
(b) The solution u of (6.2) satisfies
in (Ω) on ∂ Ω
−Δu = f γ0 (u) = 0
(equality as distributions), (γ0 is the trace operator).
(6.3)
Indeed, for u ∈ H01 (Ω) there is equivalence between (6.2) and (6.3). The solution u of (6.2) is called the weak solution of the Dirichlet problem (6.1). (c) The solution u of (6.2) is the unique solution of the minimization problem min
1 2
Ω
2
|∇v| d x −
Ω
f v dx : v ∈
H01 (Ω)
.
(6.4)
This is the Dirichlet variational principle. We also call u the variational solution of the Dirichlet problem.
i
i i
i
i
i
i
6.1. The Dirichlet problem
book 2014/8/6 page 221 i
221
PROOF. (a) Let us solve (6.2) by using the Lax–Milgram theorem. To that end, take V = H01 (Ω) equipped with the scalar product
〈u, v〉 =
Ω
(uv + ∇u · ∇v) d x,
which makes V a Hilbert space. Then, set for any u, v ∈ V , a(u, v) = ∇u · ∇v d x, Ω
l (v) =
f v d x. Ω
Let us first verify that the bilinear form a : V × V → R is continuous. For arbitrary u, v ∈ V , by using successively the Cauchy–Schwarz inequality in RN and L2 (Ω), we obtain |a(u, v)| ≤ |∇u| |∇v| d x Ω
≤
1/2
Ω
|∇u|2 d x
1/2
Ω
≤ uV vV .
|∇v|2 d x
Let us now verify that the linear form l : V → R is continuous. For arbitrary v ∈ V |l (v)| ≤ | f ||v| d x Ω
≤
1/2 2
Ω
|f | dx
≤ C v
1/2 2
Ω
|v| d x
with C = f L2 .
The only point which remains to verify is that the bilinear form a is coercive. To that end, we use the Poincaré inequality (Theorem 5.3.1). Since Ω has been assumed to be bounded, there exists some positive constant C such that v(x)2 d x ≤ C |∇v(x)|2 d x. ∀v ∈ H01 (Ω) By adding
Ω
Ω
Ω
|∇v|2 d x to each side of the above inequality, we obtain
Ω
(v(x)2 + |∇v(x)|2 ) d x ≤ (1 + C )
Equivalently, ∀v ∈ H01 (Ω)
a(v, v) ≥
Ω
1 1+C
|∇v(x)|2 d x.
vV2 ,
1 and a is α-coercive (or α-elliptic) with α = 1+C > 0. Thus, all the assumptions of the Lax–Milgram theorem are satisfied. This implies existence and uniqueness of the solution u of problem (6.2).
i
i i
i
i
i
i
222
book 2014/8/6 page 222 i
Chapter 6. Variational problems: Some classical examples
(b) Let u be the solution of (6.2). Since (Ω) ⊂ H01 (Ω), we have ∇u · ∇v d x = f v d x, ∀v ∈ (Ω) Ω
(6.5)
Ω
which, by definition of the derivation in the distribution sense, is equivalent to −Δu = f
in (Ω).
(6.6)
Moreover, we know by Proposition 5.6.1 that H01 (Ω) = ker γ0 , where γ0 is the trace operator. Hence γ0 (u) = 0 in trace sense and u satisfies (6.3). Conversely, if u satisfies −Δu = f in the distribution sense, we have (6.5). Then use the density of (Ω) in H01 (Ω) and the fact that u ∈ H01 (Ω) and f ∈ L2 (Ω) to obtain (6.2). (c) The equivalence between (6.5) and (6.3) is an immediate consequence of Proposition 2.3.1. To that end, just note that the bilinear form a(·, ·) is symmetric and positive. The corresponding minimization problem is min{J (v) : v ∈ H01 (Ω)}, where 1 J (v) = a(v, v) − l (v) 2 1 2 |∇v| d x − f v d x. = 2 Ω Ω That’s the Dirichlet variational principle. Note that the functional J is convex, continuous, and coercive on H01 (Ω). Let us now make the link between the notion of a classical solution of the Dirichlet problem and the notion of a weak solution which was introduced in Theorem 6.1.1. Let us first make precise the notion of classical solution. ¯ → R is said to be a classical solution of the Dirichlet Definition 6.1.1. A function u : Ω 2 ¯ problem if u ∈ C (Ω) satisfies −Δu = f in the sense of the classical differential calculus, while the restriction of u to ∂ Ω is equal to zero: −Δu(x) = f (x) ∀x ∈ Ω, u(x) = 0 ∀x ∈ ∂ Ω. ¯ is a classical solution of the Dirichlet problem (6.1), then Proposition 6.1.1. (a) If u ∈ C2 (Ω) it is equal to the weak solution of (6.2). As a consequence, the classical solution, if it exists, is unique. ¯ and Ω is of class C1 , then (b) If the weak solution u of (6.2) is regular, that is, u ∈ C2 (Ω) u is the classical solution of the Dirichlet problem (6.1). ¯ be a classical solution of the Dirichlet problem. Then u and PROOF. (a) Let u ∈ C2 (Ω) ∂u ¯ (recall that Ω is for any 1 ≤ i ≤ N are continuous functions on the compact set Ω ∂x i
i
i i
i
i
i
i
6.1. The Dirichlet problem
book 2014/8/6 page 223 i
223
¯ Since L∞ (Ω) ⊂ L2 (Ω) (we use again that Ω is bounded) bounded) and hence bounded on Ω. 1 we obtain that u ∈ H (Ω). ¯ by using Proposition 5.6.1, we have u = 0 on ∂ Ω so that u ∈ Since u ∈ H 1 (Ω) ∩ C(Ω), 1 H0 (Ω). Note that this implication holds true without assuming any regularity assumption on ∂ Ω. Take an arbitrary v ∈ (Ω). We have f v d x. − vΔu d x = Ω
Ω
Let us integrate by parts on Ω. We obtain ∀v ∈ (Ω) ∇u · ∇v d x = f v d x. Ω
Ω
By density of (Ω) in H01 (Ω), we obtain ∀v ∈
H01 (Ω)
Ω
∇u · ∇v d x =
f v d x. Ω
Hence, u is the weak solution of (6.2). ¯ Using (b) Let us assume that the weak solution u of (6.2) is regular, that is, u ∈ C2 (Ω). 1 again Proposition 5.6.1 and the fact that Ω is now assumed to be of class C , we obtain ¯ ∩ H 1 (Ω) =⇒ u = 0 on ∂ Ω (as a restriction to ∂ Ω). u ∈ C(Ω) 0 On the other hand, we have −Δu = f
in (Ω).
Since u ∈ C2 (Ω), the distributional derivatives of u (up to the second order) coincide with the classical derivatives and −Δu(x) = f (x)
∀x ∈ Ω
in the classical sense. Hence, u is the classical solution of the Dirichlet problem. Remark 6.1.1. By Proposition 6.1.1, the question of the existence of a classical solution has been converted into the problem of the regularity of the weak solution. For this quite involved and important question, see [8], [9], [228], and [310].
6.1.2 The nonhomogeneous Dirichlet problem ¯ → R of the following Given g : ∂ Ω → R and f : Ω → R, we look for a solution u : Ω boundary value problem: −Δu = f on Ω, (6.7) u = g on ∂ Ω. Let us first give a variational formulation of this problem, as a constrained minimization problem on the set C = {v ∈ H 1 (Ω) : v = g on ∂ Ω}. Then, we will see how this problem can be reduced to the homogeneous Dirichlet problem.
i
i i
i
i
i
i
224
book 2014/8/6 page 224 i
Chapter 6. Variational problems: Some classical examples
Theorem 6.1.2. Let Ω be a bounded regular connected open set in RN . Let g : ∂ Ω → R be a given function such that g = γ0 ( g˜ ) for some g˜ ∈ H 1 (Ω), i.e., g belongs to H 1/2 (∂ Ω), the trace space of H 1 (Ω) on ∂ Ω. Let us denote C = {v ∈ H 1 (Ω) : γ0 (v) = g on ∂ Ω}.
(6.8)
(i) The set C is a closed convex nonempty subset of H 1 (Ω). Indeed, C = g˜ + H01 (Ω) is an affine subspace in H 1 (Ω) which is parallel to H01 (Ω). (ii) For any f ∈ L2 (Ω), there exists a unique solution u of the minimization problem 1 2 min |∇v(x)| d x − f v dx : v ∈ C . (6.9) 2 Ω Ω (iii) The solution u of (6.9) is characterized by ⎧ ⎨ ∇u · ∇v d x = f v dx Ω ⎩ uΩ∈ C ,
∀v ∈ H01 (Ω),
and it is a weak solution of the nonhomogeneous Dirichlet problem −Δu = f on Ω, u=g on ∂ Ω,
(6.10)
(6.11)
where −Δu = f is interpreted in the distribution sense and u = g in the trace sense. PROOF. (i) The structure of C and its topological properties follow immediately from the fact that γ0 is a linear continuous map from H 1 (Ω) into L2 (∂ Ω). (ii) The only point which is not immediate is the fact that the functional 1 |∇v|2 d x − f v d x + δC (v) (6.12) J (v) := 2 Ω Ω is coercive on H 1 (Ω). Given λ ∈ R, let us prove that the set 1 2 {J ≤ λ} = v ∈ C : |∇v| d x − f v dx ≤ λ 2 Ω Ω is bounded in H 1 (Ω). Set v = w + g˜ with w ∈ H01 (Ω). Equivalently, we have to prove that 1 K := w ∈ H01 (Ω) : |∇w + ∇ g˜ |2 d x − f (w + g˜ ) d x ≤ λ 2 Ω Ω is bounded in H01 (Ω). Let us observe that 1 2 K ⊂ w ∈ H0 (Ω) : |∇w| d x ≤ 2 |∇w||∇ g˜ | d x + 2 | f ||w| d x + γ Ω
Ω
Ω
with γ = 2λ+2 f L2 g˜ L2 . By using the Poincaré inequality in H01 (Ω) and an elementary computation, one obtains that K is bounded, and hence J is coercive on H 1 (Ω). The functional J being convex lower semicontinuous (we use that C is closed convex) and coercive, the minimization problem of J on H 1 (Ω), that is, (6.9), admits a solution u.
i
i i
i
i
i
i
6.1. The Dirichlet problem
book 2014/8/6 page 225 i
225
The uniqueness of the solution u follows from the strict convexity property of the Dirichlet integral: for any u1 , u2 in C , we have & 2 % 1 u1 + u2 2 2 |∇u1 | d x + |∇u2 | d x dx = ∇ 2 2 Ω Ω Ω 1 − |∇(u1 − u2 )|2 d x. (6.13) 4 Ω If u1 and u2 are two distinct solutions of (6.9), then the last term in (6.13) is strictly less than zero, which leads to a contradiction. We have used that u1 = u2 = g on ∂ Ω implies u1 − u2 = 0 on ∂ Ω and that Ω |∇(u1 − u2 )|2 d x = 0 implies u1 − u2 is constant on Ω and hence, u1 = u2 . (iii) The general optimality condition for a minimization problem of the form 1 min a(v, v) − l (v) : v ∈ C 2 is, according to Theorem 3.3.5, a(u, v − u) − l (v − u) ≥ 0 u ∈ C.
∀v ∈ C ,
(6.14)
Because of the particular structure of the set C = g˜ + H01 (Ω), we have v ∈ C iff v − u ∈ H01 (Ω) which is a subspace of H 1 (Ω). As a consequence, (6.14) is equivalent to a(u, v) − l (v) = 0 ∀v ∈ H01 (Ω), u ∈ C, that is, (6.10). Then use that (Ω) is dense in H01 (Ω) to obtain the equivalent formulation (6.11) in terms of distributions. Let us now make the link with the homogeneous Dirichlet problem. Indeed, we are going to show that just by taking as a new unknown function w := u − ˜g , one can reduce the nonhomogeneous Dirichlet problem to some homogeneous Dirichlet problem. Formally, w satisfies −Δw = f + Δ g˜ on Ω, w =0 on ∂ Ω, which is a homogeneous Dirichlet problem, with a different right-hand side in the partial differential equation on Ω, namely, f +Δ g˜ , which now belongs to the space H −1 (Ω)! Let us make this precise in the following statement. Theorem 6.1.3. Let us make the same assumptions and use the same notations as in Theorem 6.1.2, and fix some g˜ ∈ H 1 (Ω) such that γ0 ( g˜ ) = g on ∂ Ω. (i) There exists a unique solution w ∈ H01 (Ω) of the problem ⎧ ⎨ ∇w · ∇v d x = f v d x − ∇ g˜ · ∇v d x ∀v ∈ H01 (Ω), Ω Ω Ω ⎩ w ∈ H 1 (Ω),
(6.15)
0
i
i i
i
i
i
i
226
book 2014/8/6 page 226 i
Chapter 6. Variational problems: Some classical examples
and u = w + g˜ is the variational solution of the nonhomogeneous Dirichlet problem (6.9). (ii) The variational solution w of (6.15) is a weak solution of the boundary value problem −Δw = f + Δ g˜ on Ω, (6.16) w =0 on ∂ Ω. PROOF. Replacing u by w + g˜ in (6.10) gives exactly (6.15). Hence Theorem 6.1.3 is just an equivalent formulation of Theorem 6.1.2. We could, as well, treat the nonhomogeneous Dirichlet problem by solving, in an independent way, the variational problem (6.15). Note that to apply the Lax–Milgram theorem, one needs to verify that l (v) = f v d x − ∇ g˜ · ∇v d x Ω
Ω
is a linear continuous form on H01 (Ω), which is clear. This corresponds to the fact that in (6.16) w is a solution of the Laplace equation with a right-hand side f + Δ g˜ , which in general is no longer in L2 (Ω) but belongs to H −1 (Ω)! Remark 6.1.2. From a practical and especially numerical point of view, it is much easier to work with w = u − g˜ and to solve the corresponding homogeneous Dirichlet problem by means of variational methods in the Sobolev space H01 (Ω). This justifies our special interest in studying the relations between the two approaches.
6.2 The Neumann problem We are going to study successively the coercive Neumann problem (both in the homogeneous and the nonhomogeneous case) and then the semicoercive Neumann problem. In this section, Ω is a bounded open connected set in RN which is assumed to be regular (piecewise of class C1 ). The outward unit normal vector to ∂ Ω at x ∈ ∂ Ω is denoted by n(x). Generally speaking, the Neumann boundary condition expresses that the normal derivative ∂∂ nu of the unknown function is prescribed on ∂ Ω.
6.2.1 The coercive homogeneous Neumann problem Let us give a function a0 ∈ L∞ (Ω) which satisfies the following assumption: there exists some positive real number α0 > 0 such that a0 (x) ≥ α0
for a.e. x ∈ Ω.
(6.17)
Given f ∈ L2 (Ω), we are looking for a solution u of the following boundary value problem: −Δu + a0 u = f on Ω, (6.18) ∂u =0 on ∂ Ω. ∂n The word homogeneous refers to the fact that ∂∂ nu is prescribed as equal to zero on the boundary. The word coercive is related to the fact that problem (5.18) is a well-posed problem (existence and uniqueness of a solution) whose variational resolution involves a coercive bilinear form (equivalently, a coercive convex functional) on the space H 1 (Ω). Let us make this precise in the statement below.
i
i i
i
i
i
i
6.2. The Neumann problem
book 2014/8/6 page 227 i
227
Theorem 6.2.1. The following facts hold: (i) There exists a unique solution u ∈ H 1 (Ω) of the problem ⎧ ⎨ (∇u · ∇v + a0 uv) d x = f v d x ∀v ∈ H 1 (Ω), Ω Ω ⎩ u ∈ H 1 (Ω). (ii) Equivalently, u is the unique solution of minimization problem 1 2 2 1 min (|∇v| + a0 v ) d x − f v d x : v ∈ H (Ω) . 2 Ω Ω
(6.19)
(6.20)
¯ Then (iii) Let us assume that the solution u of (6.19) (or (6.20)) is regular, i.e., u ∈ C2 (Ω). u is a classical solution of the Neumann problem (6.18). (iv) Conversely, if there exists a classical solution of the Neumann problem (6.18), it is equal to the solution of the variational problem (6.19) (or (6.20)). PROOF. (i) The functional space which is well adapted to the Neumann problem is V = H 1 (Ω). Since Ω has been assumed to be regular, we know (see Proposition 5.4.1) that ¯ = {v : v ∈ (RN )}. We will use ¯ is a dense subspace of H 1 (Ω). Recall that (Ω)
(Ω) |Ω ¯ elements v ∈ (Ω) as smooth test functions. Existence and uniqueness of a solution u of (6.19) is an immediate consequence of the Lax–Milgram theorem. Let us briefly give the proof. The bilinear form a : H 1 (Ω) × H 1 (Ω) → R which is defined by a(u, v) = (∇u · ∇v + a0 uv) d x Ω
verifies |a(u, v)| ≤ max{1, a0 L∞ }
Ω
(|∇u||∇v| + |u||v|)d x
≤ max{1, a0 L∞ }uH 1 vH 1 and a is continuous. On the other hand, for any v ∈ H 1 (Ω) a(v, v) ≥ min{1, α0 } (|∇v|2 + v 2 ) d x = min{1, α0 }v2H 1 Ω
and a is α-coercive on V = H 1 (Ω) with α = min{1, α0 }. The linear form l (v) = Ω f v d x verifies |l (v)| ≤ f L2 vL2 ≤ f L2 vH 1 and l is continuous on H 1 (Ω). Thus, all the conditions of the Lax–Milgram theorem are satisfied and, as a consequence, there exists a solution u, which is unique, of problem (6.19). (ii) The equivalence between (6.19) and the resolution of the minimization problem (6.20) follows immediately from Proposition 2.3.1 and the fact that a is positive and symmetric.
i
i i
i
i
i
i
228
book 2014/8/6 page 228 i
Chapter 6. Variational problems: Some classical examples
(iii) We now come to the most interesting question, which is the study of the relationship between the Neumann boundary value problem (6.18) and the variational problem (6.19) (or (6.20)). The striking feature of the variational approach is that the Neumann boundary condition does not appear explicitly in the variational formulation. Indeed, it is implicitly contained in it; this is what we are going to verify now. Suppose that the ¯ Take as a test function v ∈ (Ω) ¯ ⊂ H 1 (Ω) solution u of (6.19) is regular, i.e., u ∈ C2 (Ω). and make an integration by parts on (6.19). (This is possible since u and v are regular up to the boundary.) We obtain ∂u ¯ (−Δu + a0 u − f )v d x + v d σ = 0. (6.21) ∀v ∈ (Ω) Ω ∂Ω ∂ n Thanks to (6.21), we are going to test u successively on Ω and on ∂ Ω. Let us first test u on Ω by taking v ∈ (Ω) in (6.21). Since v = 0 on ∂ Ω, the integral term on ∂ Ω in (6.21) is equal to zero and we obtain (−Δu + a0 u − f )v d x = 0. ∀v ∈ (Ω) Ω
This implies that (see Theorem 9.3.1) −Δu + a0 u = f
a.e. on Ω.
(6.22)
¯ and Let us now test u on ∂ Ω. To do so, we return to (6.21) with a general v ∈ (Ω) use the previous information (6.22). Formula (6.22) just expresses that −Δu +a0 u − f , which is an L2 (Ω) function, is equal to zero almost everywhere on Ω. As a consequence, the first integral term on Ω in (6.21) is equal to zero and we obtain ∂u ¯ d σ = 0. (6.23) v ∀v ∈ (Ω) ∂Ω ∂ n We conclude thanks to the following lemma of independent interest. Lemma 6.2.1. Let Ω be a regular bounded open set in RN . Then, for any function h ∈ L2 (∂ Ω), we have ¯ =⇒ h = 0 on ∂ Ω. hv d σ = 0 ∀v ∈ (Ω) ∂Ω
PROOF. Take h ∈ L (∂ Ω), which satisfies hv d σ = 0 2
∂Ω
¯ ∀v ∈ (Ω).
(6.24)
¯ in H 1 (Ω) and the continuity of the trace operator from H 1 (Ω) By the density of (Ω) 2 into L (∂ Ω), property (6.24) can be extended by continuity to H 1 (Ω), i.e., hγ0 (v) d σ = 0 ∀v ∈ H 1 (Ω). ∂Ω
We know, by Proposition 5.6.3, that the image of H 1 (Ω) by γ0 is equal to H 1/2 (∂ Ω). Hence hv d σ = 0 ∀v ∈ H 1/2 (∂ Ω). ∂Ω
i
i i
i
i
i
i
6.2. The Neumann problem
book 2014/8/6 page 229 i
229
The conclusion follows now from the density property of H s (∂ Ω) in L2 (∂ Ω) for any s > 0. This last property can be easily deduced (by using local coordinates) from the fact that for any s > 0, (RN ) is included in H s (RN ), and from the density property of (RN ) in L2 (RN ). PROOF OF THEOREM 6.2.1 CONTINUED. Let us complete the proof of Theorem 6.2.1 ¯ is a classical solution of the and prove the last point (iv). Let us assume that u ∈ C2 (Ω) ¯ an arbitrary test function, Neumann boundary value problem (6.18). Take v ∈ (Ω) multiply (6.18) by v, and integrate on Ω. We obtain ¯ (−Δu + a0 u)v d x = f v d x ∀v ∈ (Ω). Ω
Ω
Integrating by parts, the above expression gives ∂u (∇u · ∇v + a0 uv)d x − v f v dx dσ = Ω ∂Ω ∂ n Ω Since
∂u ∂n
= 0 on ∂ Ω, we get (∇u · ∇v + a0 uv) d x = f v dx Ω
Ω
¯ ∀v ∈ (Ω).
¯ ∀v ∈ (Ω).
¯ in H 1 (Ω) to extend this equality to any v ∈ H 1 (Ω). Note We now use the density of (Ω) ¯ clearly implies that u ∈ H 1 (Ω). also that u ∈ C2 (Ω) Remark 6.2.1. The following facts should be pointed out: 1. The solution u of (6.19) (equivalently, of the minimization problem (6.20)) is called the variational solution, or the weak solution of the Neumann boundary value problem (6.18). As in the case of the Dirichlet problem, the term weak solution is justified by the fact that the weak solution always exists, but on the counterpart, the equation on Ω and the Neumann boundary condition are satisfied only in a weak sense (respectively, distribution sense and trace sense). The existence of a classical solution is equivalent to the study of the regularity of the weak solution. 2. According to the variational approach to the Dirichlet problem, the most natural choice for the function space in which to solve the Neumann problem is given by the closure V in H 1 (Ω) of ∂v ¯ = v ∈ (Ω) : = 0 on ∂ Ω . ∂n It is a good exercise to verify that V = H 1 (Ω)! One may be convinced of this fact just by observing the following elementary situation: take Ω = (0, 1) and v(x) = x. For each n ∈ N, take ⎧ 1 if 0 ≤ x ≤ n1 , ⎨n if n1 ≤ x ≤ 1 − n1 , vn (x) = x ⎩ 1 − n1 if 1 − n1 ≤ x ≤ 1. Clearly vn ∈ H 1 (0, 1),
d vn dv (0) = d xn (1) = 0, dx
and vn → v in H 1 (0, 1) as n → +∞: 1/n 1 2 2 2 2 x− vn − vH 1 (0,1) = 2 +1 dx ≤ + 3 . n n n 0
i
i i
i
i
i
i
230
book 2014/8/6 page 230 i
Chapter 6. Variational problems: Some classical examples
Hence the property “ ∂∂ vx = 0 on ∂ Ω” is not stable for the convergence in the H 1 (Ω) topology! As a conclusion, H 1 (Ω) is the right functional space to solve the Neumann problem by variational methods. 3. As a general rule, homogeneous Neumann boundary conditions naturally occur in problems where the boundary is free, i.e., no constraints, and no external action is exerted on ∂ Ω. This explains the importance of Neumann-type boundary value problems in mechanics, physics, and so forth.
6.2.2 The coercive nonhomogeneous Neumann problem We keep the same notation and assumptions as in the previous section. In addition, we suppose that there is some function g ∈ L2 (∂ Ω) which is given. We consider the (nonhomogeneous) Neumann problem
−Δu + a0 u = f ∂u =g ∂n
on Ω, on ∂ Ω.
(6.25)
The variational approach to (6.25) is similar to that in the proof of the homogeneous case (Theorem 6.2.1). One just needs to change the linear form l : H 1 (Ω) → R by introducing the integral term ∂ Ω g v d σ. Let us make this precise. Theorem 6.2.2. (i) Given f ∈ L2 (Ω) and g ∈ L2 (∂ Ω), there exists a unique solution u ∈ H 1 (Ω) of the problem ⎧ ⎨ (∇u · ∇v + a uv) d x = f v d x + g v dσ 0 Ω ∂Ω ⎩ uΩ∈ H 1 (Ω).
∀v ∈ H 1 (Ω),
(6.26)
(ii) Equivalently, u is the unique solution of the minimization problem min
1 2
2
(|∇v| + a0 v ) d x −
2
Ω
f v dx −
1
∂Ω
g v d σ : v ∈ H (Ω) .
(6.27)
(iii) Let us assume that the solution u of (6.26) (equivalently, (6.27)) is regular, i.e., u ∈ ¯ Then u is a classical solution of the nonhomogeneous Neumann problem (6.25). C2 (Ω). (iv) Conversely, if u is a classical solution of (6.25), then it is equal to the solution of the variational problem (6.26) (equivalently, (6.27)). PROOF. (i) We follow the lines of the proof of Theorem 6.2.1. We know, by Theorem 5.6.1, that the trace operator γ0 : H 1 (Ω) → L2 (∂ Ω) is continuous. From this, and by using the fact that g ∈ L2 (∂ Ω), we deduce that the linear form v ∈ H 1 (Ω) → ∂ Ω g v d σ is continuous on H 1 (Ω). This clearly implies that the linear mapping v → Ω f v d x + g v d σ is continuous on H 1 (Ω). The assumptions of the Lax–Milgram theorem are ∂Ω satisfied. (The bilinear form a : H 1 (Ω) × H 1 (Ω) → R is the same as in the homogeneous case.) As a consequence, problem (6.26) admits a unique solution u.
i
i i
i
i
i
i
6.2. The Neumann problem
book 2014/8/6 page 231 i
231
¯ and (iii) Let us assume that u is a regular solution of (6.26). By taking first v ∈ (Ω) by using the above equality, we infer that ∂u ¯ − g d σ = 0. v ∀v ∈ (Ω) ∂n ∂Ω By using Lemma 6.2.1, we conclude that
∂u ∂n
= g on ∂ Ω.
6.2.3 The semicoercive homogeneous Neumann problem Let us now consider the following boundary value problem. Given f ∈ L2 (Ω), find u : ¯ → R, which satisfies Ω −Δu = f on Ω, (6.28) ∂u =0 on ∂ Ω. ∂n Note that in (6.28), the term a0 which was supposed to be positive in the previous section is now equal to zero. This makes a big difference; problem (6.28) is no longer a well-posed problem. To see this, let us assume for a moment that u is a smooth solution of (6.28). Let us integrate (6.28) on Ω and use the divergence theorem. We obtain f (x) d x = − Δu d x Ω
=− = 0.
Ω
∂Ω
∂u ∂n
dσ
Hence, a necessary condition for the existence of a (classical) solution of problem (6.28) is that Ω f (x) d x = 0. Moreover, we can observe also that if u is a solution of (6.28), then for any constant C , u + C is also a solution. Indeed, we are going to prove that the condition Ω f (x) d x = 0 is a necessary and sufficient condition for the existence of a solution of problem (6.28) and that any two solutions of this problem differ by a constant. We are going to present two different variational approaches of independent interest. The first one consists in working with the space 1 v(x) d x = 0 , V = v ∈ H (Ω) : Ω
and we will see that, in this framework, the problem becomes coercive. The second approach consists in regularizing problem (6.28) by considering for each > 0, the now coercive problem u − Δu = f on Ω, ∂ u =0 on ∂ Ω. ∂n Then, by passing to the limit as → 0, one obtains that the sequence (u )→0 converges to a particular solution of the initial problem (6.28). Theorem 6.2.3. Let us give f ∈ L2 (Ω) such that Ω f (x) d x = 0. We introduce the space V = {v ∈ H 1 (Ω) : Ω v(x) d x = 0} which is equipped with the scalar product and norm of H 1 (Ω), i.e., 〈u, v〉 = (uv + ∇u · ∇v) d x, Ω
i
i i
i
i
i
i
232
book 2014/8/6 page 232 i
Chapter 6. Variational problems: Some classical examples
which makes V a closed subspace of H 1 (Ω) and hence a Hilbert space. (i) There exists a unique u ∈ V which satisfies ⎧ ⎨ ∇u · ∇v d x = f v dx Ω Ω ⎩ u ∈ V.
∀v ∈ V ,
Equivalently, u is the unique solution of the minimization problem 1 2 min |∇v| d x − f v dx : v ∈V . 2 Ω Ω
(6.29)
(6.30)
(ii) Let us assume that the solution u of (6.29) is regular. Then u is a classical solution of the problem ⎧ −Δu = f on Ω, ⎪ ⎪ ⎪ ⎨ ∂u =0 on ∂ Ω, (6.31) ∂ n ⎪ ⎪ ⎪ u(x) d x = 0, ⎩ Ω
and all the other classical solutions of (6.28) are obtained by adding a constant to u. 1 (iii) Conversely, if u is a classical solution of (6.28), then u − |Ω| u(x) d x is equal to the Ω solution of problem (6.29). PROOF. (i) The central point is to prove that the bilinear form a : V × V → R, which is defined by a(u, v) =
Ω
∇u · ∇v d x,
is coercive on V . Indeed, ∀v ∈ V
a(v, v) =
Ω
|∇v|2 d x;
(6.32)
so, we need some Poincaré inequality to conclude that a is coercive. In our setting, it is the Poincaré–Wirtinger inequality (cf. Corollary 5.4.1) which is well adapted. Let us recall that there exists some constant C such that ) ) ) ) 1 ) ) ∀v ∈ H 1 (Ω) )v − v(x) d x ) ≤ C ∇vL2 (Ω)N . ) ) 2 |Ω| Ω L (Ω)
When v ∈ V , we have
Ω
v(x) d x = 0, and hence ∀v ∈ V
vL2 (Ω) ≤ C ∇vL2 (Ω)N .
(6.33)
From (6.32) and (6.33) we obtain ∀v ∈ V
a(v, v) ≥
1 1+C2
v2H 1 (Ω) ,
(6.34)
and a is coercive.
i
i i
i
i
i
i
6.2. The Neumann problem
book 2014/8/6 page 233 i
233
(ii) Let us now interpret the variational problem (6.29) in a classical sense when its solution u is regular. To that end, let us notice that 1 ¯ v− ∀v ∈ (Ω) v(x) d x ∈ V . |Ω| Ω By using such test functions in (6.29) and by integration by parts, we obtain ! " 1 ¯ ∇u · ∇v d x = f v− v(x) d x d x. ∀v ∈ (Ω) |Ω| Ω Ω Ω It is only now that we use the information Ω f (x) d x = 0 to obtain ¯ ∀v ∈ (Ω) ∇u · ∇v d x = f v d x. Ω
(6.35)
(6.36)
Ω
The end of the proof runs as before and we obtain that u satisfies (6.31). If u ∗ is another classical solution of (6.29), then −Δ(u − u ∗ ) = 0 on Ω, ∂ (u − u ∗ ) = 0. ∂n Let us multiply the first equation by u − u ∗ and integrate by parts. We obtain |∇(u − u ∗ )|2 d x = 0 Ω
∗
and, hence, u − u ≡ C for some constant C . 1 (iii) Conversely, let u be a classical solution of (6.28). Then, u − |Ω| belongs to V and satisfies −Δu ∗ = f on Ω, ∂ u∗ =0 on ∂ Ω. ∂n
Ω
u(x) d x := u ∗
¯ and integrate by parts Let us multiply the first equation by v ∈ (Ω) ¯ ∀v ∈ (Ω) ∇u ∗ · ∇v d x = f v d x. Ω
Ω
¯ in H 1 (Ω) we obtain By density of (Ω) ∀v ∈ H 1 (Ω) ∇u ∗ · ∇v d x = f v d x, Ω
Ω
∗
and since V ⊂ H (Ω) we obtain (6.29), i.e., u is the unique solution of problem (6.29). 1
Let us now describe the other approach, which is an illustration of the so-called Tikhonov regularization method. Theorem 6.2.4. Let f ∈ L2 (Ω) be given such that Ω f (x) d x = 0. For any > 0, let u be the unique solution of the variational problem ⎧ ⎨ (u v + ∇u · ∇v) d x = f v d x ∀v ∈ H 1 (Ω), (6.37) Ω Ω ⎩ u ∈ H 1 (Ω).
i
i i
i
i
i
i
234
book 2014/8/6 page 234 i
Chapter 6. Variational problems: Some classical examples
Equivalently, u is the weak solution of the homogeneous Neumann problem u − Δu = f on Ω, ∂ u =0 on ∂ Ω. ∂n Then the family (u )→0 norm converges in H 1 (Ω), as → 0, to a function u ∈ H 1 (Ω), which is the solution of problem (6.29): ⎧ ⎨ ∇u · ∇v d x = f v d x ∀v ∈ V , Ω ⎩ uΩ∈ V , where V = {v ∈ H 1 (Ω) :
Ω
v(x) d x = 0}. Equivalently, u is the weak solution of ⎧ =f on Ω, ⎪ ⎨ −Δu ∂u =0 on ∂ Ω, ∂n ⎪ ⎩ u(x) d x = 0. Ω
PROOF. For > 0, take a0 ≡ . We are in the coercive situation described by Theorem 6.2.1. Therefore, problem (6.37) admits a unique solution u . Let us first prove that the family (u )→0 remains bounded in H 1 (Ω). By taking v = u in (6.37), we obtain 2 2 u d x + |∇u | d x = f u d x. (6.38) Ω
Ω
By taking v ≡ 1 in (6.37), and using that
Ω
Ω
Ω
f (x) d x = 0, we obtain
u (x) d x = 0.
(6.39)
Let us invoke again the Poincaré–Wirtinger inequality (Corollary 5.4.1), which, in particular, implies the existence of a constant C > 0 such that ∀v ∈ V
1/2 2
vL2 (Ω) ≤ C
Ω
|∇v| d x
.
(6.40)
Combining (6.38), (6.39), and (6.40) we obtain 2 2 u d x ≤ C |∇u |2 d x Ω
Ω
≤ C2
and hence
1/2 f 2dx
Ω
Ω
1/2 Ω
u2 d x
1/2 u2 d x
≤C
1/2
2
2
f dx
.
Ω
Returning to (6.38), we get
2
Ω
|∇u | d x ≤ C
2
f 2 d x. Ω
i
i i
i
i
i
i
6.2. The Neumann problem
book 2014/8/6 page 235 i
235
Finally, u H 1 (Ω) ≤ C (1 + C ) f L2 (Ω) .
(6.41)
Let us extract a subsequence (which we still denote by (u )) which weakly converges in H 1 (Ω) to some u ∗ ∈ H 1 (Ω). When passing to the limit on (6.37) we immediately obtain ⎧ ⎨ ∇u ∗ · ∇v d x = f v d x ∀v ∈ H 1 (Ω), (6.42) Ω Ω ⎩ u∗ ∈ V . The problem (6.42) clearly admits at most a solution: take two solutions u1∗ and u2∗ , make the difference, and take v = u1∗ − u2∗ ; one obtains u1∗ − u2∗ = constant and u1∗ − u2∗ ∈ V , which implies u1∗ = u2∗ . Hence, the whole sequence (u )→0 weakly converges to the unique solution u ∗ of (6.42), and u ∗ = u, which is the solution that we obtained in Theorem 6.2.3 by a different argument. Let us complete the argument by noticing that by passing to the limit on (6.38) 2 f u. lim |∇u | d x = →0
Ω
Ω
On the other hand, by taking v = u in (6.29) we have f u = |∇u|2 d x. Ω
Ω
Hence u → u in w − H 1 (Ω) and u H 1 → uH 1 . This implies that u → u strongly in H 1 (Ω).
6.2.4 The semicoercive nonhomogeneous Neumann problem ¯ → R of the Given f : Ω → R and g : ∂ Ω → R, we are looking for a solution u : Ω following boundary value problem:
−Δu = f ∂u =g ∂n
on Ω, on ∂ Ω.
Theorem 6.2.5. Let f ∈ L2 (Ω) and g ∈ L2 (∂ Ω) be given functions which satisfy the so-called compatibility condition:
Ω
Set
f (x) d x +
∂Ω
g (x) d σ(x) = 0.
1
V = v ∈ H (Ω) :
(6.43)
Ω
v(x) d x = 0 .
Then there exists a unique solution u of the problem ⎧ ⎨ ∇u · ∇v d x = f v dx + g v dσ Ω ∂Ω ⎩ uΩ∈ V .
∀v ∈ V ,
(6.44)
i
i i
i
i
i
i
236
book 2014/8/6 page 236 i
Chapter 6. Variational problems: Some classical examples
Equivalently, u is the unique solution of the minimization problem 1 min |∇v|2 d x − f v dx − g v dσ : v ∈ V . 2 Ω Ω ∂Ω
(6.45)
The function u is a weak solution of the following nonhomogeneous Neumann boundary value problem: −Δu = f on Ω, (6.46) ∂u =g on ∂ Ω. ∂n PROOF. Most of the ingredients of the proof were introduced in the previous sections. Therefore, we just briefly sketch the main lines of the proof. The only difference with the semicoercive homogeneous case comes from the linear form l : V → R, f v dx + g v d σ. l (v) = Ω
∂Ω
The continuity of l follows from the continuity of the trace operator γ0 from H 1 (Ω) into L2 (∂ Ω). The only point which deserves particular attention is the interpretation of (6.44) as a boundary value problem. ¯ let Let us assume thatthe solution u of (6.44) is regular. Given an arbitrary v ∈ (Ω), 1 us notice that v − |Ω| Ω v(x) d x belongs to V . Taking such a test function in (6.44), we infer that % & ∇u · ∇v d x = f v dx + g v d σ − (v) f (x) d x + g (x) d σ(x) , Ω
Ω
where we set (v) :=
∂Ω
1 |Ω|
Ω
Ω
v(x) d x. By using (6.43) we obtain
¯ ∀v ∈ (Ω)
∂Ω
Ω
∇u · ∇v d x =
Ω
f v dx +
∂Ω
g v dσ
and conclude by using arguments similar to those in the proof of Theorem 6.2.2. Remark 6.2.2. (1) We have described two different methods to reduce the semicoercive Neumann problem to the coercive one. Each time, the idea is to reduce the problem to a situation where one can apply some Poincaré inequality. As another possibility let us mention (it is a good exercise to develop it) the parallel approach which consists in working with the space W (instead of V ) defined by 1 v(x) d σ(x) = 0 . W = v ∈ H (Ω) : ∂Ω
(2) The term semicoercive comes from the fact that the lack of coercivity concerns only the lower-order terms (here only the zero-order term, namely, u!).
6.3 Mixed Dirichlet–Neumann problems We are going to consider boundary value problems whose boundary conditions contain both u and ∂∂ nu .
i
i i
i
i
i
i
6.3. Mixed Dirichlet–Neumann problems
book 2014/8/6 page 237 i
237
6.3.1 The Dirichlet–Neumann problem Let Ω be an open bounded set in RN which is assumed to be connected and regular. (Its boundary Γ = ∂ Ω is piecewise of class C1 .) Let us suppose that Γ = Γ0 ∪ Γ1 , Γ0 ∩ Γ1 = with H N −1 (Γ0 ) > 0, where Γ is the union of the two disjoint sets Γ0 and Γ1 . Given f ∈ L2 (Ω) and g ∈ L2 (Γ1 ) we are looking for a solution u of the following boundary value problem: ⎧ ⎨ −Δu = f on Ω, u =0 on Γ0 , ⎩ ∂u = g on Γ1 . ∂n Note that two different types of boundary conditions are imposed to u: on Γ0 it is a Dirichlet condition, while on the complementary Γ1 = Γ \ Γ0 it is a Neumann condition. This is a simplified model for an important situation in mechanics, where an elastic material is fixed on a part Γ0 of its boundary, while on the complementary a surface density of force g is present. Let us introduce the functional space V := {v ∈ H 1 (Ω) : γ0 (v) = 0 on Γ0 }.
(6.47)
The space V , as a subspace of H 1 (Ω), is equipped with the scalar product of H 1 (Ω) and the corresponding norm 1/2 2 2 (v + |∇v| ) d x . vV = Ω
It follows immediately from the continuity of the trace operator γ0 : H 1 (Ω) → L2 (Γ ) that V is a closed subspace of H 1 (Ω). Note that γ0 (v) = 0 in Γ0 has to be understood in the following sense: γ0 (v)(x) = 0 for a.e. x with respect to the measure H N −1 |Γ0 . Hence V is a Hilbert space. We can now state the variational formulation of the Dirichlet–Neumann problem. Theorem 6.3.1. (i) Let f ∈ L2 (Ω) and g ∈ L2 (Γ1 ) be given functions. Let us assume moreover that H N −1 (Γ0 ) > 0. Then, there exists a unique solution u of the problem ⎧ ⎨ ∇u · ∇v d x = f v dx + g v d σ ∀v ∈ V , (6.48) Ω Ω Γ1 ⎩ u ∈ V. Equivalently, u is the unique solution of the minimization problem 1 2 |∇v| d x − f v dx − g v dσ : v ∈ V . min 2 Ω Ω Γ1 (ii) Assume that Γ0 is sufficiently regular to have the property 4 5 ¯ v = 0 on Γ v|Γ1 : v ∈ (Ω), is dense in L2 (Γ1 ). 0 Then the solution u of the variational problem Dirichlet–Neumann boundary value problem: ⎧ ⎨ −Δu = f u =0 ⎩ ∂u =g ∂n
(6.49)
(6.50)
(6.48) is a weak solution of the following on Ω, on Γ0 , on Γ1 .
(6.51)
i
i i
i
i
i
i
238
book 2014/8/6 page 238 i
Chapter 6. Variational problems: Some classical examples
PROOF. (i) Clearly, the bilinear form a(u, v) =
Ω
∇u · ∇v d x
is continuous on V ×V . The coercivity of a is a consequence of the generalized Poincaré inequality (Theorem 5.4.3): we have seen that V is a closed subspace of H 1 (Ω). On the other hand, the assumption H N −1 (Γ0 ) > 0 implies that the only constant function belonging to V is the zero function. Hence, there exists some C > 0 such that ∀v ∈ V
vL2 (Ω) ≤ C
1/2 Ω
|∇v|2 d x
.
This immediately implies that ∀v ∈ V
a(v, v) ≥
1 1+C2
v2H 1 (Ω) .
Let us now consider the linear form l : V → R defined by l (v) = f v dx + g v d σ. Ω
Γ1
The continuity of the trace operator γ0 : H 1 (Ω) → L2 (Γ ) (Theorem 5.6.1) clearly implies that l is continuous. (ii) Let us now suppose that the solution u of the variational problem (6.48) is smooth, and let us prove that u is a classical solution of (6.51). Let us first take v ∈ (Ω) (note that (Ω) ⊂ V ). We obtain ∇u · ∇v d x = f v d x ∀v ∈ (Ω), Ω
Ω
which classically implies −Δu = f on Ω.
(6.52)
¯ such that v = 0 on Γ . Since v ∈ V , we have, by (6.48), Let us now take v ∈ (Ω) 0 ¯ v = 0 on Γ . ∇u · ∇v d x = f v dx + g v d σ ∀v ∈ (Ω), 0 Ω
Ω
Γ1
After integration by parts, and by using (6.52), we obtain Γ1
∂u ∂n
− g v dσ = 0
¯ v = 0 on Γ . ∀v ∈ (Ω), 0
(6.53)
From the regularity assumption (6.50) on Γ0 and (6.53) we infer that ∂u ∂n
= g on Γ1 ,
which completes the proof.
i
i i
i
i
i
i
6.3. Mixed Dirichlet–Neumann problems
book 2014/8/6 page 239 i
239
Remark 6.3.1. (1) Conversely, when proving that a classical solution of (6.51) is a variational solution of (6.48), one needs to make the following assumption: is dense in V for the H 1 (Ω) norm, where ¯ : v = 0 on Γ }. = {v ∈ (Ω) 0
(6.54)
(2) The relationship between the classical and the variational solution for the Dirichlet–Neumann problem is quite delicate. Beside the classical regularity assumptions on Ω and its boundary Γ = ∂ Ω, it requires extra regularity assumptions on the portion Γ0 of Γ , where the Dirichlet condition is imposed. Otherwise we could take some Γ0 , with H N −1 (Γ0 ) > 0 which is dense in Γ ; then, u regular and u = 0 on Γ0 would force u to be equal to zero everywhere on Γ , which makes the above argumentation no longer valid.
6.3.2 Mixed Dirichlet–Neumann boundary conditions We still consider an open bounded regular set Ω in RN . Let us give some positive measurable function a0 : ∂ Ω → R+ such that there exists a positive real number α > 0 with a0 (x) ≥ α
with respect to H N −1 |∂ Ω.
for a.e. x ∈ ∂ Ω
(6.55)
Theorem 6.3.2. Let f ∈ L2 (Ω) and g ∈ L2 (∂ Ω) be given functions and assume (6.55). (i) Then, there exists a unique solution u of the following system: ⎧ ⎨ ∇u · ∇v d x + a0 uv d σ = f v dx + g v dσ Ω ∂ Ω Ω ∂ Ω ⎩ u ∈ H 1 (Ω).
∀v ∈ H 1 (Ω).
(6.56) Equivalently, u is the unique solution of the minimization problem 1 1 2 2 1 min |∇v| d x + a v dσ − f v dx − g v d σ : v ∈ H (Ω) . 2 Ω 2 ∂Ω 0 Ω ∂Ω (6.57) (ii) The solution u of (6.56) is a weak solution of the following boundary value problem: ⎧ on Ω, ⎨ −Δu = f ∂u (6.58) ⎩ a0 u + = g on ∂ Ω. ∂n PROOF. (i) The only point which is not standard is to verify that the bilinear form a0 uv d σ a(u, v) = ∇u · ∇v d x + Ω
∂Ω
is coercive on H (Ω). By assumption (6.55) we have 1 2 ∀v ∈ H (Ω) a(v, v) ≥ |∇v| d x + α 1
Ω
∂Ω
v 2 d σ.
Let us apply the generalized Poincaré inequality (Theorem 5.4.3) to the space 1 V = v ∈ H (Ω) : v(x) d σ(x) = 0 ,
(6.59)
(6.60)
∂Ω
i
i i
i
i
i
i
240
book 2014/8/6 page 240 i
Chapter 6. Variational problems: Some classical examples
where, as usual, for simplicity of notation, we write v instead of γ0 (v). By using again the continuity of the trace operator γ0 : H 1 (Ω) → L2 (∂ Ω), we have that V is a closed subspace of H 1 (Ω). Moreover, if v is a constant function which belongs to V , say, v ≡ C , we necessarily have C H N −1 (∂ Ω) = 0 which forces C to be equal to zero. Let us finally observe that ∀v ∈ H 1 (Ω)
v − ∂ Ω (v) ∈ V ,
where ∂ Ω (v) := |∂1Ω| ∂ Ω v(x) d σ(x). Thus, by applying the generalized Poincaré inequality to the space V , we obtain the existence of a positive constant C such that ∀v ∈ H 1 (Ω)
v − ∂ Ω (v)L2 (Ω) ≤ C
Ω
1/2 |∇v|2 d x
.
(6.61)
From (6.61) we easily obtain the inequality ∀v ∈ H 1 (Ω) v(x)2 d x ≤ 2C 2 |∇v|2 d x + 2|Ω|∂ Ω (v)2 . Ω
(6.62)
Ω
On the other hand, by using the Cauchy–Schwarz inequality in L2 (∂ Ω), we obtain ∂ Ω (v) =
1 |∂ Ω|
∂Ω
v(x) d σ(x) ≤
1
|∂ Ω|1/2
1/2 ∂Ω
v(x)2 d σ(x)
.
(6.63)
Combining (6.62) and (6.63), we obtain |Ω| ∀v ∈ H 1 (Ω) v(x)2 d x ≤ 2C 2 |∇v|2 d x + 2 v(x)2 d σ(x). |∂ Ω| Ω Ω ∂Ω
(6.64)
It is an elementary computation to obtain, by using (6.59) and (6.64), the inequality ∀v ∈ H 1 (Ω) with γ=
a(v, v) ≥ γ v2H 1 (Ω) min{α, 1} |Ω|
1 + 2 max{C 2 , |∂ Ω| N }
(6.65)
.
N −1
¯ in (ii) Let us assume that the solution u of (6.56) is regular. Then, by taking v ∈ (Ω) (6.56) and integrating by parts, we obtain ∂u ¯ − g v d σ = 0 ∀v ∈ (Ω). a0 u + (−Δu − f )d x + ∂n Ω ∂Ω An analysis similar to the one used in the case of the Neumann problem gives that u satisfies (6.58). Remark 6.3.2. It is interesting to notice that the boundary condition a0 u + ∂∂ nu = g contains, at least formally, all the previous boundary conditions that we have examined. (i) Take a0 = 0; then one obtains the Neumann boundary condition that in this case, the coercivity property is lost.
∂u ∂n
= g . Note
i
i i
i
i
i
i
6.4. Heterogeneous media: Transmission conditions
book 2014/8/6 page 241 i
241
(ii) Take a0 ≡ +∞; then formally one obtains the Dirichlet boundary condition, u = 0 on ∂ Ω. (iii) Take a0 = 0 on Γ1 and a0 = +∞ on Γ0 = Γ \ Γ0 ; then one obtains the Dirichlet– Neumann boundary condition. Indeed, it is a good exercise to justify these formal results. For example, take a0 (x) ≡ n in (6.56) and prove that the corresponding sequence (un ) norm converges to the solution u of the Dirichlet problem.
6.4 Heterogeneous media: Transmission conditions Let us consider the following model situation coming from electrostatics. In the open set Ω of RN we have two distinct materials with respective conductivity coefficients α and β; take 0 < α < β < +∞, for example. The subsets of Ω occupied by the two materials are denoted, respectively, by Ωα and Ωβ . We assume that Ωα and Ωβ are open sets with a common boundary denoted by Σ which is a C1 manifold and Ω = Ωα ∪ Ωβ ∪ Σ. In this model, the conductivity coefficient a : Ω → R takes only two values, a ≡ α on Ωα and a ≡ β on Ωβ : α if x ∈ Ωα , a(x) = β if x ∈ Ωβ . Note that a is not continuous (it is discontinuous through Σ). It belongs to L∞ (Ω). Indeed, in the following developments, a(·) plays a role only as a function defined a.e. with respect to the Lebesgue measure. Since Σ has zero Lebesgue measure, we don’t need to define a on Σ. For each x ∈ Σ, n(x) is the unit normal vector to Σ at x which is directed outward with respect to Ωα (and inward with respect to Ωβ ). For any (electrostatic) potential function v : Ω → R the corresponding stored internal energy is Φ(v) =
a(x)|∇v(x)|2 d x. Ω
Suppose that the conductor is connected to the earth on its boundary. For a given ¯ → R solves the density of charge f : Ω → R the equilibrium potential function u : Ω minimization problem 1 2 a(x)|∇v(x)| d x − f (x)v(x) d x : v = 0 on ∂ Ω . min 2 Ω Ω The variational approach to the above problem, and the corresponding transmission conditions through the interface Σ satisfied by the solution u, are described in the following statement. Theorem 6.4.1. (a) For every f ∈ L2 (Ω) there exists a unique u ∈ H01 (Ω) solution of the following minimization problem: 1 2 1 min a(x)|∇v(x)| d x − f (x)v(x) d x : v ∈ H0 (Ω) . (6.66) 2 Ω Ω
i
i i
i
i
i
i
242
book 2014/8/6 page 242 i
Chapter 6. Variational problems: Some classical examples
(b) Equivalently, the solution u of (6.66) verifies ⎧ ⎨ a(x)∇u(x) · ∇v(x) d x = f (x)v(x) d x Ω ⎩ uΩ∈ H 1 (Ω).
∀v ∈ H01 (Ω),
(6.67)
0
(c) Equivalently, the solution u of (6.66) is a weak solution of the boundary value problem − div(a(x)∇u(x)) = f on Ω, (6.68) u = 0 on ∂ Ω, in the following sense: u ∈ H01 (Ω), the first equation is satisfied in the distribution sense, while u = 0 is satisfied in the trace sense. (d) Let us denote uα = u|Ωα and uβ = u|Ωβ , and let us assume that the variational (weak) ¯ ) and u ∈ C2 (Ω ¯ ). Then u is solution of (6.66) is regular in the following sense: u ∈ C2 (Ω α
α
a classical solution of the following transmission problem: ⎧ −αΔuα = f on Ωα , ⎪ ⎪ ⎪ ⎪ −βΔu = f on Ωβ , ⎪ β ⎨ uα = uβ on Σ, ⎪ ∂ uβ ∂ uα ⎪ ⎪ = β ∂ n on Σ, α ⎪ ⎪ ⎩ ∂n u =0 on ∂ Ω.
β
β
(6.69)
In particular, u is continuous through Σ, but ∂∂ nu is discontinuous through Σ. ¯ ), u ∈ C2 (Ω ¯ ), and u satisfies (e) Conversely, if u is a classical solution, i.e., uα ∈ C2 (Ω α β β (6.69), then it is equal to the variational (weak) solution of (6.66). PROOF. (a) The argument is quite similar to the one developed in the variational approach to the Dirichlet problem. The functional J : H01 (Ω) → R defined by 1 2 J (v) := a(x)|∇v(x)| d x − f v dx 2 Ω Ω is convex and continuous on H01 (Ω). The coercivity of J follows from the inequality J (v) ≥
min(α, β) 2
Ω
|∇v|2 d x − f L2 vL2
and the Poincaré inequality on H01 (Ω). The conditions of the convex minimization theorem, Theorem 3.3.4, are fulfilled. Hence, there exists a solution u to problem (6.66). It is unique because J is strictly convex. Indeed, this follows from the strict convexity of Φ(v) = a(x)|∇v|2 d x, which is a positive definite quadratic form on H01 (Ω) (see Proposition 2.3.4). (b) The equivalence between (6.66) and (6.67) is a direct consequence of Proposition 2.3.1 and the fact that the bilinear form b (u, v) := a(x)∇u(x) · ∇v(x) d x Ω
is symmetric and positive. Indeed, we could as well solve the variational problem by applying the Lax–Milgram theorem, Theorem 3.1.2, to the formulation (6.67).
i
i i
i
i
i
i
6.4. Heterogeneous media: Transmission conditions
book 2014/8/6 page 243 i
243
(c) It follows from the density of (Ω) in H01 (Ω) that it is equivalent in the variational formulation (6.67) to take only v belonging to (Ω). One then obtains the equivalent formulation (6.68) just by using the notion of derivation in the distribution sense. (d) Let us now come to the point which deserves some particular attention, namely, the interpretation of the distribution formula − div(a(x)∇u(x)) = f on Ω. As we just pointed out, all the information we have on u is contained in this distribution formula and the fact that u ∈ H01 (Ω). Equivalently, we have ⎧ ⎨ a(x)∇u(x) · ∇v(x) d x − f (x)v(x) d x = 0 ∀v ∈ (Ω), Ω Ω ⎩ u ∈ H 1 (Ω).
(6.70)
0
Let us particularize the test function v to test successively u on Ωα , Ωβ and then on Σ. ¯ ) and u ∈ C2 (Ω ¯ ). We assume that uα ∈ C2 (Ω α β β (1) Let us first take v ∈ (Ωα ). From (6.70) and the fact that a ≡ α on Ωα , we obtain that uα = u|Ωα satisfies α This yields
Ωα
∇uα · ∇v d x −
Ωα
f v d x = 0 ∀v ∈ (Ωα ).
−αΔuα = f
on Ωα .
(6.71)
(2) Similarly, by taking test functions v ∈ (Ωβ ) one obtains −βΔuβ = f
on Ωβ .
(6.72)
(3) Let us now analyze the transmission conditions through Σ which are satisfied by u. Let us first observe that uα = uβ on Σ. Indeed, since u ∈ H01 (Ω) there exists some approximating sequence (un ) u∈N , un ∈ (Ω) for each n ∈ N such that un → u in H 1 (Ω) as n → +∞. Let us identify un and its extension by zero outside of Ω. By definition of ¯ ) and (Ω ¯ ) we have that for every n ∈ N,
(Ω α β ¯ ) un |Ωα ∈ (Ω α Moreover,
¯ ). un |Ωβ ∈ (Ω β
and
un |Ωα → uα
in H 1 (Ωα ),
un |Ωβ → uβ
in H 1 (Ωβ ).
By using the continuity of the trace operator γ0,α : H 1 (Ωα ) → L2 (Σ) and the fact that for ¯ ) the trace coincides with the restriction, we obtain that functions in (Ω α un |Σ → γ0,α (uα ) in L2 (Σ). Similarly, we have
un |Σ → γ0,β (uβ )
in L2 (Σ).
i
i i
i
i
i
i
244
book 2014/8/6 page 244 i
Chapter 6. Variational problems: Some classical examples
Hence γ0,α (uα ) = γ0,β (uβ ), that is, the traces of u from both sides of Σ are the same. Since ¯ ) and C2 (Ω ¯ ), these traces coincide with the respective u has been assumed to be C2 (Ω α β values of uα and uβ on Σ, that is, uα = uβ
on Σ,
(6.73)
which makes the function u continuous on Ω. Let us now take a general test function v ∈ (Ω) and rewrite (6.70) as ∇uα · ∇v d x + β ∇uβ · ∇v d x − f v dx − f v d x = 0. α Ωα
Ωβ
Ωα
(6.74)
Ωβ
Integrating by parts gives ∂ uα d σ. ∇uα · ∇v d x = − (αΔuα )v d x + α v α ∂n Ωα Ωα Σ
(6.75)
Similarly (note that the outward normal to Ωβ on Σ is now the opposite vector −n), we have ∂u β β d σ. (6.76) ∇uβ · ∇v d x = − (βΔuβ )v d x − β v ∂ n Ωβ Ωβ Σ Combining (6.74), (6.75), and (6.76) we obtain ∂ uβ ∂ uα (αΔuα + f )v d x − (βΔuβ + f )v d x + α − −β v d σ = 0. ∂n ∂n Ωα Ωβ Σ By using (6.71) and (6.72), we finally obtain ∂ uβ ∂ uα α −β v(x) d σ(x) = 0, ∀v ∈ (Ω) ∂n ∂n Σ which implies ∂ uα
∂ uβ
on Σ. (6.78) ∂n ∂n Let us finally notice that the condition u = 0 on ∂ Ω follows from the fact that u ∈ H01 (Ω) and the regularity assumptions on ∂ Ω and u. (e) To pass from the boundary value problem (6.69) to (6.67) we proceed in a similar way, just making the integration by parts in the reverse way. The only point which requires some attention is the proof of the property u ∈ H 1 (Ω). Indeed, this is a consequence of the following lemma of independent interest. α
=β
(6.77)
Lemma 6.4.1. Let uα ∈ H 1 (Ωα ) and uβ ∈ H 1 (Ωβ ), Ωα and Ωβ being two disjoint open sets with a common interface Σ. We define uα on Ωα , u= uβ on Ωβ . Then the following derivation rule holds: ˜ + χ ∇u ˜ + [u] n d σ, ∇u = χΩα ∇u α Ωβ β Σ
i
i i
i
i
i
i
6.4. Heterogeneous media: Transmission conditions
book 2014/8/6 page 245 i
245
˜ , where [u]Σ := γ0,β (uβ ) − γ0,α (uα ) is the jump of u through Σ, d σ = N −1 Σ, and ∇u α ˜ denote the extensions by zero of ∇u and ∇u , respectively, on Ω \ Ω and Ω \ Ω . ∇u β
α
PROOF. Take 1 ≤ i ≤ N and compute
∂u ∂ xi
β
∂u ∂ xi
:= ( (Ω), (Ω))
β
in (Ω):
,ϕ
α
−
Ωα
uα
∂ϕ ∂ xi
dx −
Ωβ
uβ
∂ϕ ∂ xi
d x.
Let us integrate by parts by using the Green’s formula (Proposition 5.6.2): ∂u ∂u ∂ uα β ,ϕ = ϕ d x − γ0,α (uα )ϕni d σ + ϕdx ∂ xi ∂ x ∂ x Ωα Σ Ωβ i i ( , ) + γ0,β (uβ )ϕni d σ Σ ⎛ ⎞ ∂F uβ ∂F uα χ + χ ⎠ϕ d x = ⎝ ∂ x i Ωα ∂ x i Ωβ Ω * + + γ0,β (uβ ) − γ0,α (uα ) ni ϕ d σ. Σ
Together with the equality ∂u ∂ xi
=
∂F uα ∂ xi
χΩα +
∂F uβ ∂ xi
χΩβ + [u]Σ ni H N −1 |Σ,
this ends the proof. Remark 6.4.1. (1) We stress the fact that the transmission law through Σ is contained in the formula − div(a(x)∇u(x)) = f on Ω, which holds in the distribution sense. Let us first notice that this formula makes sense. We have u ∈ H01 (Ω), a ∈ L∞ (Ω). Hence, for each i ∈ N, a(x) ∂∂ xu ∈ L2 (Ω), which defines a distribution on Ω and
∂ ∂ xi
i
(a(x) ∂∂ xu ), is well defined as a distribution. The point is that i
we have to treat a(x) ∂∂ xu as a block; we cannot derive this product by using the classical i
derivation rule, because a(·) ∈ L∞ (Ω) and ∂∂ xa is a measure which is singular with respect i to the Lebesgue measure. This fact reveals some of the strengths and weaknesses of the distribution theory. It is a powerful tool which allows us to formulate in a quite simple and unifying way a lot of phenomena. On the counterpart, the physical laws are often implicit and hidden in the distribution formulation. (2) Taking Ω = (0, 1), Ωα = ]0, c[, Ωβ = ]c, 1[, and Σ = {c}, the profile of the solution of the corresponding transmission problem corresponds to the classical Descartes law in geometrical optics, with the equality β
d+u dx
(c) = α
d−u dx
(c).
i
i i
i
i
i
i
246
book 2014/8/6 page 246 i
Chapter 6. Variational problems: Some classical examples
6.5 Linear elliptic operators Let Ω be an open subset of RN . Let us give a family {ai j (·) : 1 ≤ i, j ≤ n} of functions belonging to L∞ (Ω) and which satisfies the following condition: there exists some positive real number α > 0 such that N
∀ξ ∈ RN
i , j =1
ai j (x)ξi ξ j ≥ α|ξ |2
a.e. on Ω.
(6.79)
Let us introduce the corresponding bilinear form a : H 1 (Ω) × H 1 (Ω) → R, ∀u, v ∈ H 1 (Ω)
a(u, v) :=
N Ω i , j =1
ai j (x)
∂u ∂v ∂ xi ∂ x j
d x.
(6.80)
Property (6.79) implies that the bilinear form enjoys a so-called ellipticity condition: ∀v ∈ H 1 (Ω)
a(v, v) =
N Ω i , j =1
ai j (x)
≥α
∂v ∂v ∂ xi ∂ x j
dx (6.81)
2
Ω
|∇v(x)| d x.
By using (6.81) one can prove existence and uniqueness results for variational problems involving the bilinear form a. The proof is similar to the ones of the previous sections. Note that the case of the Dirichlet integral and of the corresponding Laplace equation corresponds to the particular situation ai j (x) = δi j . Since we are now familiar with a number of boundary value problems (Dirichlet, Neumann, mixed Dirichlet–Neumann), let us give in the present situation a unified approach to these problems by taking as a functional space a closed subspace V of H 1 (Ω) such that H01 (Ω) ⊂ V ⊂ H 1 (Ω). Theorem 6.5.1. Let us assume that Ω is a bounded open set in RN which is regular (piecewise of class C1 ) and connected. Let V be a closed subspace of H 1 (Ω) which satisfies conditions (a) H01 (Ω) ⊂ V ⊂ H 1 (Ω), (b) v ∈ V , v ≡ constant =⇒ v ≡ 0. On the other hand, let us give a family (ai j )i , j =1,...,N of L∞ (Ω) functions which satisfies the ellipticity condition (6.79). (i) Then, for any f ∈ L2 (Ω) and any g ∈ L2 (∂ Ω), there exists a unique solution u ∈ V of the problem ⎧ N ∂u ∂v ⎪ ⎨ ai j (x) dx = f v dx + g v d σ ∀v ∈ V , (6.82) ∂ xi ∂ x j Ω Ω ∂ Ω i , j =1 ⎪ ⎩ u ∈ V. (ii) The solution u of (6.82) satisfies −
N ∂ i , j =1
∂ xj
ai j (x)
∂u ∂ xi
=f
on Ω
in the distribution sense.
i
i i
i
i
i
i
6.5. Linear elliptic operators
book 2014/8/6 page 247 i
247
(iii) When the matrix (ai j (x))1≤i , j ≤N is symmetric, (6.82) is equivalent to saying that u is the unique solution of the minimization problem N 1 ∂v ∂v min a (x) dx − f v dx − g v dσ : v ∈ V . 2 Ω i , j =1 i j ∂ xi ∂ x j Ω ∂Ω PROOF. (i) To apply the Lax–Milgram theorem to problem (6.82), we are going to work in the space V which is considered as a subspace of H 1 (Ω) and is equipped with the scalar product of H 1 (Ω): 〈u, v〉V =
Ω
(uv + ∇u · ∇v) d x.
Since V is a closed subspace of H 1 (Ω), it is a Hilbert space. Problem (6.82) can be written find u ∈ V such that a(u, v) = l (v) ∀v ∈ V , where a is given by (6.80) and l (v) = Ω f v d x + ∂ Ω g v d σ. Let us first verify that the bilinear form a : V × V → R is continuous on V . Set M := sup1≤i , j ≤N ai j L∞ (Ω) . Then, for any u, v ∈ V , N ∂ u ∂ v |a(u, v)| ≤ M dx Ω i , j =1 ∂ xi ∂ x j N |∇u| |∇v| d x ≤M Ω 1/2 1/2 2 2 |∇u| d x |∇v| d x ≤ MN Ω
(6.83)
Ω
≤ M N uV vV , which implies that a is continuous. The continuity of l follows from the inequality |l (v)| ≤ f L2 (Ω) vL2 (Ω) + g L2 (∂ Ω) γ0 (v)L2 (∂ Ω) and the continuity of the trace operator from H 1 (Ω) into L2 (Σ). Let us now verify the crucial point, that is, the coercivity of a : V × V → R. By (6.81), since V ⊂ H 1 (Ω) we have ∀v ∈ V a(v, v) ≥ α∇v2L2 (Ω)N . (6.84) We now use the generalized Poincaré inequality (Theorem 5.4.3): the assumptions (a) and (b) on V imply the existence of a positive constant C such that ∀v ∈ V
vL2 (Ω) ≤ C ∇vL2 (Ω)N .
(6.85)
As a consequence of (6.84) and (6.85) we obtain that ∀v ∈ V
a(v, v) ≥
α 1+C2
vV2
and a is coercive on V .
i
i i
i
i
i
i
248
book 2014/8/6 page 248 i
Chapter 6. Variational problems: Some classical examples
Part (ii) is a direct consequence of (6.82) and of the fact that V contains (Ω). Part (iii) is an equivalent formulation to (6.82) when the matrix (ai j )i , j is symmetric. Indeed, in that case the bilinear form a : V × V → R is symmetric, and the equivalence follows from Proposition 2.3.1. As a particular case of Theorem 6.5.1, let us consider the following situation: take V = {v ∈ H 1 (Ω) : γ0 (v) = 0 on Γ0 }, where Γ0 ⊂ Γ = ∂ Ω is a measurable subset of the boundary Γ with a strictly positive surface measure: H N −1 (Γ0 ) > 0. We set Γ1 = Γ \ Γ0 . We know (see Section 6.3) that V satisfies all the assumptions of Theorem 6.5.1. The question we are going to examine is the interpretation of the boundary conditions satisfied by the solution u of (6.82). Proposition 6.5.1. When V = {v ∈ H 1 (Ω) : γ0 (v) = 0 on Γ0 }, where H N −1 (Γ0 ) > 0, the unique solution u of the problem ⎧ N ∂u ∂v ⎪ ⎨ ai j (x) dx = f vd x + g v d σ ∀v ∈ V , (6.86) ∂ xi ∂ x j Ω i , j =1 Ω ∂Ω ⎪ ⎩ u ∈ V, is a weak solution of the following boundary value problem: ⎧ N ⎪ ∂ ∂u ⎪ ⎪ a = f on Ω, ⎪ ⎨ − ∂ x j i j ∂ xi i , j =1 ⎪ u = 0 on Γ0 , ⎪ ⎪ ⎪ ⎩ ∂ u = g on Γ , 1 ∂ν
(6.87)
A
where
∂u ∂ νA
ator A : v
N
a ∂un i , j =1 i j ∂ xi j → −Σ ∂∂x (ai j ∂∂ xv ). j i :=
is called the conormal derivative of u associated with the oper-
PROOF. To find the boundary condition satisfied by u on Γ1 let us make the following ¯ for any 1 ≤ i, j ≤ N . regularity assumptions: u ∈ H 2 (Ω) and ai j ∈ C1 (Ω) Then, for any i = 1, 2, . . . , N we have ξ j :=
N i =1
ai j
∂u ∂ xi
∈ H 1 (Ω).
Let us write the Green’s formula (Proposition 5.6.2), N Ω j =1
ξj
∂v ∂ xj
dx =−
v Ω
N ∂ξ j j =1
∂ xj
dx +
v Γ1
N j =1
ξ j n j d σ.
(6.88)
From (6.86), (6.87), and (6.88) we infer that ∀v ∈ V
N Γ1
j =1
ξj nj v dσ =
Γ1
g v d σ,
i
i i
i
i
i
i
6.6. The linearized elasticity system
book 2014/8/6 page 249 i
249
which implies N j =1
that is,
∂u ∂ νA
ξj nj = g ,
= g on Γ1 .
Remark 6.5.1. When the above regularity properties on ai j and u are not satisfied, the
formula ∂∂ νu = g on Γ1 is just a formal way to express the boundary conditions implicitly A contained in (6.86).
6.6 The linearized elasticity system In this section, we are concerned with the study of the deformation of an N -dimensional elastic body (N = 2 or 3), occupying a domain Ω of RN , clamped on a part of its boundary, and subjected to a vector field of applied forces. Under the hypothesis that the body undergoes small deformations, we will see that the system of equations that models the equilibrium of the body furnishes a special case of elliptic problem. We first specify the notation. The physical space is identified with RN (N = 2 or 3) equipped with the canonical basis (e1 , . . . , eN ) and the standard Euclidean inner product. We denote by Ω ⊂ RN an open bounded and connected set with smooth boundary Γ in the sense that the theory of traces can apply (for instance, piecewise of class C1 ). In the elasticity framework, Ω is referred to as the interior reference configuration of the body. We denote by Γ0 the measurable subset of Γ where the body is clamped and set Γ1 = Γ \ Γ0 (the free boundary). We denote by MN and MNs the spaces of N × N and N × N symmetric matrices, respectively, equipped with the Hilbert–Schmidt inner product: for A = (ai j ) and B = (bi j ), A : B := N a b . The standard Euclidean scalar product of two vectors u i , j =1 i j i j
and v in RN is denoted by u.v. We use the same notation |.| for the associated norms in MN and RN . The gradient of a vector field v = (vi )i =1,...,N : Ω → RN in Ł1l oc (Ω)N is the distribution matrix field ∇v : Ω → MN whose entries are
∂ vi ∂ xj
, where i is the row index.
The linearized strain tensor field associated with an arbitrary vector field v in L1l oc (Ω)N ∂ v ∂v is the distribution matrix field / (v) : Ω → MNs given by /i j (v) = 12 ∂ xi + ∂ x j . The j
i
divergence of a matrix field M : Ω → MN in L1l oc (Ω, MN ) is the distribution vector field ∂M div(M ) : Ω → R3 given by div(M ) = ( Nj=1 ∂ xi, j )i . j
We are given two vector fields, f : Ω → RN and g : Γ1 → RN , which represent the applied forces densities. For instance, in the case when N = 3, if the body of mass density ρ : Ω → R is subjected to gravity and to a constant pressure on Γ1 , then the densities f and g are given by f (x) = −g r ρ(x)e3 and g (x) = −πν(x), where g r is the constant of gravity, π the constant pressure, and ν the unit outer normal to Γ1 . The configuration occupied by the body when it is subjected to forces with density f ¯ → R3 whose image and g is defined by means of the deformation, i.e., the mapping Φ : Ω is the deformed configuration of the body. It is convenient to describe the deformed configuration in terms of the displacement vector field u = Φ − IR3 . Under the hypothesis that |∇u(.)| = o(1) in Ω, and when the body is made up of an isotropic and homogeneous material, physical and mechanical considerations together with approximating theory lead
i
i i
i
i
i
i
250
book 2014/8/6 page 250 i
Chapter 6. Variational problems: Some classical examples
to the (formal) equations of equilibrium in the reference configuration Ω, ⎧ −div(σ(u)) = f in Ω; ⎪ ⎪ ⎨ u = 0 on Γ0 ; ⎪ σ(u)ν = g on Γ1 ; ⎪ ⎩ σ(u) = λ trace (/ (u))IR3 + 2μ / (u),
(6.89)
or, equivalently, by using the components, ⎧ N ∂ ⎪ ⎪ − σ (u) = fi in Ω for i = 1, . . . , N ; ⎪ ⎪ ⎪ ∂ xj i j ⎪ j =1 ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ui = 0 on Γ0 for i = 1, . . . , N ; N σi j (u)ν j = gi on Γ1 for i = 1, . . . , N ; ⎪ ⎪ ⎪ ⎪ j =1 ⎪ ⎪ N ⎪ ⎪ ⎪ ⎪ ⎪ σ (u) = λ / (u) δi , j + 2μ/i j (u). ⎩ ij kk k=1
For a complete explanation of how to derive the system (6.89) from the general theory of elasticity, we refer the reader to [174]. The matrix field σ is called the Cauchy stress tensor. The last equation, called Hooke’s law, which approximates the response of the material to external stimuli, is specific to each material. The coefficients λ > 0 and μ > 0, called the Lamé constants, are determined experimentally and are often expressed in terms of the Poisson coefficients νP and Young’s modulus EY through the relations νP =
λ 2(λ + μ)
,
EY =
μ(3λ + 2μ) λ+μ
.
In the general setting of elasticity, the relation between the stress and the strain tensors, of which Hooke’s law is a particular case, is called the stress-strain constitutive equations of the body. The boundary value problem (6.89) is said to be a pure displacement problem when Γ0 = Γ , a pure traction problem when Γ1 = Γ , and a displacement-traction problem when N −1 (Γ0 ) > 0 and N −1 (Γ1 ) > 0. We are going to provide a mathematical setting for problem (6.89). For this we first extend the Green’s formula (Proposition 5.6.2) to the vectorial setting. In what follows, to shorten the notation, we do not indicate the trace operator in the integrals. According to Proposition 5.6.2, and reasoning with the components, for all σ in H 1 (Ω, M sN ), and for all v in H 1 (Ω)N , we have N Ω
i , j =1
σi j
∂ vi ∂ xj
dx = −
N
∂ Ω ∂ xj
i , j =1
σ i j vi d x +
N i,j
Γ
σi j vi ν j d N −1
(6.90)
σi j v j νi d N −1 ,
(6.91)
and N i , j =1
Ω
σi j
∂ vi ∂ xj
dx = −
N i , j =1
∂
Ω ∂ xi
σi j v j d x +
N i,j
Γ
where ν = (νi )i =1,...,N is the outer unit normal to Γ . According to the fact that σ is a symmetric vector field, we have
i
i i
i
i
i
i
6.6. The linearized elasticity system N i,j
Γ
σ i j vi ν j d
N i , j =1
Ω
251 N −1
=
N i,j
∂ ∂ xj
book 2014/8/6 page 251 i
σ i j vi d x =
N
Γ
Ω
i , j =1
σ i j v j νi d ∂ ∂ xi
N −1
=
σi j v j d x = 1 ∂ vi 2 ∂ xj
Hence, summing (6.90) and (6.91), and using /i j (v) =
Ω
Γ
σν.v d N −1 ;
v. div σ d x.
∂v + ∂ x j , we obtain the followi
ing Green’s formula: σ : / (v) d x = − v. div σ d x + σν.v d N −1 . Ω
Ω
(6.92)
Γ
Let us assume that f ∈ L2 (Ω)N , g ∈ Ł2 (Γ1 )N , and that u ∈ H 2 (Ω)N . Let v ∈ H 1 (Ω)N , satisfying γ0 (v) = 0 on Γ0 . Multiplying the first equation in (6.89) by v with respect to the Euclidean scalar product of RN , integrating over Ω, and using Green’s formula (6.92) with σ = σ(u) together with the two boundary conditions in (6.89), we deduce σ(u) : / (v) d x = f .v d x + g .v d N −1 . (6.93) Ω
Ω
Γ1
Then, taking into account the constitutive equation, and noticing that trace(/ (.)) = div(.), we derive the variational formulation of (6.89) λ div(u).div(v) d x + 2μ / (u) : / (v) d x = f .v d x + g .v d N −1 , (6.94) Ω
Ω
Ω
Γ1
which makes sense under the weaker condition that u ∈ H 1 (Ω)N with γ0 (u) on Γ0 , and for all v ∈ H 1 (Ω)N , with γ0 (u) = γ0 (v) = 0 on Γ0 . This leads to the following weak formulation of the displacement-traction problem: assume that N −1 (Γ0 ) > 0; then a vector field u is said to be a weak solution of (6.89) if it belongs to V := {v ∈ H 1 (Ω)N : γ0 (v) = 0 on Γ0 } and satisfies the variational problem f .v d x + g .v d N −1 ∀v ∈ V . λ div(u).div(v) d x + 2μ / (u) : / (v) d x = Ω
Ω
Ω
Γ1
To mimic the proof of Theorem 6.5.1 for establishing the existence of a weak solution, we are led to consider the bilinear form a : V × V → R defined by a(u, v) := σ(u) : / (v) d x = λ div(u).div(v) d x + 2μ / (u) : / (v) d x (6.95) Ω
Ω
Ω
and to apply the Lax–Milgram theorem. The bilinear form a is clearly continuous when V is equipped with the standard norm of H 1 (Ω)N , but, in contrast to the problems studied in the previous sections, the difficulty is to establish the coercivity of a in the space V . Indeed the form a is clearly coercive when V is equipped with the seminorm v → / (v)L2 (Ω,Ms ) (actually we will see that it is a norm), but, according to this seminorm, the boundedness of any sequences (v n )n∈N a priori does not provide information ∂ vn ∂ vn ∂ vn ∂ vn on all the derivatives ∂ xi , but only on ∂ xi and 12 ∂ xi + ∂ xj . Therefore a compactness j
i
j
i
procedure could fail. It is remarkable that v → / (v)L2 (Ω,Ms ) is a norm equivalent to the standard norm in H 1 (Ω)N (then all the derivatives of v n are bounded in L2 (Ω)). This nontrivial result is the consequence of the famous Korn inequalities stated below.
i
i i
i
i
i
i
252
book 2014/8/6 page 252 i
Chapter 6. Variational problems: Some classical examples
In what follows, for any v ∈ H 1 (Ω)N , vH 1 (Ω)N is the Hilbert norm associated with the scalar product 〈u, v〉 := u(x).v(x) d x + ∇u(x) : ∇v(x) d x Ω
Ω
and
/ (v)L2 (Ω,MN ) := s
Ω
1/2 / (v) : / (v) d x
.
From / (v) = 12 (∇v + ∇v T ) we see that / (v)L2 (Ω,MN ) ≤ ∇vL2 (Ω,MN ) . s
Proposition 6.6.1 (Korn’s inequalities). Let Ω be an open bounded and connected set of RN which is piecewise of class C1 . (i) Then there exists a constant C (Ω) > 0 such that for all v ∈ H 1 (Ω)N , ! "1/2 / (v)2L2 (Ω,MN ) + v2L2 (Ω)N ≥ C (Ω)vH 1 (Ω)N .
(6.96)
s
(ii) There exists a constant C (Ω) > 0 such that for all v ∈ V , / (v)L2 (Ω,MN ) ≥ C (Ω)vH 1 (Ω)N . s
(6.97)
In other words, v → / (v)L2 (Ω,MN ) is a norm on V , equivalent to the standard norm s vH 1 (Ω)N . (iii) There exists a constant C (Ω) > 0 such that for all v ∈ H 1 (Ω)N , / (v)L2 (Ω,MN ) ≥ C (Ω) s
inf
w∈ker(/ )
v + wH 1 (Ω)N .
(6.98)
Remark 6.6.1. The infimum in the second member of inequality (6.98) is nothing but the norm of v¯ in the quotient space H 1 (Ω)/ker(/ ). Therefore (6.98) can be rewritten as ¯ H 1 (Ω)/ker(/ ) , where /¯ is well defined by ¯ L2 (Ω,MN )/ker(/ ) ≥ C ”(Ω)v /¯(v) s
¯ := / (v) /¯(v) ¯ for every v ∈ v. For proving Proposition 6.6.1 we need the two lemmas below, each having its own interest. Lemma 6.6.1 (the kernel of / : the rigid displacements). Let Ω be an open bounded and connected set of RN , N ∈ N∗ . Then the kernel of / : (Ω)N → (Ω)N , v → / (v), called the space of infinitesimal rigid displacements of the set Ω, is given by ker(/ ) = {v ∈ (Ω)N : v(x) = a + B x : a ∈ RN , B ∈ MN is antisymmetric}. In the specific cases N = 2, 3, one has case N = 2: ker(/ ) = {v = (v1 , v2 ) : ∃(a1 , a2 , b ) ∈ R3 s.t. v1 (x) = a1 + b x2 , v2 (x) = a2 − b x1 }; case N = 3: ker(/ ) = {v : ∃a ∈ R3 , ∃b ∈ R3 s.t. v(x) = a + b ∧ x}.
i
i i
i
i
i
i
6.6. The linearized elasticity system
book 2014/8/6 page 253 i
253
Consequently, when N = 2, if v ∈ ker(/ ) vanishes at two distinct points, then v = 0, while when N = 3, if v ∈ ker(/ ) vanishes at three distinct and noncollinear points, then v = 0. PROOF. Every function v : x → v(x) = a+B x, where a ∈ RN and B ∈ MN is antisymmetric, clearly belongs to ker(/ ). Conversely, let v ∈ ker(/ ) and consider the antisymmetric part (v) of ∇v defined by i j (v) =
1 ∂ vi 2 ∂ xj
−
∂ vj
∂ xi
.
For all i, j , k in {1, . . . , N } we have ∂ ∂ xk
i j (v) = = =
Hence, since
∂ vi ∂ xj
1
∂ 2 vi
2 ∂ x j ∂ xk ∂ 2 vi 1 2 ∂ x j ∂ xk ∂ ∂ xj
− +
/i k (v) −
∂ 2vj ∂ xi ∂ xk ∂ 2 vk
(6.99)
∂ xi ∂ x j ∂
∂ xi
−
1
∂ 2 vk
2 ∂ xi ∂ x j
+
∂ 2vj
∂ xi ∂ xk
/k j (v) = 0.
∂v
= − ∂ x j , we have i
∂ 2 vi ∂ x j ∂ xk
+
∂ 2vj ∂ xi ∂ xk
= 0.
(6.100)
Comparing (6.99) and (6.100) gives ∂ 2 vi ∂ x j ∂ xk
=
∂ 2vj ∂ xi ∂ xk
= 0.
We infer, since Ω is connected, that for i = 1, . . . , N , vi is an affine function. Hence there exist a ∈ RN and B ∈ MN such that v(x) = a + B x. Since / (v) = 0 is the symmetric part of B, the matrix B is antisymmetric. The end of the proof is standard. Lemma 6.6.2. Let Ω be an open bounded connected set of RN which is piecewise of class C1 , and v ∈ (Ω). Then the following equivalence holds: v ∈ H −1 (Ω) and
∂v ∂ xi
∈ H −1 (Ω) for i = 1, . . . , N ⇐⇒ v ∈ L2 (Ω).
Implication v ∈ L2 (Ω) =⇒ v ∈ H −1 (Ω) and ∂∂ xv ∈ H −1 (Ω) is trivial. Indeed, for all i ϕ ∈ (Ω) we have |〈v, ϕ〉| = vϕ d x ≤ vL2 (Ω) ϕH 1 (Ω) ; 0 Ω ∂v ∂ϕ , ϕ = − v d x ≤ vL2 (Ω) ϕH 1 (Ω) . 0 Ω ∂ xi ∂ xi
i
i i
i
i
i
i
254
book 2014/8/6 page 254 i
Chapter 6. Variational problems: Some classical examples
The converse implication is more involved and was first established by J. L. Lions. The proof can be found in [204, Theorem 3.2]. For an extension relative to H m -Sobolev spaces (m ∈ Z) and to the case when the boundary of Ω is Lipschitz continuous, see [31, Proposition 2.10]. PROOF OF PROPOSITION 6.6.1. The proof proceeds in five steps. Step 1. We claim that the vector space E := {v ∈ L2 (Ω)N : / (v) ∈ L2 (Ω, MNs )} coincides with the vector space H 1 (Ω)N . Let v be any element of E. For each k = 1, . . . , N , and for all j = 1, . . . , N , since vk ∈ L2 (Ω) we have ∂ vk ∈ H −1 (Ω). (6.101) ∂ vj On the other hand, since / (v) belongs to L2 (Ω, MNs ), the elementary identity in (Ω), ∂ ∂ xi yields
∂ ∂ xi
∂ vk
∂ xj ∂v implies that ∂ vk j
∂ vk
∂ xj
=
∂ ∂ xi
/ j k (v) +
∂ ∂ xj
/i k (v) −
∂ ∂ xk
/i j (v),
∈ H −1 (Ω), which, together with (6.101), and according to Lemma 6.6.2, ∈ L2 (Ω). This proves that E ⊂ H 1 (Ω)N and completes the claim since the
converse inclusion is trivial. Step 2. We establish (6.96). The spaces H 1 (Ω)N and E equipped with the norms 1/2 .H 1 (Ω)N and .E := .2L2 (Ω)N + / (.)2 , respectively, are two Hilbert spaces. The
mapping Id : H 1 (Ω)N → E defined by Id (v) = v is then a bijective linear continuous operator between the two Banach spaces H 1 (Ω)N and E. (The fact that Id is surjective comes from Step 1, and the continuity of Id comes from the trivial inequality vE ≤ vH 1 (Ω)N .) Therefore, according to the Banach open mapping theorem (see [137, 361]), Id is open: there exists C > 0 such that for all v ∈ H 1 (Ω)N , vH 1 (Ω)N ≤ C vE . The constant C (Ω) = C −1 is suitable. Step 3. We prove that ker / ∩ V = {0}. Let v ∈ ker / ∩ V . Since 2 (Γ0 ) > 0, Γ0 contains at least two distinct points when N = 2 and three distinct and noncollinear points when N = 3. Since v vanishes on Γ0 , from Lemma 6.6.1 we infer that v = 0. Step 4. We establish (6.97) by proceeding by contradiction. If (6.97) is false, there exists a sequence (vn )n∈N in V such that vn H 1 (Ω)N = 1;
(6.102)
/ (vn )L2 (Ω,MN ) → 0.
(6.103)
s
From the Rellich–Kondrakov compactness theorem, there exists a subsequence of (vn )n∈N (that we do not relabel) which strongly converges to some v in L2 (Ω)N . We deduce, with (6.103), that (vn )n∈N is a Cauchy sequence in E = H 1 (Ω)N equipped with the norm .E ; thus, from (6.96) established in Step 2, (vn )n∈N is also a Cauchy sequence with the norm .H 1 (Ω)N . Since V is a closed subspace of H 1 (Ω)N , vn → v strongly in V , so that, from (6.102), we infer that vH 1 (Ω)N = 1. (6.104)
i
i i
i
i
i
i
6.6. The linearized elasticity system
book 2014/8/6 page 255 i
255
But, from (6.103), / (vn ) → / (v) = 0 in L2 (Ω, MNs ), hence v ∈ ker / ∩ V . From Step 3, we deduce that v = 0, which contradicts (6.104). Step 5. We establish (6.98). We proceed in a similar way by contradiction. If (6.98) is false, there exists a sequence (vn )n∈N in V such that inf
w∈ker(/ )
vn + wH 1 (Ω)N = 1;
/ (vn )L2 (Ω,MN ) → 0.
(6.105) (6.106)
s
Recall that from Lemma 6.6.1, ker(/ ) is a finite dimensional subspace of H 1 (Ω)N and thus possesses a closed orthogonal ker(/ )⊥ . Consider the decomposition vn = un + wn , where un ∈ ker(/ )⊥ and wn ∈ ker(/ ). We have vn − wn ⊥ ker(/ ); wn ∈ ker(/ ), so that wn is the orthogonal projection of vn onto ker(/ ). Hence, from (6.105), un H 1 (Ω)N = vn − wn H 1 (Ω)N =
inf
w∈ker(/ )
vn + wH 1 (Ω)N = 1.
With (6.106), we finally obtain un H 1 (Ω)N = 1;
(6.107)
/ (un )L2 (Ω,MN ) → 0.
(6.108)
s
Reasoning as in Step 4, from (6.107) and (6.108) we deduce that there exists a subsequence of (un )n∈N (that we do not relabel), and u ∈ H 1 (Ω)N such that un → u strongly in H 1 (Ω)N . Thus u ∈ ker(/ )⊥ and uH 1 (Ω)N = 1. On the other hand, from (6.108) we infer that u ∈ ker(/ ). Consequently u = 0, which contradicts uH 1 (Ω)N = 1. Theorem 6.6.1 (displacement-traction problem). Let us assume that Ω is a bounded connected open set in RN which is piecewise of class C1 , and let V := {v ∈ H 1 (Ω)N : γ0 (v) = 0}, where N −1 (Γ0 ) > 0. (i) Then, for any f ∈ L2 (Ω)N and any g ∈ L2 (∂ Ω)N , there exists a unique solution u ∈ V of the problem ⎧ ⎨ σ(u) : / (v) d x = f .v d x + g .v d N −1 ∀v ∈ V ; (6.109) Ω Ω Γ1 ⎩ u ∈ V, where σ(u) = λ trace (/ (u))IR3 + 2μ / (u). (ii) Problem (6.109) is equivalent to saying that u is the unique solution of the minimization problem 1 N −1 min σ(v) : / (v) d x − f .v d x − g .v d : v ∈ V , (6.110) 2 Ω Ω ∂Ω where σ(v) = λ trace (/ (v))IR3 + 2μ / (v).
i
i i
i
i
i
i
256
book 2014/8/6 page 256 i
Chapter 6. Variational problems: Some classical examples
(iii) The solution u of (6.109) satisfies −div(σ(u)) = f
in Ω,
or equivalently −
N ∂ j =1
∂ xj
σi j (u) = fi
in Ω for i = 1 . . . , N
in the distribution sense. PROOF. Problem (6.109) can be written
find u ∈ V such that a(u, v) = l (v) ∀v ∈ V ,
where a is given by (6.95) and l (v) = Ω f .v d x + ∂ Ω g .v d N −1 . By reproducing a calculation similar to that of the proof of Theorem 6.5.1, we see that the bilinear form a : V × V → R as well as the linear form l are continuous on V . Finally, from Korn’s inequality (6.97), a is coercive on V . By applying the Lax–Milgram theorem to problem (6.109) in V , which is a closed subspace of the Hilbert space H 1 (Ω)N equipped with the norm .H 1 (Ω)N , we infer that (6.109) possesses a unique solution. Part (ii) follows from Proposition 2.3.1 because v → 12 Ω σ(v) : / (v) d x is symmetric. Part (iii) is a direct consequence of (6.109) and the fact that V contains (Ω)N . Remark 6.6.2. (a) In the elasticity framework, the space V is called the space of kinematically admissible displacements. The variational formulation (6.109) expresses the principle of virtual work: if v is a (virtual) admissible displacement, λ Ω div(u).div(v) d x + 2μ Ω / (u) : / (v) d x represents the deformation work of the elastic solid corresponding to the virtual displacement v, while Ω f .v d x + Γ g .v d 2 represents the work of the 1
external forces (or loading). (b) The formulation (ii) expresses the principle of least action: among all of the possible displacements (i.e., the virtual displacements) the solution u minimizes the action. The action, i.e., the functional v → 12 Ω σ(v) : / (v) d x − Ω f .v d x − ∂ Ω g .v d N −1 , is called the elastic potential energy, which is the sum of the deformation energyof the body v → 12 Ω σ(v) : / (v) d x and the potential energy of the external forces v → Ω f .v d x + g .v d N −1 . ∂Ω For any displacement field v, let us denote with v¯ its class in the quotient space ¯ and /¯ are clearly well defined by the relations H (Ω)/ker(/ ). The two operators div ¯ ¯ ¯ = div(v) and / (v) ¯ = / (v). In what follows we still denote them by div and / . div(v) The mapping a : H 1 (Ω)/ker(/ ) × H 1 (Ω)/ker(/ ) → R defined by 1
¯ := a( u¯ , v) Ω
σ( u¯) : ∇v¯ d x = λ
Ω
¯ d x + 2μ div( u¯).div(v)
Ω
¯ dx / ( u¯ ) : / (v)
(6.111)
is then clearly a continuous bilinear form. Furthermore, according to Korn inequality (6.98), a is coercive in H 1 (Ω)/ker(/ ). Then, by using arguments similar to those of the
i
i i
i
i
i
i
6.6. The linearized elasticity system
book 2014/8/6 page 257 i
257
previous theorem, we deduce existence and uniqueness for the pure traction problem up to an infinitesimal rigid displacement field. More precisely, we have the next theorem. Theorem 6.6.2 (pure traction problem). Let Ω be a bounded connected open set in RN which is piecewise of class C1 with that f ∈ L2 (Ω)N and g ∈ Γ1 = Γ . Assume furthermore 2 N N −1 L (∂ Ω) satisfy the condition Ω f .v d x + Γ g .v d = 0 for all v ∈ ker(/ ). (i) Then, there exists a unique solution u¯ ∈ H 1 (Ω)/ker(/ ) of the problem ⎧ ⎨ ¯ dx = σ( u¯) : / (v) f .v¯ d x + g .v¯ d N −1 Γ ⎩ u¯Ω∈ H 1 (Ω)/ker(/ ), Ω
∀v ∈ H 1 (Ω)/ker(/ ); (6.112)
where σ( u¯) = λ trace (/ ( u¯ ))IR3 + 2μ / ( u¯).
(ii) Problem (6.112) is equivalent to saying that u¯ is the unique solution of the minimization problem min
1 2
Ω
¯ : / (v) ¯ dx − σ(v)
Ω
f .v¯ d x −
∂Ω
g .v¯ d
N −1
1
: v¯ ∈ H (Ω)/ker(/ ) ,
¯ ¯ = λ trace (/ (v))I ¯ R3 + 2μ / (v). where σ(v) (iii) Let u ∈ u¯, where u¯ is the solution u of (6.112). Then u satisfies −div(σ(u)) = f
in Ω,
or equivalently −
N ∂ j =1
∂ xj
σi j (u) = fi
in Ω for i = 1 . . . , N
in the distribution sense. Remark 6.6.3. Condition Ω f .v d x + Γ g .v d N −1 = 0 for all v ∈ ker(/ ), which is 1 necessary from the mathematical point of view, is natural and says that the external forces do not work on the infinitesimal rigid displacements. Assume now that Ω is a connected open set of class C2 . When the exterior loading is regular, and there is no change of boundary condition along a connected portion of Γ , one can establish that the solution of (6.109) or (6.112) is regular. More precisely, if f ∈ L2 (Ω)N and g ∈ H −1/2 (Γ )N , then u belongs to H 2 (Ω)N in the pure displacement or in the pure traction case (see [173, Theorem 6.3-6]). In each of these two cases we can interpret (6.109) and (6.112) in terms of the boundary value problem (6.89) by completing (iii) with the boundary conditions satisfied by the unique solution u and u¯, respectively. Proposition 6.6.2. Under the conditions of Theorem 6.6.1 in the pure displacement case and Theorem 6.6.2 in the pure traction case, and if furthermore g ∈ H −1/2 (Γ )N , and Ω is a connected open set of class C2 , then the solutions u and u¯ of (6.109) and (6.112) belong to H 2 (Ω)N and H 2 (Ω)N /ker(/ ), respectively, and satisfy the following:
i
i i
i
i
i
i
258
book 2014/8/6 page 258 i
Chapter 6. Variational problems: Some classical examples
(i) Pure displacement case:
−div(σ(u)) = f a.e. in Ω; u = 0 on Γ in the trace sense;
(ii) Pure traction case: for any u ∈ u¯ , −div(σ(u)) = f a.e. in Ω; σ(u)ν = g on Γ in the trace sense. PROOF. Use the Green’s formula (6.92) and proceed as in the proof of Proposition 6.5.1. Remark 6.6.4. Proposition 6.6.2 can be extended to displacement-traction problems if the closures of Γ0 and Γ1 do not intersect, for example, when Ω is the ring Ω = {x ∈ RN : r1 < |x| < r2 }, Γ0 is the sphere of radius r1 > 0, and Γ1 is the sphere of radius r2 > r1 . Regularity is lost at corners along the boundary even if the boundary condition does not change (see [233], [234]). There is a vast literature on the regularity for elliptic systems including the boundary value problems of linearized elasticity as a special case; we mention in particular [215], [226].
6.7 Introduction to the Signorini problem We complete the previous section with a short introduction to the Signorini problem. With the notation of Section 6.6, we assume that the body surface Γ is decomposed into three disjoint measurable parts Γ = Γ0 ∪Γ1 ∪ΓS with strictly positive N −1 -measure, and, as previously, that the body is clamped on Γ0 and subjected to a surface force on the Neumann part Γ1 . The set ΓS denotes the possible contact boundary with a rigid foundation S disjoint from Ω. By comparison with the mechanical system of displacement-traction, the elastic body is impressed on the rigid support S. Let us denote by ν the unit outer normal to ΓS and by uν the normal component u.ν of u, and we introduce the normal and tangential component of σ(u): σν (u) = σ(u)ν.ν, σT (u) = σ(u)ν − σν (u)ν. (Note that the normal component σν (u) of σ(u) is a scalar field.) We assume that there is pure contact between the body and the rigid support contact, i.e., no friction occurs on ΓS . Then physical and mechanical considerations, together with approximation theory, lead to the following (formal) equations of equilibrium in the reference configuration Ω, originally introduced by Signorini (see [333], [214], [216], [357]): ⎧ −div(σ(u)) = f in Ω; ⎪ ⎪ ⎪ ⎪ ⎨ u = 0 on Γ0 ; σ(u)ν = g on Γ1 ; (6.113) ⎪ ⎪ ⎪ uν ≤ 0, σν (u) ≤ 0, uν σν (u) = 0, σT (u) = 0 on ΓS ; ⎪ ⎩ σ(u) = λ trace (/ (u))IR3 + 2μ / (u). The set ΓS is called the Signorini boundary or set of coincidence or also the contact boundary. Compared to the boundary value problem (6.89), the additional four conditions on ΓS describe the nature of the contact between the body and the support S: the first condition states that no penetration in the normal direction occurs, and the second and the third
i
i i
i
i
i
i
6.7. Introduction to the Signorini problem
book 2014/8/6 page 259 i
259
conditions state that only compressive normal stress is allowed and that there must be vanishing contact stress in case of no contact, respectively. The last condition states that the contact is without friction. As for problem (6.89), we are going to provide a weak formulation for problem (6.113). For this, let us introduce the set K := {v ∈ H 1 (Ω)N : γ0 (v) = 0 on Γ0 , γ0 (v).ν ≤ 0 a.e. on ΓS }. (In what follows, to simplify the notation, we do not indicate the trace operator γ0 .) Assume that f ∈ L2 (Ω)N , g ∈ Ł2 (Γ1 )N , and that the solution u of (6.113) belongs to H 2 (Ω)N ∩ K. Multiplying with respect to the Euclidean scalar product of RN each two members of the first equation in (6.113) by v − u, where v ∈ K is arbitrary, integrating over Ω, and using Green’s formula (6.92) with σ = σ(u), we obtain σ(u) : / (v − u) d x = f .(v − u) d x + σ(u)ν.(v − u) d N −1 Ω Ω Γ = f .(v − u) d x + g .(v − u) d N −1 Ω
+
Γ1
ΓS
σ(u)ν.(v − u) d N −1 . (6.114)
Let us decompose the stress σ(u)ν on the contact boundary ΓS with respect to its normal and tangential component. Using the boundary conditions fulfilled by uν and σν (u) on ΓS , and the fact that v.ν ≤ 0 on ΓS , we obtain σ(u)ν.(v − u) = σT (u) + σν (u)ν .(v − u) = σν (u)ν.(v − u) = σν (u)v.ν ≥ 0 N −1 a.e. in ΓS . Consequently, (6.114) yields σ(u) : / (v − u)d x ≥ f .(v − u) d x + g .(v − u)d N −1 , Ω
Ω
Γ1
which makes sense for all v ∈ K under the weaker condition that u ∈ H 1 (Ω)N and u ∈ K. Summing up, this leads to the following weak formulation: assume that N −1 (Γ0 ) > 0, N −1 (Γ1 ) > 0, and N −1 (ΓS ) > 0. Then a vector field u is said to be a weak solution of (6.113) if it belongs to K := {v ∈ H 1 (Ω)N : γ0 (v) = 0 on Γ0 , γ0 (v).ν ≤ 0 a.e. on ΓS } and satisfies the variational inequality σ(u) : / (v − u) d x ≥ f .(v − u) d x + g .(v − u) d N −1 Ω
Ω
Γ1
∀v ∈ K,
where σ(u) = λ trace (/ (u))IR3 + 2μ / (u). It can be seen immediately that the set K of admissible displacements is a nonempty closed convex subset of V := {v ∈ H 1 (Ω)N : γ0 (v) = 0 on Γ0 }. Theorem 6.7.1 (contact without friction problem). Let us assume that Ω is a bounded connected open set in RN which is piecewise of class C1 , and Γ = Γ0 ∪Γ1 ∪ΓS with N −1 (Γ0 ) > 0, N −1 (Γ1 ) > 0, and N −1 (ΓS ) > 0.
i
i i
i
i
i
i
260
book 2014/8/6 page 260 i
Chapter 6. Variational problems: Some classical examples
(i) Then, for any f ∈ L2 (Ω) and any g ∈ L2 (∂ Ω), there exists a unique weak solution u ∈ K of the Signorini problem ⎧ ⎨ σ(u) : / (v − u) d x ≥ f .(v − u) d x + g .(v − u) d 2 ∀v ∈ K; Ω Ω Γ1 ⎩ u ∈ K, (6.115) where σ(u) = λ trace (/ (u))IR3 + 2μ / (u). (ii) Problem (6.115) is equivalent to saying that u is the unique solution of the minimization problem 1 σ(v) : / (v) d x − f .v d x − g .v d N −1 : v ∈ K . (6.116) min 2 Ω Ω ∂Ω (iii) The solution u of (6.115) satisfies −div(σ(u)) = f or equivalently −
N ∂ j =1
∂ xj
σi j (u) = f
in Ω,
in Ω for i = 1 . . . , N
in the distribution sense. PROOF. The minimization problem in (ii) is equivalent to 1 σ(v) : / (v) d x − f .v d x − g .v d N −1 + δK (v) : v ∈ V , min 2 Ω Ω ∂Ω where δK is the indicator function of the set K. Since K is a nonempty closed convex subset of V , the indicator function δV is a lower semicontinuous convex and proper function. On the other hand, from the Korn’s inequality (6.97), the quadratic form v → 1 σ(v) : / (v) d x is coercive in V . Therefore (ii) concerns the minimization of the 2 Ω functional F :V → R ∪ {+∞} 1 σ(v) : / (v) d x − f .v d x + g .v d N −1 + δK , v → 2 Ω Ω ∂Ω which is sum of three lower semicontinuous convex proper functions, one of them being coercive on the Hilbert space V . Thus, it is the minimization of a lower semicontinuous convex coercive function on V . According to Theorem 3.3.4, it admits a solution. The strict convexity of the functional F yields unicity. Passing from formulation (ii) to formulation (i) is obtained by writing the first-order necessary and sufficient condition, as in Theorem 3.3.5, or more generally by applying the subdifferential calculus rules of Chapter 9; see Theorem 9.5.5. Taking v = u ± ϕ, ϕ ∈ (Ω)N as a test function in (6.115), and using Green’s formula (6.92), the proof of of (iii) becomes identical to that of Theorem 6.6.1(iii). Remark 6.7.1. (a) Formulations (i) and (ii) of Theorem 6.7.1 enter the setting of obstacle problems studied in detail in Section 6.12 in a scalar framework.
i
i i
i
i
i
i
6.8. The Stokes system
book 2014/8/6 page 261 i
261
(b) If we assume that the weak solution belongs to H 2 (Ω)N , using Green’s formula (6.92), and choosing suitable test functions, we can interpret the formulations of (i) or (ii) in terms of the boundary value problem (6.113) (see [256]). (c) To prove the existence of a weak solution for linear elasticity contact problems in a domain with inclusions, Korn’s inequalities established in Proposition 6.6.1 are not suitable. This question, very important for its applications, is treated in [187], where unilateral inequalities of the Korn type are established.
6.8 The Stokes system In this section, we are going to make precise the variational approach to the Stokes system, which was introduced in Section 2.3.1. Let us recall that the Stokes system for an incompressible viscous fluid in a domain Ω of RN consists in finding functions u1 , u2 , . . . uN : Ω −→ R and p : Ω −→ R which satisfy ⎧ ∂p ⎪ ⎪ −μΔui + = fi on Ω, i = 1, . . . , N , ⎪ ⎪ ⎪ ∂ xi ⎨ N ∂ ui = 0 on Ω, ⎪ ⎪ ⎪ ⎪ i =1 ∂ xi ⎪ ⎩ ui = 0 on ∂ Ω, i = 1, . . . , N . The given vector f = ( f1 , f2 , . . . , fN ) ∈ L2 (Ω)N represents a volumic density of forces, and μ > 0 is the viscosity coefficient. (It is a positive scalar which is inversely proportional to the Reynolds number.) The vector function u = (u1 , . . . , uN ) : Ω −→ RN is the velocity vector field of the fluid; it assigns to each point x ∈ Ω the velocity vector u(x) = (ui (x))i =1,...,N of the fluid at x. The scalar function p : Ω −→ R is the pressure; for each x ∈ Ω, p(x) is the pressure of the fluid at x. The Stokes system can be written in the following form: ⎧ ⎨ −μΔu + ∇ p = f on Ω, div(u) = 0 on Ω, ⎩ u = 0 on ∂ Ω. The condition div(u) = 0 expresses that the fluid is incompressible. The Stokes system is a linear system of (N + 1) partial differential equations on Ω involving (N + 1) unknown functions (u1 , . . . , uN , p). The variational formulation of the Stokes system was introduced by Leray around 1934. The idea is to work in the functional space 5 4 (6.117) V = v ∈ H01 (Ω)N : div(v) = 0 and make the pressure appear as a Lagrange multiplier of the constraint div(v) = 0. Let us assume that Ω is a bounded connected open subset of RN whose boundary is piecewise C1 . The space V is equipped with the scalar product of H01 (Ω)N 〈u, v〉H 1 (Ω)N = 0
N i =1
〈ui , vi 〉H 1 (Ω) , 0
where 〈ui , vi 〉H 1 (Ω) = 0
Ω
(ui vi + ∇ui · ∇vi ) d x,
i
i i
i
i
i
i
262
book 2014/8/6 page 262 i
Chapter 6. Variational problems: Some classical examples
and the corresponding norm 6 vH 1 (Ω)N = 0
71/2
N i =1
Ω
(ui2 + |∇ui |2 )
dx
.
The space V is equal to the kernel of the divergence operator div, div : v ∈ H01 (Ω)N −→ div(v) ∈ L2 (Ω), which is a linear continuous operator from H01 (Ω)N into L2 (Ω). The continuity of the div operator follows from the following inequality: 2 N ∂ vi 1 N 2 div(v)L2 (Ω) = ∀v ∈ H0 (Ω) dx ∂ xi i =1 Ω N |∇vi |2 d x ≤ i =1
Ω
≤ v2H 1 (Ω)N . 0 Hence, V is a closed subspace of H01 (Ω)N and V is a Hilbert space. We can now state the variational formulation of the Stokes system. Theorem 6.8.1. (a) For every f ∈ L2 (Ω)N there exists a unique u ∈ V which satisfies ⎧ N N ⎪ ⎨ μ ∇u · ∇v d x = f v d x ∀v ∈ V , i i i i (6.118) Ω Ω i =1 i =1 ⎪ ⎩ u ∈ V. (b) Let u be the solution of (6.118). Then the relation (6.118) determines a unique p ∈ L2 (Ω) (up to an additive constant) such that the couple (u, p) ∈ V × L2 (Ω) satisfies μ
N i =1
Ω
∇ui · ∇vi d x −
N i =1
Ω
fi vi d x =
p div(v) d x Ω
∀v ∈ H01 (Ω)N .
(6.119)
(c) The couple (u, p) is a weak solution of the Stokes system: ⎧ N ⎨ −μΔu + ∇ p = f in (Ω) , div(u) = 0 in Ω), ⎩ u = 0 on ∂ Ω in the trace sense. The couple (u, p) is called the variational solution of the Stokes system. PROOF. (a) Let us consider the bilinear form a : V × V −→ R N ∇ui · ∇vi d x a(u, v) = μ i =1
and the linear form l : V −→ R l (v) =
Ω
N i =1
Ω
fi vi d x.
i
i i
i
i
i
i
6.8. The Stokes system
book 2014/8/6 page 263 i
263
The bilinear form a is clearly continuous and its coercivity follows, by standard argument, from the Poincaré inequality in H01 (Ω). The continuity of l is also immediate. Thus, all the assumptions of the Lax–Milgram theorem, Theorem 3.1.2, are satisfied. This implies the existence and uniqueness of the solution u of (6.118). (b) The difficulty comes from the fact that (Ω)N is not contained in the space V , and one cannot interpret directly (6.118) in terms of distributions. Moreover, up to now, the pressure p still has not appeared in the above variational formulation. Let us reformulate (6.118) as an orthogonality relation. Let us consider the linear form L : H01 (Ω)N −→ R, which is defined by L(v) = μ
N i =1
Ω
∇ui · ∇vi d x −
N Ω
i =1
fi vi d x.
(6.120)
The linear form L is clearly continuous on H01 (Ω)N and (6.118) precisely tells us that ∗ L(v) = 0 for all v ∈ V . In other words L ∈ H01 (Ω)N and L(v) = 0 for all v ∈ V , i.e., L ∈ V ⊥ the orthogonal subspace of V (for the pairing between H01 (Ω)N and its topological dual). The precise description of such elements is provided by the following theorem, obtained by de Rham in 1955; see [335]. is piecewise C1 . Let Theorem in RN whose boundary 4 5 Let Ω be a bounded connected set 1 6.8.2. N ∗ 1 N 1 L ∈ H0 (Ω) , a linear continuous form on H0 (Ω) . Set V = v ∈ H0 (Ω)N : div(v) = 0 . Then p div(v) d x ∀v ∈ H01 (Ω)N . L(v) = 0 ∀v ∈ V ⇐⇒ ∃ p ∈ L2 (Ω) such that L(v) = Ω
PROOF OF THEOREM 6.8.1 CONTINUED. Let us admit the de Rham theorem (the implication L ∈ V ⊥ =⇒ ∃ p . . . , which is the interesting part of the theorem, is a nontrivial result) and apply it to the bilinear form L which is defined in (6.120). We thus have the existence of p ∈ L2 (Ω) such that μ
N i =1
Ω
∇ui · ∇vi d x −
N
Ω
i =1
fi vi d x =
p div(v) d x Ω
∀v ∈ H01 (Ω)N .
This is precisely (6.119). (c) Since (Ω)N is dense in H01 (Ω)N , the solution of (6.119) is characterized by (u, p) ∈ V × L2 (Ω) and μ
N
Ω
i =1
∇ui · ∇vi d x −
Ω
p div(v) d x =
N i =1
Ω
fi vi d x
∀v ∈ (Ω)N .
Taking v = (0, . . . , vi , . . . 0), i = 1, . . . , N , yields μ
Ω
∇ui · ∇vi d x −
p Ω
∂ vi ∂ xi
that is, −μΔui +
∂p ∂ xi
dx =
Ω
fi vi d x
∀vi ∈ (Ω),
= fi in (Ω).
i
i i
i
i
i
i
264
book 2014/8/6 page 264 i
Chapter 6. Variational problems: Some classical examples
Moreover, u ∈ V contains the information div(u) = 0 in (Ω), u = 0 on ∂ Ω in the trace sense. Hence, (u, p) is a weak solution of the Stokes system.
6.9 Convection-diffusion equations In the following example, we apply the Lax–Milgram theorem in the nonsymmetric case. Let Ω be a bounded open set in RN . Let us give N functions b1 , b2 , . . . , bN which belong to L∞ (Ω). We set b = (b , b , . . . , b ) ∈ L∞ (Ω)N . Given f ∈ L2 (Ω), we are looking for a 1
N
2
solution of the convection-diffusion boundary value problem −Δu − div(u b ) = f on Ω, u =0 on ∂ Ω. The first-order differential operator div(u b ) =
(6.121)
∂ i ∂ xi
(bi u) describes the convection of a physical quantity which is moving with velocity b (it is also called advection, of drift term). The Laplacian Δu is a second-order differential operator associated with the diffusion. Note that (6.121) can be written as −div(∇u + u b ) = f , which is the divergence form of a conservation law from physics. There are many situations where the phenomena of diffusion and convection occur simultaneously (heat propagation, dynamic of population, reaction-diffusion equations). 1 2 2 is the Euclidian We use the following notation: for any x ∈ Ω, |b (x)|RN = i bi (x) N norm of the vector b (x) ∈ R , and |b | ∞ is the (essential) sup norm of the function L (Ω)
x → |b (x)|RN . Since Ω has been assumed to be bounded, by the Poincaré inequality, ∀v ∈ H01 (Ω)
vL2 (Ω) ≤ CP (Ω) |∇v| L2 (Ω) ,
where the Poincaré constant CP (Ω) is the smallest constant for which the above inequality holds. The variational approach of (6.121) is described in the following statement. Theorem 6.9.1. Let us give f ∈ L2 (Ω), and suppose that |b | L∞ (Ω)
0. Using again the Poincaré inequality, we obtain a(v, v) ≥ αvV2 with α=
1 − CP (Ω) |b | L∞ (Ω) 1 + CP (Ω)2
> 0.
Hence, a is coercive, and all the assumptions of the Lax–Milgram theorem are satisfied. This implies existence and uniqueness of the solution u of problem (6.123). (b) Let u be the solution of (6.123). Since (Ω) ⊂ H01 (Ω), we have for all v ∈ (Ω) ∂ u ∂ v ∂v dx + bi (x)u(x) dx = f v d x. (6.127) ∂ xi Ω ∂ xi ∂ xi Ω Ω i i Let us interpret (6.127) in the distribution sense. Since bi u ∈ L1 (Ω) is a distribution ∂u ∂v ∂v bi u, , + = 〈 f , v〉( (Ω), (Ω)) . (6.128) ∂ xi ∂ xi ( (Ω), (Ω)) ∂ xi ( (Ω), (Ω)) i i By definition of the derivation in the distribution sense, we obtain −Δu − div(u b ) = f in (Ω). Moreover, H01 (Ω) = ker γ0 , where γ0 is the trace operator. Hence γ0 (u) = 0
in the trace sense
and u satisfies (6.124). Conversely, if u satisfies −Δu − div(u b ) = f in the distribution sense, we have (6.128). Then use the density of (Ω) in H01 (Ω) and the fact that u ∈ H01 (Ω), bi u ∈ L2 (Ω), and f ∈ L2 (Ω) to obtain (6.123).
i
i i
i
i
i
i
6.10. Semilinear equations
book 2014/8/6 page 267 i
267
Remark 6.9.1. The condition (6.122) expresses the fact that the velocity vector b governing the convection is not too large. There is another type of condition on b , for which the conclusion of Theorem 6.9.1 is still valid, namely, −div b ≥ 0
in (Ω).
Under this condition, we have for all v ∈ (Ω) i
Ω
bi (x)v(x)
∂v ∂ xi
(x) d x =
1 2
i
Ω
bi (x)
∂ ∂ xi
(v 2 )(x) d x
1
= 〈−div b , v 2 〉( (Ω), (Ω)) 2 ≥ 0. Then, by density, the result can be extended to an arbitrary v ∈ H01 (Ω), which gives the coercivity of a. The rest of the proof is unchanged.
6.10 Semilinear equations Let Ω be a bounded open set in RN . Let us give g : (x, r ) ∈ Ω × R → g (x, r ), which is a Carathéodory function, i.e., g is measurable with respect to x and continuous with respect to r . We are looking for a (variational) solution of the semilinear boundary value problem −Δu = g (x, u) on Ω, (6.129) u =0 on ∂ Ω, where g (x, u) is the function x ∈ Ω → g (x, u(x)) ∈ R (for short we write g (u)). The semilinear terminology comes from the fact that the nonlinear term g (u) depends only on u (and not its partial derivatives). We will reformulate (6.129) as a fixed point problem. Then, depending on the type of (growth) assumption on g , we will apply the fixed point theorem of Banach and Picard or Schauder. As a basic ingredient of the fixed point approach, we use the operator T : L2 (Ω) → 2 L (Ω) which is the inverse of the Laplace–Dirichlet operator. Let us state its precise definition. Definition 6.10.1. T : L2 (Ω) −→ L2 (Ω) is defined for every h ∈ L2 (Ω) by the following: T h ∈ H01 (Ω) ⊂ L2 (Ω) is the unique solution of the variational problem ⎧ ⎪ ⎨ ∇(T h)(x) · ∇v(x) d x = h(x)v(x) d x Ω Ω ⎪ ⎩ T h ∈ H 1 (Ω). 0
∀v ∈ H01 (Ω),
Equivalently, T h is the variational solution of the Dirichlet problem −Δ(T h) = h on Ω, T h = 0 on ∂ Ω. We have (−Δ) ◦ T = i dH ,
H = L2 (Ω),
i
i i
i
i
i
i
268
book 2014/8/6 page 268 i
Chapter 6. Variational problems: Some classical examples
i.e., T is the right inverse of −Δ. Let us state the continuity properties of T (see Chapter 8 for the proof). Proposition 6.10.1. The operator T satisfies the following properties: T : L2 (Ω) −→ L2 (Ω) is a linear continuous operator, and (i) ∀h ∈ L2 (Ω) (ii) ∀h ∈ L2 (Ω)
T hL2 (Ω) ≤ CP (Ω)2 hL2 (Ω) , T hH 1 (Ω) ≤ CP (Ω) 1 + CP (Ω)2 hL2 (Ω) , 0
where CP (Ω) is the Poincaré constant on Ω.
6.10.1 Lipschitz nonlinearity Let us suppose that there exists some constant L g ≥ 0 such that for all r, s ∈ R |g (x, r ) − g (x, s)| ≤ L g |r − s|
for a.e. x ∈ Ω,
2
x → g (x, 0) ∈ L (Ω).
(6.130) (6.131)
Assumption (6.130) expresses that g (x, .) : R → R is L g -Lipschitz continuous. Assumption (6.131) implies that for any u ∈ L2 (Ω), g (u) still belongs to L2 (Ω). Indeed, x → g (x, u(x) is Lebesgue measurable as the composition of the measurable mappings x → (x, u(x)) and (x, r ) → g (x, r ). Moreover |g (x, u(x))| ≤ |g (x, 0)| + L g |u(x)| for a.e. x ∈ Ω. Noticing that |g (., 0)| + L g |u(.)| belongs to L2 (Ω), we deduce that g (u) belongs to L2 (Ω). As a consequence, we can define the operator G: L2 (Ω) −→ L2 (Ω), u −→ G(u)
with G(u)(x) = g (x, u(x)).
As a straight consequence of assumption (6.130), we obtain that G is L g -Lipschitz continuous, i.e., for every u, v ∈ L2 (Ω) G(u) − G(v)L2 (Ω) ≤ L g u − vL2 (Ω) .
(6.132)
Clearly, u is a solution of the semilinear boundary value problem (6.129) if and only if T ◦ G(u) = u, i.e., u is a fixed point of the operator T ◦ G. Theorem 6.10.1. Let us suppose that g : Ω × R → R is a Carathéodory function that satisfies assumptions (6.130), (6.131) with Lg
0 and a subsequence (n l ) such that g (unl ) − g (u)L2 (Ω) ≥ ε0 > 0. Applying the above argument to this subsequence, we obtain a contradiction.
i
i i
i
i
i
i
270
book 2014/8/6 page 270 i
Chapter 6. Variational problems: Some classical examples
In order to develop a fixed point argument using only a continuity property, we use the Leray–Schauder fixed point theorem, which we recall below. It is an extension to infinite dimensional spaces of the Brouwer fixed point theorem. Theorem 6.10.2. Let V be a Banach space and K ⊂ V a convex compact nonempty subset of V . Let S : K → K be a continuous mapping from K into K. Then, there exists at least a fixed point u of S, i.e., u ∈ K satisfies S(u) = u. We have all the ingredients to obtain the next theorem. Theorem 6.10.3. Let us suppose that g : Ω × R → R is a Carathéodory function that satisfies |g (x, r )| ≤ h(x)
∀ r ∈ R and a.e. x ∈ Ω
with h ∈ L2 (Ω). Then, there exists a variational solution of the semilinear boundary value problem −Δu = g (x, u) on Ω, u =0 on ∂ Ω. PROOF. By Proposition 6.10.1, the operator T : L2 (Ω) −→ L2 (Ω) is linear continuous. By (6.136), the operator G : L2 (Ω) → L2 (Ω) is continuous. Hence, the composition T ◦ G : L2 (Ω) → L2 (Ω) of the two operators is continuous. In order to apply the Leray– Schauder fixed point theorem, we just need to find a convex compact nonempty subset K of V = L2 (Ω) such that T ◦ G sends K into K. First, let us notice that (6.135) implies that for any v ∈ L2 (Ω), G(v)L2 (Ω) ≤ hL2 (Ω) . By the continuity property of T , Proposition 6.10.1, we obtain that the range of T ◦ G is contained in the convex set M N K = v ∈ H01 (Ω) : vH 1 (Ω) ≤ CP (Ω) 1 + CP (Ω)2 hL2 (Ω) . 0
The set K is bounded in H01 (Ω) (it is a ball). By the Rellich–Kondrakov compactness embedding of H01 (Ω) into L2 (Ω) (see Theorem 5.3.3), the set K is relatively compact in L2 (Ω). Moreover, it is closed in L2 (Ω). Note that whenever un → u in L2 (Ω), un ∈ K, then un is bounded in H01 (Ω) and hence converges weakly to u in H01 (Ω). Since K is a closed convex set in H01 (Ω), it is closed for the weak topology of H01 (Ω), and hence u ∈ K. Thus, K is a convex compact nonempty set in L2 (Ω), and T ◦ G sends K into K. The conclusion follows from the Leray–Schauder fixed point theorem.
6.10.3 Critical point methods As a model example, let us examine the case g (u) = |u| l −1 u for some l > 1. We are looking for a nontrivial solution (i.e., u = 0) of the semilinear problem −Δu = |u| l −1 u on Ω, (6.139) u =0 on ∂ Ω. The idea is to consider the minimization problem min |∇v(x)|2 d x, v∈Σ
(6.140)
Ω
i
i i
i
i
i
i
6.10. Semilinear equations
book 2014/8/6 page 271 i
271
which consists in the minimization of the Dirichlet integral J (v) = the manifold 1 |v(x)| l +1 d x = 1 . Σ = v ∈ H01 (Ω) : p +1 Ω
Ω
|∇v(x)|2 d x over (6.141)
Let us observe that (6.140) is a nonconvex minimization problem. (The constraint is a sphere, which is not convex.) Condition (6.141) prevents any possible solution from being equal to 0. First, let us examine under which condition on l there exists a solution to (6.140). Following the general topological approach (see Chapter 3), we consider a minimizing sequence (un ) of (6.140). It is bounded in H01 (Ω). Let us examine for which p it is relatively compact in L p+1 . This is a crucial property in order to pass to the limit on the constraint ∗ |u | l +1 d x = p + 1. By the Sobolev embedding theorem, Theorem 5.7.2, H01 (Ω) → L2 Ω n with 21∗ = 12 − N1 . By the Rellich–Kondrakov theorem, Theorem 5.3.3, the embedding of H01 (Ω) → L2 (Ω) is compact. As a consequence, the embedding of H01 (Ω) → L p (Ω) is compact for all p < 2∗ . An elementary computation gives l + 1 < 2∗ =
2N N −2
⇔l
0, 1 t
[J (u + t v) − J (u)] ≥ 0,
and pass to the limit on this inequality as t → 0+ . We have |∇u + t ∇v| p − |∇u| p 1 1, p ∀v ∈ W0 (Ω) f v d x ≥ 0. dx − p Ω t Ω
(6.162)
To pass to the limit in (6.162) we use the Lebesgue dominated convergence theorem: Set h(t ) = |∇u + t ∇v| p . We have h (t ) = p|∇u + t ∇v| p−2 (∇u + t ∇v) · ∇v. Hence 1
1 [|∇u + t ∇v| p − |∇u| p ] = (h(t ) − h(0)) t t 1 1 = p|∇u + s∇v| p−2 (∇u + s∇v) · ∇v d s. t 0
From this, by taking 0 < t ≤ 1, we obtain 1 t
||∇u + t ∇v| p − |∇u| p | ≤
p
t
|∇u + s∇v| p−1 |∇v| d s t 0 ≤ p(|∇u| + |∇v|) p−1 |∇v|.
(6.163)
Let us notice that |∇v| ∈ L p (Ω) and that (|∇u| + |∇v|) p−1 belongs to L p (Ω) (because of the equality ( p − 1) p = p). Hence the right-hand side of (6.163) is a function which belongs to L1 (Ω) and which is independent of t ∈ ]0, 1]. We can now pass to the limit on (6.162) and obtain 1, p p−2 ∀v ∈ W0 (Ω) |∇u| ∇u · ∇v d x − f v d x ≥ 0. Ω
Ω
Let us now replace v by −v. We obtain 1, p ∀v ∈ W0 (Ω) |∇u| p−2 ∇u · ∇v d x = f v d x. Ω
Ω
(iii) By taking v ∈ (Ω), we obtain, by definition of the derivation in distributions, the following equality: − div(|∇u| p−2 ∇u) = f
in (Ω).
i
i i
i
i
i
i
6.12. The obstacle problem
book 2014/8/6 page 279 i
279 1, p
On the other hand, when Ω is regular, from u ∈ W0 (Ω) we infer that u = 0 on ∂ Ω in the trace sense (Proposition 5.6.1). Remark 6.11.1. We could as well consider Neumann-type boundary value problems for ¯ we have the following the Δ p operator. Let us just notice that, assuming u ∈ C2 (Ω), integration by parts formula: ¯ ∀v ∈ (Ω)
Ω
|∇u|
p−2
∇u · ∇v d x = −
Ω
vΔ p u d x +
∂Ω
|∇u| p−2
∂u ∂n
v d σ.
It follows that variational problems in W 1, p (Ω) lead to boundary conditions of the following type: ∂u = g on ∂ Ω. |∇u| p−2 ∂n (Note that when p = 2, one recovers the classical Neumann boundary condition.)
6.12 The obstacle problem As a model example, we consider the variational problem with unilateral constraint 1 |∇v(x)|2 d x − f (x)v(x) d x : v ∈ H01 (Ω), v ≥ g on Ω , (6.164) min 2 Ω Ω where Ω is a bounded open set in RN , g : Ω → R is the obstacle function, and f : Ω → R is an external force. From a mechanical point of view, taking Ω ⊂ R2 , the solution u : Ω → R of (6.164) gives the equilibrium position of an elastic membrane whose boundary is fixed, i.e., u = 0 on ∂ Ω, and which must lie over an obstacle g : Ω → R. The membrane is submitted to the action of a vertical force f : Ω → R. The constraint K of the admissible displacements is given by 5 4 (6.165) K = v ∈ H01 (Ω) : v ≥ g on Ω . To ensure that K is not empty, we assume that g ≤ 0 on ∂ Ω. This problem plays a central role in potential theory. Taking f = 0, and g equal to the characteristic function of an open set A ⊂ Ω (i.e., g = 1 on A, and g = 0 on Ω \ A), the infimal value of (6.164) is half of the (harmonic) capacity of the set A; see Definition 5.8.1. This suggests that in the study of (6.164), where g is a general obstacle (not regular, thin, etc.), a central question is to determine in what sense the inequality v ≥ g is taken. Let us first examine the case where 5 4 K = v ∈ H01 (Ω) : v(x) ≥ g (x) for a.e. x ∈ Ω , i.e., the inequality constraint in (6.165) is intended in the sense almost everywhere. The existence and uniqueness of the solution of (6.164) are described below. This is a direct consequence of the variational principles of Chapter 5. Theorem 6.12.1. Let us suppose that g : Ω → R satisfies K g = , where 4 5 K g = v ∈ H01 (Ω) : v(x) ≥ g (x) for a.e. x ∈ Ω .
i
i i
i
i
i
i
280
book 2014/8/6 page 280 i
Chapter 6. Variational problems: Some classical examples
(i) Then, for any f ∈ L2 (Ω), there exists a unique solution u ∈ H01 (Ω) of the obstacle problem min
1 2
2
Ω
|∇v(x)| d x −
Ω
f (x)v(x) d x : v ∈
H01 (Ω),
v(x) ≥ g (x) for a.e. x ∈ Ω . (6.166)
(ii) The solution u of (6.166) satisfies the variational inequality ⎧ ⎨ ∇u(x) · ∇(v − u)(x) d x − f (x)(v − u)(x) d x ≥ 0 Ω ⎩ uΩ∈ K .
∀v ∈ K g ,
(6.167)
g
PROOF. Take V = H01 (Ω) equipped with its classical Hilbertian structure. To show that K g is closed in V , it suffices to notice that every convergent sequence in H01 (Ω) converges in L2 (Ω) and hence contains a subsequence that converges pointwise almost everywhere. As a consequence, K g is a closed convex nonempty subset of V , and its indicator function δK g is lower semicontinuous, convex, and proper. On the other hand, the Dirichlet integral Φ(v) := 12 Ω |∇v(x)|2 d x is a lower semicontinuous convex coercive function on V . (The coercivity comes from the Poincaré inequality of Theorem 5.3.1 and from the fact that Ω is bounded.) The linear integral functional v → L f (v) := f (x)v(x) d x is continuous on L2 (Ω) and hence on H01 (Ω). Problem (6.166) is then Ω the minimization of the sum of three lower semicontinuous convex proper functions N M min Φ(v) + L f (v) + δK g (v) : v ∈ H01 (Ω) , one of them being coercive. Thus, it is the minimization of a lower semicontinuous convex coercive function on the Hilbert space V = H01 (Ω). By Theorem 3.3.4, it admits a solution. This solution is unique because the Dirichlet integral is strongly convex, and hence strictly convex, which makes problem (6.166) strictly convex. Passing from (6.166) to (6.167) can be readily obtained by writing the first-order necessary and sufficient condition, as in Theorem 3.3.5, or more generally by applying the subdifferential calculus rules of Chapter 9; see Theorem 9.5.5. Let us interpret (6.167) as a free boundary problem. Let us define the coincidence set C ⊂ Ω as the region where u = g . The noncoincidence set N = Ω \ C is the region where u is not equal to g (i.e., u > g ), and the free boundary ∂ C is the interface between the two. Suppose that all these data are regular. On C , the value of u is prescribed u = g . Let us show that, formally, on the complementary set N = Ω\ C , u is solution of the Poisson problem ⎧ ⎨ −Δu = f on N = Ω \ C , u=g on ∂ C , (6.168) ⎩ u =0 on ∂ Ω. Since u and g are supposed regular, the set N where u > g is open. On any compact subset of N , the continuous positive function u − g is minorized by a positive constant. Therefore, for any test function φ ∈ (N ), there exists some positive t such that
i
i i
i
i
i
i
6.12. The obstacle problem
book 2014/8/6 page 281 i
281
u + t φ > g on N . Since u + t φ = u on C , we have u + t φ ∈ K g . Taking v = u + t φ in (6.167), after dividing by t > 0, we obtain ∇u(x) · ∇φ(x) d x − f (x)φ(x) d x ≥ 0 ∀φ ∈ (N ). (6.169) Ω
Ω
Since (6.169) holds for φ and −φ, we can replace the inequality by an equality in (6.169), which gives −Δu = f on N = Ω \ C . Note that the boundary of N is contained in ∂ C ∪ ∂ Ω. On these sets, the value of u is required to be respectively equal to g and zero. Hence (6.168) is a well-posed Poisson problem. It is a free boundary problem, because the set N , equivalently, its boundary ∂ C = ∂ N , is not given. It is part of the problem. As soon as the interface (the free boundary) between the two phases {u = g } and {u > g } is known, the obstacle problem reduces to the classical Poisson problem. Free boundary problems arise in various situations; another well-known example is the Stefan problem, describing the phase transition between ice and water. The above approach is formal. By contrast with the classical Poisson problem, the solution of the obstacle problem is not smooth in general. In the one-dimensional case, with f = 0, the solution is affine in the noncoincidence set. Consequently, whatever the smoothness of the obstacle function g is, the second derivative of u has discontinuities at the points which are at the boundary of the contact set. In order to consider general obstacles (possibly thin obstacles, which are supported by sets of zero Lebesgue measure) and interpret the inequality constraint v ≥ g in the most general sense, we use the elements of potential and capacity theory which were introduced in Section 5.8. Recall that any v ∈ H01 (Ω) has a quasi-continuous representative v˜ (unique up to the quasi-everywhere equality). A positive finite energy measure is a positive Radon measure which belongs to the dual of H01 (Ω), i.e., μ ≥ 0 and μ ∈ H −1 (Ω). A positive finite energy measure does not charge the sets which have zero capacity; hence it is a capacitary measure (see Section 5.8.4). Moreover, for any v ∈ H01 (Ω), the quasicontinuous representative v˜ of v is integrable with respect to μ and (6.170) 〈v, μ〉(H 1 (Ω),H −1 (Ω)) = v˜ d μ. 0
Ω
Let us introduce the general concept of unilateral constraint; see [51, Definition 3.1]. We denote by H01 (Ω)+ the positive cone of all nonnegative functions of H01 (Ω). Definition 6.12.1. A subset K of H01 (Ω) is said to be a unilateral convex set if it satisfies (i) K is a closed convex nonempty subset of H01 (Ω); (ii) K + H01 (Ω)+ ⊂ K; (iii) u ∧ v ∈ K for all u, v ∈ K. Let us give the functional description of a unilateral convex set; see [51, Theorem 3.2]. Theorem 6.12.2. Let K ⊂ H01 (Ω) be a unilateral convex set. Then, there exists a sequence ¯ such that ( gn ) of elements of K, and a mapping g : Ω → R
i
i i
i
i
i
i
282
book 2014/8/6 page 282 i
Chapter 6. Variational problems: Some classical examples
(i) g˜n decreases q.e. to g ; (ii) g is quasi-upper semicontinuous; 5 4 (iii) K = v ∈ H01 (Ω) : v˜ ≥ g q.e. on Ω . In the general case, in order to mathematically justify the free boundary formulation of the obstacle problem, the idea is to rewrite it as a complementary problem: ⎧ on Ω, ⎨ −Δu − f ≥ 0 u− g ≥0 on Ω, (6.171) ⎩ (u − g )(Δu + f ) = 0 on Ω. Then notice that −Δu − f is a positive distribution. As a consequence, it is a positive Radon measure; let us set −Δu− f = μ ≥ 0. The last condition of (6.171) can be rewritten (u − g ) d μ = 0, Ω
which makes sense by considering the quasi-continuous representative of u and the above description of K with g quasi-upper semicontinuous. Let us make this precise in the following theorem. Theorem 6.12.3. Let K ⊂ H01 (Ω) be a unilateral convex set, i.e., 5 4 K = v ∈ H01 (Ω) : v˜ ≥ g q.e. on Ω with g quasi-upper semicontinuous. Then there exists a unique solution u of the obstacle problem 1 min |∇v(x)|2 d x − f (x)v(x) d x : v ∈ H01 (Ω), v˜ ≥ g q.e. on Ω . (6.172) 2 Ω Ω Equivalently, u is the solution of the following complementary problem: (i) u˜(x) − g (x) ≥ 0 q.e. on Ω; (ii) μ := −Δu − f ≥ 0 is a positive finite energy measure; (iii) Ω ( u˜ − g ) d μ = 0. PROOF. The existence and uniqueness of u are obtained in the same manner as in Theorem 6.12.1. Item (i) follows from u ∈ K. Let us prove that u satisfies (ii) and (iii). The first-order optimality condition for (6.172) gives ⎧ ⎨ ∇u(x) · ∇(v − u)(x) d x − f (x)(v − u)(x) d x ≥ 0 ∀v ∈ K, (6.173) Ω Ω ⎩ u ∈ K. Taking v = u + φ, with φ ∈ (Ω), φ ≥ 0, readily implies that μ := −Δu − f ≥ 0 is a positive finite energy measure. Since u˜(x) − g (x) ≥ 0 q.e., we have u˜(x) − g (x) ≥ 0 μ-a.e. Hence Ω
( u˜ − g ) d μ ≥ 0.
(6.174)
i
i i
i
i
i
i
6.12. The obstacle problem
book 2014/8/6 page 283 i
283
On the other hand, by Theorem 6.12.2, there exists a sequence ( gn ) of elements of K such that g˜n decreases q.e. to g . Taking v = gn in (6.173), we obtain
0 ≤ 〈gn − u, −Δu − f 〉(H 1 (Ω),H −1 (Ω)) = 0
Ω
( g˜n − u˜) d μ.
From the monotone convergence theorem, we deduce that ( g − u˜) d μ ≥ 0.
(6.175)
Ω
Combining (6.174) and (6.175) gives Ω
( u˜ − g ) d μ = 0.
Conversely, let us suppose that μ := −Δu − f ≥ 0 is a positive finite energy measure such that Ω ( u˜ − g ) d μ = 0. By (6.170), for any v ∈ K, we have 〈v − u, −Δu − f 〉(H 1 (Ω),H −1 (Ω)) = (v˜ − u˜) d μ 0
= =
Ω
Ω
Ω
(v˜ − g ) d μ +
Ω
( g − u˜) d μ
(v˜ − g ) d μ ≥ 0.
Equivalently ⎧ ⎨ ∇u(x) · ∇(v − u)(x) d x − f (x)(v − u)(x) d x ≥ 0 Ω ⎩ uΩ∈ K,
∀v ∈ K,
i.e., u is the solution of the obstacle problem (6.172). Remark 6.12.1. (a) A similar analysis can be developed for the obstacle problem 1 1, p min |∇v(x)| p d x − f (x)v(x) d x : v ∈ W0 (Ω), v ≥ g on Ω , p Ω Ω where 1 < p < ∞. The crucial property is that for 1 < p < ∞, the Sobolev space 1, p 1, p W0 (Ω) is a reflexive Banach space and that the contractions operate on W0 (Ω); see Theorem 5.8.2 and Corollary 5.8.2. By contrast, the theory cannot be directly extended to the obstacle problem for the bi-Laplacian Δ2 , because the contractions do not operate on the Sobolev space H 2 (Ω). (Truncations induce discontinuities of the first derivatives.) (b) As an important variant of the above study, the unilateral constraint can be imposed on the boundary of Ω. The variational problem 1 2 1 |∇v(x)| d x − f (x)v(x) d x : v ∈ H (Ω), v ≥ 0 a.e. on ∂ Ω min 2 Ω Ω
i
i i
i
i
i
i
284
book 2014/8/6 page 284 i
Chapter 6. Variational problems: Some classical examples
gives rise to the following complementary problem (where the normal derivative on ∂ Ω plays the role of the Laplace operator): ⎧ −Δu − f = 0 on Ω, ⎪ ⎪ ⎪ ⎨ u≥0 on ∂ Ω, ⎪ ⎪ ⎪ ⎩
∂u ≥0 ∂n ∂u u∂n =0
on ∂ Ω, on ∂ Ω.
This is a scalar problem. It can serve as a model for systems in linear elasticity, where unilateral constraint are imposed on the boundary of Ω, like the celebrated Signorini problem introduced in Section 6.7 (see also [204], [216]).
i
i i
i
i
i
i
book 2014/8/6 page 285 i
Chapter 7
The finite element method
One of the major interests of variational methods is to provide both a theory for existence of solutions and numerical methods for computing accurate approximations of these solutions. Certainly, the most celebrated of these variational approximation methods is the finite element method. It is a Galerkin approximation scheme where the elements of the finite dimensional approximating subspaces Vn are piecewise polynomial functions. This method has been proved to be very successful, the main reason being that, because of the local character of several problems, by choosing a basis of the space Vn whose functions have small supports, one obtains approximated problems with sparse matrices, i.e., with most entries equal to zero. This is a decisive property in order to be able to solve numerically the corresponding linear system: one should notice that engineering problems involving systems of PDEs from continuum mechanics usually give rise to large linear systems (100 × 100 or 1000 × 1000 are quite frequent!). Our scope is to introduce the main ideas of the finite element method and then to describe a typical example. For simplicity of the exposition, we restrict ourselves to linear problems whose variational formulation enters into the abstract setting of the Lax–Milgram theorem: find u ∈ V such that a(u, v) = l (v) ∀v ∈ V . (7.1) Let us first recall and make precise some aspects of the Galerkin method (which was introduced in Section 3.1.2).
7.1 The Galerkin method: Further results Let us briefly recall the assumptions on the abstract variational problem (7.1): V is a Hilbert space, and a : V × V → R is a continuous coercive bilinear form, i.e., there exists some constants M ∈ R+ and α > 0 such that ∀u, v ∈ V , ∀v ∈ V ,
|a(u, v)| ≤ M uv, a(v, v) ≥ αv2 .
(7.2) (7.3)
The linear form l : V → R is supposed to be continuous. Then the Lax–Milgram theorem asserts the existence and uniqueness of a solution u ∈ V of problem (7.1). Typically, as in the boundary value problems which were studied in the previous sections, V is a Sobolev space, like H 1 (Ω) or H01 (Ω). It is an infinite dimensional space; this 285
i
i i
i
i
i
i
286
book 2014/8/6 page 286 i
Chapter 7. The finite element method
is a common feature of all methods from functional analysis, and the numerical computation of the solution u requires a further step, which is the reduction to a finite dimensional problem. The Galerkin method is based on the approximation of the infinite dimensional space V by a sequence of finite dimensional subspaces (Vn )n∈N . More precisely, for each n ∈ N, Vn is a finite dimensional subspace of V and one supposes that the following approximation property holds: ∀v ∈ V , ∃(vn )n∈N , vn ∈ Vn ∀ n ∈ N, and vn → v in V . For each n ∈ N the approximated variational problem is find un ∈ Vn such that a(un , v) = l (v) ∀ v ∈ Vn .
(7.4)
(7.5)
One should notice that the approximated problem (7.5) is still a variational problem: it has the same structure as the initial variational problem (7.1), except now it is posed on a finite dimensional space Vn . Indeed, existence and uniqueness of the solution un of (7.5) follows, in a similar way, from the Lax–Milgram theorem. When a is symmetric, problem (7.5) reduces to a minimization problem, namely, un is the solution of 1 min (7.6) a(v, v) − l (v) : v ∈ Vn . 2 This alternate description of the finite dimensional approximation method is often called the Ritz method. The term variational approximation is justified by the fact that the sequence of problems (7.5) does approximate the initial problem (7.1), in the sense that the sequence (un )n∈N norm converges in V to u. More precisely, in Proposition 3.1.2 it was proved that u − un ≤
M α
dist(u,Vn ),
(7.7)
and one can observe that, clearly, the approximation property (7.4) implies that for any u ∈ V , dist(u,Vn ) → 0 as n → +∞. Let us now make precise the structure of the approximated problem (7.5). Let us introduce a basis (ϕ1 , ϕ2 , . . . , ϕI (n) ) of the vector space Vn with I (n) = dimVn . Let us I (n) write un = i =1 λi ϕi . Then (7.5) is equivalent to ⎧ a(u , ϕ ) = l (ϕ j ) ∀ j = 1, 2, . . . , I (n), ⎪ ⎨ n j I (n) ⎪ u = λi ϕ i . ⎩ n i =1
This, in turn, is equivalent to finding λ = (λi )i =1,2,...,I (n) in RI (n) which is a solution of the linear system I (n) λi a(ϕi , ϕ j ) = l (ϕ j ) ∀ j = 1, 2, . . . , I (n). (7.8) i =1
Let us set An = (a(ϕ j , ϕi ))1≤i , j ≤I (n) . It is an I (n) × I (n) square matrix which, by reference to the elasticity problem, is often called the stiffness matrix. Similarly, the vector bn = (l (ϕ j )) in RI (n) is often called the load vector.
i
i i
i
i
i
i
7.2. Description of finite element methods
book 2014/8/6 page 287 i
287
With this notation, one can write (7.8) in the following form: An λ = bn .
(7.9)
Let us now examine the properties of the matrix An : for any vector λ ∈ RI (n) we have (〈·, ·〉 is the Euclidean scalar product in RI (n) and | · | the Euclidean norm) 〈An λ, λ〉 =
I (n) i =1
(An λ)i λi
⎛ ⎞ I (n) I (n) ⎝ = a(ϕ j , ϕi )λ j ⎠ λi i =1
⎛
=a⎝
j =1
I (n) j =1
λj ϕj ,
I (n) i =1
⎞ λi ϕ i ⎠
) I (n) )2 ) ) ) ) ≥ α) λi ϕ i ) . ) ) i =1
I (n) I (n) Since (ϕ)i =1 is a basis of Vn , one can easily verify that λ → i =1 λi ϕi is a norm on RI (n) . All norms being equivalent on the finite dimensional space RI (n) , there exists some constant c > 0 such that ) ) I (n) ) ) ) ) I (n) c|λ| ≤ ) λ i ϕ i ). ∀λ ∈ R ) ) i =1
Hence
∀λ ∈ RI (n)
〈An λ, λ〉 ≥ αc 2 |λ|2 .
(7.10)
From (7.10) it follows that An is one to one (that is, ker(An ) = {0}), which, in this finite dimensional setting, implies that for any load vector bn , problem (7.9) has a unique solution. This is another elementary way (without using the Lax–Milgram theorem) to prove existence and uniqueness of the solution λ of the approximated problem (7.9). Let us also notice that when a is symmetric, so is the matrix An . We now come to the central point of this theory, which is the effective construction of the finite dimensional approximating subspaces Vn and the resolution of the linear system (7.9). As we have already stressed, it is crucial, from a numerical point of view, that the matrix An possesses as many zeroes as possible. At this point, there are different strategies: in the next section, we shall describe an approach which uses spectral analysis in infinite dimensional spaces and a special basis whose elements are eigenfunctions. The finite element method relies on a different strategy that we now describe.
7.2 Description of finite element methods Let us now assume that V is a closed subspace of H 1 (Ω) and the bilinear form a : V ×V → R is of the type a(u, v) = (∇u · ∇v + a0 uv) d x (7.11) Ω
i
i i
i
i
i
i
288
book 2014/8/6 page 288 i
Chapter 7. The finite element method
with a0 ∈ L∞ (Ω), a0 ≥ 0. This allows us to cover various situations like Dirichlet, Neumann, and mixed problems which were studied in the previous sections. A key property of a is that it is a local bilinear form, that is, a(ϕ, ψ) = 0, as soon as ϕ and ψ are two elements of V whose supports do not intersect, or more generally such that the Lebesgue measure of the intersection of their supports is zero. Let us recall that the stiffness matrix An is equal to (a(ϕi , ϕ j )), where (ϕi ), i = 1, . . . , I (n), is a basis of Vn . The strategy is now clear: we have to choose Vn such that one can find a canonical basis in the space Vn whose corresponding functions have supports which are as small as possible. This is made possible thanks to a triangulation of the set Ω. For simplicity of the exposition, we restrict ourselves to problems which are posed over sets Ω ⊂ R2 which are polyhedra; we also say that such set Ω is polygonal. Definition 7.2.1. A triangulation the set Ω of the form
of a polygonal set Ω of R2 is a finite decomposition of Ω=
3
K
K∈
such that (i) each set K ∈
is a triangle,
(ii) whenever K1 and K2 belong to , K1 ∩ K2 is either empty or reduced to a common vertex or to a common face (edge). In particular, for each distinct K1 , K2 ∈ , one has int(K1 ) ∩ int(K2 ) = . Two triangles K1 and K2 which have a common face are said to be adjacent. The triangles K ∈ are called finite elements. We set h( ) = max diamK,
(7.12)
K∈
where diamK = sup{|x − y| : x, y ∈ K} is the diameter of K. By convention, we denote by h a triangulation such that h( ) = h. As we shall see, to each triangulation h will be associated a finite approximating dimensional subspace V h . It is convenient to consider the family of approximating subspaces indexed by the positive parameter h, say, V h with h → 0. Of course, to reduce to an abstract Galerkin scheme as described in Section 7.1, one may take V h = V hn for some hn → 0. An example of triangulation is given in Figure 7.1. One may already observe that one can use a fine triangulation in a subregion where a particular behavior of the solution is expected (for example, in special parts of airplanes). Figure 7.2 shows a forbidden situation where the intersection of two triangles K1 and K2 is not an edge of K2 . Let us now describe the finite dimensional space V h which is associated to a triangulation h . At this point, we need to make precise the boundary condition; take, for example, the Dirichlet boundary condition u = 0 on ∂ Ω and V = H01 (Ω). Then V h = {v ∈ C(Ω) : v is affine on each K ∈
h,
v = 0 on ∂ Ω}.
i
i i
i
i
i
i
7.2. Description of finite element methods
book 2014/8/6 page 289 i
289
Figure 7.1. Example of triangulation.
Figure 7.2. A forbidden situation.
In other words, V h is the linear space of continuous functions on Ω which are piecewise linear with respect to the triangulation h and which vanish on the boundary. One can easily verify that an affine function on a triangle K is uniquely determined by its values at the vertices of K. Hence, any function v ∈ V h is uniquely determined by its values at the vertices (also called nodes) of the triangulation which are in the interior of Ω, i.e., in Ω. (On the vertices which are on the boundary ∂ Ω, v is prescribed to be equal to zero.) For any vertex ai of the triangulation, i = 1, 2, . . . , I (h), which is in the interior set Ω, let us denote by ϕi the element of V h which satisfies ϕi (a j ) = δi j ,
1 ≤ i, j ≤ I (h).
Equivalently, ϕi is the function of V h which is equal to one at the vertex ai and is equal to zero at all other vertices a j with j = i. It is usually called a hat function. Clearly, (ϕ1 , ϕ2 , . . . , ϕI (h) ) is a basis of V h and each element v of V h can be uniquely written in the form v=
I (h) i =1
v(ai )ϕi .
One should notice that each element ϕi of this basis has small support: more precisely, the support of ϕi is the union of all triangles K of h such that ai is a vertex of K. The stiffness matrix Ah = (a(ϕi , ϕ j ))1≤i , j ≤I (h) is a sparse matrix, since a(ϕi , ϕ j ) = 0 except when ai and a j are two vertices of a same triangle K of the triangulation h . Let us now stress a technical but important point: the structure of Ah , i.e., the distribution of zeroes, for a given triangulation h highly depends on the enumeration of the vertices. Clearly, one has to use an enumeration to obtain Ah with a simple structure like, for example, tridiagonal matrices. Let us illustrate this in a concrete situation.
i
i i
i
i
i
i
290
book 2014/8/6 page 290 i
Chapter 7. The finite element method
7.3 An example Take Ω = (0, 1) × (0, 1) the unit square in R2 and, given f ∈ L2 (Ω), let us consider the Dirichlet boundary value problem −Δu = f on Ω, u = 0 on ∂ Ω. Its variational formulation is as follows: find u ∈ H01 (Ω) such that Ω
∇u · ∇v d x =
f v dx Ω
∀ v ∈ H01 (Ω).
Let us consider the triangulation of Ω in Figure 7.3.
Figure 7.3. Triangulation of Ω with indexed nodes.
We denote by a l ,m = (l h, mh), 0 ≤ l , m ≤ N +1, the nodes of the triangulation. There are N 2 nodes which are in Ω, and the dimension of V h is equal to N 2 with h = 1/(N + 1). Let us index the nodes of the triangulation as indicated in Figure 7.3, where N has been taken equal to 4. We draw, for example, the perspective of the hat function ϕ6 which is an element of the finite element basis (Figure 7.4). Recall that a(ϕi , ϕ j ) = 0 iff ai and a j are two vertices of the same triangle. Then, notice that a vertex ai is connected to at most six other vertices a j with j = i, which, a priori, yields a seven-point numerical scheme. The matrix Ah then has the following structure, where each cross represents an element a(ϕi , ϕ j ) which is, a priori, not equal to zero. It is a matrix with a band structure, that is, a(ϕi , ϕ j ) = 0 for |i − j | > d ma x , where d ma x , the width of the band, is small with respect to the size of the matrix.
i
i i
i
i
i
i
7.4. Convergence of the finite element method
book 2014/8/6 page 291 i
291
Figure 7.4. The hat function ϕ6 .
Indeed, for symmetry reasons a(ϕ l ,m , ϕ l +1,m+1 ) = 0 and a(ϕ l ,m , ϕ l −1,m−1 ) = 0 and we have a five-point scheme! Let λ l ,m = u h (ϕ l ,m ) be the component of u h with respect to ϕ l ,m so that u h = λ l ,m ϕ l ,m . An elementary computation yields ⎧ 2 1 ≤ l, m ≤ N, ⎪ ⎨−λ l −1,m − λ l ,m−1 + 4λ l ,m − λ l ,m+1 − λ l +1,m = h f l ,m , 0 ≤ l ≤ N + 1, λ l ,0 = λ l ,N +1 = 0, ⎪ ⎩λ = λ = 0, 0 ≤ m ≤ N + 1. 0,m
N +1,m
This is the classical five-point scheme for the Laplacian. (Here f l ,m = equivalently, an approximation of this integral.)
Ω
f ϕ l ,m d x or,
Remark 7.3.1. It is worth noticing that the tridiagonal block structure of Ah is intimately related to the enumeration of the elements of the basis. This structure may be lost by choosing a different enumeration!
7.4 Convergence of the finite element method The convergence of the finite element method, which is a Galerkin approximation method, relies on Proposition 3.1.2: u − u h H 1 (Ω) ≤ 0
M α
dist(u,V h ).
Therefore, the estimate of the error u − u h (and showing that the error goes to zero as h → 0) can be reduced to a problem in approximation theory: one has to evaluate (majorize) the distance for the H01 (Ω) norm between a function u ∈ H01 (Ω) and the subspace V h of continuous functions which are piecewise affine relative to a given triangulation h . To that end, we need to make a geometrical assumption on the family of triangulations ( h ) h→0 . Definition 7.4.1. A family of triangulations ( h ) h>0 is said to be regular if there exists a constant σ (σ ≥ 0) such that for any h > 0 and any K ∈ h hK ρK
≤ σ,
(7.13)
i
i i
i
i
i
i
292
book 2014/8/6 page 292 i
Chapter 7. The finite element method
where hK is the diameter of K and ρK is the supremum of the diameters of the balls contained in K. It can be easily shown that this condition is equivalent to the following. There exists a constant θ0 > 0 such that for any h > 0 and for any K ∈ h , θK ≥ θ0 , where θK denotes the smallest angle of the triangle K. Thus the regularity of a family of triangulations ( h ) h>0 in the sense of Definition 7.4.1 prevents the triangles from becoming “flat” in the limit when h → 0. As we shall see, this is a key assumption to obtain the convergence of the method. For example, the situation with h → 0 in Figure 7.5 is not allowed in the context of a regular family of triangulation (as h → 0, K h becomes flat).
Figure 7.5. A triangle Kh becoming flat.
We shall return later to this example and show that in such a situation some of the following mathematical developments fail to be true. The main result of this section, which is the convergence of the finite element method under the regularity assumption (7.13), is given below. Theorem 7.4.1. Let Ω be a polygon and let ( h ) h→0 be a regular family of triangulations of Ω. Then, the finite element method converges, i.e., lim u − u h H 1 (Ω) = 0. 0
h→0
Moreover, if u belongs to H 2 (Ω), the following estimate holds: there exists some constant C > 0 such that for all h > 0 u − u h H 1 (Ω) ≤ C huH 2 (Ω) . 0
PROOF. (a) Let us first assume that u ∈ H 2 (Ω). Recalling that N = 2, by Sobolev embedding theorems (see Section 5.7) we have u ∈ C(Ω). Indeed this is true under the assumption N ≤ 3. This allows us to talk about the value of u at any point of Ω, and especially at
i
i i
i
i
i
i
7.4. Convergence of the finite element method
book 2014/8/6 page 293 i
293
the nodes (ai )i =1,...,I (h) of the triangulation h . Let us introduce the function Π h (u) which is the continuous affine interpolant of u at the nodes of h : I (h)
Π h (u) =
i =1
u(ai )ϕi .
Recall that ϕi is the hat function related to the node ai and that the (ϕi )i =1,...,I (h) form a basis of V h . The above formula is just the linear decomposition of Π h (u) in the basis (ϕi )i =1,...,I (h) . Since Π h (u) ∈ V h , by definition of dist(u,V h ), we have dist(u,V h ) ≤ u − Π h (u)H 1 (Ω) . 0
This inequality, when combined with Proposition 3.1.2 (Cea’s lemma) yields u − u h H 1 (Ω) ≤
M α
0
u − Π h (u)H 1 (Ω) .
(7.14)
0
Let us now use the following approximation result that we admit for the moment (we shall return to this crucial result further): there exists a constant C independent of h such that for all u ∈ H 2 (Ω) u − Π h (u)H 1 (Ω) ≤ C huH 2 (Ω) . (7.15) Let us notice that this estimate makes use in an essential way of the regularity assumption (7.13) on the family of triangulations ( h ) h→0 and of the fact that u ∈ H 2 (Ω). Let us now combine (7.14) and (7.15) to obtain u − u h H 1 (Ω) ≤ 0
M α
C huH 2 (Ω) .
(7.16)
Thus, in the case u ∈ H 2 (Ω), we have convergence of the finite element method, that is, norm convergence in H01 (Ω) of the sequence (u h ) h→0 to u as h → 0. More precisely, the estimate (7.16) provides information about the rate of convergence of the method. (b) In the general case, that is, u ∈ H01 (Ω), one completes the proof by a density argu(Ω) such that ment: for any > 0 let us introduce some v ∈ (Ω) = C∞ c u − v H 1 (Ω) < .
(7.17)
0
Since v ∈ (Ω) ⊂ H 2 (Ω), for each > 0 we can use the previous argument and, by (7.15), we have (7.18) v − Π h (v )H 1 (Ω) ≤ C hv H 2 (Ω) . 0
Let us now write the triangle inequality u − Π h (v )H 1 (Ω) ≤ u − v H 1 (Ω) + v − Π h (v )H 1 (Ω) 0
0
0
and use inequalities (7.17) and (7.18) to obtain u − Π h (v )H 1 (Ω) ≤ + C hv H 2 (Ω) . 0
Since Π h (v ) ∈ V h , this implies dist(u,V h ) ≤ + C hv H 2 (Ω) .
i
i i
i
i
i
i
294
book 2014/8/6 page 294 i
Chapter 7. The finite element method
Hence
lim sup dist(u,V h ) ≤ . h→0
This being true for any > 0, we finally obtain lim dist(u,V h ) = 0,
h→0
which, by Proposition 3.1.2 (Cea’s lemma), implies the norm convergence in H01 (Ω) of u h to u as h → 0. Let us now give the proof of the piecewise affine interpolation inequality (7.15) which, together with the abstract Cea’s lemma, is the key ingredient of the proof of Theorem 7.4.1. Because of its importance and its own interest let us state it independently. Theorem 7.4.2. Let Ω be a polygon and ( h ) h→0 a regular family of triangulations of Ω (i.e., (7.13) is supposed to be satisfied). Then, there exists a constant C , which is independent of h, such that for any u ∈ H 2 (Ω) u − Π h (u)H 1 (Ω) ≤ C huH 2 (Ω) . We recall that Π h (u) is the piecewise affine interpolant of u relative to
h.
For pedagogical reasons, it is worthwhile to first prove Theorem 7.4.2 in one dimension, i.e., Ω = (a, b ) is an interval of R. Indeed, the role of the norms H 1 (Ω) and H 2 (Ω) and of the assumption u ∈ H 2 (Ω) already appear quite naturally in this situation, and the proof just requires elementary tools. We shall then consider the two-dimensional case and show how, in that case, one has to do some geometrical assumptions on h . (This is where the regularity assumption (7.13) on h plays a central role.) PROOF OF THEOREM 7.4.2 IN THE ONE-DIMENSIONAL CASE. Let Ω = (a, b ) be an interval of R with −∞ < a < b < +∞. Let a = a0 < a1 < a2 < · · · < an = b be a discretization of Ω, and set h = maxi |ai +1 − ai |. (a) Let us first assume that u is smooth, say, u ∈ C∞ ([a, b ]), and let x ∈ (a j , a j +1 ). The Taylor–Lagrange formula at order one yields !
" u(a j +1 ) − u(a j ) Π h (u) (x) = a j +1 − a j = u (a j + θ j )
for some 0 < θ j < h. Hence, for any x ∈ (a j , a j +1 ) |u (x) − Π h (u) (x)| = |u (x) − u (a j + θ j )| a j +1 ≤ |u (s)| d s. aj
Applying the Cauchy–Schwarz inequality, we obtain a j +1 |u (s)|2 d s. |u (x) − Π h (u) (x)|2 ≤ h aj
i
i i
i
i
i
i
7.4. Convergence of the finite element method
book 2014/8/6 page 295 i
295
The above inequality holds for any x ∈ (a j , a j +1 ). After integration on (a j , a j +1 ) one obtains a j +1 a j +1 2 2 |u (x) − Π h (u) (x)| d x ≤ h |u (s)|2 d s. aj
aj
Summing the above inequality with respect to j = 0, 1, . . . , N − 1 finally yields
b a
|u (x) − Π h (u) (x)|2 d x ≤ h 2
b
|u (s)|2 d s,
a
that is, u − Π h (u) L2 (Ω) ≤ hu L2 (Ω) .
(7.19)
Let us prove that there exists some constant C > 0 such that u − Π h (u)L2 (Ω) ≤ C hu L2 (Ω) .
(7.20)
Using the same argument and notation as above, we can write for x ∈ (a j , a j +1 ) u(x) − Π h (u)(x) = u(x) − u(a j ) +
u(a j +1 ) − u(a j ) a j +1 − a j
(x − a j )
= u(x) − u(a j ) − (x − a j )u (a j + θ j ) = (x − a j )u (a j + θ x, j ) − (x − a j )u (a j + θ j ), where 0 < θ x, j < x − a j . It follows that |u(x) − Π h (u)(x)| ≤ (x − a j )
a j +1
|u (t )| d t ,
aj
which by similar arguments as above yields h2 u − Π h (u)L2 (Ω) ≤ $ u L2 (Ω) . 3 Finally, by combining (7.19) and (7.20), one obtains u − Π h (u)H 1 (Ω) ≤ C huH 2 (Ω) .
(7.21)
(b) Let us now extend the above inequality to an arbitrary u ∈ H 2 (Ω). To that end one uses a density argument: noticing that C∞ ([a, b ]) is dense in H 1 (a, b ) and H 2 (a, b ), we just need to prove that the operator Π h : H 1 (Ω) → H 1 (Ω) is continuous. More precisely, one can state the following result of independent interest, which concludes the proof in the one-dimensional case. Lemma 7.4.1. Suppose Ω = (a, b ) and ( h ) h→0 is a discretization of Ω. Then, there exists a constant C > 0 such that for any h > 0, for any v ∈ H 1 (Ω), Π h (v)H 1 (Ω) ≤ C vH 1 (Ω) .
i
i i
i
i
i
i
296
book 2014/8/6 page 296 i
Chapter 7. The finite element method
PROOF. For any x ∈ (a j , a j +1 ) Π h (v) (x) = =
v(a j + 1) − v(a j ) a j +1 − a j a j +1 1 a j +1 − a j
v (t ) d t
aj
(see Theorem 5.1.1). Hence
Π h (v)
2
(x) ≤
1 (a j +1 − a j )2
which by the Cauchy–Schwarz inequality yields
Π h (v)
2
(x) ≤
1 a j +1 − a j
a j +1
|v (t )| d t
2 ,
aj
a j +1
|v (t )|2 d t .
aj
After integration on (a j , a j +1 ), and summation with respect to j , one obtains Π h (v) L2 (Ω) ≤ v L2 (Ω) . On the other hand, Π h (v)L2 (Ω) ≤ ≤
/ /
(7.22)
b − aΠ h (v)L∞ (Ω) b − avL∞ (Ω) .
(7.23)
In the one-dimensional case (N = 1) it was proved in Theorem 5.1.1 that each element of H 1 (a, b ) has a unique continuous representative. Let us verify by some elementary computation that this canonical embedding H 1 (a, b ) ⊂ C([a, b ]) is continuous, i.e., there exists some constant C > 0 such that ∀ v ∈ H 1 (a, b ),
vL∞ (a,b ) ≤ C vH 1 (a,b ) .
(7.24)
Recall that we still denote v˜ = v the continuous representative of v and that for any x0 , x ∈ [a, b ] x v (t ) d t . v(x0 ) = v(x) + x0
Let us apply the Cauchy–Schwarz inequality to the above formula: x |v(x0 )| ≤ |v(x)| + |v (t )| d t x0
≤ |v(x)| +
/
b −a
b
|v (t )|2 d t
1/2 .
a
Let us now integrate this inequality with respect to x ∈ [a, b ]: b 1/2 b 3/2 2 |v(x)| d x + (b − a) |v (t )| d t (b − a)|v(x0 )| ≤ a
≤ (b − a)
a
b
1/2
2
|v(x)| d x a
1/2 + (b − a)
3/2
b
1/2 2
|v (t )| d t
.
a
i
i i
i
i
i
i
7.4. Convergence of the finite element method
book 2014/8/6 page 297 i
297
The elementary inequality (α + β)2 ≤ 2(α2 + β2 ) now yields % &1/2 b b $ 1 vL∞ (a,b ) ≤ 2 v(x)2 d x + (b − a) v (x)2 d x b −a a a 1/2 $ 1 ,b −a ≤ 2 max vH 1 (a,b ) . b −a Combining (7.23) and (7.24) finally yields Π h (v)L2 (Ω) ≤ C
/
b − avH 1 (Ω) ,
which, together with (7.22), gives Π h (v)H 1 (Ω) ≤ C vH 1 (Ω) , and the proof of Lemma 7.4.1 is complete. Remark 7.4.1. Indeed one can prove the following result: for all v ∈ H 1 (a, b ) lim v − Π h (v)H 1 (a,b ) = 0.
h→0
(7.25)
This is slightly more precise than proving that for every u ∈ H 1 (a, b ) lim dist(u,V h ) = 0, h→0
which we have used in the proof of the convergence of the finite element method. The proof follows the lines of the previous arguments: given v ∈ H 1 (a, b ), for any > 0, let v ∈ H 2 (a, b ) with v − v H 1 (a,b ) < . We have v − Π h (v)H 1 (a,b ) ≤ v − v H 1 (a,b ) + v − Π h (v )H 1 (a,b ) + Π h (v − v )H 1 (a,b ) ≤ C v − v H 1 (a,b ) + C hv H 2 (a,b ) . It follows that
lim sup v − Π h (v)H 1 (a,b ) ≤ C . h→0
This being true for any > 0, the conclusion follows. PROOF OF THEOREM 7.4.2 IN THE TWO-DIMENSIONAL CASE. Suppose now that Ω is a polygon. The proof of the basic estimate for u ∈ H 2 (Ω), u − Π h (u)H 1 (Ω) ≤ C huH 2 (Ω) ,
(7.26)
is much more involved than in the one-dimensional case. To establish this result one needs to make some geometrical assumptions on the triangulation h ; this is where the regularity assumption (7.13) plays a central role. Indeed, to establish (7.26) we first are going to argue with a single triangle (the key step) and show the following result. Theorem 7.4.3. There exists a constant C > 0 such that for any triangle K, for any u ∈ H 2 (K), h u − Π(u)H 2 (K) ≤ C h h + (7.27) uH 2 (K) , ρ
i
i i
i
i
i
i
298
book 2014/8/6 page 298 i
Chapter 7. The finite element method
where Π(u) is the affine interpolant of u at the vertices of K, h is the diameter of K, and ρ is the diameter of the largest ball contained in K. PROOF OF THEOREM 7.4.2 CONTINUED. The basic estimate (7.26) and so Theorem 7.4.2 can be easily deduced from (7.27), as shown in the following. By taking the square of each member of (7.27), writing the corresponding inequalities for all the triangles of the triangulation h , and then summing these inequalities, one obtains h 2 2 2 u − Π h (u)H 1 (Ω) ≤ C h h + u2H 2 (Ω) . ρ Let us now use the regularity assumption (7.13) on the triangulation (h/ρ ≤ σ) and take h ≤ 1 to obtain u − Π h (u)H 1 (Ω) ≤ C h(1 + σ)uH 2 (Ω) , which proves (7.26). Thus our concern to complete the proof of the finite element method in the case N = 2 is to prove Theorem 7.4.3. A key idea in the process of getting the estimate (7.27) consists O which is used as a reference; first in establishing such a formula for a fixed triangle K, O equal to the unit simplex. take, for example, K O be a given triangle. Then there exists a constant C > 0 such that for Lemma 7.4.2. Let K 2 O any v ∈ H (K) 2 v − ΠKO (v)H 1 (K) (7.28) O ≤ C |D v|L2 (K) O , ∂ 2v 2 where |D 2 v|L2 (K) O := i1 i2 (x) d x. O i1 +i2 =2 K ∂ x1 ∂ x2
This is the two-dimensional version of inequalities (7.19) and (7.20). At this stage, we don’t need to know the precise value of the constant C , the point being just to know if such a constant exists. O to a triangle K of Then, we shall pass from the reference triangle K h by using an affine transformation, x = B xˆ + b = F (ˆ x ), where B is an invertible matrix and b ∈ R2 , which satisfies O K = F (K). The geometrical properties of the triangulation will appear through this transformation. PROOF OF LEMMA 7.4.2. To prove (7.28), and since we don’t need to know the precise value of C , we follow an analysis similar to the one in the proof of general Poincaré inequalities (Theorem 5.4.3, with the Poincaré–Wirtinger inequality as an example). The idea is to argue by contradiction and use the Rellich–Kondrakov compact embedding theorem, Theorem 5.4.2. Without ambiguity, for simplicity of the notation let us write K O instead of K. Suppose the assertion (7.28) is false. Then we could find a sequence (vn )n∈N such that ⎧ 2 ⎨ vn ∈ H (K) ∀ n ∈ N, 1 vn − ΠK (vn )H 1 (K) ≥ n. ⎩ 2 |D vn |L2 (K)
i
i i
i
i
i
i
7.4. Convergence of the finite element method
book 2014/8/6 page 299 i
299
Noticing that D 2 (ΠK (vn )) = 0, one can rewrite the above inequality in the following form: 1 2 vn − ΠK (vn ) ≤ . D vn − ΠK (vn ) H 1 (K) L2 (K) n Let us introduce the function un := We have
vn − ΠK (vn ) vn − ΠK (vn ) H 1 (K)
.
⎧ un ∈ H 2 (K), ⎪ ⎪ ⎪ ⎨u 1 = 1, n H (K)
|D 2 un |L2 (K) ≤ 1/n, ⎪ ⎪ ⎪ ⎩ u (a ) = 0, j = 1, 2, 3, where a j are the vertices of K. n j Let us show how to obtain a contradiction from this set of properties. From un H 1 (K) = 1 and |D 2 un |L2 (K) ≤ 1/n we obtain that the sequence (un )n∈N is bounded in H 2 (K). Since K is bounded and piecewise C1 one can apply the Rellich–Kondrakov theorem, which gives that the sequence (un )n∈N is relatively compact in H 1 (K). We can then extract a subsequence, which we still denote un , such that un → u
in H 1 (K).
Hence uH 1 (K) = 1. On the other hand, from |D 2 un |L2 (K) ≤ 1/n, we obtain that D 2 un → 0 = D 2 u
in L2 (K).
Hence u is an affine function on K. The linear map u → u(a j ) from H 2 (K) into R is continuous. Since un → u in H 2 (K) we have u(a j ) = 0, j = 1, 2, 3. The only affine function on K which is zero at the vertices is the function u = 0. This is a contradiction with uH 1 (K) = 1. This establishes (7.28) and concludes the proof of Lemma 7.4.2. Let us now consider an affine invertible transformation F (ˆ x ) = B xˆ + b
(7.29)
O and examine how it affects the formula (7.28). with K = F (K) The following notation and definitions will be helpful. We use the mappings O −→ F (ˆ x) = x ∈ K xˆ ∈ K F
O which is the inverse of F . To each function v defined on K one can and F −1 : K → K, O → R, which is defined by associate the function vˆ : K ˆ x ) = v(F (ˆ v(ˆ x )), which, with the above notation, gives ˆ x ) = v(x). v(ˆ
i
i i
i
i
i
i
300
book 2014/8/6 page 300 i
Chapter 7. The finite element method
Recall that F (ˆ x ) = B xˆ + b is an affine invertible map. The spectral norms of B and B −1 will play a central role in the following. Recall that these norms are defined by 4 5 B = sup |Bξ | : |ξ | = 1 , 5 4 B −1 = sup |B −1 ξ | : |ξ | = 1 , where |ξ | is the Euclidean norm. The geometrical characteristic properties of the triangulation (h and ρ) do appear naturally in the evaluation of these norms. Let us denote by hK and ρK (respectively, hKO O as defined in (7.13). and ρ O ) the geometrical characteristic numbers of K (respectively, K) K
Lemma 7.4.3. The following estimates hold: hK
B ≤
ρKO
B −1 ≤
,
hKO ρK
.
PROOF. We have 4 5 B = sup |Bξ | : |ξ | = 1 4 5 1 = sup |Bξ | : |ξ | = ρKO . ρKO O By definition of ρKO , for any ξ with |ξ | = ρKO one can find two points xˆ1 and xˆ2 in K such that ξ = xˆ2 − xˆ1 . Hence Bξ = B xˆ2 − B xˆ1 = F xˆ2 − F xˆ1 = x2 − x1 , O where x2 and x1 belong to K = F (K). By definition of hK , which is the diameter of K, we have |Bξ | = |x2 − x1 | ≤ hK . This inequality being true for any ξ with |ξ | = ρKO , we deduce B ≤
hK ρKO
.
O The other inequality is obtained in a similar way, by reversing the role of K and K. The other basic ingredient of the proof is the change of variables in the integrals which are equal to Sobolev norms. Let us write, for a given integer m ≥ 0 6 |v| m,K =
71/2
|α|=m
|∂ α v(x)|2 d x
.
(7.30)
K
i
i i
i
i
i
i
7.4. Convergence of the finite element method
book 2014/8/6 page 301 i
301
O be two finite elements which are affine equivalent, that is, K = Lemma 7.4.4. Let K and K O with F (ˆ F (K) x ) = B xˆ + b and B affine invertible. If a function v belongs to the space H m (K) O and there is a constant for some integer m ≥ 0, then the function vˆ = v ◦ F belongs to H m (K) C (m) > 0 such that ∀v ∈ H m (K)
ˆ m,KO ≤ C (m)B m | det B|−1/2 |v| m,K . |v|
Analogously, one has O ∀vˆ ∈ H m (K)
ˆ m,K . |v| m,K ≤ C (m)B −1 m | det B|1/2 |v|
PROOF. By standard density arguments, one just needs to argue with v ∈ C∞ (K). Hence O It is convenient to introduce the first and second derivatives of v: then Dv(x) vˆ ∈ C∞ (K). is a linear form and ∂v (x) = Dv(x) · ei , ∂ xi where (ei ) are the vectors of the canonical basis in RN (here N = 2). Similarly, D 2 v(x) is the bilinear symmetric form associated to the Hessian matrix and ∂ 2v ∂ xi ∂ x j
(x) = D 2 v(x) · (ei , e j ).
One can unify these two situations (and much more) by writing for any multi-index α = (α1 , α2 ) with length |α| = m ∂ α v(x) = D m v(x)(e1 , . . . , e1 , e2 , . . . , e2 ), where e1 is repeated α1 times, and e2 is repeated α2 times. Recall that ∂ α v(x) = (∂ |α| v/ α α ∂ x1 1 ∂ x2 2 )(x). Set 4 5 D m v(x) = sup |D m v(x)(ξ1 , . . . , ξ m )| : |ξi | ≤ 1, 1 ≤ i ≤ m . Then
|∂ α v(x)| ≤ D m v(x)
and |v| m,K =
|∂ α v(x)|2 d x
K |α|=m
≤ C1
1/2
1/2
D m v(x)2 d x
,
(7.31)
K
where C12 is the cardinal of the set of indices α such that |α| = m, i.e., C1 = C1 (m). We can now perform the differentiation rule for composition of functions. Recalling that ˆ x ) = v(F (ˆ v(ˆ x )) = v(B xˆ + b ) we have ˆ x )(ξ1 , . . . , ξ m ) = D m v(x)(Bξ1 , . . . , Bξ m ) D m v(ˆ so that
ˆ x ) ≤ D m v(x)B m . D m v(ˆ
i
i i
i
i
i
i
302
book 2014/8/6 page 302 i
Chapter 7. The finite element method
O one obtains Taking the square and after integration on K, ˆ x )2 d xˆ ≤ B2m D m v(ˆ D m v(F (ˆ x ))2 d xˆ. KO
O K
Using the formula of change of variables in multiple integrals we get m 2 2m −1 ˆ ˆ D v(ˆ x ) d x ≤ B | det(B )| D m v(x)2 d x. O K
(7.32)
K
Combining (7.31) and (7.32) we obtain m
−1/2
ˆ m,KO ≤ C1 (m)B | det B| |v|
!
D m v(x)2 d x
"1/2
.
K
Since conversely there exists a constant C2 (m) such that "1/2 ! D m v(x)2 d x ≤ C2 (m)|v| m,K ,
(7.33)
K
we finally obtain
ˆ m,KO ≤ C (m)B m | det B|−1/2 |v| m,K |v|
with C (m) = C1 (m)C2 (m). O yields the other inequality. Reversing the role of K and K END OF THE PROOF OF THEOREM 7.4.3. We now have all the ingredients of the proof of the basic estimate (7.27) in Theorem 7.4.3: h uH 2 (K) . u − Π(u)H 1 (K) ≤ C h h + ρ Let u ∈ H 2 (K). By Lemma 7.4.4 we have for m = 0, 1 P |u − Π(u)| m,K ≤ C B −1 m | det B|1/2 | uˆ − Π(u)| . m,KO
(7.34)
O yields for m = 0, 1 The estimate (7.28) on K P ˆ |2,KO . | uˆ − Π(u)| O ≤ C |u m,K
(7.35)
Applying again Lemma 7.4.4 we have | uˆ|2,KO ≤ C B2 | det B|−1/2 |u|2,K .
(7.36)
Combining (7.34), (7.35), and (7.36) we obtain |u − Π(u)| m,K ≤ C B −1 m B2 |u|2,K .
(7.37)
Take the square of inequality (7.37) and sum over m = 0, 1 to obtain u − Π(u)H 1 (K) ≤ C B2 1 + B −1 |u|2,K .
i
i i
i
i
i
i
7.5. Complements
book 2014/8/6 page 303 i
303
Using Lemma 7.4.3 we finally get u − Π(u)H 1 (K) ≤ C
hK2
ρ2O
K
1+
≤ C hK2 +
hKO
|u|2,K
ρK 2
hK
|u|2,K ,
ρK
O QK and hQ where 1/ρ K have been included in the constant C . Recall that K is a fixed reference triangle. Noticing that |u2,K | ≤ uH 2 (K) , the proof is complete. Remark 7.4.2. Note that we have obtained a slightly more precise result than (7.27); indeed, we proved that for every u ∈ H 2 (K) h u − Π h (u)H 1 (K) ≤ C h h + |u|2,K , ρ where |u|2,K = |D 2 u|L2 (K) just involves the L2 norm of the second-order partial derivatives of u. Consequently, in Theorem 7.4.2, we have that for any u ∈ H 2 (Ω) u − Π h (u)H 1 (Ω) ≤ C h|u|2,K . Similarly, in Theorem 7.4.1, we have that if u ∈ H 2 (Ω), u − u h H 1 (Ω) ≤ C h|u|2,K .
7.5 Complements 7.5.1 Flat triangles Let us return to the situation described in Figure 7.5, which illustrates a family of triangulations ( h ) h→0 involving triangles K h ∈ h becoming flat as h → 0. Let us show that on such triangles the affine interpolate can lead to significant errors. Take a simple function which is not affine, for example, a quadratic function u(x, y) = y 2 . Let us compute the affine interpolate Π h (u) of u on the triangle K h whose vertices are (0, h2 ), (0, − h2 ), and (h h , 0). We have that Π h (u) vanishes at (h h , 0) and is equal to h 2 /4 at the two other vertices. Hence ∂ ∂x Since
∂u ∂x
Π h (u) = −
= 0 we obtain | ∂∂x (u − Π h (u))| = 6
h 2 /4 h h
h 4 h
=−
h 4 h
.
and
71/2 2 ∂ h2 (u − Π h (u)) (x) d x = $ $ . 4 2 h Kh ∂ x
On the other hand, we have
∂ 2u ∂ x2
=
∂ 2u ∂ x∂ y
= 0 and
|u|2,Kh =
∂ 2u ∂ y2
= 2, which give
1/2 4dx dy Kh
=
$
$ 2h h .
i
i i
i
i
i
i
304
book 2014/8/6 page 304 i
Chapter 7. The finite element method
An inequality of the type u − Π h (u)H 1 (Kh ) ≤ C h|u|2,Kh would then imply
) ) ) ∂ ) ) ) (u − Π h (u))) ) )∂ x )
≤ C h|u|2,K ,
L2 (K h )
that is,
which is equivalent to
$ $ h2 $ $ ≤ C h 2h h , 4 2 h inf h > 0.
h>0
Thus, the convergence analysis developed in this chapter fails to be true without any geometrical assumption on the family of triangulations preventing the triangles from becoming flat.
7.5.2 H 2 (Ω) regularity of the solution of the Dirichlet problem on a convex polygon In the model situation studied in this section, we chose to take as Ω a polygon in R2 , to make as simple as possible the description of the triangulation in the finite element method. (Otherwise, for general Ω one has to approximate it by such polygonal sets Ω h .) Conversely, we have a difficulty, which is to know if the solution u of the Dirichlet problem with f ∈ L2 (Ω) −Δu = f on Ω, u = 0 on ∂ Ω satisfies the property u ∈ H 2 (Ω). Indeed the estimate u − u h H 1 (Ω) ≤ h 0
has been established under the assumption u ∈ H 2 (Ω). We are in a situation where Ω is a polygon, its boundary is not smooth (it is only piecewise C1 or Lipschitz continuous), and the classical Agmon–Douglis–Nirenberg theorem which asserts that u ∈ H 2 (Ω) under the assumption that Ω is of class C2 does not apply. The answer to this question is quite involved. It was studied by Grisvard in [233], [234], who proved in particular that if Ω is a polygon which is supposed to be convex, then u ∈ H 2 (Ω).
7.5.3 Finite element methods of type P2 The method which has been developed in R2 with Ω a polygon and finite elements which are triangles can be naturally extended to R3 when replacing triangles by tetrahedrons. Functions of the approximating subspaces V h are continuous and piecewise affine. This is what we call a finite element method of type P1 (by reference to the degree of the polynomial functions which are used). To improve the quality of the approximation, one may naturally think to enrich the approximating subspaces and make them contain more functions. This can be done, for example, by considering functions which are piecewise
i
i i
i
i
i
i
7.5. Complements
book 2014/8/6 page 305 i
305
polynomial of degree less than or equal to two. Let us briefly describe an example of such a finite element method of type P2 . Take Ω a polygon in R2 and a given triangulation h of Ω. We introduce the space 4 5 V h = v ∈ C(Ω) : vK ∈ P2 for every K ∈ h , where P2 is the family of polynomial functions on R2 of degree less than or equal to two. The general form of an element p ∈ P2 is then p(x, y) = a + b x + c y + d x 2 + e xy + f y 2 , and one can verify that P2 is a vector space of dimension equal to 6. Then, to fix an element p ∈ P2 , it is not sufficient to give its values at the vertices of a triangle (as was the case for p ∈ P1 ): we need to give its values at six points carefully chosen. Take, for example, the case of Figure 7.6, where a1 , a2 , a3 are the vertices of K and ai j = 12 (ai + a j ) are the midpoints of the edges of K.
Figure 7.6. Six points on the triangle K.
This choice leads to triangulations whose nodes are the vertices of the triangles and the midpoints of all the edges. As an illustration consider the case of Figure 7.7.
Figure 7.7. A triangulation for finite element method of type P2 .
Let us denote by (N j ) the nodes of h , j = 1, . . . , I (h) (vertices + midpoints of edges). It is quite elementary to verify that (a) V h is a subspace of dimension I (h) of H 1 (Ω) and any function v of V h is uniquely determined by its values at the nodes of the triangulation, (b) a basis of V h is given by the family of functions ( p j ) j =1,...,I (h) which is defined by p j ∈ Vh , p j (Ni ) = δi j
for i, j = 1, . . . , I (h).
For any element v ∈ V h one has v(x) =
I (h) j =1
v(N j ) p j (x).
i
i i
i
i
i
i
306
book 2014/8/6 page 306 i
Chapter 7. The finite element method
The finite element method can now be developed in a way parallel to what we did before. Indeed, as expected, one can get a better order in the approximation by piecewise P2 functions. Let us denote by I (h) v(N j ) p j Π h (v) = j =1
the element of V h obtained by interpolation of v on the nodes of the triangulation (Π h (v) = v on the nodes). Then, one can show the following result (which we do not prove): if the family of triangulations ( h ) h→0 is regular, then there exists a constant C > 0 such that for any u ∈ H 3 (Ω), u − Π h (u)H 1 (Ω) ≤ C h 2 |u|3,Ω .
i
i i
i
i
i
i
book 2014/8/6 page 307 i
Chapter 8
Spectral analysis of the Laplacian
8.1 Introduction From the very beginning of the 20th century, the study of the eigenvalue problem for the Laplace equation emerged as a fundamental topic in the theory of partial differential equations. In 1922, J. B. J. Fourier was faced with this question to develop the so-called separation of variables method. Let us illustrate this method in the case of the wave equation with Dirichlet boundary data, which, for example, gives a model for the vibrations of an elastic membrane which is clamped on its boundary. Given initial data u0 , u1 : Ω → R, one looks for a solution u : Q = Ω × (0, +∞) → R of the boundary value problem ⎧ 2 ∂ u ⎪ ⎪ ⎪ − Δu = 0 on Q, ⎪ ⎪ ⎪ ∂ t2 ⎪ ⎨ u =0 on Σ = ∂ Ω × (0, +∞), ⎪ (x) on Ω, u(x, 0) = u ⎪ 0 ⎪ ⎪ ⎪ ⎪ ∂u ⎪ ⎩ (x, 0) = u1 (x) on Ω. ∂t The idea is to look for a solution u of the form u(x, t ) = w(x)ϕ(t ),
(8.1)
where the dependence of u with respect to (x, t ) has been separated. The wave equation then becomes w(x)ϕ (t ) − ϕ(t )Δw(x) = 0 or, equivalently,
ϕ (t ) ϕ(t )
=
Δw(x) w(x)
.
(8.2)
Since the left-hand side of (8.2) is a function only of t and the right-hand side only of x, this forces these two expressions to be constant, that is, ϕ (t ) ϕ(t )
=
Δw(x) w(x)
= −λ
for some constant λ. 307
i
i i
i
i
i
i
308
book 2014/8/6 page 308 i
Chapter 8. Spectral analysis of the Laplacian
This method leads to the study of the spectral problem for the so-called Laplace– Dirichlet operator, −Δw = λw on Ω, (8.3) w = 0 on ∂ Ω, and the resolution of the ordinary differential equation ϕ (t ) + λϕ(t ) = 0.
(8.4)
One can easily verify that the eigenvalues of the spectral problem (8.3) are positive. (It is enough to multiply by w and integrate by parts on Ω.) Therefore, the solutions of (8.4) are of the following form: $ $ ϕ(t ) = Acos λt + B sin λt . The question is now, can one, by linear combinations of such separate solutions, obtain a solution ! $ $ " wi (x) Acos λi t + B sin λi t (8.5) u(x, t ) = i
which satisfies the initial data u(x, 0) = u0 (x) and (∂ u/∂ t )(x, t ) = u1 (x)? Indeed, this question is intimately related to the possibility of generating any given function by linear combinations (indeed, series!) of eigenvectors of the Laplace–Dirichlet operator. We shall give a positive answer to this question and prove the following theorem (which is the main result of this chapter). Assume Ω is a bounded open set in RN . Then, there exists a complete orthonormal system of eigenvectors of the Laplace–Dirichlet operator in the space L2 (Ω). A complete orthonormal system is also called a Hilbertian basis. This is a deep result of Rellich which is precisely based on the Rellich–Kondrakov theorem (compact embedding of H01 (Ω) into L2 (Ω); see Theorem 5.3.3) and on the abstract spectral decomposition theorem for compact self-adjoint operators. Indeed, variational methods play a central role in the theory of eigenvalues of elliptic partial differential equations. Another striking result in this direction is the variational characterization of eigenvalues of the Laplace–Dirichlet operator. One of the formulas provided by the Courant–Fisher min-max principle is the following: the first eigenvalue of the Laplace– Dirichlet operator is given by the variational formula 2 1 2 |∇v(x)| d x : v ∈ H0 (Ω), v(x) d x = 1 . λ1 (−Δ) = min Ω
Ω
This formula and its companions provide powerful tools for studying the properties of the eigenvalues and eigenvectors of the Laplace–Dirichlet operator. In 1911, Weyl used this principle to solve the problem on the asymptotic distribution of the eigenvalues of the Laplace–Dirichlet operator. We shall briefly describe such a result in Section 8.5. In the last two decades, spectral methods, just like finite element methods, have proved to be very efficient in the numerical treatment of some partial differential operators. They give raise to approximation methods where the finite dimensional approximating spaces Vn are based on (orthogonal) polynomials of high degree (by contrast with finite element methods, where the degree is fixed, one for the P1 method, two for the P2 method, for example). Here, the degree of polynomials of Vn increases with n. These methods provide accurate approximations of the solution, which are limited only by the regularity of the solution. But on the counterpart, there are some restrictions on the geometry of Ω. For pedagogical reasons and simplicity of exposition, we restrict our attention to the spectral analysis of the Laplace equation with Dirichlet boundary condition. In Section 8.6, we shall briefly survey some straight extensions of these results.
i
i i
i
i
i
i
8.2. The Laplace–Dirichlet operator: Functional setting
book 2014/8/6 page 309 i
309
8.2 The Laplace–Dirichlet operator: Functional setting Our objective is to study the eigenvalue problem for the Laplace equation on Ω with Dirichlet boundary conditions on ∂ Ω. We seek for λ ∈ R and u = 0 such that −Δu = λu on Ω, (8.6) u =0 on ∂ Ω. Throughout this chapter, Ω is assumed to be a bounded open set in RN . Let us give a precise meaning to this definition and write its variational formulation. Definition 8.2.1. We say that λ ∈ R is an eigenvalue of the Laplace–Dirichlet operator if there exists some u ∈ H01 (Ω), u = 0, such that ⎧ ⎨ ∇u(x) · ∇v(x) d x = λ u(x)v(x) d x Ω ⎩ uΩ∈ H 1 (Ω).
∀v ∈ H01 (Ω),
(8.7)
0
When such u exists it is called an eigenvector related to the eigenvalue λ. Remark 8.2.1. Let us make some comments to the definition above: (a) If (8.7) is satisfied, then −Δu = λu in the distribution sense, u = 0 in the trace sense, i.e., (8.6) is satisfied in a weak sense. The next step consists in proving that u is regular and hence it is a classical solution of (8.6). (b) One may wonder whether “λ ∈ R” is not too restrictive, and take instead λ ∈ C. Indeed, by taking v = u in (8.7) one obtains |∇u|2 d x Ω , λ= u 2 (x) d x Ω
which implies that all the eigenvalues of the Laplace–Dirichlet operator are positive real numbers. Let us now come to the central idea which will permit us to formulate the above problem in terms of classical operator theory. The classical theory for spectral analysis of operators in infinite dimensional spaces works with operators T : H −→ H , where H is a Hilbert space and T ∈ L(H ) is a linear, continuous, and compact operator from H into H . One cannot write −Δ in such a setting because, as for any differential operator, there is a loss of regularity when passing from u to −Δu. Indeed, −Δ is a linear continuous operator −Δ : H01 (Ω) −→ H −1 (Ω).
i
i i
i
i
i
i
310
book 2014/8/6 page 310 i
Chapter 8. Spectral analysis of the Laplacian
Another way to treat −Δ is to consider it as an operator from L2 (Ω) into L2 (Ω), but with a domain, i.e., dom(−Δ) = H 2 (Ω) ∩ H01 (Ω). By contrast, and that is the central idea, the inverse operator T = (−Δ)−1 is a nice operator which fits well with the classical theory. The straight relation which connects the spectrum of an operator and the spectrum of its inverse operator will permit us to conclude our analysis. Let us now define the operator T as the inverse of the Laplace–Dirichlet operator. Definition 8.2.2. The inverse of the Laplace–Dirichlet operator is the operator T : L2 (Ω) −→ L2 (Ω) which is defined for every h ∈ L2 (Ω) by the following: T h ∈ H01 (Ω) ⊂ L2 (Ω) is the unique solution of the variational problem ⎧ ⎪ ⎨ ∇(T h)(x) · ∇v(x) d x = h(x)v(x) d x Ω
Ω
⎪ ⎩T h ∈ H 1 (Ω).
∀v ∈ H01 (Ω),
0
Equivalently, T h is the variational solution of the Dirichlet problem (see Theorem 5.1.1) −Δ(T h) = h on Ω, T h = 0 on ∂ Ω. At this point let us notice that T : L2 (Ω) −→ H01 (Ω). To consider T as a linear continuous operator from a space H into itself we have two possibilities: either T : L2 (Ω) −→ L2 (Ω) or T : H01 (Ω) −→ H01 (Ω). These two approaches lead to similar parallel developments. We choose to consider T as acting from L2 (Ω) into L2 (Ω). We then have (−Δ) ◦ T = i dH ,
H = L2 (Ω),
i.e., T is the right inverse of −Δ. The introduction of T = (−Δ)−1 is justified in the context of the spectral analysis of the Laplace–Dirichlet operator by the following result. Lemma 8.2.1. The real number λ is an eigenvalue of the Laplace–Dirichlet operator iff 1/λ is an eigenvalue of T = (−Δ)−1 . PROOF. Let us assume that λ is an eigenvalue of the Laplace–Dirichlet operator, i.e., there exists some u ∈ H01 (Ω), u = 0, such that
Ω
∇u · ∇v d x = λ
uv d x Ω
∀v ∈ H01 (Ω).
i
i i
i
i
i
i
8.2. The Laplace–Dirichlet operator: Functional setting
book 2014/8/6 page 311 i
311
By definition of T = (−Δ)−1 this is equivalent to u = T (λu). By linearity of T (this is proved in the next proposition), and using the fact that λ = 0, we deduce 1 T (u) = u, λ i.e., 1/λ is an eigenvalue of T . The following properties of the operator T will play a central role in its spectral analysis. Proposition 8.2.1. The operator T satisfies the following properties: (i) T : L2 (Ω) −→ L2 (Ω) is a linear continuous operator, (ii) T is self-adjoint in L2 (Ω), (iii) T is compact from L2 (Ω) into L2 (Ω), (iv) T is positive definite. PROOF. (i)1 Take h1 , h2 ∈ L2 (Ω) and α1 , α2 ∈ R. By definition of T ∇(T h1 ) · ∇v d x = h1 v d x ∀v ∈ H01 (Ω),
Ω
Ω
Ω
∇(T h2 ) · ∇v d x =
Ω
h2 v d x
∀v ∈ H01 (Ω).
By taking a linear combination of these two equalities we obtain ⎧ ⎪ ⎨ ∇(α1 T h1 + α2 T h2 ) · ∇v d x = (α1 h1 + α2 h2 )v d x Ω
Ω
⎪ ⎩ α T h + α T h ∈ H 1 (Ω). 1 1 2 2 0
∀v ∈ H01 (Ω),
By uniqueness of the solution of the Dirichlet problem, we get T (α1 h1 + α2 h2 ) = α1 T h1 + α2 T h2 , which is the linearity of T . (i)2 Let us now prove that T : L2 (Ω) −→ L2 (Ω) is continuous. By definition of T , the equality Ω
∇(T h) · ∇v d x =
hv d x Ω
holds true for any v ∈ H01 (Ω). In particular, it is satisfied by v = T h ∈ H01 (Ω), which gives 2 |∇(T h)| d x = hT (h) d x. (8.8) Ω
Ω
Using the Cauchy–Schwarz inequality, we obtain 1/2 1/2 2 2 2 |∇(T h)| d x ≤ h dx (T h) d x . Ω
Ω
(8.9)
Ω
i
i i
i
i
i
i
312
book 2014/8/6 page 312 i
Chapter 8. Spectral analysis of the Laplacian
Let us now use the Poincaré inequality and the fact that Ω is bounded (Theorem 5.3.1). There exists a constant C > 0 which depends only on Ω such that ∀v ∈
H01 (Ω)
1/2 2
≤C
v(x) d x Ω
1/2 2
Ω
|∇v(x)| d x
.
In particular, since for every h ∈ L2 (Ω) we have T h ∈ H01 (Ω), we can write the Poincaré inequality with v = T h, which gives 1/2
Ω
|T h(x)|2 d x
≤C
1/2
Ω
|∇(T h(x))|2 d x
.
(8.10)
Let us combine inequalities (8.9) and (8.10) to obtain 1/2
2
Ω
|∇(T h)| d x ≤ C
Equivalently
2
h dx Ω
Ω
1/2 2
Ω
1/2
2
|∇(T h)| d x
|∇(T h)| d x
1/2 2
≤C
.
h dx
,
(8.11)
Ω
which, together with (8.10), gives
Ω
1/2
|T h(x)|2 d x
≤ C2
1/2 h2 d x
.
(8.12)
Ω
Thus, we have obtained ∀h ∈ L2 (Ω)
T hL2 (Ω) ≤ C 2 hL2 (Ω) ,
(8.13)
i.e., T is a linear continuous operator from L2 (Ω) into L2 (Ω). Indeed, we have obtained a sharper result: from (8.11) and (8.12) we deduce that / T hH 1 (Ω) ≤ C 1 + C 2 hL2 (Ω) , (8.14) ∀h ∈ L2 (Ω) 0
i.e., T : L2 (Ω) −→ H01 (Ω) is a linear continuous operator. This proves that one can also treat T as a linear continuous operator from H01 (Ω) into H01 (Ω). (ii) Let us now prove that T is a self-adjoint operator in L2 (Ω), i.e., ∀ g , h ∈ L2 (Ω)
〈T h, g 〉L2 (Ω) = 〈h, T g 〉L2 (Ω) ,
which means
Ω
(T h)(x) g (x) d x =
Ω
h(x)(T g )(x) d x.
By definition of T h and T g we have ∇(T h) · ∇v d x = hv d x
Ω
Ω
∇(T g ) · ∇v d x =
Ω
gv dx Ω
∀v ∈ H01 (Ω), ∀v ∈ H01 (Ω).
i
i i
i
i
i
i
8.3 Existence of a Hilbertian basis of eigenvectors
book 2014/8/6 page 313 i
313
Take v = T g ∈ H01 (Ω) in the first equality and v = T h ∈ H01 (Ω) in the second equality. We obtain ∇(T h) · ∇(T g ) d x = hT ( g ) d x = 〈h, T g 〉L2 (Ω) ,
Ω
Ω
∇(T g ) · ∇(T h) d x =
Ω
Ω
g T (h) d x = 〈g , T h〉L2 (Ω) .
Hence 〈h, T g 〉L2 (Ω) = 〈g , T h〉L2 (Ω) =
Ω
∇(T h) · ∇(T g ) d x,
which shows that T is self-adjoint in L2 (Ω). (iii) T is compact from L2 (Ω) into L2 (Ω). Take B a bounded set in L2 (Ω). By (8.14), since T is linear and continuous from L2 (Ω) into H01 (Ω), the set T (B) is bounded in H01 (Ω). We now use the fact that Ω is bounded and the Rellich–Kondrakov theorem, Theorem 5.3.3, to conclude that T (B) is relatively compact in L2 (Ω). (iv) T is positive definite. By (8.8) we have 2
∀v ∈ L (Ω)
〈T h, h〉 =
Ω
|∇(T h)|2 d x ≥ 0,
that is, T is positive. Moreover, if 〈T h, h〉 = 0, then ∇(T h) = 0, that is, T h is locally constant. Since T h ∈ H01 (Ω), this forces T h to be equal to zero. Coming back to the definition of T h, T h = 0 means that Ω hv d x = 0 for all v ∈ H01 (Ω). By the density of H01 (Ω) into L2 (Ω), we conclude that h = 0. We can summarize the results of this section and say that the operator T = (−Δ)−1 is a linear continuous, self-adjoint, compact, positive operator from L2 (Ω) into L2 (Ω). This will allow us to obtain in the next section the spectral decomposition of the Laplace– Dirichlet operator.
8.3 Existence of a Hilbertian basis of eigenvectors of the Laplace–Dirichlet operator Let us first recall the well-known (see, for instance, Brezis [137]) abstract “diagonalization” theorem for compact self-adjoint positive definite operators. Theorem 8.3.1. Let us assume that H is a separable Hilbert space with dim H = +∞. Let T : H −→ H be a linear continuous self-adjoint compact and positive definite operator. Then we have the following: (i) T is diagonalizable: there exists a Hilbertian basis of eigenvectors of T . (ii) The set Λ(T ) of eigenvalues of T is countable. It can be written as a sequence (μn )n∈N of positive distinct real numbers that decreases to zero as n → +∞ 0 ← μn < · · · < μ3 < μ2 < μ1 . (iii) For each μn ∈ Λ(T ), Eμn = ker(T − μn I ) is a finite dimensional subspace of H : it is the eigensubspace relative to the eigenvalue μn . Its dimension is called the multiplicity of μn .
i
i i
i
i
i
i
314
book 2014/8/6 page 314 i
Chapter 8. Spectral analysis of the Laplacian
(iv) For all μi = μ j , μi , μ j ∈ Λ(T ), Eμi ⊥ Eμ j (orthogonal subspaces). (v) H = ⊕n∈N Eμn , i.e., ∀x ∈ H
x=
projE (x)
n∈N
and ∀x ∈ H
Tx=
n∈N
μn
μn projEμ (x). n
Remark 8.3.1. The situation described in the above statement is simplified by the fact that here we have assumed T be positive definite, i.e., ker T = 0, which allows us to avoid considering μ = 0 in the spectral decomposition. (The null space ker T may be infinite dimensional.) SKETCH OF THE PROOF OF THEOREM 8.3.1. It is worthwhile to recall some of the basic ingredients of the proof of Theorem 8.3.1. (a) First notice that Λ(T ) is a bounded subset of (0, +∞): if μ ∈ Λ(T ), then there exists some u ∈ H , u = 0 such that T u = μu. Hence
〈T u, u〉 = μ|u|2 .
Since u = 0 we have 〈T u, u〉 > 0 (T is positive definite), which forces μ to be positive. On the other hand, since T is linear continuous, μ|u|2 ≤ |T u|H |u|H ≤ T L(H ,H ) |u|2 , which gives
0 < μ ≤ T L(H ,H ) .
(b) Now take μ = ν with μ, ν ∈ Λ(T ). By definition of Λ(T ) and Eμ , Eν we have ∀h ∈ Eμ T h = μh, ∀k ∈ Eν T k = νk. We deduce
〈T h, k〉 = μ〈h, k〉, 〈T k, h〉 = ν〈k, h〉.
Since T is self-adjoint, we have 〈T h, k〉 = 〈h, T k〉. Hence (μ − ν)〈h, k〉 = 0. We have assumed μ = ν. This forces h and k to satisfy 〈h, k〉 = 0, that is, Eμ ⊥ Eν . (c) It is interesting to see where the compactness on T comes into play. Let us notice that for any μ ∈ Λ(T ) the subspace Eμ is closed and invariant by T ; if h ∈ Eμ , i.e., T h = μh, then T (T h) = μ(T h), i.e., T h ∈ Eμ . Hence T : Eμ −→ Eμ and Eμ is a Hilbert space for the induced structure of H . Moreover, if BEμ denotes the unit ball in Eμ , we have T (BEμ ) = μBEμ .
i
i i
i
i
i
i
8.3 Existence of a Hilbertian basis of eigenvectors
book 2014/8/6 page 315 i
315
Since μ = 0 and T is compact, this forces BEμ to be relatively compact, that is, dim Eμ < +∞. We have obtained the following decomposition of H as a Hilbertian sum of eigenspaces: R Eμ n . (8.15) H= n∈N
To derive from this formula a Hilbertian basis of H we have to pick up in each Eμn an orthonormal basis whose cardinal is equal to the (finite) multiplicity of μn . To keep a quite simple notation we adopt the following convention. Definition 8.3.1. We now decide to count the eigenvalues of T according to their multiplicity, that is, μ1 is repeated k1 times where k1 is the multiplicity of μ1 ... μn is repeated kn times where kn is the multiplicity of μn ... and so on. Clearly, in this way, we obtain a sequence of positive real numbers, which we still denote by (μn )n∈N , such that 0 ← μn ≤ · · · μ3 ≤ μ2 ≤ μ1 . Note that now the μi are not necessarily distinct. The convention above allows us to pick up an orthonormal basis in each finite dimensional eigensubspace to obtain a Hilbertian basis (hn )n∈N in H which satisfies T h n = μn h n for every n ∈ N. Let us now come back to our model example. Clearly, by Proposition 8.2.1, the operator T = (−Δ)−1 which is considered as acting from H = L2 (Ω) into L2 (Ω) satisfies all the conditions of the abstract diagonalization theorem, Theorem 8.3.1. Thus, there exists a Hilbertian basis (en )n∈N of L2 (Ω) such that for each n ∈ N en is an eigenvector of T . More precisely, T e n = μn e n , and (μn )n∈N is a sequence of positive numbers which decreases to zero. By Lemma 8.2.1, we deduce that 1/μn is an eigenvalue of the Laplace–Dirichlet operator and that en is a corresponding eigenvector. This means −Δen = μ1 en on Ω, n
en = 0
on ∂ Ω,
the solution en being taken in the variational sense, i.e., en ∈ H01 (Ω) and 1 ∇en · ∇v d x = e v d x ∀v ∈ H01 (Ω). μn Ω n Ω Indeed, it is immediate to verify that the above equality is equivalent to T (en /μn ) = en , that is, T (en ) = μn en .
i
i i
i
i
i
i
316
book 2014/8/6 page 316 i
Chapter 8. Spectral analysis of the Laplacian
Let us now forget the operator T = (−Δ)−1 which was just a technical ingredient in our study and convert the previous results directly in terms of −Δ, the Laplace–Dirichlet operator. Noticing that the sequence (λn )n∈N with λn = 1/μn is now an increasing sequence of positive numbers which tends to +∞ as n → +∞, we obtain the following theorem. Theorem 8.3.2. The Laplace–Dirichlet operator has a countable family of eigenvalues (λn )n∈N which can be written as an increasing sequence of positive numbers which tends to +∞ as n → +∞: 0 < λ1 ≤ λ2 ≤ · · · ≤ λ n ≤ · · · . Each eigenvalue is repeated a number of times equal to its multiplicity (which is finite). There exists a Hilbertian basis (en )n∈N of L2 (Ω) such that for each n ∈ N, en is an eigenvector of the Laplace–Dirichlet operator relatively to the eigenvalue λn : −Δen = λn en on Ω, en = 0 on ∂ Ω. We already mentioned that the spectral analysis of the Laplace–Dirichlet operator could have been, as well, developed in the space H01 (Ω). Indeed, there is a direct link between these two approaches, which is described below. / Proposition 8.3.1. The family en / λn n∈N is a Hilbertian basis of the space H01 (Ω) equipped with the scalar product 〈u, v〉 = Ω ∇u · ∇v d x. PROOF. (a) Let us start from the definition of en : ∇en · ∇v d x = λn en v d x Ω
Ω
∀v ∈ H01 (Ω).
By taking v = en ∈ H01 (Ω), we obtain 2 |∇en | d x = λn en2 d x = λn . Ω
Hence
Ω
en 2H 1 (Ω) = λn 0
and
) ) ) en ) )/ ) = 1. ) ) λn H01 (Ω) (b) Let us verify the orthogonality property in H01 (Ω): take n = m, A B en em 1 =/ ∇en · ∇e m d x / ,/ λn λ m H 1 (Ω) λn λ m Ω 0 1 λn en e m d x, =/ λn λ m Ω
which is equal to zero because (en )n∈N is an orthogonal system in L2 (Ω).
i
i i
i
i
i
i
8.4. The Courant–Fisher min-max and max-min formulas
/ (c) Let us verify that en / λn
n∈N
book 2014/8/6 page 317 i
317
generates a vector space which is dense in H01 (Ω).
Equivalently, we have to verify that if f ∈ H01 (Ω) is such that for all n ∈ N, B A en f,/ = 0, λn H 1 (Ω) 0
then f = 0. Let us notice that A
en f,/ λn
B
1 =/ λn H 1 (Ω) 0
=
λn
Ω
Ω
∇ f · ∇en d x f en d x.
Since λn = 0, our assumption becomes Ω f en d x = 0 for all n ∈ N, which clearly implies f = 0 because (en )n∈N is an orthonormal basis in L2 (Ω).
8.4 The Courant–Fisher min-max and max-min formulas Let us start with the variational characterization of the first eigenvalue λ1 (−Δ) of the Laplace–Dirichlet operator. Without ambiguity, we write λ1 . To introduce this result, let us notice that if λ is an eigenvalue of −Δ, then there exists some u ∈ H01 (Ω), u = 0, such that ∇u · ∇v d x = λ uv d x ∀v ∈ H01 (Ω). Ω
Ω
By taking v = u and noticing that u = 0 we obtain |∇u(x)|2 d x Ω λ= . u(x)2 d x
(8.16)
Ω
The above expression plays a central role in the variational approach to eigenvalue problems for the Laplace equation. Definition 8.4.1. For any v ∈ H01 (Ω), v = 0, let us write |∇v(x)|2 d x =(v) = Ω . v(x)2 d x Ω
=:
H01 (Ω) −→ R+
is called the Rayleigh quotient.
From (8.16) we immediately obtain that for any eigenvalue λ of the Laplace–Dirichlet operator, 5 4 λ ≥ inf =(v) : v ∈ H01 (Ω), v = 0 , which is equivalent to saying
4 5 λ1 ≥ inf =(v) : v ∈ H01 (Ω), v = 0 .
(8.17)
i
i i
i
i
i
i
318
book 2014/8/6 page 318 i
Chapter 8. Spectral analysis of the Laplacian
Indeed, there is equality between these two expressions. That is the object of the following theorem. Theorem 8.4.1 (Courant–Fisher formula). Assume Ω is a bounded open subset of RN . The first eigenvalue λ1 of the Laplace–Dirichlet operator on Ω is given by the following variational formula: ⎧ ⎫ ⎪ ⎪ 2 ⎪ ⎪ ⎪ ⎪ ⎨ |∇v(x)| d x ⎬ Ω 1 : v ∈ H0 (Ω), v = 0 . λ1 = min ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ v 2 (x) d x ⎩ ⎭ Ω
Moreover, the infimum above is achieved and the solutions of this variational problem are the eigenvectors relative to the first eigenvalue λ1 . PROOF. By (8.17) we need only to prove the inequality 4 5 inf =(v) : v ∈ H01 (Ω), v = 0 ≥ λ1 , that is,
∀v ∈ H01 (Ω), v = 0,
=(v) ≥ λ1 .
(8.18)
The idea is to express, for any v ∈ H01 (Ω), =(v) in a Hilbertian basis of eigenvectors of (−Δ). Indeed, we know by Theorem 8.3.2 and Proposition 8.3.1 that there exists a Hilbertian basis (en )n∈N of L2 (Ω) such that en is an eigenvector of −Δ relatively to the / eigenvalue λn and that en / λn is a corresponding Hilbertian basis in H01 (Ω). By using the Bessel–Parseval inequality respectively in H01 (Ω) and L2 (Ω), we have +∞ en 2 2 2 vH 1 (Ω) = |∇v(x)| d x = v, / , (8.19) 0 λn H01 (Ω) Ω n=1 +∞ v2L2 (Ω) = v(x)2 d x = 〈v, en 〉2L2 (Ω) . (8.20) Ω
n=1
One can easily compare these two quantities by using that en is an eigenvalue of −Δ relatively to λn . Indeed, (8.19) gives "2 +∞ 1 ! 2 vH 1 (Ω) = ∇v · ∇en d x 0 Ω n=1 λn "2 +∞ λ2 ! +∞ n = en v d x = λn 〈v, en 〉2L2 (Ω) Ω n=1 λn n=1 ≥ λ1
+∞ n=1
〈v, en 〉2L2 (Ω) = λ1 v2L2 (Ω) ,
where in the last equality we use (8.20). Hence, =(v) ≥ λ1 for any v ∈ H01 (Ω), v = 0, which concludes the first part of the theorem. Let us now notice that by (8.16), any eigenvector v relative to the eigenvalue λ1 is a solution of the Courant–Fisher minimization problem. Conversely, if v is such a solution, by the above Parseval equalities (8.19), (8.20) we must have =(v) =
v2H 1 (Ω) 0
v2L2 (Ω)
+∞
λ 〈v, en 〉2 n=1 n = = λ1 , +∞ 〈v, en 〉2 n=1
(8.21)
i
i i
i
i
i
i
8.4. The Courant–Fisher min-max and max-min formulas
i.e.,
+∞
book 2014/8/6 page 319 i
319
(λn − λ1 )〈v, en 〉2 = 0.
n=1
This forces 〈v, en 〉 to be equal to zero for all indices n ∈ N∗ such that λn = λ1 . Equivalently, v must belong to the eigenspace relative to λ1 . Another elegant and direct approach to the Courant–Fisher formula relies on the theory of Lagrange multipliers. Let us start with the following equivalent formulation of the Courant–Fisher variational formula for λ1 : |∇v(x)|2 d x : v ∈ H01 (Ω), v(x)2 d x = 1 . (+ ) inf Ω
Ω
Problem (+ ) is now seen as a constrained minimization problem, namely, the minimization of the Dirichlet energy functional on the unit sphere of L2 (Ω). That is where Lagrange multipliers naturally come into play! Let us first notice that by using the direct methods of the calculus of variations and the Rellich–Kondrakov theorem, one can easily prove that problem (+ ) admits a solution u. To write some optimality condition satisfied by such solution u we take V = H01 (Ω) with 〈u, v〉 = Ω ∇u · ∇v d x and v2 = 〈v, v〉, and we consider the functionals F (v) = |∇v(x)|2 d x = v2 , Ω G(v) = v 2 (x) d x. Ω
Then (+ ) is equivalent to
4 5 min F (v) : G(v) = 1, v ∈ V . 4 5 The sphere S = v ∈ V : G(v) = 1 is a submanifold of class C1 and codimension 1 in V . Indeed, for any u, v we have, as t → 0, 1 G(u + t v) − G(u) −→ 2 uv d x. t Ω
Let us interpret this limit as a linear continuous form on H01 (Ω): uv d x = 〈h, v〉H 1 (Ω) = ∇h · ∇v d x ∀v ∈ H01 (Ω) Ω
means
0
Ω
−Δh = u on Ω, h = 0 on ∂ Ω,
i.e., h = T u, where T = (−Δ)−1 . Hence G is Fréchet differentiable on V and ∇G(u) = 2T u. The theory of Lagrange multipliers applies in our situation and we have that there exists μ ∈ R such that ∇F (u) = μ∇G(u), that is,
u = μT u.
i
i i
i
i
i
i
320
book 2014/8/6 page 320 i
Chapter 8. Spectral analysis of the Laplacian
Equivalently,
−Δu = μu on Ω, u = 0 on ∂ Ω,
which implies
|∇u(x)|2 d x μ = =(u) = Ω = λ1 . u(x)2 d x Ω
Let us summarize the previous results in the following statement. Proposition 8.4.1. The first eigenvalue λ1 (−Δ) of the Laplace–Dirichlet operator is a Lagrange multiplier of the constrained minimization problem
2
min Ω
|∇v(x)| d x : v ∈
H01 (Ω),
2
Ω
v (x) d x = 1 .
The above approach provides a direct variational proof of the existence of an eigenvalue (indeed, the first λ1 (−Δ)) of the Laplace–Dirichlet operator. The whole theory can then be developed in this way, by using a recursive argument: in the next step we can apply the same argument in the orthogonal subspace of V1 (which is the eigenspace relative to λ1 ) and so on. Let us make this precise in the following statement. Proposition 8.4.2. Let 0 < λ1 ≤ λ2 ≤ · · · ≤ λn ≤ · · · be the sequence of the eigenvalues of the Laplace–Dirichlet operator (repeated in accordance with their multiplicities) and (en )n∈N a corresponding Hilbertian basis of eigenvectors in L2 (Ω): −Δen = λn en on Ω, en = 0 on ∂ Ω. Let us denote by Vn the subspace of H01 (Ω) generated by the first eigenvectors e1 , . . . , en , Vn = span{e1 , e2 , . . . , en }, and by Vn⊥ the orthogonal of Vn in H01 (Ω) with respect to the scalar product of H01 (Ω): 〈u, v〉 = ∇u · ∇v d x. Ω
Then the following variational formulas hold: 4 5 λ1 = min =(v) : v ∈ H01 (Ω), v = 0 , 4 5 λ2 = min =(v) : v ∈ V1⊥ , v = 0 , ... 4 5 ⊥ λn = min =(v) : v ∈ Vn−1 , v = 0 , .... PROOF. By definition of λn and en ∇en · ∇v d x = λn en v d x Ω
Ω
∀v ∈ H01 (Ω).
i
i i
i
i
i
i
8.4. The Courant–Fisher min-max and max-min formulas
book 2014/8/6 page 321 i
321
By taking v = en we obtain =(en ) = λn ,
n = 1, 2, . . . .
⊥ , we first obtain Hence, by noticing that en ∈ Vn−1
4 5 ⊥ , v = 0 . λn ≥ inf =(v) : v ∈ Vn−1
(8.22)
To obtain the reverse inequality, we use an argument similar to the one used in the proof of Theorem 8.4.1, which relies on the Bessel–Parseval equality. Recall that (see (8.21)) for any v ∈ H01 (Ω) +∞ λ 〈v, ei 〉2L2 (Ω) i =1 i =(v) = +∞ 〈v, ei 〉2L2 (Ω) i =1 and that 〈v, ei 〉H 1 (Ω) = λi 〈v, ei 〉2L2 (Ω) , which allows passing from orthogonality in H01 (Ω) 0
to orthogonality in L2 (Ω) and vice versa. ⊥ Hence, for any v ∈ Vn−1 we have
i ≥n
=(v) = Therefore
λi 〈v, ei 〉2L2 (Ω)
2 i ≥n 〈v, ei 〉L2 (Ω)
≥ λn .
5 4 ⊥ , v = 0 ≥ λn . inf =(v) : v ∈ Vn−1
(8.23)
Combining inequalities (8.22) and (8.23) yields the result. The fact that the infimum is actually attained follows from the direct methods of the calculus of variations, being the functional v → =(v) coercive and weakly lower semicontinuous. We can now state the Courant–Fisher min-max and max-min formulas. Theorem 8.4.2 (Courant–Fisher min-max and max-min formulas). With the same notation as in Proposition 8.4.2, we have λn = min
max =(v)
M ∈"n v∈M , v =0
= max
min
M ∈"n−1 v∈M ⊥ , v =0
=(v),
where "n is the class of all n-dimensional linear subspaces in H01 (Ω) and M ⊥ stands for the orthogonal subspace of M in H01 (Ω). PROOF. (a) Take first M = Vn = span{e1 , . . . , en }. For any v ∈ Vn , we have 〈v, ei 〉H 1 (Ω) = 0 〈v, ei 〉L2 (Ω) for all i > n. By expressing =(v) as (see (8.21)) +∞ i =1
λi 〈v, ei 〉2L2 (Ω)
=(v) = +∞ i =1
we obtain
n
i =1
=(v) = n
〈v, ei 〉2L2 (Ω)
λi 〈v, ei 〉2L2 (Ω)
i =1
〈v, ei 〉2L2 (Ω)
,
≤ λn .
i
i i
i
i
i
i
322
book 2014/8/6 page 322 i
Chapter 8. Spectral analysis of the Laplacian
Hence
5 4 max =(v) : v ∈ Vn , v = 0 ≤ λn
(8.24)
(indeed equality holds taking v = en ) and consequently min max =(v) ≤ λn .
M ∈"n v∈M ,v =0
Let us now prove the reverse inequality. Equivalently, we have to prove that for any n-dimensional subspace M of H01 (Ω), 4 5 (8.25) λn ≤ max =(v) : v ∈ M , v = 0 . Take a subspace M of H01 (Ω) with dim M = n. We claim that ⊥
= {0}. M ∩ Vn−1
This is a consequence of the classical relation linking the dimension of the image and the kernel of a linear mapping: take P : M −→ Vn−1 to be the linear mapping which, to any v ∈ M , associates P (v) = projVn−1 v, the projection of v on Vn−1 . We have dim M = n = dim(ker P ) + dim(P (M )). Since P (M ), the image of M per P , is contained in Vn−1 , we have dim(P (M )) ≤ n − 1. Hence
dim(ker P ) ≥ n − (n − 1) = 1.
⊥ Equivalently, there exists some v ∈ M , v = 0 such that projVn−1 v = 0, that is, v ∈ M ∩Vn−1 , which proves the claim. ⊥ , v = 0. We know by Proposition 8.4.2 that Take now any v ∈ M ∩ Vn−1
4 5 ⊥ λn = min =(v) : v ∈ Vn−1 . Hence λn ≤ =(v) 4 5 ≤ max =(v) : v ∈ M , v = 0 , which proves (8.25) and completes the proof of the min-max formula. (b) The proof of the max-min formula is very similar to the proof of the min-max formula. First note that by Proposition 8.4.2, 4 5 ⊥ λn = min =(v) : v ∈ Vn−1 , v = 0 . (8.26) Hence
λn ≤ sup
inf
⊥ M ∈"n−1 v∈M , v =0
=(v).
(8.27)
To prove the reverse inequality, we need to show that for any M ∈ "n−1 we have λn ≥
inf
v∈M ⊥ , v =0
=(v).
(8.28)
i
i i
i
i
i
i
8.4. The Courant–Fisher min-max and max-min formulas
book 2014/8/6 page 323 i
323
We claim that there exists some v ∈ M ⊥ ∩ Vn with v = 0. This can be obtained by considering the linear mapping Q : Vn −→ M which, to any v ∈ Vn , associates Q(v) = projM v. We have n = dimVn = dim(ker Q) + dim Q(Vn ) . Since dim Q(Vn ) ≤ dim M = n − 1, we have dim(ker Q) ≥ 1. Equivalently, there exists some v ∈ Vn , v = 0 such that projM v = 0, that is v ∈ M ⊥ ∩ Vn . We now use (8.24) to obtain λn ≥ =(v) ≥
inf
v∈M ⊥ , v =0
=(v),
that is, (8.28). Hence λn = supM ∈"n−1 infv∈M ⊥ , v =0 =(v). Moreover, by (8.26) we have that the sup is a max (it is precisely attained by taking M = Vn−1 ) and the inf is a min (take v = en ). Finally, λn = max
min
M ∈"n−1 v∈M ⊥ v =0
=(v),
which ends the proof. Remark 8.4.1. (1) It is worth pointing out that the Courant–Fisher min-max and maxmin principles, which give a variational characterization of the eigenvalues of the Laplace– Dirichlet operator, hold for the sequence (λn )n∈N of eigenvalues which is expressed according to the multiplicity condition. This is another justification of this convention which gives the information on the values of the eigenvalues and on their multiplicities. For these reasons, we call the (λn )n∈N with the multiplicity condition (i.e., λn is repeated a number of times equal to its multiplicity) the sequence of eigenvalues of the Laplace– Dirichlet operator. (2) In Proposition 8.4.2, the (λn )n∈N are obtained by a recursive formula: one has first to know Vn−1 to obtain λn . By contrast, the Courant–Fisher min-max and maxmin principles provide a direct variational formulation of the eigenvalues of the Laplace– Dirichlet operator. (3) The Courant–Fisher min-max principle is the point of departure for the Ljusternik– Schnirelman theory of critical points. Indeed, in 1930, Ljusternik wrote, “The theory of eigenvalues of quadratic form developed by R. Courant enables one to discern their existence and reality without calculations. We shall generalize their theory to arbitrary functions having continuous second partial derivatives.” Typically, the Ljusternik theory deals with the variational approach to nonlinear eigenvalue problems of the type F (u) = λu,
u ∈ X , λ ∈ R, u = 1,
where X is a separable Hilbert space, dim X = +∞, and F : X −→ R is even, of class C1 , with F compact. Let us conclude this section by making a direct connection between the first eigenvalue λ1 (−Δ) of the Laplace–Dirichlet operator and the Poincaré constant (cf. Definition 5.3.1). Let us recall that the Poincaré constant is the smallest constant C such that for any v ∈ H01 (Ω) 1/2 1/2 2 2 v(x) d x ≤C |∇v(x)| d x . Ω
Ω
i
i i
i
i
i
i
324
book 2014/8/6 page 324 i
Chapter 8. Spectral analysis of the Laplacian
This is equivalent to saying that 1 C
2
4 5 = inf =(v) : v ∈ H01 (Ω), v = 0 ,
i.e., 1/C 2 = λ1 . In other words, we have obtained the following result. Proposition 8.4.3. The Poincaré constant C and the first eigenvalue λ1 of the Laplace– Dirichlet operator are related by the following formula: 1 C2
= λ1 .
8.5 Multiplicity and asymptotic properties of the eigenvalues of the Laplace–Dirichlet operator The first eigenvalue λ1 (−Δ) plays a fundamental role, for example, in the analysis of the resonance phenomena for vibrating structures and in some related shape optimization problems. Indeed, the first eigenvalue λ1 (−Δ) enjoys remarkable properties as stated in the following result. Theorem 8.5.1. Let Ω be a bounded connected regular open set in RN . The first eigenvalue λ1 of the Laplace–Dirichlet operator has multiplicity equal to one. Its eigenspace is generated by a vector e1 ∈ H01 (Ω) such that e1 > 0 on Ω. PROOF. Let us denote by E1 the eigenspace relative to the first eigenvalue λ1 . We recall that the Courant–Fisher theorem, Theorem 8.4.1, asserts that the elements v ∈ E1 , v = 0, are the solutions of the minimization problem ⎧ ⎫ ⎪ ⎪ ⎪ |∇v(x)|2 d x ⎪ ⎪ ⎪ ⎨ ⎬ Ω 1 (+ ) : v ∈ H0 (Ω), v = 0 . min ⎪ ⎪ ⎪ ⎪ 2 ⎪ ⎪ v(x) d x ⎩ ⎭ Ω
An important consequence of this formula is that if v ∈ E1 , then automatically |v| ∈ E1 . This follows from the fact that the truncations operate on the space H01 (Ω). In particular, see Corollary 5.8.1, for any v ∈ H01 (Ω), |v| ∈ H01 (Ω), and =(|v|) = =(v). The following argument proceeds by contradiction and makes use of the strong maximum principle. Suppose that one can find two elements v1 and v2 in the eigensubspace E1 which are not proportional. Because of the regularity assumption on Ω, v1 and v2 are smooth functions and it makes sense to consider their values at any point x ∈ Ω. Thus we can find x0 and x1 ∈ Ω such that α1 v1 (x0 ) + α2 v2 (x0 ) = 0, α1 v1 (x1 ) + α2 v2 (x1 ) = 0 for some α1 , α2 ∈ R. Now take w = |α1 v1 + α2 v2 |. Since v1 , v2 ∈ E1 , we have α1 v1 + α2 v2 ∈ E1 and w = |α1 v1 + α2 v2 | still belongs to E1 (as shown just above, as a consequence of the Courant– Rayleigh variational formula for λ1 ). Let us summarize the properties of w:
i
i i
i
i
i
i
8.5 Multiplicity and asymptotic properties of the eigenvalues
book 2014/8/6 page 325 i
325
⎧ w∈E , ⎪ ⎪ ⎨ w ≥ 0,1 w(x0 ) = 0, ⎪ ⎪ ⎩ w(x1 ) = 0. Since −Δw = λ1 w, from λ1 > 0 and w ≥ 0, we deduce that ⎧ ⎨ −Δw ≥ 0, w = 0 on ∂ Ω, ⎩ w(x ) = 0, x ∈ Ω. 0 0 The strong maximum principle property of Hopf now implies that w = 0 on Ω, a clear contradiction to the fact that w(x1 ) = 0. Thus E1 has dimension one. By taking any vector w ∈ E1 \ {0} and e1 = |w| we obtain a vector in E1 which satisfies, by using again the strong maximum principle, e1 > 0 on Ω. Corollary 8.5.1. Let Ω be as in Theorem 8.5.1 and take any eigenvector ei of the Laplace– Dirichlet operator corresponding to an eigenvalue λi > λ1 . Then the sign of ei is not constant on Ω. PROOF. Since λi = λ1 we have E(λi ) ⊥ E(λ1 ) and Ω, this forces ei to change sign on Ω.
e (x)e1 (x) d x Ω i
= 0. Since e1 > 0 on
This means that the status of the first eigenvalue is very particular. It is the only eigenvalue which possesses an eigenvector with constant sign. Indeed, in the analysis of the second eigenvalue problem λ2 (−Δ), the nodal set of a second eigenvector (the set where it is equal to zero) plays a central role. The explicit computation of the spectrum of the Laplace–Dirichlet operator is possible only in very particular situations. Nevertheless, even in situations where such a computation is not possible (or is too complicated), one can get rather precise information on the spectrum by using comparison arguments. To stress the dependence of eigenvalues with respect to Ω, let us denote by (λn (Ω))n∈N the sequence of eigenvalues of the Laplace– Dirichlet operator on Ω (with the multiplicity convention). Then, as a direct consequence of the Courant–Fisher min-max principle, we have the following comparison result. ˜ be two open bounded subsets of RN with Ω ⊂ Ω. ˜ Then, for Proposition 8.5.1. Let Ω and Ω any n ≥ 1, ˜ ≤ λ (Ω), λ (Ω) n
n
i.e., λn (Ω) is a decreasing function of Ω. PROOF. For any v ∈ H01 (Ω), let us denote by v˜ the function which is equal to v on Ω and ˜ \ Ω. By Proposition 5.1.1, we have v˜ ∈ H 1 (Ω). ˜ Moreover, zero on Ω 0
2
˜ 2 d x, |v(x)| 2 ˜ 2 d x. |∇v(x)| d x = |∇v(x)| Ω
Ω
|v(x)| d x =
˜ Ω
˜ Ω
i
i i
i
i
i
i
326
book 2014/8/6 page 326 i
Chapter 8. Spectral analysis of the Laplacian
˜ by the mapping Hence, H01 (Ω) can be isometrically identified with a subspace of H01 (Ω) i˜
˜ v ∈ H01 (Ω)−→v˜ = i˜(v) ∈ H01 (Ω). ˜ ) ∈ " (Ω) ˜ is an If M ∈ "n (Ω) is an n-dimensional subspace of H01 (Ω), then i(M n 1 ˜ n-dimensional subspace of H0 (Ω). We can now apply the Courant–Fisher min-max formula (Theorem 8.4.2) to obtain λn (Ω) = min
max =(v, Ω)
= min
˜ ˜ Ω) max =(v,
M ∈"n (Ω) v∈M , v =0 M ∈"n (Ω) v∈M , v =0
=
˜ max =(w, Ω)
min
W =i˜(M ), M ∈"n (Ω) w∈W \{0}
˜ = λ (Ω), ˜ max =(w, Ω) n
≥ min
˜ w∈W \{0} M ∈"n (Ω)
which ends the proof. To go further and use this comparison result we need to know some particular situations where the spectrum of −Δ can be explicitly computed. Let us start with the simplest situation, that is, N = 1 and Ω = (0, 1). Proposition 8.5.2. Let N = 1 and Ω = (0, 1). Then the eigenvalues (λn )n∈N of the Laplace– Dirichlet operator are given by λn = n 2 π 2 ,
n = 1, 2 . . . ,
and the corresponding orthonormal basis (en )n∈N of eigenvectors in L2 (Ω) is given by en (x) =
$
2 sin(nπx).
PROOF. The proof is elementary. When solving the ordinary differential equation u + λu = 0, one obtains
$ $ u(x) = Asin( λx) + B cos( λx).
The boundary condition u(0) = 0 gives B = 0 and the boundary condition u(1) = 0 $ gives sin λ = 0, that is, λ = n 2 π2 , for some n ≥ 1. The$corresponding solution is u(x) = Asin(nπx). After L2 -normalization one obtains A = 2. In this very simple situation, each eigenvalue has multiplicity one. Let us now study the Laplace equation with the Dirichlet boundary condition on the N -cube Ω = (0, 1)N and the corresponding eigenvalue problem. Proposition 8.5.3. Let Ω = (0, 1)N . For each p = ( p1 , p2 , . . . , pN ) with pi ∈ N \ {0}, i = 1, 2, . . . , N (i.e., p ∈ (N∗ )N ), the positive real number λ p := π2 ( p12 + p22 + · · · + pN2 )
i
i i
i
i
i
i
8.5 Multiplicity and asymptotic properties of the eigenvalues
book 2014/8/6 page 327 i
327
is an eigenvalue of the Laplace–Dirichlet operator on Ω = (0, 1)N and the function u p (x) = 2N /2
N . i =1
sin(π pi xi )
is an eigenfunction corresponding to the eigenvalue λ p . Indeed, 4 5 Λ(−Δ, Ω) = λ p : p ∈ (N∗ )N , i.e., of −Δ on Ω = (0, 1)N can be expressed in this way, and the family 4 all the eigenvalues 5 ∗ N u p : p ∈ (N ) is an orthonormal basis of L2 (Ω). PROOF. Take p = ( p1 , p2 , . . . , pN ), v ∈ (Ω) and compute N ∂ u ∂v p ∇u p (x) · ∇v(x) d x = (x) (x) d x. ∂ xi Ω Ω i =1 ∂ xi Let us notice that u p (x) = Consequently,
N . i =1
e pi (xi )
∂ ∂ xi
u p (x) =
with e pi (xi ) = %. j =i
$
2 sin(π pi xi ).
& e p j (x j ) e p (xi ). i
(e p stands for the derivative of the function of one variable e pi (·).) Hence i
Ω
∇u p (x) · ∇v(x) d x =
N i =1
·
1 0
e p (xi ) i
.
(0,1)N −1 j =i
∂v ∂ xi
(x) d xi
e p j (x j ) d x1 . . . d xi −1 d xi +1 . . . d xN .
An integration by parts yields 1 1 ∂v e p (xi ) (x) d xi = − e p (xi )v(x) d xi i i ∂ xi 0 0 1 = π2 pi2 e pi (xi )v(x) d xi . 0
(The last equality follows from the fact that
e p i
+ π2 pi2 e pi = 0, since e pi is an eigenvector
relative to the eigenvalue π2 pi2 of the one-dimensional Laplace–Dirichlet problem.) Thus 6 7 N 2 2 ∇u p (x) · ∇v(x) d x = π pi u p (x)v(x) d x. Ω
i =1
Ω
By a classical density and extension by continuity argument, this equality can be extended to an arbitrary v ∈ H01 (Ω), and ⎧ ⎨ ∇u p (x) · ∇v(x) d x = λ p u p v d x ∀v ∈ H01 (Ω), Ω ⎩ uΩ ∈ H 1 (Ω), p
0
i
i i
i
i
i
i
328
book 2014/8/6 page 328 i
Chapter 8. Spectral analysis of the Laplacian
which precisely means that λ p = π2 N p 2 is an eigenvalue of the Laplace–Dirichlet i =1 i J operator on (0, 1)N , and u p (x) = N e (x ) is a corresponding eigenvector. i =1 pi i 5 4 Let us now show that the family of eigenfunctions u p : p ∈ (N∗ )N is an orthonormal basis of L2 (Ω). First let us notice that if p = q, then there exists at least one i ∈ {1, 2, . . . , N } such that sin( pi πx) and cos(qi πx), pi = qi . From the orthogonality in L2 (0,4 1) of the two functions 5 we immediately obtain that the family u p : p ∈ (N∗ )N is orthogonal in L2 (Ω). 4 5 The point that is more delicate is to prove that the family u p : p ∈ (N∗ )N generates L2 (Ω) in the topological sense, that is, the vector space generated by this family of vectors is dense in L2 (Ω). Indeed, by a careful application of the Fubini theorem, one can prove the following result (which is quite classical in integration theory and we omit its proof). Lemma 8.5.1. Let (v p ) p∈N∗ and (wq )q∈N∗ be two Hilbertian bases of L2 (0, 1). Then the family of functions (x, y) → v p (x)wq (y) is a Hilbertian basis of L2 ((0, 1)2 ). 4 Thus, by iterating this result a finite number of times we obtain that the family u p : 5 p ∈ (N∗ )N is an orthonormal basis of L2 (Ω). This clearly implies that by taking Λ = 4 5 λ p : p ∈ (N∗ )N we have obtained all the eigenvalues; otherwise, there would exist / Λ. some v ∈ H01 (Ω), v = 0, which is an eigenvector corresponding to some eigenvalue λ ∈ 2 (Ω) to all the By the orthogonality property this would imply that v is orthogonal in L 4 5 u p : p ∈ (N∗ )N which forms a basis, and hence v = 0, a clear contradiction. This completes the proof of the spectral analysis of the Laplace–Dirichlet operator in the case Ω = (0, 1)N . Propositions 8.5.3 and 8.5.1 (comparison principle) permit us to obtain a sharp estimation of the asymptotic behavior of the sequence (λn (Ω))n∈N∗ of the eigenvalues of the Laplace–Dirichlet operator in a bounded open set Ω in RN . Indeed, we can prove the following result. Theorem 8.5.2. Let (λn (Ω))n∈N be the sequence of the eigenvalues of the Laplace–Dirichlet operator in a bounded open set Ω in RN (with the multiplicity convention). Then, there exist two positive constants cΩ and dΩ , which depend only on Ω, such that for all n ≥ 1, cΩ n 2/N ≤ λn (Ω) ≤ dΩ n 2/N . SKETCH OF THE PROOF. The proof is quite technical but the idea is very simple. The idea consists in the comparison of Ω with two N -cubes Qa and Q b such that Qa ⊂ Ω ⊂ Q b , Qa = (−a/2, a/2)N , Q b = (−b /2, b /2)N . Then Proposition 8.5.1 applies and one obtains λn (Qa ) ≤ λn (Ω) ≤ λn (Q b ). Then the problem has been reduced to the evaluation of λn (Qa ) and λn (Q b ). Clearly λn (Qa ) = λn (Q)/a 2 and λn (Q b ) = λn (Q)/b 2 , where Q = (0, 1)N . By Proposition 8.5.3, p 2 with pi ∈ the numbers λn (Q)/π2 are precisely the positive integers of the form N i =1 i 4 N 2 5 p : pi ∈ N∗ as an increasing N \ {0}. Thus, one has to arrange the numbers i =1 i5 4 2 sequence to obtain the sequence λn (Q)/π : n ∈ N∗ : this is just a combinatorial
i
i i
i
i
i
i
8.6. A general abstract theory for spectral analysis of elliptic boundary value problems
book 2014/8/6 page 329 i
329
problem! To that end, it is convenient to introduce for any t > 0 the quantity νN (t ) which p2 ≤ t . is the cardinal of all the elements p ∈ (N∗ )N such that p = ( p1 , . . . , pN ) with N i =1 i Then, the key of the proof consists in showing the following estimate: νN (t ) ∼ CN t N /2 for some constant CN > 0. Remark 8.5.1. From Proposition 8.5.3, we can obtain, as indicated before, after some combinatorial argument, a complete description of the sequence (λn )n∈N of the eigenvalues of the Laplace–Dirichlet operator in Ω = (0, 1)N . For example, (a) N = 2. Then λ1 (Ω) = 2π2 , λ2 (Ω) = 5π2 , λ3 (Ω) = 8π2 , λ4 (Ω) = 10π2 , ....
multiplicity = 1 (no surprise!), multiplicity = 2, multiplicity = 1, multiplicity = 2,
λ1 (Ω) = 3π2 , λ2 (Ω) = 6π2 , λ3 (Ω) = 9π2 , ....
multiplicity = 1 (no surprise!), multiplicity = 3, multiplicity = 3,
(b) N = 3. Then
Indeed, in the various examples we have encountered, the multiplicity of the second eigenvalue of the Laplace–Dirichlet operator does not obey a simple rule: it may be equal to one, two, three, . . . . This makes a sharp contrast to the first eigenvalue λ1 (−Δ), which always has multiplicity one, and makes the second eigenvalue more delicate to work with. Remark 8.5.2. The previous analysis of the spectrum of the Laplace–Dirichlet operator ˜ such on a general bounded open set Ω relies on the comparison of Ω with a reference set Ω ˜ or Ω ˜ ⊂ Ω. There is another way to obtain comparison results, which that either Ω ⊂ Ω consists in the use of rearrangement results (Steiner symmetrization). The idea is that this type of transformation preserves the measure and the L2 norm of the functions and makes smaller the (L2 )N norm of the gradient of a function (Dirichlet integral). In this way it is possible to compare the corresponding Rayleigh quotients. This device is very useful in shape optimization; it permits us, for example, to solve an old problem from Rayleigh which consists in proving that the ball minimizes the first eigenvalue among all open sets of given volume.
8.6 A general abstract theory for spectral analysis of elliptic boundary value problems So far, we have considered the spectral analysis of the Laplacian with Dirichlet boundary conditions. To be able to develop a similar analysis for more general linear elliptic operators and for different types of boundary conditions (like Neumann, mixed, . . .), let us introduce the following abstract setting: (i) Let V and H be two real Hilbert spaces (infinite dimensional spaces) such that i
V −→ H ; we assume that
i
i i
i
i
i
i
330
book 2014/8/6 page 330 i
Chapter 8. Spectral analysis of the Laplacian
• V can be embedded in H by i which is linear continuous and one to one, H
• V is dense in H (i.e., i(V ) = H ), • V is compactly embedded in H (i.e., i is compact) (as a typical example, take V = H01 (Ω) and H = L2 (Ω) with their usual Hilbertian structures). (ii) Let a : V × V −→ R, (u, v) → a(u, v), be a bilinear form on V × V which is symmetric continuous and coercive: ∃α > 0 such that ∀ v ∈ V
a(v, v) ≥ αv2 .
Here · stands for the norm in V . The norm and the scalar product in H are respectively denoted by | · |H (| · | without ambiguity) and 〈·, ·〉H (〈·, ·〉 without ambiguity). Note that we are in the situation of the Lax–Milgram theorem, Theorem 3.1.2, and for any L ∈ V ∗ there exists a unique u ∈ V which satisfies a(u, v) = L(v) ∀v ∈ V . Noticing that for any h ∈ H the linear form v ∈ V → 〈h, v〉H is continuous on V , we deduce from the Lax–Milgram theorem the existence of a unique solution u = T h of the following problem:
a(T h, v) = 〈h, v〉H , T h ∈ V.
By using the same device as in Proposition 8.2.1, we can easily prove that T : H −→ H is a linear continuous, self-adjoint, compact, and positive definite operator. Thus, one can apply to T the abstract diagonalization theorem, Theorem 8.3.1, for compact, self-adjoint, positive definite operators and conclude that there exists a Hilbertian basis (en )n∈N in H of eigenvectors, T e n = μn e n , with (μn )n∈N , the decreasing sequence of positive eigenvalues (with multiplicity condition), which tends to zero as n → +∞. Note that now the family (en )n∈N is a Hilbertian basis of V , when V is equipped with the scalar product 〈〈u, v〉〉 := a(u, v) (which is equivalent to the initial one). We can now give a precise description of the solutions of the abstract spectral problem: find λ ∈ R such that there exists u ∈ V , u = 0, which satisfies a(u, v) = λ〈u, v〉H
∀v ∈ V .
(When such u = 0 exists, it is called an eigenvector relative to the eigenvalue λ.)
i
i i
i
i
i
i
8.6. A general abstract theory for spectral analysis of elliptic boundary value problems
book 2014/8/6 page 331 i
331
Theorem 8.6.1. Assume that the canonical injection of V into H is dense and compact and that the continuous bilinear form a : V × V −→ R is symmetric and coercive on V × V (V -elliptic). Then the eigenvalues λ of the abstract variational problem find λ ∈ R such that there exists u ∈ V , u = 0, a(u, v) = λ〈u, v〉H ∀ v ∈ V , can be written as an increasing sequence of positive numbers (λn )n∈N which tends to +∞ as n → +∞ 0 < λ1 ≤ λ2 ≤ λ3 ≤ · · · ≤ λ n ≤ · · · . (We again adopt the multiplicity convention: each eigenvalue is repeated a number of times equal to its multiplicity, which is finite.) There exists an orthonormal basis (Hilbertian basis) (en )n∈N of H such that for each n ∈ N, en is an eigenvector relative to the eigenvalue λn : a(en , v) = λn 〈en , v〉H ∀v ∈ V , en ∈ V . / Moreover, the sequence en / λn n∈N is a Hilbertian basis of V when this space is equipped with the (equivalent) scalar product a(·, ·). We consider now some applications of the results above. (1) NEUMANN PROBLEM. Take in this case 1 v(x) d x = 0 and V = v ∈ H (Ω) : Ω
a(u, v) =
Ω
∇u(x) · ∇v(x) d x.
The coercivity of a on V × V follows from the Poincaré–Wirtinger inequality (Corollary 5.4.1) and the compact embedding V −→ H = L2 (Ω) from the Rellich–Kondrakov theorem, Theorem 5.4.2 (Ω is assumed smooth and bounded). From Theorem 8.6.1, we deduce the existence of a Hilbertian basis (en )n∈N in L2 (Ω) such that ⎧ ⎨ −Δen = λn en on Ω, ∂ en ⎩ = 0 on ∂ Ω, ∂ν where
∂ ∂ν
stands for the normal outward derivative on the boundary ∂ Ω.
4 (2) MIXED DIRICHLET–NEUMANN PROBLEM (see Section 6.3). By taking V = v ∈ 5 H 1 (Ω) : γ0 (v) = 0 on Γ0 with N −1 (Γ0 ) > 0, one obtains the existence of a Hilbertian basis in L2 (Ω), (en )n∈N such that ⎧ −Δen = λn en on Ω, ⎪ ⎪ ⎨ e = 0 on Γ , n 0 ⎪ ∂ e ⎪ n ⎩ = 0 on Γ1 = Γ \ Γ0 . ∂ν (3) One can obtain similar results by replacing −Δ by an elliptic linear operator A of the form
∂ ∂v Av = − ai , j + a0 u ∂ xj i , j ∂ xi or more generally when considering elliptic systems (elasticity, Stokes, . . . ).
i
i i
i
i
i
i
332
book 2014/8/6 page 332 i
Chapter 8. Spectral analysis of the Laplacian
Remark 8.6.1. The variational approach of Courant–Fisher works without any particular difficulty in such a general setting. One introduces the abstract Rayleigh quotient =(v) = a(v, v)/|v|2H and thus 4 5 λ1 = min =(v) : v ∈ V , v = 0 , λn = min
max =(v)
M ∈"n v∈M , v =0
= max
min
M ∈"n−1 v∈M ⊥ , v =0
=(v).
Let us end this chapter and return to the situation which was our first motivation for this study, the method of separation of variables of Fourier, applied to the wave equation ⎧ 2 ∂ u ⎪ ⎪ ⎪ − Δu = 0 on Q = Ω × (0, +∞), ⎪ ⎪ ⎪ ∂ t2 ⎪ ⎨ u = 0 on Σ = ∂ Ω × (0, +∞), u(x, 0) = u0 (x) on Ω, ⎪ ⎪ ⎪ ⎪ ⎪ ∂u ⎪ ⎪ ⎩ (x, 0) = u1 (x) on Ω. ∂t
Denote by 0 < λ1 < λ2 ≤ λ3 ≤ · · · ≤ λn ≤ · · · the eigenvalues of the Laplace–Dirichlet operator and by (en )n∈N a corresponding Hilbertian basis in L2 (Ω) of eigenvectors. Set $ ωn = λn . Then the unique (variational) solution of the above problem is given by the following formula (for u0 ∈ H01 (Ω) and u1 ∈ L2 (Ω) given): u(t ) =
% & 1 〈u0 , en 〉L2 (Ω) cos(ωn t ) + 〈u1 , en 〉L2 (Ω) sin(ωn t ) en . ωn n=1
+∞
Here the variational solution is taken in the following sense: for any 0 < T < +∞ u ∈ C (0, T ); H01 (Ω) ∩ C1 (0, T ); L2 (Ω) and ⎧ 2 d ⎪ ⎪ ⎪ 〈u(t ), v〉L2 (Ω) + ∇u(t ) · ∇v d x = 0 ⎪ ⎨ dt2 Ω ⎪ ⎪ ⎪ ⎪ ⎩ u(0) = u , 0
du dt
in the distributional sense on (0, T ) ∀v ∈ H01 (Ω),
(0) = u1 .
(See, for example, Raviart–Thomas [323, Section 8.2] for further details.)
i
i i
i
i
i
i
book 2014/8/6 page 333 i
Chapter 9
Convex duality and optimization
In this chapter, unless otherwise specified, (V , · V ) is a general normed linear space with topological dual V ∗ . For any v ∈ V and v ∗ ∈ V ∗ , we write v ∗ (v) = 〈v ∗ , v〉(V ∗ ,V ) . Recall that V ∗ is a Banach space when equipped with the dual norm v ∗ V ∗ = sup{〈v ∗ , v〉 : vV ≤ 1}. Without ambiguity, for simplicity of notation, we write · instead of ·V , ·∗ instead of · V ∗ and 〈v ∗ , v〉 instead of 〈v ∗ , v〉(V ∗ ,V ) .
9.1 Dual representation of convex sets We know that several basic geometrical objects in a normed linear space V can be described by using continuous linear forms, i.e., elements of the topological dual space. For example, a closed hyperplane H can be written H = {v ∈ V : 〈v ∗ , v〉 = α} for some v ∗ ∈ V ∗ , v ∗ = 0, and α ∈ R. Similarly, a closed half-space can be written = {v ∈ V : 〈v ∗ , v〉 ≤ α}. Intersections of finite collections of closed half-spaces yield convex polyhedra. Indeed, we are going to show that arbitrary closed convex sets in V can be described by using only linear continuous forms. This is what we call a dual representation. This theory is based on the Hahn–Banach theorem (which we stated in Theorem 3.3.1), which is formulated below. Theorem 9.1.1. Let C be a nonempty closed convex subset of a normed linear space V . Then, each point u ∈ / C can be strongly separated from C by a closed hyperplane, which means ∃u ∗ ∈ V ∗ , u ∗ = 0, ∃α ∈ R such that 〈u ∗ , u〉 > α and 〈u ∗ , v〉 ≤ α ∀ v ∈ C . From a geometrical point of view, this means that C is contained in the closed halfspace M N {u ∗ ≤α} := v ∈ V : 〈u ∗ , v〉 ≤ α , whereas u is in the complement: {u ∗ >α} := {v ∈ V : 〈u ∗ , v〉 > α}. 333
i
i i
i
i
i
i
334
book 2014/8/6 page 334 i
Chapter 9. Convex duality and optimization
PROOF. Let us give the proof of Theorem 9.1.1 when V is a Hilbert space. In that case, one can give a constructive proof relying on the projection theorem on a closed convex subset. (In the general case of a normed linear space V , one can use the analytic version of the Hahn–Banach theorem, which itself is a consequence of the Zorn lemma.) Let us denote by PC (u) the projection of u on C . It is characterized by the angle condition (optimality condition) ∀ v ∈ C, 〈u − PC (u), v − PC (u)〉 ≤ 0 PC (u) ∈ C . / C , we have z = 0 and we can rewrite the above inequality Set z := u − PC (u). Since u ∈ in the following form: sup 〈z, v〉 ≤ 〈z, PC (u)〉.
(9.1)
v∈C
On the other hand, by definition of z and since z = 0 0 < |z|2 = 〈z, z〉 = 〈z, u〉 − 〈z, PC (u)〉, which implies
〈z, PC (u)〉 < 〈z, u〉.
(9.2)
Take α := 〈z, PC (u)〉. Combining (9.1) and (9.2) we obtain sup 〈z, v〉 ≤ α < 〈z, u〉, v∈C
i.e., C ⊂ {〈z,.〉≤α} and u ∈ {〈z,.〉>α} . As a direct consequence of Theorem 9.1.1 we obtain the following corollary. Corollary 9.1.1. Let C be a nonempty closed convex subset of a normed linear space V . Then C is equal to the intersection of all closed half-spaces that contain it: ' {v ∗ ≤α} . C= C ⊂{v ∗ ≤α}
PROOF. Let us denote by 0 the set M N 0 = (v ∗ , α) ∈ V ∗ × R : C ⊂ {v ∗ ≤α} . ( ( Clearly C ⊂ (v ∗ ,α)∈0 {v ∗ ≤α} . Let us prove the converse inclusion (v ∗ ,α)∈0 {v ∗ ≤α} ⊂ C . By taking the complement, this is equivalent to proving " 3 ! V \C ⊂ V \ {v ∗ ≤α} , (v ∗ ,α)∈0
which is precisely the conclusion of the Hahn–Banach separation theorem, Theorem 9.1.1. Among closed convex sets, an important subclass is obtained by taking the intersection of a finite number of closed half-spaces.
i
i i
i
i
i
i
9.1. Dual representation of convex sets
book 2014/8/6 page 335 i
335
Definition 9.1.1. A closed convex polyhedron P is an intersection of finitely many closed half-spaces: in other words, there exist v1∗ , . . . , vk∗ ∈ V ∗ with vi∗ = 0 and α1 , . . . , αk ∈ R such that 5 4 P = v ∈ V : 〈vi∗ , v〉 ≤ αi for i = 1, . . . , k . In the representation of closed convex sets as the intersection of closed half-spaces, it is natural to look for the simplest representation. To that end, let us observe the following elementary facts: (a) α ≥ α and C ⊂ {v ∗ ≤α} =⇒ C ⊂ {v ∗ ≤α } ; (b) fixing v ∗ = 0 and making α vary provides parallel hyperplanes. From Corollary 9.1.1 and the above observations, we deduce ' ' C= {v ∗ ≤α} . v ∗ ∈V ∗ , v ∗ =0
(9.3)
{∃α∈R : C ⊂{v ∗ ≤α} }
The question we have to examine is to describe, for a given v ∗ ∈ V ∗ , v ∗ = 0, such that there exists some α ∈ R with C ⊂ {v ∗ ≤α} , what is the intersection of all the parallel half-spaces {v ∗ ≤α} which contain C . The answer to this question gives rise to the notion of support function. Proposition 9.1.1. For any v ∗ ∈ V ∗ , v ∗ = 0, such that C ⊂ {v ∗ ≤α} for some α ∈ R we have ' {v ∗ ≤α} = {v ∗ ≤σC (v ∗ )} , {α : C ⊂{v ∗ ≤α} }
4 where σC (v ∗ ) := sup 〈v ∗ , v〉 : v ∈ C }. In other words, for any given v ∗ ∈ V ∗ , v ∗ = 0, such that C ⊂ {v ∗ ≤α} for some α ∈ R, the intersection of all the “parallel” closed half-spaces {v ∗ ≤α} containing C is the closed half-space {v ∗ ≤σC (v ∗ )} , where σC (v ∗ ) is defined as above. It is convenient to extend the definition of σC to an arbitrary v ∗ ∈ V ∗ by allowing it to take the value +∞. Definition 9.1.2. For any subset C of V , the function σC : V ∗ → R ∪ {+∞} defined by 4 5 σC (v ∗ ) = sup 〈v ∗ , v〉 : v ∈ C is called the support function of the set C . PROOF OF PROPOSITION 9.1.1. (a) For any v ∈ C , by definition of σC , we have 〈v ∗ , v〉 ≤ σC (v ∗ ). Hence, C ⊂ {v ∗ ≤σC (v ∗ )} , which clearly implies ' {α : C ⊂{v ∗ ≤α} }
{v ∗ ≤α} ⊂ {v ∗ ≤σC (v ∗ )} .
(b) For any α ∈ R such that C ⊂ {v ∗ ≤α} , we have 4 5 α ≥ sup 〈v ∗ , v〉 : v ∈ C = σC (v ∗ ).
i
i i
i
i
i
i
336
book 2014/8/6 page 336 i
Chapter 9. Convex duality and optimization
Hence, {v ∗ ≤σC (v ∗ )} ⊂ {v ∗ ≤α} and {v ∗ ≤σC (v ∗ )} ⊂
'
{v ∗ ≤α} ,
{α : C ⊂{v ∗ ≤α} }
which completes the proof. As a direct consequence of formula (9.3) and Proposition 9.1.1 we obtain the following important result. Theorem 9.1.2. Let C be a nonempty closed convex subset of a normed linear space V . Then ' {v ∗ ≤σC (v ∗ )} , C= v ∗ ∈V ∗ , v ∗ =0
where σC is the support function of C . Equivalently, 4 5 C = v ∈ V : 〈v ∗ , v〉 ≤ σC (v ∗ ) ∀ v ∗ ∈ V ∗ . Remark 9.1.1. The dual representation of a closed convex set C has been obtained with the help of the support function σC : V ∗ → R ∪ {+∞}. As we will see in this chapter, the mapping C → σC can be viewed as a particular case of the general duality correspondence, namely, the Legendre–Fenchel transform f → f ∗ . More precisely, by taking f = δC the indicator of C , we have f ∗ = σC . We examine below the properties of σC which are direct consequences of its definition. Proposition 9.1.2. The support function σC : V ∗ → R∪{+∞} of a closed convex nonempty subset C is a function which is closed, convex, proper, and positively homogeneous of degree 1. PROOF. For any v ∈ C , the mapping v ∗ ∈ V ∗ → 〈v ∗ , v〉 is a linear continuous form on V ∗ , hence convex and continuous. The function σC as a supremum of convex functions is still convex and, as a supremum of continuous functions, it is closed (lower semicontinuous); see Proposition 3.2.3. Moreover, σC (0) = 0 and σC is proper. Finally, for any v ∗ ∈ V ∗ and t > 0 we have 4 5 σC (t v ∗ ) = sup t 〈v ∗ , v〉 : v ∈ C 4 5 = t sup 〈v ∗ , v〉 : v ∈ C = t σC (v ∗ ), which expresses that σc is positively homogeneous of degree 1. To have a sharper view of the dual generation of closed convex sets, it is interesting to introduce the notion of supporting hyperplane. This notion is closely related to the 4 5 question, In the definition of σC (v ∗ ) = sup 〈v ∗ , v〉 : v ∈ C , is the supremum attained? Definition 9.1.3. An element v ∗ ∈ V ∗ , v ∗ = 0, is said to support C at a point u ∈ C if σC (v ∗ ) = 〈v ∗ , u〉 5 4 = sup 〈v ∗ , v〉 : v ∈ C . An equivalent terminology consists in saying that v ∗ is a supporting functional of C at u ∈ C .
i
i i
i
i
i
i
9.2. Passing from sets to functions: Elements of epigraphical calculus
book 2014/8/6 page 337 i
337
The geometric terminology above comes from the fact that when v ∗ supports C at u ∈ C we have that the closed half-space {v ∗ ≤σC (v ∗ )} contains C and that the corresponding hyperplane 5 4 H = v ∈ V : 〈v ∗ , v〉 = σC (v ∗ ) intersects C at u. (Note that the intersection of H with C may contain some other points.) An interesting question is to know whether it is possible to obtain a dual representation of closed convex sets by supporting functionals. As we will see, this is a quite involved question which is intimately connected with the properties of the subdifferential of a closed convex function and the Bishop–Phelps theorem (density properties of the domain of the subdifferential). Let us end this section with some elementary examples illustrating the concept of support function. Example 9.1.1. (1) Take C = B(0, 1) the unit ball of V . Then, for any v ∗ ∈ V ∗ 4 5 σC (v ∗ ) = sup 〈v ∗ , v〉 : vV ≤ 1 = v ∗ V ∗ , i.e., σC is the dual norm · V ∗ of · V . (2) Take C as a cone, i.e., λv ∈ C for all v ∈ C and λ ≥ 0 (note that necessarily 0 ∈ C ). Let us assume moreover that C is closed and convex. Then 4 5 σC (v ∗ ) = sup 〈v ∗ , v〉 : v ∈ C < ∗ = 0 whenever 〈v , v〉 ≤ 0 ∀v ∈ C , +∞ otherwise. Let us notice that the set 4 C ∗ = v ∗ ∈ V ∗ : 〈v ∗ , v〉 ≤ 0 for all v ∈ C } is a closed convex cone; it is called the polar cone of C . We have that σC is equal to the indicator function of this polar cone σ C = δC ∗ .
9.2 Passing from sets to functions: Elements of epigraphical calculus Our next goal is to apply the dual representation Theorem 9.1.2 to the set C = epi f , where f : V → R ∪ {+∞} is a closed convex proper function. So doing, we will obtain by a pure geometrical approach the Legendre–Fenchel duality theory for closed convex functions. To that end, it will be useful to develop some tools of epigraphical calculus, which consists of viewing functions as sets, via their epigraphs. As stressed in Section 3.2.2, the epigraph of an extended real-valued function is a geometrical object that carries most of
i
i i
i
i
i
i
338
book 2014/8/6 page 338 i
Chapter 9. Convex duality and optimization
the properties of the corresponding variational problems. In our context, given f : V → R ∪ {+∞}, recall that f is closed (lsc) ⇐⇒ epi f is closed, f is convex ⇐⇒ epi f is convex, and that the basic operation in convex analysis and duality which consists in taking the supremum of a family of convex (affine) functions has an immediate epigraphical interpretation
' epi sup fk = epi fk . k∈I
k∈I
Beyond the classical operations on extended real-valued functions (sum and multiplication by a positive scalar) let us introduce the epi-addition, also called inf-convolution. Definition 9.2.1. Let V be a linear space and f , g : V → R ∪ {+∞} two extended realvalued functions. The epi-sum of f and g (also called inf-convolution) is the function f #e g : V → R defined by 4 5 ( f #e g )(v) = inf f (v1 ) + g (v2 ) : v1 + v2 = v, v1 , v2 ∈ V 4 5 = inf f (v − w) + g (w) : w ∈ V 4 5 = inf f (w) + g (v − w) : w ∈ V . We often briefly write f #g . Note that f #e g may take the value −∞ (for example, take g = 0 and f not minorized). The term epi-sum comes from the following geometrical interpretation of this operation. Proposition 9.2.1. For any f , g : V → R ∪ {+∞} epiS ( f #e g ) = epiS f + epiS g , where epiS f stands for the strict epigraph of f , i.e., 4 5 epiS f = (v, λ) ∈ V × R : λ > f (v) , and the sum epiS f + epiS g is the vectorial sum (also called Minkowski sum) of the two sets epiS f and epiS g . PROOF. We have
λ > ( f #e g )(v)
iff there exists v1 , v2 ∈ V , with v = v1 + v2 such that λ > f (v1 ) + g (v2 ). This is clearly equivalent to the existence of v1 , v2 ∈ V and λ1 , λ2 ∈ R such that λ1 > f (v1 ), λ2 > g (v2 ) and v = v1 + v2 , λ = λ1 + λ2 . Equivalently, (λ, v) = (λ1 , v1 ) + (λ2 , v2 ) with (λ1 , v1 ) ∈ epiS f and (λ2 , v2 ) ∈ epiS g .
i
i i
i
i
i
i
9.2. Passing from sets to functions: Elements of epigraphical calculus
book 2014/8/6 page 339 i
339
Remark 9.2.1. The term inf-convolution refers to the (formal) similarities of this operation with the usual convolution of functions on RN ( f ∗ g )(x) = f (x − y) g (y) d y,
RN
where one has to replace RN by inf and product by addition. As we will see, there are many striking similarities between these two operations. Proposition 9.2.2. Let f , g : V → R ∪ {+∞} be two convex functions. Then, their epi-sum f #e g is still a convex function. PROOF. This property is a clear consequence of the geometrical interpretation of the episum via epigraphs (Proposition 9.2.1) epiS ( f #e g ) = epiS f + epiS g and of the fact that the Minkowski (vectorial) sum of two convex sets is still convex: indeed, given C and D two convex subsets of a vector space E, consider two points of C +D, let v1 = c1 + d1 , v2 = c2 + d2 with ci ∈ C and di ∈ D (i = 1, 2). For any 0 ≤ λ ≤ 1, one has λv1 + (1 − λ)v2 = λ(c1 + d1 ) + (1 − λ)(c2 + d2 ) = λc1 + (1 − λ)c2 + λd1 + (1 − λ)d2 , which still belongs to C +D. One can then easily verify that the convexity of an extended real-valued function is equivalent to the convexity of its strict epigraph. The fact that the epi-sum preserves the convexity, as we have just observed, follows clearly from its geometrical interpretation. On the other hand, it is somewhat surprising from the analytical point of view, since ( f #e g )(v) is expressed as an infimum of convex functions, namely, v → f (v − w) + g (w), and the class of convex functions is not stable by infimal operations. This calls for some explanation. Indeed, the convexity of f #e g when f and g are convex functions is a consequence of the observation “the function of two variables (v, w) → h(v, w) := f (v − w) + g (w) is convex with respect to the pair (v, w)” and of the following proposition. Proposition 9.2.3. Let V and W be two linear spaces and h : V × W → R ∪ {+∞} be a convex function. Then, the function p : V → R defined by p(v) = inf h(v, w) w∈W
is still convex. PROOF. Let us prove that for any u, v ∈ V and λ ∈ ]0, 1[ p(λu + (1 − λ)v) ≤ λ p(u) + (1 − λ) p(v). Without any restriction, we can assume p(u) < +∞ and p(v) < +∞; otherwise the inequality is trivially satisfied. Take arbitrary s > p(u) and t > p(v). By definition of p, one can find elements w u,s and wv,t in W such that s > h(u, w u,s ) and t > h(v, wv,t ).
i
i i
i
i
i
i
340
book 2014/8/6 page 340 i
Chapter 9. Convex duality and optimization
By convexity of h(·, ·) with respect to the couple of variables (v, w) h(λu + (1 − λ)v, λw u,s + (1 − λ)wv,t ) ≤ λh(u, w u,s ) + (1 − λ)h(v, wv,t ) ≤ λs + (1 − λ)t . By definition of p p(λu + (1 − λ)v) ≤ h(λu + (1 − λ)v, λw u,s + (1 − λ)wv,t ). We combine the two above inequalities to obtain p(λu + (1 − λ)v) ≤ λs + (1 − λ)t . This being true for any s > p(u) and t > p(v), by letting s tend to p(u) and t tend to p(v), we obtain the required convexity inequality. We will see that the epi-sum (inf-convolution) is the dual operation of the usual sum. The epi-sum plays also an important role in the regularization of lower semicontinuous extended real-valued functions. The following theorem has a long history (it goes back to Hausdorff, Pasch, and Baire and has been revisited by many authors). Theorem 9.2.1 (Lipschitz regularization via epi-sum). Let (V , · ) be a normed space and let f : V → R∪{+∞} be a proper and lower semicontinuous function. Suppose moreover that f is conically minorized, i.e., there exists some k0 ≥ 0 such that for all v ∈ V f (v) ≥ −k0 (1 + v). Let us define for all k ∈ R+ the function fk := f #e k · , i.e., 4 5 fk (v) = inf f (w) + kv − w . w∈V
Then we have (a) for all k ≥ k0 , fk is Lipschitz continuous on V with constant k, that is, for all u, v ∈ V | fk (u) − fk (v)| ≤ ku − v; (b) for all v ∈ V one has
f (v) = lim fk (v). k→+∞
More precisely, the sequence ( fk )k monotonically increases to f as k ↑ ∞. (c) When f is convex, so is fk for all k ≥ k0 . PROOF. (a) For all k ≥ k0 , we have 5 4 fk (v) ≥ inf − k0 − k0 w + kv − w w∈V 5 4 ≥ inf − k0 − k0 w + kw − kv w∈V
≥ −k0 − kv > −∞. On the other hand, taking some w0 ∈ dom f = ( f is proper) fk (v) ≤ f (w0 ) + kv − w0 < +∞.
i
i i
i
i
i
i
9.2. Passing from sets to functions: Elements of epigraphical calculus
book 2014/8/6 page 341 i
341
Hence, for all k ≥ k0 and all v ∈ V , fk (v) is a real number. Take now u, v ∈ V . The triangle inequality yields for any w ∈ V v − w ≤ u − w + v − u. Hence, for all w ∈ V , for all k ∈ R+ f (w) + kv − w ≤ f (w) + ku − w + kv − u. Taking the infimum with respect to w ∈ V yields fk (v) ≤ fk (u) + kv − u. Exchanging the role of v and u and noticing that for k ≥ k0 both fk (v) and fk (u) are finitely valued yields | fk (v) − fk (u)| ≤ kv − u. (b) By taking w = v in the definition of fk (v), one has fk (v) ≤ f (v). Clearly, the sequence ( fk )k is increasing with respect to k. Hence lim fk (v) ≤ f (v).
k→+∞
Let us prove the reverse inequality f (v) ≤ lim fk (v). k→+∞
If limk→+∞ fk (v) = +∞, there is nothing to prove. So let us assume that limk→+∞ fk (v) < +∞. For each k ≥ k0 , let us introduce some wk ∈ V such that fk (v) ≥ f (wk ) + kv − wk − k for some k > 0 with k → 0 as k → +∞. Using the growth condition on f we obtain +∞ > sup fk (v) ≥ −k0 (1 + wk ) + kv − wk − k , k>0
which clearly implies wk → v in (V , · ) as k → +∞. Let us now pass to the limit on the inequality fk (v) ≥ f (wk ) − k and use the lower semicontinuity of f to obtain lim fk (v) ≥ lim inf f (wk )
k→+∞
k
≥ f (v). (c) Noticing that f and k· are both convex functions, the convexity of fk = f #e k· is a straightforward consequence of Proposition 9.2.2. Let us end this section with the following striking property of closed convex functions. We know that a linear operator from a normed space into another normed space
i
i i
i
i
i
i
342
book 2014/8/6 page 342 i
Chapter 9. Convex duality and optimization
is continuous iff it is bounded on bounded sets. This is an important property since it reduces the study of continuity of a linear operator A : E → F to the establishment of majorizations of the following type: there exists some M ∈ R+ such that vE ≤ 1 =⇒ AvF ≤ M . We will prove in Theorem 9.3.1 that any closed convex function is the supremum of all its continuous affine minorants. Thus, it is not surprising that also for closed convex functions local boundedness implies continuity. Let us make this precise in the following statement. Theorem 9.2.2. Let (V , · ) be a normed space and let f : V → R ∪ {+∞} be a convex function which is majorized on a neighborhood of a point v0 ∈ dom f , i.e., ∃r > 0 such that
sup v−v0 0 and observe that for any v ∈ B(0, r ), the following convex inequalities hold: writing v = (1 − )0 + 1 v we have 1 f (v) ≤ (1 − ) f (0) + f v ≤ M ; −1 1 v + 1+ v we have writing 0 = 1+ −1 1 M 1 f (v) + f v ≤ f (v) + , 0 = f (0) ≤ 1+ 1+ 1+ 1+ which yields
f (v) ≥ −M .
Combining the two above inequalities, we obtain | f (v)| ≤ M for v ∈ B(0, r ), which yields the continuity of f at the origin. (b) First observe that in the above argument, when taking = 1 we have the existence of some positive constant, which we still denote by M , such that | f (v + v0 ) − f (v0 )| ≤ M
∀ v ∈ B(0, r ).
Take arbitrary v, w ∈ B(v0 , r ) with v = w. Set = r − r > 0 and u=w+
w − v
(w − v) ,
λ=
w − v + w − v
.
i
i i
i
i
i
i
9.3. Legendre–Fenchel transform
book 2014/8/6 page 343 i
343
We have u ∈ B(v0 , r ) and w − vu = ( + w − v)w − v. Equivalently, w = λu +
+ w − v
v,
w = λu + (1 − λ)v with λ ∈ ]0, 1[. By convexity of f f (w) ≤ λ f (u) + (1 − λ) f (v) = f (v) + λ( f (u) − f (v)), which yields (observe that | f (u) − f (v)| ≤ | f (u) − f (v0 )| + | f (v) − f (v0 )| ≤ 2M ) f (w) − f (v) ≤
w − v + w − v
2M ≤
2M
w − v.
Exchanging the role of w and v, we obtain | f (w) − f (v)| ≤
2M
w − v,
which completes the proof.
9.3 Legendre–Fenchel transform Given f : V → R ∪ {+∞} a closed convex proper function, we are going to introduce f ∗ , the Legendre–Fenchel transform of f , by considering the set C = epi f ⊂ V × R and its dual representation, as given by Theorem 9.1.2. To that end, we need to exploit the particular structure of the set C = epi f in V × R and describe the family of the closed half-spaces in V × R containing it. Let us start with the following elementary result, which describes the closed halfspaces in V × R. Lemma 9.3.1. Let l ∈ (V × R)∗ be a linear continuous form on V × R, l = 0. Then there exist u ∗ ∈ V ∗ and γ ∈ R, (u ∗ , γ ) = 0 such that l (v, t ) = 〈u ∗ , v〉 + γ t
∀ (v, t ) ∈ V × R.
A closed half-space in V × R is of the following form: 4 5 = {(u ∗ ,γ )≤α} := (v, t ) ∈ V × R : 〈u ∗ , v〉 + γ t ≤ α . Depending on the value of γ (γ = 0 or γ 4 = 0), we have two distinct situations: 5 4 5 (a) γ = 0. Then = {(u ∗ ,γ )≤α} = (v, t ) ∈ V × R : 〈u ∗ , v〉 ≤ α = u ∗ ≤ α × R is 4 5 invariant by all the translations parallel to 0 × R. In that case, we say that the half-space is “vertical.” (b) γ = 0. By normalization (divide by −γ ) one can rewrite the closed half-space in the form 5 4 = (v, t ) ∈ V × R : t ≥ 〈u ∗ , v〉 − α or
5 4 = (v, t ) ∈ V × R : t ≤ 〈u ∗ , v〉 − α .
i
i i
i
i
i
i
344
book 2014/8/6 page 344 i
Chapter 9. Convex duality and optimization
It is the epigraph or the hypograph of the affine continuous function v → 〈u ∗ , v〉 − α. Because of its particular structure, the epigraph of a proper function cannot be con5 4 tained in a half-space of the form (v, t ) ∈ V × R : t ≤ 〈u ∗ , v〉 − α . We can summarize the previous results in the following lemma. Lemma 9.3.2. Let f : V −→ R ∪ {+∞} be a proper function. Then, a closed half-space in V × R containing the set C = epi f is either vertical or equal to the epigraph of an affine continuous function, i.e., there exist some u ∗ ∈ V and α ∈ R such that 4 5 = (v, t ) ∈ V × R : t ≥ 〈u ∗ , v〉 − α . Let us keep in mind that we are looking for the simplest dual representation of convex functions f . In this perspective, it is a striking and important property that one can get rid of the vertical half-spaces in their dual representations. Indeed, this is not a surprising result since one can approach them, and get arbitrarily “close” to closed vertical halfspaces, by epigraphs of continuous affine functions. Let us make this precise in the following statement. Theorem 9.3.1. Let (V , · ) be a normed space and let f : V → R ∪ {+∞} be a closed convex proper function. Then f is equal to the supremum of all the continuous affine functions which minorize f . PROOF. (a) Let us first notice that among all the closed half-spaces containing C = epi f , at least one of them is the epigraph of an affine continuous function. Otherwise, there would be only vertical half-spaces in this family, and C , as an intersection of such sets, would be vertically4 invariant. This is impossible, because f is proper. Indeed, for any 5 v0 ∈ dom f , C ∩ ( v0 × R) = {v0 } × [ f (v0 ), +∞[ with [ f (v0 ), +∞[ strictly included in R. Then notice that 5 4 epi f ⊂ epi v → 〈u ∗ , v〉 − α is equivalent to
f (v) ≥ 〈u ∗ , v〉 − α
∀v ∈ V ,
i.e., f admits at least an affine continuous minorant. ( (b) Let us recall the general property: f = sup fi ⇐⇒ epi f = i ∈I epi fi . Thus, to establish the assertion of the theorem, it suffices to show that each point (v0 , t0 ) ∈ epi f is outside the epigraph of an affine continuous function that is majorized by f . We know that C = epi f is equal to the intersection of all the closed half-spaces that contain it (Corollary 9.1.1) and that any such half-space either is vertical or is the epigraph of an affine continuous function (Lemma 9.3.2). To eliminate the vertical halfspaces in their dual representation, we use the Lipschitz regularization theorem, Theorem 9.2.1. Since f admits an affine continuous minorant, it is conically minorized and f = supk fk with fk convex and Lipschitz continuous. Since t0 < f (v0 ), for some k0 sufficiently large t0 < fk0 (v0 ) ≤ f (v0 ) and (v0 , t0 ) ∈ epi fk0 . Let us now use the dual representation of the closed convex set epi fk0 as the intersection of all the closed half-spaces that contain it. Since fk0 is everywhere defined, there is no vertical half-space containing epi fk0 . Thus, fk0 is the supremum of all its affine continuous minorants. As a consequence, one can find an affine continuous function
i
i i
i
i
i
i
9.3. Legendre–Fenchel transform
book 2014/8/6 page 345 i
345
v → l (v) = 〈u ∗ , v〉 − α with (v0 , t0 ) ∈ epi l
and
l ≤ f k0 .
Since fk0 ≤ f we have l ≤ f and (v0 , t0 ) ∈ epi l , that is, l satisfies all required properties. Just like for convex sets, we are going to look for the simplest dual description of closed convex functions, i.e., using the simplest continuous affine minorants. Theorem 9.3.1 tells us that 4 5 f (v) = sup 〈v ∗ , v〉 − α : 〈v ∗ , v〉 − α ≤ f (v) ∀v ∈ V . Let us now observe that for v ∗ ∈ V ∗ being fixed, making α vary provides parallel minorizing affine continuous functions. Clearly, the best α is obtained by taking 5 4 α = sup 〈v ∗ , v〉 − f (v) : v ∈ V , which, for v ∗ ∈ V ∗ being fixed, is a real number iff f admits a continuous affine minorant with slope v ∗ . This is precisely the quantity which is classically denoted by 4 5 f ∗ (v ∗ ) = sup 〈v ∗ , v〉 − f (v) : v ∈ V and which makes sense for an arbitrary v ∗ ∈ V ∗ , with possibly +∞ values. The above geometrical considerations allow us to reformulate Theorem 9.3.1 in the following form: 5 4 (9.4) ∀v ∈ V f (v) = sup 〈v ∗ , v〉 − f ∗ (v ∗ ) : v ∗ ∈ V ∗ . We are now ready to introduce classical notation, terminology, and basic facts concerning the Legendre–Fenchel transform which is defined below for arbitrary proper function f . Definition 9.3.1. Let V be a normed linear space and let f : V → R ∪ {+∞} be a proper function. The Legendre–Fenchel conjugate of f is the function f ∗ : V ∗ → R ∪ {+∞} defined by 4 5 f ∗ (v ∗ ) = sup 〈v ∗ , v〉 − f (v) : v ∈ V . Let us notice that since f is proper, by taking some v0 ∈ dom f f ∗ (v ∗ ) ≥ 〈v ∗ , v0 〉 − f (v0 ), i.e., f ∗ admits an affine continuous minorant and f ∗ : V ∗ → R ∪ {+∞}. Moreover, v ∗ ∈ dom f ∗ iff there exists α ∈ R such that for all v ∈ V one has 〈v ∗ , v〉 − f (v) ≤ α, i.e., f (v) ≥ 〈v ∗ , v〉 − α. Let us now return to the case when f is closed convex and proper. We know that f admits at least one such affine continuous minorant. This implies that f ∗ is proper. Since f ∗ is a
i
i i
i
i
i
i
346
book 2014/8/6 page 346 i
Chapter 9. Convex duality and optimization
supremum of continuous affine functions it is a closed convex proper function from V ∗ into R ∪ {+∞}. Let us examine the two formulas 4 5 (definition of f ∗ ), f ∗ (v ∗ ) = sup 〈v ∗ , v〉 − f (v) : v ∈ V 4 5 f (v) = sup 〈v ∗ , v〉 − f ∗ (v ∗ ) : v ∗ ∈ V ∗ (Theorem 9.3.1). They are essentially the same. Let us make this precise. Since f ∗ is closed convex and proper we can compute its conjugate f ∗∗ : V ∗∗ → R ∪ {+∞}. By using the canonical embedding of V into V ∗∗ , we can restrict f ∗∗ to V to obtain 4 5 ∀v ∈ V f ∗∗ (v) = sup 〈v ∗ , v〉 − f ∗ (v ∗ ) : v ∗ ∈ V ∗ . (Recall that i : V → V ∗∗ is defined by i(v)(v ∗ ) = v ∗ (v).) The dual representation theorem, Theorem 9.3.1, for closed convex functions then can be reformulated in the following form: f = f ∗∗ . This is the Fenchel–Moreau–Rockafellar theorem that we now state. Theorem 9.3.2. Let V be a normed space and let f : V → R ∪ {+∞} be a closed convex proper function. Then f = f ∗∗ , i.e., f is equal to its biconjugate. Equivalently, 4 5 ∀v ∈ V f (v) = sup 〈v ∗ , v〉 − f ∗ (v ∗ ) : v ∗ ∈ V ∗ . It is worth noticing that the dual representation of closed convex proper functions has been given, in the above theorem, a quite simple formulation. On the counterpart, it is a precise analytic formulation which may hide the geometrical features of this duality theory. It is good to keep in mind both aspects. At this point, it is interesting to observe that the duality for functions has been derived from duality for sets (via the representation of f : V → R ∪ {+∞} by its epigraph C = epi f ). Conversely, the duality for sets can be obtained as a particular case of the duality for functions. Let us associate to a set C its indicator function δC and first observe that whenever C is a nonempty closed convex set, then δC is a closed convex proper function. Then notice that 4 5 (δC )∗ (v ∗ ) = sup 〈v ∗ , v〉 − δC (v) : v ∈ V 4 5 = sup 〈v ∗ , v〉 : v ∈ C = σC (v ∗ ), i.e., (δC )∗ is the support function of C . The Fenchel–Moreau–Rockafellar theorem, Theorem 9.3.2, says that (δC )∗∗ = δC , which is equivalent to 4 5 δC (v) = sup 〈v ∗ , v〉 − σC (v ∗ ) : v ∗ ∈ V ∗ . Noticing that v ∈ C iff δC (v) = 0 one gets 4 5 C = v ∈ V : 〈v ∗ , v〉 ≤ σC (v ∗ ) ∀ v ∗ ∈ V ∗ . Let us summarize the previous results.
i
i i
i
i
i
i
9.3. Legendre–Fenchel transform
book 2014/8/6 page 347 i
347
Proposition 9.3.1. Let C be a nonempty closed convex subset of a normed linear space V ; then (δC )∗ = σC and (σC )∗ = δC , that is,
4 5 C = v ∈ V : 〈v ∗ , v〉 ≤ σC (v ∗ ) ∀v ∗ ∈ V ∗ .
It is worth noticing that the biconjugate operation f → f ∗∗ enjoys nice properties for convex functions which are not necessarily closed. We recall (see Section 3.2.4) that given (X , τ) a general topological space and f : X → R ∪ {+∞}, c lτ f is the largest τ-lsc function that minorizes f . We have epi(c lτ f ) = c l (epi f ); c lτ f is called the lower semicontinuous regularization of f . Moreover (see Proposition 3.2.5(d)), f is τ-lsc at x iff f (x) = (c lτ f )(x). For convex functions f : V :→ R ∪ {+∞}, with V a normed space, we have the following elegant characterization of c l f (for the topology of the norm of the space V ) in terms of the biconjugate f ∗∗ . Proposition 9.3.2. Let V be a normed linear space and let f : V → R ∪ {+∞} be a convex proper function. Let us assume that f admits a continuous affine minorant. Then the following equality holds: f ∗∗ = c l f . As a consequence, f is lower semicontinuous at u ∈ V ⇐⇒ f (u) = f ∗∗ (u). PROOF. By definition, for any v ∈ V 4 5 f ∗∗ (v) = sup 〈v ∗ , v〉 − f ∗ (v ∗ ) : v ∗ ∈ V ∗ , which implies that f ∗∗ is the upper envelope of the continuous affine minorants of f . It is a closed (convex) proper function, hence f ∗∗ ≤ c l f ≤ f . Let us now observe that c l f is still convex, because the epigraph of c l f is the closure of epi f which is a convex set. Hence, c l f is a closed convex proper function. Since f ∗∗ and c l f are both closed convex proper functions, we apply Theorem 9.3.2 to obtain ( f ∗∗ )∗∗ = f ∗∗ , (c l f )∗∗ = c l f . The inequality f ∗∗ ≤ c l f ≤ f implies, by taking the biconjugate of each term, that ( f ∗∗ )∗∗ ≤ (c l f )∗∗ ≤ f ∗∗ , i.e.,
f ∗∗ ≤ c l f ≤ f ∗∗ ,
and the equality f ∗∗ = c l f follows.
i
i i
i
i
i
i
348
book 2014/8/6 page 348 i
Chapter 9. Convex duality and optimization
By Proposition 3.2.5(d), for general functions f : V → R ∪ {+∞} we have the equivalence f is lower semicontinuous at u ∈ V ⇐⇒ f (u) = c l f (u). As a consequence, when f is proper and convex, we have f is lower semicontinuous at u ∈ V ⇐⇒ f (u) = f ∗∗ (u), which completes the proof. Example 9.3.1. (1) Take C = B(0, 1). By definition of the dual norm, for any v ∗ ∈ V ∗ σB(0,1) (v ∗ ) = sup 〈v ∗ , v〉 = v ∗ V ∗ . v∈B(0,1)
Thus, σB(0,1) = · V ∗ . Conversely the convex duality theorem yields ( · ∗ )∗ = δB(0,1) .
(9.5)
(2) As suggested by the result above we have ( · )∗ = δB∗ (0,1) .
(9.6)
Let us prove (9.6). Indeed by contrast with (9.5), (9.6) is an elementary result which does not use the Hahn–Banach theorem. Set f (v) = vV . Thus, for any v ∗ ∈ V ∗ 4 5 f ∗ (v ∗ ) = sup 〈v ∗ , v〉 − v : v ∈ V . If v ∗ ∗ ≤ 1, then 〈v ∗ , v〉 − v ≤ 0 for all v ∈ V and 〈v ∗ , v〉 − v = 0 for v = 0. Thus f ∗ (v ∗ ) = 0. If v ∗ ∗ > 1, by definition of · ∗ there exists some v0 ∈ V such that 〈v ∗ , v0 〉 > v0 . It follows that for all t > 0 〈v ∗ , t v0 〉 − t v0 = t (〈v ∗ , v0 〉 − v0 ). Hence lim t →+∞ 〈v ∗ , t v0 〉 − t v0 = +∞, which implies 5 4 f ∗ (v ∗ ) = sup 〈v ∗ , v〉 − f (v) = +∞. v∈V
By taking the conjugate in (9.6) and applying the duality Theorem 9.3.2 we obtain for any v ∈ V 4 5 v = sup 〈v ∗ , v〉 : v ∗ ∗ ≤ 1 ; this is the isometrical embedding theorem from V into its bidual V ∗∗ . We give in the following proposition an important example of a dual convex function which indeed is an extension of Example 9.3.1, case (2).
i
i i
i
i
i
i
9.3. Legendre–Fenchel transform
book 2014/8/6 page 349 i
349
Proposition 9.3.3. Let (V , · ) be a normed space with topological dual space (V ∗ , · ∗ ). Let ϕ : R → R ∪ {+∞} be a closed convex function which is even (i.e., ϕ(−t ) = ϕ(t )). Then the function f : V → R ∪ {+∞}, f (v) = ϕ(v), is a closed convex proper function and f ∗ (v ∗ ) = ϕ ∗ (v ∗ ∗ ). PROOF. The assumptions on ϕ imply that ϕ : R+ → R ∪ {+∞} is increasing. Thus f is still convex and clearly closed. Moreover, 5 4 f ∗ (v ∗ ) = sup 〈v ∗ , v〉 − ϕ(v) v∈V
= sup
sup
t ≥0 v∈V ,v=t
5 4 ∗ 〈v , v〉 − ϕ(v)
5 4 = sup t v ∗ ∗ − ϕ(t ) t ≥0
5 4 = sup t v ∗ ∗ − ϕ(t ) (because ϕ is even) t ∈R
= ϕ ∗ (v ∗ ∗ ), which completes the proof. As a straightforward consequence we obtain the following useful result. Corollary 9.3.1. Set f (v) = 1p v p with 1 < p < +∞. Then f ∗ (v ∗ ) =
1 v ∗ ∗p p
where
p is the Hölder conjugate exponent of p, i.e., 1/ p + 1/ p = 1. In particular, taking V = L p (Ω, , μ), 1 < p < +∞, we have V ∗ = L p (Ω, , μ) and the conjugate function of 1 f (v) = v(x) p d μ(x) p Ω
is equal to ∗
∗
f (v ) =
1 p
Ω
v ∗ (x) p d μ(x).
This makes the transition with the next important example, which is concerned with integral functionals and which will be studied in detail in Chapter 13. Theorem 9.3.3. Let V = L p (Ω, , μ), 1 < p < +∞, and j (x, v(x)) d μ(x) f (v) = Ω
a convex integral functional associated to a convex normal integrand j . Then
f ∗ : V ∗ = L p (Ω, , μ) → R ∪ {+∞} is given by ∗
∗
f (v ) =
Ω
j ∗ (x, v ∗ (x)) d μ(x),
where j ∗ (x, ·) is the convex conjugate of j (x, ·).
i
i i
i
i
i
i
350
book 2014/8/6 page 350 i
Chapter 9. Convex duality and optimization
Remark 9.3.1. When V = H is a Hilbert space the Legendre–Fenchel transform f → f ∗ is an involution from Γ0 (H ) into itself, where Γ0 (H ) is the set of closed convex proper functions on H , ∗ Γ0 (H ) −→ Γ0 (H ), f → f ∗ , i.e., f ∗∗ = f . This transform has some analogy with the Fourier–Plancherel transform, 0 : L2 (RN ) → L2 (RN ), f → 0 ( f ), where 0 0 f = f , which is indeed an isometry. Let us notice that ( · 2 /2)∗ = · 2 /2 2 i.e., · 2 /2 is invariant for the Legendre–Fenchel transform while f (x) = 12 e −x is invariant for the Fourier–Plancherel transform. A basic property of the Fourier–Plancherel transform is 0 ( f ∗ g ) = 0 ( f )0 ( g ), where f ∗ g is the convolution of functions. This property can be seen as the (formal) analogue of the property ( f # e g )∗ = f ∗ + g ∗ , which is studied in Section 9.4. Let us complete this section by studying the natural setting in which the Legendre– Fenchel transform acts as an operator. We will pay particular attention to the description of its range. Our final result (Theorem 9.3.5) shows that the Legendre–Fenchel transform is a one-to-one mapping from Γ0 (V ) onto Γ0 (V ∗ ) (cf. Definition 9.3.2). Given a general normed space (V , .), let us recall that the Legendre–Fenchel transform is the mapping which associates to a closed convex proper function f : V −→ R ∪ {+∞} its conjugate f ∗ : V ∗ −→ R ∪ {+∞} which is defined by 4 5 f ∗ (v ∗ ) = sup 〈v ∗ , v〉 − f (v) : v ∈ V . ∀v ∗ ∈ V ∗ Let us start with some simple observations. (a) The Legendre–Fenchel transform is one-to-one. This results from the implication f ∗ = g ∗ =⇒ f ∗∗ = g ∗∗ and f ∗∗ = f , g ∗∗ = g , which hold true for arbitrary closed convex proper functions f and g (Theorem 9.3.2). (b) The description of the range of the Legendre–Fenchel transform requires some attention. Let us return to the (above) definition of f ∗ . In this supremum, one just needs to consider v ∈ dom f . For such v ∈ dom f , the mapping v ∗ → 〈v ∗ , v〉 − f (v) is continuous on V ∗ for the topology σ(V ∗ ,V ), which is the weak star topology of the dual. This follows directly from the definition of this topology. Hence, f ∗ , as a supremum of such affine, σ(V ∗ ,V ) continuous functions, is a convex, proper function which is σ(V ∗ ,V ) lower semicontinuous. Indeed, when V is not reflexive, for a convex function g : V ∗ −→ R ∪ {+∞}, to be σ(V ∗ ,V ) lower semicontinuous is a strictly stronger property than just to be lower semicontinuous for the topology of the norm. One can exhibit a closed convex function g : V ∗ −→ R which is not σ(V ∗ ,V ) lower semicontinuous. Take any ξ ∈ V ∗∗ \ I (V ), where I is the canonical embedding of V into its bidual V ∗∗ (recall that 〈I (v), v ∗ 〉(V ∗∗ ,V ∗ ) = 〈v ∗ , v〉(V ∗ ,V ) ). Then define g (v ∗ ) := 〈ξ , v ∗ 〉(V ∗∗ ,V ∗ ) . Clearly g is continuous on V ∗ , but it
i
i i
i
i
i
i
9.3. Legendre–Fenchel transform
book 2014/8/6 page 351 i
351
is not σ(V ∗ ,V ) lower semicontinuous. Otherwise, by linearity, it would be continuous for the topology σ(V ∗ ,V ), which would imply ξ ∈ I (V ). It turns out that this σ(V ∗ ,V ) lower semicontinuity property allows us to characterize the range of the Legendre–Fenchel transform. Let us make this precise in the following statement. Theorem 9.3.4. Let (V , .) be a normed space. (a) For any f : V −→ R ∪ {+∞} which is closed, convex, and proper, its Legendre– Fenchel conjugate f ∗ : V ∗ −→ R ∪ {+∞} is a convex proper function which is σ(V ∗ ,V ) lower semicontinuous. (b) Conversely, let g : V ∗ −→ R ∪ {+∞} be a convex, proper, and σ(V ∗ ,V ) lower semicontinuous function. Then, g = g ∗∗ and g belongs to the range of the Legendre–Fenchel transform. More precisely, g = ( g ∗ )∗ is equal to the Legendre–Fenchel transform of the closed convex proper function g ∗ : V −→ R ∪ {+∞} which is defined by 4 5 ∀v ∈ V g ∗ (v) = sup 〈v ∗ , v〉 − g (v ∗ ) : v ∗ ∈ V ∗ . PROOF. Part (a) has already been proved. Proof of part (b) requires some further topological tools. When equipped with the topology σ(V ∗ ,V ), the space V ∗ is a locally convex topological vector space, whose dual can be identified with V . The Hahn–Banach theorem still holds in locally convex topological vector spaces, and from that point, the proof is essentially the same as in Theorem 9.3.2. To give a unified formulation of Theorem 9.3.2 and Theorem 9.3.4 where V and V ∗ , f and f ∗ play symmetrical roles, it is convenient to introduce the following notions and notation. Definition 9.3.2. Let (V , .) be a normed space with topological dual V ∗ . We set 4 Γ0,V ∗ (V ) = f : V −→ R ∪ {+∞}, f is a pointwise supremum of a nonvoid family of 5 affine functions with slopes in V ∗ , f ≡ +∞ ; 4 Γ0,V (V ∗ ) = g : V ∗ −→ R ∪ {+∞}, g is a pointwise supremum of a nonvoid family of 5 affine functions with slopes in V , g ≡ +∞ . These definitions make explicit references to the pairing between the two spaces V and V ∗ , that is, (v, v ∗ ) ∈ V × V ∗ → 〈v ∗ , v〉(V ∗ ,V ) = v ∗ (v). Without ambiguity, one often omits the subscript referring to the coupled space and writes briefly Γ0 (V ) and Γ0 (V ∗ ). To be more precise, one has f ∈ Γ0 (V ) ⇐⇒ f = sup fi i ∈I
with
fi (v) = 〈vi∗ , v〉 − αi
for some index set I ,
vi∗
∈ V ∗ (slope), and αi ∈ R;
g ∈ Γ0 (V ∗ ) ⇐⇒ g = sup g j j ∈J
with g j (v ∗ ) = 〈v ∗ , v j 〉 − β j for some index set J , v j ∈ V (slope), and β j ∈ R.
i
i i
i
i
i
i
352
book 2014/8/6 page 352 i
Chapter 9. Convex duality and optimization
We can now reformulate Theorem 9.3.1 and its corresponding version when considering the locally convex topological vector space (V ∗ , σ(V ∗ ,V )), together with Theorems 9.3.2 and 9.3.4 in the following final statement. Theorem 9.3.5. Let (V , .) be a normed space with topological dual V ∗ . Then, (a) one has 4 5 Γ0 (V ) = f : V −→ R ∪ {+∞}, f closed, convex, proper 4 5 = f : V −→ R ∪ {+∞}, f σ(V ,V ∗ ) closed, convex, proper , while 4 5 Γ0 (V ∗ ) = g : V ∗ −→ R ∪ {+∞}, g σ(V ∗ ,V ) closed, convex, proper . (b) The Legendre–Fenchel transform is a one-to-one mapping from Γ0 (V ) onto Γ0 (V ∗ ): ∗
Γ0 (V ) −→ Γ0 (V ∗ ) f −→ f ∗ . For any f ∈ Γ0 (V ) one has f = f ∗∗ and for any g ∈ Γ0 (V ∗ ) one has g = g ∗∗ . Remark 9.3.2. The preceding theory can be developed in the general setting of two vector spaces V and W in separate duality. Let us denote by 〈v, w〉(V ,W ) a given pairing between elements v ∈ V and w ∈ W . It is a bilinear form with separating properties, namely,
∀v ∈ V , v = 0, ∃w ∈ W with 〈v, w〉 = 0, ∀w ∈ W , w = 0, ∃v ∈ V with 〈v, w〉 = 0.
Then W is the dual of (V , σ(V ,W )) and conversely. The set Γ0 (V ) (respectively, Γ0 (W )) is defined by taking suprema of affine functions with slopes in W (respectively, V ), and the Legendre–Fenchel transform is a one-to-one mapping from Γ0 (V ) onto Γ0 (W ); see Moreau [296] for further details.
9.4 Legendre–Fenchel calculus As we have already stressed, most optimization problems can be written as 4 5 inf f (v) : v ∈ V , where f = f0 + δC is the sum of the objective function f0 and the indicator function of the constraint C . This explains the importance of getting a formula for the Legendre– Fenchel conjugate of a sum of functions. At this point, the epi-sum plays a central role, because of the following general property. Proposition 9.4.1. Let ϕ, ψ : V → R ∪ {+∞} be two proper functions. Then (ϕ#ψ)∗ = ϕ ∗ + ψ∗ .
i
i i
i
i
i
i
9.4. Legendre–Fenchel calculus
book 2014/8/6 page 353 i
353
PROOF. It is enough to take v ∗ ∈ V ∗ and compute 5 4 (ϕ#ψ)∗ (v ∗ ) = sup 〈v ∗ , v〉 − (ϕ#ψ)(v) v∈V M N = sup 〈v ∗ , v〉 − inf ϕ(v1 ) + ψ(v2 ) v∈V
v1 +v2 =v
M
= sup 〈v ∗ , v〉 + sup v∈V
=
sup
4
v∈V , v1 +v2 =v
v1 +v2 =v
− ϕ(v1 ) − ψ(v2 )
N
5 〈v ∗ , v1 〉 − ϕ(v1 ) + 〈v ∗ , v2 〉 − ψ(v2 )
= ϕ ∗ (v ∗ ) + ψ∗ (v ∗ ), which completes the proof. Corollary 9.4.1. Let f , g : V → R ∪ {+∞} be two closed convex proper functions. Then ( f + g )∗ = ( f ∗ #g ∗ )∗∗ . As a consequence, when the convex function f ∗ #g ∗ is a σ(V ∗ ,V ) closed proper function, we have ( f + g )∗ = f ∗ #g ∗ . PROOF. By Proposition 9.4.1 we have ( f ∗ #g ∗ )∗ = f ∗∗ + g ∗∗ . When f and g are assumed to be closed convex and proper, one gets ( f ∗ #g ∗ )∗ = f + g . Taking again the Legendre–Fenchel conjugate, we obtain ( f + g )∗ = ( f ∗ #g ∗ )∗∗ . The function f ∗ #g ∗ , as the epi-sum of two convex functions, is still convex (Proposition 9.2.2). When it is σ(V ∗ ,V ) closed and proper, Theorem 9.3.4 yields ( f + g )∗ = f ∗ #g ∗ , which completes the proof. We can now state the following theorem from Rockafellar [325] and Moreau [296], which, under a so-called qualification assumption on f and g , asserts that f ∗ #g ∗ is σ(V ∗ ,V ) closed and hence ( f + g )∗ = f ∗ #g ∗ . Theorem 9.4.1. Let V be a normed linear space and let f , g : V → R∪ {+∞} be two closed convex and proper functions which satisfy the following qualification assumption: there is a point u0 ∈ dom f ∩ dom g where f is continuous.
(Q)
Then f ∗ #g ∗ is a σ(V ∗ ,V ) closed convex proper function and the following equality holds: ( f + g )∗ = f ∗ #g ∗ . Moreover, for any v ∗ ∈ V ∗ , the infimum in the definition of f ∗ #g ∗ is achieved.
i
i i
i
i
i
i
354
book 2014/8/6 page 354 i
Chapter 9. Convex duality and optimization
PROOF. Corollary 9.4.1 tells us that the only point we need to verify is that f ∗ #g ∗ is σ(V ∗ ,V ) closed. Equivalently, we have to prove that for λ ∈ R, the sublevel set of f ∗ #g ∗ 4 5 C = v ∗ ∈ V ∗ : ( f ∗ #g ∗ )(v ∗ ) ≤ λ is σ(V ∗ ,V ) closed. Indeed, we are going to establish that for each ρ > 0, C ∩ ρBV ∗ is σ(V ∗ ,V ) closed, i.e., the traces of C on all closed balls of V ∗ are σ(V ∗ ,V ) closed. It will follow from the Banach–Dieudonné–Krein–Smulian theorem (see, e.g., [203, Theorem V 5.7]) that C is σ(V ∗ ,V ) closed. Let (vn∗ )n∈N be a bounded sequence of elements of C with vn∗ → v ∗ , σ(V ∗ ,V ). When V is separable, it is not restrictive to consider sequences. For general V the argument can be readily extended by considering generalized sequences. By definition of f ∗ #g ∗ , for each n ∈ N, there exists some wn ∈ V ∗ such that f ∗ (vn∗ − wn∗ ) + g ∗ (wn∗ ) ≤ λ +
1 n
.
(9.7)
The key point of the proof is to prove that the sequence (wn∗ )n∈N is bounded in V ∗ . To that end, we use as an essential fact the qualification assumption (Q): there exist some r > 0 and some M ∈ R such that sup f (u0 + r v) ≤ M .
(9.8)
vV ≤1
For any v ∈ B(0, 1) let us majorize 〈wn∗ , v〉. To that end, let us write r 〈wn∗ , v〉(V ∗ ,V ) = 〈wn∗ , r v〉
= 〈wn∗ , u0 〉 + 〈wn∗ , r v − u0 〉
= 〈wn∗ , u0 〉 + 〈vn∗ − wn∗ , u0 − r v〉 − 〈vn∗ , u0 − r v〉
≤ g (u0 ) + g ∗ (wn∗ ) + f (u0 − r v) + f ∗ (vn∗ − wn∗ ) − 〈vn∗ , u0 − r v〉.
We rewrite the above inequality in the form ! " r 〈wn∗ , v〉(V ∗ ,V ) ≤ f ∗ (vn∗ − wn∗ ) + g ∗ (wn∗ ) + f (u0 − r v) + g (u0 ) + u0 − r vvn∗ ∗ and use (9.7), (9.8) to obtain r 〈wn∗ , v〉(V ,V ∗ ) ≤ λ +
1 n
+ M + g (u0 ) + vn∗ ∗ (u0 + r ).
Using that the sequence (vn∗ ) is bounded, we immediately obtain from the above inequality (which is valid for any v ∈ B(0, 1)) that supn wn∗ ∗ < +∞. We now use the Banach–Alaoglu–Bourbaki theorem, Theorem 1.4.7, and Corollary 1.4.2: when V is separable (for a general V one can use a device of Attouch and Brezis [43]), the unit ball of V ∗ is σ(V ∗ ,V ) sequentially compact. As a consequence, one can find a subsequence (wnk )k∈N and some w ∗ ∈ V ∗ such that wnk w ∗ in σ(V ∗ ,V ). Let us now use the lower semicontinuity of f ∗ and g ∗ for the topology σ(V ∗ ,V ) and pass to the limit in (9.7) to obtain f ∗ (v ∗ − w ∗ ) + g ∗ (w ∗ ) ≤ λ. As a consequence, ( f ∗ #g ∗ )(v ∗ ) ≤ f ∗ (v ∗ − w ∗ ) + g ∗ (w ∗ ) ≤ λ and v ∗ ∈ C .
i
i i
i
i
i
i
9.5. Subdifferential calculus for convex functions
book 2014/8/6 page 355 i
355
The same argument with λ = f ∗ #g ∗ and vn∗ = v ∗ gives that the infimum in the definition of f ∗ #g ∗ is achieved. The qualification assumption (Q), because of its importance, has been intensively studied and many weakened versions of it have been established. Let us quote the following result (see Aubin [64]). Theorem 9.4.2. Let V be a Banach space and let f , g : V → R∪{+∞} be two closed convex proper functions such that dom f − dom g is a neighborhood of the origin. Then, the same conclusions as Theorem 9.4.1 hold and ( f + g )∗ = f ∗ #g ∗ . In the same spirit, the same result was established by Attouch and Brezis in [43] under the even weaker assumption 3 λ(dom f − dom g ) is a closed subspace of V . λ>0
Note that by contrast with the Rockafellar theorem, which holds in general normed spaces, the Aubin and Attouch–Brezis theorems require that the space V is a Banach space. Indeed, an essential ingredient in the proof of these theorems is the Banach–Steinhaus theorem. Otherwise, the proof is essentially the same as in Theorem 9.4.1.
9.5 Subdifferential calculus for convex functions To obtain the simplest possible dual representation of a closed convex set C of a normed linear space (V , · ), we introduce the notion of supporting hyperplane. When taking C = epi f , the epigraph of a closed convex proper function, the corresponding notion is the exact minorization: a continuous affine function l : V → R is an exact minorant of f at u if l ≤ f and l (u) = f (u). Equivalently, when setting l (v) = 〈u ∗ , v〉 + α, this becomes f (v) ≥ 〈u ∗ , v〉 + α ∀ v ∈ V , f (u) = 〈u ∗ , u〉 + α, i.e., α = f (u) − 〈u ∗ , u〉, l (v) = f (u) + 〈u ∗ , v − u〉, which is equivalent to ∀v ∈ V
f (v) ≥ f (u) + 〈u ∗ , v − u〉.
This leads to the following definition. Definition 9.5.1. Let (V , · ) be a normed space and f : V → R ∪ {+∞} be a closed convex proper function. We say that an element u ∗ ∈ V ∗ belongs to the subdifferential of f at u ∈ V if ∀v ∈ V f (v) ≥ f (u) + 〈u ∗ , v − u〉(V ∗ ,V ) . We then write u ∗ ∈ ∂ f (u).
i
i i
i
i
i
i
356
book 2014/8/6 page 356 i
Chapter 9. Convex duality and optimization
The terminology reflects the fact that when f is continuously differentiable and convex, the following inequality holds: ∀v ∈ V
f (v) ≥ f (u) + 〈∇ f (u), v − u〉.
Moreover, this inequality characterizes ∇ f (u). For this reason, when u ∗ ∈ ∂ f (u), we say either that u ∗ belongs to the subdifferential of f at u or that u ∗ is a subgradient of f at u. Note that if u ∗ ∈ ∂ f (u), then necessarily u ∈ dom f (take v0 ∈ dom f = ; we have f (v0 ) − 〈u ∗ , v0 − u〉 ≥ f (u) and f (u) < +∞). It is also important to notice that given u ∈ dom f , the set ∂ f (u) may be empty; see Phelps [320, Example 3.8]. Proposition 9.5.1. Let (V , · ) be a normed space and f : V → R ∪ {+∞} a closed convex proper function. Then the two following conditions are equivalent: (i) u ∗ ∈ ∂ f (u), (ii) f (u) + f ∗ (u ∗ ) − 〈u ∗ , u〉 = 0. PROOF. (a) Let us first give a geometrical proof: to say that u ∗ ∈ ∂ f (u) means that v → 〈u ∗ , v〉 + f (u) − 〈u ∗ , u〉 is an exact minorant of f at u. This implies that it is a maximal minorant with slope u ∗ , i.e., f (u) − 〈u ∗ , u〉 = − f ∗ (u ∗ ). (b) The analytic proof is also immediate: the inequality f (u) + f ∗ (u ∗ ) − 〈u ∗ , u〉 ≥ 0 is always true. Thus the equality f (u)+ f ∗ (u ∗ )−〈u ∗ , u〉 = 0 is equivalent to the inequality f (u) + f ∗ (u ∗ ) − 〈u ∗ , u〉 ≤ 0. By definition of f ∗ this is equivalent to saying 〈u ∗ , u〉 − f (u) ≥ 〈u ∗ , v〉 − f (v) ∀ v ∈ V , i.e., u ∗ ∈ ∂ f (u). Remark 9.5.1. As we have already stressed, for any v ∈ V and v ∗ ∈ V ∗ the inequality f (v) + f ∗ (v ∗ ) − 〈v ∗ , v〉 ≥ 0 is always true. Thus, when writing the characterization of u ∗ ∈ ∂ f (u), f (u) + f ∗ (u ∗ ) − 〈u ∗ , u〉 = 0, we express that for the pair (u, u ∗ ) ∈ V ×V ∗ , the function (v, v ∗ ) → f (v)+ f ∗ (v ∗ )−〈v ∗ , v〉 takes its minimal value. For this reason, relation (ii) in Proposition 9.5.1 is called the Fenchel extremality relation. A major interest of the Fenchel extremality characterization of subdifferentials is that f and f ∗ play a symmetric role in its formulation. This together with the Fenchel– Moreau–Rockafellar duality theorem, Theorem 9.3.2 (which expresses that f = f ∗∗ ), yields the following result. Theorem 9.5.1. Let (V , · ) be a normed space and let f : V → R ∪ {+∞} be a closed convex and proper function. Then, for u ∈ V and u ∗ ∈ V ∗ we have u ∗ ∈ ∂ f (u) ⇐⇒ u ∈ ∂ f ∗ (u ∗ ).
i
i i
i
i
i
i
9.5. Subdifferential calculus for convex functions
book 2014/8/6 page 357 i
357
PROOF. In fact we obtain u ∗ ∈ ∂ f (u) ⇐⇒ f (u) + f ∗ (u ∗ ) − 〈u ∗ , u〉 = 0 ⇐⇒ f ∗∗ (u) + f ∗ (u ∗ ) − 〈u ∗ , u〉 = 0 ⇐⇒ u ∈ ∂ f ∗ (u ∗ ), where we use the Fenchel extremality characterization of the subdifferential and the Fenchel–Moreau–Rockafellar duality theorem, Theorem 9.3.2 ( f = f ∗∗ ). Remark 9.5.2. When using the notation of set-valued analysis we can write (∂ f )−1 = ∂ f ∗ . This is indeed the formulation, in terms of subdifferentials, of the convex duality theory. For theoretical reasons it is important to know if a closed convex proper function can be uniquely determined (up to a constant) by its subdifferential. From a geometrical point of view, this can be formulated as follows: Is a closed convex proper function the upper envelope of its exact continuous affine minorants? Indeed the answer is yes when V is a Banach space. The proof of this result relies on the Ekeland’s -variational principle. (See Section 3.4. We state it without proof, referring, for instance, to Phelps [320, Corollary 3.1.9].) Theorem 9.5.2. Suppose (V , · ) is a Banach space and f : V → R ∪ {+∞} is a closed convex proper function. Then, for any u ∈ dom f 5 4 f (u) = sup f (v) + 〈v ∗ , u − v〉 : v ∈ V , v ∗ ∈ V ∗ with v ∗ ∈ ∂ f (v) 4 5 = sup 〈v ∗ , u〉 − f ∗ (v ∗ ) : ∃v ∈ V such that v ∗ ∈ ∂ f (v) . Note that this theorem, when specialized to convex sets, says that any closed convex nonempty set in a Banach space is the intersection of the closed half-spaces defined by its supporting hyperplanes (Phelps [320, Proposition 3.2.1]). When proving the above theorem via Ekeland’s variational principle one obtains in the process the following density result. Theorem 9.5.3. Suppose (V , ·) is a Banach space and f : V → R∪{+∞} is a closed convex proper function. Then, dom ∂ f is dense in dom f . More precisely, for any v ∈ dom f , there exists a sequence (vn )n∈N with vn ∈ dom ∂ f for all n ∈ N such that vn → v
and
f (vn ) → f (v).
PROOF. For the proof, see Azé [70, Theorem 3.2.4] and Aubin and Ekeland [67, Theorem 3]. For a proof in the case when (V , · ) is a reflexive Banach space, see Proposition 17.4.3. To develop a calculus for subdifferentials it is convenient to consider ∂ f as a multivalued operator, −→ ∂ f : V −→ V ∗ , and to identify ∂ f with its graph 5 4 ∂ f = (v, v ∗ ) ∈ V × V ∗ : v ∗ ∈ ∂ f (v) .
i
i i
i
i
i
i
358
book 2014/8/6 page 358 i
Chapter 9. Convex duality and optimization
−→ We recall the basic definitions for calculus of set-valued mappings: given A, B : V −→ V ∗ we have 4 5 dom A4= v ∈ V : ∃v ∗ ∈ V ∗ with (v, v5∗ ) ∈ A , A−1 = (v ∗ , v) ∈ V ∗ × V : (v, v ∗ ) ∈ A ,
dom(A + B) = dom A ∩ dom B, (A + B)(v) = Av + B v in the sense of vectorial sum.
Moreover, we say that A ⊂ B if graph A ⊂ graph B. As we have already stressed, the convex duality theory can be expressed as ∂ f ∗ = (∂ f )−1 . Theorem 9.5.4. Let (V , · ) be a normed space and let f , g : V → R ∪ {+∞} be two closed convex proper functions. (a) The following inclusion is always true: ∂ f + ∂ g ⊂ ∂ ( f + g ). (b) If moreover the qualification assumption (Q) holds, f is finite and continuous at a point of dom g ,
(Q)
then we have ∂ f + ∂ g = ∂ ( f + g ). PROOF. (a) Take u ∈ dom ∂ f ∩dom ∂ g , u ∗ ∈ ∂ f (u), and w ∗ ∈ ∂ g (u). By the definition of ∂ f and ∂ g , for any v ∈ V , f (v) ≥ f (u) + 〈u ∗ , v − u〉, g (v) ≥ g (u) + 〈w ∗ , v − u〉. By adding these two inequalities, we obtain for any v ∈ V ( f + g )(v) ≥ ( f + g )(u) + 〈u ∗ + w ∗ , v − u〉, i.e., u ∗ + w ∗ ∈ ∂ ( f + g )(u). (b) Take u ∗ ∈ ∂ ( f + g )(u). Equivalently, by using the Fenchel extremality relation, we obtain ( f + g )(u) + ( f + g )∗ (u ∗ ) − 〈u ∗ , u〉 = 0. By Theorem 9.4.1, we have ( f + g )∗ (u ∗ ) = ( f ∗ #g ∗ )(u ∗ ) and the infimum in the definition of ( f ∗ #g ∗ )(u ∗ ) is achieved. Consequently, there exists some w ∗ ∈ V ∗ such that ( f + g )(u) + f ∗ (u ∗ − w ∗ ) + g ∗ (w ∗ ) − 〈u ∗ , u〉 = 0.
i
i i
i
i
i
i
9.5. Subdifferential calculus for convex functions
book 2014/8/6 page 359 i
359
Equivalently, ( f (u) + f ∗ (u ∗ − w ∗ ) − 〈u ∗ − w ∗ , u〉) + ( g (u) + g ∗ (w ∗ ) − 〈w ∗ , u〉) = 0. By the Fenchel inequality, f (u) + f ∗ (u ∗ − w ∗ ) − 〈u ∗ − w ∗ , u〉 ≥ 0, g (u) + g ∗ (w ∗ ) − 〈w ∗ , u〉 ≥ 0. Since the sum of these two quantities is equal to zero, we obtain f (u) + f ∗ (u ∗ − w ∗ ) − 〈u ∗ − w ∗ , u〉 = 0, g (u) + g ∗ (w ∗ ) − 〈w ∗ , u〉 = 0. These are the Fenchel extremality relations (Proposition 9.5.1) and they are equivalent to u ∗ − w ∗ ∈ ∂ f (u) Finally, we obtain
and
w ∗ ∈ ∂ g (u).
u ∗ = (u ∗ − w ∗ ) + w ∗ ∈ ∂ f (u) + ∂ g (u),
i.e., u ∗ ∈ (∂ f + ∂ g )(u). We already stressed the fact that for u ∈ dom f , the set ∂ f (u) may be empty. The following result, which can be viewed as a corollary of Theorem 9.5.4, gives a sufficient condition for the set ∂ f (u) to be nonempty. This result, as we will see in Sections 9.6 and 9.8, is quite useful for applications. Proposition 9.5.2. Let (V , · ) be a normed space and let f : V → R ∪ {+∞} be a closed convex and proper function. Let us assume that f is continuous at u ∈ dom f . Then ∂ f (u) = and ∂ f (u) is a closed convex and bounded subset of V ∗ . PROOF. Let us apply Theorem 9.4.1 to the sum of the two closed convex and proper functions f and g = δ{u} (g is the indicator function of the singleton {u}). By assumption, f is continuous at the point u, and the qualification assumption (Q) of Theorem 9.4.1 is satisfied. Hence, for any v ∗ ∈ V ∗ , the equality ∗ )(v ∗ ) ( f + δ{u} )∗ (v ∗ ) = ( f ∗ #δ{u} ∗ )(v ∗ ) is achieved. An elementary holds, and the infimum in the formulation of ( f ∗ #δ{u} computation yields ( f + δ{u} )∗ (v ∗ ) = 〈v ∗ , u〉 − f (u), ∗ (w ∗ ) = 〈w ∗ , u〉. δ{u}
Hence, for any v ∗ ∈ V ∗ , there exists some w ∗ ∈ V ∗ such that 〈v ∗ , u〉 − f (u) = f ∗ (v ∗ − w ∗ ) + 〈w ∗ , u〉. Equivalently,
f (u) + f ∗ (v ∗ − w ∗ ) − 〈v ∗ − w ∗ , u〉 = 0.
i
i i
i
i
i
i
360
book 2014/8/6 page 360 i
Chapter 9. Convex duality and optimization
This is the Fenchel extremality relation. This is equivalent to v ∗ − w ∗ ∈ ∂ f (u), which expresses that ∂ f (u) = . Note that, as well, we may have applied Theorem 9.5.4 instead of Theorem 9.4.1 to obtain the above result. As a general rule, the set ∂ f (u) is closed and convex. This is an immediate consequence of the definition of ∂ f (u). Let us now verify that under the continuity assumption of f at u, this set is bounded. Since f is continuous at u, it is bounded on a neighborhood of u. Let r > 0 and M ≥ 0 be such that f (u + r v) ≤ M
∀v ∈ B(0, 1).
Take v ∗ ∈ ∂ f (u). By definition of ∂ f , we have for all v ∈ B(0, 1) f (u + r v) ≥ f (u) + r 〈v ∗ , v〉. Hence 〈v ∗ , v〉 ≤
1
v ∗ ∗ ≤
1
(M + | f (u)|). r This being true for any v ∈ B(0, 1), we obtain r
(M + | f (u)|),
and, as a consequence, the set ∂ f (u) is bounded. Let us now come to the central role played by the subdifferential calculus in convex optimization. The following result, despite its elementary proof (it is a straightforward consequence of the definition of ∂ f ), shows the role of the subdifferential optimality rule ∂ f (u) # 0 as a substitute to the classical Fermat rule. Proposition 9.5.3. Let (V , · ) be a normed space and let f : V → R ∪ {+∞} be a closed convex and proper function. Then, for an element u ∈ V the two following statements are equivalent: (i) f (u) ≤ f (v) for all v ∈ V ; (ii) ∂ f (u) # 0. Let us stress that the above proposition gives a necessary and sufficient condition for an element u ∈ V to be a solution of the convex minimization problem 4 5 min f (v) : v ∈ V . This necessary and sufficient condition ∂ f (u) # 0 is an extension to nonsmooth convex functions of the classical first-order necessary and sufficient condition of optimality for convex C1 functions, namely, ∇ f (u) = 0.
i
i i
i
i
i
i
9.5. Subdifferential calculus for convex functions
book 2014/8/6 page 361 i
361
Thus, for a given convex optimization problem, the problem which consists in finding the optimal solutions can be attacked by using the subdifferential calculus and solving the generalized equation ∂ f (u) # 0. As we have stressed, Legendre–Fenchel calculus and subdifferential calculus are intimately connected; playing with both of them when passing from one formulation to the other gives a lot of flexibility and makes a rich calculus. This calculus is made even richer when exploiting some of its geometrical aspects (duality via polar cones, etc.). Let us develop these ideas in the following general approach to optimization (both finite and infinite dimensional) problems. Let f0 : V → R ∪ {+∞} be a closed convex proper function (objective function) on a normed linear space V , and let C ⊂ V be a closed convex nonempty subset of V (set of constraints). Consider the following optimization problem: 5 4 (+ ) inf f0 (v) : v ∈ C . It can be written in the equivalent form 4 5 inf f (v) : v ∈ V , where f := f0 + δC . An element u ∈ V is an optimal solution of (+ ) iff ∂ f (u) # 0. To compute ∂ f we assume that the qualification assumption (Q) is satisfied: f0 is continuous at a point of C or int C ∩ dom f0 = .
(Q)
Then, Theorem 9.5.4 tells us that it is equivalent to look for a solution of the equation ∂ f0 (u) + ∂ δC (u) # 0. To describe the subdifferential of the indicator function of a closed convex set C , we need to introduce the notion of tangent and normal cone to C at a point u ∈ C . Definition 9.5.2. Let C be a closed convex nonempty subset of a normed space V and let u ∈ C. (a) The tangent cone to C at u, denoted by TC (u), is defined by TC (u) =
3
λ(C − u).
λ≥0
It is the closure of the cone spanned by C − u. (b) The normal cone (also called outward normal cone) NC (u) to C at u ∈ C is the polar cone of the tangent cone: 4 5 NC (u) = v ∗ ∈ V ∗ : 〈v ∗ , v〉 ≤ 0 ∀ v ∈ TC (u) 4 5 = v ∗ ∈ V ∗ : 〈v ∗ , v − u〉 ≤ 0 ∀ v ∈ C . Proposition 9.5.4. Let C be a closed convex nonempty subset of a normed space V . Then, for every u ∈ C , ∂ δC (u) = NC (u).
i
i i
i
i
i
i
362
book 2014/8/6 page 362 i
Chapter 9. Convex duality and optimization
PROOF. By definition of the subdifferential u ∗ ∈ ∂ δC (u) ⇐⇒ δC (v) ≥ δC (u) + 〈u ∗ , v − u〉 ∀v ∈ C u ∈ C, ⇐⇒ 〈u ∗ , v − u〉 ≤ 0 ∀ v ∈ C u ∈ C, ⇐⇒ 〈u ∗ , v〉 ≤ 0 ∀ v ∈ TC (u), that is, u ∗ ∈ NC (u). An equivalent and quite useful characterization of NC (u) is given by the Fenchel extremality relation: u ∗ ∈ NC (u) ⇐⇒ u ∗ ∈ ∂ δC (u)
⇐⇒ δC (u) + δC∗ (u ∗ ) = 〈u ∗ , u〉 ⇐⇒ σC (u ∗ ) = 〈u ∗ , u〉,
where we have used that δC∗ = σC (see Proposition 9.3.1). Let us formulate this result precisely. Proposition 9.5.5. Let C be a closed convex nonempty subset of a normed linear space V . For every u ∈ C we have 4 5 NC (u) = u ∗ ∈ V ∗ : 〈u ∗ , u〉 = max{〈u ∗ , v〉 : v ∈ C } . Equivalently, an element u ∗ of NC (u) is characterized by the fact that the linear form v → 〈u ∗ , v〉 attains its maximum on C at the point u. Let us come back to the convex constrained optimization problem (+ ). We can summarize the previous results in the following statement. Theorem 9.5.5. Let (V , · ) be a normed space, let f0 : V → R ∪ {+∞} be a closed convex and proper function, and let C ⊂ V be a closed convex nonempty subset. We assume that one of the two following qualification assumptions (Q1 ) or (Q2 ) is satisfied: f0 is continuous at some point of C ,
(Q1 )
dom f0 ∩ int C = .
(Q2 )
Then the following statements are equivalent: (i) u is an optimal solution of the minimization problem (+ ) 5 4 min f0 (v) : v ∈ C ;
(+ )
(ii) u is a solution of the equation ∂ f0 (u) + NC (u) # 0; (iii) there exists some u ∗ ∈ V ∗ such that ⎧ ⎨ u ∈ C, u ∗ ∈ ∂ f (u), ⎩ 〈u ∗ , v −0u〉 ≥ 0 ∀ v ∈ C .
i
i i
i
i
i
i
9.5. Subdifferential calculus for convex functions
book 2014/8/6 page 363 i
363
To go further we need to enrich the model and give more information on the structure of the set of constraints C . Because of its practical importance, in the next subsection we are going to pay particular attention to the mathematical convex programming theory (and in particular to linear programming) and the theory of multipliers. We will see how the notion of the dual problem naturally occurs. When f0 is a smooth convex function, say, f0 ∈ C1 (V , R), Theorem 9.5.5 takes the following simpler equivalent form: u is an optimal solution of the above minimization problem (+ ) iff (iii)
u ∈ C and 〈∇ f0 (u), v − u〉 ≥ 0 for every v ∈ C .
Problem (iii) is a particular case of the following general variational inequality problem: given an operator A : V → V ∗ and z ∈ V ∗
find u ∈ C such that 〈Au, v − u〉 ≥ 〈z, v − u〉
∀ v ∈ C.
Note that when C = V (i.e., there are no constraints), the above problem reduces to the standard equation Au = z. As an example, let us examine the important case where C is a closed convex cone such that C ∩ (−C ) = {0}. Then, C is equal to the positive cone for the partial ordering v ≥ u ⇐⇒ v − u ∈ C . Then problem (iii) takes the following equivalent form: ⎧ ⎨ ∇ f0 (u) ≥ 0, u ≥ 0, ⎩ 〈∇ f (u), u〉 = 0. 0 (The last equality is obtained by taking successively v = 0 and v = 2u in (iii).) This type of problem is called a complementarity problem. 4 5 Take now a closely related problem where C = v ∈ V : v ≥ g , where g ∈ V is given. One can easily obtain that (iii) becomes ⎧ ⎨ ∇ f0 (u) ≥ 0, u ≥ g, ⎩ 〈∇ f (u), u − g 〉 = 0. 0 When V = H01 (Ω) and f0 (v) =
1 2
Ω
|∇v(x)|2 d x is the Dirichlet integral, we obtain
⎧ ⎨ −Δu ≥ 0, u ≥ g, ⎩ 〈Δu, u − g 〉 = 0. The first condition expresses that −Δu = μ ≥ 0 is a nonnegative Radon measure. The last condition (complementary condition) can be recognized as Ω
( u˜ − g˜ ) d μ = 0,
where u˜ and g˜ are the quasi-continuous representatives of u and g . It expresses that μ = −Δu does not charge the set where u˜ > g˜ .
i
i i
i
i
i
i
364
book 2014/8/6 page 364 i
Chapter 9. Convex duality and optimization
In other words, μ is concentrated on the contact set ω = { u˜ = g˜ } and we have to solve the free boundary value problem: ⎧ ⎨−Δu = 0 on Ω \ ω, u = g on ω, ⎩ u = 0 on ∂ Ω.
9.6 Mathematical programming: Multipliers and duality In this section, (V , .) is a normed space. Mathematical programming is concerned with optimization problems of the form 5 4 (+ ) min f0 (v) : f1 (v) ≤ 0, . . . , fn (v) ≤ 0 , where fi (i = 1, . . . , n) are given functions from V into R. Thus, a mathematical programming problem is an optimization problem where the constraint C has the following specific form: 5 4 C = v ∈ V : fi (v) ≤ 0, i = 1, . . . , n . This problem is of fundamental importance; a large number of problems in decision sciences, engineering, and so forth can be written as mathematical programming problems. The mathematical analysis of this kind of problem depends heavily on the geometrical properties of the functions fi (i = 0, . . . , n). When the functions fi are affine, (+ ) is called a linear programming problem. When fi (i = 1, . . . , n) are affine and f0 is quadratic, (+ ) is called a quadratic programming problem. In this section, we study the situation where f0 , f1 , . . . , fn are supposed to be convex functions. Thus (+ ) is a convex minimization problem ( f0 and C are convex); it is called a convex mathematical programming problem.
9.6.1 Karush–Kuhn–Tucker optimality conditions The following theorem, which is the central result of this section, will be obtained by applying Theorem 9.5.5 to our situation. Because of the specific form of the constraint C , the constraint qualification assumption (Q) takes a quite simple form (this is the Slater qualification assumption). The computation of the normal cone NC (u) provides, as fundamental mathematical objects, the Karush–Kuhn–Tucker optimality conditions and the corresponding Lagrange multipliers. Theorem 9.6.1. Suppose that V is a normed space, f0 : V → R ∪ {+∞} is closed convex proper, and f1 , . . . , fn : V → R are convex and continuous. Suppose moreover that the following Slater qualification assumption is satisfied: There exists some v0 ∈ V such that f0 (v0 ) < +∞ and such that fi (v0 ) < 0 ∀i = 1, . . . , n. Then the following statements are equivalent: (i) u is a solution of problem (+ ) above; (ii) there exist λ1 , λ2 , . . . , λn in R+ such that ⎧ ∂ f0 (u) + λ1 ∂ f1 (u) + · · · + λn ∂ fn (u) # 0, ⎪ ⎪ ⎪ ⎨ λ ≥ 0 ∀i = 1, . . . , n, i ⎪ f ⎪ i (u) ≤ 0 ∀i = 1, . . . , n, ⎪ ⎩ λi fi (u) = 0 ∀i = 1, . . . , n.
i
i i
i
i
i
i
9.6. Mathematical programming: Multipliers and duality
book 2014/8/6 page 365 i
365
The central point of the proof of Theorem 9.6.1 is the computation of the normal to do it first when C is a closed half-space (that is, when cone 4NC (u). We are going 5 C = v ∈ V : f (v) ≤ 0 with f affine continuous) and then in the general case. Lemma 9.6.1. Let (V , .) be a normed space and u ∗ ∈ V ∗ with u ∗ = 0. Let us consider the closed half-space 4 5 = v ∈ V : 〈u ∗ , v − u〉 ≤ 0 . Then, N (u) = R+ u ∗ . In other words, v ∗ ∈ V ∗ belongs to the normal cone to at u iff there exists some λ ≥ 0 such that v ∗ = λu ∗ . PROOF. The inclusion R+ u ∗ ⊂ N (u) is immediate: by the definition of , we have 〈u ∗ , v − u〉 ≤ 0 for all v ∈ . Hence u ∗ ∈ N (u) and R+ u ∗ ⊂ N (u). Conversely, take v ∗ ∈ N (u), v ∗ = 0 (the case v ∗ = 0 is trivial). By definition of N (u), we have (9.9) 〈v ∗ , v − u〉 ≤ 0 ∀ v ∈ . As a particular subset of , let us consider the affine subspace 5 4 W = v ∈ V : 〈u ∗ , v − u〉 = 0 . We have W = u + M , where M = ker u ∗ is the hyperspace 5 4 M = v ∈ V : 〈u ∗ , v〉 = 0 . By taking in (9.9) elements v belonging to W = u + M , we obtain 〈v ∗ , v〉 ≤ 0
∀ v ∈ M.
Then, replace v by −v (M is a subspace) to obtain 〈v ∗ , v〉 = 0
∀ v ∈ M.
We now follow a standard device in linear algebra. Take an arbitrary element w ∈ V , w ∈ M ; noticing that 〈u ∗ , v〉 ∗ w = 0, u ,v − ∗ 〈u , w〉 we deduce that for every v ∈ V , v−
〈u ∗ , v〉 〈u ∗ , w〉
w ∈ M = ker u ∗ .
Since v ∗ = 0 on M , we have ∗
∗
〈v , v〉 = v , that is,
∗
〈v , v〉 =
〈u ∗ , v〉 〈u ∗ , w〉
〈v ∗ , w〉 〈u ∗ , w〉
w ,
∗
u ,v .
i
i i
i
i
i
i
366
book 2014/8/6 page 366 i
Chapter 9. Convex duality and optimization
This being true for all v ∈ V , we finally obtain v∗ =
〈v ∗ , w〉 〈u ∗ , w〉
u ∗,
i.e., v ∗ = t u ∗ for some t ∈ R. Until now, we have exploited only a part of the information given by (9.9). Returning to (9.9), t must satisfy t 〈u ∗ , v − u〉 ≤ 0 ∀ v ∈ . Since for all v ∈ 〈u ∗ , v − u〉 ≤ 0, we necessarily have t ≥ 0. 4 5 Let us now examine the situation where C = v ∈ V : f (v) ≤ 0 and compute the normal cone NC (u) at an arbitrary point u of C . Proposition 9.6.1. Suppose that f : V → R is a convex continuous function on a normed linear space V . Set 4 5 C = v ∈ V : f (v) ≤ 0 and assume that C satisfies the following Slater property: there exists some v0 ∈ C such that f (v0 ) < 0. Then, for every u ∈ C NC (u) =
{0} if f (u) < 0, R+ ∂ f (u) if f (u) = 0.
As a consequence, u ∗ ∈ NC (u) ⇐⇒ ∃λ ≥ 0 such that u ∗ ∈ λ∂ f (u) and λ f (u) = 0. PROOF. Take u ∈ C . If f (u) < 0, because of the continuity of f , we have u ∈ int C , which yields TC (u) = V and hence NC (u) = {0}. If on the contrary f (u) = 0, let us prove that NC (u) = R+ ∂ f (u). The inclusion R+ ∂ f (u) ⊂ NC (u) is quite easy to verify: take u ∗ ∈ ∂ f (u); by definition of the subdifferential ∂ f (u) of f at u ∀v ∈ V
f (v) ≥ f (u) + 〈u ∗ , v − u〉.
Noticing that f (u) = 0 and f (v) ≤ 0 for all v ∈ C , we obtain 〈u ∗ , v − u〉 ≤ 0 ∀ v ∈ C , i.e., u ∗ ∈ NC (u). Since NC (u) is a cone, we obtain R+ ∂ f (u) ⊂ NC (u). Let us now prove the opposite inclusion, which is the delicate part of the proof: NC (u) ⊂ R+ ∂ f (u). Equivalently, we have to prove that if f (u) = 0 and u ∗ ∈ NC (u), then there exists some λ ≥ 0 such that u ∗ ∈ λ∂ f (u). The case u ∗ = 0 is trivial, so we assume in the following that u ∗ = 0. We are going to prove the existence of such λ by using a variational argument. As a direct consequence of the definition of the normal cone, we
i
i i
i
i
i
i
9.6. Mathematical programming: Multipliers and duality
book 2014/8/6 page 367 i
367
have (see Proposition 9.5.5) the equivalence u ∗ ∈ NC (u) > the linear form v → 〈u ∗ , v〉 attains its maximal value on C at u ∈ C . As a general property of a linear form, the maximum of the linear form v → 〈u ∗ , v〉 on C is attained on its boundary and v ∈ int C =⇒ 〈u ∗ , v〉 < 〈u ∗ , u〉. Hence
f (v) < 0 =⇒ 〈u ∗ , v〉 < 〈u ∗ , u〉.
Therefore
〈u ∗ , v − u〉 ≥ 0 =⇒ f (v) ≥ 0, 4 5 that is, on the closed half-space = v ∈ V : 〈u ∗ , v − u〉 ≥ 0 we have f (v) ≥ 0. Noticing that u ∈ and f (u) = 0, we have the following variational property: “ f achieves its minimal value on the half-space at the point u.” Hence, ∂ ( f + δ )(u) # 0. Since f is continuous, we can apply Theorem 9.5.5 to obtain ∂ f (u) + N (u) # 0. We are in the situation described in Lemma 9.6.1. Noticing that 4 5 = v ∈ V : 〈−u ∗ , v − u〉 ≤ 0 , we thus have N (u) = R+ (−u ∗ ) = R− (u ∗ ). As a consequence, there exists some t ≤ 0 such that ∂ f (u) + t u ∗ # 0. Let us finally prove that t < 0. Otherwise, t = 0 and ∂ f (u) # 0, which expresses that f attains its minimal value at u. This is impossible because f (u0 ) < 0 (Slater condition) and f (u) = 0. Thus t < 0, and, dividing by t the above relation, we obtain u ∗ ∈ − 1t ∂ f (u), i.e., u ∗ ∈ R+ ∂ f (u). We have now all the elements to prove Theorem 9.6.1. PROOF OF THEOREM 9.6.1. Let us first verify that all the assumptions of Theorem 9.5.5 are satisfied. Since the functions fi are continuous, the Slater condition implies that v0 ∈ int C . Since f0 (v0 ) < +∞, we have dom f ∩ int C = and the qualification assumption (Q2 ) is satisfied. Thus u is a solution of the convex programming problem (+ ) iff ∂ f0 (u) + NC (u) # 0. (n 4 5 Then notice that C = i =1 Ci , where Ci = v ∈ V : fi (v) ≤ 0 , which is equivalent to saying that δC = δC1 + · · · + δCn . The Slater condition implies that each of the closed convex functions fi = δCi is continuous at the point v0 . Thus, the subdifferential rule for
i
i i
i
i
i
i
368
book 2014/8/6 page 368 i
Chapter 9. Convex duality and optimization
the sum of convex functions (see Theorem 9.5.4) gives ∂ δC = ∂ δC 1 + · · · + ∂ δC n , that is, for any u ∈ C ,
NC (u) = NC1 (u) + · · · + NCn (u).
We now combine these results with Proposition 9.6.1 to obtain the existence of real numbers λ1 ≥ 0, . . . , λn ≥ 0 such that ∂ f0 (u) + λ1 ∂ f1 (u) + · · · + λn ∂ fn (u) # 0, λi = 0 if fi (u) < 0. Thus, in all cases λi fi (u) = 0.
9.6.2 The marginal approach to multipliers Let us first restate Theorem 9.6.1 in a variational way. Proposition 9.6.2. Assume that the hypotheses of Theorem 9.6.1 are satisfied. Let u be an optimal solution of the minimization problem 5 4 i = 1, . . . , n . (+ ) min f0 (v) : fi (v) ≤ 0, n (a) Then, there exists some vector λ ∈ R+ such that u is a solution of the unconstrained minimization problem: n min f0 (v) + λi fi (v) : v ∈ V . (+λ ) i =1
Moreover, the complementarity slackness condition holds: λi fi (u) = 0,
i = 1, . . . , n.
n , u is a solution of the unconstrained minimization (b) Conversely, if for some λ ∈ R+ problem (+λ ) and fi (u) ≤ 0, i = 1, . . . , n,
λi fi (u) = 0,
i = 1, . . . , n,
then u is an optimal solution of the minimization problem (+ ). PROOF. Just notice that, since for all i = 1, . . . , n the functions fi are supposed to be continuous, the additivity rule for subdifferentials holds, ∂ f0 + λ1 ∂ f1 + · · · + λn ∂ fn = ∂ ( f0 + λ1 f1 + · · · + λn fn ), and the Karush–Kuhn–Tucker condition can be written in the form
n λi fi (u) # 0. ∂ f0 + i =1
This expresses that u is a solution of the convex unconstrained minimization problem (+λ ).
i
i i
i
i
i
i
9.6. Mathematical programming: Multipliers and duality
book 2014/8/6 page 369 i
369
Definition 9.6.1. Let u be an optimal solution of the minimization problem (+ ) above. n a Lagrange multiplier vector for u if We call a vector λ ∈ R+ ∂ f0 (u) + and
n i =1
λi ∂ fi (u) # 0
λi fi (u) = 0,
i = 1, . . . , n.
The determination of Lagrange multipliers is a central question since, if we are able to compute a Lagrange multiplier λ(u) of an optimal solution u, then u can be obtained as a solution of the unconstrained minimization problem n λi fi (v) : v ∈ V . (+λ ) min f0 (v) + i =1
Let us first notice that the set of Lagrange multipliers does not depend on the solution u, i.e., if u1 and u2 are two solutions of the minimization problem (+ ), then M (u1 ) = M (u2 ), where M (ui ) is the set of Lagrange multipliers of the solution ui . Indeed, this is a consequence of the following characterization of Lagrange multipliers. Proposition 9.6.3. Let u be an optimal solution of the minimization problem 4 5 min f0 (v) : fi (v) ≤ 0, i = 1, . . . , n . Then, the set of Lagrange multipliers for u is equal to
M=
n λ ∈ R+
: inf f0 = inf f0 + C
V
n i =1
(+ )
λi fi
,
4 5 where C = v ∈ V : fi (v) ≤ 0 for all i = 1, . . . , n is the set of constraints. PROOF. Take a Lagrange multiplier λ for u. Then u is a solution of the unconstrained minimization problem n n λi fi (u) = inf f0 (v) + λi fi (v) : v ∈ V . f0 (u) + i =1
V
i =1
Because of the complementarity slackness property we deduce n λi fi . f0 (u) = inf f0 + V
i =1
On the other hand, since u is an optimal solution of (+ ), we have 4 5 f0 (u) = inf f0 + δC , which proves that λ ∈ M . Conversely, let us suppose that λ ∈ M . Then, f0 (u) ≤ f0 (u) +
n i =1
λi fi (u)
i
i i
i
i
i
i
370
book 2014/8/6 page 370 i
Chapter 9. Convex duality and optimization
and ni=1 λi fi (u) ≥ 0. Since λi ≥ 0 and fi (u) ≤ 0, this implies λi fi (u) = 0 for all i = 1, . . . , n. Hence n n λi fi (u) = inf f0 (v) + λi fi (v) : v ∈ V , f0 (u) + V
i =1
i =1
which expresses that u is a solution of the unconstrained minimization problem n min f0 (v) + λi fi (v) : v ∈ V . i =1
As a consequence, ∂ f0 (u) +
n i =1
λi fi (u) # 0,
which, together with λi ≥ 0, fi (u) ≤ 0, and λi fi (u) = 0, tells us that λ is a Lagrange multiplier vector for u. Clearly, the set M is independent of u solution of (+ ). Thus we can speak of the set of Lagrange multipliers of a convex program. Indeed, the definition of the set M makes sense, and the set M may be nonempty, even when there is no solution of the convex program (+ ). This leads us to give the following definition. Definition 9.6.2. For a given convex program 4 inf f0 (v) : fi (v) ≤ 0,
5 i = 1, . . . , n ,
(+ )
the set M of generalized Lagrange multiplier vectors is defined by
n n λi fi , M = λ ∈ R+ : inf f0 + δC = inf f0 + V
4
V
i =1
5
where C = v ∈ V : fi (v) ≤ 0, i = 1, . . . , n . When the problem (+ ) has a solution, then M is the set of Lagrange multiplier vectors for (+ ). Without ambiguity, in what follows we will omit the word “generalized.” We are going to characterize the set M by using marginal analysis. Definition 9.6.3. The value function attached to a convex program 5 4 inf f0 (v) : fi (v) ≤ 0 ∀ i = 1, . . . , n
(+ )
is the function p : Rn → R which is defined, for every y = (y1 , y2 . . . , yn ) ∈ Rn , by 4 5 p(y) := inf f0 (v) : fi (v) ≤ yi ∀ i = 1, . . . , n . The function p is also called the marginal function. Let us observe that the value function is the optimal value of the perturbed convex program (+y ) 5 4 (+y ) inf f0 (v) : fi (v) ≤ yi ∀ i = 1, . . . , n .
i
i i
i
i
i
i
9.6. Mathematical programming: Multipliers and duality
book 2014/8/6 page 371 i
371
The initial problem, or unperturbed problem, corresponds to the case y = 0, i.e., (+ ) = (+0 ). We also notice that the value function may take the value −∞, which may be a source of difficulties. We are going to show that Lagrange multiplier vectors for problem (+ ) correspond to subgradients of the value function p. Theorem 9.6.2. Consider the convex minimization problem 5 4 inf f0 (v) : fi (v) ≤ 0 ∀ i = 1, . . . , n
(+ )
and its value function p : Rn → R 4 5 p(y) = inf f0 (v) : fi (v) ≤ yi ∀ i = 1, . . . , n . Then the following properties hold: (a) the value function p is convex; (b) if p(0) ∈ R, then M = −∂ p(0), i.e., the set of generalized Lagrange multiplier vectors for (+ ) is equal to the opposite of the subdifferential of p at the origin; (c) if p(0) ∈ R and the Slater qualification assumption is satisfied, then p is continuous at n . the origin and, as a consequence, M is a nonempty, closed, convex, bounded set in R+ PROOF. (a) We notice that
p(y) = inf f (v, y), v∈V 4 5 where f (v, y) = f0 (v) + δC (y) (v) and C (y) = v ∈ V : fi (v) ≤ yi for all i = 1, . . . , n . Let us verify that the mapping (v, y) → δC (y) (v) is convex. We just need to verify that for every (u, z) ∈ V × Rn and (v, y) ∈ V × Rn such that u ∈ C (z) and v ∈ C (y), we still have λu + (1 − λ)v ∈ C (λz + (1 − λ)y) for all λ ∈ [0, 1]. Indeed, this is an immediate consequence of the convexity of functions fi : we have fi (λu + (1 − λ)v) ≤ λ fi (u) + (1 − λ) fi (v) ≤ λzi + (1 − λ)yi = (λz + (1 − λ)y)i . Since f0 is convex, we obtain that f is convex with respect to the pair (v, y). The convexity of the value function p is then a consequence of Proposition 9.2.3. n (b) We first prove that every generalized Lagrange multiplier vector λ ∈ R+ satisfies −λ ∈ ∂ p(0). Equivalently, we need to prove that ∀y ∈ Rn
p(y) ≥ p(0) −
n i =1
λi yi .
By definition of p and by Definition 9.6.2 of generalized Lagrange multiplier vectors, we have 4 5 p(0) = inf f0 + δC n λi fi . = inf f0 + i =1
Take an arbitrary y ∈ Rn and denote by C (y) the set 5 4 i = 1, . . . , n . C (y) = v ∈ V : fi (v) ≤ yi ,
i
i i
i
i
i
i
372
book 2014/8/6 page 372 i
Chapter 9. Convex duality and optimization
For every v ∈ C (y) we have ni=1 λi fi (v) ≤ ni=1 λi yi (recall that λi ≥ 0 for all i = 1, . . . , n). Hence, for all v ∈ C (y), n
p(0) ≤ f0 (v) +
i =1
λi yi .
As a consequence, by taking the infimum with respect to v ∈ C (y), we obtain p(0) ≤ p(y) +
n i =1
λi yi .
Let us now prove that, conversely, if −λ ∈ ∂ p(0), then λ is a generalized Lagrange multiplier vector for (+ ). n n . Indeed, for every y ∈ R+ we have C ⊂ C (y), and as a We first prove that λ ∈ R+ consequence p(y) ≤ p(0). Combining this inequality and the subdifferential inequality p(y) ≥ p(0) −
n i =1
we obtain
n i =1
λi yi ,
λi yi ≥ 0.
n n This being true for all y ∈ R+ , we obtain that λ ∈ R+ . Let us now prove that
n λi fi . inf f0 + δC = inf f0 + V
V
i =1
Equivalently, we need to prove that
p(0) = inf f0 + V
n i =1
λi fi .
n The inequality p(0) ≥ infV f0 + ni=1 λi fi is always true for arbitrary λ ∈ R+ : indeed, for every v ∈ C , we have fi (v) ≤ 0 and hence λi fi (v) ≤ 0. This immediately yields
n n λi fi ≤ inf f0 + λi fi inf f0 + V
C
i =1
i =1
≤ inf f0 = p(0). C
The opposite inequality p(0) ≤ inf f0 + We thus have for each y ∈ Rn p(y) +
n i =1
n
i =1
λi fi relies on the fact that −λ ∈ ∂ p(0).
λi yi ≥ p(0).
i
i i
i
i
i
i
9.6. Mathematical programming: Multipliers and duality
book 2014/8/6 page 373 i
373
Take an arbitrary v ∈ V and choose correspondingly yi = fi (v) for all i = 1, . . . , n. Thus, we have v ∈ C (y) and p(y) ≤ f0 (v). As a consequence, f0 (v) +
n i =1
λi fi (v) ≥ p(0).
This being true for all v ∈ V , by taking the infimum with respect to v, we obtain
n inf f0 + λi fi ≥ p(0). V
i =1
n Finally, we have proved that λ ∈ R+ and
inf( f0 + δC ) = inf f0 + V
V
n i =1
λi fi .
By Definition 9.6.2, λ is a generalized Lagrange multiplier vector. (c) By the Slater qualification assumption, there exists some v0 ∈ dom f0 such that fi (v0 ) < 0 for all i = 1, . . . , n. Thus, we can find a neighborhood of the origin in Rn , say, B(0, r ) with r > 0, such that ∀y ∈ B(0, r ), ∀ i = 1, . . . , n fi (v0 ) < yi . 4 5 (It is enough to take, for example, r = 12 inf | fi (v0 )| : i = 1, . . . , n .) By definition of the value function p, we have ∀y ∈ B(0, r )
p(y) ≤ f0 (v0 ).
Since f0 (v0 ) < +∞, p is bounded from above on the ball B(0, r ). Let us prove that this property, together with p(0) ∈ R, implies ∀y ∈ Rn
p(y) > −∞.
We first formulate the properties above in terms of epigraphs. We have B(0, r ) × [ f0 (v0 ), +∞[ ⊂ epi p. If p(y) = −∞ for some y ∈ Rn , we would have {y} × R ⊂ epi p. Take ξ = −αy with α > 0 to have |ξ | < r , for example, α = r /(2|y|). We can write α = (1 − λ)/λ for some 0 < λ < 1, which gives λξ + (1 − λ)y = 0. Then we observe that (ξ , f0 (v0 )) ∈ B(0, r ) × [ f0 (v0 ), +∞[, (y, t ) ∈ {y} × R for every t ∈ R. By the convexity of epi p we obtain λξ + (1 − λ)y, λ f0 (v0 ) + (1 − λ)t ∈ epi p, i.e.,
p(0) ≤ λ f0 (v0 ) + (1 − λ)t
for every t ∈ R.
Since 0 < λ < 1, this implies p(0) = −∞, a contradiction.
i
i i
i
i
i
i
374
book 2014/8/6 page 374 i
Chapter 9. Convex duality and optimization
We can now apply Theorem 9.2.2: the function p : Rn → R ∪ {+∞} is convex and majorized on a neighborhood of the origin. Hence, p is continuous at the origin. By Proposition 9.5.2, ∂ p(0) = , and the set M = −∂ p(0) is a nonempty closed convex n . bounded set in R+ Let us now introduce a dual minimization problem (+ ∗ ) to the convex program (+ ) and show that the generalized Lagrange multiplier vectors are the solutions of this dual problem (+ ∗ ). Theorem 9.6.3 (dual convex program). Let us consider a convex program 4 5 inf f0 (v) : fi (v) ≤ 0 ∀ i = 1, . . . , n
(+ )
and let p : Rn → R be its value function. We assume that p(0) ∈ R and that the Slater qualification assumption holds. Then the following hold: (a) The generalized Lagrange multiplier vectors of (+ ) are the solutions of the maximization problem 5 4 n , (+ ∗ ) sup − p ∗ (−λ) : λ ∈ R+ which is called the dual problem of (+ ). The set of solutions of (+ ∗ ) is a nonempty closed n convex bounded subset of R+ . n (b) For every λ ∈ R+ , the equality ∗
− p (−λ) = inf
v∈V
f0 (v) +
n i =1
λi fi (v)
holds and, as a consequence, the dual problem (+ ∗ ) can be written in the form n sup inf f0 (v) + λi fi (v) . n v∈V λ∈R+
(+ ∗ )
i =1
PROOF. By Theorem 9.6.2, we have the following equivalence: λ ∈ M ⇐⇒ −λ ∈ ∂ p(0). We know that p is a convex function. Thus, by using the Fenchel extremality relation (Proposition 9.5.1), we obtain λ ∈ M ⇐⇒ p(0) + p ∗ (−λ) = 0. Theorem 9.6.2 also tells us that p is continuous at the origin. By Proposition 9.3.2, we thus have p(0) = p ∗∗ (0). Noticing that p ∗∗ (0) = sup{− p ∗ (μ) : μ ∈ Rn }, we obtain λ ∈ M ⇐⇒ − p ∗ (−λ) = sup − p ∗ (μ) μ∈Rn
= sup − p ∗ (−μ). μ∈Rn
Thus, λ ∈ M iff λ is a solution of the maximization problem sup − p ∗ (−μ).
μ∈Rn
i
i i
i
i
i
i
9.6. Mathematical programming: Multipliers and duality
book 2014/8/6 page 375 i
375
Let us now compute p ∗ : 4 5 p ∗ (μ) = sup 〈μ, y〉 − p(y) y
55 4 4 = sup 〈μ, y〉 − inf f0 (v) : fi (v) ≤ yi ∀ i = 1, . . . , n y
5 4 = sup 〈μ, y〉 − f0 (v) : y ∈ Rn , fi (v) ≤ yi ∀ i = 1, . . . , n . y
n If for some i ∈ {1, . . . , n} we have μi > 0, then p ∗ (μ) = +∞. Otherwise, when −μ ∈ R+ we have n p ∗ (μ) = sup − f0 (v) + sup μi y i v∈V
= sup v∈V
Hence
yi ≥ fi (v) i =1
n i =1
μi fi (v) − f0 (v) .
⎧ n ⎪ ⎨ inf f (v) + μ f (v) if μ ∈ Rn , i i + − p ∗ (−μ) = v∈V 0 i =1 ⎪ ⎩ −∞ otherwise,
and the dual problem can be written sup inf
n v∈V μ∈R+
f0 (v) +
n i =1
μi fi (v) .
9.6.3 The Lagrangian approach to duality In the framework of convex problems, and thanks to the Legendre–Fenchel transform, we have seen that a large number of mathematical objects can be paired with a dual one. Indeed we are going to go further and see how to realize the duality of optimization problems themselves. In the previous section, we introduced a variational problem (+ ∗ ), called the dual problem of (+ ). We are going to justify this terminology and explore how the primal problem (+ ) and its dual (+ ∗ ) are related to each other in remarkable ways. Let us introduce some basic notations and concepts. The convex program 4 5 inf f0 (v) : fi (v) ≤ 0 ∀ i = 1, . . . , n (+ ) is called the primal problem. The key notion in the duality theory for optimization problems is the Lagrangian. Definition 9.6.4. The Lagrangian function attached to the convex program (+ ) is the funcn tion L : V × R+ → R ∪ {+∞} defined by L(v, λ) = f0 (v) +
n i =1
λi fi (v).
We already noticed that this expression plays a central role in the theory of Lagrange multipliers. The new aspect in the definition above is to consider this expression as a
i
i i
i
i
i
i
376
book 2014/8/6 page 376 i
Chapter 9. Convex duality and optimization
bivariate function, i.e., L is a function of the two variables v and λ. This is a big step, since we are no longer concerned only with the primal problem, its solutions, and the characterization of its solutions: we choose to give from the very beginning an equivalent status to the variables v and λ. Let us first observe that the Lagrangian function L encapsulates all the information of the primal problem (+ ). Clearly sup L(v, λ) = f0 (v) + δC (v),
n λ∈R+
4 5 where C = v ∈ V : fi (v) ≤ 0 for all i = 1, . . . , n is the set of constraints. Thus, the primal problem can be equivalently written as an inf-sup problem, namely, inf sup L(v, λ).
v∈V λ∈Rn +
(+ )
Let us denote by α ∈ R the optimal value of (+ ) α := inf sup L(v, λ); v∈V λ∈Rn +
α is called the primal value. This formulation of (+ ) makes it rather natural to consider the associated variational problem (+ ∗ ) sup inf L(v, λ), n v∈V λ∈R+
called the dual problem, which is obtained by interchanging the order of the sup and inf operators. Indeed, this formulation fits perfectly with the conclusion of Theorem 9.6.3, where it is shown that, under some assumptions, Lagrange multipliers are solutions of the dual problem (+ ∗ ). Let us denote by β ∈ R the optimal value of (+ ∗ ) β := sup inf L(v, λ); n v∈V λ∈R+
β is called the dual value. n The dual problem therefore consists in maximizing over vectors λ ∈ R+ the dual function d (λ) := inf L(v, λ). v∈V
∗
Note that the dual problem (+ ) is well defined without any assumptions on the functions fi , i = 1, . . . , n. One has always β ≤ α, because supX infY ≤ infY supX is always true. It can happen that the primal value α is strictly larger than the dual value β. In this case, we say that there is a duality gap. A basic question is to find conditions ensuring that there is no duality gap. When this is the case, the primal and the dual problem are connected through a rich calculus involving value functions, Legendre–Fenchel transform and subdifferentials, and minimax and saddle value problems. The notion of the saddle value and the saddle point of the Lagrangian function is also fundamental: it permits us to treat in a unifying way the primal and the dual aspects of
i
i i
i
i
i
i
9.6. Mathematical programming: Multipliers and duality
book 2014/8/6 page 377 i
377
optimization problems and will allow us to develop all these ideas in a far more general setting in the next section. Definition 9.6.5. Let L : X × Y → R be a bivariate function where X and Y are arbitrary spaces. A point (x, y) ∈ X × Y is called a saddle point of L if max L(x, y) = L(x, y) = min L(x, y). y∈Y
x∈X
Equivalently, (x, y) is a saddle point of L if L(x, y) ≤ L(x, y) ≤ L(x, y)
∀ x ∈ X , y ∈ Y.
Another way to say this is (a) x is a solution of the minimization problem inf x∈X L(x, y), (b) y is a solution of the maximization problem maxy∈Y L(x, y). Note that the existence of a saddle point (x, y) implies that there is no duality gap. This follows from the equalities sup L(x, y) = L(x, y) = inf L(x, y), x∈X
y∈Y
which imply α = inf sup L(x, y) ≤ L(x, y) ≤ sup inf L(x, y) = β. x
y
y
x
Since α ≥ β is always true we obtain L(x, y) = inf sup L(x, y) = sup inf L(x, y). x
y
y
x
The converse is not true in general: it is possible to have no duality gap without the existence of saddle points. We can now reformulate the conclusions of Theorem 9.6.3 in the following form. Theorem 9.6.4. Consider a convex program (+ ) and assume that the Slater condition holds. Then the following facts hold true: (a) There is no duality gap, i.e., the primal and the dual values are equal; let us call it the optimal value. (b) (dual attainment) Assuming moreover that the optimal value is finite, then the set of solutions of the dual problem (+ ∗ ) is nonempty: it is the set of generalized Lagrange multipliers of problem (+ ), and it is convex and bounded. (c) (saddle point formulation of primal solutions) The following assertions are equivalent: (i) u is a solution of the primal problem (+ ); n such that (u, λ) is a saddle point of the Lagrangian (ii) there exists a vector λ ∈ R+ function L.
If (ii) is satisfied, then λ is a Lagrange multiplier of the optimal solution u, and it is a solution of the dual problem (+ ∗ ).
i
i i
i
i
i
i
378
book 2014/8/6 page 378 i
Chapter 9. Convex duality and optimization
PROOF. (a) If α = −∞, there is nothing to prove, since we know that α ≥ β. Otherwise, if α is finite, we are in the situation which was studied in Theorem 9.6.2: the Slater condition implies ∂ p(0) = and the set M of generalized Lagrange multiplier vectors is nonempty. For every λ ∈ M we have α = inf sup L(v, λ) v
λ
= inf L(v, λ) v
≤ sup inf L(v, λ) = β. λ
v
Since α ≥ β is always true, we obtain α = β. (b) It is just a reformulation of Theorem 9.6.3: the set of solutions of the dual problem (+ ∗ ) has been characterized with the help of the value function: λ solution of (+ ∗ ) ⇐⇒ −λ ∈ ∂ p(0) ⇐⇒ λ generalized Lagrange multiplier of (+ ). (c) Let us first prove the implication (i) =⇒ (ii). Let u be a solution of problem (+ ). Then, the Slater condition implies the existence of a Lagrange multiplier λ associated to u (Theorem 9.6.1 and Proposition 9.6.3). Therefore, L(u, λ) = min L(v, λ). v∈V
On the other hand, because of the complementary slackness property (λi fi (u) = 0 for all n i = 1, . . . , n), we have for any λ ∈ R+ L(u, λ) = f0 (u) + = f0 (u) ≥ f0 (u) +
n i =1 n i =1
λi fi (u)
λi fi (u)
(use that λi ≥ 0 and fi (u) ≤ 0). Hence L(u, λ) ≥ sup L(u, λ). n λ∈R+
Finally inf L(v, λ) ≥ L(u, λ) ≥ sup L(u, λ),
v∈V
n λ∈R+
which expresses that (u, λ) is a saddle point of L. n , Let us now prove the implication (ii) =⇒ (i). If (u, λ) is a saddle point of L on V ×R+ we have f0 (u) + δC (u) = sup L(u, λ) ≤ L(u, λ) n λ∈R+
≤ inf L(v, λ) ≤ sup inf L(v, λ) v
n v λ∈R+
≤ inf sup L(v, λ) = inf f0 (v) + δC (v) . v
n λ∈R+
v
i
i i
i
i
i
i
9.6. Mathematical programming: Multipliers and duality
book 2014/8/6 page 379 i
379
Hence, u is a solution of the primal problem (+ ) and
n inf f0 + δC = inf f0 + λ i fi , v
v
i =1
which expresses (see Proposition 9.6.3) that λ is a Lagrange multiplier and, hence, λ is a solution of the dual problem.
9.6.4 Duality for linear programming Take V = Rn . Given vectors a 1 , a 2 , . . . , a m , c in Rn , and a vector b in R m consider the primal linear program 5 4 (+ ) inf 〈c, x〉 : 〈a i , x〉 − bi ≤ 0, i = 1, . . . , m , where 〈·, ·〉 is the usual Euclidean scalar product in Rn . This is clearly a problem of linear programming, with f0 (x) = 〈c, x〉, fi (x) = 〈a i , x〉 − bi , i = 1, . . . , m. 4 5 Indeed, f0 is linear and C = x ∈ Rn : 〈a i , x〉 − bi ≤ 0 is a polyhedral set (finite intersection of closed half-spaces). The Lagrangian function L : Rn × R+m → R is given by L(x, λ) = 〈c, x〉 +
m i =1
λi 〈a i , x〉 − bi .
Therefore, the primal problem can be rewritten as inf sup L(x, λ) x
(+ )
m λ∈R+
and the dual problem (+ ∗ ) is given by (+ ∗ )
sup inf L(x, λ).
m λ∈R+
x
Let us compute the dual function d (λ) = inf L(x, λ) x
S = infn x∈R
We find d (λ) =
x, c +
m i =1
T λi a
i
−
m i =1
λi bi .
− im=1 λi bi if c + im=1 λi a i = 0, −∞ otherwise.
The dual problem (+ ∗ ) is then given by ⎧ ⎪ ⎨ sup −〈b , λ〉 m ⎪ ⎩ subject to
i =1
λi a i = −c,
λ ∈ R+m
(+ ∗ )
,
i
i i
i
i
i
i
380
book 2014/8/6 page 380 i
Chapter 9. Convex duality and optimization
and the Kuhn–Tucker optimality conditions are the following: ⎧ m λ a i = −c, ⎪ ⎪ i =1 i ⎪ ⎪ ⎨ λ ≥ 0, x ∈ Rn , i ⎪ 〈a i , x〉 − bi ≤ 0, ⎪ ⎪ ⎪ ⎩ λ 〈a i , x〉 − b = 0. i
i
9.7 A general approach to duality in convex optimization In Section 9.6, we developed a duality theory for convex programs. This is an important class of convex optimization problems, but it is far from covering the whole field of convex optimization. Thus a number of natural questions arise: Is it possible to develop a duality theory for general convex optimization, and if yes, is there a unique dual minimization problem? What are the relations between primal and dual problems, and what is the interpretation of the solutions of the dual problem? At the center of all these questions is the notion of the Lagrangian function which we now introduce. Let us consider the primal problem 4 5 inf f (v) : v ∈ V , (+ ) where f : V → R∪{+∞} is a general convex, lower semicontinuous, and proper function whose definition usually includes the constraints. The basic idea is to introduce a bivariate function L : V × W → R which satisfies ∀v ∈ V
f (v) = sup L(v, w). w∈W
We say that L is a Lagrangian function associated to problem (+ ). In this way, the primal problem can be written as an inf-sup problem: inf sup L(v, w).
v∈V w∈W
(+ )
As we did in Section 9.6, one can associate to problem (+ ) a dual problem (+ ∗ ) which is obtained by interchanging the order of inf and sup: sup inf L(v, w).
w∈W v∈V
(+ ∗ )
Equivalently, (+ ∗ ) can be written as a maximization problem, in the form sup d (w)
(+ ∗ )
w∈W
with d : W → R being defined by d (w) := inf L(v, w). v∈V
We call d (·) the dual function (attached to the Lagrangian function L). The interesting situation occurs when there is no duality gap, i.e., inf(+ ) = sup(+ ∗ ),
i
i i
i
i
i
i
9.7. A general approach to duality in convex optimization
book 2014/8/6 page 381 i
381
which is equivalent to saying inf sup L(v, w) = sup inf L(v, w).
v∈V w∈W
w∈W v∈V
In this general abstract setting, we have the following result. Proposition 9.7.1. Let L : V ×W → R be a general bivariate function. Then, the following facts are equivalent: (i) (v, w) is a saddle point of L; (ii) v is a solution of the primal problem (+ ), w is a solution of the dual problem (+ ∗ ), and there is no duality gap: inf(+ ) = sup(+ ∗ ). PROOF. (i) =⇒ (ii) By definition of saddle point L(v, w) ≤ L(v, w) ≤ L(v, w)
∀v ∈ V , ∀w ∈ W .
Hence, for every v ∈ V sup L(v, w) ≤ L(v, w) ≤ L(v, w) ≤ sup L(v, w). w∈W
w∈W
This being true for all v ∈ V , we obtain that v is a solution of the minimization problem inf sup L(v, w) , v∈V
w∈W
which is precisely the primal problem (+ ). In a similar way, for every w ∈ W inf L(v, w) ≤ L(v, w) ≤ L(v, w) ≤ inf L(v, w).
v∈V
v∈V
Hence, w is a solution of the maximization problem ! " sup inf L(v, w) , w∈W
v∈V
which is the dual problem (+ ∗ ). (ii) =⇒ (i) Since v is a solution of the primal problem (+ ), we have sup L(v, w) = inf sup L(v, w). w∈W
v∈V w∈W
Similarly, since w is a solution of the dual problem (+ ∗ ), we have inf L(v, w) = sup inf L(v, w).
v∈V
w∈W v∈V
Since there is no duality gap, inf sup = sup inf and we obtain sup L(v, w) = inf L(v, w), w∈W
that is,
∀v ∈ V , ∀w ∈ W
v∈V
L(v, w) ≤ L(v, w).
i
i i
i
i
i
i
382
book 2014/8/6 page 382 i
Chapter 9. Convex duality and optimization
This clearly implies L(v, w) ≤ L(v, w) for all v ∈ V , and L(v, w) ≤ L(v, w) for all w ∈ W , i.e., L(v, w) = min L(v, w) = max L(v, w), v
w
which expresses that (v, w) is a saddle point of L. As we described above, duality theory for minimization problems follows very naturally from the Lagrangian formulation: it just consists in the permutation of the inf and the sup. We stress the fact that the primal and the dual problems are intimately paired as soon as there is no duality gap and there exist saddle points of L. Thus the question is, for which class of bivariate functions L can one expect to have such properties? This is a central question in game theory, fixed point theory, economics, and so forth. Let us quote the celebrated Von Neumann’s minimax theorem. Indeed, we give a slightly more general formulation to recover, as a particular case, the existence theorem for convex minimization problems (see Aubin [64], for example). Theorem 9.7.1 (von Neumann’s minimax theorem). Let V and W be two reflexive Banach spaces and let M ⊂ V and N ⊂ W be two closed convex nonempty sets. Let L : M × N → R be a bivariate function which satisfies the following properties: (ia ) ∀w ∈ N , v → L(v, w) is convex and lower semicontinuous, (ib ) ∀v ∈ M w → L(v, w) is concave and upper semicontinuous. (iia ) M is bounded or there exists some w0 ∈ N such that v → L(v, w0 ) is coercive, (iib ) N is bounded or there exists some v0 ∈ M such that w → −L(v0 , w) is coercive. Then L possesses a saddle point (v, w) ∈ M × N , i.e., min L(v, w) = L(v, w) = max L(v, w). v∈M
w∈N
In particular infv supw L(v, w) = supw infv L(v, w), i.e., there is no duality gap. It follows from the previous results that the key property to developing a duality theory for optimization problems is the possibility to write the function f in the following form: f (v) = sup L(v, w) w∈W
with L : V × W → R a convex-concave bivariate function. We are going to see how the Legendre–Fenchel transform permits us to produce such convex-concave Lagrangian functions in a systematic and elegant way. The idea is first to introduce a perturbation function F : V × Y → R ∪ {+∞} such that F (v, 0) = f (v) for all v ∈ V . The primal problem (+ ) can now be written as 4 5 inf F (v, 0) : v ∈ V . (+ ) The key property which allows us to produce a convex-concave Lagrangian function from F is that F is convex with respect to (v, w) ∈ V × W . For example, in convex programming, the duality scheme that was studied in Section 9.6 is associated with the perturbation function: f0 (v) if fi (v) ≤ yi , i = 1, . . . , n, F (v, y) = +∞ otherwise. One can easily verify that when f0 and fi (i = 1, . . . , n) are convex functions, so is F .
i
i i
i
i
i
i
9.7. A general approach to duality in convex optimization
book 2014/8/6 page 383 i
383
Let us now describe how one can associate a Lagrangian function to F . Proposition 9.7.2 (definition of Lagrangian). Let F : V × Y → R ∪ {+∞} be a convex function. We associate to F a Lagrangian function L : V ×Y ∗ → R by the following formula: 5 4 ∀v ∈ V , ∀y ∗ ∈ Y ∗ − L(v, y ∗ ) = sup 〈y ∗ , y〉 − F (v, y) , y∈Y
i.e., −L(v, ·) is the Legendre–Fenchel conjugate of F (v, ·). We have that L is a convex-concave function. More precisely, (1) for all v ∈ V , y ∗ → L(v, y ∗ ) is concave, upper semicontinuous on Y ∗ ; (2) for all y ∗ ∈ Y ∗ , v → L(v, y ∗ ) is convex. PROOF. The proof is immediate; just note that (2) follows from Proposition 9.2.3 and from the fact that the function (v, y) → −〈y ∗ , y〉 + F (v, y) is convex. As expected, we can reformulate problem (+ ) by using the Lagrangian function L attached to the (convex) perturbation function F . Proposition 9.7.3. Let F : V × Y → R ∪ {+∞} be a convex, lower semicontinuous, proper function and let L : V × Y ∗ → R be the corresponding Lagrangian function, given by 5 4 −L(v, y ∗ ) = sup 〈y ∗ , y〉 − F (v, y) . y∈Y
(a) Primal problem: we have ∀v ∈ V
F (v, 0) = sup L(v, y ∗ ). y ∗ ∈Y ∗
As a consequence, with the notation f (v) = F (v, 0), the primal problem 4 5 inf f (v) : v ∈ V can be written as
inf sup L(v, y ∗ ).
v∈V y ∗ ∈Y ∗
(+ ) (+ )
(b) Dual problem: the dual problem, which by definition is sup inf L(v, y ∗ ),
(+ ∗ )
sup d (y ∗ ),
(+ ∗ )
y ∗ ∈Y ∗ v∈V
can be written as
y ∗ ∈Y ∗
where the dual function d : Y ∗ → R ∪ {+∞} is given by d (y ∗ ) = inf L(v, y ∗ ) v∈V
= −F ∗ (0, y ∗ ), and F ∗ is the Legendre–Fenchel conjugate of F with respect to (v, y).
i
i i
i
i
i
i
384
book 2014/8/6 page 384 i
Chapter 9. Convex duality and optimization
PROOF. (a) Since F is closed convex and proper on V × Y , for every v ∈ V the function ϕv : y → F (v, y) is closed convex on Y . Hence, for all v ∈ V ϕv (y) = ϕv∗∗ (y) 5 4 = sup 〈y ∗ , y〉 − ϕv∗ (y ∗ ) . y ∗ ∈Y ∗
By definition of L we have 5 4 ϕv∗ (y ∗ ) = sup 〈y ∗ , y〉 − ϕv (y) y∈Y
5 4 = sup 〈y ∗ , y〉 − F (v, y) y∈Y
= −L(v, y ∗ ). Hence,
5 4 ϕv (y) = sup 〈y ∗ , y〉 + L(v, y ∗ ) . y ∗ ∈Y ∗
Take now y = 0 to obtain f (v) = F (v, 0) = ϕv (0) = sup L(v, y ∗ ). y ∗ ∈Y ∗
(b) By definition, the dual function d : Y ∗ → R ∪ {+∞} is equal to d (y ∗ ) = inf L(v, y ∗ ). v∈V
By definition of L, 5 4 L(v, y ∗ ) = − sup 〈y ∗ , y〉 − F (v, y) y
5 = inf − 〈y ∗ , y〉 + F (v, y) . 4
y
Hence,
d (y ∗ ) =
4
inf
v∈V ,y∈Y
5 − 〈y ∗ , y〉 + F (v, y) .
Thus, 4
d (y ∗ ) = − sup
〈0, v〉 + 〈y ∗ , y〉 − F (v, y)
5
v∈V ,y∈Y ∗ ∗
= −F (0, y ), where F ∗ is the conjugate of F with respect to (v, y). The other fundamental mathematical object which is attached to the perturbation function F is the value function. Proposition 9.7.4 (definition of the value function). Let F : V × Y → R ∪ {+∞} be a convex function. The value function (also called marginal function) attached to F is the function p : Y → R ∪ {+∞}, which is defined by ∀y ∈ Y
p(y) := inf F (v, y). v∈V
i
i i
i
i
i
i
9.7. A general approach to duality in convex optimization
book 2014/8/6 page 385 i
385
It is a convex function. Moreover, for every y ∗ ∈ Y ∗ p ∗ (y ∗ ) = F ∗ (0, y ∗ ). Thus, the dual problem (+ ∗ ) can be formulated in terms of p as follows: sup (− p ∗ (y ∗ )) = − ∗inf ∗ p ∗ (y ∗ ). y ∈Y
y ∗ ∈Y ∗
(+ ∗ )
PROOF. The convexity of p follows from the convexity of F and by applying Proposition 9.2.3. For y ∗ ∈ Y ∗ we have 5 4 p ∗ (y ∗ ) = sup 〈y ∗ , y〉 − p(y) y∈Y
5 4 = sup 〈y ∗ , y〉 − inf F (v, y) y∈Y
=
sup (v,y)∈V ×Y ∗ ∗
v∈V
5 4 〈0, v〉 + 〈y ∗ , y〉 − F (v, y)
= F (0, y ) = −d (y ∗ ), where d is the dual function introduced in Proposition 9.7.3(b). We have now all the ingredients for developing a general convex duality theory. The following theorem may be proved in the same way as Theorems 9.6.2 and 9.6.3. Let us notice that the qualification assumption “there exists some v0 ∈ V such that F (v0 , .) is finite and continuous at the origin” plays the role of the Slater qualification assumption in convex programming. For this reason, we call it generalized Slater. Theorem 9.7.2. Let f : V → R ∪ {+∞} be a closed convex and proper function such that inf f > −∞. Let F : V × Y → R ∪ {+∞} be a perturbation function which satisfies the following conditions: (i) F (v, 0) = f (v)
∀v ∈ V .
(ii) F is a closed convex proper function. (iii) (Generalized Slater) there exists some v0 ∈ V such that y → F (v0 , y) is finite and continuous at the origin. Then the following properties hold: (a) The value function p is continuous at the origin. As a consequence, the set of solutions of the dual problem (+ ∗ ), which is equal to ∂ p(0), is nonempty. Indeed, it is a nonempty closed convex and bounded subset of Y ∗ . (b) There is no duality gap, i.e., inf(+ ) = max(+ ∗ ). (c) Let u be a solution of the primal problem (+ ). Then, for every element y ∗ which is a solution of the dual problem (by property (a) there exist such elements), (u, y ∗ ) is a saddle point of the Lagrangian function L associated to F . Conversely, if (u, y ∗ ) is a saddle point of L, then u is a solution of (+ ) and y is a solution of (+ ∗ ). (d) (u, y ∗ ) is a saddle point of L iff it satisfies the extremality relation: F (u, 0) + F ∗ (0, y ∗ ) = 0.
i
i i
i
i
i
i
386
book 2014/8/6 page 386 i
Chapter 9. Convex duality and optimization
PROOF. (a) The generalized Slater condition implies the existence of some neighborhood of 0 in Y in which the function y → F (v0 , y) is bounded from above: let r > 0 and M ∈ R be such that F (v0 , y) ≤ M ∀ y ∈ BY (0, r ). As a consequence, the value function p satisfies p(y) = inf F (v, y) ≤ F (v0 , y) ≤ M v∈V
∀ y ∈ BY (0, r ).
The function p also satisfies p(0) = inf(+ ) and p is bounded from above on a neighborhood of the origin. By Theorem 9.3.2 and Proposition 9.5.2 we obtain that p is subdifferentiable at 0, i.e., ∂ p(0) = . We know by Proposition 9.7.4 that the dual problem (+ ∗ ) can be expressed in terms of the value function p. Indeed, the fact that y ∗ is a solution of (+ ∗ ) is equivalent to saying that y ∗ is a solution of the minimization problem inf p ∗ (y ∗ ).
y ∗ ∈Y ∗
Thus, y ∗ is a solution of (+ ∗ ) ⇐⇒ ∂ p ∗ (y ∗ ) # 0 ⇐⇒ y ∗ ∈ ∂ p(0)
(Theorem 9.5.1).
By the argument above, ∂ p(0) is a closed convex bounded nonempty subset of Y ∗ . Thus, there exist solutions of the dual problem (+ ∗ ) and the set of these solutions is a closed convex bounded subset of Y ∗ . (b) Let y ∗ be any solution of the dual problem (+ ∗ ). We know by (a) that there exist such elements and that they are characterized by the relation y ∗ ∈ ∂ p(0) or, equivalently, by the Fenchel extremality relation p(0) + p ∗ (y ∗ ) = 〈y ∗ , 0〉 = 0. Hence inf(+ ) = p(0) = − p ∗ (y ∗ ) = sup − p ∗ (y ∗ ) = sup(+ ∗ ). y ∗ ∈Y ∗
(c) We use the Lagrangian formulation of problem (+ ) and (+ ∗ ) given by Proposition 9.7.3: (+ ) infv∈V supy ∗ ∈Y ∗ L(v, y ∗ ), (+ ∗ ) supy ∗ ∈Y ∗ infv∈V L(v, y ∗ ). The characterization of pairs of optimal solutions (u, y ∗ ) of problems (+ ) and (+ ∗ ), respectively, as saddle points of the Lagrangian L is a direct consequence of Proposition 9.7.1. (d) Let (u, y ∗ ) be a saddle point of L. Let us reformulate the extremality relation p(0) + p ∗ (y ∗ ) = 0
(see (b) above)
in terms of the function F : we have p(0) = inf(+ ) = F (u, 0), p ∗ (y ∗ ) = sup(+ ∗ ) = F ∗ (0, y ∗ )
(Proposition 9.7.4).
i
i i
i
i
i
i
9.8. Duality in the calculus of variations: First examples
Hence
book 2014/8/6 page 387 i
387
F (u, 0) + F ∗ (0, y ∗ ) = 0,
that is, (0, y ∗ ) ∈ ∂ F (u, 0).
9.8 Duality in the calculus of variations: First examples As a model example, let us consider the Dirichlet problem: given h ∈ L2 (Ω), find u ∈ H01 (Ω) such that −Δu = h in Ω, u = 0 on ∂ Ω. The variational formulation of this problem has been extensively studied in Chapter 5: the solution u of the Dirichlet problem is the minimizer, on the Sobolev space H01 (Ω), of the functional 1 |∇v(x)|2 d x − h(x)v(x) d x. f (v) = 2 Ω Ω The primal problem (+ ) can be expressed as 1 2 min |∇v(x)| d x − h(x)v(x) d x . v∈H01 (Ω) 2 Ω Ω
(+ )
We now introduce the perturbation function 1 F (v, y) = |∇v(x) + y(x)|2 d x − h(x)v(x) d x, 2 Ω Ω which we consider as a function F : H01 (Ω) × L2 (Ω)N → R. To compute the Lagrangian function L associated to F and describe the corresponding dual problem, we start to analyze the structure of this problem. The primal function f can be written as f (v) = Φ(Av) + Ψ(v), where Φ : L2 (Ω)N → R is the convex integral functional 1 |w(x)|2 d x Φ(w) = 2 Ω and A is the gradient operator, which can be viewed as a linear continuous operator from H01 (Ω) into L2 (Ω)N . The functional Ψ : H01 (Ω) → R is the linear and continuous mapping Ψ(v) = − h(x)v(x) d x. Ω
The perturbation function F : H01 (Ω) × L2 (Ω)N → R can then be written as F (v, y) = Φ(Av + y) + Ψ(v). Theorem 9.8.1. Let V and Y be two Banach spaces and suppose Φ ∈ Γ0 (Y ), Ψ ∈ Γ0 (V ), and A ∈ L(V , Y ) (A is a linear continuous operator). Consider the primal problem 4 5 (+ ) inf Φ(Av) + Ψ(v) v∈V
i
i i
i
i
i
i
388
book 2014/8/6 page 388 i
Chapter 9. Convex duality and optimization
and the perturbation function F : V × Y → R ∪ {+∞} defined by F (v, y) = Φ(Av + y) + Ψ(v). Then the following facts hold: (a) The Lagrangian function L : V × Y ∗ → R associated to F is given by L(v, y ∗ ) = Ψ(v) + 〈y ∗ , Av〉 − Φ∗ (y ∗ ) and the dual problem (+ ∗ ) is equal to 5 4 sup − Ψ ∗ (−A∗ y ∗ ) − Φ∗ (y ∗ ) , y ∗ ∈Y ∗
(+ ∗ )
where A∗ is the adjoint operator of A. (b) Let us assume that there exists some v0 ∈ V such that Ψ(v0 ) < +∞, Φ(A(v0 )) < +∞ and Φ is continuous at Av0 . Then (+ ∗ ) has at least one solution y ∗ . If u is a solution of (+ ), one has the following extremality relations: −A∗ y ∗ ∈ ∂ Ψ(u), y ∗ ∈ ∂ Φ(Au). PROOF. (a) By the definition of Lagrangian (Proposition 9.7.2), 5 4 L(v, y ∗ ) = inf − 〈y ∗ , y〉 + F (v, y) y∈Y 5 4 = inf − 〈y ∗ , y〉 + Φ(Av + y) + Ψ(v) y∈Y 5 4 = Ψ(v) − sup 〈y ∗ , y〉 − Φ(Av + y) y∈Y
5 4 = Ψ(v) − sup 〈y ∗ , Av + y〉 − Φ(Av + y) − 〈y ∗ , Av〉 y∈Y
= Ψ(v) + 〈y ∗ , Av〉 − Φ∗ (y ∗ ). The perturbation function F is convex and lower semicontinuous: this is an immediate consequence of the facts that Φ ∈ Γ0 (Y ), Ψ ∈ Γ0 (V ), and A ∈ L(V , Y ). Therefore, the primal and the dual problems can be expressed in terms of the Lagrangian function L and we have (Proposition 9.7.3) (+ ) infv∈V supy ∗ ∈Y ∗ L(v, y ∗ ), (+ ∗ ) supy ∗ ∈Y ∗ infv∈V L(v, y ∗ ). Let us compute the dual function d : d (y ∗ ) = inf L(v, y ∗ ) v∈V 5 4 = inf Ψ(v) + 〈y ∗ , Av〉 − Φ∗ (y ∗ ) v∈V 5 4 = −Φ∗ (y ∗ ) − sup 〈−y ∗ , Av〉 − Ψ(v) v∈V
4 5 = −Φ∗ (y ∗ ) − sup 〈−A∗ y ∗ , v〉 − Ψ(v) v∈V
= −Φ∗ (y ∗ ) − Ψ ∗ (−A∗ y ∗ ). Therefore, from the analysis made in Section 9.7, part (a) is proved.
i
i i
i
i
i
i
9.8. Duality in the calculus of variations: First examples
book 2014/8/6 page 389 i
389
(b) The existence of v0 ∈ V such that Ψ(v0 ) < +∞ with Φ continuous at Av0 clearly implies that the generalized Slater condition is satisfied. Hence (+ ∗ ) admits at least a solution (Theorem 9.7.2). Let y ∗ be such a solution. Let u be a solution of the primal problem (+ ). We know that there is no duality gap. Hence inf(+ ) = sup(+ ∗ ), i.e., Φ(Au) + Ψ(u) = −Φ(y ∗ ) − Ψ ∗ (−A∗ y ∗ ). Thus
Φ(Au) + Φ∗ (y ∗ ) + Ψ(u) + Ψ ∗ (−A∗ y ∗ ) = 0.
Equivalently, Φ(Au) + Φ∗ (y ∗ ) − 〈y ∗ , Au〉 + Ψ(u) + Ψ ∗ (−A∗ y ∗ ) + 〈A∗ y ∗ , u〉 = 0. Since by the Fenchel inequality the quantities Φ(Au) + Φ∗ (y ∗ ) − 〈y ∗ , Au〉 and Ψ(u) + Ψ ∗ (−A∗ y ∗ ) + 〈A∗ y ∗ , u〉 are nonnegative, we obtain Φ(Au) + Φ∗ (y ∗ ) − 〈y ∗ , Au〉 = 0, Ψ(u) + Ψ ∗ (−A∗ y ∗ ) + 〈A∗ y ∗ , u〉 = 0. These are the Fenchel extremality relations, which are equivalent to ∗ y ∈ ∂ Φ(Au), −A∗ y ∗ ∈ ∂ Ψ(u). Remark 9.8.1. It is often more convenient to write the dual problem as a minimization problem 5 4 inf ∗ Φ∗ (y ∗ ) + Ψ ∗ (−A∗ y ∗ ) . ∗ y ∈Y
Let us come back to the Dirichlet problem and apply the above results: we recall that V = H01 (Ω) and Y = L2 (Ω)N . (a) A : H01 (Ω) → L2 (Ω)N is the gradient operator. The adjoint operator A∗ : L2 (Ω)N → H −1 (Ω) = H01 (Ω)∗ is defined by 〈A∗ y, v〉(H −1 ,H 1 ) = 〈y, Av〉L2 (Ω)N 0 N ∂v = yi dx ∂ xi i =1 Ω B A N ∂ yi ,v = − ∂ xi i =1
,
( (Ω), (Ω))
D y = − div y in the where in the last equality v varies in (Ω)N , that is, A∗ y = − N i =1 i i distribution sense, i.e., A∗ is the opposite of the divergence operator. (b) We know by Theorem 9.3.3 that the conjugate of the function 1 Φ(y) = |y(x)|2 d x 2 Ω
i
i i
i
i
i
i
390
book 2014/8/6 page 390 i
Chapter 9. Convex duality and optimization
is equal to ∗
∗
Φ (y ) = (c) Let Ψ(v) = −
Ω
1 2
Ω
|y ∗ (x)|2 d x.
h(x)v(x) d x. An easy computation gives
Ψ ∗ (v ∗ ) = sup v∈H01 (Ω)
4
〈v ∗ , v〉(H −1 ,H 1 ) − 〈−h, v〉L2 (Ω)
5
0
= sup 〈v ∗ + h, v〉(H −1 ,H 1 ) v∈H01 (Ω)
=
0
0 if v ∗ + h = 0, +∞ otherwise.
We now collect all these results to obtain, thanks to Theorem 9.8.1, the description of the dual problem (+ ∗ ) of the Dirichlet problem 1 ∗ 2 ∗ 2 N ∗ sup − |y (x)| d x : y ∈ L (Ω) , div y = h . (+ ∗ ) 2 Ω Clearly, the generalized Slater condition is satisfied (F is everywhere continuous!). Thus there exists a solution y ∗of (+ ∗ ) and this solution is unique because of the strict convexity of the mapping y ∗ → Ω |y ∗ (x)|2 d x. On the other hand, we know that (+ ) admits a unique solution u. The extremality relations yield ∗ y = Au = grad u, div y ∗ = −h, i.e., we have − div(grad u) = h which is in accordance with the definition of u.
i
i i
i
i
i
i
book 2014/8/6 page 393 i
Chapter 10
Spaces BV and S BV
The modelization of a large number of problems in physics, mechanics, or image processing requires the introduction of new functional spaces permitting discontinuities of the solution. In phase transitions, image segmentation, plasticity theory, the study of cracks and fissures, the study of the wake in fluid dynamics, and so forth, the solution of the problem presents discontinuities along one-codimensional manifolds. Its first distributional derivatives are now measures which may charge zero Lebesgue measure sets, and the solution of these problems cannot be found in classical Sobolev spaces. Thus, the classical theory of Sobolev spaces must be completed by the new spaces BV (Ω) and SBV (Ω).
10.1 The space BV (Ω): Definition, convergences, and approximation In this section Ω is an open subset of RN . Let us recall (see Chapter 4) that M(Ω, RN ) denotes the space of all RN -valued Borel measures, which is also, according to the Riesz theory, the dual of the space C0 (Ω, RN ) of all continuous functions ϕ vanishing at infinity, N 1/2 sup x∈Ω |ϕi (x)|2 . Note that M(Ω, RN ) equipped with the uniform norm ||ϕ||∞ = i =1 is isomorphic to the product space MN (Ω) and that μ = (μ1 , . . . , μN ) ∈ M(Ω, RN ) ⇐⇒ μi ∈ C0 (Ω),
i = 1, . . . , N .
Definition 10.1.1. We say that a function u : Ω → R is a function of bounded variation iff it belongs to L1 (Ω) and its gradient D u in the distributional sense belongs to M(Ω, RN ). We denote the set of all functions of bounded variation by BV (Ω). The four following assertions are then equivalent: (i) u ∈ BV (Ω); (ii) u ∈ L1 (Ω) and ∀i = 1, . . . , N ,
∂u ∂ xi
∈ M(Ω);
(iii) u ∈ L1 (Ω) and ||D u|| := sup{〈D u, ϕ〉 : ϕ ∈ Cc (Ω, RN ), ||ϕ||∞ ≤ 1} < +∞; (iv) u ∈ L1 (Ω) and ||D u|| = sup{ Ω u div ϕ d x : ϕ ∈ C1c (Ω, RN ), ||ϕ||∞ ≤ 1} < +∞, 393
i
i i
i
i
i
i
book 2014/8/6 page 394 i
Chapter 10. Spaces BV and SBV
394
where the bracket 〈, 〉 in (iii) is defined by 〈D u, ϕ〉 :=
N i =1
Ω
ϕi
∂u ∂ xi
.
Equivalence between (ii) and (iii) is a direct consequence of the density of the space Cc (Ω, RN ) in C0 (Ω, RN ) equipped with the uniform norm. Equivalence between (iii) and (iv) can easily be established by the density of C∞ (Ω, RN ) in Cc (Ω, RN ) and C1c (Ω, RN ). c Remark 10.1.1. According to the vectorial version of the Riesz–Alexandroff representation theorem, Theorem 2.4.7, the dual norm ||D u|| is also the total mass |D u|(Ω) = |D u| of the total variation |D u| of the measure D u. Moreover, from classical integraΩ tion theory, the integral Ω f D u can be defined for all D u-integrable functions f from Ω into RN as, for example, for functions in C b (Ω, RN ). For the same reasons, Ω f |D u| is well defined for all |D u|-integrable real-valued functions f as, for example, for functions in C b (Ω). According to the Radon–Nikodým theorem, Theorem 4.2.1, there exist ∇u ∈ L1 (Ω, R ) and a measure D s u, singular with respect to the N -dimensional Lebesgue measure " N Ω restricted to Ω, such that D u = ∇u " N Ω + D s u. Consequently, W 1,1 (Ω) is a subspace of the vectorial space BV (Ω) and u ∈ W 1,1 (Ω) iff D u = ∇u " N Ω. For functions in W 1,1 (Ω), we will sometimes write ∇u for D u. The space BV (Ω) is equipped with the following norm, which extends the classical norm in W 1,1 (Ω): N
||u||BV (Ω) := |u|L1 (Ω) + ||D u||. We will define two weak convergence processes in BV (Ω). The first is too weak to ensure continuity of the trace operator defined in Section 10.2 but is sufficient to provide compactness of bounded sequences. The second is an intermediate convergence between the weak and the strong convergence associated with the norm. Definition 10.1.2. A sequence (un )n∈N in BV (Ω) weakly converges to some u in BV (Ω), and we write un u iff the two following convergences hold: un → u strongly in L1 (Ω); D un D u weakly in M(Ω, RN ). We will see later that when Ω is regular, the boundedness of a sequence in BV (Ω) is sufficient to ensure the existence of a weak cluster point (Theorem 10.1.4). In the proposition below we establish a compactness result related to this convergence, together with the lower semicontinuity of the total mass. Proposition 10.1.1. Let (un)n∈N be a sequence in BV (Ω) strongly converging to some u in L1 (Ω) and satisfying supn∈N Ω |D un | < +∞. Then (i) u ∈ BV (Ω) and Ω |D u| ≤ lim infn→+∞ Ω |D un |; (ii) un weakly converges to u in BV (Ω). PROOF. For all ϕ in C1c (Ω, RN ) such that ||ϕ||∞ ≤ 1, we have u div ϕ d x = lim un div ϕ d x ≤ lim inf |D un | Ω
n→+∞
Ω
n→+∞
Ω
i
i i
i
i
i
i
10.1. The space BV (Ω): Definition, convergences, and approximation
book 2014/8/6 page 395 i
395
and assertion (i) is proved by taking the supremum in the first member, over all the elements ϕ in C1c (Ω, RN ) satisfying ||ϕ||∞ ≤ 1. We now establish (ii). Since un strongly converges to u in L1 (Ω), for all ϕ ∈ C∞ (Ω, RN ), c one has 〈D un , ϕ〉 = − un div ϕ d x → − u div ϕ d x = 〈D u, ϕ〉. Ω
Ω
(Ω, RN ) C∞ c
in C0 (Ω, R ) for the uniform norm and the boundBy using the density of edness of (D un )n∈N , we easily conclude that the sequence (D un )n∈N weakly converges to D u. N
As a consequence of the semicontinuity property (i), BV (Ω) is a complete normed space. Theorem 10.1.1. Equipped with its norm, BV (Ω) is a Banach space. PROOF. Let (un )n∈N be a Cauchy sequence in BV (Ω). Then (un )n∈N is a Cauchy sequence in L1 (Ω) and for all > 0 there exists N in N such that |D u p − D uq | < . (10.1) ∀ p, q > N Ω
Thus, there exists u ∈ L1 (Ω) such that un → u strongly in L1 (Ω). In particular u p − uq → u − uq strongly in L1 (Ω) when p goes to +∞. According to the lower semicontinuity property (i) of Proposition 10.1.1, (10.1) yields, for q > N , |D(u − uq )| ≤ lim inf |D(u p − uq )| ≤ . p→+∞
Ω
Ω
This estimate yields first u ∈ BV (Ω) then limq→+∞ BV (Ω).
Ω
|D(u − uq )| = 0, so that un → u in
To define the second weak convergence process, let us recall the notion of narrow convergence defined in Section 4.2.2. As said in Remark 10.1.1, the integral Ω f |D u| is well defined for all f in the space C b (Ω) of bounded continuous functions on Ω. Thus |D u| may be considered as an element of Cb (Ω). We now say that a sequence (|D un |)n∈N narrowly converges to μ in M(Ω) iff |D un | μ for the σ(Cb (Ω), C b (Ω)) convergence (see Section 4.2.2). Definition 10.1.3. Let (un )n∈N be a sequence in BV (Ω) and u ∈ BV (Ω). We say that un converges to u in the sense of the intermediate convergence iff u → u strongly in L1 (Ω), n |D un | → |D u|. Ω
Ω
The term intermediate convergence is due to Temam [348] and is also called strict convergence. Let us notice that according to Proposition 10.1.1(i), when un strongly converges to u in L1 (Ω), there is in general loss of the total mass at the limit. The proposition below states that this convergence is stronger than the weak convergence, therefore justifying the terminology.
i
i i
i
i
i
i
book 2014/8/6 page 396 i
Chapter 10. Spaces BV and SBV
396
Proposition 10.1.2. The three following assertions are equivalent: (i) un u in the sense of the intermediate convergence; un u weakly in BV (Ω), (ii) |D un | → Ω |D u|; Ω un → u strongly in L1 (Ω), (iii) |D un | |D u| narrowly in M(Ω). PROOF. (i) =⇒ (ii) This implication is a straightforward consequence of Proposition 10.1.1. We are going to prove (ii) =⇒ (iii). Let us recall (cf. Proposition 4.2.5) that for nonnegative Borel measures μn and μ in M(Ω), there is equivalence between μn μ narrowly and μn (Ω) → μ(Ω), μ(U ) ≤ lim inf μn (U ) n→+∞
∀ open subset U of Ω.
Set μn = |D un | and μ = |D u|. At first we have μn (Ω) = Ω |D un | → μ(Ω) = Ω |D u|. Now let U be any open subset of Ω. Obviously, un and u belong to BV (U ) and un → u strongly in L1 (U ). Applying Proposition 10.1.1 with Ω = U (note that sup n∈N U |D un | ≤ supn∈N Ω |D un | < +∞), we obtain |D u| ≤ lim inf |D un |. n→+∞
U
U
Implication (iii) =⇒ (i) is straightforward. Remark 10.1.2. It results from Propositions 4.2.5 and 10.1.2 that if un u in the sense of the intermediate convergence, for all Borel subset B of Ω such that ∂ B |D u| = 0, one has |D un | → |D u|. B
B
More generally, according to Proposition 4.2.6, for all bounded Borel function f : Ω → R such that the set of its discontinuity points has a null |D u|-measure, one has f |D un | → f |D u|. Ω
Ω
Remark 10.1.3. The intermediate convergence is strictly finer than the weak convergence in BV (Ω). Indeed, the sequence of functions in BV (0, 1) defined by nx if 0 < x ≤ n1 , un (x) = 1 if x > n1 weakly converges to the function 1 in BV (0, 1) and does not converge in the sense of the intermediate convergence. Indeed, the total mass |D un |(0, 1) is the constant 1. The space C∞ (Ω) is not dense in BV (Ω) when BV (Ω) is equipped with its strong convergence. Indeed, its closure is the space W 1,1 (Ω) (see Proposition 5.4.1). Nevertheless,
i
i i
i
i
i
i
10.1. The space BV (Ω): Definition, convergences, and approximation
book 2014/8/6 page 397 i
397
one can approximate every element of BV (Ω) by a function of C∞ (Ω) in the sense of the intermediate convergence. Theorem 10.1.2. The space C∞ (Ω) ∩ BV (Ω) is dense in BV (Ω) equipped with the intermediate convergence. Consequently C∞ (Ω) is also dense in BV (Ω) for the intermediate convergence. PROOF. First notice that C∞ (Ω) ∩ BV (Ω) = C∞ (Ω) ∩ W 1,1 (Ω). The second assertion is then a straightforward consequence of the density of the space C∞ (Ω) in W 1,1 (Ω) equipped with its strong convergence (Proposition 5.4.1) which is finer than the intermediate convergence. Let > 0, intended to go to zero, and u ∈ BV (Ω). We are going to construct u in C∞ (Ω) ∩ W 1,1 (Ω) such that |D u | − |D u| < 4. Ω Ω
Ω
|u − u | d x <
and
(10.2)
The following construction is similar to the proof of the Meyers–Serrin theorem, Theorem 5.1.4. Let us consider a family (Ωi )i ∈N of open subsets of Ω such that Ω\Ω0
|D u| < ;
(10.3)
Ωi ⊂⊂ Ωi +1 ; ∞ 3 Ω = Ωi . i =0
We construct the open covering (Ci )i ∈N∗ of Ω as follows: set C1 = Ω2 and, for i ≥ 2, Ci = Ωi +1 \ Ωi −1 . Let now (ϕi )i ∈N∗ be a partition of unity subordinate to the covering ∞ (Ci )i ∈N∗ . The functions ϕi satisfy ϕi ∈ C∞ (C ), 0 ≤ ϕ ≤ 1, ϕ = 1. Note that i i c i =1 i ϕ1 = 1 on Ω1 . For each i, choose i > 0 such that spt(ρi ∗ ϕi u) ⊂ Ci , |ρ1 ∗ (ϕ1 D u)| d x − |ϕ1 D u| < , Ω Ω |ρi ∗ (uϕi ) − uϕi | d x < 2−i , Ω |ρi ∗ (uDϕi ) − uDϕi | d x < 2−i ,
(10.4) (10.5) (10.6) (10.7)
Ω
where ρi are the regularizers defined in Theorem 4.2.2. Estimate (10.5) is obtained by applying Theorem 4.2.2(iii) to the measure ϕ1 D u. Estimates (10.6) and (10.7) are straightforward consequences of the convergence of ρ ∗ (uϕi ) and ρ ∗ (uDϕi ), respectively, to uϕi and uDϕi in L1 (Ω) (see Proposition 2.2.4). We define u by u =
∞ i =1
ρi ∗ (uϕi ).
i
i i
i
i
i
i
book 2014/8/6 page 398 i
Chapter 10. Spaces BV and SBV
398
Note that each x in Ω belongs to at most two of the sets Ci and that, by (10.4), u is well defined and clearly belongs to C∞ (Ω). From (10.6) we obtain Ω
|u − u | d x ≤
∞ i =1
Ω
|ρi ∗ (uϕi ) − uϕi | d x < .
We are going to establish the last estimate of (10.2). In the distributional sense, we have D(uϕi ) = ϕi D u + uDϕi " Ω so that D u = = =
∞ i =1 ∞ i =1 ∞ i =1
D(ρi ∗ (uϕi )) = ρi ∗ (ϕi D u) + ρi ∗ (ϕi D u) +
∞
i =1 ∞
ρi ∗ (D(uϕi ))
ρi ∗ (uDϕi )
i =1 ∞ ! i =1
" ρi ∗ (uDϕi ) − uDϕi .
Therefore, according to (10.7), Theorem 4.2.2(ii), and (10.3), ∞ |ρi ∗ (ϕi D u)| d x |ρ1 ∗ (ϕ1 D u)| d x − |D u | ≤ Ω Ω i =2 Ω ∞ |ρi ∗ (uDϕi ) − uDϕi | d x + i =1
≤ ≤
i =2 Ω ∞ i =2
≤
Ω
∞
Ω
Ω\Ω0
|ρi ∗ (ϕi D u)| d x + |ϕi D u| +
|D u| + < 2.
On the other hand, from (10.5), (10.3) and because ϕ1 = 1 on Ω1 , |ρ1 ∗ (ϕ1 D u)| d x − |D u| ≤ + (1 − ϕ1 )|D u| Ω Ω Ω ≤+ |D u| ≤ 2.
(10.8)
(10.9)
Ω\Ω0
Collecting (10.8), (10.9) we obtain |D u | − |D u| < 4, Ω Ω which completes the proof of (10.2). Theorem 10.1.2 allows us to extend Sobolev’s inequalities and compactness embedding results on W 1,1 (Ω) (see Section 5.7) to the space BV (Ω).
i
i i
i
i
i
i
10.1. The space BV (Ω): Definition, convergences, and approximation
book 2014/8/6 page 399 i
399
Theorem 10.1.3. Let Ω be a 1-regular open bounded subset of RN . For all p, 1 ≤ p ≤ the embedding BV (Ω) → L p (Ω)
N , N −1
is continuous. More precisely, there exists a constant C which depends only on Ω, p, and N such that for all u in BV (Ω),
1 Ω
|u| p d x
p
≤ C |u|BV (Ω) .
PROOF. Let (un )n∈N be a sequence of functions in C∞ (Ω) ∩ BV (Ω) which converges to some u in BV (Ω) for the intermediate convergence. Since the embedding W 1,1 (Ω) → L p (Ω) is continuous for 1 ≤ p ≤ NN−1 , there exists a constant C , which depends only on Ω, p, and N such that
1 p
Ω
|un | d x
p
≤ C |un |L1 (Ω) +
Ω
|D un | d x < +∞.
We deduce that un u in L p (Ω) and, according to the weak lower semicontinuity of the norm of L p (Ω), Ω
1 p
|u| d x
p
≤ lim inf n→+∞
1 p
|un | d x
p
Ω ≤ lim inf C |un |L1 (Ω) + |D un | d x n→+∞
Ω
= C uBV (Ω) ,
where we have used the intermediate convergence in the last equality. Theorem 10.1.4. Let Ω be a 1-regular open bounded subset of RN . Then for all p, 1 ≤ p < N the embedding N −1 BV (Ω) → L p (Ω) is compact. PROOF. According to Theorem 10.1.3, every element of BV (Ω) belongs to L p (Ω) for 1 ≤ p ≤ NN−1 , so that we can slightly improve the density Theorem 10.1.2 as follows: for all u ∈ BV (Ω), there exists un ∈ C∞ (Ω) ∩ BV (Ω) satisfying ⎧ un → u in L p (Ω), ⎨ ⎩ |D un | → |D u|. Ω
Ω
We conclude thanks to the compactness of the embedding of W 1,1 (Ω) → L p (Ω). Indeed, let un be such that un BV (Ω) ≤ 1 and vn ∈ C∞ (Ω) ∩ BV (Ω) be such that ⎧ |vn − un |L p (Ω) ≤ n1 , ⎨ ⎩
Ω
|Dvn | d x ≤ 2.
i
i i
i
i
i
i
book 2014/8/6 page 400 i
Chapter 10. Spaces BV and SBV
400
Since vn W 1,1 (Ω) ≤ 4 for n large enough there exists a subsequence (vnk )k∈N and u in L p (Ω) such that vnk → u strongly in L p (Ω), and thus
u nk → u
strongly in L p (Ω).
By the lower semicontinuity of the total mass, and since unk strongly converges to u in L1 (Ω), we obtain |u|L1 (Ω) + |D u| ≤ lim |unk |L1 (Ω) + lim inf |D unk | Ω
k→+∞
k→+∞
≤ lim inf unk BV (Ω) ≤ 1
Ω
k→+∞
and the proof is complete.
10.2 The trace operator, the Green’s formula, and its consequences Throughout this section, Ω is a domain of RN with a Lipschitz boundary Γ (i.e., a Lipschitz domain). Under a weaker hypothesis on the regularity of the set Ω, we extend in the space BV (Ω) the notion of trace developed for Sobolev functions in Section 5.6. It is worth noticing that the method used for establishing the trace theorem below also applies to Sobolev functions. Theorem 10.2.1. There exists a linear continuous map γ0 from BV (Ω) onto L1 N −1 (Γ ) satisfying (i) for all u in C(Ω) ∩ BV (Ω), γ0 (u) = uΓ ; (ii) the generalized Green’s formula holds: for all ϕ ∈ C1 (Ω, RN ), ϕD u = − u div ϕ d x + γ0 (u) ϕ.ν d N −1 , Ω
Ω
Γ
where ν(x) is the outer unit normal at N −1 almost all x in Γ . x , xN ), where PROOF. Each generic element x in RN will be denoted by x = (˜ x˜ = (x1 , . . . , xN −1 ) ∈ RN −1 and xN ∈ R. Let us consider a finite cover of Γ by the open cylinders CR (y) = SR (˜ y ) × (yN − R, yN + R), y = (˜ y , yN ) ∈ Γ , where SR (˜ y ) is the open ball of RN −1 with radius R centered at y˜ = (y1 , . . . , yN −1 ). Since Γ is Lipschitz regular, relabeling the coordinate axes if necessary, there exists 0 > 0 and a Lipschitz function f such that Ω ∩ CR (y) contains the open set CR,0 (y) := {x ∈ RN : x˜ ∈ SR (˜ y ), f (˜ x ) − 0 < xN < f (˜ x )}, and
Σ(y) := {(˜ x , xN ) : x˜ ∈ SR (˜ y ), xN = f (˜ x )}
i
i i
i
i
i
i
10.2. The trace operator, the Green’s formula, and its consequences
book 2014/8/6 page 401 i
401
is a neighborhood of y in Γ . Let u be a fixed function in BV (Ω). According to Lemma 4.2.2, since C (y) |D u| < +∞, R and 0 can be chosen so that the measure |D u| does R,0 not charge ∂ CR,0 (y) \ Σ(y), i.e., ∂ C (y)\Σ(y) |D u| = 0. R,0
First step. We fix y in Γ and, to shorten notation, we denote the sets Σ(y), SR (˜ y ), and CR,0 (y) by Σ, SR , and CR,0 , respectively. Since Γ is Lipschitz, according to Rademacher’s theorem (see [211]), the outer unit normal ν(x) exists at N −1 a.e. x on Γ . This step is devoted to the existence of u + in L1 N −1 (Σ) satisfying ⎧ ⎪ + N −1 ⎪ |u | d ≤C |u| d x + |D u| ⎪ ⎪ ⎨ Σ CR, CR, 0 0 ⎪ ⎪ 1 N ⎪ ϕD u = − u div ϕ d x + u + ϕν d N −1 , ⎪ ⎩∀ϕ ∈ Cc (CR,0 ∪ Σ, R ) CR,
CR,
0
Σ
0
(10.10) where C is a positive constant depending only on f . For all regular function v defined on CR,0 and all in ]0, 0 [, we adopt the notation v (˜ x ) := v(˜ x , f (˜ x ) − ). Consider now a sequence (un )n∈N in C∞ (CR,0 ) ∩ BV (CR,0 ) which converges to u in BV (CR,0 ) in the sense of intermediate convergence (see Theorem 10.1.2). We have ⎧ ⎪ ⎪ |un − u| d x → 0, ⎪ ⎨ C R, 0 (10.11) ⎪ ⎪ |D u | d x → |D u|. ⎪ n ⎩ CR,
CR,
0
0
Since the function un is smooth, for > in ]0, 0 [ one has
un (˜ x ) − un (˜ x) =
so that x ) − un (˜ x )| ≤ |un (˜
−
∂ un
−
∂ xN
(˜ x , f (˜ x ) + s) d s,
|D un (˜ x , f (˜ x ) − s)| d s.
Thus, according to Proposition 4.1.6 and Remark 4.1.5 applied to the map SR ⊂ RN −1 → RN , x˜ → (˜ x , f (˜ x )), we deduce Σ
|un (˜ x ) − un (˜ x )| d N −1 (x) ≤
SR
|un (˜ x ) − un (˜ x )| 1 + |D f (˜ x )|2 d x˜
≤C
SR
=C CR,,
|D un (˜ x , f (˜ x ) − s)| d s d x˜
|D un (x)| d x,
where C is a positive constant depending only on f and CR,, = {x ∈ RN : x˜ ∈ SR , f (˜ x ) − < xN < f (˜ x ) − }.
i
i i
i
i
i
i
book 2014/8/6 page 402 i
Chapter 10. Spaces BV and SBV
402
Therefore, with the notation made precise above, N −1 |un − un | d ≤C Σ
CR,,
|D un | d x.
(10.12)
We intend to go to the limit on n in (10.12). From the coarea formula (Theorem 4.2.5), one has 0 |un (˜ x ) − u (˜ x )| d N −1 (x) d ≤ C |un (x) − u(x)| d x, −0
Σ
CR,
0
and the first limit in (10.11) yields the existence of a subsequence on n (not relabeled) such that for a.e. in ]0, 0 [, un → u in L1 N −1 (Σ). On the other hand, according to Proposition 10.1.2, (10.11) ensures the narrow convergenceof the measure |D un | to the measure |D u| in M+ (CR,0 ). Let us consider , such that ∂ C |D u| = 0. This choice R,,
is indeed valid in the complementary I of a countable subset of ]0, 0 [, by using Lemma 4.2.2 and because |D u| does not charge ∂ CR,0 \ Σ. According to the properties of the narrow convergence (cf. Proposition 4.2.5), we have, for and in I , |D un | → |D u|. CR,,
CR,,
Going to the limit on n in (10.12), we finally obtain, for in ]0, 0 [\8 , where 8 is an " 1 -negligible set, N −1 |u − u | d ≤C |D u|. (10.13) Σ
CR,,
From now on, denotes a sequence of numbers in ]0, 0 [\8 , going to zero, and (10.13) must be taken in the sense p q N −1 |u − u | d ≤C |D u|. Σ
CR, p ,q
From (10.13), (u ) is a Cauchy sequence in L1 N −1 (Σ), then strongly converges to some function u + in L1 (Σ). It remains to prove that u + satisfies (10.10). Letting → 0 in (10.13) and integrating over ] − 0 , 0[, we obtain 6 7 Σ
|u + | d N −1 ≤ C
|u| d x + CR,
|D u| . CR,
0
(10.14)
0
On the other hand, since un ∈ C∞ (CR,0 ) and ϕ ∈ C1c (CR,0 ∪ Σ, RN ), going to the limit on n in the classical Green’s formula un div ϕ d x = − D un .ϕ d x + un ϕ .ν d N −1 , CR,
where we claim that
CR,
0 ,
Σ
0 ,
y ), f (˜ x ) − 0 < xN < f (˜ x ) − }, CR,0 , := {x ∈ RN : x˜ ∈ SR (˜
u div ϕ d x = − CR,
0 ,
ϕD u + CR,
0 ,
Σ
u ϕ .ν d N −1 .
(10.15)
i
i i
i
i
i
i
10.2. The trace operator, the Green’s formula, and its consequences
We must justify the convergence CR,
book 2014/8/6 page 403 i
403
D un .ϕ d x →
0 ,
ϕD u. CR,
(10.16)
0 ,
The two others are straightforward. We reason by truncation: let < and consider ϕ˜ = ϕθ in C1c (CR,0 , , RN ), where the scalar function θ belongs to Cc (CR,0 , ) and satisfies θ = 1 in CR,0 , , with ||θ||∞ ≤ 1. We have
CR,
D un ϕ d x = 0 ,
C
From
R,,
CR,
D un ϕ˜ d x − 0
,
˜ ˜ D un ϕ d x ≤ ||ϕ||∞ C
CR,,
D un ϕ˜ d x.
(10.17)
|D un | R,,
and according to the narrow convergence of |D un | to |D u|, and because |D u| does not charge ∂ CR,, , for and in ]0, 0 [\8 , we obtain ˜ ϕ d x lim sup D u lim = 0. n → n→+∞ C R,,
The convergence in (10.16) is obtained by letting n → +∞ and → in (10.17) and the claim is proved. Finally going to the limit on in (10.15) we obtain u div ϕ d x = − ϕD u − u + ϕ.eN N −1 . CR,
CR,
0
SR
0
Second step. According to the first step and from a straightforward argument using a partition of unity subordinate to a finite cover (CR (yi ))i =1,...,r of Γ , there exists γ0 (u) in L1 N −1 (Γ ) satisfying
Γ
|γ0 (u)| d N −1 ≤ C
!
Ω
|u| d x +
Ω
" |D u|
and such that for all ϕ ∈ C(Ω, RN ), ϕD u = − u div ϕ d x + γ0 (u) ϕ.ν d N −1 , Ω
Ω
(10.18)
(10.19)
Γ
where ν(x) is the outer normal unit at N −1 almost all x in Γ and C a positive constant depending only on Ω. The operator γ0 is well defined by γ0 (u) = ui+ , in Σi , i = 1, . . . , r . Indeed, Green’s formula (10.10) established in the first step yields (ui+ − u + )ϕ.ν d N −1 = 0 j Σi ∩Σ j
for all functions ϕ in C1c ((CR,0 (yi )∩CR,0 (y j ))∪(Σi ∩Σ j ), RN ) so that ui+ = u + in Σi ∩Σ j j
up to sets of N −1 measure zero.
i
i i
i
i
i
i
book 2014/8/6 page 404 i
Chapter 10. Spaces BV and SBV
404
The generalized Green’s formula (10.19) yields the linearity of γ0 . The continuity is a consequence of (10.18). The identity γ0 (u) = uSR for all u in C(Ω) ∩ BV (Ω) is a straightforward consequence of the definition of γ0 (u). The operator γ0 then agrees with the trace operator defined in W 1,1 (Ω). Since γ0 (W 1,1 (Ω)) = L1 (Ω), we also have γ0 (BV (Ω)) = L1 (Ω). Remark 10.2.1. Let Ω be a Lipschitz open bounded subset of RN . The density theorem, Theorem 10.1.2, may be slightly improved in the sense that one may further assert that the trace of each regular approximating function of u ∈ BV (Ω), belonging to C∞ (Ω) ∩ BV (Ω), coincides with the trace of u on the boundary of Ω. Indeed, it is easily seen, with the notation of Theorem 10.1.2, that u − u is the strong limit in BV (Ω) of the functions u,n − un :=
n i =0
ρi ∗ (uϕi ) − uϕi
whose traces on Γ := ∂ Ω are the function null. The result then follows from the strong continuity of the trace operator. Remark 10.2.2. One may define the space BV (Ω, R m ) as the space of all functions u : Ω → R m in L1 (Ω, R m ) whose distributional derivative D u belongs to the space M(Ω, M m×N ) of m × N matrix-valued measures. Arguing as in the proof of Theorem 10.2.1 with each component of u, one may prove the existence of the trace operator γ0 from BV (Ω, R m ) onto L1 N −1 (Γ , R m ) satisfying (i) ∀u ∈ C(Ω, R m ) ∩ BV (Ω, R m ), γ0 (u) = uΓ ; (ii) the Green’s formula holds: for all ϕ ∈ C1 (Ω, M m×N ) ϕ : D u = − u. div ϕ d x + γ0 (u) ⊗ ν : ϕ d N −1 , Ω
Ω
Γ
where ν(x) is the outer unit normal at N −1 almost all x in Γ and γ0 (u) ⊗ ν is the M m×N valued function (γ0 (u)i ν j )i =1...m, j =1...N . The integral with respect to the measure D u in the first member is defined by Ω
ϕ : D u :=
and γ0 (u) ⊗ ν : ϕ :=
N m i =1 j =1 m N i =1 j =1
Ω
ϕi , j
∂ uj ∂ xi
γ0 (u)i ν j ϕi , j .
The divergence of ϕ is the vector valued distribution (div ϕ) j := 1, . . . , m.
N
∂ uj i =1 ∂ xi
, j =
The density theorem, Theorem 10.1.2, and Remark 10.2.1 also hold in BV (Ω, R m ). We now give some consequences of Theorem 10.2.1. Example 10.2.1. Consider two disjoint Lipschitz domains Ω1 and Ω2 , included in an open bounded subset Ω of RN , such that Ω = Ω1 ∪ Ω2 , and set Γ1,2 := ∂ Ω1 ∩ ∂ Ω2 which
i
i i
i
i
i
i
10.2. The trace operator, the Green’s formula, and its consequences
book 2014/8/6 page 405 i
405
Figure 10.1. The set Ω.
is assumed to satisfy N −1 (Γ1,2 ) > 0 (see Figure 10.1). We respectively denote the trace operators from BV (Ω1 ) onto L1 (∂ Ω1 ) and BV (Ω2 ) onto L1 (∂ Ω2 ) by γ1 and γ2 . Let u1 and u2 be, respectively, two functions in BV (Ω1 ) and BV (Ω2 ) and define < u1 in Ω1 , u= u2 in Ω2 . Then u belongs to BV (Ω) and D u = D u1 Ω1 + D u2 Ω2 + [u]ν N −1 Γ1,2 , where [u] = γ1 (u1 ) − γ2 (u2 ) and ν(x) is the unit inner normal at x to Γ1,2 considered as a part of the boundary of Ω1 (see Figure 10.1). In particular, if u ∈ W 1,1 (Ω \ Γ1,2 ) D u = ∇u "N + [u]ν N −1 Γ1,2 , where ∇u is the gradient of u in L1 (Ω). PROOF. For all ϕ ∈ C1c (Ω, RN ), 〈D u, ϕ〉 = −
Ω
u div ϕ d x = −
Ω1
u1 div ϕ d x −
Ω2
u2 div ϕ d x.
Since ϕ belongs to C1 (Ω1 , RN ) ∩ C1 (Ω2 , RN ), applying the generalized Green’s formula in BV (Ω1 ) and BV (Ω2 ), we have ⎧ ⎪ ⎪ u div ϕ d x = − ϕ D u1 + γ1 (u1 )ϕ.(−ν) N −1 , ⎪ ⎪ ⎨ Ω 1 Ω Γ 1 1 1,2 ⎪ ⎪ ⎪ u2 div ϕ d x = − ϕ D u2 + γ2 (u2 )ϕ.ν N −1 . ⎪ ⎩ Ω2
Ω2
By summing these two equalities, we obtain ϕ D u1 + ϕ D u2 + 〈D u, ϕ〉 = Ω1
Ω2
Γ1,2
Γ1,2
γ1 (u1 ) − γ2 (u2 ) ϕ.ν N −1 .
(10.20)
Assume now ||ϕ||∞ ≤ 1. According to the continuity of γ1 and γ2 , there exists a positive constant C depending only on Ω1 and Ω2 such that N −1 γ1 (u1 ) − γ2 (u2 ) ϕ.ν ≤ C u1 BV (Ω1 ) + u2 BV (Ω2 ) , Γ1,2
i
i i
i
i
i
i
book 2014/8/6 page 406 i
Chapter 10. Spaces BV and SBV
406
so that u belongs to BV (Ω) and
D u := sup{〈D u, ϕ〉 : ϕ ∈ Cc (Ω, RN ), ||ϕ||∞ ≤ 1} ≤ C u1 BV (Ω1 ) + u2 BV (Ω2 ) .
Finally, (10.20) can be written γ1 (u1 ) − γ2 (u2 ) ϕ.ν N −1 Γ1,2 . 〈D u, ϕ〉 = ϕ D u1 Ω1 + D u2 Ω2 + Ω
Ω
This shows that D u = D u1 Ω1 + D u2 Ω2 + γ1 (u1 ) − γ2 (u2 ) ν N −1 Γ1,2 on C1c (Ω, RN ) and thus are equal by density. Example 10.2.2. Let us slightly modify the previous example by considering the function < u in Ω, v= 0 in RN \ Ω, where Ω is a Lipschitz domain of RN and u ∈ BV (Ω). We see that v belongs to BV (RN ) and Dv = D uΩ + u + ν N −1 Γ , where Γ is the boundary of Ω, ν denotes the inner unit vector normal to Γ , and u + the trace of u on Γ . Thus, taking for instance u equal to the constant 1 in Ω, the characteristic function 1Ω of Ω belongs to BV (RN ) (more precisely in a subspace SBV (RN ) introduced in Section 10.5) and D1Ω = ν N −1 Γ . This formula will be generalized in Section 10.3 when Ω possesses a reduce boundary ∂ r Ω (which in general does not coincides with the topological boundary) and a generalized unit normal ν to ∂ r Ω. More precisely, for 1Ω belonging to BV (RN ), we will obtain D1Ω = ν N −1 ∂ r Ω. Example 10.2.3. Jump through a family of hypersurfaces. Let Ω be a Lipschitz domain of RN , u a function in BV (Ω), and (Σ t ) t ∈I a family of Lipschitz hypersurfaces such that Σ t ⊂ Ω is the boundary of a Lipschitz open bounded subset Ω t of Ω with t < t =⇒ Ω t ⊂⊂ Ω t . As a consequence of Example 10.2.1, one has for all but countably many t in I that the jumps [u] t of u through Σ t are null. PROOF. Indeed, from Example 10.2.1, |D u| = |[u] t | d N −1 , Σt
Σt
and we conclude by Lemma 4.2.1. Example 10.2.4. Let Ω be a Lipschitz domain of RN and u ∈ BV (Ω). For all t > 0, consider the Lipschitz domains Ω t = {x ∈ Ω : d (x, RN \ Ω) > t } with boundary Γ t and denote, respectively, by u t+ and u t− the traces of u when u is considered as an element of BV (Ω t ) or BV (Ω \ Ω t ). We have for all but countably many t in R+ : the traces u t+ (x) and u t− (x) agree with u(x) for N −1 almost x in Γ t . PROOF. By using arguments of the proof of Theorem 10.2, one may assume that Ω is a cylinder. With the notation of this theorem, for all t in a complementary of a countable
i
i i
i
i
i
i
10.2. The trace operator, the Green’s formula, and its consequences
subset of R+ , estimate (10.13) becomes |u t + − u t | d N −1 ≤ C Σ
book 2014/8/6 page 407 i
407
|D u|. CR,t +,t
We deduce, for these t , that u t + tends to u t in L1 (Σ) when goes to 0+ . On the other hand, according to the proof of the trace theorem, u t + tends to u t+ in L1 (Σ t ) when goes to 0+ . Therefore, for N −1 almost all x in Σ t , u t+ = u. With the same arguments, but now reasoning on CR,t ,t + , we obtain that for N −1 almost all x in Γ t , u t− = u. We are going to establish the continuity of the trace operator with respect to the intermediate convergence. Let us recall that the trace operator is continuous from W 1, p (Ω) into L p (Γ ), p ≥ 1 when these two spaces are equipped with their weak topology. Indeed, this is a consequence of the following property (cf. [137, Proposition III 9]): if T is a continuous linear operator from a Banach space E into a Banach space F , then T is continuous from E equipped with the σ(E, E ) topology into F equipped with the σ(F , F ) topology. We cannot apply this general principle to the trace operator defined in BV (Ω). Indeed, the weak topology in BV (Ω) is not the σ(BV (Ω), BV (Ω) ) topology and we must be careful with the terminology “weak” convergence in BV (Ω). The example of Remark 10.1.3 shows that γ0 is not continuous from BV (Ω) into L1 (Γ ) when BV (Ω) and L1 (Γ ) are equipped with their weak convergence: define un ∈ BV (Ω), Ω = (0, 1) by un (x) = nx if x ∈ (0, n1 ] and u(x) = 1 if x ∈ [ n1 , 1). It is easily seen that un weakly converges to the constant 1 in BV (Ω), whereas un (x) = 0 for x ∈ {0}. Note that in this example, un does not converge to u = 1 in the sense of intermediate convergence because there is “loss of total mass.” Indeed Ω |D un | = 1 but Ω |D u| = 0. When the total mass is preserved at the limit, the trace operator is continuous. Theorem 10.2.2. Let Ω be a Lipschitz domain of RN . The trace operator γ0 is continuous from BV (Ω) equipped with the intermediate convergence onto L1 (Γ ) equipped with the strong convergence. PROOF. For each t > 0, consider the Lipschitz domain Ω t = {x ∈ Ω : d (x, RN \ Ω) > t } with Lipschitz boundary Γ t , and (un )n∈N and u in BV (Ω) such that ⎧ ⎪ ⎪ ⎪ ⎨ |un − u | d x → 0, Ω ⎪ ⎪ ⎪ ⎩ |D un | → |D u|. Ω
(10.21)
Ω
Possibly passing to a subsequence on n, and for almost all t ∈ R+ , we have ⎧ ⎪ ⎪ |D u| = 0, ⎪ ⎪ ⎪ ⎨ Γt un − u= (un − u) t for N −1 a.e. x on Γ t , ⎪ ⎪ ⎪ ⎪ ⎪ lim |un − u| d N −1 → 0, ⎩n→+∞
(10.22)
Γt
where (un − u) t denotes the trace on Γ t of the function un − u in BV (Ω t ). Indeed, the two first assertions are satisfied for all but countably many t in R+ , thanks to Lemma 4.2.1
i
i i
i
i
i
i
book 2014/8/6 page 408 i
Chapter 10. Spaces BV and SBV
408
and to Example 10.2.4. The last assertion is a consequence of the curvilinear Fubini theorem (cf. Corollary 4.2.2): 0
+∞ Γt
|un − u| d N −1 d t =
=
+∞ 0
Ω
N
[d (x,R \Ω)=t ]
|un − u| d N −1 d t
|un − u| d x.
For a fixed t , for which these assertions hold, let us define the function un,t in BV (Ω) by un,t = We have
un − u in Ω \ Ω t , 0 in Ω t .
D un,t = D(un − u)(Ω \ Ω t ) + (un − u) t ν t N −1 Γ t ,
where ν t is the outer unit normal vector to Γ t . According to the strong continuity of the trace operator γ0 from BV (Ω) onto L1 (Γ ), we finally deduce |γ0 (un − u)| d N −1 = |γ0 (un,t )| d N −1 Γ Γ ≤C |un − u| d x + |D un − D u| d x Ω\Ω t
+
Γt
|un − u| d N −1 .
Letting n → ∞, (10.21) and (10.22) yield N −1 lim sup |γ0 (un − u)| d ≤ 2C n→+∞
Γ
Ω\Ωt
Ω\Ω t
|D u| d x.
We have used the narrow convergence of |D un | to |D u| and Γ |D u| = 0 to assert that t |D u | tends to |D u| (see Proposition 4.2.5). We complete the proof by letting n Ω\Ω t Ω\Ω t t go to zero.
10.3 The coarea formula and the structure of BV functions It is well known that each real-valued function u of bounded variation on an interval I of R is the difference between two monotonous functions and, consequently, possesses at every point x0 ∈ I the two limits u(x0 −0) and u(x0 +0). One can then define its jump set S u := {x ∈ Ω : u(x − 0) = u(x + 0)}. The main objective of this section is to establish that one can associate a set S u to each u in BV (Ω), which generalize in any dimension the jump set of u in one dimension. In the particular case of a simple function u = χΩ ∈ BV (Ω), we will see that S u is a part ∂M Ω of the topological boundary ∂ Ω, which may differ from ∂ Ω of a set of null N −1 measure. The structure theorem, Theorem 10.3.4, will be a straightforward consequence of this property thanks to the generalized coarea formula in Section 10.3.3. For a first reading, the reader is advised to go directly to the definitions of the approximate limit sup and approximate limit inf (Definition 10.3.4) and to Theorem 10.3.4.
i
i i
i
i
i
i
10.3. The coarea formula and the structure of BV functions
book 2014/8/6 page 409 i
409
10.3.1 Notion of density and regular points In this subsection, we generalize the notions of interior, exterior of subsets of RN as well as the notions of limit, continuity, and jump for measurable functions. Indeed, we intend to extend the expression DχΩ = ν N −1 Γ previously obtained for Lipschitz domains Ω thanks to the theory of traces (see Example 10.2.3) to sets Ω whose characteristic function χΩ belongs to BV (RN ). Actually, for these sets, we will obtain DχΩ = ν N −1 ∂M Ω, where the generalized boundary ∂M Ω (the measure theoretical boundary) is, up to a set of N −1 measure zero, the set of all points that are neither in the generalized interior nor in the generalized exterior of Ω. In what follows, Bρ (x0 ) denotes the open ball of RN with radius ρ > 0 and centered at x0 . Definition 10.3.1. Let E be a Borel subset of RN . A point x0 in RN is a density point of E iff lim
ρ→0
" N (Bρ (x0 ) ∩ E) " N (Bρ (x0 ))
= 1.
A point x0 is a rarefaction point of E iff lim
ρ→0
" N (Bρ (x0 ) ∩ E) " N (Bρ (x0 ))
= 0.
The set of all density points and all rarefaction points of E are respectively called measure theoretical interior and measure theoretical exterior of E and denoted by E∗ and E ∗ . Example 10.3.1. When O is an open subset of RN , it is easily seen that all the points of O are density points and all the points of RN \ O are rarefaction points. Nevertheless, if x0 belongs to the boundary Γ of O, various situations may occur, as shown in Figure 10.2.
Figure 10.2. The point x0 is a density point of the union of the two discs but is a rarefaction point of the complementary of this union.
We generalize these definitions relatively to a fixed Borel subset F of RN as follows. Definition 10.3.2. Let F and E be two Borel subsets of RN and assume that x0 in RN is such that " N (Bρ (x0 ) ∩ F ) > 0 for all ρ > 0 small enough. The point x0 is an F -density point of E iff lim
ρ→0
" N (Bρ (x0 ) ∩ F ∩ E) " N (Bρ (x0 ) ∩ F )
= 1.
i
i i
i
i
i
i
book 2014/8/6 page 410 i
Chapter 10. Spaces BV and SBV
410
The point x0 is an F -rarefaction point of E iff lim
" N (Bρ (x0 ) ∩ F ∩ E)
ρ→0
" N (Bρ (x0 ) ∩ F )
= 0.
These definitions allow us to adopt the following notion of boundary. Definition 10.3.3. Let E be a Borel subset of RN . The measure theoretical boundary of E is the subset of RN denoted by ∂M E, made up of all the elements of RN which are neither density points nor rarefaction points of E. As a straightforward consequence of this definition, one can easily establish that the measure theoretical boundary is a part of the classical topological boundary as stated in the following proposition. The proof is left to the reader. Proposition 10.3.1. The measure theoretical boundary of E is the subset of the topological boundary ∂ E defined by " N (Bρ (x) ∩ E) " N (Bρ (x) \ E) N > 0 and lim sup >0 . ∂M E = x ∈ R : lim sup " N (Bρ (x)) " N (Bρ (x)) ρ→0 ρ→0 Remark 10.3.1. The measure theoretical boundary may differ from the topological boundary of a set of nonnull N −1 measure. Indeed, consider E = B \ [(0, 0), (0, 1)[, where B is the unit open ball of R2 . The measure theoretical boundary is the sphere but the (topological) boundary is the union of the sphere and the interval [(0, 0), (0, 1)]. Let E be an open subset of RN satisfying the following property: for all point x0 of ∂ E, there exists a normal vector ν(x0 ) to ∂ E such that E is included in the half-space πν (x0 ) := {x ∈ RN : 〈x − x0 , ν(x0 )〉 > 0}, where 〈 , 〉 denotes the scalar product in RN . Then one has ∂ E = ∂M E. Indeed, it is easily seen that χ 1 (E−x0 ) strongly converges to the ρ
characteristic function χπν (x0 ) of πν (x0 ) in L1l oc (RN ) so that for x0 in ∂ E, lim
ρ→0
" N (Bρ (x0 ) ∩ E) " N (Bρ (x0 ))
= lim
" N (B1 (x0 ) ∩ ρ1 (E − x0 )) " N (B1 (x0 ))
ρ→0
= 1/2 = lim
ρ→0
" N (Bρ (x0 ) \ (E − x0 )) " N (Bρ (x0 ))
.
We now look into the notion of approximate limit. Definition 10.3.4. Let f : RN −→ R be a measurable function and x0 ∈ RN . A real number α is the approximate limit of f at x0 iff ∀ > 0
x0 is a density point of the set [ | f − α| < ],
or equivalently ∀ > 0
x0 is a rarefaction point of the set [ | f − α| > ].
We then write α = a p limx→x0 f (x).
i
i i
i
i
i
i
10.3. The coarea formula and the structure of BV functions
book 2014/8/6 page 411 i
411
More generally, we define in R the approximate limit sup and approximate limit inf of f at x0 by ap lim sup f (x) = inf t ∈ R : lim
" N (Bρ (x0 ) ∩ [ f > t ]) " N (Bρ (x0 ))
ρ→0
x→x0
ap lim inf f (x) = sup t ∈ R : lim
" N (Bρ (x0 ) ∩ [ f < t ])
ρ→0
x→x0
" N (Bρ (x0 ))
=0
,
=0
.
Let F be a fixed Borel subset of RN . A real number α is called the F -approximate limit of f at x0 iff ∀ > 0 x0 is an F -density point of [ | f − α| < ] and we write α = a p lim x→x0 ,x∈F f (x). Example 10.3.2. If α = lim x→x0 f (x), it is straightforward to show that α = ap lim x→x0 f (x). When f is the characteristic function χD ∪D of the union of the two discs in Example 1
2
10.3.1, [ | f − 1| < ] = D1 ∪ D2 , so that ap limx→x0 f (x) = 1 although the classical limit does not exist. When f is the characteristic function of the complementary of the union of these two discs, the approximate limit of f at x0 is zero. It is easy to establish uniqueness of the approximate limit when it exists. For further details related to these notions, see [245]. We only give, without proof, four elementary useful properties. Proposition 10.3.2. For all Borel subsets A and B of RN , the four following assertions hold: (i) Let C = A ∪ B. If the approximate limits ap
lim
x→x0 , x∈A
f (x)
and
ap
lim
x→x0 , x∈B
f (x)
exist and coincide, then a p lim x→x0 , x∈C f (x) exists. (ii) If x0 is not a rarefaction point of A, A ⊂ B, and if x0 is a B-rarefaction point of B \ A, then the existence of a p lim f (x) x→x0 , x∈A
implies the existence of ap
lim
x→x0 , x∈B
f (x).
(iii) If A ⊂ B and if x0 is not a rarefaction point of A, then the existence of ap
lim
f (x)
lim
f (x).
x→x0 , x∈B
implies the existence of ap
x→x0 , x∈A
i
i i
i
i
i
i
book 2014/8/6 page 412 i
Chapter 10. Spaces BV and SBV
412
(iv) If x0 is not a rarefaction point of A ∩ B, then the existence of ap
lim
x→x0 , x∈A
f (x)
and
ap
lim
x→x0 , x∈B
f (x)
implies equality of these two approximate limits. In the following proposition, we show that the approximate limit at x0 is a classical limit for the restriction of f to a suitable Borel set. Moreover, when the approximate lim inf and approximate lim sup coincide, their common value is the approximate limit. Proposition 10.3.3. A measurable function f : RN −→ R possesses an approximate limit α at x0 iff there exists a Borel subset B of RN such that x0 is a rarefaction point of RN \ B and such that the restriction f B of f to B possesses the classical limit α at x0 . On the other hand, one always has ap lim inf f ≤ ap lim sup f x→x0
x→x0
in R and ap lim inf x→x0 f = ap lim sup x→x0 f := α ∈ R iff ap lim x→x0 f = α. PROOF. Let us prove the first assertion. It is easily seen that the given condition is sufficient. Conversely, assume that f possesses an approximate limit α at x0 . Without loss of generality one may assume α = 0. For all integer i, x0 is then a rarefaction point of Ai := RN \ {x ∈ RN : | f (x)| < 1/i}. Consider a nonincreasing sequence ρ1 > ρ2 > · · · > ρi > · · · in R+ such that " N (Bρ (x0 ) ∩ Ai ) " N (Bρ (x0 ))
≤ 2−i
when 0 ≤ ρ ≤ ρi , and denote the complementary set of ∪i ∈N∗ (Ai ∩ Bρi (x0 )) by B. It is straightforward to show that f B has the limit zero at x0 . We now show that x0 is a rarefaction point of ∪i ∈N∗ (Ai ∩ Bρi (x0 )). Let ρi > ρ > ρi +1 . Then " N (Bρ (x0 ) ∩ (RN \ B)) ≤ " N (Bρ (x0 ) ∩ Ai ) +
∞
" N (Bρi+k (x0 ) ∩ Ai +k )
k=1
≤ C ρN 2−(i −1) . We conclude the proof of the assertion by letting ρ go to zero and then i go to +∞ in inequality " N (Bρ (x0 ) ∩ (RN \ B)) ≤ C 2−(i −1) . " N (Bρ (x0 )) We are going to establish the second assertion. Assume that ap lim infx→x0 f (x) = ap lim sup x→x0 f (x) = α. Let > 0, t , t be such that α ≤ t < α + , α − < t ≤ α and lim
ρ→0
" N (Bρ (x0 ) ∩ [ f > t ]) " N (Bρ (x0 ))
= 0, lim
ρ→0
" N (Bρ (x0 ) ∩ [ f < t ]) " N (Bρ (x0 ))
= 0.
i
i i
i
i
i
i
10.3. The coarea formula and the structure of BV functions
book 2014/8/6 page 413 i
413
Since [| f − α| > ] = [ f > α + ] ∪ [ f < α − ] and [ f > α + ] ⊂ [ f > t ], we infer lim
[ f < α − ] ⊂ [ f < t ],
" N (Bρ (x0 ) ∩ [| f − α| > ]) " N (Bρ (x0 ))
ρ→0
= 0,
and thus α = ap lim x→x0 f (x). Conversely, assume that α = ap lim x→x0 f (x), i.e., for all >0 " N (Bρ (x0 ) ∩ [| f − α| > ]) = 0. lim ρ→0 " N (Bρ (x0 )) We deduce that for all > 0 lim
" N (Bρ (x0 ) ∩ [ f > α + ])
ρ→0
" N (Bρ (x0 ))
=0
and
lim
" N (Bρ (x0 ) ∩ [ f < α − ]) " N (Bρ (x0 ))
ρ→0
= 0.
Therefore α + ≤ ap lim inf f (x) and α − ≥ ap lim sup f (x), x→x0
x→x0
and thus ap lim inf f (x) = ap lim sup f (x), x→x0
x→x0
provided that we have established ap lim inf x→x0 f (x) ≤ ap lim inf x→x0 f (x). Let us show ap lim inf x→x0 f (x) ≤ ap lim sup x→x0 f (x). One may assume ap lim sup x→x0 f (x) < +∞. Let any τ ∈ R be such that lim
ρ→0
" N (Bρ (x0 ) ∩ [ f < τ]) " N (Bρ (x0 ))
= 0.
We claim that τ≤ ap lim sup x→x0 f (x). Assume, on the contrary, that τ>ap lim sup x→x0 f (x) and set = τ − ap lim sup x→x0 f (x). We deduce that there exists t such that τ > t and lim
ρ→0
" N (Bρ (x0 ) ∩ [ f > t ]) " N (Bρ (x0 ))
= 0.
The inclusion [ f ≥ τ] ⊂ [ f > t ] then yields lim
ρ→0
" N (Bρ (x0 ) ∩ [ f ≥ τ]) " N (Bρ (x0 ))
= 0,
which is in contradiction with lim
ρ→0
" N (Bρ (x0 ) ∩ [ f ≥ τ]) " N (Bρ (x0 ))
= 1.
i
i i
i
i
i
i
book 2014/8/6 page 414 i
Chapter 10. Spaces BV and SBV
414
Remark 10.3.2. Every measurable function f : RN −→ R possesses an approximate limit almost everywhere. For a proof, consult Morgan [301]. Let f be a function in L1l oc (RN ). Then each Lebesgue point of f is a point of approximate limit. This property is indeed the straightforward consequence of " N (Bρ (x0 ) ∩ [| f − f (x0 )| > ]) 1 | f (x) − f (x0 )| d x. ≤ " N (Bρ (x0 )) " N (Bρ (x0 )) Bρ (x0 ) Nevertheless, there exist points of approximate limit which are not Lebesgue points. Consult, for instance, Morgan [301, Exercise 2.7]. In the next sections, for all functions f in L1 (Ω), Ω open subset of RN , we will adopt the following convention: we choose a representative of f , still denoted f , such that at every point x0 of approximate limit, f (x0 ) = a p lim x→x0 f (x). Such a representative is said to be approximately continuous at its points of approximate limit. We now generalize the concept of left and right limits u(x0 − 0) and u(x0 + 0) for functions defined on RN . We denote the unit sphere of RN by S N −1 and for all a in S N −1 and all x0 in RN , πa (x0 ) denotes the open half-space πa (x0 ) := {x ∈ RN : 〈x − x0 , a〉 > 0}. We also denote the hemiball πa (x0 ) ∩ Bρ (x0 ) by Hρ,a (x0 ). Definition 10.3.5. A point x0 in RN is called a regular point for the measurable function f : RN → R iff there exists a ∈ S N −1 such that the two following approximate limits exist: fa (x) := a p
lim
x→x0 , x∈πa (x0 )
f (x)
and
f−a (x) := a p
lim
x→x0 , x∈π−a (x0 )
f (x).
Example 10.3.3. Let us consider Example 10.3.1 and take for f the characteristic function of the union of the two discs. The point x0 is a regular point of f and satisfies 1 = fa (x0 ) = f−a (x0 ), where a is one of the two unit vectors orthogonal to the common tangent hyperplane at the two discs at x0 . One could say that x0 is a point of approximate continuity for f . Let us point out that we also have 1 = a p lim x→x0 f (x). Consider now the characteristic function f of one of the two discs. The point x0 is also a regular point of f and satisfies fa (x0 ) = f−a (x0 ). The following theorem asserts that regular points always satisfy the alternative of Example 10.3.3. Theorem 10.3.1 (structure of the set of regular points, jump points, and jump sets). Let x0 be a regular point of a measurable function f : RN → R. We have the following alternative: (i) If fa (x0 ) = f−a (x0 ), then
a p lim x→x0 f (x)
N −1
exists and for all b in S , f b (x0 ) = a p lim x→x0 f (x). The point x0 is called a point of approximate continuity of f . (ii) If fa (x0 ) = f−a (x0 ), then ±a is the unique element of S N −1 such that fa and f−a exist. The point x0 is called a jump point of f . The real number | fa (x0 ) − f−a (x0 )| is the jump of f at x0 and fa (x0 ) − f−a (x0 ) a is the oriented jump. The set of all jump points of f is called the jump set of f and denoted by S f .
i
i i
i
i
i
i
10.3. The coarea formula and the structure of BV functions
book 2014/8/6 page 415 i
415
PROOF. Assume that fa (x0 ) = f−a (x0 ). The existence of a p lim x→x0 f (x) is a consequence of Proposition 10.3.2(i) with A = πa (x0 ) and B = π−a (x0 ). Thus, according to Proposition 10.3.2(iii), with B = RN and A = π b (x0 ), f b (x0 ) exists for all b ∈ S N −1 . According to Proposition 10.3.2(iv), we obtain f b (x0 ) = a p lim x→x0 f (x). We finally establish (ii). Let b be any element of S N −1 . We show that f b (x0 ) does not exist if b = ±a. Otherwise, as x0 is not a rarefaction point of the sets πa (x0 ) ∩ π b (x0 ) and π−a (x0 ) ∩ π b (x0 ), from Proposition 10.3.2(iv), fa (x0 ) = f b (x0 ) and f−a (x0 ) = f b (x0 ), a contradiction. Example 10.3.4. Let us consider the function f defined by < α if |x| ≤ 1, f = β if |x| > 1, where α = β. The jump set is the sphere S N −1 . In Example 10.3.1, if f = χD1 ∪D2 , the jump set is the boundary (topological) except the point x0 . The next proposition characterizes the jump set of simple functions. Proposition 10.3.4 (inner measure theoretic normal). Let E be a Borel subset of RN and χ its characteristic function. The jump set Sχ of χ is the set of all points x0 of RN for which there exists a in S N −1 satisfying ⎧ " N Hρ,a (x0 ) ∩ E ⎪ ⎪ ⎪ ⎪ lim = 1, ⎪ ⎨ρ→0 " N H (x ) ρ,a 0 ⎪ " N Hρ,−a (x0 ) ∩ E ⎪ ⎪ ⎪ lim = 0. ⎪ ⎩ρ→0 " N Hρ,a (x0 ) For such points, the unit vector a is unique and called the inner measure theoretic normal to E at x0 . Moreover, Sχ ⊂ ∂M E. PROOF. From the definition, every point satisfying the two above properties is a jump point of χ , and the uniqueness of a is a consequence of Theorem 10.3.1(ii). Conversely, if x0 is a jump point of χ , there exists a unique ±a in S N −1 such that χa (x0 ) = χ−a (x0 ). It is easily seen that χa (x0 ) and χ−a (x0 ) belong to {0, 1}. Then, exchanging, if necessary, a and −a, we have χa (x0 ) = 1 and χ−a (x0 ) = 0, which gives the two required limits. We must now prove SχE ⊂ ∂M E. Let x0 ∈ SχE . One has " N (Bρ (x0 ) ∩ E) " N (Bρ (x0 ))
= × × ≥
" N (Bρ (x0 ) ∩ E) " N (Bρ (x0 ) ∩ E ∩ πa (x0 )) " N (Bρ (x0 ) ∩ E ∩ πa (x0 )) " N (Bρ (x0 ) ∩ πa (x0 )) " N (Bρ (x0 ) ∩ πa (x0 )) " N (Bρ (x0 )) 1 " N (Bρ (x0 ) ∩ E ∩ πa (x0 )) 2
" N (Bρ (x0 ) ∩ πa (x0 ))
.
i
i i
i
i
i
i
book 2014/8/6 page 416 i
Chapter 10. Spaces BV and SBV
416
Since the second factor tends to 1 when ρ → 0, the inequality above yields lim sup ρ→0
" N (Bρ (x0 ) ∩ E)
≥
" (Bρ (x0 )) N
1 2
> 0.
Exchanging the roles of a and −a and E and RN \ E, we also obtain lim sup ρ→0
" N (Bρ (x0 ) \ E) " (Bρ (x0 )) N
≥
1 2
> 0;
thus, according to Proposition 10.3.1, x0 ∈ ∂M E.
10.3.2 Sets of finite perimeter, structure of simple BV functions To clarify the structure of BV functions, we now establish that up to a set of N −1 measure zero, the jump set of all simple function χE which belongs to BV (RN ) is essentially the measure theoretical boundary of E. Definition 10.3.6. A Borel subset E of RN is called a set of finite perimeter in Ω iff its characteristic function χE belongs to BV (Ω). The total mass |DχE |(Ω) is called the perimeter of E in Ω and is denoted by P (E, Ω) or P (E) when Ω = RN . A Borel subset E of RN is called a set of locally finite perimeter if it is a set of finite perimeter in U for all bounded open subset U of RN . Remark 10.3.3. When E is a Lipschitz open bounded subset of Ω, according to the trace Theorem 10.2.1 and to Example 10.2.2, we have P (E, Ω) = N −1 (Ω ∩ ∂ E). Theorem 10.3.2 (structure of simple functions of BV (Ω)). Let E be a set of finite perimeter in Ω. Then the following hold: (i) Up to a Borel subset of N −1 Ω-zero measure, ∂M E ∩ Ω is the jump set of χE . (ii) The set ∂M E ∩ Ω is countably N − 1-rectifiable, i.e., ∂M E ∩ Ω ⊂ i ∈N An , where N −1 Ω(A0 ) = 0, and for each i = 1, . . . , +∞ there exists a Lipschitz function fi : RN −1 → RN such that Ai = fi (RN −1 ). (iii) The following generalized Gauss–Green formula holds: for N −1 almost all x in Ω, there exists ν(x) ∈ S N −1 , called the generalized inner normal vector to E at x, such that for all ϕ in C1c (Ω, RN ),
Ω
χE div ϕ d x =
∂M E∩Ω
ϕ.(−ν)d N −1 ,
that is, DχE = ν N −1 ∂M E ∩ Ω. PROOF. The proof is divided into the five following steps: Step 1. We define the generalized inner normal vector ν(x) to E at x and the reduced boundary ∂ r E of E.
i
i i
i
i
i
i
10.3. The coarea formula and the structure of BV functions
book 2014/8/6 page 417 i
417
Step 2. This step consists of establishing ∂ r E ⊂ SχE ∩ Ω ⊂ ∂M E ∩ Ω. Step 3. We prove that ∂M E ∩ Ω, SχE ∩ Ω, and ∂ r E are essentially the same sets; more precisely, N −1 (∂M E ∩ Ω \ ∂ r E) = 0. Step 4. We establish that SχE ∩ Ω is countably (N − 1)-rectifiable. Step 5. We prove the generalized Gauss–Green formula. Step 1. Contrary to the previous notions of boundary ∂M E and SχE , the definition below is specifically defined for subsets E of Ω such that DχE belongs to M(Ω, RN ). Definition 10.3.7. Let E be a subset of finite perimeter in Ω. The generalized unit inner normal to E is the Radon–Nikodým derivative of DχE with respect to the measure |DχE |. In other words, DχE Bρ (x) . for |DχE | a.e. x in Ω, ν(x) = lim ρ→0 |DχE | B (x) ρ
The reduced boundary ∂ r E consists of all points x in Ω such that the limit above exists. Let us remark that according to the trace theory, when E is a Lipschitz domain of RN with boundary Γ , we have DχE = ν N −1 Γ , where ν(x) is the inner normal to E at Dχ N −1 a.e. x in Γ . Consequently |DχE | (x) = ν(x) for N −1 a.e. x in Γ and N −1 (Γ \ E ∂ r E) = 0. Step 2. We establish that ∂ r E ⊂ SχE ∩ Ω. The inclusion SχE ∩ Ω ⊂ ∂M E ∩ Ω has been proved in Proposition 10.3.4. The key of the proof is the following blow-up lemma. Lemma 10.3.1. Let x0 ∈ ∂ r E, ν(x0 ) be the generalized unit inner normal to E at x0 , Eρ the homothetic subset {x ∈ RN : ρ(x − x0 ) ∈ E} of E, and πν (x0 ) the half-space {x ∈ RN : 〈x − x0 , ν(x0 )〉 > 0}. Then χEρ weakly converges to χπν (x0 ) in BV (B1 (x0 )). SKETCH OF THE PROOF OF LEMMA 10.3.1. We admit the three following estimates: for each x ∈ ∂ r E there exits a positive constant C such that for all sufficiently small r > 0, C ≤ r N −1 |DχE |(B r (x)) ≤ C −1 , −N
N
C ≤ r " (B r (x) ∩ E), C ≤ r −N " N (B r (x) \ E).
(10.23) (10.24) (10.25)
For a proof, we refer the reader to [366, Lemma 5.5.4]. Without loss of generality, one may assume x0 = 0 and ν(0) = (0, . . . , 0, 1). Moreover, it is enough to establish that for each sequence (ρ h ) h∈N , there exists a subsequence (not relabeled) satisfying χEρ χπν (0) in BV (B1 (x0 )). h
Let us fix r > 0. Reasoning with a smooth approximating sequence in the sense of the intermediate convergence (Theorem 10.1.2), and changing scale, a straightforward calculation gives DχEρ (B r (0)) = ρ1−N DχE (Bρh r (0)), h
(10.26)
|DχE |(Bρh r (0)). |DχEρ |(B r (0)) = ρ1−N h
(10.27)
h
h
i
i i
i
i
i
i
book 2014/8/6 page 418 i
Chapter 10. Spaces BV and SBV
418
Collecting (10.23) and (10.27), we obtain, for ρ h small enough, C r N −1 ≤ |DχEρ |(B r (0)) ≤ C −1 r N −1 .
(10.28)
h
Thanks to (10.28) with r = 1, applying Theorem 10.1.4 in BV (B1 (0)), there exist a subsequence (not relabeled) and a subset F ⊂ RN of finite perimeter in B1 (0) such that strongly in L1 (B1 (0)),
χEρ → χF h
DχEρ DχF h
(10.29)
weakly in M(B1 (0), RN ).
(10.30)
From now on we reason in BV (B1 (0)). It remains to establish that F = πν (0). According to Theorem 4.2.1 and Lemma 4.2.1, for all but countably many 0 < ρ < 1 DχEρ (Bρ (0)) → DχF (Bρ (0)).
(10.31)
h
Thus, from (10.27), (10.26), according to the definition of ν(0) = (0, . . . , 0, 1), and from (10.31) we deduce lim |DχEρ |(Bρ (0)) = lim
h→+∞
|DχEρ |(Bρ (0)) h
h→+∞ ∂ ∂ xN
h
= lim
= lim
h→+∞
χEρ (Bρ (0)) h→+∞ ∂ xN
χEρ (Bρ (0)) h
h
|DχE |(Bρh ρ (0))
h→+∞ ∂ ∂ xN
∂
lim
∂
lim
χE (Bρh ρ (0)) h→+∞ ∂ xN
∂ ∂ xN
χEρ (Bρ (0)) = h
∂ ∂ xN
χEρ (Bρ (0)) h
χF (Bρ (0)).
(10.32)
The lower semicontinuity of the total variation (Proposition 10.1.1) and (10.30), (10.32) finally yield ∂ |DχF |(Bρ (0)) ≤ χ (B (0)), ∂ xN F ρ hence equality. Let us consider the density νN of ∂ ∂x χF with respect to |DχF | (cf. the N Radon–Nikodým theorem, Theorem 4.2.1). From above one has |DχF |(Bρ (0)) =
∂ ∂ xN
χF (Bρ (0)) =
Bρ (0)
νN (x) d |DχF |(x).
∂ χ = |DχF | is a nonnegative ∂ xN F ∂ (10.28), ∂ x χF (Bρ (0)) ≥ C ρN −1 so that N
This shows that νN (x) = 1 for |DχF | a.e. x ∈ B1 (0); hence Radon measure. Note also that from (10.32) and ∂ ∂ xN
χF ≡ 0. On the other hand, from (10.31), (10.26), (10.27), and the definition of ν(0), for i = 1, . . . , N − 1, and for all but countably many 0 < ρ < 1, we have ∂ ∂ xi
χF (Bρ (0))
|DχF |(Bρ (0))
= lim
h→+∞
∂ ∂ xi
χEρ (Bρ (0)) h
|DχEρ |(Bρ (0)) h
= lim
h→+∞
∂ ∂ xi
χE (Bρh ρ (0))
|DχE |(Bρh ρ (0))
= 0.
(10.33)
i
i i
i
i
i
i
10.3. The coarea formula and the structure of BV functions
We deduce from (10.33) that density νi of
∂ ∂ xi
∂ ∂ xi
book 2014/8/6 page 419 i
419
χF (Bρ (0)) = 0 for i = 1, . . . , N − 1. Let us consider the
χF with respect to |DχF |. Since 0=
∂ ∂ xi
χF (Bρ (0)) =
Bρ (0)
νi (x) d |DχF |(x),
νi (x) = 0 for |DχF | a.e. x ∈ B1 (0), and hence ∂∂x χF = 0. i The function χF depends only on the variable xN , is nondecreasing, and, from (10.29), takes only the two values 0 and 1. Now set α = sup{xN : χF (xN ) = 0}. Note that α = +∞; otherwise DχF ≡ 0. For proving F = πν (0), it suffices to establish α = 0. Assuming α > 0 gives Bρ (0) ⊂ RN \ F for ρ < α so that, by (10.29), 0 = " N (Bρ (0) ∩ F ) = lim " N (Bρ (0) ∩ Eρh ) h→+∞
" N (Bρρh (0) ∩ E), = lim ρ−N h h→+∞
which contradicts (10.24). If α < 0, then Bρ (0) ⊂ F for ρ < −α and a similar argument gives " N (Bρρh (0) \ E), 0 = lim ρ−N h h→+∞
which contradicts (10.25). The proof of Lemma 10.3.1 is complete. We are going to establish ∂ r E ⊂ SχE ∩ Ω. Let x0 ∈ ∂ r E. We have " N (E ∩ Hρ,−ν (x0 )) " N (Hρ,−ν (x0 ))
=
" N (Eρ ∩ B1 (x0 ) ∩ π−ν (x0 )) " N (B1 (x0 ) ∩ π−ν (x0 ))
.
The convergence of χEρ to χπν (x0 ) in L1 (B1 (x0 )) yields lim
" N (E ∩ Hρ,−ν (x0 ))
ρ→0
" N (Hρ,−ν (x0 ))
=
" N (πν (x0 ) ∩ B1 (x0 ) ∩ π−ν (x0 )) " N (B1 (x0 ) ∩ π−ν (x0 ))
= 0.
Similarly, lim
ρ→0
" N (E ∩ Hρ,ν (x0 )) " (Hρ,ν (x0 )) N
=
" N (πν (x0 ) ∩ B1 (x0 ) ∩ πν (x0 )) " N (B1 (x0 ) ∩ πν (x0 ))
=1
and, according to Proposition 10.3.4, x0 ∈ SχE . Step 3. The proof of N −1 (∂M E ∩ Ω \ ∂ r E) = 0 consists first in proving N −1 (SχE ∩ Ω \ ∂ r E) = 0; next, N −1 (∂M E ∩ Ω \ SχE ∩ Ω) = 0. Keys of the proof are the relative isoperimetric inequality and Lemma 4.2.3. This is summarized in the next lemma. Lemma 10.3.2. Let E be a set of finite perimeter in Ω. Then the following assertions hold: (i) Relative isoperimetric inequality: there exists a positive constant C such that min{" N (B r ∩ E), " N (B r \ E)}
N −1 N
≤ C |DχE |(B r )
for all open ball B r with radius R included in Ω.
i
i i
i
i
i
i
book 2014/8/6 page 420 i
Chapter 10. Spaces BV and SBV
420
(ii) There exists a positive constant C such that for all x ∈ SχE ∩ Ω, lim inf
|DχE |(Bρ (x)) ρN −1
ρ→0
≥ C.
(iii) For N −1 almost all x ∈ Ω \ SχE ∩ Ω, lim sup
|DχE |(Bρ (x)) ρN −1
ρ→0
= 0.
(iv) N −1 (SχE ∩ Ω \ ∂ r E) = 0. (v) N −1 (∂M E ∩ Ω \ SχE ∩ Ω) = 0. PROOF OF LEMMA 10.3.2. Assertion (i) is a straightforward consequence of the Poincaré– Wirtinger inequality |u − u|
N N −1
N −1 N
≤C
Br
|D u|,
where u =
Br
1 " N (B r )
u(x) d x, Br
applied to u = χE . Indeed, every function u in BV (Ω) satisfies the Poincaré–Wirtinger inequality: consider a smooth approximating sequence in the sense of the intermediate convergence, and apply Corollary 5.4.1 and Theorem 10.1.2 as in the proof of Proposition 10.1.3. Let us prove (ii). Let x ∈ SχE ∩ Ω; according to Theorem 10.3.4, an easy computation yields " N (Bρ (x) ∩ E) 1 " N (Bρ (x) \ E) 1 ≥ ≥ lim inf and lim inf ρ→0 ρ→0 2 2 " N (Bρ (x)) " N (Bρ (x)) so that lim
ρ→0
" N (Bρ (x) ∩ E) " (Bρ (x)) N
= lim
ρ→0
" N (Bρ (x) \ E) " (Bρ (x)) N
1 = . 2
The conclusion of assertion (ii) then follows from (i). For proving (iii), it suffices to establish that for all δ > 0, the set |DχE |(Bρ (x)) Aδ = (Ω \ SχE ) ∩ x ∈ Ω : lim sup > δ ρN −1 ρ→0 satisfies N −1 (Aδ ) = 0. From Lemma 4.2.3 applied to the Borel measure μ = |DχE |, one has |DχE |(Aδ ) ≥ C δ N −1 (Aδ ) and the conclusion follows from |DχE |(Aδ ) = 0 which is a consequence of ∂ r E ⊂ SχE ∩ Ω established in Step 2. We establish assertion (iv). From assertion (ii) and Lemma 4.2.3 applied to the Borel measure μ = |DχE |, there exists a nonnegative constant C depending only on N , such that, for all Borel set B included in SχE , N −1 (B) ≤ C |DχE |(B).
(10.34)
i
i i
i
i
i
i
10.3. The coarea formula and the structure of BV functions
book 2014/8/6 page 421 i
421
Obviously, by definition |DχE |(Ω \ ∂ r E) = 0; thus |DχE |(SχE ∩ Ω \ ∂ r E) = 0, and the conclusion N −1 (SχE ∩ Ω \ ∂ r E) = 0 follows from (10.34). We finally establish N −1 (∂M E ∩ Ω \ SχE ∩ Ω) = 0. According to (iii), we are reduced to proving that for each x in ∂M E ∩ Ω, lim sup
|DχE |(Bρ (x))
ρ→0
ρN −1
> 0.
From the definition of the measure theoretical boundary (Proposition 10.3.1), there exists δ > 0 such that lim sup ρ→0
" N (Bρ (x) ∩ E)
>δ
" N (Bρ (x))
and
lim inf
" N (Bρ (x) ∩ E)
ρ→0
" N (Bρ (x))
< 1 − δ.
Consequently, choosing δ < 1/2, for ρ small enough, δ
t }. According to the first step and to Fatou’s lemma, we have
Ω
|D u| = lim
n→+∞
= lim
n→+∞
≥
+∞
|D un |
Ω +∞
Ω
−∞
|DχEn,t | d t
lim inf n→+∞
−∞
Ω
|DχEn,t | d t .
(10.37)
On the other hand,
Ω
|un − u| d x =
Ω
+∞
−∞
|χEn,t − χEt | d t d x =
+∞ −∞
Ω
|χEn,t − χEt | d x
dt,
which converges to zero. Thus, for a subsequence (not relabeled), and for almost all t in R, χEn,t → χEt strongly in L1 (Ω). (10.38) The lower semicontinuity of the total variation with respect to the strong convergence in L1 (Ω) (Proposition 10.1.1) and (10.37), (10.38) finally yield Ω
|D u| ≥
+∞
−∞
Ω
|DχEt | d t .
It is now easy to adapt the proof above for obtaining the coarea formula with f = χE for any Borel subset E of Ω. The general coarea formula with a Borel function f : Ω → R+ is then obtained by a classical density argument. We are now in a position to establish that N −1 almost all points of Ω are regular for function in BV (Ω) and that their jump set (cf. Theorem 10.3.1) is countably (N − 1)rectifiable. For each function u in BV (Ω) whose representative satisfies the convention of Remark 10.3.2, we set S u = { x ∈ Ω : u − (x) < u + (x) }, where u − (x) = ap lim infy→x u(y) and u + (x) = ap lim supy→x u(y). Theorem 10.3.4. Let u be a given function in BV (Ω). Then, for N −1 -almost all x in Ω, u − (x) and u + (x) are finite and S u is countably (N −1)-rectifiable. Moreover, S u is, up to a set of N −1 Ω measure zero, the jump set of u, and N −1 almost all x in Ω are regular for u. PROOF. We begin by proving that S u is countably (N − 1)-rectifiable. According to the coarea formula (Theorem 10.3.3), for almost all t ∈ R, E t = {x ∈ Ω : u(x) > t } := [u > t ] is a set of finite perimeter in Ω. Now let D be a dense countable subset of {t ∈ R : E t is of finite perimeter} and set S u,t := { x ∈ S u : u − (x) < t < u + (x) }. We have S u = S . On the other hand, from definitions of u − and u + , it is easy to establish that t ∈D u,t
i
i i
i
i
i
i
10.3. The coarea formula and the structure of BV functions
for all x ∈ S u,t ,
book 2014/8/6 page 425 i
425
⎧ " N (Bρ (x) ∩ [u > t ]) ⎪ ⎪ ⎪ ⎪ lim sup > 0, ⎪ ⎨ ρ→0 " N (B (x)) ρ
⎪ " N (Bρ (x) ∩ [u < t ]) ⎪ ⎪ ⎪ lim sup > 0, ⎪ ⎩ ρ→0 " N (B (x))
ρ
so that x ∈ ∂M E t . Thus S u ⊂ t ∈D ∂M E t and the conclusion follows from the structure theorem, Theorem 10.3.2. We admit that −∞ < u − (x) ≤ u + (x) < +∞ for N −1 almost all x in Ω. For a proof, consult Evans and Gariepy [211, Theorem 2]. For establishing that up to an N −1 Ωnegligible set, S u is the jump set of u and that N −1 almost all x in Ω are regular for u, according to Proposition 10.3.3, Definition 10.3.5, and Theorem 10.3.1, it is enough to establish that for N −1 almost all x in S u , there exists ν(x) in S N −1 such that u − (x) = ap limy→x, y∈π−ν(x) (x) u(y) and u + (x) = ap limy→x, y∈πν(x) (x) u(y). Let x ∈ S u such that u − (x) and u + (x) are finite and set t = u + (x) − with small enough so that u − (x) < t < u + (x). Thus x ∈ ∂M E t and, from Theorem 10.3.2, for N −1 -almost all such x in S u , there exists ν(x) in S N −1 such that " N Hρ,ν(x) (x) ∩ [u > u + (x) − ] = 1. (10.39) lim ρ→0 " N Hρ,ν(x) (x) On the other hand, according to the definition of the approximate limsup, " N Bρ (x) ∩ [u > u + (x) + ] = 0, lim ρ→0 " N Bρ (x) hence
" N Hρ,ν(x) (x) ∩ [u > u + (x) + ] lim = 0. ρ→0 " N Hρ,ν(x) (x)
(10.40)
Combining (10.39) and (10.40) we obtain " N Hρ,ν(x) (x) ∩ [|u − u + (x)| < ] lim = 1, ρ→0 " N Hρ,ν(x) (x) which proves that u + (x) = ap limy→x, y∈πν(x) (x) u(y). The proof of u − (x) = ap
lim
y→x, y∈π−ν(x) (x)
u(y)
is similar. Remark 10.3.4. In the proof of Theorem 10.3.4 we have established that S u possesses for N −1 -a.e. x in Ω, a normal unit vector ν u (x) and that N −1 S u \ {x ∈ RN : uνu (x) (x) = u−νu (x) (x)} = 0. Moreover, we have obtained that for almost every t ∈ R and for N −1 -a.e. x in ∂M E t ∩S u , ν u (x) = νEt , where νEt is the inner measure theoretic normal to E t at x.
i
i i
i
i
i
i
book 2014/8/6 page 426 i
Chapter 10. Spaces BV and SBV
426
Remark 10.3.5. According to Theorem 10.3.4 and Proposition 10.3.3, there exist two Borel sets E + and E − such that for H N −1 -almost all x in S u u + (x) =
lim
y→x, y∈E + ∩πν u (x)
u(y) and u − (x) =
lim
y→x, y∈E − ∩π−ν u (x)
u(y).
Remark 10.3.6. According to our convention (Remark 10.3.2) on the representative of L1 -functions, for N −1 a.e. x in Ω \ S u , u + (x) = u − (x) = u(x). In the following proposition, we give some information on u when x belongs to S u . Proposition 10.3.5. Let u ∈ BV (Ω). For N −1 -a.e. x in S u letting E t := [u > t ], one has (u − (x), u + (x)) ⊂ {t ∈ R : x ∈ ∂M E t ∩ Ω} ⊂ [u − (x), u + (x)], and for N −1 -a.e. x in ∂M E t ∩ Ω \ S u , u(x) = t . PROOF. For the first inclusion (u − (x), u + (x)) ⊂ {t ∈ R : x ∈ ∂M E t ∩Ω}, it suffices to note that t ∈ (u − (x), u + (x)) implies x ∈ S u,t ⊂ ∂M E t . We establish now the second inclusion {t ∈ R : x ∈ ∂M E t ∩ Ω} ⊂ [u − (x), u + (x)]. Let t ∈ R be such that x ∈ ∂M E t ∩ Ω, assume that t > u + (x), and take t0 such that t > t0 > u + (x) and lim
ρ→0
" N (Bρ (x) ∩ [u > t0 ]) " N (Bρ (x))
= 0.
Such a t0 exists from the definition of the approximate limsup. We have lim
ρ→0
" N (Bρ (x) ∩ [u > t ]) " N (Bρ (x))
= 0,
which is in contradiction with x ∈ ∂M E t . Assuming t < u − (x) yields the same contradiction. Since, up to a set of N −1 Ω measure zero, S u is the jump set of u, one has, for N −1 -a.e. x in ∂M E t ∩ Ω\ S u , u(x) = u + (x) = u − (x), so that u(x) = ap limy→x u(y). For such x, we establish that u(x) = t . Otherwise, assume that t > u(x) and set = t − u(x). According to the definition of the approximate limit at x, we have lim
" N (Bρ (x) ∩ [|u − u(x)| > ])
which yields lim
= 0,
" N (Bρ (x))
ρ→0
" N (Bρ (x) ∩ [u > + u(x)]) " N (Bρ (x))
ρ→0
that is, lim
ρ→0
" N (Bρ (x) ∩ [u > t ]) " N (Bρ (x))
= 0,
= 0,
which contradicts the hypothesis x ∈ ∂M E t . Using the same arguments, assumption u < t would give the same contradiction.
i
i i
i
i
i
i
10.4. Structure of the gradient of BV functions
book 2014/8/6 page 427 i
427
10.4 Structure of the gradient of BV functions Let u be a given function in BV (Ω) and D u = D a u + D s u the Lebesgue–Nikodým decomposition of the measure D u with respect to the N -dimensional Lebesgue measure " N Ω restricted to Ω. Let us recall that the measure D a u denotes the absolutely continuous part of D u with respect to the measure " N Ω and D s u its singular part. We will denote the density of D a u with respect to " N Ω by ∇u, so that D a u = ∇u " N Ω. The theorem below makes precise the structure of the singular part D s u. Theorem 10.4.1. Let us denote the two measures D s uS u and D s uΩ \ S u by J u and C u, respectively, called the jump part and the Cantor part of D u. Then J u is absolutely continuous with respect to the restriction of the (N − 1)-dimensional Hausdorff measure to S u . More precisely, J u = (u + − u − ) ν u N −1 S u . Moreover, J u and C u are mutually singular: for all Borel sets E of Ω N −1 (E) < +∞ =⇒ |C u|(E) = 0. Consequently, the Hausdorff dimension of the support spt(C u) of the measure C u satisfies N − 1 ≤ dimH (spt(C u)) < N . PROOF. According to the coarea formula, Theorem 10.3.3, and Theorem 10.3.2(iii), for all Borel set E ⊂ S u J u(E) = D u(E) = =
+∞ −∞
= E
+∞
−∞
E∩∂M Et +∞ −∞
DχEt (E) d t ν u (x) d N −1 (x) d t
χ{t ∈R : x∈∂M Et } d t ν u (x) d N −1 (x).
Since E ⊂ S u , according to Proposition 10.3.5, for N −1 a.e. x in E, one has " 1 ({t ∈ R : x ∈ ∂M E t }) = u + (x) − u − (x), thus J u = (u + − u − ) ν u N −1 S u . Now let E be any Borel set included in Ω\ S u , satisfying N −1 (E) < +∞. According to Theorems 10.3.3 and 10.3.2(iii), one has |C u|(E) = |D u|(E) =
+∞
−∞
N −1 (E ∩ ∂M E t ) d t .
(10.41)
With our convention (Remark 10.3.2), the points of Ω \ S u are all points of approximate continuity for u; thus, according to Proposition 10.3.5, one has E ∩ ∂M E t ⊂ {y ∈ E : u(y) = t }. Moreover, N −1 (E) < +∞, so that, from Lemma 4.2.1, the set of all t such that N −1 ({y ∈ E : u(y) = t }) > 0 is at most countable, and (10.41) yields |C u|(E) = 0.
i
i i
i
i
i
i
book 2014/8/6 page 428 i
Chapter 10. Spaces BV and SBV
428
Figure 10.3. Construction of the Cantor–Vitali function.
The proposition below states that all functions u in BV l oc (RN ) possess an approximate derivative for almost all x in RN in the following sense: there exists a linear function L : RN −→ R denoted by ap D u such that ap lim
|u(y) − u(x) − L(y − x)| |y − x|
y→x
= 0.
For a proof, see [213, Theorem 4.5.9] or [211, Theorem 4]. Proposition 10.4.1. Let u be a given function of BV l oc (RN ), i.e., u ∈ BV (U ) for all open bounded subset U of RN . Then for almost all x0 in RN , 1 | u(x) − u(x0 ) − ∇u(x0 ) . (x − x0 ) | lim d x = 0. ρ→0 ρd B (x ) | x − x0 | ρ 0 Consequently for almost all x0 in RN ap limx→x0 D u = ∇u(x0 ). Example 10.4.1. We construct a BV -function whose gradient is reduced to its Cantor part: ( the Cantor–Vitali function. Let Ω = (0, 1) and C be the classical triadic Cantor set C = n∈N Cn , where Cn is the union of 2n intervals of size 3−n . We define −n ⎧ 2 ⎪ ⎪ (x) := χCn , f ⎪ n ⎪ ⎨ 3 ⎪ ⎪ ⎪ ⎪ ⎩ un (x) :=
x 0
fn (t ) d t
(see Figure 10.3).
All the functions un belong to C([0, 1]) and if I is any of the 2n intervals of Cn , ⎧ ⎪ ⎨ f (t ) d t = f (t ) d t = 2−n , n n+1 I I ⎪ ⎩∀x ∈ (0, 1) \ C un (x) = un+1 (x). n Indeed
f (t ) I n
d t = ( 23 )n mes(I ) = 2−n . We deduce that for all x in Cn , |un (x) − un+1 (x)| ≤ 2−(n−1) ,
i
i i
i
i
i
i
10.5. The space SBV (Ω)
book 2014/8/6 page 429 i
429
so that un uniformly converges to a continuous function u. According to the lower semicontinuity of the total variation, we then obtain
(0,1)
|D u| ≤ lim inf n→+∞
(0,1)
|D un | d x = 1,
which proves that u belongs to BV (0, 1) and, since u is continuous, that J u = 0. Finally, since u is locally constant on (0, 1) \ C and " 1 (C ) = 0, one has ∇u = 0 and D u = C u. Moreover, the support of C u is the Cantor set C whose Hausdorff dimension is ln(2)/ln(3) ∼ 0.632 (see Example 4.1.1).
10.5 The space S BV (Ω) In some problems arising in image segmentation, or in mechanics in the study of cracks and fissures (see Chapters 12 and 14), the first distributional derivatives of the competing functions which operate in the models are often measures without singular Cantor part. The solutions of these problems may be found in a special space of functions of bounded variation.
10.5.1 Definition Definition 10.5.1. The special set of functions of bounded variation is the subset SBV (Ω) of BV (Ω) made up of all the functions of BV (Ω) whose gradient measures have no Cantor part in their Lebesgue decomposition, i.e., u ∈ SBV (Ω) ⇐⇒ u ∈ L1 (Ω) and D u = ∇u " N Ω + (u + − u − ) ν u N −1 S u . Remark 10.5.1. Arguing as in Remark 10.2.2, one may define the space SBV (Ω, R m ) as the space of all functions u : Ω → R m which belong to L1 (Ω, R m ) and whose distributional derivative D u is a M m×N -valued measure of the form D u = ∇u " N Ω + (u + − u − ) ⊗ ν u N −1 S u . Example 10.5.1. Let Ω be an open bounded subset of RN , K a closed subset of Ω such that N −1 (K) < +∞, and u ∈ W 1,1 (Ω\K)∩ L∞(Ω). We claim that u belongs to SBV (Ω) and that S u ⊂ K. We first assume that K is regular in the following sense: there exist a C1 hypersurface Σ such that K ⊂ Σ and two disjoint subsets Ω1 and Ω2 of Ω such that ∂ Ω1 ∩ ∂ Ω2 = Σ and " N (Ω \ Ω1 ∪ Ω2 ) = 0. Then the result is a straightforward consequence of the trace theory (see Subsection 10.2 and Example 10.2.1). It is worth pointing out that in this case, the hypothesis u ∈ L∞ (Ω) is unnecessary. We now consider the general case. If N = 1, the result follows from the previous argument. We assume N ≥ 2. Since N −1 (K) < +∞ and K is a compact set, for all n ∈ N∗ there exists a finite family of closed balls (Bρn (xi ))i ∈In , covering K, with ρi ≤ n1 i and such that cN −1 (2ρi )N −1 ≤ N −1 (K) + 1. i ∈In
i
i i
i
i
i
i
book 2014/8/6 page 430 i
Chapter 10. Spaces BV and SBV
430
Therefore
i ∈In
N −1 (∂ Bρn (xi )) ≤ C ( N −1 (K) + 1), i
where C is a positive constant depending only on the dimension N . We now consider the following functions un : ⎧ 3 ⎨ u(x) for x ∈ Ω \ Bρn (xi ), i un (x) = i ∈In ⎩ 0 elsewhere. Since " N (∪i ∈In Bρn (xi )) tends to zero, un → u strongly in L1 (Ω) when n → +∞. Since i
moreover ∂ Bρn (xi )) is a finite union of C1 hypersurfaces, reasoning on a neighborhood i of each ∂ Bρn (xi )), from the trace theory and the estimate |un+ (x)| ≤ ||u||L∞ (Ω) (note that i according to Remark 10.3.4, u + is a classical limit in a Borel set E + of Ω), one has |D un | ≤ |∇u| d x + C ||u||L∞ (Ω) ( N −1 (K) + 1). Ω
Ω\K
The semicontinuity of the total variation (Proposition 10.1.1) yields u ∈ BV (Ω). On the other hand, it is easily seen that S u ⊂ K. Since N −1 (K \ S u ) ≤ N −1 (K) < +∞, according to Theorem 10.4.1, C u(K \ S u ) = 0. Finally C u = 0 because C u(S u ) = 0.
10.5.2 Properties The following chain rule for the derivatives in BV (Ω) was established in Ambrosio [24]. Proposition 10.5.1. Let u be a given function in BV (Ω) and ϕ in C10 (R). Then v := ϕ ◦ u belongs to BV (Ω) and, even if it means changing ν u by −ν u , J v = ϕ(u + ) − ϕ(u − ) ν u N −1 S u , ∇v = ϕ (u)∇u, C v = ϕ (u)C u. PROOF. Consider un ∈ C∞ (Ω) ∩ BV (Ω) converging to u for the intermediate convergence. Then vn := ϕ ◦ un → v := ϕ ◦ u in L1 (Ω). On the other hand, |Dv|(Ω) ≤ lim inf |Dvn |(Ω) n→+∞
≤ ||ϕ ||∞ lim inf |D un |(Ω) n→+∞
= ||ϕ ||∞ |D u|(Ω) < +∞. This proves that v ∈ BV (Ω). We now show that J v = ϕ(u + ) − ϕ(u − ) ν u N −1 S u . Since ϕ ∈ C10 (R), ϕ is the difference of two nondecreasing functions in C1 (R). One may then assume ϕ nondecreasing so that v + = ϕ ◦ u + , v − = ϕ ◦ u − , Sv = S u , and νv = ν u . It remains to establish ∇v = ϕ (u)∇u and C v = ϕ (u)C u or, equivalently, DvΩ\Sv = ϕ (u)D uΩ\Sv . Consider a Borel set E of Ω included in Ω \ Sv . From the coarea formula (Theorem 10.3.3) and the structure of simple functions of BV (Ω) (Theorem 10.3.2(iii)), one has
i
i i
i
i
i
i
10.5. The space SBV (Ω)
431
Dv(E) =
book 2014/8/6 page 431 i
+∞ −∞
Dχ[v>t ] (E) d t = = =
+∞
−∞ +∞
Dχ[u>ϕ−1 (t )] (E) d t Dχ[u>t ] (E) ϕ (t ) d t
−∞ +∞
=
−∞ +∞ −∞
E
Dχ[u>t ] ϕ (t ) d t
N −1 (∂M ([u > t ]) ∩ E) ϕ (t ) d t .
(10.42)
But, according to Proposition 10.3.5, for N −1 a.e. x in Ω, x ∈ ∂M ([u > t ]) ∩ E =⇒ u(x) = t so that (10.42) yields +∞ N −1 ϕ (u(x)) d ∂M ([u > t ])(x) d t Dv(E) = E −∞ +∞
=
−∞
E
ϕ (u)Dχ[u>t ] d t
= ϕ (u) D u(E), which completes the proof. The following criterion for a function u in BV (Ω) to belong to SBV (Ω) was established by Ambrosio in [17]. Theorem 10.5.1. Let u be a given function in BV (Ω). Then u belongs to SBV (Ω) iff there exists a Borel measure μ in M(Ω×R, RN ) and a in L1 (Ω, RN ) such that for all Φ in C1c (Ω, RN ) and all ϕ in C10 (R), ! " ϕ (u) a . Φ(x) + ϕ(u) div Φ(x) d x. ϕ(s)Φ(x) μ(d x, d s) = − (10.43) Ω×R
Ω
Moreover, a = ∇u a.e. and (ν N −1 S u ) − Λ− (ν N −1 S u ), μ = Λ+ # u # u |μ|(Ω × R) = 2 N −1 (S u ),
where Λ+ : Ω −→ Ω × R, x → (x, u + (x)) and Λ− : Ω −→ Ω × R, x → (x, u − (x)). PROOF. Let us assume that u belongs to SBV (Ω). According to Proposition 17.2.5, for all ϕ ∈ C10 (R), ϕ(u) belongs to SBV (Ω) and D(ϕ(u)) = ϕ (u)∇u" Ω + ϕ(u + ) − ϕ(u − ) ν u N −1 S u . Consequently, for all Φ ∈ C1c (Ω, RN ) ϕ(u) div Φ d x = − 〈D(ϕ(u)), Φ〉 Ω = − ϕ (u)∇u.Φ d x − ϕ(u + ) − ϕ(u − ) Φ.ν u d N −1 S u , Ω
Ω
which we write ! " + − N −1 ϕ (u) ∇u . Φ + ϕ(u) div Φ d x. S u = − ϕ(u ) − ϕ(u ) Φ.ν u d Ω
Ω
i
i i
i
i
i
i
book 2014/8/6 page 432 i
Chapter 10. Spaces BV and SBV
432
According to the definition of the image of a measure, the left-hand side above is nothing but ϕ(s)Φ(x) d μ(x, s). Ω×R
Finally, since Λ+ (Ω) and Λ− (Ω) are disjoint sets, and by injectivity of Λ+ and Λ− , one has |μ| = |Λ+ (ν N −1 S u ) − Λ− (ν N −1 S u )| # u # u = |Λ+ (ν N −1 S u )| + |Λ− (ν N −1 S u )| # u # u = Λ+ ( N −1 S u ) + Λ− ( N −1 S u ). # # Hence |μ|(Ω × R) = 2 N −1 (S u ). The proof of the converse condition proceeds in three steps. First step. We establish a(x0 ) = ∇u(x0 ) for a.e. x0 in Ω. Let us fix x0 ∈ Ω such that ⎧ ⎪ aρ (y) := a(x0 + ρy) → a(x0 ) strongly in L1 (B), ⎪ ⎪ ⎨ u → ∇u(x ).y strongly in L1 (B), ρ 0 1 ⎪ ⎪ ⎪ lim N −1 |μ|(Bρ (x0 ) × R) = 0, ⎩ρ−→0 ρ where uρ denotes the rescaled function uρ (y) :=
1 ρ
u(x0 + ρy) − u(x0 ) ,
and Bρ (x0 ), B, respectively, the open ball in RN with radius ρ, centered at x0 , and the unit open ball in RN centered at 0. Let us justify the possible choice of such x0 . Actually x0 is chosen to be a Lebesgue point of y → aρ (y) := a(x0 + ρy). On the other hand, according to Proposition 10.4.1, the second property is satisfied a.e. in Ω. Finally, denoting by π the projection from Ω × R onto Ω and by π#|μ| the image of the measure |μ| by π, the limit lim
ρ−→0
1 ρ
N
|μ|(Bρ (x0 ) × R) = lim
ρ−→0
1 ρN
π# |μ|(Bρ (x0 ))
exists for almost every x0 in Ω and, up to a positive multiplicative constant, is equal to the density of the regular part of the measure π# |μ| in its Lebesgue–Nikodým decomposition. In what follows, x0 is a fixed element in Ω where these three properties are satisfied. ˜ ∈ ˜ x−x0 ), where Φ Applying condition (10.43) to the function Φ defined by Φ(x) := Φ( ρ C1c (B, RN ), one obtains −
1
ρ
Bρ (x0 )
˜ div Φ
x − x0 ρ
ϕ(u) d x −
and the change of scale y = −
! B
x−x0 ρ
Bρ (x0 )
˜ Φ
x − x0 ρ
ϕ (u).a d x =
Ω×R
ϕ(s)Φ(x) d μ,
gives
" ˜ ˜ ϕ(u(x + ρy)) + ρϕ (u(x + ρy))Φ(y).a div Φ 0 0 ρ (y) d y =
1 ρN −1
Ω×R
ϕ(s)Φ(x) d μ.
i
i i
i
i
i
i
10.5. The space SBV (Ω)
433
Testing this equality with the function ϕ defined by ϕ(s) := γ ( −
! B
" ˜ γ (u ) + γ (u )Φ(y).a ˜ div Φ (y) dy = ρ ρ ρ
and letting ρ → 0, B
Since ! B
book 2014/8/6 page 433 i
s −u(x0 ) ), ρ
one obtains
1 ρN −1
Ω×R
ϕ(s)Φ(x) d μ
˜ γ (∇u(x ).y) + γ (∇u(x ).y)Φ(y).a(x ˜ div Φ 0 0 0 ) d y = 0.
" ˜ γ (∇u(x ).y) + γ (∇u(x ).y)Φ(y).∇u(x ˜ ˜ dy div Φ ) d y = div γ (∇u(x0 ).y)Φ 0 0 0 B
= 0,
we deduce
a(x0 ) − ∇u(x0 ) .
B
˜ γ (∇u(x0 ).y)Φ(y) d y = 0.
˜ and γ being arbitrary, we deduce a(x ) = ∇u(x ) and the proof of the The choice of Φ 0 0 first step is complete. Second step. We establish C u = 0. From Proposition 17.2.5, for all ϕ in C10 (R), ϕ ◦ u ∈ BV (Ω) so that for all Φ in C1c (Ω, RN ), one has ! Ω
"
div Φ ϕ(u)+ϕ (u)Φ.∇u d x = − Su
ϕ(u )−ϕ(u ) Φ.ν u d N −1 − +
−
Ω
ϕ (u)Φ d C u,
and condition (10.43) yields ϕ (u)Φ d C u = ϕ(s)Φ(x) μ(d x, d s) − ϕ(u + ) −ϕ(u − ) Φ.ν u d N −1 . (10.44) Ω
Ω×R
Su
We now focus on a careful analysis of the measure ϕ (u) C u. Let us apply the slicing Theorem 4.2.4 for the measure μ. Let τi denote the density of μi with respect to |μi |, where (μi )1≤i ≤N is the family of components of the measure μ. According to Theorem 4.2.4 one has ϕ(s)Φi (x) μi (d x, d s) = Φi (x) ϕ(s)τi (x, s) θ x (d s) σi (d x), Ω×R
Ω
R
where σi is the image of the measure |μi | by the projection of Ω × R on Ω and (θ x ) x∈R is a family of probability measures on R. Then (10.44) yields, for i = 1, . . . , N , ϕ(s)τi (x, s) θ x (d s) σi − ϕ(u + ) − ϕ(u − ) ν u,i N −1 S u , (10.45) ϕ (u)Ci u = R
and finally, since C u and N −1 S u are mutually singular, ϕ (u)Ci u = ϕ(s)τi (., s) θ. (d s) λi
(10.46)
R
i
i i
i
i
i
i
book 2014/8/6 page 434 i
Chapter 10. Spaces BV and SBV
434
with λi := σi Ω\Su . To complete the proof, the idea is to express (10.46) in terms of functional identity. Consider the densities b and c of, respectively, Ci u and λi with respect to the measure α := |Ci u| + λi , equality (10.46) yields, for α a.e. x in Ω, b (x)(ϕ ◦ u)(x) = c(x) ϕ(s)τi (x, s) θ x (d s). R
Indeed, there exists a Borel set Nϕ with " N (Nϕ ) = 0 such that above equality holds true for all x in Ω \ Nϕ . Since C10 (Ω) possesses a dense countable subset D, it also holds for all x in Ω = Ω \ ϕ∈D Nϕ . Let then x in Ω and assume that b (x) = 0. We deduce ϕ ◦ u(x) =
c(x)
b (x)
R
ϕ(s)τi (x, s) θ x (d s)
∀ϕ ∈ C10 (R).
Equality between the two linear forms
ϕ → (ϕ ◦ u)(x) and ϕ →
c(x) b (x)
R
ϕ(s)τi (x, s) θ x (d s)
brings a contradiction. (The first is not continuous in C10 (R).) Consequently, b (x) = 0 and Ci u = b α = 0, which ends the proof of the second step. Last step. It remains to establish μ = Λ+ (ν N −1 S u )−Λ− (ν N −1 S u ). According # u # u to the previous step, equality (10.45) now becomes ϕ(s)τi (x, s) θ x (d s) σi = ϕ(u + ) − ϕ(u − ) ν u,i N −1 S u . R
Let β := σi + N −1 S u and let b , c denote now the densities of N −1 S u and σi with respect to the measure β. We have for all x ∈ Ω \ N satisfying β(N ) = 0, and for all ϕ ∈ C10 (R), c(x) ϕ(s)τi (x, s) θ x (d s) = b (x) ϕ(u + ) − ϕ(u − ) ν u,i . R
Thus, for all x in Ω \ N c(x)τi (x, .) θ x = b (x)ν u,i (x) δ u + (x) − δ u − (x) . For all bounded Borel function f : Ω × R −→ R, we now obtain f (x, s) d μi (x, s) = f (x, s)τi (x, s) θ x (d s) σi (d x) Ω×R
Ω
R
Ω
R
= = Su
f (x, s)τi (x, s) θ x (d s) c(x)β(d x)
f (x, u + (x)) − f (x, u − (x)) ν u,i (x) d N −1 (x)
and the proof is complete. We now state, without proof, another criterion for a function u in L∞ (Ω) to belong to SBV (Ω). This criterion concerns the restrictions of u to the one-dimensional slices of
i
i i
i
i
i
i
10.5. The space SBV (Ω)
book 2014/8/6 page 435 i
435
Ω. Let us define for all ν ∈ S N −1 ⎧ ⎨πν = {x ∈ RN : x.ν = 0}, Ω = {t ∈ R : x + t ν ∈ Ω}, ⎩Ω x = {x ∈ π : Ω = }. ν ν x
x ∈ πν ,
On the other hand, for all Borel functions u : Ω → R and x in Ων , we define the Borel function u x for all t in Ω x by : u x (t ) = u(x + t ν). For a proof of the theorem below, consult Braides [122]. Theorem 10.5.2. Let u be a given function in L∞ (Ω) such that for all ν ∈ S N −1 (i) u x ∈ SBV (Ω x ) for N −1 a.e. x ∈ Ων ; ! " (ii) |∇u x | d t + 0 (S ux ) N −1 (d x) < +∞. Ων
Ωx
Then u belongs to SBV (Ω). Conversely, if u belongs to SBV (Ω) ∩ L∞ (Ω), conditions (i) and (ii) are satisfied for all ν in S N −1 . Moreover, for N −1 a.e. x in Ων , ∇u(x + t ν).ν = ∇u x (t ) and
Ων
H0 (S ux )
N −1
(d x) = Su
|ν u · ν| d N −1 .
i
i i
i
i
i
i
book 2014/8/6 page 437 i
Chapter 11
Relaxation in Sobolev, BV , and Young measures spaces
11.1 Relaxation in abstract metrizable spaces This section is devoted to the description of the relaxation principle in a general metrizable space, or more generally, in a first countable topological space X . Roughly speaking, given an extended real-valued function F : X −→ R ∪ {+∞}, we wish to apply the direct method in the calculus of variations to the lower semicontinuous envelope cl(F ) of the function F so that infX F = minX cl(F ). Such a procedure is very important in various applications and leads to the concept of generalized solutions for the optimization problem infX F . We begin by giving some complements on the sequential version, denoted by F , of the general notion of lower semicontinuous envelope cl(F ) introduced in Chapter 3. Proposition 11.1.1. Let F : X −→ R ∪ {+∞} be a proper extended real-valued function defined on a metrizable space (X , d ) or, more generally, on a first countable topological space, and let us define the extended real-valued function F : X −→ R by 4 5 F (x) := inf lim inf F (xn ) : (xn )n∈N , x = lim xn . n→+∞
n→+∞
(11.1)
Then the function F is characterized for every x in X by the two following assertions: (i) ∀(xn )n∈N such that xn → x, F (x) ≤ lim infn→+∞ F (xn ); (ii) there exists a sequence (yn )n∈N in X such that yn → x and F (x) ≥ lim supn→+∞ F (yn ). PROOF. Note that trivially the system of assertions (i) and (ii) is equivalent to (i) and (ii ): (i) ∀(xn )n∈N such that xn → x, F (x) ≤ lim infn→+∞ F (xn ); (ii ) there exists a sequence (yn )n∈N in X such that yn → x and F (x) = limn→+∞ F (yn ); and each function F satisfying (i) and (ii ) automatically satisfies 4 5 F (x) := inf lim inf F (xn ) : (xn )n∈N , x = lim xn . n→+∞
n→+∞
437
i
i i
i
i
i
i
438
book 2014/8/6 page 438 i
Chapter 11. Relaxation in Sobolev, BV , and Young measures spaces
We are reduced to establishing that the function F defined by formula (11.1) satisfies (i) and (ii ). We only establish the nontrivial assertion (ii ). Its proof is based on the following diagonalization lemma. Lemma 11.1.1. Let (a m,n )(m,n)∈N×N be a sequence in a first countable topological space X such that (i) limn→+∞ a m,n = a m ; (ii) lim m→+∞ a m = a. Then there exists a nondecreasing map n → m(n) from N into N such that lim a n→+∞ m(n),n
= a.
For a proof and other diagonalization results, see [37]. Note that under the same conditions, we have the following more classical diagonalization procedure: there exists an increasing map m → n(m) such that limm→+∞ a m,m(n) = a. This second result could be applied for proving Proposition 11.1.1 but we prefer using Lemma 11.1.1, which will turn out to be the fitting tool for establishing a similar proposition in the context of the Γ -convergence (see the next chapter). Let us go back to the proof of Proposition 11.1.1. By definition of the infima, for all m ∈ N∗ there exists a sequence (x m,n )n∈N in X satisfying limn→+∞ x m,n = x and such that 1 if F (x) = −∞, F (x) ≤ lim inf F (x m,n ) ≤ F (x) + , n→+∞ m if F (x) = −∞, lim inf F (x m,n ) ≤ −m. n→+∞
Therefore lim m→+∞ limn→+∞ F (x m,σm (n) ) = F (x), where σ m : N −→ N is an increasing map, possibly depending on m, such that lim inf F (x m,n ) = lim F (x m,σm (n) ). n→+∞
n→+∞
We end the proof by applying Lemma 11.1.1 to the sequence (x m,σm (n) , F (x m,σm (n) ))(m,n)∈N2 in the metrizable space X × R: there exists n → m(n) mapping N into N such that ⎧ ⎨ lim F (x m(n),σm(n) (n) ) = lim lim F (x m,σm (n) ) = F (x), n→+∞
m→+∞ n→+∞
⎩ lim x m(n),σ m(n) (n) = lim n→+∞
lim x m→+∞ n→+∞ m,σ m (n)
= x.
The sequence (yn )n∈N defined by yn = x m(n),σm(n) (n) fulfills assertion (ii ). Theorem 11.1.1. The function F defined in Proposition 11.1.1 is the lower semicontinuous (lsc) envelope cl(F ) of the function F , i.e., the greatest lsc function less than F . PROOF. We must establish ⎧ ⎨ F ≤ F; F lsc; ⎩ G : X −→ R, G lsc, and G ≤ F =⇒ G ≤ F . For the first assertion, take the constant sequence (xn )n∈N = (x)n∈N in formula (11.1).
i
i i
i
i
i
i
11.1. Relaxation in abstract metrizable spaces
book 2014/8/6 page 439 i
439
Let us prove the second assertion. Let (y m ) m∈N be a sequence in X converging to y ∈ X and consider a subsequence (yσ(m) ) m∈N satisfying lim m→+∞ F (yσ(m) ) = lim infm→+∞ F (y m ). According to Proposition 11.1.1, there exists a sequence (yσ(m),n )n∈N in X satisfying lim y n→+∞ σ(m),n
= yσ(m)
and such that lim inf F (y m ) = lim F (yσ(m) ) m→+∞
m→+∞
= lim
lim F (yσ(m),n ).
m→+∞ n→+∞
On the other hand, we have limm→+∞ limn→+∞ yσ(m),n = y. Applying the diagonalization lemma, Lemma 11.1.1, to the sequence (yσ(m),n , F (yσ(m),n ))(m,n)∈N2 in the metrizable space X × R, there exists n → m(n) mapping N to N such that ⎧ ⎨ lim F (yσ(m(n)),n ) = lim inf F (y m ), n→+∞
m→+∞
⎩ lim yσ(m(n)),n = y. n→+∞
Hence lim inf F (y m ) = lim F (yσ(m(n)),n ) m→+∞
n→+∞
≥ lim inf F (yσ(m(n)),n ) n→+∞ 4 5 ≥ inf lim inf F (xn ) : (xn )n∈N , y = lim xn = F (y). n→+∞
n→+∞
We establish now the third assertion. Let G ≤ F be a lsc function mapping X into R and, for every x in X , consider any sequence (xn )n∈N in X converging to x. We have G(x) ≤ lim inf G(xn ) n→+∞
≤ lim inf F (xn ). n→+∞
Taking the infimum over all the sequences (xn )n∈N converging to x, we finally obtain the inequality G(x) ≤ F (x). From now on, we write indifferently F or cl(F ) when X is metrizable or first countable. But the two notions differ when X is a general topological space. In the following theorem, we state the abstract relaxation principle in countable topological spaces. Theorem 11.1.2. Let F : X −→ R ∪ {+∞} be a proper extended real-valued function defined on a metrizable space (X , d ) or, more generally, on a first countable topological space, and assume that 4 there exists5a minimizing sequence (xn )n∈N (i.e., limn→+∞ F (xn ) = infX F ) such that S = xn : n ∈ N is relatively compact in X . Then (i) infX F = minX cl(F ); (ii) every cluster point x of S is a solution of minX cl(F ), i.e., cl(F )(x) = minX cl(F ).
i
i i
i
i
i
i
440
book 2014/8/6 page 440 i
Chapter 11. Relaxation in Sobolev, BV , and Young measures spaces
PROOF. Let x be any cluster point of S and (xσ(n) )n∈N a subsequence of (xn )n∈N converging to x. According to Proposition 11.1.1(i), we have cl(F )(x) ≤ lim inf F (xσ(n) ) = inf F . n→+∞
X
(11.2)
Let now x be any element of X . From Proposition 11.1.1(ii), there exists a sequence (yn )n∈N converging to x and satisfying cl(F )(x) ≥ lim sup F (yn ).
(11.3)
n→+∞
Combining (11.2) and (11.3), we obtain cl(F )(x) ≤ lim inf F (xσ(n) ) = inf F ≤ lim sup F (yn ) ≤ cl(F )(x). n→+∞
X
(11.4)
n→+∞
This proves that cl(F )(x) = min cl(F ). Taking now x = x in (11.4), we obtain X
inf F = min cl(F ) X
X
and the proof is complete. In the terminology of the relaxation theory, the problem (+ ) :
min cl(F )
is called the relaxed problem of the optimization problem (+ ) :
inf F . X
A solution of (+ ) is sometimes called a generalized solution of the initial problem (+ ). The relaxation procedure consists in making explicit the lsc envelope of the functional F for a suitable topology on the space X , in order to obtain a well-posed problem (+ ) in the sense of Theorem 11.1.2 (i.e., the existence of an optimal solution holds for (+ )).
11.2 Relaxation of integral functionals with domain W 1,p (Ω, Rm ), p>1 One of the fundamentals hypotheses in elasticity theory is that the total free energy F associated with many materials is of local nature. From a mathematical point of view, the functional F can be represented as the integral over the reference configuration Ω ⊂ RN (N = 3), of a density associated with the possible deformation gradients of the body, which account for the local deformations. The other basic principle is that equilibrium configurations correspond to minimizers of F under prescribed conditions in a Sobolev space W 1, p (Ω, R m ) (m = 3). The functional F may fail to be lower semicontinuous. Indeed, in order to model the various solid/solid phase transformations in the microstructure, the density energy possesses in general a multiwell structure, and corresponding optimization problems have no solutions. According to Section 11.1, a classical procedure is to replace F by its lower semicontinuous envelope with respect to the weak topology of W 1, p (Ω, R m ). The relaxed problem possesses now at least a solution giving the same initial energy.
i
i i
i
i
i
i
11.2. Relaxation of integral functionals with domain W 1, p (Ω, R m ), p > 1
book 2014/8/6 page 441 i
441
The case p = 1 will be treated in the next section. We will see that contrary to the case p > 1, the domain of the lower semicontinuous envelope cl(F ) of the functional F strictly contains the domain W 1,1 (Ω, R m ) of F . This domain is indeed the space BV (Ω, R m ) of functions of bounded variation introduced in Chapter 10. This phenomena is due to the lack of reflexivity of the Sobolev space W 1,1 (Ω, R m ). This approach, which consists in relaxing the functional F with respect to the weak topology of W 1, p (Ω, R m ) (or to the strong topology of L p (Ω, R m )), is not the only one. Another idea consists in“enlarging” the space W 1, p (Ω, R m ) of admissible functions and treating the problem in the space 6 (Ω; M m×N ) of Young measures introduced in Section 4.3. Actually this is the same general procedure: the integral functionals are considered as living on the space X = 6 (Ω; M m×N ) equipped with a metrizable topology for which compactness of minimizing sequences also holds. One computes the lsc envelope of F relative to this new space. The two relaxed problems are different but there are some important connections between them. This procedure will be described in detail in Section 11.4. Let us now make precise the structure of the functional F . We consider a bounded open subset Ω of RN sufficiently regular in order that the trace theory, the Rellich– Kondrakov theorem, Theorem 5.4.2, and density arguments apply (take, for example, Ω of class C1 ). We denote the space of m × N matrices with entries in R by M m×N and consider a function f : M m×N −→ R such that there exist three positive constants α, β, L satisfying ∀a ∈ M m×N α|a| p ≤ f (a) ≤ β(1 + |a| p ) ∀a, b ∈ M m×N | f (a) − f (b )| ≤ L|b − a|(1 + |a| p−1 + |b | p−1 ).
(11.5) (11.6)
Let W 1, p (Ω, R m ) be the space (isomorphic to W 1, p (Ω) m ; see Chapter 5), made up of all ∂u functions u : Ω −→ R m whose distributional gradient ∇u = ( ∂ x i )i =1...m, j =1...N belongs j
to L p (Ω, M m×N ). We define the functional F : L p (Ω, R m ) −→ R+ ∪ {+∞} by ⎧ ⎨ f (∇u) d x if u ∈ W 1, p (Ω, R m ), F (u) = Ω ⎩ +∞ otherwise,
(11.7)
and we intend to compute its lsc envelope in the space X = L p (Ω, R m ) equipped with its strong topology. Actually, by classical arguments (the compactness of the embedding of W 1, p (Ω, R m ) into L p (Ω, R m ) and the lower bound in (11.5)), one can easily establish that the lsc envelope of F considered as living on W 1, p (Ω, R m ) equipped with its weak topology coincides with the restriction to W 1, p (Ω, R m ) of the lsc envelope of F considered here. Proposition 11.2.1 below states that the domain of F is not relaxed. Proposition 11.2.1. The domain of the functional cl(F ) is the space W 1, p (Ω, R m ). PROOF. From the inequality cl(F ) ≤ F , we obviously obtain W 1, p (Ω, R m ) ⊂ dom(cl(F )). For the converse inclusion, let u ∈ L p (Ω, R m ) such that cl(F )(u) < +∞ and consider a sequence (un )n∈N strongly converging to u in L p (Ω, R m ) and satisfying cl(F )(u) = limn→+∞ F (un ). Such a sequence exists from Proposition 11.1.1. According to the lower bound (11.5) and to the equality cl(F )(u) = limn→+∞ F (un ) < +∞ one obtains sup |∇un | p d x < +∞, n∈N
Ω
i
i i
i
i
i
i
442
book 2014/8/6 page 442 i
Chapter 11. Relaxation in Sobolev, BV , and Young measures spaces
so that (un )n∈N is bounded in W 1, p (Ω, R m ). According to the Rellich–Kondrakov theorem, Theorem 5.4.2, there exists a subsequence (not relabeled) and v ∈ W 1, p (Ω, R m ) such that un v weakly in W 1, p (Ω, R m ), un → v strongly in L p (Ω, R m ). Consequently u = v ∈ W 1, p (Ω, R m ). In Theorem 11.2.1, we establish that the functional cl(F ) possesses an integral representation. In the following proposition, also valid for p = 1, we characterize the density of this integral functional. For every bounded Borel set A of RN , we will sometimes denote its N -dimensional Lebesgue measure by |A| rather than " N (A). Proposition 11.2.2 (quasi-convex envelope of f ). Let us consider a function f : M m×N −→ R+ satisfying for p ≥ 1 and all a ∈ M m×N the upper growth condition 0 ≤ f (a) ≤ β(1+|a| p ) and the continuity assumption (11.6). Then for each fixed a in M m×N , the map 1 1, p m D → ID := inf f (a + ∇φ(x)) d x : φ ∈ W0 (D, R ) |D| D is constant on the family of all open bounded subsets of RN whose boundary satisfies |∂ D| = 0; we denote it by Q f (a). If f satisfies (11.5), the function Q f : M m×N → R+ , defined for all a ∈ M m×N by 1 1, p Q f (a) = inf f (a + ∇φ(x)) d x : φ ∈ W0 (D, R m ) , |D| D satisfies the same condition (11.5) and (11.6) with a new constant L depending only on α, β, and p. Moreover, Q f is W 1, p -quasi-convex in the sense of Morrey (quasi-convex for short), namely, it satisfies the so-called quasi-convexity inequality: for all open bounded subset D of RN with |∂ D| = 0, 1 1, p m×N m ∀a ∈ M , ∀φ ∈ W0 (D, R ) Q f (a) ≤ Q f (a + ∇φ(x)) d x. (11.8) |D| D Furthermore, the function Q f is the greatest quasi-convex function less than or equal to f , also called the quasi-convexification or quasi-convex envelope of f . PROOF. (a) Let D and D be two open bounded subsets of RN with |∂ D| = |∂ D | = 0. For proving the first assertion, it suffices to establish ID ≤ ID and to invert the roles of D and D . For every > 0 there exists a finite family (xi + i D )i ∈I of pairwise disjoint sets xi + i D ⊂ D, i > 0, satisfying 3 D \ (x + D ) < . (11.9) i i i ∈I First, we claim that the map I verifies the following subadditivity property: if A and B are two disjoint bounded open subsets of RN , then |A ∪ B|IA∪B ≤ |A|IA + |B|IB .
(11.10)
i
i i
i
i
i
i
11.2. Relaxation of integral functionals with domain W 1, p (Ω, R m ), p > 1
book 2014/8/6 page 443 i
443
Indeed, let φA and φB be two η-minimizers of |A|IA and |B|IB in (A, R m ) and (B, R m ), respectively, extended by 0 on RN \ A and RN \ B. We have |A|IA ≥ f (a + ∇φA) d x − η, A
|B|IB ≥
B
f (a + ∇φB ) d x − η,
Such η-minimizers exist thanks to (11.6) and by a density argument. The function φ 1, p which coincides with φA and φB on A and B, respectively, belongs to W0 (A ∪ B, R m ) so that |A ∪ B|IA∪B ≤ f (a + ∇φ) d x A∪B f (a + ∇φA) d x + f (a + ∇φB ) d x = A
B
≤ |A|IA + |B|IB + 2η. The thesis is obtained after making η → 0. Using quite similar arguments, one also obtains |A|IA ≤ |A \ B|IA\B + |B|IB
(11.11)
whenever A and B are two open bounded subsets of RN with B ⊂A and |∂ B| = 0. Applying, respectively, (11.10) and (11.11) to the finite union i ∈I (xi + i D ) and to A = D, B = i ∈I (xi + i D ), according to (11.9) and to the growth condition (11.5), we obtain 3 |i D |I xi +i D , (11.12) (xi + i D )I (xi +i D ) ≤ i∈I i ∈I i ∈I 3 p |D|ID ≤ β(1 + |a| ) + (xi + i D )I . (11.13) i∈I (xi +i D ) i ∈I
A change of scale easily gives I xi +i D = ID . Combining now (11.12) and (11.13), one obtains β(1 + |a| p ) + ID . ID ≤ |D| Since is arbitrary we have indeed established ID ≤ ID . It is worth noticing that the above subadditivity argument is a particular case of a more general result related to subadditive ergodic processes (see Krengel [262], Dal Maso and Modica [185], or Licht and Michaille [274]). Such a general argument will be used in Chapter 12. (b) We assume that f satisfies (11.5) and show that Q f satisfies the same conditions. Taking D = Y = (0, 1)N in the definition of Q f , from (11.5), we have 1, p p m Q f (a) ≥ α inf |a + ∇φ| d x : φ ∈ W0 (Y, R ) Y
p 1, p m ≥ α inf (a + ∇φ) d x : φ ∈ W0 (Y, R ) Y = α|a| p .
i
i i
i
i
i
i
book 2014/8/6 page 444 i
Chapter 11. Relaxation in Sobolev, BV , and Young measures spaces
444
We have used Jensen’s inequality satisfied by the convex function a → |a| p in the second inequality above. The upper bound of (11.5) is trivially obtained by taking φ = 0 as an admissible function in the expression of the infima and by using the upper bound condition satisfied by f . 1, p
Let us now establish (11.6). Given arbitrary η > 0, let φη ∈ W0 (Y, R m ) be such that Q f (b ) ≥ Y
f (b + ∇φη ) d x − η.
From (11.6) and Hölder’s inequality, we obtain Q f (a) − Q f (b ) ≤ f (a + ∇φη ) d x − f (b + ∇φη ) d x + η Y Y | f (a + ∇φη ) − f (b + ∇φη ) | d x + η ≤ Y
≤L|a−b | Y
(1+ | a + ∇φη (x) |
p−1
+ | b + ∇φη (x) |
p−1
Y
p
dx
+η
p−1 p
≤ CL | a − b |
)
p−1
p p−1
p
p
(1+ | a | + | b | + | b + ∇φη (x) | )d x
p
+ η,
where C is a constant depending only on p. On the other hand, from (11.5), 1 | b + ∇φη (x) | p d x ≤ f (b + ∇φη ) d x α Y Y 1 ≤ (Q f (b ) + η) α η β ≤ (1+ | b | p ) + . α α
(11.14)
(11.15)
Combining (11.14) and (11.15) and letting η → 0, we obtain Q f (a) − Q f (b ) ≤ L | b − a | (1+ | a | p−1 + | b | p−1 ), where L is a constant which depends only on p, α β. We end the proof by interchanging the roles of a and b . (c) We establish the quasi-convex inequality. Let us set 4 5 1, p Aff0 (D, R m ) := φ ∈ W0 (D, R m ) : φ piecewise affine . 1, p
Note that according to the density of Aff0 (D, R m ) in W0 (D, R m ) equipped with its strong topology, and to (11.6), we have 1 m×N m , Q f (a) = inf f (a + ∇φ) d x : φ ∈ Aff0 (D, R ) . ∀a ∈ M |D| D It is now easily seen that the quasi-convex inequality is satisfied with every φ ∈ Aff0 (D, R m ). Indeed, since φ ∈ Aff0 (D, R m ), there exist some open bounded and pairwise disjoint sets
i
i i
i
i
i
i
11.2. Relaxation of integral functionals with domain W 1, p (Ω, R m ), p > 1
book 2014/8/6 page 445 i
445
¯ , |∂ D | = 0, and a ∈ M m×N , i = 1, . . . , r , such ¯ = ∪r D Di ⊂ D, i = 1, . . . , r , with D i i i =1 i that ∇φ ≡ ai on Di , which clearly implies Q f (a + ∇φ) d x =
r i =1
D
Q f (a + ai )|Di |.
(11.16)
On the other hand, for η > 0 and i = 1, . . . , r , there exists φi ,η ∈ Aff0 (Di , R m ) such that Q f (a + ai ) ≥
1 |Di |
Di
f (a + ai + ∇φi ,η ) d x − η.
(11.17)
Let us consider the function φ˜ defined on D by ˜ = φ(x) + φ (x) when x ∈ D . φ(x) i ,η i Clearly φ˜ belongs to W0 (D, R m ) (actually to Aff0 (D, R m )). Summing inequalities (11.17) for i = 1, . . . , r , equality (11.16) yields ˜ d x − η|D| Q f (a + ∇φ) d x ≥ f (a + ∇φ) 1, p
D
D
≥ Q f (a)|D| − η|D|. Letting η → 0, we obtain the quasi-convex inequality for every φ ∈ Aff0 (D, R m ). Thanks 1, p to the density of Aff0 (D, R m ) in W0 (D, R m ) and to (11.6) satisfied by Q f , the quasi1, p convex inequality is now satisfied for any φ ∈ W0 (D, R m ). It remains to establish that Q f is the greatest quasi-convex function less than or equal to f . First notice that g ≤ f yields Q g ≤ Q f . On the other hand, if g is quasi-convex, Q g = g . Indeed Q g ≤ g by definition and inequality (11.8) satisfied by g gives the converse inequality. We then obtain from g ≤ f and the quasi-convexity of g that g = Qg ≤Qf . Remark 11.2.1. One can prove that Q f is the quasi-convex envelope of f under less restrictive assumptions on f , as, for example, without continuity condition (see Dacorogna [182, Theorem 1.1, Chapter 5] and [82]). For the relationship between the notions of convexity, polyconvexity, rank-one convexity, and various examples, consult Dacorogna [182] and Sverak [340], [341], [342]. One can establish that Q f is convex on each interval [a, b ] in M m×N satisfying rank(a− b ) = 1, i.e., is rank-one convex. Consequently, when m = 1 or N = 1, Q f is a convex function and actually Q f = f ∗∗ . For a proof, consult Step 3 in the proof of Theorem 1.1, Chapter 5 in Dacorogna [182]. Nevertheless, when m > 1 or N > 1, Q f is not in general the convexification of f . Consequently the optimization problems related to integral functionals F with a convex density f are not well-posed. One need to uses a relaxation procedure in the sense of the relaxation Theorem 11.1.2. Let f : M m×N −→ R satisfying (11.5) and (11.6). In Theorem 13.2.1, we will establish, in a more general situation, that the quasi-convex inequality (11.8) is a necessary and sufficient condition to ensure the lower semicontinuity of the integral functional u → Ω f (∇u) d x defined on W 1, p (Ω, R m ) equipped with its weak topology. We now state the main result of this section.
i
i i
i
i
i
i
book 2014/8/6 page 446 i
Chapter 11. Relaxation in Sobolev, BV , and Young measures spaces
446
Theorem 11.2.1. Let us consider a function f : RN −→ R satisfying (11.5) and (11.6) with p > 1, and F the associated integral functional (11.7) defined in L p (Ω, R m ) equipped with its strong topology. Then the lsc envelope of F is given, for every u in L p (Ω, R m ), by ⎧ ⎨ Q f (∇u) d x if u ∈ W 1, p (Ω, R m ), cl(F )(u) = ⎩ Ω +∞ otherwise. The proof of Theorem 11.2.1 is the straightforward consequence of Propositions 11.2.3 and 11.2.4 below. We denote the functional ⎧ ⎨ Q f (∇u) d x if u ∈ W 1, p (Ω, R m ), QF (u) = ⎩ Ω +∞ otherwise by QF . The proofs given here have the advantage of being easily adapted to the theory of homogenization (see the next chapter). Proposition 11.2.3. For every u in L p (Ω, R m ), p > 1, and every sequence (un )n∈N strongly converging to u in L p (Ω, R m ), one has QF (u) ≤ lim inf F (un ). n→+∞
(11.18)
Assume that f satisfies (11.6) and only the upper growth condition 0 ≤ f (a) ≤ β(1 + |a| p ) for all a ∈ M m×N . Then, for every u in W 1, p (Ω, R m ) and every sequence (un )n∈N weakly converging to u in W 1, p (Ω, R m ), one has QF (u) ≤ lim inf F (un ). n→+∞
PROOF. We assume that f satisfies (11.5) and (11.6) and we establish the first assertion. The second assertion will be obtained at the end of the proof as a straightforward consequence. Obviously, one can assume lim infn→+∞ F (un ) < +∞ so that u belongs to W 1, p (Ω, R m ). For a nonrelabeled subsequence, consider the nonnegative Borel measure μn := f (∇un (.))" N Ω. We have sup μn (Ω) < +∞.
n∈N
Consequently there exists a further subsequence (not relabeled) and a nonnegative Borel measure μ ∈ M(Ω) such that μn μ
weakly in M(Ω).
Let μ = g " Ω + μ s be the Lebesgue–Nikodým decomposition of μ where μ s is a nonnegative Borel measure, singular with respect to the Lebesgue measure " N Ω. For establishing (11.18) it is enough to prove that for a.e. x ∈ Ω, N
g (x) ≥ Q f (∇u(x)). Indeed, according to Alexandrov’s theorem, Theorem 4.2.3, we will obtain lim inf F (un ) = lim inf μn (Ω) ≥ μ(Ω) = g (x) d x + μ s (Ω) n→+∞ n→+∞ Ω ≥ g (x) d x Ω ≥ Q f (∇u(x)) d x. Ω
i
i i
i
i
i
i
11.2. Relaxation of integral functionals with domain W 1, p (Ω, R m ), p > 1
book 2014/8/6 page 447 i
447
Let ρ > 0 intended to tend to 0 and denote the open ball of radius ρ centered at x0 by Bρ (x0 ). According to the theory of differentiation of measures (see Theorem 4.2.1), there exists a negligible set N for the measure " N Ω such that for all x0 ∈ Ω \ N , g (x0 ) = lim
μ(Bρ (x0 ))
ρ→0
|Bρ (x0 )|
.
Applying Lemma 4.2.1, for all but countably many ρ > 0, one may assume μ(∂ Bρ (x0 )) = 0. From Alexandrov’s theorem, Theorem 4.2.3, we then obtain μ(Bρ (x0 )) = limn→+∞ μn (Bρ (x0 )) and we finally are reduced to establishing lim lim
μn (Bρ (x0 ))
ρ→0 n→+∞
|Bρ (x0 )|
≥ Q f (∇u(x0 )) for x0 ∈ Ω \ N .
(11.19)
Let us assume for the moment that the trace of un on ∂ Bρ (x0 ) is equal to the affine function u0 defined by u0 (x) := u(x0 ) + 〈∇u(x0 ), x − x0 〉. It follows that lim
μn (Bρ (x0 )) |Bρ (x0 )|
n→+∞
= lim
1
|Bρ (x0 )| 1
n→+∞
≥ inf
|Bρ (x0 )|
Bρ (x0 )
f (∇u(x0 ) + ∇(un − u0 )) d x 1, p
Bρ (x0 )
f (∇u(x0 ) + ∇φ) d x : φ ∈ W0 (Bρ (x0 ), R m )
= Q f (∇u(x0 )) and the proof would be complete. Note that the small size of the radius ρ of Bρ (x0 ) does not contribute to the estimate above. The idea now consists in modifying un by a function of W 1, p (Bρ (x0 ), R m ) which coincides with u0 on ∂ Bρ (x0 ) in the trace sense, to follow the previous procedure and to control additional terms, when ρ goes to zero, thanks to the following classical estimate. Lemma 11.2.1. For every u in W 1, p (Ω, R m ), p ≥ 1, there exists a negligible set N such that for all x0 in Ω \ N we have %
1 |Bρ (x0 )|
Bρ (x0 )
|u(x) − u(x0 ) + ∇u(x0 )(x − x0 ) | p d x
&1/ p = o(ρ).
(11.20)
PROOF. For the proof we refer to Ziemer [366, Theorem 3.4.2]. From now on, we fix x0 in the set Ω \ N ∪ N . In order to suitably modify un on the boundary of Bρ (x0 ), we use a well-known method due to De Giorgi. A neighborhood of the boundary of Bρ (x0 ) is sliced as follows: let ν ∈ N∗ , ρ0 := λρ, where 0 < λ < 1, and set Bi := Bρ +i ρ−ρ0 (x0 ) 0
ν
for i = 0, . . . , ν.
On the other hand, consider for i = 1, . . . , ν, ϕ i ∈ C∞ (Bi ), 0 ≤ ϕi ≤ 1, ϕi = 1 in Bi −1 , 0
|gradϕi |L∞ (Bi ) ≤
ν ρ − ρ0
=
ν ρ(1 − λ)
,
i
i i
i
i
i
i
448
book 2014/8/6 page 448 i
Chapter 11. Relaxation in Sobolev, BV , and Young measures spaces
and define un,i ∈ W 1, p (Bρ (x0 ), R m ) by un,i := u0 + ϕi (un − u0 ). For i = 1, . . . , ν, we have Q f (∇u(x0 )) 1 1, p m f (∇u(x0 ) + ∇φ) d x : φ ∈ W0 (Bρ (x0 ), R ) = inf |Bρ (x0 )| Bρ (x0 ) 1 f (∇un,i ) d x ≤ |Bρ (x0 )| Bρ (x0 ) 1 1 f (∇un ) d x + f (∇un,i ) d x = |Bρ (x0 )| Bi−1 |Bρ (x0 )| Bi \Bi−1 1 f (∇u(x0 )) d x + |Bρ (x0 )| Bρ (x0 )\Bi 1 1 f (∇un ) d x + f (∇un,i ) d x ≤ |Bρ (x0 )| Bρ (x0 ) |Bρ (x0 )| Bi \Bi−1 + β 1 + |∇u(x0 )| p (1 − λ)N .
(11.21)
Let us estimate the second term of the right-hand side of (11.21). From the growth condition (11.5) we obtain 1 f (∇un,i )d x ≤ C 1 + |∇u(x0 )| p (1 − λ)N |Bρ (x0 )| Bi \Bi−1 νp C C p + |∇(un − u0 )| d x + |u − u0 | p d x, |Bρ (x0 )| Bi \Bi−1 |Bρ (x0 )| ρ p (1 − λ) p Bi \Bi−1 n where C is a positive constant depending only on p and β. Then, averaging inequalities (11.21), we obtain Q f (∇u(x0 )) =
ν 1
ν
i =1
Q f (∇u(x0 ))
1
C ν p−1
|u − u0 | p d x |Bρ (x0 )|ρ p (1 − λ) p Bρ (x0 ) n 1 C p N |∇un | p d x + C 1 + |∇u(x0 )| (1 − λ) + ν |Bρ (x0 )| Bρ (x0 ) C +1 C ν p−1 f (∇un ) d x + |u − u0 | p d x ≤ αν |Bρ (x0 )| Bρ (x0 ) |Bρ (x0 )|ρ p (1 − λ) p Bρ (x0 ) n + C 1 + |∇u(x0 )| p (1 − λ)N , (11.22)
≤
|Bρ (x0 )|
Bρ (x0 )
f (∇un ) d x +
where we have used the lower bound (11.5) in the last inequality. Letting n → +∞ and ρ → 0, from Lemma 11.2.1 we obtain μn (Bρ (x0 )) C Q f (∇u(x0 )) ≤ + 1 lim lim + C 1 + |∇u(x0 )| p (1 − λ)N ρ→0 n→+∞ |B (x )| αν ρ 0
i
i i
i
i
i
i
11.2. Relaxation of integral functionals with domain W 1, p (Ω, R m ), p > 1
book 2014/8/6 page 449 i
449
and (11.19) follows after letting λ → 1 and ν → +∞. It is worth noticing that we have 1 |∇un | p d x by letting used the slicing method in order to control the term |B (x )| B (x ) ρ
the slices become increasingly thin (i.e., ν → +∞).
0
ρ
0
If now f does not verify the lower bound in (11.5), we first note that the weak convergence of un to u in W 1, p (Ω, R m ) yields the boundedness of sup |∇un | p d x < +∞. Bρ (x0 )
n∈N
Moreover, according to the Rellich–Kondrakov compact embedding theorem, Theorem 5.4.2, un → u strongly in L p (Ω, R m ). Then, to conclude, it suffices going to the limit respectively on n and ρ in (11.22) and to let λ → 1, ν → +∞ as above. Proposition 11.2.4. For every u in L p (Ω, R m ), p > 1, there exists a sequence (un )n∈N strongly converging to u in L p (Ω, R m ) such that QF (u) ≥ lim sup F (un ). n→+∞
PROOF. One can assume QF (u) < +∞. Therefore, taking D = Y := (0, 1)N in the definition of Q f , and η = 1/k, k ∈ N∗ , QF (u) = Q f (∇u) d x Ω 1, p m = inf f (∇u(x) + ∇y φ(y)) d y : φ ∈ W0 (Y, R ) d x Ω Y ≥ f (∇u(x) + ∇y φη (x, y)) d xd y − η|Ω|, (11.23) Ω×Y
4
where φη (x, ·) is a η-minimizer of inf
Y
5 1, p f (∇u(x) + ∇y φ(y)) d y : φ ∈ W0 (Y, R m ) . 1, p
We admit that we can select a measurable map x → φη (x, ·) from Ω into W0 (Y, R m ). For a proof, consult Castaing and Valadier [166]. We claim that φη belongs to 1, p
L p (Ω,W0 (Y, R m )). Indeed p ∇y φη (x, ·) p d x = m×N Ω
L (Y,M
)
|∇y φη (x, y)| p d xd y
Ω×Y
≤C ≤C ≤C
Ω×Y
1 α 1 α
|∇u(x) + ∇y φη (x, y)| p d xd y +
Ω×Y
Ω
|∇u| p d x
f (∇u(x) + ∇y φη (x, y)) d xd y +
QF (u) +
η|Ω| α
+
Ω
|∇u| p d x < +∞,
p
Ω
|∇u| d x
where C is a positive constant depending only on p. 1, p Classically Cc (Ω, (Y, R m )) is dense in L p (Ω,W0 (Y, R m )). Consequently, from the continuity assumption (11.6) fulfilled by f , it is easily seen that (11.23) yields QF (u) = Q f (∇u) d x ≥ f (∇u(x) + ∇y φ˜η (x, y)) d xd y − 2η|Ω| (11.24) Ω
Ω×Y
i
i i
i
i
i
i
450
book 2014/8/6 page 450 i
Chapter 11. Relaxation in Sobolev, BV , and Young measures spaces
for some φ˜η in Cc (Ω, (Y, R m )). We have actually established the following interchange result between infimum and integral: inf f (∇u(x) + ∇y φ(y)) d y d x Ω
=
1, p
φ∈W0 (Y,R m )
f (∇u(x) + ∇y Φ(x, y)) d xd y
inf
1, p
Φ∈L p (Ω,W0 (Y,R m ))
Ω×Y
=
Y
f (∇u(x) + ∇y Φ(x, y)) d xd y.
inf
Φ∈Cc (Ω, (Y,R m ))
Indeed from (11.23)
Ω×Y
inf
1, p
φ∈W0 (Y,R m )
Ω
≥
Y
f (∇u(x) + ∇y φ(y)) d y d x
inf
1, p
Φ∈L p (Ω,W0 (Y,R m ))
Ω×Y
f (∇u(x) + ∇y Φ(x, y)) d xd y
and the converse inequality is trivial. Note that this result also holds for p = 1. Because of its importance, we state it in a slightly more general form. Lemma 11.2.2. Let f : M m×N → R+ be any function satisfying (11.5) and (11.6) with p ≥ 1. Let moreover ξ be any element of L p (Ω, M m×N ). Then inf f (ξ (x) + ∇y φ(y)) d y d x Ω
=
1, p
φ∈W0 (Y,R m )
Y
1, p
Φ∈L p (Ω,W0 (Y,R m ))
=
inf
Ω×Y
inf
Φ∈Cc (Ω, (Y,R m ))
Ω×Y
f (ξ (x) + ∇y Φ(x, y)) d xd y
f (ξ (x) + ∇y Φ(x, y)) d xd y.
For more about interchange theorems, consult, for instance, Anza Hafsa and Mandallena [34]. PROOF OF PROPOSITION 11.2.4 CONTINUED. Let us go back to (11.24). To shorten the notation we denote the previous η-minimizer φ˜η in Cc (Ω, (Y, R m )) by φη and extend y → φη (x, y) by Y -periodicity on RN . Consider now the function uη,n defined by uη,n (x) = u(x) +
1 n
φη (x, nx).
Note that φη is a Carathéodory function so that x → φη (x, nx) is measurable. Clearly uη,n belongs to W 1, p (Ω, R m ) and uη,n → u strongly in L p (Ω, R m ) when n → +∞. It’s indeed a straightforward consequence of |φη (x, nx)| p d x ≤ sup |φη (x, y)| p d x Ω
Ω y∈Y
≤ |Ω| sup sup |φη (x, y)| p < +∞. x∈Ω y∈Y
i
i i
i
i
i
i
11.2. Relaxation of integral functionals with domain W 1, p (Ω, R m ), p > 1
book 2014/8/6 page 451 i
451
On the other hand, 1 f (∇uη,n ) d x = lim f ∇u(x) + (∇y φη )(x, nx) + ∇ x φη (x, nx) d x lim n→+∞ n→+∞ n Ω Ω = lim f (∇u(x) + (∇y φη )(x, nx)) d x n→+∞ Ω = f (∇u(x) + ∇y φη (x, y)) d xd y. Ω×Y
For passing from the first to the second equality we have used the continuity assumption (11.6) on f . The last equality is a consequence of Lemma 11.2.3 stated at the end of the proof and applied to g (x, y) = f (∇u(x) + ∇y φη (x, y)). (Note that y → f (∇u(x) + ∇y φη (x, y)) belongs to the set C# (Y ) of all the restrictions to Y of continuous and Y periodic functions on RN and that g ∈ L1 (Ω, C# (Y )).) Consequently, from (11.24) lim F (uη,n ) = lim f (∇u(x) + (∇y φη )(x, nx)) d x n→+∞ n→+∞ Ω = f (∇u(x) + ∇y φη (x, y)) d xd y Ω×Y
≤ QF (u) + 2η. Letting η → 0 (i.e., k → +∞), up to a subsequence, one obtains lim lim F (uη,n ) ≤ QF (u).
η→+0 n→+∞
We apply now the diagonalization Lemma 11.1.1 for the sequence F (uη,n ), uη,n η,n in the metric space R × L p (Ω, R m ): there exists a map n → η(n) :=
1 k(n)
such that
lim F (uη(n),n ) ≤ QF (u),
n→+∞
lim u n→+∞ η(n),n
=u
strongly in L p (Ω, R m ).
The sequence (un )n∈N where un = uη(n),n then verifies the assertion of Proposition 11.2.4. We now state and prove Lemma 11.2.3 invoked above. Let D be any open cube of RN . We recall that C# (D) denotes the set of all the restrictions to D of continuous and Dperiodic functions on RN , equipped with the uniform norm on D, and that L1 (Ω, C# (D)) denotes the space of all measurable functions h from Ω into C# (D) satisfying sup |h(x, y)| d x < +∞. Ω y∈D
Lemma 11.2.3. For every function g in L1 (Ω, C# (D)), 1 g (x, nx) d x = g (x, y) d xd y. lim n→+∞ |D| Ω×D Ω
(11.25)
PROOF. It is well known (see, for instance, Yosida [361]) that if g belongs to L1 (Ω, C# (D)), then g is a Carathéodory function satisfying sup y∈D |g (., y)| ∈ L1 (Ω) so that Ω g (x, nx) d x
i
i i
i
i
i
i
book 2014/8/6 page 452 i
Chapter 11. Relaxation in Sobolev, BV , and Young measures spaces
452
is well defined. For k ∈ N∗ , let us decompose the cube D as follows: D=
kN 3
Di ,
i =1
where Di are small pairwise disjoint open cubes 1/k-homothetic of D. We approximate g in L1 (Ω, C# (D)) by the following step function gk : gk (x, y) =
kN i =1
g (x, yi )1Di (y),
where 1Di is the characteristic function of the set Di extended by D-periodicity on RN and yi is any fixed element of Di . Due to the periodicity of 1Di , classically x → 1Di (nx) σ(L∞ , L1 ) weakly converges to |Di |/|D| so that lim
n→+∞
Ω
g (x, yi )1Di (nx) d x =
|Di | |D|
Ω
g (x, yi ) d x.
For a proof, see Example 2.4.2 or Proposition 13.2.1 and the proof of Theorem 13.2.1. From now on, to shorten notation, we assume |D| = 1. Summing these equalities over i = 1, . . . , k N , one obtains that (11.25) is satisfied for gk . In order to conclude by going to the limit on k, we use the uniform bound with respect to n: |g (x, nx) d x − gk (x, nx)| d x ≤ g − gk L1 (Ω,C# (D)) . (11.26) Ω
Moreover, since sup |gk (x, y) − g (x, y)| ≤ 2 sup |g (x, y)| ∈ L1 (Ω)
y∈D
y∈D
and limk→+∞ supy∈D |gk (x, y) − g (x, y)| = 0, according to the Lebesgue dominated convergence theorem, we obtain lim g − gk L1 (Ω,C# (D)) = 0.
k→+∞
(11.27)
Thus, from (11.26) g (x, y) d xd y ≤ g (x, nx) d x − gk (x, nx) d x g (x, nx) d x − Ω Ω×D Ω Ω + gk (x, nx) d x − gk (x, y) d xd y Ω Ω×D gk (x, y) d xd y − g (x, y) d xd y + Ω×D Ω×D ≤ 2 g − gk L1 (Ω,C# (D)) + gk (x, nx) d x − gk (x, y) d xd y . Ω Ω×Y
i
i i
i
i
i
i
11.2. Relaxation of integral functionals with domain W 1, p (Ω, R m ), p > 1
book 2014/8/6 page 453 i
453
We conclude first by letting n → +∞ and using (11.25) satisfied by gk and then by letting k → +∞ and using (11.27). Now, we would like to compute the lower semicontinuous envelope of the integral functional in (11.7) by taking into account a boundary condition on a part Γ0 of the boundary ∂ Ω of Ω. More precisely, we aim to describe the lower semicontinuous envelope of the integral functional F : L p (Ω, R m ) −→ R+ ∪ {+∞} defined by ⎧ ⎪ ⎨ f (∇u) d x if u ∈ W 1, p (Ω, R m ), Γ0 (11.28) F (u) = Ω ⎪ ⎩ +∞ otherwise, 1, p
where WΓ (Ω, R m ) denotes the subspace of all the functions u in W 1, p (Ω, R m ) such that 0 u = 0 on Γ0 in the sense of traces. In the following corollary, we state that the boundary condition is “not relaxed.” We will see that the preservation of the boundary condition is not satisfied in the case p = 1. Corollary 11.2.1. The lower semicontinuous envelope of the integral functional F defined in (11.28) is given by ⎧ ⎨ Q f (∇u) d x if u ∈ W 1, p (Ω, R m ), Γ0 cl(F )(u) = ⎩ Ω +∞ otherwise. PROOF. We only have to establish the existence of a sequence (un )n∈N strongly converging to u in L p (Ω, R m ) such that cl(F )(u) ≥ lim supn→+∞ F (un ). Assuming cl(F )(u) < 1, p +∞, we must construct a sequence (un )n∈N in WΓ (Ω, R m ), strongly converging to u in 0 L p (Ω, R m ) and satisfying cl(F )(u) ≥ lim supn→+∞ F (un ). According to Theorem 11.2.1, there exists a sequence (vn )n∈N in W 1, p (Ω, R m ) strongly converging to u in L p (Ω, R m ) such that Q f (∇u) d x ≥ lim sup f (∇vn ) d x. (11.29) Ω
n→+∞
Ω
The idea is now to modify vn on a neighborhood of ∂ Ω so that the new function belongs 1, p to WΓ (Ω, R m ) and in such a way to decrease the energy. We use again the slicing method 0 of De Giorgi. Let ν ∈ N∗ and Ω0 ⊂⊂ Ω such that 1 1 + |∇u| p d x ≤ (11.30) ν Ω\Ω0 and (Ωi )i =0,...,ν an increasing sequence of open subsets strictly included in Ω, Ωi ⊂⊂ Ωi +1 ⊂⊂ Ω. Let (ϕi )i =0,...,ν−1 be a sequence of functions in (RN ) satisfying ϕi = 1
on Ωi , ν |∇ϕi | ≤ , d
ϕi = 0
on RN \ Ωi +1 , 0 ≤ ϕi ≤ 1,
where d = dist(Ω0 , RN \ Ω), and define un,i = ϕi (vn − u) + u.
i
i i
i
i
i
i
book 2014/8/6 page 454 i
Chapter 11. Relaxation in Sobolev, BV , and Young measures spaces
454
1, p
Clearly un,i belongs to WΓ (Ω, R m ) and 0
f (∇un,i ) d x =
Ω
≤
f (∇un,i ) d x +
Ω\Ωi+1
Ωi+1 \Ωi
f (∇u) d x +
Ω\Ω0
Ωi+1 \Ωi
f (∇un,i ) d x +
Ωi
f (∇un,i ) d x +
Ω
f (∇un,i ) d x
f (∇vn ) d x.
Then, from (11.30) and the growth condition in (11.5), we obtain
Ω
f (∇un,i ) d x ≤ C
1 ν
+
p ν d
|vn − u| d x + p
Ω
(|∇vn | ) d x + f (∇vn ) d x, p
Ωi+1 \Ωi
Ω
where, from now on, C denotes various positive constants depending only on β, p, and Ω. By averaging these ν inequalities, we obtain ν−1 1
ν
i =0
f (∇un,i ) d x ≤ C
Ω
1 ν
+
p ν d
p
Ω
|vn −u| d x+
1
|∇vn | d x + f (∇vn ) d x. p
ν
Ω
Ω
As already said in the proof of Proposition 11.2.3, we have used a slicing method in order to control the term Ω |∇vn | p d x by taking increasingly thin slices (i.e., ν → +∞). We could not conclude by using a simple truncation. From the coercivity condition (11.5), Ω |∇vn | p d x is bounded, hence ν−1 1
ν
Ω
i =0
f (∇un,i ) d x ≤ C
1 ν
+
p ν d
Ω
Let i(n, ν) be the index i such that f (∇un,i (n,ν) ) d x = Ω
p
|vn − u| d x +
Ω
f (∇vn ) d x.
(11.31)
min
i =0,...,ν−1
Ω
f (∇un,i ) d x.
Inequality (11.31) yields
Ω
f (∇un,i (n,ν) ) d x ≤ C
1 ν
+
p ν d
p
Ω
|vn − u| d x +
Ω
f (∇vn ) d x
so that from (11.29) lim sup lim sup F (un,i (n,ν) ) ≤ lim sup F (vn ) ≤ ν→+∞
n→+∞
n→+∞
Ω
Q f (∇u) d x.
We conclude by a classical diagonalization argument: there exists n → ν(n) mapping N into N such that lim sup F (un,i (n,ν(n)) ) ≤ lim sup lim sup F (un,i (n,ν) ) ≤ lim sup F (vn ) ≤ Q f (∇u) d x. n→+∞
ν→+∞
n→+∞
n→+∞
Ω
Obviously lim un,i (n,ν(n)) = u strongly in L p (Ω, R m ).
n→+∞
i
i i
i
i
i
i
11.2. Relaxation of integral functionals with domain W 1, p (Ω, R m ), p > 1
book 2014/8/6 page 455 i
455
The sequence (un )n∈N defined by un = un,i (n,ν(n)) then tends to u in L p (Ω, R m ) and satisfies Q f (∇u) d x ≥ lim supn→+∞ F (un ). Ω As a consequence we obtain the following relaxation theorem in the case p > 1. Theorem 11.2.2 (relaxation theorem, p > 1). Let us consider a function f : M m×N −→ R satisfying (11.5), (11.6), a function g in Lq (Ω, R m ), where 1p + q1 = 1, and the following problem: 1, p inf f (∇u) d x − g .u d x : u ∈ WΓ (Ω, R m ) . (+ ) Ω
0
Ω
Then, the relaxed problem of (+ ) in the sense of Theorem 11.1.2 is
inf Ω
Q f (∇u) d x −
g .u d x : u Ω
1, p ∈ WΓ (Ω, R m ) 0
PROOF. With the notation of Theorem 11.2.1 one has 1, p m inf f (∇u) d x − g .u d x : u ∈ WΓ (Ω, R ) =
u∈L p (Ω,R m )
and 1, p inf Q f (∇u) d x− g .u d x : u ∈ WΓ (Ω, R m ) =
inf p
Ω
0
Ω
Ω
0
Ω
F (u) −
inf
(+ )
.
g .u d x Ω
u∈L (Ω,R m )
g .u d x .
cl(F )(u)−
Ω
Since G : u → Ω g .u d x is a continuous perturbation of F , one has cl(F +G) = cl(F )+G in L p (Ω, R m ). Then, according to Theorems 11.1.2 and 11.2.1, it suffices to establish the inf-compactness of g .u d x u → F (u) − Ω
p R m ) such that F (u) − in L (Ω, R ) equipped with its strong topology. Let u ∈ L (Ω, 1, p g .u d x ≤ C , where C is any positive constant. Then u ∈ W (Ω, R m ) and Ω p
m
F (u) −
Ω
g .u d x =
Ω
f (∇u) d x −
Ω
g .u d x ≤ C .
From (11.5) and Hölder’s inequality, we obtain α
Ω
|∇u| p d x ≤ g Lq (Ω,Rm ) uL p (Ω,Rm ) + C .
Applying Young’s inequality ab ≤ where λ is chosen so that constant satisfying
λp Cp p
λp a p p
+
1 bq λq q
(11.32)
with a = uL p (Ω,Rm ) , b = g Łq (Ω,Rm ) ,
< α and C p denotes the Poincaré constant, i.e., the best
p
Ω
|u| d x ≤ C p
Ω
|∇u| p d x,
i
i i
i
i
i
i
456
book 2014/8/6 page 456 i
Chapter 11. Relaxation in Sobolev, BV , and Young measures spaces
the estimate (11.32) yields
Ω
|∇u| p d x ≤ C ,
where C is a positive constant depending only on Ω, p, α, and g Łq (Ω,Rm ) . Therefore u 1, p
belongs to the closed ball with radius C of WΓ (Ω, R m ) which, according to the Rellich– 0 Kondrakov theorem, Theorem 5.4.2, is compact in L p (Ω, R m ). Remark 11.2.2. In the case p = 1, Theorem 11.2.1 obviously remains valid as long as one considers the restrictions of F and cl(F ) to W 1,1 (Ω, R m ). More precisely, for all u ∈ W 1,1 (Ω, R m ), cl(F )(u) =
Ω
Q f (∇u) d x.
Nevertheless, we do not have a complete description of cl(F ), which is indeed given in the next section.
11.3 Relaxation of integral functionals with domain W 1,1(Ω, Rm ) We show how the space BV (Ω, R m ) and the notion of trace (see Remark 10.2.2) take place in the relaxation theory. For simplicity of the exposition, in a first approach we limit our study to the case m = 1 and f = |.|. The general case is treated at the end of this section. From now on Ω is a Lipschitz open bounded subset of RN . It is well known that the integral functionals defined on L p (Ω), p > 1, by ⎧ ⎧ ⎨ ⎨ 1, p p 1, p |∇u| d x if u ∈ W (Ω), |∇u| p d x if u ∈ W0 (Ω), G(u) = F (u) = Ω Ω ⎩ ⎩ +∞ otherwise, +∞ otherwise, are lower semicontinuous for the strong topology of LP (Ω) or the weak topology of W 1, p (Ω). For instance, one may argue directly by using the convexity of these two functionals or one may apply the previous section by noticing that Q f = f when f = |.| p . The case p = 1, where L1 (Ω) is equipped with its strong topology and the functionals F , G are given by ⎧ ⎧ ⎨ ⎨ 1,1 |∇u| d x if u ∈ W (Ω), |∇u| d x if u ∈ W01,1 (Ω), G(u) = F (u) = ⎩ Ω ⎩ Ω +∞ otherwise, +∞ otherwise, is more involved. Indeed, we have seen in Section 10.4 that the sequence (un )n∈N which generates the Cantor–Vitali function u satisfies un → u strongly in L1 (0, 1), supn∈N F (un ) < +∞ but u ∈ W 1,1 (Ω). Consequently, the domain of the lower closure of F strictly contains the space W 1,1 (Ω). We see below that this domain is included in the space BV (Ω). Actually, as a consequence of Proposition 11.3.2, the domain is exactly BV (Ω). Concerning the second functional G, we will see that the boundary condition u = 0 is “relaxed” by a surface energy. Proposition 11.3.1. The domain of cl(F ) and cl(G) is included in BV (Ω). PROOF. Let u ∈ L1 (Ω, R m ) be such that cl(F )(u) < +∞ and consider a sequence (un )n∈N strongly converging to u in L1 (Ω, R m ) and satisfying cl(F )(u) = limn→+∞ F (un ). Such a
i
i i
i
i
i
i
11.3. Relaxation of integral functionals with domain W 1,1 (Ω, R m )
book 2014/8/6 page 457 i
457
sequence exists from Proposition 11.1.1. According to cl(F )(u) = limn→+∞ F (un ) < +∞, one obtains, for a not relabeled subsequence of (un )n∈N , sup |∇un | d x < +∞. n∈N
Ω
Thus, from Proposition 10.1.1(i), u ∈ BV (Ω). Proposition 11.3.2. Let Ω be a Lipschitz bounded open subset of RN and Γ its boundary. The lsc envelopes cl(F ) and cl(G) of the functionals F and G defined on L1 (Ω) equipped with its strong topology are given by ⎧ ⎨ |D u| if u ∈ BV (Ω), cl(F )(u) = Ω ⎩ otherwise, ⎧+∞ ⎨ |D u| + |γ0 (u)| d N −1 if u ∈ BV (Ω), cl(G)(u) = Γ ⎩ Ω +∞ otherwise, where γ0 is the trace operator from BV (Ω) into L1 (Γ ), defined in Section 10.2. PROOF. Let us set
⎧ ⎨
|D u| if u ∈ BV (Ω), ⎩ Ω otherwise, ⎧+∞ ⎨ |D u| + |γ0 (u)| d N −1 QG(u) = Γ ⎩ Ω +∞ otherwise.
QF (u) =
if u ∈ BV (Ω),
We first establish that for all u in L1 (Ω), ⎧ 1 ⎪ inf F (un ), ⎨ if un → u in L (Ω), then QF (u) ≤ lim n→+∞ there exists a sequence (un )n∈N converging to u in L1 (Ω) such that ⎪ ⎩ QF (u) ≥ lim sup n→+∞ F (un ). These two assertions are straightforward consequences of Proposition 10.1.1 and Theorem 10.1.2. We now deal with the functional G and establish for every u ∈ L1 (Ω), ⎧ 1 ⎪ inf G(un ), ⎨ if un → u in L (Ω), then QG(u) ≤ lim n→+∞ there exists a sequence (un )n∈N converging to u in L1 (Ω) such that ⎪ ⎩ QG(u) ≥ lim sup n→+∞ G(un ). Proof of the first assertion. Consider a sequence (un )n∈N strongly converging to u in L1 (Ω) such that lim infn→+∞ G(un ) < +∞. For a subsequence (not relabeled), we have ˜ denote a bounded open subset of RN strongly containing Ω and un ∈ W01,1 (Ω). Let Ω ˜ by define, for every function v in L1 (Ω), the function v˜ in L1 (Ω) v(x) if x ∈ Ω, ˜ = v(x) ˜ \ Ω. 0 if x ∈ Ω
i
i i
i
i
i
i
book 2014/8/6 page 458 i
Chapter 11. Relaxation in Sobolev, BV , and Young measures spaces
458
˜ u˜ ∈ W 1,1 (Ω), ˜ and u˜ ∈ BV (Ω) ˜ with D u˜ = D uΩ + It is easily seen that u˜n → u˜ in L1 (Ω), n N −1 N −1 Γ , where ν denotes the inner unit normal at a.e. x in Γ (see Examples γ0 (u) ν 10.2.1 and 10.2.2). Since D uΩ and γ0 (u) ν N −1 Γ are mutually singular, one has |D u˜| = |D u|Ω + |γ0 (u)| N −1 Γ . Thus, according to the previous result for F , one has N −1 |D u| + |γ0 (u)| d = |D u˜| ≤ lim inf |D u˜n | = lim inf |D un |. Ω
n→+∞
˜ Ω
Γ
˜ Ω
n→+∞
Ω
Proof of the second assertion. For t > 0 let us consider the open subset Ω t = {x ∈ Ω : dist(x, RN \ Ω) > t } of Ω and define for every u in BV (Ω) the function u t by u(x) if x ∈ Ω t , u t (x) = 0 if x ∈ Ω \ Ω t . It is easy to see that u t ∈ BV (Ω) and that D u t Ω = D uΩ t + γ t (u) ν t N −1 Γt , where Γ t denotes the boundary of Ω t , ν t the inner unit normal at N −1 a.e. x in Γ t and γ t the trace operator from BV(Ω t ) into L1 (Γ t ) (see Examples 10.2.1 and 10.2.2). According to Theorem 10.1.2 and Remark 10.2.1, there exists u t ,n ∈ C∞ (Ω)∩BV (Ω) satisfying u t ,n = 0 on Γ in the trace sense and converging to u t for the intermediate convergence of BV (Ω). We obtain, for a.e. t , |D u t ,n | = |D u t | lim n→+∞ Ω Ω = |D u| + |u| d N −1 , Ωt
Γt
where we have used that for a.e. t , γ t (u) = u on Γ t (cf. Example 10.2.4). Letting t → 0+ , we claim that |D u t ,n | = |D u| + |γ0 (u)| d N −1 . (11.33) lim+ lim t →0 n→+∞
Ω
Ω
Γ
In order to justify the nontrivial limit N −1 lim+ |u| d = |γ0 (u)| d N −1 , t →0
Γt
Γ
we argue with local coordinates and make use of estimate (10.13) in the proof of Theorem 10.2.1. With the notation of this proof, we have |u(˜ x , t ) − u(˜ x , t )| ≤ |D u|, Γ
and letting t → 0,
CR,t ,t
Γ
|γ0 (u) − u(˜ x , t )| ≤
|D u|. CR,t
The result follows after letting t → 0.
i
i i
i
i
i
i
11.3. Relaxation of integral functionals with domain W 1,1 (Ω, R m )
book 2014/8/6 page 459 i
459
Now t denotes a sequence converging to 0. Going back to (11.33) and using the diagonalization Lemma 11.1.1 for the sequence Ω |D u t ,n |, u t ,n in the metrizable space R × L1 (Ω), we conclude that there exists a sequence n → t (n) such that ⎧ ⎪ ⎨ lim |D u t (n),n | = |D u| + |γ0 (u)| d N −1 , n→+∞
Ω
⎪ ⎩ lim u t (n),n = u n→+∞
Ω
Γ
in L1 (Ω),
and the proof is complete. Let us now consider the general case and state the following theorem, analogous to Theorem 11.2.1. Given a function f : M m×N → R satisfying (11.5) and (11.6) with p = 1, we intend to compute the lsc envelope of the functional F : L1 (Ω, R m ) −→ R+ ∪ {+∞} defined by ⎧ ⎨ f (∇u) d x if u ∈ W 1,1 (Ω, R m ), F (u) = (11.34) ⎩ Ω +∞ otherwise. Let us recall that when u ∈ BV (Ω), according to the Radon–Nikodým theorem, Theorem 4.2.1, one has D u = ∇u" N Ω + D s u, where ∇u" N Ω and D s u are two mutually singular measures in M(Ω, RN ). Theorem 11.3.1. The lsc envelope of the functional F defined in (11.34) is given, for every u ∈ L1 (Ω, R m ), by ⎧ s ⎪ ⎨ Q f (∇u) d x + (Q f )∞ D u |D s u| if u ∈ BV(Ω, R m ), |D s u| cl(F )(u) = Ω Ω ⎪ ⎩ +∞ otherwise, where Q f is the quasi-convex envelope of f defined in Proposition 11.2.2, and (Q f )∞ is the Q f (t a) recession function of Q f defined for every a in M m×N by (Q f )∞ (a) = lim sup t →+∞ t . The proof of Theorem 11.3.1 is a consequence of Propositions 11.3.3 and 11.3.4. We set s ⎧ ⎨ Q f (∇u) d x + (Q f )∞ D u |D s u| if u ∈ BV(Ω, R m ) QF (u) = |D s u| Ω ⎩ Ω +∞ otherwise. Proposition 11.3.3. For every u in L1 (Ω, R m ) and every sequence (un )n∈N strongly converging to u in L1 (Ω, R m ), one has QF (u) ≤ lim inf F (un ). n→+∞
(11.35)
PROOF. Our strategy is exactly the one of Proposition 11.2.3. Obviously, one may assume lim infn→+∞ F (un ) < +∞. For a nonrelabeled subsequence, let us consider the nonnegative Borel measure μn := f (∇un (.))" N Ω. Since sup μn (Ω) < +∞,
n∈N
i
i i
i
i
i
i
460
book 2014/8/6 page 460 i
Chapter 11. Relaxation in Sobolev, BV , and Young measures spaces
there exists a further subsequence (not relabeled) and a nonnegative Borel measure μ ∈ M(Ω) such that μn μ weakly in M(Ω). Let μ = g " N Ω + μ s be the Lebesgue–Nikodým decomposition of μ, where μ s is a nonnegative Borel measure, singular with respect to the N -dimensional Lebesgue measure " N Ω. For establishing (11.35) it suffices to prove that g (x) ≥ Q f (∇u(x)) s for a.e. x ∈ Ω, D u s ∞ |D s u|. μ ≥ (Q f ) |D s u| Indeed, according to Alexandrov’s theorem, Theorem 4.2.3, we will obtain lim inf F (un ) = lim inf μn (Ω) ≥ μ(Ω) = g (x) d x + μ s (Ω) n→+∞ n→+∞ Ω s D u ∞ |D s u|. ≥ Q f (∇u(x)) d x + (Q f ) |D s u| Ω Ω (a) Proof of g (x) ≥ Q f (∇u(x)) for a.e. x in Ω. It suffices to reproduce the proof of Proposition 11.2.3, which obviously holds true for p = 1 (see Remark 11.2.2). s
D u (b) Proof of μ s ≥ (Q f )∞ ( |D ) |D s u|. The proof is based on various arguments of s u|
Ambrosio and Dal Maso [25]. The density Alberti [10]). Lemma 11.3.1. The density x0 ∈ Ω
Du |D u|
Du |D u|
satisfies the following property (see
is for |D s u|-a.e. x0 ∈ Ω a rank-one matrix, i.e., for |D s u|-a.e. Du |D u|
(x0 ) = a(x0 ) ⊗ b (x0 )
with a(x0 ) ∈ R m , b (x0 ) ∈ RN , |a(x0 )| = |b (x0 )| = 1. The rank-one property of the jump part of the singular measure D s u is indeed trivial because of its structure (see Section 10.3). De Giorgi conjectured that the diffuse singular part is also a rank-one matrix valued measure. The proof was later given by Alberti in [10], to which we refer. Let us give now some notation. Let x0 be an element of the complementary of the |D s u|-null set invoked in Lemma 11.3.1 and let Q denote the unit cube of RN centered at the origin, whose sides are either orthogonal or parallel to b (x0 ). We set Qρ (x0 ) := {x0 + ρx : x ∈ Q}, where ρ is a positive parameter intended to tend to zero. According to the theory of differentiation of measures (see Theorem 4.2.1), for establishing μ s ≥ (Q f )∞ (D s u), it is enough to prove lim
ρ→0
μ(Qρ (x0 )) |D u|(Qρ (x0 ))
≥ (Q f )∞ (a(x0 ) ⊗ b (x0 )).
(11.36)
i
i i
i
i
i
i
11.3. Relaxation of integral functionals with domain W 1,1 (Ω, R m )
book 2014/8/6 page 461 i
461
We will make use of the following two estimates: for |D s u|-a.e. x0 ∈ Ω, we have |D u|(Qρ (x0 ))
= +∞, ρN |D u|(Q t ρ (x0 )) lim sup ≥ t N ∀t ∈ ]0, 1[d . |D u|(Qρ (x0 )) ρ→0 lim
(11.37)
ρ→0
(11.38)
Assertion (11.37) easily follows from the theory of differentiation of measures (see Theorem 4.2.1). For a proof of (11.38), consult Ambrosio and Dal Maso [25, Theorem 2.3]. From now on, x0 will be a fixed element of the complementary of the |D s u|-null set invoked in Lemma 11.3.1, for which moreover estimates (11.37) and (11.38) hold true and the two limits μ(Qρ (x0 )) D u(Qρ (x0 )) lim , lim ρ→0 |D u|(Q (x )) ρ→0 |D u|(Q (x )) ρ 0 ρ 0 exist (cf. Theorem 4.2.1). We set tρ :=
|D u|(Qρ (x0 )) ρN
. According to (11.37), tρ will play the role of the parameter t
∞
in the definition of (Q f ) : (Q f )∞ (a) = lim sup
Q f (t a)
t →+∞
t
.
Let us now define the following rescaled functions of BV (Q, R m ): vρ (x) :=
|D u|(Qρ (x0 ))
vρ,n (x) := which satisfy
ρN −1
u(x0 + ρx) −
|Qρ (x0 )|
ρN −1 |D u|(Qρ (x0 ))
un (x0 + ρx) −
1
u(y)d y , Qρ (x0 )
1 |Qρ (x0 )|
Qρ (x0 )
un (y)d y ,
Q
vρ (y)d y = 0, lim |vρ,n − vρ |L1 (Q,Rm ) = 0 n→+∞
and, from Lemma 11.3.1, Dvρ (Q) =
D u(Qρ (x0 )) |D u|(Qρ (x0 ))
→ a(x0 ) ⊗ b (x0 ) when ρ → 0.
The sequence (vρ )ρ>0 fulfills the following properties (for a proof, consults Ambrosio and Dal Maso [25, Theorem 2.3]). Lemma 11.3.2. There exists a subsequence of (vρ )ρ>0 , not relabeled, which weakly converges in BV (Q, R m ) to a function v of the form v(x) = v(〈b (x0 ), x〉)a(x0 ), where v is nondecreasing and belongs to BV (] − 1/2, 1/2[). Moreover, for a.e. δ in (0, 1) one has Dvρ (δQ) → Dv(δQ).
i
i i
i
i
i
i
book 2014/8/6 page 462 i
Chapter 11. Relaxation in Sobolev, BV , and Young measures spaces
462
s
D u We are now in a position to establish μ s ≥ (Q f )∞ ( |D ) |D s u|. The proof will be s u| divided into three steps. We fix δ in (0, 1) outside of a set of null measure so that the last assertion of Lemma 11.3.2 holds. We will make δ tend to 1 at the end of the proof.
First step (truncation). Let uρ,n in W 1,1 (Q, R m ) defined by 1 1 uρ,n := un (y)d y . un (x0 + ρy) − ρ |Qρ (x0 )| Qρ (x0 ) Note that
1 tρ
uρ,n is the function vρ,n previously defined. A change of scale gives
1 |D u|(Qρ (x0 ))
Qδρ (x0 )
f (∇un )d x =
1 tρ
δQ
f (∇uρ,n )d x.
We want to modify the function uρ,n in a neighborhood of δQ so that it coincides with an affine function of gradient tρ Dv(δQ) on ∂ δQ. The basic idea, similar to the one used in the proof of Proposition 11.2.3, consists in slicing a neighborhood of the boundary of 1/2 δQ by thin slices whose size is of order vρ − v 1 m . (Recall that vρ − v L1 (Q,R m ) L (Q,R )
goes to zero when ρ → 0.)
1
More precisely, we set αρ := vρ − v 2 1
L (Q,R m )
and, for ν ∈ N∗ intended to tend to +∞,
Q0 := (1 − αρ )δQ, Qi := 1 − αρ + i
αρ
ν
δQ
for i = 1, . . . , ν.
On the other hand, for i = 1, . . . , ν, we consider (Qi ), 0 ≤ ϕi ≤ 1, ϕi = 1 in Qi −1 , ϕ i ∈ C∞ 0
|∇ϕi | ≤
ν αρ
and define uρ,n,i ∈ tρ Θδ + W01,1 (δQ, R m ) by uρ,n,i := tρ Θδ + ϕi (uρ,n − tρ Θδ ), where Θδ is the affine function: Dv(δQ)
Θδ (x) :=
δN
.x +
v(( δ2 )− ) + v((− δ2 )+ ) 2
a(x0 ) .
v((δ/2)− )+v (−(δ/2)+ )
a(x0 ) has been chosen so that the traces of v and Θ agree The constant 2 on the faces of δQ orthogonal to b (x0 ) and, consequently, so that v − θ fulfills the Poincaré inequality on δQ. From (11.5), an easy computation gives 1 1 f (∇uρ,n )d x ≥ f (∇uρ,n,i )d x − Rρ,n,ν,δ,i , (11.39) tρ δQ tρ δQ where Rρ,n,ν,δ,i := Oρ +
β tρ
Qi \Qi−1
|∇(uρ,n − tρ Θδ )|d x +
βν
αρ
Qi \Qi−1
|vρ,n − Θδ |d x
and Oρ does not depend on n and tends to 0 when ρ → +∞.
i
i i
i
i
i
i
11.3. Relaxation of integral functionals with domain W 1,1 (Ω, R m )
book 2014/8/6 page 463 i
463
Second step (averaging). According to (11.39) and to the definition of Q f , we have tρ 1 δN f (∇un )d x ≥ Qf Dv(δQ) |D u|(Qρ (x0 )) Qδρ (x0 ) tρ δN β − Oρ − |∇(uρ,n − tρ Θδ )|d x tρ Qi \Qi−1 βν |v − Θδ |d x. − αρ Qi \Qi−1 ρ,n Averaging these ν inequalities, we obtain 1
f (∇un )d x ≥
δN
Qf
tρ
Dv(δQ) N
|D u|(Qρ (x0 )) Qδρ (x0 ) tρ δ β β |∇(uρ,n − tρ Θδ )|d x − |v − Θδ |d x − Oρ − ν tρ Q αρ δQ\(1−αρ )δQ ρ,n tρ δN β ≥ Qf Dv(δQ) − Oρ − |D(uρ,n − tρ Θδ )|d x tρ ν tρ Q δN 1 2 β −β |vρ,n − v|d x − |v − Θδ |d x. αρ δQ\(1−αρ )δQ Q Letting successively ν → +∞ and n → +∞, we obtain 1 f (∇un )d x lim sup n→+∞ |D u|(Qρ (x0 )) Qδρ (x0 ) 1 2 tρ δN ≥ Qf Dv(δQ) − Oρ − β |v m − v|d x N tρ δ Q β − |v − Θδ |d x. αρ δQ\(1−αρ )δQ
(11.40)
Last step. From Lipschitz continuity of Q f (cf. Proposition 11.2.2) and according to Lemmas 11.3.1 and 11.3.2 and estimates (11.37) and (11.38), we obtain tρ tρ δN δN Dv(δQ) = lim sup Qf Dv(δQ) lim sup Qf tρ tρ δN δN ρ→0 ρ→0 tρ δN ≥ lim sup Qf a(x0 ) ⊗ b (x0 ) tρ δN ρ→0 |D u|(Cδρ (x0 )) − lim inf L 1 − ρ→0 |D u|(Cρ (x0 )) ≥ (Q f )∞ (a(x0 ) ⊗ b (x0 )) − L (1 − δ N ). (11.41) On the other hand, lim
ρ→0
1 αρ
δQ\(1−αρ )δQ
|v − Θδ |d x = 0.
(11.42)
i
i i
i
i
i
i
464
book 2014/8/6 page 464 i
Chapter 11. Relaxation in Sobolev, BV , and Young measures spaces
Indeed, from Poincaré’s inequality 1 Dv(δQ) |v − Θδ |d x ≤ C Dv − αρ δQ\(1−αρ )δQ δN δQ\(1−αρ )δQ which tends to zero when ρ goes to zero. Combining (11.40), (11.41), and (11.42), we obtain 1 lim sup lim sup f (∇un )d x ≥ (Q f )∞ (a(x0 ) ⊗ b (x0 )) − L (1 − δ N ). |D u|(Q (x )) n→+∞ ρ→0 δQρ (x0 ) ρ 0 The lower bound is then established after letting δ → 1. Proposition 11.3.4. For every u in L p (Ω, R m ), there exists a sequence (un )n∈N strongly converging to u in L1 (Ω, R m ) such that QF (u) ≥ lim sup F (un ). n→+∞
PROOF. The proof proceeds in two steps. First step: computation of cl(F )W 1,1 (Ω, R m ). According to Remark 11.2.2 and Theorem 11.2.1 for every u ∈ W 1,1 (Ω, R m ), (11.43) cl(F )(u) = Q f (∇u) d x. Ω
Second step. Let us consider the functional F˜ : L1 (Ω, R m ) −→ R+ ∪ {+∞} defined by ⎧ ⎨ Q f (∇u) d x if u ∈ W 1,1 (Ω, R m ), F˜(u) = ⎩ Ω +∞ otherwise. We claim that it is enough to establish that for all u ∈ BV(Ω, R m ), s D u cl(F˜)(u) ≤ Q f (∇u) d x + (Q f )∞ |D s u|. s |D u| Ω Ω
(11.44)
Indeed, let us assume (11.44). For every u ∈ BV(Ω, R m ), according to Proposition 11.1.1 and Theorem 11.1.1, we obtain the existence of a sequence (uk )k∈N in W 1,1 (Ω, R m ) strongly converging to u in L1 (Ω, R m ) and satisfying s D u Q f (∇u) d x + (Q f )∞ |D s u| ≥ cl(F˜)(u) s |D u| Ω Ω ≥ lim sup F˜(u ) k→+∞
= lim sup k→+∞
k
Ω
Q f (∇uk ) d x.
On the other hand, from (11.43), there exists a sequence (uk,n )n∈N in W 1,1 (Ω, R m ) strongly converging to uk in L1 (Ω, R m ) and satisfying Q f (∇uk ) d x ≥ lim sup f (∇uk,n ) d x. Ω
n→+∞
Ω
i
i i
i
i
i
i
11.3. Relaxation of integral functionals with domain W 1,1 (Ω, R m )
book 2014/8/6 page 465 i
465
Consequently ⎧ s D u ⎪ ∞ s ⎪ |D u| ≥ lim sup lim sup Q f (∇u) d x + (Q f ) f (∇uk,n ) d x, ⎨ |D s u| k→+∞ n→+∞ Ω Ω Ω ⎪ ⎪ ⎩ lim lim u = u strongly in L1 (Ω, R m ), k→+∞ n→+∞
k,n
and we conclude by a diagonalization argument: there exists n → k(n) mapping N to N such that ⎧ s D u ⎪ ∞ ⎪ |D s u|, ⎨ lim sup f (∇uk(n),n ) d x ≤ Q f (∇u) d x + (Q f ) |D s u| n→+∞ Ω Ω Ω ⎪ ⎪ ⎩ lim u = u strongly in L1 (Ω, R m ). n→+∞
k(n),n
We now establish (11.44). We use some arguments of Ambrosio and Dal Maso [25]. Let u be a fixed element of BV(Ω, R m ) and consider the set function μ : A → cl(F˜ )(u, A) defined for all bounded open subset A of Ω. The notation cl(F˜)(·, A) means that we consider the lower semicontinuous envelope of the functional F˜ localized on A, the space L1 (A, R m ) being equipped with its strong topology. Note that μ satisfies the following estimate for every open bounded subset A of Ω: (11.45) μ(A) ≤ β|A| + β |D u|. A
Indeed, applying the approximation Theorem 10.1.2, there exists a sequence (vn )n∈N in BV(A, R m ) ∩ C∞ (A, R m ) such that vn → u
strongly in L1 (A, R m ),
A
|∇vn | d x →
|D u|. A
Thus, from the growth condition (11.5) μ(A) = cl(F˜)(u, A) ≤ lim inf F (vn , A) n→+∞ ≤ β|A| + lim inf |∇vn | d x n→+∞ A = β|A| + |D u|. A
According to the definition of μ and to (11.45), one can now easily establish that for all bounded open subsets A and A of Ω, μ satisfies A ⊂ A =⇒ μ(A) ≤ μ(A ); A ∩ A = =⇒ μ(A ∪ A ) ≥ μ(A) + μ(A ); μ(A) ≤ sup{μ(A ) : A ⊂⊂ A}; μ(A ∪ A ) ≤ μ(A) + μ(A ).
i
i i
i
i
i
i
466
book 2014/8/6 page 466 i
Chapter 11. Relaxation in Sobolev, BV , and Young measures spaces
For a complete proof, consult Ambrosio and Dal Maso [25]. Consequently (consult, for instance, the book [183]), one can extend μ to a Borel measure on Ω, still denoted by μ, defined for every Borel set B of Ω by μ(B) = inf{μ(A) : A open, B ⊂ A}. By considering the Lebesgue–Nikodým decomposition (cf. Theorem 4.2.1) μ = μa + μ s of μ, it suffices now to establish, for every Borel set B of Ω, (11.46) μa (B) ≤ Q f (∇u) d x, B
s
μ (B) ≤
(Q f ) B
∞
Ds u |D s u|
|D s u|.
(11.47)
In order to estimate the singular part μ s from above, i.e., (11.47), we will need the following continuity result. Lemma 11.3.3. Let h : M m×N → R be a continuous function and (un )n∈N a sequence in BV(Ω, R m ) converging to u for the intermediate convergence. Then D un Du lim |D un | = |D u|. h h n→+∞ |D un | |D u| Ω Ω For a proof, we refer the reader to Luckhaus and Modica [281] or, in the convex case, to Demengel and Temam [199]. Let us set, for every a ∈ M m×N , g (a) = sup t >0
Q f (t a) − Q f (0) t
.
According to the rank-one convexity of Q f (cf. Remark 11.2.1), we note that if rank(a) = 1, one has g (a) = (Q f )∞ (a). For every open subset A of Ω, consider un ∈ BV(A, R m ) ∩ C∞ (A, R m ), strongly converging to u in L1 (A, R m ), and such that lim |D un |(A) = |D u|(A).
n→+∞
Such a sequence exists by Theorem 10.1.2. From f ≤ f (0)+ g and Lemma 11.3.3, one has D un μ(A) ≤ Q f (0)|A| + lim g |D un | n→+∞ |D un | A Du = Q f (0)|A| + g |D u|. |D u| A The same inequality now holds true for any Borel set B of Ω. Taking the singular part of the two members and according to the rank-one property of the singular part of D u, we obtain s D u s ∞ μ (B) ≤ (Q f ) |D s u|. |D s u| B We are going to estimate from above the absolutely continuous part μa , i.e., (11.46). Let ρn be a regularizing kernel (see Theorem 4.2.2) and set un = ρn ∗ u. From ∇un = ρn ∗ D u = ρn ∗ ∇u + ρn ∗ D s u
i
i i
i
i
i
i
11.3. Relaxation of integral functionals with domain W 1,1 (Ω, R m )
book 2014/8/6 page 467 i
467
and the local Lipschitz continuity of Q f , one has, for every open set A ⊂⊂ A, Q f (∇un ) d x ≤ Q f (ρn ∗ ∇u) d x + L |ρn ∗ D s u|(A ) A A Q f (ρn ∗ ∇u) d x + L |D s u|(A + spt(ρn )). ≤ A
Letting n → +∞ yields
μ(A ) ≤
Q f (∇u) d x + L |D s u|(A)
A
and, since A ⊂⊂ A is arbitrary,
Q f (∇u) d x + L |D s u|(A)
μ(A) ≤ A
for every open subset A of Ω. Since the Lebesgue measure and the nonnegative Borel measure |D s u| are regular, the same inequality now holds true for every Borel subset B of Ω. Taking the absolutely continuous part of each member, we finally obtain a μ (B) ≤ Q f (∇u) d x B
and the proof is complete. We now compute the lsc envelope of the integral functional in (11.34) by taking into account a boundary condition on part Γ0 of the boundary ∂ Ω of Ω. More precisely, we aim to describe the lsc envelope of the integral functional F : L1 (Ω, R m ) −→ R+ ∪ {+∞} defined by ⎧ ⎨ f (∇u) d x if u ∈ W 1,1 (Ω, R m ), Γ0 F (u) = ⎩ Ω +∞ otherwise. Corollary 11.3.1. The lsc envelope of the integral functional F is given by s ⎧ D u ⎪ ∞ ⎪ Q f (∇u) d x + (Q f ) |D s u| ⎪ ⎪ s ⎪ |D u| ⎨ Ω Ω cl(F )(u) = + (Q f )∞ (γ0 (u) ⊗ ν) d N −1 if u ∈ BV(Ω, R m ), ⎪ ⎪ ⎪ ⎪ Γ0 ⎪ ⎩ +∞ otherwise, where ν denotes the outer unit normal to Γ0 and γ0 the trace operator. In other words, in this relaxation process, the boundary condition is translated in terms of surface energy Ω (Q f )∞ ([u] ⊗ ν) d N −1 Γ0 . The proof is very similar to that of Proposition 11.3.2. For details, consult Abddaimi, Licht, and Michaille [1] and the references therein. As a consequence we obtain the relaxation theorem in the case p = 1. Theorem 11.3.2 (relaxation theorem, p = 1). Let us consider a function f : M m×N −→ R satisfying (11.5) and (11.6) and g in L∞ (Ω, R m ) satisfying g ∞ < Cα , where C p is the p
i
i i
i
i
i
i
book 2014/8/6 page 468 i
Chapter 11. Relaxation in Sobolev, BV , and Young measures spaces
468
Poincaré constant in Ω. Then the relaxed problem of
inf Ω
f (∇u) d x −
g .u d x : u Ω
∈ WΓ1,1 (Ω, R m ) 0
,
(+ )
in the sense of Theorem 11.1.2, is given by
inf Ω
Q f (∇u) d x −
Ω
(Q f )∞
Ds u |D s u|
|D s u| + −
Γ0
Ω
(Q f )∞ (γ0 (u) ⊗ ν) d N −1
(+ )
g .u d x : u ∈ BV(Ω, R m ) .
PROOF. Arguing as in the proof of Theorem 11.2.2, according to Theorem 11.1.2, it suffices to establish the inf-compactness of u → F (u) −
g .u d x Ω
in L1 (Ω, M m×N ). This property is a straightforward consequence of the estimate g ∞ < α and the compactness of the embedding of WΓ1,1 (Ω, R m ) into L1 (Ω, M m×N ). C 0
p
11.4 Relaxation in the space of Young measures in nonlinear elasticity The strategy described in Sections 11.2 and 11.3 has the disadvantage to quasi-convexify the density function f so that the relaxed functional, with density Q f , does not provide information on the oscillations of the gradient minimizing sequences. In this section, we describe an alternative way for relaxing the free energy by using the notion of the Young measure introduced in Section 4.3, well adapted for capturing oscillations of minimizing sequences (see Subsection 4.3.6). According to the point of view of Ball and James in [81] or Bhattacharya and Kohn in [99], the density of the relaxed free energy obtained this way is the microscopic free energy density corresponding to the macroscopic free energy density Q f .
11.4.1 Young measures generated by gradients To shorten notation, we denote the N -dimensional Lebesgue measure restricted to the open bounded subset Ω ⊂ RN by " . Definition 11.4.1. Let us denote by E := R mN ∼ M m×N the set of m × N matrices. A Young measure μ in 6 (Ω; E) is called a W 1, p -Young measure if there exists a bounded sequence (un )n∈N in W 1, p (Ω, R m ), p ≥ 1, such that μ is generated by the sequence of gradients (∇un )n∈N , i.e., na r
(δ∇un (x) ) x∈Ω ⊗ " μ, or equivalently, from Theorem 4.3.1, Lw
(δ∇un (x) ) x∈Ω (μ x ) x∈Ω .
i
i i
i
i
i
i
11.4. Relaxation in the space of Young measures in nonlinear elasticity
book 2014/8/6 page 469 i
469
We admit the following important technical lemma. For a proof, consult Kinderlehrer and Pedregal [257], Pedregal [316], or Fonseca, Müller, and Pedregal [218]. Lemma 11.4.1. Let (un )n∈N be a sequence in W 1, p (Ω, R m ), weakly converging to some u in W 1, p (Ω, R m ). Then, there exists a sequence (vn )n∈N in W 1, p (Ω, R m ) satisfying 1, p
(i) vn ∈ u + W0 (Ω, R m ); (ii) (|∇vn | p )n∈N is uniformly integrable; (iii) vn − un → 0 and ∇(vn − un ) → 0 in measure. Let us point out that according to item (iii) and to Proposition 4.3.8, the two sequences (∇un )n∈N and (∇vn )n∈N generate the same W 1, p -Young measure. The main result of this section is the following characterization of W 1, p -Young measures, established by Kinderlehrer and Pedregal [257] (see Pedregal [316] and Sychev [344]). Theorem 11.4.1. Let p > 1; then μ ∈ 6 (Ω; E) is a W 1, p -Young measure iff there exists u ∈ W 1, p (Ω, R m ) such that the three following assertions hold: (i) ∇u(x) = E λ d μ x (λ) for a.e. x in Ω; (ii) for all quasi-convex function φ : E → R for which there exist some γ ∈ R and β > 0 such that γ ≤ φ(λ) ≤ β(1 + |λ| p ) for all λ ∈ E, one has φ(∇u(x)) ≤ φ(λ) d μ x (λ) for a.e. x in Ω; E
(iii)
Ω×E
|λ| p d μ(x, λ) < +∞.
The function u will be referred to as the underlying deformation of the Young measure μ. PROOF. We split the proof into several steps. PROOF OF THE NECESSARY CONDITIONS. First step: Necessity of (i) and (iii). By definition, there exists a bounded sequence (un )n∈N in W 1, p (Ω, R m ) weakly converging to some u ∈ W 1, p (Ω, R m ), whose sequence of gradients generates μ. Since ∇un ∇u weakly in L p (Ω, E), one obtains (i) by applying Proposition 4.3.6. Take now ϕ(x, λ) = |λ| p . Since λ → ϕ(x, λ) is lsc (actually continuous), according to Proposition 4.3.3, we deduce |λ| p d μ(x, λ) ≤ lim inf |∇un | p d x < +∞. n→+∞
Ω×E
Ω
Second step: Necessity of (ii). Let φ be a quasi-convex function satisfying the growth condition in (ii) and x0 a fixed element in Ω such that the two following limits exist: 1 φ(∇u(x)) d x, φ(∇u(x0 )) = lim ρ→0 " (B (x )) B (x ) ρ 0 ρ 0 E
φ(λ) d μ x0 (λ) = lim
ρ→0
1 " (Bρ (x0 ))
Bρ (x0 )
! E
" φ(λ) d μ x (λ) d x.
i
i i
i
i
i
i
470
book 2014/8/6 page 470 i
Chapter 11. Relaxation in Sobolev, BV , and Young measures spaces
Such x0 exists outside a negligible set, from the Lebesgue differentiation theorem. Let us set ϕ(x, λ) = " (B1(x )) 1Bρ (x0 ) (x)φ(λ) which defines a (Ω) ⊗ (E)-measurable function, ρ
0
continuous with respect to λ. Consider now the sequence (vn )n∈N given by Lemma 11.4.1 with Ω = Bρ (x0 ). From the growth condition fulfilled by φ, the sequence (ϕ(x, ∇vn (x)))n∈N is uniformly integrable. Therefore, by applying Theorem 4.3.3, we obtain 1 ϕ(x, λ) d μ(x, λ) = lim φ(∇vn (x)) d x n→+∞ " (B (x )) Ω×E Bρ (x0 ) ρ 0 1 φ(∇u(x)) d x. (11.48) ≥ " (Bρ (x0 )) Bρ (x0 ) The last inequality is a consequence of lower semicontinuity of the integral functional v → φ(∇v(x)) d x Bρ (x0 )
when W 1, p (Bρ (x0 ), R m ) is equipped with its weak convergence, due to the quasi-convexity assumption on φ (see Theorem 13.2.1). But, according to the slicing theorem, Theorem 4.2.4, 1 ϕ(x, λ) d μ(x, λ) = φ(λ) d μ x (λ) d x " (Bρ (x0 )) Bρ (x0 ) E Ω×E so that (11.48) yields 1 " (Bρ (x0 ))
Bρ (x0 )
E
φ(λ) d μ x (λ) d x ≥
1 " (Bρ (x0 ))
Bρ (x0 )
φ(∇u(x)) d x.
The conclusion then follows by letting ρ → 0. PROOF OF THE SUFFICIENT CONDITIONS . The proof of the sufficient conditions is more involved. Furthermore, we prove that any Young measure satisfying conditions (i), (ii), and (iii) is generated by the sequence of gradients of a bounded sequence in u + 1, p W0 (Ω, R m ). For the convenience of the reader we divide the proof into several steps. First step: The Young measure μ is assumed to be homogeneous. Let a be a fixed matrix in E and Q a fixed open bounded cube of RN , and let la be the linear function defined for every x ∈ Q by la (x) = a.x. We consider the closed set a (E) of probability measures μ on E satisfying the three following conditions: (i) E λ d μ = a; p (ii) for all quasi-convex function φ satisfying for all λ ∈ E, γ ≤ φ(λ) ≤ β(1 + |λ| ) for some γ ∈ R and β > 0, one has φ(a) ≤ E φ(λ) d μ(λ); (iii) E |λ| p d μ < +∞.
Note that these three conditions are exactly the one stated in Theorem 11.4.1, fulfilled by the homogeneous Young measure μ ⊗ " Ω. In this step we set Ω = Q and we aim to 1, p establish the existence of a bounded sequence (un )n∈N in la + W0 (Q, R m ) such that the sequence (∇un )n∈N generates the Young measure μ ⊗ " Q. In the second step, we will apply this result in the case when a = 0.
i
i i
i
i
i
i
11.4. Relaxation in the space of Young measures in nonlinear elasticity
book 2014/8/6 page 471 i
471
For every v ∈ la + W 1, p (Q, R m ), let us consider the probability measure on E μv :=
1 " (Q)
Q
δ∇v(x) d x,
which acts on every ϕ ∈ C0 (E) as follows: 〈μv , ϕ〉 =
1 " (Q)
ϕ(∇v(x)) d x. Q
Note that E |λ| p d μv is well defined and precisely E |λ| p d μv = We finally consider the following convex subset of a (E): 4 5 'a (E) := μv : v ∈ la + W 1, p (Q, R m ) .
1 " (Q)
Q
|∇v(x)| p d x.
1, p
Let now v be a fixed element of la +W0 (Q, R m ), extended by Q-periodicity on RN , and, for every n ∈ N∗ , let us define the function vn : x → vn (x) := n1 v(nx) in la + W 1, p (Q, R m ). According to a classical result on oscillating functions, rephrased in terms of narrow convergence of Young measures, one has na r
(δ∇vn (x) ) x∈Q ⊗ " Q μv ⊗ " Q
when n → +∞.
(11.49)
Note that the norm of ∇vn in L p (Q, E) is exactly the one of ∇v. To conclude, it is enough to show that for every μ ∈ a (E), there exists a sequence (μwk )k∈N in 'a (E), (wk )k∈N 1, p
bounded in la + W0 (Q, R m ) such that i.e., σ(C0 (E), C0 (E));
μwk μ weakly in M(E),
hence, according to Theorem 4.3.1, and since μwk and μ are homogeneous, na r
μwk ⊗ " Q μ ⊗ " Q
when k → +∞.
(11.50)
Indeed, since the space 6 (Q; E) endowed with the topology of the narrow convergence is metrizable (see [162, Proposition 2.3.1]), combining (11.49), (11.50), and by using a diagonalization argument (cf. Lemma 11.1.1), we will show that there exists a bounded sequence (un )n∈N∗ in la +W 1, p (Q, R m ), whose sequence of gradients generates the Young measure μ ⊗ " Q. We are going to establish (11.50) or, equivalently, the density of 'a (E) in a (E) for the σ(C0 (E), C0 (E)) topology. Let us point out that we want to prove the existence of a sequence (μwm ) m∈N weakly converging to μ, such that moreover (w m ) m∈N is bounded 1, p
in la + W0 (Q, R m ) or, equivalently, such that (∇w m ) m∈N is bounded in L p (Q, E). In
order to take into account this condition, we establish 'a (E) = a (E) for a metric d finer than the classical metric d inducing the σ(C0 (E), C0 (E)) topology in the set P(E) of probability measures on E. Let us recall that given a dense countable family (ϕi )i ∈N∗ in C0 (E), the distance d is given by ∀(μ, ν) ∈ P(E) × P(E)
d (μ, ν) :=
+∞
1
i =1
2 ϕi ∞ i
|〈μ, ϕi 〉 − 〈ν, ϕi 〉|.
i
i i
i
i
i
i
book 2014/8/6 page 472 i
Chapter 11. Relaxation in Sobolev, BV , and Young measures spaces
472
We define now the distance d by setting, for every (μ, ν) ∈ P(E) × P(E), +∞ 1 p p |〈μ, ϕi 〉 − 〈ν, ϕi 〉|, d (μ, ν) := |λ| d μ − |λ| d ν + i E 2 ϕ E
i =1
i ∞
and we argue by contradiction. Let us assume that 'a (E) is not dense in a (E) for the metric associated with the distance d . Then, there exists μ0 ∈ a (E), k ∈ N∗ and η > 0 such that k 1 p p ∀ν ∈ 'a (E) |λ| d μ0 − |λ| d ν + |〈μ0 , ϕ〉 − 〈ν, ϕ〉| > η. (11.51) i E E i =1 2 ϕi ∞ We reason now in the finite dimensional space Rk+1 . From (11.51), the vector E
|λ| p d μ0 ,
1 2i ϕi ∞
〈μ0 , ϕi 〉
i =1,...,k
does not belong to the closure of the following convex set of Rk+1 : |λ| d ν,
Ca :=
E
〈ν, ϕi 〉
1
p
2i ϕi ∞
i =1,...k
: ν ∈ 'a (E) .
Consequently, according to the Hahn–Banach separation theorem, Theorem 9.1.1, there exist (ci )i =0,...k in Rk+1 and η > 0 such that for all ν ∈ 'a (E), c0
|λ| p d ν + E
≥ η + c 0
E
k
1
i =1
2 ϕi ∞ i
|λ| p d μ0 +
ci 〈ν, ϕi 〉
k
1
i =1
2 ϕi ∞ i
ci 〈μ0 , ϕi 〉.
(11.52)
Let us set φ := c0 | · | p +
k
ci
i =1
ϕi ∞
ϕi .
We claim that c0 ≥ 0. Otherwise, taking in (11.52) ν = μvt with v t := t w+la , t > 0, where 1, p
w is a fixed function in W0 (Q, R m ), and letting t → +∞, the right-hand side in (11.52) would be −∞. Replacing if necessary (when c0 = 0), the function φby the function φ+δ|·| p with δ > 0 small enough (choose precisely δ satisfying η −δ E |λ| p d μ0 > 0), one may assume c0 > 0. Thus the function φ still satisfies (11.52) for η > 0 replaced p by η − δ E |λ| d μ0 > 0 if necessary, together with the growth conditions c0 |λ| p + γ ≤ φ(λ) ≤ β(1+|λ| p ) for some constants γ ∈ R and β > 0. Finally, replacing φ by φ−γ , one may assume that φ satisfies (11.52) and the growth conditions c0 |λ| p ≤ φ(λ) ≤ β(1+|λ| p ) of Proposition 11.2.2. Therefore, (11.52) and condition (ii) satisfied by μ0 yield inf
1 " (Q)
Q
1, p φ(∇v(x)) d x : v ∈ la +W0 (Q, R m ) > φ d μ0 ≥ Qφ d μ0 ≥ Qφ(a). E
E
i
i i
i
i
i
i
11.4. Relaxation in the space of Young measures in nonlinear elasticity
book 2014/8/6 page 473 i
473
But, according to the classical variational principle (Proposition 11.2.2), since φ satisfies appropriate growth conditions, 1 1, p m φ(∇v(x)) d x : v ∈ la + W0 (Q, R ) = Qφ(a), inf " (Q) Q a contradiction. Second step: The Young measure μ satisfies conditions (i), (ii), and (iii) of Theorem 11.4.1, and u = 0. We want to construct a bounded sequence (un )n∈N in W 1, p (Ω, R m ) whose gradients generate μ. The idea of the proof consists in “localizing” (μ x ) x∈Ω thanks to Vitali’s covering theorem, to apply the previous step for each localization, then to stick together the sequences of functions whose gradients generate each localized Young measures. According to Vitali’s covering lemma, Lemma 4.1.2, and Remark 4.1.4, for every k ∈ N∗ , there exists a finite family (Qi ,k )i ∈Ik of pairwise disjoint open cubes included in Ω satisfying 3 1 1 " Ω\ (11.53) Qi ,k ≤ , diam(Qi ,k ) < . k k i ∈I k
For every i ∈ Ik , let us define the probability measure μi ,k on E by 1 μi ,k := μ d x, " (Qi ,k ) Qi,k x which acts on every ϕ ∈ C0 (E) as follows: 1 〈μi ,k , ϕ〉 = ϕ(λ) d μ x (λ) d x. " (Qi ,k ) Qi,k E With the notation of the first step, it is easy to show that μi ,k belongs to 0 (E). Consequently, according to the first step, for every i ∈ Ik , there exists a bounded sequence 1, p (vi ,k,n )n∈N in W0 (Qi ,k , R m ) such that na r
(δ∇vi,k ,n (x) ) x∈Qi,k ⊗ " Qi ,k μi ,k ⊗ " Qi ,k
when n → +∞.
(11.54)
By using Lemma 11.4.1, one may furthermore assume the sequence (|∇vi ,k,n | p )n∈N uniformly integrable so that, from Theorem 4.3.3, (11.54) yields p |∇vi ,k,n | d x = |λ| p d μi ,k d x lim n→+∞
Qi,k
Qi,k
=
E
1 "1(0,1) (Qi ,k )
p
Qi,k
E
|λ| d μ x d x.
(11.55)
We now stick together the functions vi ,k,n , i ∈ Ik , by setting ⎧ ⎨ vi ,k,n if x ∈ Qi ,k , vk,n (x) := ⎩ 0 if x ∈ Ω \ Qi ,k . i ∈I k
1, p Clearly (vk,n )n∈N is a bounded sequence in W0 (Ω, R m ). Take now θ in a dense subset of regular functions of L1 (Ω), ϕ ∈ C0 (E) and set Rk := Ω\ Q θ(x)ϕ(0) d x. Note that i,k i∈I k
i
i i
i
i
i
i
book 2014/8/6 page 474 i
Chapter 11. Relaxation in Sobolev, BV , and Young measures spaces
474
from (11.53), limk→+∞ Rk = 0. From (11.54), the definition of μi ,k , and according to the mean value theorem, one has θ(x)ϕ(∇vk,n (x)) d x = lim θ(x)ϕ(∇vk,n (x)) d x + Rk lim n→+∞
n→+∞
Ω
=
i ∈Ik
= =
i ∈Ik
Qi,k
i ∈Ik
Qi,k
i ∈Ik
Qi,k
Qi,k
θ(x)〈μi ,k , ϕ〉 d x + Rk
ϕ d μy d y
1
" (Qi ,k ) Qi,k θ(xi ,k ) ϕ d μy d y + R k E
θ(x) d x + Rk (11.56)
E
for some xi ,k ∈ Qi ,k . We stress the fact that the convergence in (11.56) may be taken in the narrow convergence sense. Indeed, setting μi ,k if x ∈ Qi ,k , ˜x = μ δ0 if x ∈ Ω \ i ∈I Qi ,k , k
˜ x ) x∈Ω ⊗ " and estimate (11.56) yields one defines a Young measures (μ lim (δ∇vk ,n (x) ) x∈Ω ⊗ " = (μ˜ x ) x∈Ω ⊗ "
n→+∞
in the narrow convergence sense in 6 (Ω; E). Letting k → +∞ in (11.56), from (11.53), one obtains lim lim θ(x)ϕ(∇vk,n (x)) d x = lim θ(xi ,k ) ϕ d μy d y k→+∞ n→+∞
k→+∞
Ω
= lim
k→+∞
Ω
Qi,k
i ∈Ik
=
i ∈Ik
θ(y)
Qi,k
θ(y) E
E
E
ϕ d μy d y
ϕ d μy d y.
This shows that limk→+∞ limn→+∞ (δ∇vk ,n (x) ) x∈Ω ⊗ " = (μ x ) x∈Ω ⊗ " , where each limit must be taken in the narrow convergence sense in 6 (Ω; E). According to the diagonalization lemma, Lemma 11.1.1, there exists a map n → k(n) such that na r
(δ∇vk(n),n (x) ) x∈Ω ⊗ " (μ x ) x∈Ω ⊗ " when n → +∞. Furthermore, the sequence (vk(n),n )n∈N is bounded in W 1, p (Ω, R m ). Setting un := vk(n),n shows what required. Last step: The Young measure μ satisfies conditions (i), (ii), and (iii) of Theorem 11.4.1 without condition on u in W 1, p (Ω, R m ). Given μ ∈ 6 (Ω; E) satisfying (i), (ii), and (iii), ˜ = (μ ˜ x ) x∈Ω ⊗ " in 6 (Ω; E) defined by 〈μ˜ x , ϕ〉 = 〈μ x , ϕ(. − ∇u(x))〉 let us consider μ ˜ satisfies for every ϕ ∈ Lμp (E) and for a.e. x in Ω. It is straightforward to check that μ x
1, p
the conditions of the second step. Thus, there exists a sequence (vn )n∈N in W0 (Ω, R m )
i
i i
i
i
i
i
11.4. Relaxation in the space of Young measures in nonlinear elasticity
book 2014/8/6 page 475 i
475
˜ Consider un := vn + u in W 1, p (Ω, R m ). For each ψ ∈ such that (∇vn )n∈N generates μ. ˜ λ) := ψ(x, λ + ∇u(x)). Thus, for C b (Ω; E), let us define ψ˜ in C b (Ω; E) by setting ψ(x, every ψ ∈ C b (Ω; E) we have ˜ ∇v (x)) d x lim ψ(x, ∇un (x)) d x = lim ψ(x, n n→+∞ n→+∞ Ω Ω ˜ λ) d μ(x, ˜ λ) = ψ(x, =
Ω×E
Ω×E
ψ(x, λ)d μ(x, λ),
which proves that (∇un )n∈N generates μ. Remark 11.4.1. When considering functions depending on x and u, the necessary condition (ii) must be replaced by condition (ii) below: for every Φ : Ω × R m × E −→ R such that for a.e. x in Ω, φ(x, u(x), .) is quasi-convex and satisfies γ ≤ φ(x, ξ , λ) ≤ β(1 + |ξ | p + |λ| p ) for some constants γ and β > 0, one has φ(x, u(x), ∇u(x)) ≤ φ(x, u(x), λ) d μ x (λ) for a.e. x ∈ Ω. E
Indeed, it suffices to argue as in the second step of the proof of the necessary conditions by setting 1 ϕ(x, λ) = (x)φ(x, u(x), λ), 1 " (Bρ (x0 )) Bρ (x0 ) where x0 is such that the two following limits exist: 1 φ(x0 , u(x0 ), ∇u(x0 )) = lim φ(x, u(x), ∇u(x)) d x, ρ→0 " (B (x )) B (x ) ρ 0 ρ 0 1 φ(x0 , u(x0 ), λ) d μ x0 (λ) = lim φ(x, u(x), λ) d μ x d x. ρ→0 " (B (x )) B (x ) E E ρ 0 ρ 0
11.4.2 Relaxation of classical integral functionals in (Ω; E ) We intend to apply the relaxation procedure for the integral functionals of Section 11.2, but considered as living in the space X = 6 (Ω; E), equipped with the topology of the narrow convergence. The generalized solutions of the relaxed problem may be interpreted as microstructures: they capture highly oscillatory minimizing sequences on smaller and smaller spatial scales and describe fine phase mixtures in elastic crystals. Fundamental applications to real materials and polycrystals may be found in [96], [97], [98], [99], and [172]. For papers dealing with laminates and multiwell problems, see [81], [307], [343], [317], and [365]. We consider the problem (+ )
5 4 1, p inf F (u) : u ∈ u0 + W0 (Ω, R m )
(:= inf(+ )),
i
i i
i
i
i
i
book 2014/8/6 page 476 i
Chapter 11. Relaxation in Sobolev, BV , and Young measures spaces
476
where u0 is a given function in W 1, p (Ω, R m ), p > 1, and F : W 1, p (Ω, R m ) −→ R is the integral functional defined by f (x, u, ∇u) d x. F (u) = Ω
The density f : Ω × RN × E −→ R is assumed to be (Ω) ⊗ (RN ) ⊗ (E)-measurable, continuous with respect to the third variable, to satisfy the continuity assumption | f (x, ξ , λ) − f (x, ξ , λ)| ≤ L|ξ − ξ |(1 + |ξ | p−1 + |ξ | p−1 )
(11.57)
with respect to the second variable, and the usual bounds α(|λ| p − 1) ≤ f (x, ξ , λ) ≤ β(1 + |ξ | p + |λ| p ), where L > 0, 0 < α < β are three given constants. Take, for example, f (x, ξ , λ) = f0 (x, ξ ) + f1 (x, λ). For less restrictive continuity assumptions on f , see Acerbi and Fusco [3], Dacorogna [182], and Pedregal [316]. In general (+ ) has no solution and the relaxation procedure applied to this problem with X = W 1, p (Ω, R m ) equipped with its weak convergence leads to the classical relaxed problem 4 5 1, p min F (u) : u ∈ u0 + W0 (Ω, R m )
(+ )
(:= min(+ )),
where F is the integral functional defined on W 1, p (Ω, R m ) by F (u) = Q f (x, u, ∇u) d x, Ω
and, for a.e. x in Ω, Q f (x, ξ , .) is the quasi-convex envelope of f (x, ξ , .). According to Theorem 11.1.2 and arguing as in the proof of Theorem 11.2.2, one can establish inf(+ ) = min(+ ). Because of the quasi-convexification, this procedure has the disadvantage of erasing the possible potential wells of f (x, ξ , .) so that the relaxed problem does not provide much information on the behavior of minimizing sequences (see Example 11.4.1 below or examples given in Geymonat [224]). An alternative way, as we will see now, is to relax (+ ) in the space 6 (Ω; E) equipped with the narrow convergence. Let us first give some definitions and notation. For a fixed u0 in W 1, p (Ω, R m ), let G60 (Ω; E) be the set of W 1, p -Young measures such that the underlying deformation u 1, p defined in Theorem 11.4.1 belongs to u0 + W0 (Ω, R m ). We define its subset of elementary W 1, p -Young measures by 4 5 1, p EG60 (Ω; E) := μ ∈ 6 (Ω; E) : ∃u ∈ u0 + W0 (Ω, R m ), μ = (δ∇u(x) ) x∈Ω ⊗ " . 1, p
Since u ∈ u0 + W0 (Ω, R m ) is uniquely defined by its gradient ∇u ∈ L p (Ω, E), the operator 1, p m T : G60 (Ω; E) −→ u0 + W0 (Ω, R ), μ → T μ := u, ∇u(x) = λ d μ x (λ), E
is well defined.
i
i i
i
i
i
i
11.4. Relaxation in the space of Young measures in nonlinear elasticity
book 2014/8/6 page 477 i
477
We now reformulate the problem (+ ) in terms of Young measures by considering the integral functional G : 6 (Ω; E) −→ R ∪ {+∞}, defined by ⎧ ⎨ f (x, T μ(x), λ) d μ(x, λ) if μ ∈ EG60 (Ω; E), G(μ) = Ω×E ⎩ +∞ otherwise. Note that, in its domain, the functional G is nothing but the functional F . More precisely, f (x, u(x), ∇u(x)) d x = F (u) G(μ) = Ω
1, p
when u ∈ u0 + W0 (Ω, R m ) and μ = (δ∇u(x) ) x∈Ω ⊗ " . The problem (+ ) is then equivalent to 4 5 inf G(μ) : μ ∈ 6 (Ω; E) . On the other hand, let us define the integral functional G : 6 (Ω; E) −→ R ∪ {+∞} by setting ⎧ ⎨ f (x, T μ(x), λ) d μ(x, λ) if μ ∈ G60 (Ω; E), G(μ) = Ω×E ⎩ +∞ otherwise. According to Theorem 11.4.1(iii) and to the growth conditions fulfilled by f , the domain of G is G60 (Ω; E). The functional G is nothing but the natural extension of G to G60 (Ω; E). In Theorem 11.4.2, we show that G is the sequential lower semicontinuous envelope of G when 6 (Ω; E) is equipped with the narrow convergence and we make precise the relationship between the two relaxed problems in W 1, p (Ω, R m ) and 6 (Ω; E). Theorem 11.4.2. The sequential lower semicontinuous envelope of G in 6 (Ω; E) equipped with the narrow convergence is the extended functional G. Moreover, we have inf(+ ) = min(+ ) = min(+ where min(+
yo un g
yo un g
),
4 5 ) := min G(μ) : μ ∈ 6 (Ω; E) ,
yo un g
and, if μ is a solution of min(+ ), then u = T μ is a solution of min(+ ). Furthermore, if (un )n∈N is a sequence of 1/n-minimizers of the problem inf(+ ), then every cluster point of yo un g ). (δ∇un (x) ) x∈Ω ⊗ " converges narrowly to a solution of min(+ PROOF. We begin by proving that G is the lsc envelope of G. According to Proposition 11.1.1 and Theorem 11.1.1 we must establish the two following assertions: for every μ ∈ 6 (Ω; E), na r
∀μn μ, G(μ) ≤ lim inf G(μn ),
(11.58)
n→+∞ na r
there exists a sequence (νn )n∈N in 6 (Ω; E), νn μ, such that G(μ) ≥ lim sup G(νn ). n→+∞
(11.59)
i
i i
i
i
i
i
478
book 2014/8/6 page 478 i
Chapter 11. Relaxation in Sobolev, BV , and Young measures spaces
In a second step, in order to apply Theorem 11.1.1, we will establish the compactness of every minimizing sequence of inf(+ ), in terms of Young measures. We will conclude by yo un g giving the relations between a solution of min(+ ) and the corresponding underlying deformation. First step: Proof of (11.58). One may assume lim infn→+∞ G(μn ) < +∞, so that, for a nonrelabeled subsequence, one has G(μn ) < +∞, μn ∈ EG60 (Ω; E), and f (x, un (x), ∇un (x)) d x. G(μn ) = Ω
1, p
According to the coerciveness assumption on f , there exists u ∈ u0 + W0 (Ω, R m ) such that un u weakly in W 1, p (Ω, R m ) and strongly in L p (Ω, R m ). Let us write f (x, un (x), ∇un (x)) d x G(μn ) = Ω f (x, u(x), λ) d μn (x, λ) = Ω×E
+
Ω
f (x, un (x), ∇un (x)) d x −
Ω
f (x, u(x), ∇un (x)) d x .
From continuity assumption (11.57), and the boundedness of the sequence (un )n∈N in W 1, p (Ω, R m ), the second term in the right-hand side tends to zero. Since, for a.e. x ∈ Ω, λ → f (x, u(x), λ) is lsc (actually continuous), the conclusion then follows from Proposition 4.3.3. Second step: Proof of (11.59). One may assume G(μ) < +∞ so that μ ∈ G60 (Ω; E). 1, p According to the definition of G60 (Ω; E), there exists u ∈ u0 +W0 (Ω, R m ) and a bounded sequence (un )n∈N in W 1, p (Ω, R m ) such that μn = (δ∇un (x) ) x∈Ω ⊗ " narrowly converges to μ. But, from Lemma 11.4.1, there exists a sequence (vn )n∈N in W 1, p (Ω, R m ) satisfying 1, p
(i) vn ∈ u + W0 (Ω, R m ); (ii) (|∇vn | p )n∈N is uniformly integrable; (iii) vn − un → 0 and ∇(vn − un ) → 0 in measure. Hence by (iii) and Proposition 4.3.8, νn := (δ∇vn (x) ) x∈Ω ⊗ " narrowly converges to μ. Moreover, it is easily seen that, up to a subsequence, vn strongly converges to u in L p (Ω, R m ). From now on we consider a nonrelabeled subsequence of (vn )n∈N such that vn strongly converges to u in L p (Ω, R m ). Let us write f (x, vn , ∇vn ) d x = f (x, u, ∇vn ) d x Ω Ω + f (x, vn , ∇vn ) d x − f (x, u, ∇vn ) d x . (11.60) Ω
Ω
According to the growth condition satisfied by f , the sequence ( f (x, u, ∇vn ))n∈N is also uniformly integrable, so that, by Theorem 4.3.3, lim f (x, u, ∇vn ) d x = f (x, u, λ) d μ(x, λ). (11.61) n→+∞
Ω
Ω×E
i
i i
i
i
i
i
11.4. Relaxation in the space of Young measures in nonlinear elasticity
book 2014/8/6 page 479 i
479
On the other hand, from (11.57) and the boundedness of the sequence (vn )n∈N in L p (Ω, R m ), the second term in the right-hand side of (11.60) tends to zero. Collecting (11.60) and (11.61), we obtain lim G(νn ) = G(μ),
n→+∞
which completes the proof of the second step. Third step: Compactness of minimizing sequences. Let (μn )n∈N be a minimizing se1, p quence of inf(+ ). Since G(μn ) < +∞, there exists (un )n∈N in u0 + W0 (Ω, R m ) such that μn = (δ∇un (x) ) x∈Ω ⊗ " and G(μn ) = F (un ). According to the coerciveness assumption on f , the sequence (∇un )n∈N is bounded in L p (Ω, E); thus, from Remark 4.3.3, the sequence (μn )n∈N is tight. The conclusion then follows by applying Prokhorov’s theorem, Theorem 4.3.2. yo un g ) and u = T μ. According to Theorem Last step. Let μ be a solution of min(+ 11.4.1(ii) and Remark 11.4.1, we have, for a.e. x ∈ Ω, Q f (x, u(x), ∇u(x)) ≤
E
≤ E
Q f (x, u(x), λ) d μ x (λ) f (x, u(x), λ) d μ x (λ).
Therefore Ω
Q f (x, u(x), ∇u(x)) d x ≤
Ω×E
f (x, T μ(x), λ) d μ(x, λ)
= min(+
yo un g
) = inf(+ ) = min(+ ).
This shows that u = T μ is a solution of min(+ ). The last assertion is a straightforward consequence of Theorem 11.1.2. Example 11.4.1. Let us illustrate Theorem 11.4.2 with the example treated in Geymonat [224]: take Ω = (0, 1), p = 4, u0 = 0, and ϕ(ξ , λ) = (λ2 − 1)2 + ξ 2 ; see Figure 11.1. The map λ → ϕ(ξ , λ) possesses the two potential wells ±1 and inf(+ ) = 0 has no solution.
Figure 11.1. The graph of λ → ϕ(ξ , λ) with ξ = 1.
i
i i
i
i
i
i
480
book 2014/8/6 page 480 i
Chapter 11. Relaxation in Sobolev, BV , and Young measures spaces
For each fixed ξ , the quasi-convex envelope of λ → (λ2 − 1)2 + ξ 2 (equivalently, its convex envelope) is given by < 2 (λ − 1)2 + ξ 2 if |λ| ≥ 1, Qϕ(x, ξ , λ) = ξ 2 otherwise, so that min(+ ) = 0 possesses a unique solution u = 0. Therefore, from Theorem 11.4.2 yo un g we have min(+ ) = 0 and if μ is a solution 2 2 2 ((λ − 1) + (T μ) )d μ x (λ) d x = 0, x a.e. on (0, 1), (0,1) R 0 = λ d μ x (λ). R
From the first equality, we deduce that the support of μ x is include in {−1, 1}. Therefore, the second equality gives μ x = 12 δ−1 + 12 δ1 and μ = ( 12 δ−1 + 12 δ1 ) ⊗ " .
11.5 Mass transportation problems The mass transportation theory goes back to Gaspard Monge: in 1781 he proposed in [295] a model to describe the work necessary to move a mass distribution μ1 = ρ1 d x into a final destination μ2 = ρ2 d x (respectively, déblais and remblais in Monge’s terminology), once the unitary transportation cost function c(x, y), which measures the work to move a unit mass from x to y, is given. Monge’s goal was to find, among all possible transportation maps T which move μ1 into μ2 , i.e., such that μ2 (E) = μ1 T −1 (E) for every measurable set E, a map with minimal total transportation cost, defined as c x, T (x) d μ1 . The measures μ1 and μ2 on RN (more generally on metric spaces) have equal mass (normalized to one for simplicity) and are called marginals; the operator T # μ(E) = μ T −1 (E) is called the push forward operator and is characterized by the fact that φ d T #μ = φ ◦ T d μ for every Borel function φ : RN → R which makes the integrals above meaningful. Note that when the measures μ1 and μ2 are absolutely continuous with densities ρ1 and ρ2 , respectively, and T is smooth and injective, the push forward equality reads ρ2 (T (x))| det ∇T (x)| = ρ1 (x)
for a.e. x ∈ RN .
With the notation above, the optimal mass transportation problems is written as # (11.62) min c x, T (x) d μ1 : T μ1 = μ2 .
i
i i
i
i
i
i
11.5. Mass transportation problems
book 2014/8/6 page 481 i
481
The literature on mass transportation problems is very wide and several excellent publications are available; our goal here is not to be exhaustive but only to give to the reader a short presentation of the field, referring, for instance, to Ambrosio [20], Brenier [133], Evans and Gangbo [210], Villani [358], [359], and the references therein. The natural framework for this kind of problems is the one where X is a metric space and μ1 , μ2 are probabilities on X ; however, the existence of an optimal transport map is a very delicate question, even in the classical Monge case, where X is the Euclidean space RN and c(x, y) = |x − y|. When μ1 , μ2 are singular measures, easy counterexamples show that an optimal transport map may not exist. Example 11.5.1. On the real line R consider the probabilities μ1 = δ0 and μ2 = 12 δ−1 + 1 δ . Then no transport map between μ1 and μ2 exists, and so the Monge problem (11.62) 2 1 is meaningless for any cost function c(x, y). The same situation occurs in general when μ1 has h atoms and μ2 has k atoms with h < k. When the marginal measures are discrete (i.e., finite sums of Dirac masses) and with the same number of atoms, μ1 =
k 1
k
i =1
δ xi ,
μ2 =
k 1
k
i =1
δyi ,
the Monge problem becomes a problem in combinatorial optimization. In fact, a transport map T corresponds to a permutation of the indices and, taking, for instance, c(x, y) = |x − y|, the Monge problem becomes min
k i =1
|xi − y h(i ) | : h permutation of 1, . . . , k .
An example with k = 4 is shown in Figure 11.2.
Figure 11.2. A transport map in a discrete case.
Example 11.5.2. It may also happen that the family of admissible transport maps in nonempty but an optimal transport map does not exist: take, for instance, in R2 the segments A = {(0, y) : y ∈ [0, 1]}, B = {(1, y) : y ∈ [0, 1]}, C = {(−1, y) : y ∈ [0, 1]}, and the probabilities μ1 = 1 A, μ2 = 12 1 B + 12 1 C (see Figure 11.3). In this case the family of transport maps between μ1 and μ2 is nonempty; for instance, the map < (−1, 2y) if y ∈ [0, 1/2], T (0, y) = (1, 2y − 1) if y ∈ ]1/2, 1]
i
i i
i
i
i
i
book 2014/8/6 page 482 i
Chapter 11. Relaxation in Sobolev, BV , and Young measures spaces
482
C
A
B
Figure 11.3. A case in which an optimal transport map does not exist.
moves μ1 into μ2 with cost $
|x − T (x)| d μ1 =
5
4
+ log
$ 1+ 5 2
≈ 1.04.
Taking the maps Tn (0, y) =
(−1, 2y − nk ) (1, 2y − k+1 ) n
if y ∈ [ nk , k+1 ] and k is even, n if y ∈ ] nk , k+1 [ and k is odd, n
an easy computation shows that lim
n→∞
|x − Tn (x)| d μ1 = 1
so that inf
#
|x − T (x)| d μ1 : T μ1 = μ2 ≤ 1.
(11.63)
On the other hand, for every transport map T we have |x − T (x)| ≥ 1
∀x ∈ A,
which implies that the infimum in (11.63) is equal to 1. However, the infimum in (11.63) cannot be attained, because this would imply that for a transport map T we have |x − T (x)| = 1
for a.e. x ∈ A,
which is impossible. Example 11.5.3. In general, we cannot expect the uniqueness of optimal transport maps; for instance, in the one-dimensional case and with c(x, y) = |x −y|, the following example, known as book shifting for the analogy of the movement of volumes between shelves, shows that there are infinitely many optimal transport maps. Let " 1 be the Lebesgue
i
i i
i
i
i
i
11.5. Mass transportation problems
book 2014/8/6 page 483 i
483
measure on R; consider μ1 = " 1 [0, 1] and μ2 = " 1 [1, 2]. It is not difficult to verify that both the transport maps T1 (x) = 1 + x (translation) and T2 (x) = 2 − x (reflection) are optimal with cost 1 1 |x − T1 (x)| d x = |x − T2 (x)| d x = 1. 0
0
In a similar way we may construct infinitely many optimal transport maps, for instance, subdividing the interval [0, 1] into n equal intervals Ik and the interval [1, 2] into n equal intervals Jk , and transporting every Ik into Jk by maps similar to T1 and T2 . Actually, in this case every transport map is optimal. In fact, by the definition of the push forward operator, we have
1
T (x) d x = 0
1
2
3 x dx = ; 2
therefore, since T (x) ∈ [1, 2] and x ∈ [0, 1], we obtain
1
|x − T (x)| d x = 0
1 0
3 1 T (x) − x d x = − = 1. 2 2
The original problem was stated by Monge with the cost function c(x, y) = |x − y|; in this case the existence of an optimal transport map is a very delicate question; the following result was shown by Sudakov [339] in an incomplete form and successively completely proved by several authors (see Ambrosio and Pratelli [28], Evans and Gangbo [210], Caffarelli, Feldman, and McCann [163], and Trudinger and Wang [353]). space RN with a compact Theorem 11.5.1. Let μ1 , μ2 be two probabilities on the Euclidean support or more generally with finite first-order moments |x| d μi . Assume that μ1 is absolutely continuous with respect to the Lebesgue measure. Then the optimal transport problem
min
#
|x − T (x)| d μ1 : T μ1 = μ2
admits a solution. To avoid the difficulties related to the existence of optimal transport maps, in 1942 Kantorovich proposed [252] a relaxed formulation of the Monge transport problem: the goal is now to find a probability on the product space, which minimizes the relaxed transportation cost (11.64) K(μ0 , μ1 ) = c(x, y) γ (d x, d y) over all admissible transport plans γ , that is, probabilities on RN × RN such that the projections π1# γ and π2# γ coincide with the marginals μ1 and μ2 , respectively. Note that the projection condition above on the transport plan γ can be equivalently stated as γ (B × RN ) = μ1 (B)
and
γ (RN × B) = μ2 (B)
for every Borel set B ⊂ RN .
Moreover, every transport map T can be seen as a transport plan γ , given by γ = (I d × T )# μ1 .
i
i i
i
i
i
i
484
book 2014/8/6 page 484 i
Chapter 11. Relaxation in Sobolev, BV , and Young measures spaces
The Kantorovich problem then reads min c(x, y) γ (d x, d y) : π#j γ = μ j for j = 1, 2 .
(11.65)
The cases c(x, y) = |x − y| p with p ≥ 1 have been particularly studied, and the cost K(μ0 , μ1 ) in (11.64) provides, through the relation 1/ p , W p (μ0 , μ1 ) = K(μ0 , μ1 ) the so-called Wasserstein distance. This distance metrizes the weak* convergence on the space P(Ω) of probabilities supported in a compact set Ω. A very wide literature on the subject is available; we mention, for instance, the books [27], [358], [359], where one can find a complete list of references. Theorem 11.5.2. Assume that the cost function c is lower semicontinuous and bounded from below. Then the optimization Kantorovich problem (11.65) admits a solution. PROOF. Let (γn )n∈N be a minimizing sequence for the Kantorovich optimization problem (11.65). If H , K are compact subsets of RN we have ! " γn (RN × RN \ H × K) = γn (RN \ H ) × RN ∪ RN × (RN \ K) ≤ γn (RN \ H ) × RN + γn RN × (RN \ K) = μ1 (RN \ H ) + μ2 (RN \ K), where the last equality follows from the projection property π1# γ = μ1 ,
π2# γ = μ2 .
Therefore, the sequence of measures (γn )n∈N is tight, in the sense of Theorem 4.2.3 (see also Definition 4.3.2), and the Prokhorov compactness, Theorem 4.2.3, allows to extract a subsequence (γnk )k∈N which converges narrowly to a probability γ on RN × RN in the sense of Definition 4.2.2. We show now that γ is a transport plan, that is, it satisfies the projection property: take a Borel set E of RN ; for every compact set K ⊂ E and every open set A ⊃ E, by using the Alexandrov proposition, Proposition 4.2.3, and the projection property of γn , we have μ1 (K) = lim sup γn (K × RN ) ≤ γ (K × RN ) ≤ γ (E × RN ) n→∞
≤ γ (A × RN ) ≤ lim inf γn (A × RN ) = μ1 (A). n→∞
Since K and A were arbitrarily chosen, we have γ (E × RN ) = μ1 (E). In a similar way we obtain the second projection property γ (RN × E) = μ2 (E). To conclude, it remains to show that the transport plan γ is optimal for the Kantorovich problem (11.65). Since the cost function c(x, y) is lower semicontinuous, there exists an
i
i i
i
i
i
i
11.5. Mass transportation problems
book 2014/8/6 page 485 i
485
increasing sequence ck (x, y) k∈N of continuous and bounded functions which converge to c(x, y) pointwise. Then, by the narrow convergence of γn to γ , we have for every k ∈ N ck (x, y) d γ (x, y) = lim ck (x, y) d γn (x, y) n→∞ ≤ lim inf c(x, y) d γn (x, y) = inf(11.65). n→∞
Passing now to the limit as k → ∞ gives the conclusion. Optimal transportation problems have strong links with several research fields and applications; we list some of them, pointing the interested reader to the related references: • shape and density optimization for elastic structures (see, for instance, [113], [115]); • optimization of transportation networks (see, for instance, [130], [156], [161]); • problems in urban planning (see, for instance, [158], [159], [157]); • optimal location and irrigation problems (see, for instance, [116], [117], [132], [154], [160]); • traffic models with congestion effects (see, for instance, [164], [165], [360]); • curves in the space of measures and applications to crowd movements (see, for instance, [131], [244], [289], [290]).
i
i i
i
i
i
i
book 2014/8/6 page 487 i
Chapter 12
Γ -convergence and applications
12.1 Γ -convergence in abstract metrizable spaces Given a metrizable space, or more generally a first countable topological space, we would like to define a convergence notion on the space of extended real-valued functions F : X −→ R ∪ {+∞} so that the maps F → inf F , X
F → arg minX F
are sequentially continuous. More precisely, given Fn , F : X → R ∪ {+∞}, under some compactness hypotheses, we wish that the following implications hold true when n → +∞: Fn → F =⇒ inf Fn → min F ; X
X
Fn → F , un ∈ arg minX Fn =⇒ un → u ∈ arg minX F at least for a subsequence. It is worth noticing that such convergence theory contains in some sense the theory of relaxation of Chapter 11. Indeed, according to the relaxation theorem, Theorem 11.1.2, one has Fn ≡ F → cl(F ) =⇒ inf = min cl(F ) X
X
and every relatively compact minimizing sequence possesses a subsequence converging to u ∈ arg minX F . Therefore, constant sequences Fn ≡ F must converge to cl(F ) in the sense described above (see Remark 12.1.1). Such an issue is of central importance in the calculus of variations. Indeed, many problems arising from physics, mechanics, economics, and approximation methods in numerical analysis are modeled by means of minimization of functionals depending on some parameter, here formally denoted by n. For instance, we write Fn for F , where is a small parameter associated to a thickness, a stiffness in mechanics, or a size of small discontinuities. Then, if the model associated with Fn possesses a variational formulation, the problem of finding a functional F asymptotically equivalent to Fn , formally written F ∼ Fn , must be posed in terms of variational analysis: F ∼ Fn means that when n tends to infinity (or tends to zero), infX F ∼ minX Fn ; x ∈ arg minX F ∼ xn ∈ n -arg minX Fn , n → 0, in the sense of some suitable topology on X . 487
i
i i
i
i
i
i
book 2014/8/6 page 488 i
Chapter 12. Γ -convergence and applications
488
The notion of Γ -convergence, introduced by De Giorgi and Franzoni [197] and studied in this section, corresponds to that issue. Definition 12.1.1. Let (X , d ) be a metrizable space, or more generally a first countable topological space, (Fn )n∈N a sequence of extended real-valued functions Fn : X −→ R∪{+∞}, and F : X −→ R ∪ {+∞}. The sequence (Fn )n∈N (sequentially) Γ -converges to F at x ∈ X iff both the following assertions hold: (i) for all sequences (xn )n∈N converging to x in X , one has F (x) ≤ lim inf Fn (xn ); n→+∞
(ii) there exists a sequence (yn )n∈N converging to x in X such that F (x) ≥ lim sup Fn (yn ). n→+∞
When (i) and (ii) hold for all x in X , we say that (Fn )n∈N Γ -converges to F in (X , d ) and we write F = Γ − limn→+∞ Fn . Note that trivially the system of assertions (i) and (ii) is equivalent to (i) and (ii) : (i) for all sequences (xn )n∈N converging to x in X , one has F (x) ≤ lim inf Fn (xn ); n→+∞
(ii) there exists a sequence (yn )n∈N converging to x in X such that F (x) = lim Fn (yn ). n→+∞
Remark 12.1.1. Let us consider the constant sequence (Fn )n∈N , where Fn = F : X −→ R ∪ {+∞} is a given function. From Chapter 11, this sequence does not Γ -converge to F but converges to the lsc envelope cl(F ) of F . Consequently, the Γ -convergence is not in general associated with a topology on the family of all functions F : X −→ R ∪ {+∞}. For a detailed analysis of subfamilies of functions on which the Γ -convergence is endowed by a topology, we refer the interested reader to [183]. Remark 12.1.2. Let us recall the definition of the set convergence. Let (Cn )n∈N be a sequence of subsets of a metric space (X , d ) or more generally of any topological space. The lower limit of the sequence (Cn )n∈N is the subset of X denoted by lim infn→+∞ Cn and defined by 4 5 lim inf Cn = x ∈ X : ∃xn → x, xn ∈ Cn ∀n ∈ N . n→+∞
The upper limit of the sequence (Cn )n∈N is the subset of X denoted by lim supn→+∞ Cn and defined by 4 5 lim sup Cn = x ∈ X : ∃(nk )k∈N , ∃(xk )k∈N , ∀k, xk ∈ Cnk , xk → x . n→+∞
The sequence (Cn )n∈N is said to be convergent if the following equality holds: lim inf Cn = lim sup Cn . n→+∞
n→+∞
i
i i
i
i
i
i
12.1. Γ -convergence in abstract metrizable spaces
book 2014/8/6 page 489 i
489
The common value is called the limit of (Cn )n∈N in the Painlevé–Kuratowski sense and denoted by limn→+∞ Cn . Therefore, by definition x ∈ C := limn→+∞ Cn iff the two following assertions hold: ∀x ∈ C , ∃(xn )n∈N such that ∀n ∈ N, xn ∈ Cn , and xn → x; ∀(nk )k∈N , ∀(xk )k∈N such that ∀k ∈ N, xk ∈ Cnk , xk → x =⇒ x ∈ C . One can prove that the Γ -convergence of a sequence (Fn )n∈N to a function F is equivalent to the convergence of the sequence of epigraphs of the functions Fn to the epigraph of F when the class of subsets of X ×R is equipped with the set convergence previously defined (see Attouch [37]). This is why Γ -convergence is sometimes called epiconvergence. We define the extended real-valued functions Γ −lim supn→+∞ Fn and Γ −lim infn→+∞ Fn by setting for all x ∈ X 1 Γ − lim sup Fn (x) := sup lim sup inf Fn (y) : d (x, y) < , m n→+∞ m∈N∗ n→+∞ (12.1) 1 Γ − lim inf Fn (x) := sup lim inf inf Fn (y) : d (x, y) < . n→+∞ m m∈N∗ n→+∞ Since clearly lim supn→+∞ inf{Fn (y) : d (x, y) < m1 } and lim infn→+∞ inf{Fn (y) : d (x, y) < 1 } are nondecreasing with respect to m, each supremum above is a limit when m goes m to +∞. The following proposition is a straightforward consequence of the definitions above. For a proof see Attouch [37], Braides [123], and Dal Maso [183]. Proposition 12.1.1. Let (X , d ) be a metrizable space, or more generally a first countable topological space, (Fn )n∈N a sequence of functions Fn : X −→ R ∪ {+∞}, and F : X −→ R ∪ {+∞}. Then (i) for all x ∈ X , Γ − lim sup Fn (x) := min lim sup Fn (xn ) : xn → x , n→+∞
M
n→+∞
N Γ − lim inf Fn (x) := min lim inf Fn (xn ) : xn → x ; n→+∞
n→+∞
(ii) the sequence (Fn )n∈N (sequentially) Γ -converges to F iff Γ − lim sup Fn ≤ F ≤ Γ − lim inf Fn ; n→+∞
n→+∞
(iii) the functions Γ − lim supn→+∞ Fn and Γ − lim infn→+∞ Fn are lsc; (iv) assuming that there exist α > 0 and x0 in X such that for all n ∈ N and for all x in X Fn (x) ≥ −α(1 + d (x, x0 )), then Γ − lim sup Fn (x) := sup lim sup inf {Fn (y) + λd (x, y)} , n→+∞
λ≥0 n→+∞ y∈X
Γ − lim inf Fn (x) := sup lim inf inf {Fn (y) + λd (x, y)} . n→+∞
λ≥0
n→+∞ y∈X
i
i i
i
i
i
i
book 2014/8/6 page 490 i
Chapter 12. Γ -convergence and applications
490
Remark 12.1.3. Note that assertion (iv) expresses the functionals Γ − lim supn→+∞ Fn and Γ − lim infn→+∞ Fn in terms of unconstrained problems by penalizing the distance from the point x in (12.1). Note also that Fnλ (x) := infy∈X {Fn (y) + λd (x, y)} is nothing but the Baire approximation of the functional Fn at the point x introduced in Theorem 9.2.1 in the context of Lipschitz regularization via epi-sum in normed spaces. Recall that Fnλ (x) increases to Fn (x) when λ → +∞ and that Fnλ is Lipschitz continuous with constant 4 λ. We could replace 5 this penalization by the Moreau–Yoshida approximation infy∈X Fn (y) + 2λλd (x, y)2 . The main interest of the concept of Γ -convergence is its variational nature made precise in item (i) below. For more precise details about epiconvergence or Γ -convergence, see Attouch [37], Braides [123], and Dal Maso [183]. Theorem 12.1.1. Let (Fn )n∈N be a sequence of functions Fn : X −→ R ∪ {+∞} which Γ -converges to some function F : X −→ R ∪ {+∞}. Then the following assertions hold: (i) Let xn ∈ X be such that Fn (xn ) ≤ inf{ Fn (x) : x ∈ X } + n , where n > 0, n → 0 when n → +∞. Assume that {xn , n ∈ N} is relatively compact; then every cluster point x of {xn : n ∈ N} is a minimizer of F and lim inf{ Fn (x) : x ∈ X } = F (x).
n→+∞
(ii) If G : X → R is continuous, then (Fn + G)n∈N Γ -converges to F+G. Let (Fn )n∈N be a sequence of functions Fn : X −→ R ∪ {+∞}. If there exists a function F : X −→ R ∪ {+∞} such that each subsequence of (Fn )n∈N possesses a subsequence which Γ -converges to F , then all the sequence Γ -converges to F . PROOF. Assertion (ii) is easy to establish and is left to the reader. For a proof of the last assertion, consult Attouch [37, Proposition 2.72]. The proof of assertion (i) is very close to that of Theorem 11.1.2. Let x be a cluster point of {xn : n ∈ N}, let (xσ(n) )n∈N be a subsequence of {xn : n ∈ N} converging to x, and set < if there exists n such that m = σ(n), x x˜m = σ(n) x¯ otherwise. Then x˜m → x¯ when m → +∞ and, according to (i) of Definition 12.1.1, we have F (x) ≤ lim inf Fn (˜ xn ) ≤ lim inf Fσ(n) (xσ(n) ) = lim inf inf Fσ(n) . n→+∞
n→+∞
n→+∞ X
(12.2)
Let now x be any element of X . According to (ii) of Definition 12.1.1, there exists a sequence (yn )n∈N converging to x and satisfying F (x) ≥ lim sup Fn (yn ) ≥ lim sup Fσ(n) (yσ(n) ). n→+∞
(12.3)
n→+∞
Combining (12.2) and (12.3), we obtain F (x) ≤ lim inf inf Fσ(n) ≤ lim sup inf Fσ(n) ≤ lim sup Fσ(n) (yσ(n) ) ≤ F (x). n→+∞ X
n→+∞ X
(12.4)
n→+∞
This proves that F (x) = minX F .
i
i i
i
i
i
i
12.2. Application to the nonlinear membrane model
book 2014/8/6 page 491 i
491
Taking x = x in (12.4) we also obtain limn→+∞ infX Fσ(n) = minX F . Since all subsequence of infX Fn possesses a subsequence converging to minX F , one has limn→+∞ infX Fn = minX F as required. In the following sections, we give three applications of the Γ -convergence. In Section 14.2, we also show how the Γ -convergence allows us to justify some one-dimensional models in the framework of fracture mechanics.
12.2 Application to the nonlinear membrane model Let ω be an open bounded subset of R2 with boundary γ and consider Ω = ω × (0, ), the reference configuration filled up by some elastic material. This three-dimensional thin structure is clamped on a part Γ0, = γ0 ×(0, ) of the boundary ∂ Ω of Ω (see Figure 12.1).
Figure 12.1. The deformation of a thin layer Ω of size .
To take into account large purely elastic deformation, the constitutive law of the deformable body is associated with a nonconvex elastic density f satisfying a growth condition of order p > 1. The stored strain energy associated with a displacement field u : Ω → R3 is given by the integral functional F : L p (Ω , R3 ) −→ R+ ∪ {+∞} defined by ⎧ ⎪ 1, p ⎨1 f (∇u) d x if u ∈ WΓ (Ω , R3 ), 0, F (u) = Ω ⎪ ⎩ +∞ otherwise, where the density f satisfies conditions (11.5) and (11.6), namely, there exist three positive constants α, β, L such that ∀a ∈ M3×3 α|a| p ≤ f (a) ≤ β(1 + |a| p ), ∀a, b ∈ M3×3 | f (a) − f (b )| ≤ L|b − a|(1 + |a| p−1 + |b | p−1 ).
(12.5) (12.6)
The scaling parameter −1 accounts for the stiffness of the material. In the linearized elasticity framework, it corresponds to Lamé coefficients of order −1 . The structure is subjected to applied body forces g : Ω −→ R3 for which we make the following assumption: there exists a vector valued function g : Ω = ω × (0, 1) −→ R3 , g ∈ Lq (Ω, R3 ) (1/ p + 1/q = 1), such that g (ˆ x , x3 ) = g (x), x = (ˆ x , x3 ). The exterior loading is L (u) =
Ω
g .u d x
and the equilibrium configuration is given by displacement vector fields u , solutions of the problem 5 4 inf F (u) − L (u) : u ∈ L p (Ω , R3 ) .
i
i i
i
i
i
i
book 2014/8/6 page 492 i
Chapter 12. Γ -convergence and applications
492
Due to the very small thickness of the layer Ω , for computing an approximate equilibrium displacement field, it is unrealistic to make a direct use of the finite element method described in Chapter 7. The variational property of Γ -convergence (Theorem 12.1.1) provides a new procedure: by letting go to zero, we aim at finding the elastic energy of a (fictitious) material occupying the two-dimensional membrane ω. We will finally compute an approximate equilibrium displacement field for the corresponding minimization problem by means of a two-dimensional finite element method. x , x3 ) = To work in the fixed space L p (Ω, R3 ), Ω = ω × (0, 1), the change of scale (ˆ (ˆ x , x3 ) transforming (ˆ x , x3 ) ∈ Ω into (ˆ x , x3 ) ∈ Ω leads to the following equivalent optimization problem: find u˜ solution of inf F˜ (u) − g .u d x : u ∈ L p (Ω, R3 ) , Ω
where
⎧ 1 ∂v ⎪ ⎨ U dx f ∇v, F˜ (v) = ∂ x3 Ω ⎪ ⎩ +∞ otherwise,
1, p
if v ∈ WΓ (Ω, R3 ), 0
U denotes the tangential gradient of v (i.e., ∇v U = ( ∂ vi ) Γ0 = γ0 × (0, 1), and ∇v ∂ x i =1,2,3, j
j =1,2 ).
As suggested above, we first establish the existence of the Γ -limit of the sequence (F˜ )>0 when goes to zero. To make precise its domain, we establish the following compactness result. Proposition 12.2.1 (compactness). Let (u )>0 be a sequence in L p (Ω, R3 ) satisfying sup F˜ (u ) < +∞. >0
1, p
Then, there exist a nonrelabeled subsequence and u in V = {v ∈ WΓ (Ω, R m ) : such that u converges to u, weakly in
1, p WΓ (Ω, R3 ) 0
0
∂v ∂ x3
= 0},
and strongly in L (Ω, R ). p
3
PROOF. Since sup>0 F˜ (u ) < +∞, by using the lower bound in (12.5), we obtain 1 ∂ u ˜ U F (u ) = d x; f ∇u , ∂ x3 Ω (∇u )>0 is bounded in L p (Ω, M3×3 ); * ∂u + 1 is bounded in L p (Ω, R3 ). ∂x 3
>0
Consequently, according to the Rellich–Kondrakov theorem, Theorem 5.4.2, and the 1, p Poincaré inequality, Theorem 5.3.1, there exist some u ∈ WΓ (Ω, R3 ) and a nonrelabeled 0 subsequence of (u )>0 such that 1, p
u u
weakly in WΓ (Ω, R3 );
u → u
strongly in L p (Ω, R3 );
∂ u ∂ x3
→0
0
strongly in L p (Ω, R3 ).
Therefore u belongs to V .
i
i i
i
i
i
i
12.2. Application to the nonlinear membrane model
book 2014/8/6 page 493 i
493
Note that V is canonically isomorphic to Wγ1, p (ω, R3 ). In what follows, we will use 0
the same notation for v ∈ V and its canonical representant in Wγ1, p (ω, R3 ). The follow0 ing theorem was established by Le Dret and Raoult [271]. For more general and recent variational models related to thin elastic plates, see [33], [35], [114], [284], and [285]. For a variational model taking into account oscillation-concentration effects, see [272]. Theorem 12.2.1 (Le Dret and Raoult [271]). Let us equip L p (Ω, R3 ) with its strong topology. The sequence of integral functionals (F˜ )>0 Γ -converges to the integral functional F defined in L p (Ω, R3 ) by ⎧ ⎨ Q f (∇u) U d xˆ if u ∈ V , 0 F (u) = ω ⎩ +∞ otherwise. The energy density f0 : M3×2 −→ R is defined for all m in M3×2 by 4 5 f0 (m) = inf f ((m|ξ )) : ξ ∈ R3 and Q f0 denotes the quasi-convex envelope of f0 . We write (m/ξ ) to denote the matrix M3×3 obtained by completing the matrix m with the column ξ . Since the map u → Ω g .u d x is continuous on L p (Ω, R3 ), from Theorem 12.1.1 we deduce the following corollary. Corollary 12.2.1. The sequence of optimization problems p 3 g .u d x : u ∈ L (Ω , R ) inf F˜ (u) −
(+ )
Ω
converges to the limit problem
min F (u) −
ω
g .u d xˆ : u ∈ V ,
(+ )
1 where g is defined for all xˆ ∈ ω, by g (ˆ x ) = 0 g (ˆ x , s) d s. Moreover, if u is a solution or an -minimizer of (+ ), then u˜ defined by u˜ (x) = u (ˆ x , x3 ) strongly converges in L p (Ω, R3 ) to a solution u of the limit problem (+ ). Roughly speaking, for very small thickness of the layer Ω , an equilibrium configura1, p tion u of (+ ), living in WΓ (Ω , R3 ), is close to u, living in Wγ1, p (ω, R3 ), and the layer 0,
0
Ω may be considered as a two-dimensional membrane ω, reference configuration filled up by some elastic material whose strain energy density is the function Q f0 . PROOF OF THEOREM 12.2.1. For all bounded Borel sets A of RN , we will sometimes denote its N -dimensional Lebesgue measure by |A| rather than " N (A). The proof proceeds in two steps, corresponding to each inequality in the definition of the Γ -convergence. Let us first notice that f0 satisfies α|m| p ≤ f0 (m) ≤ β(1 + |m| p ),
∀m ∈ M3×2
3×2
∀m, m ∈ M
(12.7)
| f0 (m) − f0 (m )| ≤ L |m − m |(1 + |m|
p−1
p−1
+ |m |
),
(12.8)
i
i i
i
i
i
i
book 2014/8/6 page 494 i
Chapter 12. Γ -convergence and applications
494
where L is a positive constant depending only on p and L. These estimates are obtained by easy calculations from (12.5) and (12.6) and are left to the reader. Consequently f0 fulfills all conditions of Proposition 11.2.2. First step. Let u strongly converge to u in L p (Ω, R3 ). We are going to establish F (u) ≤ lim inf F˜ (u ). →0
Obviously, one may assume lim inf→0 F˜ (u ) < +∞ so that for a nonrelabeled subsequence, 1 ∂ u U ˜ d x, f ∇u , F (u ) = ∂ x3 Ω and from Proposition 12.2.1, u belongs to V . Trivially we have 1 ∂ u U U ) dx dx ≥ f ∇u , f0 (∇u ∂ x3 Ω Ω U ) d x. ≥ Q f0 (∇u
(12.9)
Ω
Let us now consider the function h : M3×3 −→ R defined by h(a) = Q f0 (a1 |a2 ), where a1 and a2 denote the two first columns of the matrix a = (a1 , a2 , a3 ). We claim that 0 ≤ h(a) ≤ β(1 + |a| p ),
∀a ∈ M3×3
3×3
∀a, b ∈ M
|h(a) − h(b )| ≤ L |b − a|(1 + |a|
(12.10) p−1
+ |b |
p−1
),
(12.11)
where L is some positive constant depending only on p and L . We also claim that h = Q h, where, D denoting any bounded open subset of R3 with |∂ D| = 0, the function Q h is defined by 1 1, p 3×3 3 ∀a ∈ M Q h(a) = inf h(a + ∇φ) d x : φ ∈ W0 (D, R ) |D| D (see Proposition 11.2.2). Inequalities (12.10) and (12.11) are straightforward consequences of inequalities (12.7) and (12.8). Let us show that h = Q h. Indeed, let Y = Yˆ × (0, 1), 1, p x ) = φ(ˆ x , y) for a.e. y in (0, 1). Clearly where Yˆ = (0, 1)2 , φ ∈ W0 (Y, R3 ), and set φy (ˆ 1, p ˆ 3 φ belongs to W (Y , R ) and we have y
0
h(a + ∇φ) d x =
Y
Y
U dx Q f0 ((a1 |a2 ) + ∇φ)
1 =
Yˆ
0
≥
0
Q f0 ((a1 |a2 ) + ∇φy (ˆ x )) d xˆ
dy
1
Q f0 ((a1 |a2 ))d y = h(a),
where we have used the quasi-convexity inequality satisfied by Q f0 in the last inequality (see Proposition 11.2.2). Consequently, 1, p 3 h(a) = inf h(a + ∇φ) d x : φ ∈ W0 (Y, R ) Y
i
i i
i
i
i
i
12.2. Application to the nonlinear membrane model
book 2014/8/6 page 495 i
495
and, according to Proposition 11.2.2, the equality h = Q h follows from 1 1, p inf h(a + ∇φ) d x : φ ∈ W0 (D, R3 ) |D| D 1, p h(a + ∇φ) d x : φ ∈ W0 (Y, R3 ) . = inf Y
Letting → 0 in (12.9) and according to Remark 11.2.1 and Theorem 13.2.1, we obtain 1 ∂ u U U ) dx lim inf f ∇u , d x ≥ lim inf Q f0 (∇u →0 →0 ∂ x3 Ω Ω = lim inf h(∇u ) d x →0 Ω U d x. ≥ h(∇u) d x = Q f0 (∇u) Ω
Ω
Since u belongs to V , with our convention the last integral is also equal to and the proof of the first step is complete.
ω
Q f0 (∇u) d xˆ
Second step. We are going to establish (Γ − lim sup→0 F˜ ) ≤ F . Let us assume F (u) < +∞ so that u ∈ Wγ1, p (ω, R3 ) and 0 F (u) = Q f0 (∇u(ˆ x )) d xˆ. ω
Following the proof of Lemma 11.2.2 about interchange between infimum and integral, one may easily establish f0 (∇u) d xˆ = inf f (∇u(ˆ x ), ξ (ˆ x )) d xˆ. (12.12) ξ ∈ (ω,R3 )
ω
ω
1, p
Let now ξ be some fixed element in (ω, R3 ) and define in WΓ (Ω, R3 ) the following 0 function: w (ˆ x , x3 ) = u(ˆ x ) + x3 ξ (ˆ x ). It is easy to see that w strongly converges to u in L p (Ω, R3 ). On the other hand, from (12.6), an easy computation gives f (∇u(ˆ x ), ξ (ˆ x )) d xˆ. (12.13) lim F˜ (w ) = →0
ω
Now, from (12.13), and taking the infimum over ξ ∈ (ω, R3 ), (12.12), gives f0 (∇u) d xˆ, inf lim sup F˜ (v ) : v → u strongly in L p (Ω, R3 ) ≤ →0
that is to say,
ω
Γ − lim sup F˜ (u) ≤ F˜ (u), →0
(12.14)
where F˜ is the functional defined in L p (Ω, R3 ) by ⎧ ⎨ f0 (∇v(ˆ x )) d xˆ if v ∈ V , F˜(v) = ⎩ ω +∞ otherwise.
i
i i
i
i
i
i
book 2014/8/6 page 496 i
Chapter 12. Γ -convergence and applications
496
Obviously, (12.14) also holds for any function u ∈ L p (Ω, R3 ). Taking the lower semicontinuous envelope of each of the two members when L p (Ω, R3 ) is equipped with its strong topology, and according to Proposition 12.1.1(iii) and to an easy adaptation of Corollary 11.2.1, we obtain ˜ Γ − lim sup F (u) ≤ F (u) →0
for all u ∈ LP (Ω, R3 ). Last step. Collecting the two previous steps gives Γ −lim sup →0 F˜ ≤ F ≤ Γ −lim inf→0 F˜ so that Γ − lim→0 F˜ = F .
12.3 Application to homogenization of composite media 12.3.1 The quadratic case in one dimension Before giving the general result concerning homogenization of composite media in Subsection 12.3.2, we establish a complete description of Γ -limits of integral functionals with quadratic density in the one-dimensional case. More precisely, given a : R → R satisfying that there exist α > 0 and β > 0 such that, for all x ∈ R, α ≤ a ≤ β,
(12.15)
we would like to establish the existence of a Γ -limit for the sequence of integral functionals F : L2 (0, 1) → R+ ∪ {+∞} defined by F (u) =
⎧ ⎨ ⎩
a (x) u 2 (x) d x
(0,1)
+∞
if u ∈ H 1 (0, 1),
otherwise,
when L2 (0, 1) is equipped with its strong topology. Theorem 12.3.1. Assume that a fulfills condition (12.15). Then the following assertions hold: (i) If a1 a1 for the σ(L∞ , L1 ) topology, then (F )>0 Γ -converges to the integral functional
F defined on L2 (0, 1) by
F (u) =
⎧ ⎨ ⎩
(0,1)
+∞
a(x) u 2 (x) d x
if u ∈ H 1 (0, 1),
otherwise.
(ii) Conversely, if (F )>0 Γ -converges to some functional F , then ( a1 )>0 σ(L∞ , L1 )-converges to some b with a =
1 b
satisfying (12.15), and F has the integral representation
F (u) =
⎧ ⎨ ⎩
(0,1)
+∞
a(x) u 2 (x) d x
if u ∈ H 1 (0, 1),
otherwise.
i
i i
i
i
i
i
12.3. Application to homogenization of composite media
book 2014/8/6 page 497 i
497
PROOF OF (i). Let (u )>0 be a sequence strongly converging to some u in L2 (0, 1). We want to establish (12.16) F (u) ≤ lim inf F (u ). →0
Obviously, one may assume lim inf→0 F (u ) < +∞, so that, for a nonrelabeled subsequence, u belongs to H 1 (0, 1). From the equiboundedness of u in L2 (0, 1), we deduce that u is bounded in H 1 (0, 1) so that u weakly converges to u in H 1 (0, 1). We now take into account the quadratic expression of F and write F (u ) as follows: a u2 d x = a (u − au /a )2 d x F (u ) = (0,1)
(0,1)
+2 ≥2
(0,1)
(0,1)
u u a
dx −
u u a d x −
(0,1)
(0,1)
u 2 a 2 /a d x u 2 a 2 /a d x.
Letting → 0 gives (12.16). Given u ∈ L2 (0, 1), we now must construct a sequence (v )>0 strongly converging to u and satisfying F (u) ≥ lim sup F (v ). (12.17) →0
One may assume u ∈ H (0, 1). Let us set x a(t )u (t )/a (t ) d t . v (x) = u(0) + 1
0
We recall that u belongs to C([0, 1]) so that the previous expression is well defined. Then v = au /a weakly converges to u in L2 (0, 1). Since v is bounded in L∞ (0, 1), we deduce that v is bounded in H 1 (0, 1); thus, from the Rellich–Kondrakov theorem, Theorem 5.4.2, it strongly converges to some θ in L2 (0, 1) with θ = u , thus θ = u + c. According to the continuity of the trace, one has θ(0) = u(0), so that θ = u and v → u strongly in L2 (0, 1). On the other hand, F (v ) = a 2 u 2 /a d x, (0,1)
and letting → 0 yields
lim F (v ) = F (u), →0
hence (12.17). PROOF OF (ii). From (12.15), 1/a is bounded in L∞ (0, 1). Therefore, for a nonrelabeled subsequence, it σ(L∞ , L1 )-converges to some b with a = b1 satisfying (12.15). According to (i), the corresponding subsequence of (F )>0 Γ -converges to the functional G defined by ⎧ ⎨ a(x) u 2 (x) d x if u ∈ H 1 (0, 1), G(u) = (0,1) ⎩ +∞ otherwise. Consequently F = G and a is uniquely defined by F . Since all subsequence of (1/a )>0 possesses a subsequence which σ(L∞ , L1 )-converges to the same limit 1/a, all the sequence (1/a )>0 σ(L∞ , L1 )-converges to 1/a.
i
i i
i
i
i
i
book 2014/8/6 page 498 i
Chapter 12. Γ -convergence and applications
498
Example 12.3.1. Let us consider a defined by a (x) = a(x/), where a is a (0, 1)-periodic function taking two positive values α and β on (0, 1/2) and (1/2, 1), respectively. Then 1/a weakly converges for the σ(L∞ , L1 ) topology to (α−1 + β−1 )/2. For a proof, see Example 2.4.2 or Proposition 13.2.1 and the proof of Theorem 13.2.1. Therefore
F (u) =
⎧ ⎪ ⎨
⎪ ⎩ +∞
(0,1)
α−1 + β−1
−1 u 2 (x) d x
2 otherwise.
if u ∈ H 1 (0, 1),
The functional F may be interpreted, for example, as the elastic energy of a system of two kinds of small periodically distributed springs with size . Theorem 12.3.1 shows that the mechanical behavior of such a system is equivalent to a homogeneous string in the sense α−1 +β−1 −1 of Γ -convergence. The equivalent density is associated with the strain tensor 2 strictly smaller than the mean value material.
α+β 2
of the tensors associated with the two kinds of
Remark 12.3.1. The same problem in the two-dimensional case, describing, for example, a system of two kinds of small elastic pieces in Ω = (0, 1)2 in a chessboard structure, may be treated as above by using the concept of Γ -convergence. One / can show that the equivalent density is quadratic and associated with the strain tensor αβ. In the three-dimensional case, there is no explicit formula for the strain tensor limit. More generally, when working with general quadratic densities of the form f (ξ ) = 〈A ξ , ξ 〉, A ∈ 3×3 satisfying, for all ξ ∈ R3 , α|ξ |2 ≤ 〈A ξ , ξ 〉 ≤ β|ξ |2 , one can show that the Γ -limit of the associated integral functional possesses a density of the form 〈Aξ , ξ 〉. The strategy is then to derive optimal bounds for the constant limit matrix A (see Murat and Tartar [309]).
12.3.2 Periodic homogenization in the general case Let Ω be an open bounded subset of R3 which represents the interior of the reference configuration filled up by some elastic ( p > 1) or pseudoplastic ( p = 1) material which is clamped on a part Γ0 of the boundary ∂ Ω of Ω. We assume that this material is heterogeneous with a periodic distribution of small heterogeneities of size of order > 0, so that the stored strain energy density is of the form (x, a) → f
!x
" ,a ,
where f (., a) is Y -periodic, Y = (0, 1)3 . We assume that f satisfies conditions (12.5) and (12.6) of the previous section and, to take into account large purely elastic deformations, f is not assumed to be convex but possibly quasi-convex. With the notation of Sections 11.2 and 11.3, the stored strain energy associated with a displacement field u : Ω → R3 is given by the integral functional F : L p (Ω, R3 ) −→ R+ ∪ {+∞} defined by ⎧ ! " ⎨ f x , ∇u d x F (u) = ⎩ Ω +∞ otherwise.
1, p
if u ∈ WΓ (Ω, R3 ), 0
i
i i
i
i
i
i
12.3. Application to homogenization of composite media
book 2014/8/6 page 499 i
499
The structure is subjected to applied body forces g : Ω −→ R3 , g ∈ Lq (Ω, R3 ), (1/ p + 1/q = 1 if p > 1; q = +∞ if p = 1) and the exterior loading is defined by L(u) =
g .u d x. Ω
The equilibrium configuration is then given by the displacement field u solution of the problem p 3 inf F (u) − L(u) : u ∈ L (Ω, R ) . Ω
Due to the very small size of heterogeneity, for computing an approximate equilibrium displacement field, it is illusory to make direct use of the finite element method. The variational property of Γ -convergence (Theorem 12.1.1) would again provide a new procedure: to find a fictitious material occupying Ω, which appears to be homogeneous when goes to zero, and to compute the approximate equilibrium displacement field by means of a finite element method related to a discretization of the new model. To treat more general situations, we deal with functionals F : L p (Ω, R m ) −→ R+ ∪ {+∞} defined by ⎧ ! " x ⎪ ⎨ f , ∇u d x F (u) = Ω ⎪ ⎩ +∞ otherwise,
1, p
if u ∈ WΓ (Ω, R m ), 0
where Ω is an open bounded subset of RN , m is any positive integer, f satisfies the growth conditions (12.5) and (12.6), and, for all a in the set M m×N of m × N matrices, the Borel function f (., a) is assumed to be Y -periodic, Y = (0, 1)N . Following the strategy of the previous subsection, we are going to establish the Γ -convergence of the sequence (F )>0 when L p (Ω, R m ) is equipped with its strong topology. In Theorem 12.3.2, we will establish that the Γ -limit of F possesses an integral representation. In the following proposition, we characterize its density ( p > 1) or its regular part ( p = 1). It is worth noticing the similarity between this proposition and Proposition 11.2.2, where we defined the relaxed density of the integral functional on Sobolev or BV spaces. Proposition 12.3.1. For all open bounded convex set A in RN the following limit exists: f
ho m
(a) = lim inf →0
1 |A/|
f (x, a + ∇u(x)) d x : u A/
1, p ∈ W0 (A/, R m )
.
Moreover, this limit does not depend on the choice of the open bounded convex set A and is equal to 1 1, p m inf inf f (y, a + ∇u(y)) d y : u ∈ W0 (Y, R ) . n∈N∗ n N nY The proof is based on a convergence result related to subadditive processes. To go further, we first give some notation and definitions. Let us denote the family of all the bounded Borel sets of RN by b (RN ).
i
i i
i
i
i
i
book 2014/8/6 page 500 i
Chapter 12. Γ -convergence and applications
500
A sequence (Bn )n∈N of sets of b (RN ) is said to be regular if there exists an increasing sequence of half intervals In of the type [a, b ) with vertices in ZN and a positive constant C independent of n such that Bn ⊂ In and |In | ≤ C |Bn | for all n ∈ N. A subadditive ZN -invariant set function indexed by b (RN ) is a map @ : b (RN ) −→ R, A → @A, such that (i) for all A, B ∈ b (RN ) with A ∩ B = , @A∪B ≤ @A + @B ; (ii) for all A ∈ b (RN ) and all z ∈ ZN , @ z+A = @A. Finally, for all A in b (RN ), we define the positive number ρ(A) := sup{r ≥ 0 : ∃B¯r (x) ⊂ A}, where B¯r (x) is the closed ball with radius r > 0 centered at x. The following lemma generalizes the classical limit theorem related to subadditive processes indexed by cubes. For a proof, we refer the reader to Ackoglu and Krengel [4] or to Licht and Michaille [274]. Lemma 12.3.1. Let @ be a subadditive ZN -invariant set function such that @I γ (@ ) := inf : I = [a, b [, a = (ai )i =1,...,N , b = (bi )i =1,...,N ∈ ZN , |I | ∀i = 1, . . . , N , ai < bi > −∞ and which satisfies the following domination property: there exists a positive constant C (@ ) < +∞ such that |@A| ≤ C (@ ) for all Borel sets A included in [0, 1[N . Let (An )n∈N be a regular sequence of Borel convex sets of b (RN ) satisfying limn→+∞ ρ(An ) = +∞. Then @[0,m[N @An lim = γ (@ ). = inf ∗ n→+∞ |A | m∈N mN n PROOF OF PROPOSITION 12.3.1. Let us notice that in the definition of subadditivity, assertion (i) may be replaced by the following: for all A, B ∈ b (RN ) with A ∩ B = and |∂ A| = |∂ B| = 0, @A∪B ≤ @A + @B (see [274, Remark of Theorem 2.1] or [185]). Then we claim that 0 1, p m @ : A → inf 0 f (x, a + ∇u(x)) d x : u ∈ W0 (A, R ) A
is a subadditive Z -invariant process. Indeed, Y -periodicity of f (·, a) yields @A+z = @A for all A ∈ b (RN ) and all z ∈ ZN . Let now A, B in b (RN ) such that A ∩ B = and N
0
0
|∂ A| = |∂ B| = 0. For arbitrary fixed η > 0, consider ϕA ∈ (A, R m ) and ϕB ∈ (B, R m ) satisfying 0
f (x, a + ∇ϕA) d x ≤ @A + η,
0
f (x, a + ∇ϕB ) d x ≤ @B + η.
A
B
0
0
Extending ϕA and ϕB by zero, respectively, on RN \ A and RN \ B, the function ϕ which 0
0
1, p
0
coincides with ϕA on A and ϕB in B belongs to W0 (A ∪ B, R m ). Since |∂ A| = |∂ B| = 0,
i
i i
i
i
i
i
12.3. Application to homogenization of composite media 0
0
book 2014/8/6 page 501 i
501
0
| A ∪ B \ A ∪ B | = 0, thus f (x, a + ∇ϕ) d x = 0 f (x, a + ∇ϕA) d x + 0 f (x, a + ∇ϕB ) d x 0 A∪B B A + 0 0 0 f (x, a) d x =
A∪B\A∪B
0
A
f (x, a + ∇ϕA) d x +
≤ @A + @B + 2η.
0
B
f (x, a + ∇ϕB ) d x
Consequently, @A∪B ≤ @A +@B +2η and the subadditivity of @ follows by letting η → 0. We conclude the proof by applying Lemma 12.3.1 to this process. Remark 12.3.2. (1) When a → f (x, a) is a convex function, the limit which defines f ho m in Proposition 12.3.1 can be expressed in a reduced form (see Proposition 12.3.4 below). (2) Lemma 12.3.1 may be generalized for ergodic subadditive processes, i.e., for subadditive processes with value in L1 (Σ, , P) and whose probability law is, roughly speaking, invariant under a group (T z ) z∈ZN of measure-preserving transformations on the probability space (Σ, , P). This probabilistic version allows us to treat stochastic homogenization in Section 12.4 and a variational model in fracture mechanics in Section 14.2. We can now establish the main convergence result, a generalization of Theorems 11.2.1 and 11.3.1. In what follows, actually denotes a sequence (n )n∈N of positive numbers n going to zero when n → +∞ and we adopt the notation of Section 11.2. For more general problems involving multiple small parameters, see [13], [14]. For problems concerned with nonlocal effects, see [90], [91]. Theorem 12.3.2. Let f satisfying (12.5) and (12.6) with p ≥ 1 and assume that the Borel function f (., a) is Y -periodic for all a in M m×N . Let us consider the integral functional F defined in L p (Ω, R m ) by ⎧ ! " ⎨ f x , ∇u d x if u ∈ W 1, p (Ω, R m ), Γ0 F (u) = Ω ⎩ +∞ otherwise, where L p (Ω, R m ) is equipped with its strong topology. Then (F )>0 Γ -converges to the integral functional F ho m defined by (i) case p > 1, ⎧ 1, p ⎨ f ho m (∇u) d x if u ∈ WΓ (Ω, R m ), ho m 0 (u) = F Ω ⎩ +∞ otherwise; (ii) case p = 1, s ⎧ D u ⎪ ho m ho m,∞ ⎪ |D s u| f (∇u) d x + ( f ) ⎪ ⎪ s ⎪ |D u| ⎨ Ω Ω F ho m (u) = + ( f ) ho m,∞ (γ0 (u) ⊗ ν) d N −1 ⎪ ⎪ ⎪ ⎪ Γ0 ⎪ ⎩ +∞ otherwise,
if u ∈ BV(Ω, R m ),
i
i i
i
i
i
i
book 2014/8/6 page 502 i
Chapter 12. Γ -convergence and applications
502
where ν denotes the outer unit normal to Γ0 , γ0 the trace operator, and ( f ) ho m,∞ the recession function of f ho m defined by ( f ) ho m,∞ (a) = lim sup
( f ) ho m (t a)
t →+∞
t
.
The proof of Theorem 12.3.2 is the consequence of Propositions 12.3.2 and 12.3.3. To shorten the proofs we do not take into account the boundary condition, i.e., the domain of F is W 1, p (Ω, R m ). For treating the general case, it suffices to reproduce exactly the proofs of Corollary 11.2.1 when p > 1 and Corollary 11.3.1 when p = 1. Proposition 12.3.2. For all u in L p (Ω, R m ) and all sequence (un )n∈N strongly converging to u in L p (Ω, R m ), one has F ho m (u) ≤ lim inf Fn (un ). n→+∞
(12.18)
PROOF. Our strategy is exactly the one of Proposition 11.2.3 or 11.3.3. Obviously, one may assume lim infn→+∞ Fn (un ) < +∞. For a nonrelabeled subsequence, consider the nonnegative Borel measure μn := f ( . , ∇un (.))" Ω; we have n
sup μn (Ω) < +∞.
n∈N
Consequently, there exists a further subsequence (not relabeled) and a nonnegative Borel measure μ ∈ M(Ω) such that μn μ
weakly in M(Ω).
Let μ = g " N Ω + μ s be the Lebesgue–Nikodým decomposition of μ, where μ s is a nonnegative Borel measure, singular with respect to the N -dimensional Lebesgue measure " Ω restricted to Ω. For establishing (12.18) it is enough to prove that ho m
(∇u(x)) x a.e., s D u s ho m,∞ μ ≥f |D s u| when p = 1. |D s u| g (x) ≥ f
(12.19) (12.20)
(a) Proof of (12.19). Let ρ > 0 intended to tend to 0 and let Bρ (x0 ) be the open ball of radius ρ centered at x0 . According to the theory of differentiation of measures, for a.e. x0 ∈ Ω μ(Bρ (x0 )) g (x0 ) = lim . ρ→0 |B (x )| ρ 0 Applying Lemma 4.2.1, one may assume μ(∂ Bρ (x0 )) = 0 for all but countably many ρ > 0, so that, from Alexandrov’s theorem, Proposition 4.2.3, we have μ(Bρ (x0 )) = limn→+∞ μn (Bρ (x0 )) and we finally must establish lim lim
ρ→0 n→+∞
μn (Bρ (x0 )) |Bρ (x0 )|
≥ f ho m (∇u(x0 )) for a.e. x0 ∈ Ω.
(12.21)
i
i i
i
i
i
i
12.3. Application to homogenization of composite media
book 2014/8/6 page 503 i
503
Let us assume for the moment that the trace of un on ∂ Bρ (x0 ) coincides with the affine function u0 defined by u0 (x) := u(x0 ) + 〈∇u(x0 ), x − x0 〉. It follows from Proposition 12.3.1 that lim
μn (Bρ (x0 )) |Bρ (x0 )| 1
n→+∞
= lim
n→+∞
|Bρ (x0 )|
≥ lim sup inf n→+∞
= lim inf n→+∞
⎧ ⎨
f Bρ (x0 )
1
|Bρ (x0 )|
x n
, ∇u(x0 ) + ∇(un − u0 ) d x f
Bρ (x0 )
x n
, ∇u(x0 ) + ∇φ
dx
1
⎩ | 1 Bρ (x0 )| n
1, p : φ ∈ W0 (Bρ (x0 ), R m )
1
1, p
1 n
Bρ (x0 )
f (x, ∇u(x0 ) + ∇φ) d x : φ ∈ W0
n
Bρ (x0 ), R m
⎫ ⎬ ⎭
= f ho m (∇u(x0 )), and the proof would be complete. The idea now consists in modifying un into a function of W 1, p (Bρ (x0 ), R m ) which coincides with u0 on ∂ Bρ (x0 ) in the trace sense, to follow the previous procedure and to control the additional terms, when ρ goes to zero, thanks to the estimate (see Lemma 11.2.1 and Proposition 10.4.1): for a.e. x ∈ Ω, ⎡ ⎣
1 |Bρ (x0 )|
Bρ (x0 )
⎤1/ p p |u(x) − u(x0 ) + ∇u(x0 )(x − x0 ) | d x ⎦ = o(ρ).
The suitable modification of un is exactly the one of Proposition 11.2.3 because of the conditions (12.5) and (12.6) satisfied by f . The proof of (12.21) is then complete. (b) Proof of (12.20). It suffices to reproduce the proof of inequality s D u s ∞ μ ≥ (Q f ) |D s u| |D s u| obtained in the proof of Proposition 11.3.3 after substituting f by f ( x , ·) and, according n
to Proposition 12.3.1, after substituting Q f by f ho m .
Proposition 12.3.3. For all u in L p (Ω, R m ), p ≥ 1, there exists a sequence (un )n∈N strongly converging to u in L p (Ω, R m ) such that F ho m (u) ≥ lim sup Fn (un ). n→+∞
PROOF. The proof will be obtained in two steps. First step. We assume u ∈ W 1, p (Ω, R m ). We reproduce, with minor modifications, the outline of the proof of Proposition 11.2.4. According to Proposition 12.3.1 and to the Lebesgue dominated convergence theorem, ho m ho m F (u) = f (∇u) d x = lim fkho m (∇u) d x, (12.22) Ω
k→+∞
Ω
i
i i
i
i
i
i
book 2014/8/6 page 504 i
Chapter 12. Γ -convergence and applications
504
where fkho m (a) = inf
1
|kY |
kY
1, p f (y, a + ∇v) d y : v ∈ W0 (kY, R m ) .
∗
Let us fix k ∈ N . Applying the interchange lemma, Lemma 11.2.2, we have for all η > 0 (of the form 1/h with h integer) and for some φk,η in Cc (Ω, (kY, R m )), 1 >
|kY | Ω×kY 1 |kY |
Ω×kY
f (y, ∇u(x) + ∇y φk,η (x, y)) d xd y ≥
Ω
fkho m (∇u) d x
f (y, ∇u(x) + ∇y φk,η (x, y)) d xd y − η.
(12.23)
Let us extend y → φk,η (x, y) by kY -periodicity on RN and consider the function uk,η,n defined by x uk,η,n (x) = u(x) + n φk,η x, . n Note that φk,η is a Carathéodory function so that x → φk,η (x, x ) is measurable. Clearly n
uk,η,n belongs to W 1, p (Ω, R m ) and
uk,η,n → u
strongly in L p (Ω, R m )
when n goes to ∞. On the other hand, according to the continuity assumption (12.6) on f and to Lemma 11.2.3, x f , ∇uk,η,n d x lim n→+∞ n Ω x x x + n ∇φk,η x, dx f , ∇u(x) + ∇y φk,η x, = lim n→+∞ n n n Ω x x = lim dx f , ∇u(x) + (∇y φk,η ) x, n→+∞ n n Ω 1 = f (y, ∇u(x) + ∇y φk,η (x, y)) d xd y. |kY | Ω×kY Consequently, from (12.23) 1 lim F (u )= f (y, ∇u(x)+∇y φk,η (x, y)) d xd y ≤ fkho m (∇u) d x +η. n→+∞ n k,η,n |kY | Ω×kY Ω The inequality above and (12.22), letting η → 0 (i.e., h → +∞) and k → +∞, yield lim lim lim Fn (uk,η,n ) = F ho m (u).
k→+∞ η→+0 n→+∞
Let us now apply the diagonalization Lemma 11.1.1 to the sequence Fn (uk,η,n ), uk,η,n k,η,n
in the metric space R × L p (Ω, R m ): there exists a map n → (k, η)(n) such that )= lim F (u n→+∞ n (k,η)(n),n lim u n→+∞ (k,η)(n),n
=u
F ho m (u),
strongly in L p (Ω, R m ).
i
i i
i
i
i
i
12.3. Application to homogenization of composite media
book 2014/8/6 page 505 i
505
We have proved that F ho m (u) ≥ lim supn→+∞ Fn (un ) for un = u(k,η)(n),n converging to u in W 1, p (Ω, R m ) equipped with the strong convergence of L p (Ω, R m ). If p > 1, the proof is complete because the domain of F ho m is W 1, p (Ω, R m ). Second step ( p = 1). Let us consider the functional G defined on L1 (Ω, R m ) by ⎧ ⎨ f ho m (∇u) d x if u ∈ W 1,1 (Ω, R m ), G(u) = ⎩ Ω +∞ otherwise. According to Theorem 11.3.1, F ho m is nothing but the lower semicontinuous envelope cl(G) of G. Therefore, for all u ∈ L1 (Ω, R m ), there exists a sequence (u l ) l ∈N in L1 (Ω, R m ), strongly converging to u in L1 (Ω, R m ) such that F ho m (u) = lim G(u l ). l →+∞
One may assume F ho m (u) < +∞ so that according to the first step, there exists a sequence (u l ,n )n∈N strongly converging to u l in L1 (Ω, R m ) when n → +∞, such that G(u l ) = lim Fn (u l ,n ). n→+∞
Combining these two equalities, we obtain F ho m (u) = lim sup lim sup Fn (u l ,n ) l →+∞
and
lim lim u l ,n = u
l →+∞ l →+∞
n→+∞
strongly in L1 (Ω, R m ).
We end the proof by applying the diagonalization Lemma 11.1.1. Proposition 12.3.4 (convex case). Assume that the function a → f (x, a) is convex. Then for all a ∈ M m×N , f ho m (a) reduces to 1, p ho m m (a) = inf f (y, a + ∇u(y)) d y : u ∈ W# (Y, R ) , f Y
1, p
where W# (Y, R m ) := {u ∈ W 1, p (Y, R m ) : u is Y -periodic}. 1, p
PROOF. For all v ∈ W# (Y, R m ) and all a ∈ M m×N , set u (x) := a.x + u( x ). Clearly u → la in L p (Ω, R m ) so that, according to Proposition 12.3.2 with Ω = Y , !x ! x "" f ho m (a) ≤ lim inf f , a + ∇u d x. (12.24) →0 Y
But
f
lim →0
Y
!x
, a + ∇u
! x ""
f (y, a + ∇u(y)) d y.
dx =
(12.25)
Y
Combining (12.24) and (12.25), we infer that 1, p ho m m f (a) ≤ inf f (y, a + ∇u(y)) d y : u ∈ W# (Y, R ) . Y
i
i i
i
i
i
i
book 2014/8/6 page 506 i
Chapter 12. Γ -convergence and applications
506
We are going to establish the converse inequality. From the subdifferential inequal1, p 1, p ity, for all u ∈ W0 (nY, R m ) and all w ∈ W# (Y, R m ) extended by periodicity in RN , we have f (y, a + ∇u(y)) ≥ f (y, a + ∇w(y)) + 〈∂ f (y, a + ∇w(y)), ∇u(y) − ∇w(y)〉, where, to shorten the notation, we write ∂ f (y, a + ∇w(y)) for some ξ (y) ∈ ∂ f (y, a + ∇w(y)). Thus 1 f (y, a + ∇u(y)) d y ≥ f (y, a + ∇w(y)) d y n N nY Y 1 + N 〈∂ f (y, a + ∇w(y)), ∇u(y) − ∇w(y)〉 d y. n nY (12.26) 1, p Take now for w the minimizer of inf{ Y f (y, a + ∇u(y)) d y : u ∈ W# (Y, R m )} which then satisfies div ∂ f (y, a + ∇w(y)) = 0 a.e. in nY ; ∂ f (y, a + ∇w(y)).ν
antiperiodic on ∂ nY,
and integrate by parts the last term of (12.26). We obtain 1 1, p m f (y, a + ∇u(y)) d y ≥ inf f (y, a + ∇u(y)) d y : u ∈ W# (Y, R ) n N nY Y 1, p
for all u ∈ W0 (nY, R m ). We conclude the proof by taking the infimum of the left-hand 1, p side with respect to all the functions of W0 (nY, R m ).
12.4 Stochastic homogenization We return to the previous study, but we no longer assume that the density x → f (x, a) is periodic. From the standpoint of the modeling, the heterogeneities of the medium studied are not assumed to be regularly distributed. However, a too general distribution of heterogeneities does not allow us to perform a mathematical analysis. This is why, although the medium is imperfectly known, we assume that it is statistically homogeneous in the sense that the probability distribution of heterogeneities is invariant under the spatial translations. We provide a precise mathematical definition which clearly extends the periodic framework. Finally, we give two standard examples of statistically homogeneous materials (cf. Examples 12.4.1 and 12.4.2). The strategy consisting in performing the statistical averages obtained from a large number of samples of the random functional energy has been implemented in [59] by means of the epigraphical sum of realizations. In this section, we choose to continue the strategy of the previous section, initiated by [185], by showing that the sequence of the functional energies Γ -converges almost surely and by identifying the Γ -limit. For this, we only have to mimic the proof of Theorem 12.3.2 by substituting a subadditive theorem for Lemma 12.3.1. In Chapter 17, we deal with stochastic homogenization in a dynamical case and determine the limit Cauchy problem corresponding to the diffusion in random media. We first establish the mathematical tools coming from ergodic theory and establish the almost sure pointwise convergence of subadditive processes in general. The energy density of the homogenized problem is the almost sure limit of a suitable subadditive process
i
i i
i
i
i
i
12.4. Stochastic homogenization
book 2014/8/6 page 507 i
507
defined from the energy functional of the initial problem. Therefore, it possibly depends on a physical or geometrical parameter (temperature, inclusion shape). This is why it may be interesting to investigate the variational property of the pointwise almost sure convergence through the parameter. Precisely, when the process depends on a parameter in a separable metric space, under a lower semicontinuity dependence, we show that the almost sure convergence of the opposite superadditive process is actually a Γ -convergence. This last result generalizes the epigraphical law of large numbers established in [59].
12.4.1 The subadditive ergodic theorem In what follows, a dynamical system (Σ, , P, (T z ) z∈ZN ) is a probability space (Σ, , P) endowed with a group (T z ) z∈ZN of P-preserving transformations on (Σ, ), i.e., a family of , -measurable maps T z : Σ → Σ satisfying T z ◦ T z = T z+z , T−z = T z−1 T z# P = P ∀z ∈ ZN .
∀(z, z ) ∈ ZN × ZN ;
We use the standard notation T z# P to denote the image measure (or push forward) of P by T z . The term dynamical system refers to the “evolution” of the elements (or alea) of Σ according to the group (T z ) z∈ZN . More specific dynamical systems, namely, differential dynamical systems, are introduced in Chapter 17, where the semigroup (S t ) t ≥0 generated by a gradient or a subdifferential of convex potential plays the role of the discrete group (T z ) z∈ZN . The dynamical system (Σ, , P, (T z ) z∈ZN ) is said to be ergodic if for all E of we have T z E = E ∀z ∈ ZN =⇒ P(E) = 0 or P(E) = 1. A sufficient condition to ensure ergodicity is the so-called mixing condition which expresses an asymptotic independence: for all sets E and F of lim P(T z E ∩ F ) = P(E)P(F ).
|z|→+∞
(12.27)
Ergodicity is obtained from (12.27) by taking E = F . The defect of ergodicity is captured by the σ-algebra 0 of invariant sets of under the group (T z ) z∈ZN , i.e., E ∈ 0 iff T z E = E for all z ∈ ZN . We denote by L1P (Σ) the space of P-integrable numerical functions. For X ∈ L1P (Σ), E0 X denotes the conditional expectation of X given 0 , that is, the unique 0 -measurable function in L1P (Σ) satisfying 0 E X (ω) d P(ω) = X (ω) d P(ω) ∀ E ∈ 0 . E
E 0
It is easy to establish that the function E X is (T z ) z∈ZN -invariant, i.e., E0 X ◦ T z = E0 X
∀z ∈ ZN .
Moreover, if the dynamical system (Σ, , P, (T z ) z∈ZN ) is ergodic, then E 0 X is constant equal to the expectation value EX := Σ X d P of the function X . We denote by b (RN ) the family of bounded Borel subsets of RN . Definition 12.4.1 (additive process). Let (Σ, , P, (T z ) z∈ZN ) be a dynamical system. An additive process indexed by b (RN ), and covariant with respect to the group (T z ) z∈ZN , is a mapping A : b (RN ) −→ L1P (Σ), A → AA, satisfying the three following conditions:
i
i i
i
i
i
i
book 2014/8/6 page 508 i
Chapter 12. Γ -convergence and applications
508
(i) For all (A, B) ∈ b (RN ) × b (RN ) with A ∩ B = , AA∪B = AA + AB . (ii) For all A ∈ b (RN ) and all z ∈ ZN , A z+A = AA ◦ T z . (iii) There exists a nonnegative function h in L1P (Σ) such that |AA| ≤ h for all Borel sets A included in [0, 1[N . Condition (ii) is referred to as the covariance property. It expresses the fact that for the map A → AA, the spatial translations are transferred to the dynamic. Condition (iii) is referred to as the domination property. In what follows, A denotes the family of the half open intervals [a, b [ with a and b in ZN . Let us recall the notion of regularity of sequences in b (RN ), introduced in Lemma 12.3.1: a sequence (Bn )n∈N of sets of b (RN ) is said to be regular if there exists a nondecreasing sequence (In )n∈N of A and a constant C r e g > 0 such that Bn ⊂ In and supn∈N |In |/|Bn | ≤ C r e g . For every A ∈ b (RN ), we set ρ(A) := sup{r ≥ 0 : ∃B¯r (x) ⊂ A}, where B¯r (x) is the closed ball with radius r > 0 centered at x. The following theorem generalizes the Birkhoff ergodic theorem. For a proof see [311, Corollary 4.20]. Theorem 12.4.1. Let A be an additive process covariant with respect to (Tz ) z∈ZN , and let (An )n∈N be a regular sequence of convex sets of b (RN ) satisfying limn→+∞ ρ(An ) = +∞. Then, for P almost every ω ∈ Σ, lim
n→+∞
AAn |An |
(ω) = E0 A[0,1[N (ω).
If moreover the dynamical system (Σ, , P, (T z ) z∈ZN ) is ergodic, then lim
n→+∞
AAn |An |
(ω) = EA[0,1[N .
Remark 12.4.1. If the process is covariant with respect to the group (T z ) z∈mZN , where m is a given integer in N∗ , one can establish the analogous pointwise convergence theorem provided that, in the domination condition (iii), we replace [0, 1[N by [0, m[N . Furthermore, in the definition of regularity, we must replace A by the family A m of all the half open intervals [a, b ) with a and b in mZN . Let us denote by 0 m the σ-algebra of inor, in the case when (Σ, , P, (T z ) z∈mZN ) is ergodic,
AAn
(ω) = E0m m1N A[0,m[N (.), |An | A limn→+∞ |AAn| (ω) = E m1N A[0,m[N . n
variant sets of under the group (T z ) z∈mZN ; then limn→+∞
It is sometimes sufficient to consider the restriction of A to A . More precisely, we have the following. Definition 12.4.2. A discrete additive process covariant with respect to (Tz ) z∈ZN is a set function A : A −→ L1P (Σ) satisfying (i) for every I ∈ A such that there exists a finite family (I j ) j ∈J of disjoint intervals in A satisfying I = j ∈J I j , then AI (·) = AI j (·); j ∈J
(ii) for all I ∈ A and all z ∈ ZN , AI ◦ τ z = z+I .
i
i i
i
i
i
i
12.4. Stochastic homogenization
book 2014/8/6 page 509 i
509
Then the same conclusion holds. More precisely, we have the following. Theorem 12.4.2. Let A : A −→ L1P (Σ) be a discrete additive process, and let (In )n∈N be a regular sequence of A satisfying limn→+∞ ρ(In ) = +∞. Then, for P almost every ω ∈ Σ, AIn
lim
n→+∞
(ω) = E0 A[0,1[N (ω).
|In |
If moreover the dynamical system (Σ, , P, (T z ) z∈ZN ) is ergodic, then lim
AIn
n→+∞
|In |
(ω) = EA[0,1[N .
Given a dynamical system (Σ, , P, (T z ) z∈Z ) (N = 1) and a function Φ in L1P (Σ) for −1 Φ ◦ Ti . It is easy to check that A : A → L1P (Σ) is all a, b in Z, a < b , set A[a,b [ := ib=a a discrete additive process, covariant with respect to (T z ) z∈Z . From Theorem 12.4.2 we deduce the standard discrete time Birkhoff ergodic theorem. Corollary 12.4.1 (Birkhoff’s ergodic theorem). For P-a.e. ω ∈ Σ lim
n→+∞
n 1
n
i =1
Φ ◦ Ti (ω) = E0 Φ(ω).
If the dynamical system (Σ, , P, (T z ) z∈Z ) is ergodic, then for P-a.e. ω ∈ Σ, lim
n→+∞
n 1
n
i =1
Φ ◦ Ti (ω) = EΦ.
Let A be a bounded open subset of RN and AZ the set of sequences in A endowed with the σ algebra AZ which is the infinite product of the Borel σ-algebra on A. Let us equip (AZ , AZ ) with the probability measure μ, infinite product of the normalized Lebesgue 1 " A on A. From the Birkhoff ergodic theorem we deduce the following result, measure |A| which is at the root of the Monte Carlo method for computing the integrals, and which is a useful tool for questions of measurability (see Lemma 12.4.4 below). Corollary 12.4.2. Let g : RN × M m×N → R be a (RN ) ⊗ (M m×N )-measurable function satisfying 0 ≤ g (x, a) ≤ β(1 + |a| p ) for all (x, a) ∈ RN × M m×N , where β is a given nonnegative constant, and let u be a given function in W 1, p (A, R m ). Then 1 |A|
g (x, ∇u(x)) d x = lim A
n→+∞
n 1
n
g (si , ∇u(si ))
k=1
for μ almost every sequence s = (si )i ∈Z in A. PROOF. Consider the projection π : AZ → A defined by π((s)i ∈Z ) = s0 , the shift group (T z ) z∈Z defined by T z ((si )i ∈Z ) = (si +z )i ∈Z , and set Φ(s) = g (π(s), ∇u(π(s))) for every s in AZ . The dynamical system (AZ , AZ , μ, (T z ) z∈Z ) is ergodic. Indeed the measure μ restricted to 0 is uniquely determined by its values on the cylinders of 0 , and μ restricted
i
i i
i
i
i
i
book 2014/8/6 page 510 i
Chapter 12. Γ -convergence and applications
510
to the cylinders satisfies (12.27). Thus, according to the Birkhoff ergodic theorem, for μ almost every s ∈ AZ , we infer 1 g (x, ∇u(x)) d x = Φ(s) d μ(s) |A| A AZ n 1 Φ ◦ Ti (s) = lim n→+∞ n i =1 n 1 g (si , ∇u(si )), = lim n→+∞ n i =1 which completes the proof. Additive processes and the pointwise convergence result stated in Theorem 12.4.1 can be generalized to subadditive processes. We address this notion and give detailed proofs. Definition 12.4.3. Let (Σ, , P, (T z ) z∈ZN ) be a dynamical system. A subadditive process indexed by b (RN ), and covariant with respect to (T z ) z∈ZN , is a mapping S : b (RN ) −→ L1P (Σ), A → SA, satisfying the four following conditions: (i) For all (A, B) ∈ b (RN ) × b (RN ) with A ∩ B = , SA∪B ≤ SA + SB . (ii) For all A ∈ b (RN ) and all z ∈ ZN , S z+A = SA ◦ T z . (iii) There exists a nonnegative function h in L1P (Ω) such that |SA| ≤ h for all Borel sets A included in [0, 1[N . 5 4 S (iv) γ (S) := inf Σ |II| d P : I ∈ A > −∞. The constant γ (S) in (iv) is referred to as the spatial constant of the process. It is sometimes sufficient to consider the restriction of S to A . More precisely, we have the following. Definition 12.4.4. A discrete subadditive process, covariant with respect to (T z ) z∈ZN , is a set function S : A −→ L1P (Σ) satisfying (i) for every I ∈ A such that there exists a finite family (I j ) j ∈J of disjoint intervals in A with I = j ∈J I j , then SI (·) ≤ SI j (·); j ∈J
(ii) for all I ∈ A and all z ∈ ZN , SI ◦ τ z = @ z+I ; 5 4 S (iii) γ (S) := inf Σ |II| d P : I ∈ A > −∞. We are going to establish a pointwise convergence result (Theorem 12.4.3 below) which generalizes Lemma 12.3.1 to a stochastic situation, and Theorem 12.4.1 for subadditive processes. Its proof is based on a so-called maximal inequality (maximal for −S, thus minimal for S), which itself is derived from the Wiener covering lemma below. For any bounded interval I of RN , we denote by I ∗ its 3-dilated associated interval, namely, ∗ I = u∈RN :(u+I )∩I = (u + I ). Note that |I ∗ | = 3N |I |.
i
i i
i
i
i
i
12.4. Stochastic homogenization
book 2014/8/6 page 511 i
511
Lemma 12.4.1. Let I1 ⊂ · · · ⊂ In be finitely many nested half open bounded intervals of RN with |I1 | > 0, and let x0 be a fixed element of I1 . Let A be a finite subset of RN and a map ν : A → {1, . . . , n}. Consider the covering 3 A⊂ x − x0 + Iν(x) ; x∈A
then there exists A ⊂ A such that the family x − x0 + Iν(x) x∈A is made up of pairwise disjoint intervals and 3 ∗ x − x0 + Iν(x) . A⊂ z∈A
PROOF. We adapt the proof of the standard Wiener covering lemma by open Euclidean balls of RN . (For a proof of the standard Wiener covering lemma, see [264, Lemma 3.5.1].) Note that in the standard Wiener covering lemma Ii = B ri (0), where 0 < r1 ≤ · · · ≤ rn , ∗ x0 = 0, x − x0 + Ik = B rk (x), x − x0 + Ik = B3rk (x), and the crucial argument of the proof is to notice that ri ≤ r j and B ri (x) ∩ B r j (y) = =⇒ B ri (x) ⊂ B3r j (y). To shorten the notation, we set Kν(x) := x − x0 + Iν(x) , and we use the following similar ∗ remark throughout the proof: ν(x) ≤ ν(y) and Kν(x) ∩ Kν(y) = =⇒ Kν(x) ⊂ Kν(y) . We construct A following a finite iterative procedure. If A = , then the conclusion of the lemma is trivial. Otherwise choose x1 ∈ A such that ν(x1 ) = max{ν(x) : x ∈ A} ∗ and set A1 = A \ Kν(x . If A1 = , then A = {x1 } is suitable. Otherwise, if x1 , . . . , xk 1) k ∗ are chosen with Kν(xi ) i =1,...,k pairwise disjoint, let Ak = A \ i =1 Kν(x . If Ak = , then i) A = {u1 , . . . , uk } is suitable. Otherwise, choose xk+1 such that ν(xk+1 ) = max{ν(x) : x ∈ k+1 ∗ Ak }, and set Ak+1 = A \ i =1 Kν(x . The set Kν(xk+1 ) does not intercept each set Kν(xi ) i) k ∗ for i = 1, . . . , k; otherwise Kν(xk+1 ) ⊂ i =1 Kν(x , in contradiction with xk+1 ∈ Ak = A \ i) k l ∗ K ∗ . This construction must end after finite many steps, i.e., A ⊂ i =1 Kν(x for i =1 ν(xi ) i) some integer l . The set A = {x1 , . . . , x l } is suitable. Lemma 12.4.2 (minimal inequality). Let S : b (RN ) −→ L1P (Σ) be a discrete nonpositive subadditive process, covariant with respect to (T z ) z∈ZN , and (In )n∈N a regular sequence of intervals of A with a constant of regularity C r e g . Then, for every r > 0, the probability of the set SIn (ω) E r := ω ∈ Σ : inf ≤ −r n |In | satisfies P(E r ) ≤ −
3N C r e g γ (S) r
.
PROOF. Since (In )n∈N is a regular family of A , there exists a nondecreasing family (In )n∈N of A such that for all n ∈ N, In ⊂ In and |In | ≤ C r e g |In |. In what follows, we fix n intended to go to +∞, consider I1 ⊂ · · · ⊂ In , and fix z0 ∈ I1 ∩ ZN . Let k(n) ∈ N be large enough so that (12.28) −z0 + Ii ⊂ [−k(n), k(n)[N ∀i = 1, . . . , n,
i
i i
i
i
i
i
book 2014/8/6 page 512 i
Chapter 12. Γ -convergence and applications
512
and take an integer k ≥ k(n). We set En := ω ∈ Σ : inf
|Ii |
1≤i ≤n
and
SIi
[ω) ≤ −r ,
∀ω ∈ Σ, A(ω) := {z ∈ [−k + k(n), k − k(n)[N : T z−z0 ω ∈ En }.
According to the definition of the set En , and by covariance, for each z ∈ A(ω) there exists an integer ν(z) in {1, . . . , n} (possibly depending on ω) such that S z−z0 +Iν(z) (ω) ≤ −r |Iν(z) |.
(12.29)
We consider now the following covering of A(ω): 3 z − z0 + Iν(z) . A(ω) ⊂ z∈A(ω)
Then, applying Lemma 12.4.1, one can extract a pairwise disjoint family zi −z0 +Iν(z i ) i =1,...l such that l 3 ∗ A(ω) ⊂ zi − z0 + Iν(z . (12.30) ) i
i =1
From (12.30) we deduce that
l i =1
#A(ω)
|Iν(xi | ≥
3N C r e g
,
(12.31)
where #A(ω) denotes the cardinal of the set A(ω). We have used the fact that #(I ∩ ZN ) = |I | for all I ∈ A . On the other hand, from (12.28), we have −z0 + Ii ⊂ [−k(n), k(n)[N for all i = 1, . . . , n. Thus, for i = 1, . . . , l , since zi ∈ A(ω) ⊂ [−k + k(n), k − k(n)[N , we infer that zi − z0 + Iν(zi ) ⊂ [−k, k[N , and thus ∪il =1 zi − z0 + Iν(zi ) ⊂ [−k, k[N .
(12.32)
Note that zi − z0 + Iν(zi ) i =1,...l are pairwise disjoint. Since S is a nonpositive subadditive process, it is subadditive and nonincreasing. Therefore, from (12.32), (12.29), and (12.31) we infer that for all ω ∈ Σ S[−k,k[N (ω) ≤
l i =1
≤ −r
S zi −z0 +Iν(z ) (ω) i
l i =1
≤−
|Iν(zi ) |
r
N
3 Cr e g
#A(ω).
(12.33)
Integrating (12.33) over Σ, we obtain γ (S) ≤ −
r 3N C r e g (2k)N
#A(ω)d P(ω).
(12.34)
Σ
i
i i
i
i
i
i
12.4. Stochastic homogenization
book 2014/8/6 page 513 i
513
Noticing that
#A(.) =
z∈[−k+k(n),k−k(n)[N
1{ω:Tz−z
0
ω∈En } ,
and using the fact that P is invariant under the group (T z ) z∈ZN , we have N #A(ω) d P(ω) = P(En ) = 2(k − k(n)) P(En ) Σ
z∈[−k+k(n),k−k(n)[N
so that (12.34) yields P(En ) ≤ −
3N C r e g γ (S)
kN
r
(k − k(n))N
.
The proof is completed first by letting k → +∞ and then n → +∞. Let (An )n∈N be a regular sequence of convex sets of b (RN ) such that limn→+∞ ρ(An ) = +∞. For every m ∈ N∗ , m < n, we set 3 An,m = z + [0, m[N ; {z∈mZN : z+[0,m[N ⊂An }
An,m =
{z∈mZN : (z+[0,m[N )∩An =}
z + [0, m[N
and denote by 0 m the σ-algebra of invariant sets of under the group (T z ) z∈mZ . We will also need the following technical lemma. Lemma 12.4.3. Let X and h be two functions in L1P (Σ) with h ≥ 0. Then the following assertions hold: (i) limn→+∞
|An,m \An,m | |An |
(ii) limn→+∞
1 |An |
(iii) limn→+∞
1 |An |
= 0; X ◦ Tz =
z∈mZN ∩An,m z∈An,m \An,m
E0 m X mN
almost surely;
h ◦ T z = 0 almost surely.
PROOF. The proof of assertion (i) is standard and is left to the reader. Note that we easily |A
|A
|
|
deduce from (i) that limn→+∞ |An,m| = limn→+∞ |An,m| = 1. For each convex set An and n n δ > 0 set Aδn := {x ∈ RN : d (x, ∂ An ) ≤ δ}. It is worth noticing that (An \Aδn )n∈N and (An ∪Aδn )n∈N are two families of convex regular sets with limn→+∞ ρ(An \ Aδn ) = limn→+∞ ρ(An ∪ Aδn ) = +∞. For fixed m, we can find some δ > 0 such that 1 1 h ◦ Tz ≤ h ◦ Tz |An | |An | z∈A ∩mZN δ N z∈(An \An )∩mZ
n,m
≤ ≤
1 |An | 1 |An |
z∈An,m ∩mZ
h ◦ Tz N
h ◦ Tz .
(12.35)
z∈(An ∪Aδn )∩mZN
i
i i
i
i
i
i
book 2014/8/6 page 514 i
Chapter 12. Γ -convergence and applications
514
Applying Theorem 12.4.1 and Remark 12.4.1 to the additive process A, covariant with respect to (T z ) z∈mZN , defined by h ◦ Tz , AA := z∈mZN ∩A
we infer that the left-hand side and the right-hand side of (12.35) converge to the same limit. We then complete the proof of (iii) by going to the limit on (12.35) as n → +∞. On account of 1 1 1 X ◦ T − X ◦ T |X | ◦ T z , z z ≤ |A | |A | |A | n z∈mZN ∩A n
n z∈mZN ∩(Am \Am ) n n
n z∈mZN ∩Am n
assertion (ii) follows from (iii) by taking h = |X | and from Theorem 12.4.1 applied to the additive process A, covariant with respect to (T z ) z∈mZN , defined by X ◦ Tz . AA := z∈mZN ∩A
The proof of Lemma 12.4.3 is complete. We are in a position to establish the proof of the so-called subadditive ergodic theorem. Theorem 12.4.3. Let S : b (RN ) −→ L1P (Σ) be a subadditive process covariant with respect to (T z ) z∈ZN , and let (An )n∈N be a regular sequence of convex sets of b (RN ) satisfying limn→+∞ ρ(An ) = +∞. Then, for P-a.e. ω ∈ Σ, lim
SAn
n→+∞
|An |
(ω) = inf ∗ E0
S[0,m[N mN
m∈N
(ω).
If moreover the dynamical system (Σ, , P, (T z ) z∈ZN ) is ergodic, then for P-a.e. ω ∈ Σ, lim
SAn
n→+∞
(ω) = inf ∗ E
|An |
m∈N
S[0,m[N mN
= γ (S).
PROOF. Since (An )n∈N is a regular sequence of convex sets of b (RN ), there exists an increasing sequence (In )n∈N of A and a positive constant C r e g , which does not depend on n, such that An ⊂ In and |In | ≤ C r e g |Bn | for all n ∈ N. For m ∈ N∗ we set l m := lim sup n→+∞
l := lim sup n→+∞
SA
n,m
|An,m |
SAn |An |
,
,
l m = lim inf n→+∞
l = lim inf n→+∞
SAn |An |
SA
n,m
|An,m |
;
.
Note that the sets I n,m and I n,m belong to A m . (Recall that A m denotes the family of the half open intervals [a, b [ with a and b in mZN .) Finally, for every discrete subadditive process Ψ defined on A m and covariant with respect to (T z ) z∈mZN , we set ΨI m γ (Ψ) := inf d P : I ∈ Am . Σ |I |
i
i i
i
i
i
i
12.4. Stochastic homogenization
book 2014/8/6 page 515 i
515
First step. We prove that l = l almost surely. We will denote by l this common value. The inclusion An,m ⊂ An , together with the subadditivity and the domination conditions, yields SAn | An |
≤ ≤
SAn,m |An,m | |An,m | |An | SA
n,m
|An,m |
|An,m | |An |
+ +
SAn \An,m |An |
1 |An |
h ◦ Tz
(12.36)
z∈ZN ∩(An,m \An,m )
with h defined in Definition 12.4.3(iii). On account of (iii) of Lemma 12.4.3, we have the following almost sure limit: lim
n→+∞
1 |An |
h ◦ T z = 0.
z∈ZN ∩(An,m \An,m )
Hence, letting n → +∞ in (12.36) we deduce that almost surely l ≤ l m.
(12.37)
Similarly from An ⊂ An,m we infer that almost surely lm ≤ l.
(12.38)
Fix r > 0 and set E m,r := {ω : l m (ω) − l m (ω) ≥ r }. From (12.37), (12.38) we infer that {ω : l (ω) − l (ω) ≥ r } ⊂ E m,r . Thus, to conclude it suffices to show that for every > 0, and for m large enough, the inequality P(E m,r ) ≤
2N C r e g r
holds, provided that we have established that for m large enough, and for P-almost all ω ∈ Σ, −∞ < l m (ω) and l m (ω) < +∞. The end of the step consists in establishing these two claims. According to [274, Theorem 2.1] applied to the ZN -invariant subadditive set function S N d P = γ (S). Hence, given > 0, there exists A → Σ SA d P, we have limm→+∞ Σ [0,m[ mN ∗ m() ∈ N such that, for m ≥ m(), S [0,m[N d P − γ (S) ≤ . (12.39) N Σ m Consider the discrete additive process A m covariant with respect to the group (T z ) z∈mZN , defined for all I ∈ A m by AIm := S[0,m[N ◦ T z . z∈I ∩mZN
Subtracting this process from the restriction of S to A m , we obtain a nonpositive discrete subadditive process S m , indexed by A m , and covariant with respect to (T z ) z∈mZN : S m := S − A m ≤ 0.
(12.40)
i
i i
i
i
i
i
book 2014/8/6 page 516 i
Chapter 12. Γ -convergence and applications
516
On the other hand, by additivity and covariance, for every I of A m
AIm Σ
|I |
dP =
S [0,m[N mN
Σ
d P,
so that (12.39) yields that for m ≥ m(), the spatial constant γ (S m ) of S m satisfies γ m (S m ) ≥ −.
(12.41)
Moreover, according to Lemma 12.4.3(i) and (ii), for P-a.e. ω, we have A m (ω) An,m
L m (ω) := lim
|An,m |
n→+∞
AAm (ω)
= lim
n,m
|An,m |
n→+∞
= E0m By applying @
m
S[0,m[N mN
.
.
to (An,m )n∈N and letting n → +∞, (12.40) yields that for P-a.e. ω l m (ω) − L m (ω) ≤ 0,
(12.42)
from which we deduce l m < +∞. By applying @ m to (An,m )n∈N we infer @
m An,m
|An,m | from which we deduce, since @
m
=
@A
−
n,m
|An,m |
m
An,m
|An,m |
,
is nonpositive, thus nonincreasing,
m |I n,m | @I n,m
|An,m | |I n,m |
≤
@A
n,m
|An,m |
−
m
An,m
|An,m |
.
From (i) of Lemma 12.4.3 it is easy to establish lim supn→+∞
|I n,m | |An,m |
≤ C r e g . Consequently,
by letting n → +∞ in the above inequality, for P-a.e. ω we obtain l m (ω) − L m (ω) ≥ C r e g inf n
S m (ω) I n,m
|I n,m |
.
(12.43)
From (12.43) we infer ⎫ ⎧ S m (ω) ⎨ r ⎬ I n,m . ≤− {ω : l m − L m ≤ −r } ⊂ E r := ω ∈ Σ : inf n ⎩ Cr e g ⎭ |I n,m |
i
i i
i
i
i
i
12.4. Stochastic homogenization
book 2014/8/6 page 517 i
517
For fixed m ≥ m(), from (12.41) and Lemma 12.4.2 applied to the process S m , covariant with respect to the group (T z ) z∈mZN (note that (I n,m )n∈N is nondecreasing and thus is a regular sequence of A m with constant 1), we obtain 3N C r e g
P({ω : l m − L m ≤ −r }) ≤
r
.
(12.44)
The almost sure inequality −∞ < l m follows by letting r → +∞. Combining (12.43) and (12.42) we deduce that S m (ω) I n,m
l m (ω) − l m (ω) ≤ −C r e g inf
|I n,m |
n
,
so that E m,r ⊂ E r . Therefore, by using again Lemma 12.4.2 together with (12.41), for m ≥ m(), P({ω ∈ Σ : l (ω) − l (ω) ≥ r }) ≤ P({ω ∈ Σ : l m (ω) − l m (ω) ≥ r }) ≤
2N C r e g r
.
The step is then completed by letting first → 0 and then r → 0. Since for all m ∈ N∗ , l ≤ l m ≤ l m ≤ l , we have also proved that for all m ∈ N, we have l m = l m = l almost surely. Second step. We prove that l is almost surely invariant by (T z ) z∈Z , i.e., for all z ∈ ZN , l (ω) = l (T z ω) a.s. Consider In = [0, n[N . Fix z = (zi )i =1,...,N and set |z|∞ := maxi =1,...,N |zi |. From In + z ⊂ [−|z|∞ , n + |z|∞ [N , and the fact that @ 1 = @ − 1 is nonpositive (see (12.40)), thus nonincreasing, we infer that @I 1 n
nN
◦ Tz =
@I 1+z n
nN
≥
1 @[−|z|
∞ ,n+|z|∞ [
N
(n + 2|z|∞ )N
(n + 2|z|∞ )N
nN
.
Letting n → +∞, from the first step and the invariance of L1 with respect to (T z ) z∈ZN , we deduce that l (T z ω) ≥ l (ω) for P-a.e. ω. Hence we also infer that l (T−z ω) ≥ l (ω) for P-a.e. ω. Applying T z we finally obtain l (ω) ≥ l (T z ω), thus l (T z ω) = l (ω) for P-a.e. ω. Last step. This step is devoted to the identification of l . Let us set for all m ∈ N∗ , f m (ω) := E0 (S[0,m[N /m N ). We first prove that l ≤ infm∈N∗ f m . Indeed from (12.42) for every m ∈ N∗ , l ≤ L m = E0m S[0,m[N /m N so that, by invariance of l and from the fact that 0 ⊂ 0 m , we infer l = E0 l ≤ E0 L m
0
=E
= E0
0m
E
S[0,m[N
S[0,m[N mN
md ,
which proves the claim.
i
i i
i
i
i
i
book 2014/8/6 page 518 i
Chapter 12. Γ -convergence and applications
518
On the other hand, with the notation of the first step, noticing that (I n,m )n∈N is regular, we have for P-a.e. ω, @m I n,m lim = l − L m ≤ 0. n→+∞ |I n,m | From Fatou’s lemma and (12.41), we deduce that for every E in 0 , and for m ≥ m(),
E
(L m − l ) d P =
lim −
@I m
n,m
dP |In,m | @I m n,m dP ≤ lim inf − n→+∞ |I E n,m | @I m n,m ≤ lim inf − dP n→+∞ |I Σ n,m | @J m − ≤ sup dP |J | J ∈A m Σ E
n→+∞
= −γ (@
m
) ≤ .
According to the definition of the conditional expectation with respect to 0 , for all E ∈ 0 , and for m ≥ m() we infer that l d P ≥ Lm d P − E
E
E0m
=
S[0,m[N
dP − mN S[0,m[N = E0 E0m dP − mN E S[0,m[N = E0 dP − mN E inf∗ fn d P − . ≥ E
E n∈N
Letting → 0, and since l ≤ infn∈N∗ fn a.s., we deduce that ∀E ∈ 0 l dP = inf∗ fn d P. E n∈N
E
(12.45)
Since, by definition, fn is 0 -measurable, so is infn∈N∗ fn . Equality (12.45) being true for every E ∈ 0 , we obtain l = infn∈N∗ fn , which completes the proof. The same conclusion holds for discrete subadditive process. More precisely, we have the following. Theorem 12.4.4. Let S : A −→ L1P (Σ) be a discrete subadditive process, and (In )n∈N a regular sequence of A satisfying limn→+∞ ρ(In ) = +∞. Then, for P-a.e. ω ∈ Σ, lim
n→+∞
SIn |In |
(ω) = inf ∗ E0 m∈N
S[0,m[N mN
(ω).
i
i i
i
i
i
i
12.4. Stochastic homogenization
book 2014/8/6 page 519 i
519
If moreover the dynamical system (Σ, , P, (T z ) z∈ZN ) is ergodic, then lim
SIn
n→+∞
|In |
(ω) = inf ∗ E m∈N
S[0,m[N mN
= γ (S).
For a proof, we refer the reader to [4]. Theorem 12.4.4 will be used in Section 14.2. Remark 12.4.2. As for the deterministic case (see Lemma 12.3.1), condition (i) can be weakened by restricting the subadditivity to the sets of b (RN ) whose boundary is Lebesgue negligible, more precisely: for all A and all B ∈ b (RN ) with A ∩ B = and |∂ A| = |∂ B| = 0, SA∪B ≤ SA +SB . Indeed the boundary of the sets considered in the proof is Lebesgue negligible.
12.4.2 Parametrized subadditive processes In this section, we assume that the dynamical system (Σ, , P, (T z ) z∈ZN ) is ergodic. We are concerned with the variational property of the almost sure convergence stated in Theorem 12.4.3, when the subadditive process depends on a parameter which belongs to a separable metric space. For convenience, in order to use usual concepts of the calculus of variations, the process S will be assumed superadditive, that is, −S subadditive. More precisely, given a separable metric space (X , d ), we consider a mapping S : b (RN ) × X → L1P (Σ), (A, x) → SA(x, .) fulfilling the following conditions: (i) for all x ∈ X , A → −SA(x, .) is a subadditive process, covariant with respect to (T z ) z∈ZN ; (ii) for all A ∈ b (RN ), (x, ω) → SA(x, ω) is (X ) ⊗ -measurable; (iii) for all A ∈ b (RN ) and all ω ∈ Σ, the map x → SA(x, ω) is lower semicontinuous in X ; (iv) ∃α > 0, ∃x0 ∈ X such that for all A ∈ b (RN ) and all x ∈ X , SA(ω, x) ≥ −α 1 + d (x, x0 ) . Such a mapping S will be referred to as a parametrized superadditive process covariant with respect to (T z ) z∈ZN . Under these conditions, Theorem 12.4.5 below generalizes the epigraphical law of large numbers established in [59] (see also [240]). Theorem 12.4.5 (almost sure Γ -convergence). Let (Σ, , P, (T z ) z∈ZN ) be an ergodic dynamical system, S a parametrized superadditive process covariant with respect to (Tz ) z∈ZN , and (An )n∈N a regular sequence of convex sets of b (RN ) satisfying limn→+∞ ρ(An ) = +∞. Then, for P-a.e. ω ∈ Σ, we have Γ − lim
n→+∞
SAn (., ω) |An |
= sup E m∈N∗
S[0,m[N mN
= sup
Σ
SI (., ω) |I |
d P(ω) : I ∈ A .
S PROOF. Since the map x → α 1+d (x, x0 ) is a continuous perturbation of x → |AAn| (x, ω), n according to (ii) of Theorem 12.1.1, it is enough to establish the Γ -convergence for the
i
i i
i
i
i
i
book 2014/8/6 page 520 i
Chapter 12. Γ -convergence and applications
520
nonnegative superadditive process A → SA(ω, .) + α 1 + d (x, x0 ) |A|. We still denote by S this new process. First step. We establish the existence of Σ in satisfying P(Σ ) = 1 and such that for all ω ∈ Σ , SAn S[0,m[N (., ω) d P(ω) . Γ − lim inf (., ω) ≥ sup N n→+∞ |A | m∈N∗ Σ m n The crucial idea is to notice that the process defined for fixed x and fixed k (intended to go to +∞) by A → − infy∈X {@A(y, .) + kd (x, y)|A|} is subadditive and satisfies all of the hypothesis of Theorem 12.4.3. (The measurability comes from the measurability of ω → epi @A(., ω); see [59, 240].) Let D ⊂ X be a dense countable subset of X . From the consideration above, there exists Σ ∈ with P(Σ ) = 1 such that for all (ω, x) ∈ Σ × D ⎧ ⎫
k k ⎨ S[0,m[N ⎬ SAn lim (., ω) (x) d P(ω) (., ω) (x) = sup n→+∞ |A | ⎭ mN m∈N∗ ⎩ Σ n ≥
S [0,m[N Σ
md
k (x) d P(ω) ∀m ∈ N∗ ,
(., ω)
S
where, for all A ∈ b (RN ), ( |A|A (., ω))k is the Baire approximation of x → fined by k SA SA (., ω) (x) = inf (y, ω) + kd (x, y) . y∈X |A| |A|
(12.46)
SA x, ω |A|
de-
Since the Baire approximation is Lipschitz continuous with constant k (see Theorem 9.2.1 and Remark 12.1.3), inequality (12.46) holds for all (ω, x) ∈ Σ × X . Noticing that S
S
N
N
( [0,m[d (., ω))k increases to [0,m[d (., ω) when k → +∞, from Proposition 12.1.1(iv), and m m the monotone convergence theorem, (12.46) yields Γ − lim inf n→+∞
SAn |An |
(., ω) ≥
for all m ∈ N∗ . Hence Γ − lim inf n→+∞
SAn |An |
S [0,m[N mN
Σ
(., ω) ≥ sup
m∈N∗
(., ω) d P(ω)
S[0,m[N Σ
mN
(., ω) d P(ω) .
Second step. We establish the existence of Σ” ∈ satisfying P(Σ”) = 1 and such that for all ω ∈ Σ SAn S[0,m[N Γ − lim sup (., ω) d P(ω) . (., ω) ≤ sup N n→+∞ |An | m∈N∗ Σ m For fixed > 0 and x ∈ X , letting n → +∞ in inf
y∈B(x,)
SAn |An |
(ω, y) ≤
SAn |An |
(ω, x),
i
i i
i
i
i
i
12.4. Stochastic homogenization
book 2014/8/6 page 521 i
521
and according to Theorem 12.4.3, we deduce that there exists Σ x ∈ satisfying P(Σ x ) = 1, and such that for all ω ∈ Σ x SAn S[0,m[N lim sup inf (ω, x) d P(ω) . (ω, y) ≤ sup N n→+∞ y∈B(x,) |An | m∈N∗ Σ m Letting → 0, from (12.1), we deduce that for all x ∈ X and all ω ∈ Σ x SAn S[0,m[N (x, ω) ≤ sup Γ − lim sup (x, ω) d P(ω) . N n→+∞ |An | m∈N∗ Σ m
(12.47)
Let De pi be a dense countable subset of the epigraph of the map Φ : x → sup
m∈N∗
Σ
S[0,m[d md
(x, ω) d P(ω) ,
ΠX De pi its projection on X , and set Σ := ∩ x∈ΠX De pi Σ x . We have P(Σ ) = 1. From (12.47) we infer that for all ω ∈ Σ
{(x, r ) ∈ : Φ(x) ≤ r } ⊂ epigraph Γ − lim sup n→+∞
Noticing that Φ and Γ − lim supn→+∞
SAn (., ω) |An |
SAn |An |
(., ω) .
(12.48)
are lower semicontinuous, taking the clo-
sure of each two sets above, we deduce that for all ω ∈ Σ
SAn epigraph(Φ) ⊂ epigraph Γ − lim sup (., ω) . n→+∞ |An | Hence Γ − lim sup n→+∞
SAn
|An |
(., ω) ≤ Φ for all ω ∈ Σ .
Set Σ = Σ ∩Σ . We have P(Σ ) = 1 and, from the two steps above, the Γ -convergence of the process is obtained for all ω ∈ Σ , which completes the proof.
12.4.3 Random integrands We denote by Bα,β,L the subset of RR ×M made up of functions g : RN × M m×N → R measurable in x and satisfying conditions (12.5), (12.6) for some given α > 0, β > 0, L > 0, and p ∈ [1, +∞[. We equip Bα,β,L with the σ-algebra Bα,β,L , trace on Bα,β,L , of N
m×N
the product σ-algebra of RR ×M , i.e., the smallest σ-algebra on Bα,β,L such that all the evaluation maps e(x,ξ ) : g → g (x, ξ ), (x, ξ ) ∈ RN × M m×N N
m×N
are measurable when R is endowed with its Borel σ-algebra. Let us consider a probability space (Σ, , P) and, for any topological space X , denote by (X ) its Borel σ-algebra. Definition 12.4.5. A function f : Σ × RN× M m×N → R is said to be a random integrand if it is ⊗ (RN ) ⊗ (M m×N ), (R) measurable and if f (ω, ., .) belongs to the class Bα,β,L for every ω ∈ Σ.
i
i i
i
i
i
i
book 2014/8/6 page 522 i
Chapter 12. Γ -convergence and applications
522
In the literature, in the general definition of a random integrand, the class Bα,β,L is replace by the larger class of function g measurable in x and lower semicontinuous in ξ . In what follows, we restrict ourselves to the above definition. Given a random integrand f , the map f˜ : Σ → Bα,β,L defined by f˜(ω) = f (ω, ., .) is clearly , Bα,β,L measurable. Denote by 3 the family of all open bounded subsets of RN and consider the class 0α,β,L of RWl oc (R
N
,R m )×3
defined by
0α,β,L := {G = J ( g ) : g ∈ Bα,β,L },
where J ( g )(u, A) :=
A
1, p
g (x, ∇u) d x, (u, A) ∈ W l oc (RN , R m ) × 3 .
0α,β,L is endowed with the σ-algebra
1, p
W l oc (RN ,R m )×3 , 0α,β,L , trace of the product σ-algebra of R
i.e., the smallest σ-algebra on 0α,β,L , such that all the evaluation maps 1, p
/(u,A) : G → G(u, A), (u, A) ∈ W l oc (RN , R m ) × 3 are measurable. It is worth noting that the map J : g → J (g) from Bα,β,L into 0α,β,L is not Bα,β,L , 0α,β,L measurable in general so that we can not deduce the measurability of ω → J ◦ f˜(ω) from the , Bα,β,L measurability of with the smallest ω → f˜(ω). To overcome this difficulty, from now on, we equip B
σ-algebra ˜ containing
α,β,L
Bα,β,L
such that the map
J : Bα,β,L → 0α,β,L , g → J ( g ) is ˜,
0α,β,L
measurable. The explicit knowledge of ˜ is not really necessary. Indeed
we have the next proposition. Proposition 12.4.1. Let f : Ω × RN × M m×N → R be a random integrand. Then f˜ : Σ → Bα;,β,L is , ˜ measurable and the map ω → J ◦ f˜(ω) is , 0α,β,L measurable. PROOF. As said above, f˜ is , Bα,β,L measurable, then, according to the definition of the σ-algebra ˜, it suffices to establish that the map ω → J ◦ f˜(ω) is , 0α,β,L measurable, that is, from the definition of the σ-algebra 0α,β,L , the maps ω → J ( f˜(ω))(u, A) 1, p are , (R) measurable for all u ∈ W l oc (RN , R m ) and all A ∈ 3 . The thesis follows straightforwardly from the definition of a random integrand and the standard result on the measurability of integrals depending on a parameter.
12.4.4 The dynamical system associated with a random integrand Thanks to f˜, the “phenomenal” probability space (Σ, , P) is transferred into the probability space (Bα,β,L , ˜, f #P), where f #P is the image probability measure of P, defined
i
i i
i
i
i
i
12.4. Stochastic homogenization
book 2014/8/6 page 523 i
523
by f˜#P(E) = P( f˜−1 (E)) for every E in ˜. Since in what follows f is the only random integrand under consideration, in order to shorten the notation we will denote by ˜ ˜ ˜ the transported probability space (B ˜ ˜, P) (Σ, α,β,L , , f #P). We recall below the wellknown change of variable theorem (or transfer theorem). ˜ → R be an ˜, (R) measurable function. Then Theorem 12.4.6 (transfer). Let X : Σ ˜ ⇐⇒ X ◦ f˜ ∈ L (Σ) X ∈ LP˜ (Σ) P and, in this case,
˜ Σ
˜= X dP
Σ
X ◦ f˜ d P.
˜ →Σ ˜ defined by T g (x, .) = Let (T z ) z∈ZN be the group of measurable maps T z : Σ z ˜ and all x ∈ RN . Then for all z ∈ ZN , T is ˜, ˜ measurable. g (x + z, .) for every g ∈ Σ z Indeed the thesis is a straightforward consequence of the definition of ˜, and the relation 1, p J (T z g )(u, A) = J ( g )(u(. − z), A + z) for every (u, A) ∈ W l oc (RN , R m ) × 3 . ˜ ˜, P, ˜ (T ) N ) is referred to as the dynamical Definition 12.4.6. The dynamical system (Σ, z z∈Z ˜ system associated with the random integrand f . Assume that (T z ) z∈ZN is a P-preserving trans# ˜=P ˜ for all z ∈ ZN . Then f is said ˜ ˜), that is, T P formation on the measurable space (Σ, z ˜ (T ) N ) is said to be ˜ ˜, P, to be periodic in law, or equivalently the dynamical system (Σ, z z∈Z stationary. If P˜ (E) ∈ {0, 1} for every subset E ∈ ˜ such that for every z ∈ ZN , T z (E) = E, ˜ ˜, P, ˜ (T ) N ) is said to be ergodic. then f or (Σ, z z∈Z The proposition below states that the explicit knowledge of ˜ is not essential to characterize random integrands periodic in law or ergodic. Proposition 12.4.2. Let f be a random integrand. Then we have (i) f is periodic in law iff the laws of the random vectors
f (., xi , ξi )
i ∈I
,
f (., xi + z, ξi )
i ∈I
are equal for every z ∈ ZN and every finite family (xi , ξi )i ∈I in RN × M m×N ; ˜ (ii) f is ergodic iff P(E) ∈ {0, 1} for every subset E ∈ z ∈Z ;
Bα,β,L
such that T z (E) = E for every
N
(iii) if f satisfies the mixing condition lim P {ω ∈ Σ : f (ω, xi , ξi ) > si ∀i ∈ I , f (ω, y j + z, ζ j ) > t j ∀ j ∈ J } = P {ω ∈ Σ : f (ω, xi , ξi ) > si ∀i ∈ I } P { f (ω, y j , ζ j ) > t j ∀ j ∈ J }
|z|→+∞
for every pair of finite families (xi , ξi , si )i ∈I and (y j , ζ j , t j ) j ∈J in RN × M m×N × R, then f is ergodic.
i
i i
i
i
i
i
book 2014/8/6 page 524 i
Chapter 12. Γ -convergence and applications
524
PROOF. From the definition of the σ-algebra
Bα,β,L ,
the condition in (i) is necessary ˜ restricted ˜ and T # P and sufficient to ensure equality of the two probability measures P z to Bα,β,L . Therefore to conclude it suffices to establish ˜ = T #P ˜ on P z
Bα,β,L
˜ ˜ = T #P ˜ on A, =⇒ P z
f˜#P = T z ◦ f˜#P on
Bα,β,L
˜ =⇒ f˜#P = T z ◦ f˜#P on A.
that is,
This last implication is a straightforward consequence of the lemma below. Lemma 12.4.4. Let ( gi )i ∈N be a countable family of random integrands; then for every E ∈ A˜ there exists E in Bα,β,L such that g˜i #P(EΔE ) = 0 for every i ∈ N. Indeed Lemma 12.4.4 yields g˜i #P(E) = g˜i #P(E ) for every i ∈ N. Assertion (i) then follows by applying this equality to f˜ and T z ◦ f˜. The proof of (ii) is obtained by applying Lemma 12.4.4 to the countable family of random integrands (Tz f ) z∈Z , and (iii) follows from (ii), the mixing condition (12.27), and the definition of the σ-algebra Bα,β,L . We are going to prove Lemma 12.4.4. Let be the subfamily of ˜ made up of the sets ˜ Clearly is a for which the thesis holds. The proof consists in establishing that = A. σ-algebra which contains ; therefore it is enough to prove that the map J is ( , ˜) Bα,β,L
1, p measurable, i.e., from the definition of ˜, that for any A in 3 , any u ∈ W l oc (RN , R m ) and any r ∈ R, the set E := {ϕ ∈ Aα,β,L : J ( g )(u, A) > r }
belongs to . According to Corollary 12.4.2, for every ϕ ∈ Bα,β,L , there exists Sϕ in with μ(Sϕ ) = 1, such that for every sequence s = (sI )i ∈Z of Sϕ , 1 |A|
ϕ(x, ∇u(x)) d x = lim
n→∞
n 1
n
ϕ(sk , D u(sk )).
AZ
(12.49)
k=1
Let us consider the subset U of Σ × AZ made up of all the (ω, s) for which 1 |A|
gi (ω, x, ∇u(x)) d x = lim sup n→∞
n 1
n
gi (ω, sk , D u(sk ))
∀i ∈ N.
k=1
Clearly U belongs to ⊗ AZ and from (12.49), μ {t ∈ AZ : (ω, t ) ∈ U } = 1 for every ω ∈ Σ. By Fubini’s theorem we have Σ
AZ
1{t :(ω,t )∈U } (s) d μ(s)
d P(ω) = 1
=
Z
A =
AZ
Σ
1{θ:(θ,s )∈U } (ω) d P(ω) d μ(s)
P {ω ∈ Σ : (ω, s) ∈ U } d μ(s)
i
i i
i
i
i
i
12.4. Stochastic homogenization
book 2014/8/6 page 525 i
525
so that P {ω ∈ Σ : (ω, s) ∈ U } = 1 for μ for almost every s ∈ AZ . Consider ¯s ∈ AZ such that P {ω ∈ Σ : (ω, s) ∈ U } = 1. We claim that the set n 1 r E := ϕ ∈ Bα,β,L = lim sup ϕ(¯sk , ∇u(¯sk ) ≥ |A| n→+∞ n k=1 satisfies g˜i #P(EΔE ) = 0 for every i ∈ N. Indeed E clearly belongs to g˜i #P(EΔE ) 6 ≤P
ω∈Σ:
1 |A|
A
gi (ω, x, ∇u(x)) d x = lim sup
≤ P {ω ∈ Σ : (ω, s) ∈ U } = 0,
n→∞
n 1
n
Bα,β,L ,
and 7
gi (ω, sk , D u(sk ))
k=1
which completes the proof of Lemma 12.4.4. Example 12.4.1 (random checkerboard-like materials). Let g− and g+ be two homogeneous functions in Bα,β,L , (a, b ) ∈ (0, 1)2 satisfying a + b = 1 and consider the product N
Σ = {g− , g+ }Z equipped with the σ-algebra product of the trivial σ- algebra in {g− , g+ }, and with the product probability measure P = ⊗ z∈ZN μ z , where μ z = aδ g− + b δ g+ for all z ∈ Zn . By construction P is invariant under the shift group (τ z ) z∈ZN defined by τ z (ω t ) t ∈ZN = (ω t +z ) t ∈ZN , i.e., τ z# P = P for all z ∈ ZN . We define f : Σ × RN × M m×N → R as follows: f (ω, x, ξ ) := ω z (ξ ) whenever x ∈ Y + z, where Y is the unit cell (0, 1)N . According to this definition it is straightforward to show that f is a random integrand and that for all ω ∈ Σ, all x ∈ RN , and all ξ ∈ M m×N , T z f (ω, x, ξ ) = f (ω, x + z, .) = f (τ z ω, x, ξ ) so that T z f˜(ω) = f˜(τ z ω). Consequently f is periodic in law. Indeed the thesis follows from the following calculation: for every E ∈ ˜ ˜ T z# P(E) = T z# f˜#P)(E) = P( f˜−1 (T−z E)) ! " = P {ω ∈ Σ : T z f˜(ω) ∈ E} " ! = P {ω ∈ Σ : f˜(τ z ω) ∈ E} = τ z# P( f˜−1 (E)) ˜ = P( f˜−1 (E)) := P(E). Finally it is easily seen that f satisfies the mixing condition (iii) of Proposition 12.4.2. In the case when N = 3 and in the framework of nonlinear elasticity, the random integrand f may be thought of as an elastic density of a random checkerboard-like material, i.e., whose density takes two values g− and g+ at random on the lattice spanned by the unit cell Y = (0, 1)3 . The probability presence of g− is a, and that of g+ is b . Example 12.4.2 (heterogeneities distributed following a Poisson point process). Let Σ be the set of locally finite sequences (ωi )i ∈N in RN and let be the set of countable
i
i i
i
i
i
i
book 2014/8/6 page 526 i
Chapter 12. Γ -convergence and applications
526
and locally finite sums of Dirac measures, equipped with their standard σ-algebra. Given λ > 0, we consider the Poisson point process ω → 8 (ω, .) with intensity λ "N from the N probability space (Σ, , P) into N (R ) equipped with the standard product σ-algebra, which is characterized as follows: (i) for every A ∈ b (RN )
8 (ω, A) =
i ∈N
δωi (A);
(ii) for every finite and pairwise disjoint family (Ai )i ∈I of b (RN ), (8 (., Ai ))i ∈I are independent random variables; (iii) for every bounded Borel set A and every k ∈ N P ([8 (., A) = k]) = λk "N (A)k
exp(−λ"N (A)) k!
.
Note that for all A ∈ b (RN ), 8 (ω, A) = #(A ∩ Ω), and that E(8 (., A)) = λ "N (A). Given g− and g+ two homogeneous functions in Bα,β,L and r > 0, we define a random integrand f by setting f (ω, x, ξ ) := g+ (ξ ) + g− (ξ ) − g+ (ξ ) min(1, 8 (ω, B(x, r ))). More explicitly we clearly have f (ω, x, ξ ) =
⎧ ⎨ g− (ξ ) ⎩
if x ∈
3
B(ωi , r ),
i ∈N
g+ (ξ )
otherwise.
According to Proposition 12.4.2(i) and (ii), it is easy to show that f is periodic in law and ergodic. In the case when N = 3 and in the framework of nonlinear elasticity, the random integrand f may be thought of as an elastic density of a material with spherical heterogeneities of size r whose elastic density is g− and whose centers are randomly distributed with a frequency λ per unit of volume. The elastic density of the matrix is g+ . It should be noted that the spheres can interpenetrate.
12.4.5 The process {Fε , F hom : ε → 0} In what follows, denotes a sequence (n )n∈N of positive numbers n going to zero when n → +∞, and we often briefly write → 0 instead of limn→+∞ n = 0. As in section 12.3.2, Ω is an open bounded subset of R3 which represents the interior of the reference configuration filled up by some elastic ( p > 1) or pseudoplastic ( p = 1) material which is clamped on a part Γ0 of the boundary ∂ Ω of Ω. But to treat more general situations, Ω is actually an open bounded subset of RN with N ∈ N∗ . Given a probability space (Σ, , P), m ∈ N∗ , and a random integrand, periodic in law, f : Σ × RN × M m×N → R (recall that f (ω, ., .) belongs to the class Bα,β,L ), we define the random functional integral F : Ω × L p (Ω, R m ) −→ R+ ∪ {+∞}
i
i i
i
i
i
i
12.4. Stochastic homogenization
by
book 2014/8/6 page 527 i
527
⎧ ! " x ⎨ f ω, , ∇u d x F (ω, u) = ⎩ Ω +∞ otherwise.
1, p
if u ∈ WΓ (Ω, R m ), 0
When N = m = 3, F (ω, .) is the random stored strain energy associated with a displacement field u : Ω → R3 , and, with the notation of Section 12.3.2, the equilibrium configuration is given by the displacement field u solution of the random problem
inf F (ω, u) −
p
Ω
3
L(u) : u ∈ L (Ω, R ) .
The small parameter accounts for the size of the small and randomly distributed heterogeneities. Following the strategy of Section 12.3.2, we are going to establish the almost sure Γ -convergence of the sequence (F )>0 when L p (Ω, R m ) is equipped with its strong topology. In the following proposition we characterize the density of the Γ -limit, or its singular part when p = 1. Proposition 12.4.3. There exists a set Σ in with P(Σ ) = 1 such that for all (ω, a) ∈ Σ × M m×N , and for all open bounded convex set A in RN the following limit exists: f
ho m
(ω, a) = lim inf →0
1
f (ω, x, a + ∇u(x)) d x : u
|A/|
A/
1, p ∈ W0 (A/, R m )
.
This limit does not depend on the choice of the open bounded convex set A and is given by f
ho m
(ω, a) = inf∗ E inf
1
0
nN
n∈N
f (ω, y, a + ∇u(y)) d y : u nY
1, p ∈ W0 (Y, R m )
,
˜ : E˜ ∈ ˜, T E˜ = E˜ for all z ∈ ZN }. If f is ergodic, then where 0 = { f˜−1 (E) z f ho m (a) = inf∗ E inf n∈N
1 nN
nY
1, p f (ω, y, a + ∇u(y)) d y : u ∈ W0 (Y, R m ) .
Moreover for all ω ∈ Σ , f ho m (ω, .) satisfies (12.5) and (12.6) with a constant L depending only on L, p, α, and β. ˜ (T ) N ) associated with ˜ ˜, P, PROOF. We begin by reason in the dynamical system (Σ, z z∈Z m×N the random integrand f . Fix a matrix a in the subset MQ of M m×N made up of the ˜ defined for m × N matrices with rational entries. We claim that S : b (RN ) → LP (Σ) N ˜ ˜ ∈ Σ by every A in (R ) and every ω b
˜ → inf S : A → ω
0
A
˜ a + ∇u(x)) d x : u ω(x,
0 1, p ∈ W0 (A, R m )
˜ is a direct consequence ˜ belongs to LP˜ (Σ) ˜ → SA(ω) is a subadditive process. The fact that ω ˜ and the measurability of the uniform growth condition satisfied by all the elements of Σ,
i
i i
i
i
i
i
book 2014/8/6 page 528 i
Chapter 12. Γ -convergence and applications
528
may be established by standard arguments. Indeed, from (12.5) and taking u = 0 as an admissible function in the definition of SA we deduce that ˜ ≤ β|A|(1 + |a| p ). SA(ω)
(12.50)
Note that from the lower growth inequality in (12.5), and according to Jensen’s inequality, we infer ˜ ≥ β|A||a| p . SA(ω) (12.51) The proof of condition (i) in definition 12.4.3 follows point by point the proof of Proposition 12.3.1. Condition (ii) is a straightforward consequence of the definition of the group (T z ) z∈ZN and a change of variable. Condition (iii) is obvious. Consequently, ˜ Σ ˜ in ˜ with P( ˜ ) = 1 such that for all according to Theorem 12.4.3, there exists a set Σ a a ˜ the limit ˜ ∈Σ ω a 1 1, p ho m m ˜ a + ∇u(x)) d x : u ∈ W0 (A/, R ) ˜ a) := lim inf g ω(x, (ω, →0 |A/| A/ exists and is equal to g
ho m
0˜
˜ a) = inf∗ E inf (ω,
˜ a + ∇u(y)) d y : u ω(y,
nN
n∈N
1
nY
1, p ∈ W0 (Y, R m )
,
where 0˜ is the σ-algebra made up of the invariant sets of ˜ under (T z ) z∈ZN . Let us set ( ˜ ˜ Σ ˜ = ˜ . Clearly P( ˜ ) = 1. From condition (12.6) it is easy to show that SA(ω,.) Σ Σ a∈M m×N a |A| Q
satisfies the local Lipschitz condition S (ω, ˜ b ) A ˜ a) SA(ω, − ≤ L |a − b |(1 + |a| p−1 + |b | p−1 ) |A| |A|
(12.52)
for all (a, b ) ∈ M m×N × M m×N , where L depends only on L, p, α, and β (for a complete ˜ × M m×N . ˜ a) satisfies (12.52) for all (ω, ˜ a) in Σ proof, see [291]). Therefore a → g ho m (ω, Q ˜ × M m×N : still denoting by g ho m this extension, From (12.52) we can extend g ho m on Σ ˜ × M m×N , g ho m (ω, ˜ a) ∈ Σ ˜ a) = lim ˜ a ), where (a ) for every (ω, g ho m (ω, is any n→+∞
n
n n∈N
m×N (note that from (12.52), this limit does not depend on the choice of the sequence in MQ sequence (an )n∈N ). By using the uniform estimate (12.52) with respect to , and letting an → a, then → 0 in the estimate ˜ a) SA/ (ω, ho m ˜ a) − ˜ a) − g ho m (ω, ˜ an )| (ω, ≤ |g ho m (ω, g |A/| ˜ an ) SA/ (ω, ho m ˜ an ) − (ω, + g |A/| SA/ (ω, ˜ an ) SA/ (ω, ˜ a) + − , |A/| |A/|
we obtain g
ho m
˜ a) := lim inf (ω, →0
1 |A/|
˜ a + ∇u(x)) d x : u ω(x, A/
1, p ∈ W0 (A/, R m )
i
i i
i
i
i
i
12.4. Stochastic homogenization
book 2014/8/6 page 529 i
529
˜ ×M m×N . Set Σ = f˜−1 (Σ ˜ ), and, for all (ω, a) ∈ Σ ×M m×N , f ho m (ω, a) ˜ a) in Σ for all (ω, := g ho m ( f˜(ω), a). The conclusion follows from the definition of the conditional expectation, Theorem 12.4.6, and (12.50), (12.51), (12.52). We can now establish the main convergence result, a generalization of Theorem 12.3.2. Theorem 12.4.7. Assume that Ω is piecewise of class C1 . Let f be a random integrand, periodic in law. Then, for P almost all ω in Σ, (F (ω, .))>0 Γ -converges to the random integral functional F ho m defined on Σ × L p (Ω, R m ) by (i) case p > 1, ⎧ 1, p ⎨ f ho m (ω, ∇u) d x if u ∈ WΓ (Ω, R m ), 0 F ho m (ω, u) = Ω ⎩ +∞ otherwise; (ii) case p = 1, ⎧ Ds u ⎪ ho m ho m,∞ ⎪ |D s u| ω, f (ω, ∇u) d x + ( f ) ⎪ ⎪ s ⎪ |D u| ⎨ Ω Ω F ho m (ω, u) = ho m,∞ (ω, γ0 (u) ⊗ ν) d N −1 if u ∈ BV(Ω, R m ), + (f ) ⎪ ⎪ ⎪ ⎪ Γ 0 ⎪ ⎩ +∞ otherwise, where ν denotes the outer unit normal to Γ0 , γ0 the trace operator, and ( f ) ho m,∞ (ω, .) the recession function of f ho m (ω, .) defined for every a ∈ M m×N by ( f ) ho m,∞ (ω, a) = lim sup t →+∞
( f ) ho m (ω, t a) t
.
If, moreover, f is ergodic, then the same result holds with a deterministic limit F ho m defined by replacing f ho m (ω, .) by f ho m (.) in the expression of F ho m above. The proof of Theorem 12.4.7 is the consequence of Propositions 12.4.4 and 12.4.5 below. To shorten the proofs, we do not take into account the boundary condition, i.e., the domain of F is W 1, p (Ω, R m ). For treating the general case, it suffices to reproduce exactly the proofs of Corollary 11.2.1 when p > 1 and Corollary 11.3.1 when p = 1, established in the periodic case. Proposition 12.4.4. For all ω in the subset Σ of Proposition 12.4.3, for all u in L p (Ω, R m ), and for all sequences (un )n∈N strongly converging to u in L p (Ω, R m ), we have F ho m (ω, u) ≤ lim inf Fn (ω, un ). n→+∞
(12.53)
PROOF. In what follows ω is fixed in Σ . Our strategy is exactly that of Proposition 12.3.2. Obviously, one may assume lim infn→+∞ Fn (ω, un ) < +∞. Then, considering the nonnegative Borel measure μn (ω) := f (ω, . , ∇un (.))" Ω, for a nonrelabeled subsequence n we have sup μn (ω)(Ω) < +∞. n∈N
i
i i
i
i
i
i
book 2014/8/6 page 530 i
Chapter 12. Γ -convergence and applications
530
Consequently, there exists a further nonrelabeled subsequence and a nonnegative Borel measure μ(ω) ∈ M(Ω) such that μn (ω) μ(ω)
weakly in M(Ω).
Let μ(ω) = g (ω)" N Ω + μ s (ω) be the Lebesgue–Nikodým decomposition of μ(ω), where μ s (ω) is a nonnegative Borel measure, singular with respect to the N -dimensional Lebesgue measure " Ω restricted to Ω. For establishing (12.53) it is enough to prove that g (ω)(x) ≥ f
ho m
(ω, ∇u(x)) x a.e., Ds u s ho m,∞ |D s u| when p = 1. ω, s μ (ω) ≥ f |D u|
(12.54) (12.55)
(a) Proof of (12.54). Let ρ > 0 intended to tend to 0. With the notation of the proof of Proposition 12.3.2, for a.e. x0 ∈ Ω, we have g (ω, x0 ) = lim
ρ→0
μ(ω)(Bρ (x0 )) |Bρ (x0 )|
.
One may assume μ(ω)(∂ Bρ (x0 )) = 0 for all but countably many ρ > 0, so that, from Alexandrov’s theorem, Proposition 4.2.3, we have μ(ω)(Bρ (x0 )) = limn→+∞ μn (ω)(Bρ (x0 )). We finally are led to establish lim lim
μn (ω)(Bρ (x0 ))
ρ→0 n→+∞
|Bρ (x0 )|
≥f
ho m
(ω, ∇u(x0 )) for a.e. x0 ∈ Ω.
(12.56)
Let us assume for the moment that the trace of un on ∂ Bρ (x0 ) coincides with the affine function u0 defined by u0 (x) := u(x0 ) + 〈∇u(x0 ), x − x0 〉. It follows from Proposition 12.4.3 that lim
n→+∞
μn (ω)(Bρ (x0 )) |Bρ (x0 )| 1
, ∇u(x0 ) + ∇(un − u0 ) d x n 1 x 1, p m f ω, , ∇u(x0 ) + ∇φ d x : φ ∈ W0 (Bρ (x0 ), R ) ≥ lim sup inf |Bρ (x0 )| Bρ (x0 ) n n→+∞ ⎫ ⎧ ⎬ ⎨ 1 1 1, p = lim inf f (ω, x, ∇u(x0 ) + ∇φ) d x : φ ∈ W0 B (x ), R m n→+∞ ⎭ ⎩ | 1 Bρ (x0 )| 1 Bρ (x0 ) n ρ 0
= lim
n→+∞
|Bρ (x0 )|
Bρ (x0 )
n
f ω,
x
n
= f ho m (ω, ∇u(x0 )), and the proof would be complete. In the general case, following point by point the proof of Proposition 12.3.2, we suitably modify un as in Proposition 11.2.3 into a function of W 1, p (Bρ (x0 ), R m ), which coincides with u0 on ∂ Bρ (x0 ) in the trace sense, and follow the previous procedure. Recall that the additional term induced by this modification goes to
i
i i
i
i
i
i
12.4. Stochastic homogenization
book 2014/8/6 page 531 i
531
zero with ρ thanks to the estimate (see Lemma 11.2.1 and Proposition 10.4.1): for a.e. x ∈ Ω, ⎤1/ p ⎡ 1 ⎣ |u(x) − u(x0 ) + ∇u(x0 )(x − x0 ) | p d x ⎦ = o(ρ). |Bρ (x0 )| Bρ (x0 ) The proof of (12.56) is then complete. (b) Proof of (12.55). It suffices to reproduce the proof of inequality s D u |D s u| μ s ≥ (Q f )∞ |D s u| obtained in the proof of Proposition 11.3.3 after substituting f (ω, x , .) for f and, accordn
ing to Proposition 12.4.3, after substituting f ho m (ω, .) for Q f .
Proposition 12.4.5. For all ω ∈ Σ and for all u in L p (Ω, R m ), p ≥ 1, there exists a sequence (un (ω, .))n∈N strongly converging to u in L p (Ω, R m ) such that F ho m (ω, u) ≥ lim sup Fn (ω, un ). n→+∞
PROOF. The proof will be obtained in four steps. First step. We assume that u = la . For η > 0, let (Qi ,η )i ∈Jη be a finite family of open
cubes Qi ,η of the lattice spanned by ]0, η[N and a finite subset Iη of Jη such that 3 3 Qi ,η ⊂ Ω ⊂ Qi ,η , i ∈Iη i ∈Jη 3 Qi ,η < δ(η), lim δ(η) = 0. η→0 i ∈Jη \Iη
1, p
For each i ∈ Jη , consider ui ,η,n (ω, .) ∈ W0 (Qi ,η , R m ) such that
1 1 |Qi ,η | n
1 n
Qi,η
f (ω, x, a + ∇ui ,η,n (ω, x)) d x − η ≤
S
1 n
Qi,η (ω, a)
1 |Qi ,η | n
.
(When p > 1 one can take for ui ,η,n an exact minimizer.) Note that 1 |Qi ,η |
Qi,η
S 1 Qi,η (ω, a) x x d x ≤ n1 f ω, , a + (∇ui ,η,n ) ω, + η. n n |Qi ,η | n
Set uη,n (ω, .) = la + extended by la on Ω \
(12.57)
i ∈Iη
i ∈Iη
n ui ,η,n ω,
.
n
1Qi,η ,
¯ . Clearly u belongs to W 1, p (Ω, R m ). From (12.57), Q i ,η η,n
(12.5), (12.50) and by using Poincaré’s inequality in Qi ,η , one easily obtains |uη,n (ω, .) − la |L p (Ω,Rm ) ≤ C η p ,
(12.58)
i
i i
i
i
i
i
book 2014/8/6 page 532 i
Chapter 12. Γ -convergence and applications
532
where C > 0 depends only on α, β, p, and |Ω|. According to Proposition 12.4.3, from (12.57) and (12.50), we infer F ≥
ho m
i ∈Iη
≥
i ∈Iη
(ω, u) |Qi ,η | f
ho m
(a)
lim |Qi ,η |
S
1 n
Qi,η (ω, a)
1 |Qi ,η | n
n→+∞
x f ω, , ∇u (ω, x) d x − η|Ω| η,n n n→+∞ Q i,η i∈Iη x x f ω, , ∇uη,n (ω, x) d x − f ω, , ∇uη,n (ω, x) d x − η|Ω| = lim sup n n n→∞ Ω Qi,η i∈Jη \Iη x ≥ lim sup f ω, , ∇uη,n (ω, x) d x − δ(η)β(1 + |a| p ) − η|Ω|. n→∞ Ω n
≥ lim sup
Letting η → 0 yields F
ho m
(ω, u) ≥ lim sup lim sup η→0
n→∞
Ω
f ω,
x n
, ∇uη,n (ω, x) d x,
so that, by using a diagonalization argument (see [37, Corollary 1.16]), there exists a map n → η(n) (possibly depending on ω) satisfying limn→+∞ η(n) = 0 and such that F ho m (ω, u) ≥ lim sup Fn (ω, uη(n),n (ω, .)). n→+∞
On the other hand, from (12.58), limn→+∞ uη(n),n (ω, .) = la strongly in L p (Ω, R m ). The function un (ω, .) := uη(n),n (ω, .) satisfies the assertion of Proposition 12.4.5. Second step. We assume that u belongs to the space Aff(Ω, R m ) made up of continuous piecewise affine functions on Ω. Then there exists a finite family (Ωi )i ∈I of pairwise disjoint open subsets of Ω, piecewise of class C1 such that |Ω \ i ∈I Ωi | = 0 and (ai , bi ) ∈ M m×N × R m such that uΩi = lai + bi for all i in I . According to the first step, there exists ui ,n (ω, .) in W 1, p (Ωi , R m ) strongly converging to u in L p (Ωi , R m ) such that for every i ∈ I
f Ωi
ho m
(ω, ∇u) d x(ω, u) ≥ lim sup n→+∞
Ωi
f ω,
x n
, ∇ui ,n (ω, x) d x.
We can modify ui ,n (ω, .) in a neighborhood of the boundary of each Ωi into a function of W 1, p (Ωi , R m ), still denoted by un,i (ω, .), such that u (ω, .) = u on ∂ Ωi , ui ,n (ω, .) → u strongly in L p (Ωi , R m ) i ,n x f ho m (ω, ∇u) d x ≥ lim sup f ω, , ∇ui ,n (ω, x) d x. n n→+∞ Ωi Ωi
i
i i
i
i
i
i
12.5. Application to image segmentation and phase transitions
book 2014/8/6 page 533 i
533
(See the proof of Corollary 11.2.1.) Set un (ω, .) := i ∈I un,i (ω, .)1Ωi . From the above we infer that un (ω, .) ∈ W 1, p (Ω, R m ), un (ω, .) → u strongly in L p (Ω, R m ) and F
ho m
x (ω, u) ≥ lim sup f ω, , ∇ui ,n (ω, x) d x n i ∈I n→+∞ Ωi x ≥ lim sup f ω, , ∇ui ,n (ω, x) d x n n→+∞ i ∈I Ωi x = lim sup f ω, , ∇un (ω, x) d x. n n→+∞ Ω
The function un (ω, .) satisfies the assertion of Proposition 12.4.5. Third step. We assume that u belongs to W 1, p (Ω, R m ). We conclude from the previous step, by the density of Aff(Ω, R m ) in W 1, p (Ω, R m ). Indeed from (12.5) and (12.6), u → F ho m (ω, u) is strongly continuous in W 1, p (Ω, R m ). Therefore there exists a sequence (u m ) m∈N in Aff(Ω, R m ) strongly converging to u in W 1, p (Ω, R m ) such that F ho m (ω, u) = lim F ho m (ω, u m ). m→+∞
(12.59)
On the other hand, from the second step, and Proposition 12.4.4, for each u m there exists a sequence (u m,n (ω, .))n∈N in W 1, p (Ω, R m ) strongly converging to u m in L p (Ω, R m ) such that x F ho m (ω, u m ) = lim f ω, , ∇u m,n (ω, x) d x. (12.60) n→+∞ n Ω We conclude from (12.59) and (12.60) by applying the diagonalization Lemma 11.1.1 to the sequence Fn (ω, u m,n (ω, .)), u m,n (ω, .) m,n in the metric space R × L p (Ω, R m ). If p > 1, the proof of Proposition 12.4.5 is complete. Last step ( p = 1). The conclusion follows from the previous step and the second step of the proof of Proposition 12.3.3.
12.5 Application to image segmentation and phase transitions 12.5.1 The Mumford–Shah model Let Ω be a bounded open subset of RN and g a given function in L∞ (Ω). Denoting by 0 the class of the closed sets of Ω, for all K in 0 and all u in C1 (Ω \ K) we deal with the functional E(u, K) := Ω
|u − g |2 d x +
Ω\K
|∇u|2 d x + N −1 (K)
and the associated optimization problem inf{E(u, K) : (u, K) ∈ C1 (Ω \ K) × 0 }.
(12.61)
When Ω is a rectangle in R2 and g (x) is the light signal striking Ω at a point x, (12.61) is the Mumford–Shah model of image segmentation: K may be considered as the outline of
i
i i
i
i
i
i
book 2014/8/6 page 534 i
Chapter 12. Γ -convergence and applications
534
the given light image in computer vision. If it exists, a solution (u ∗ , K ∗ ) of (12.61) fulfills the three following properties: (i) the first term in E(u, K) asks that u ∗ approximates the light signal g in L2 (Ω); (ii) in Ω \ K ∗ , u ∗ does not vary very much (because of the term Ω\K |∇u ∗ |2 d x); (iii) the third term asks that the boundaries K ∗ be as short as possible. Let us remark that dropping one of the three terms makes the problem trivial, i.e., inf{E(u, K) : (u, K) ∈ C1 (Ω \ K) × 0 } = 0. Indeed, when E(u, K) = Ω\K |∇u|2 d x + N −1 (K), take u ∗ = 0 and K ∗ = . When E(u, K) := Ω |u − g |2 d x + N −1 (K), take u ∗ = g and K ∗ = . When E(u, K) := Ω |u − g |2 d x + Ω\K |∇u|2 d x, let us decompose Ω by a finite union of open cubes Qi ,η with diameter η and boundary Ki ,η , 6 "
N
3
Ω\
7 Qi ,η = 0,
Kη =
i ∈I (η)
and set ui ,η :=
1 |Qi ,η |
3
Ki ,η ,
i ∈I (η)
g (x) d x, Qi,η
uη :=
i ∈I (η)
ui ,η 1Qi,η .
Then E(uη , Kη ) tends to zero when η goes to zero. In this last case (14.14) obviously has no solution if g is not constant. To fit several applications to computer vision problems, one can adjust the functional E by suitable positive constants α, β, and γ and consider |∇u|2 d x + γ N −1 (K). E(u, K) := α |u − g |2 d x + β Ω
Ω\K
In what follows, to shorten notation, we set α = β = γ = 1. The existence of a solution for the optimization problem (12.61) was conjectured in [308] and has been established in [196] by using the semicontinuity and the compactness results of Ambrosio [16] related to functionals defined in SBV spaces (see Chapter 13). They defined a weak formulation of (12.61) as follows. If (12.61) has a solution (u ∗ , K ∗ ), the closed set K ∗ must contain the jump set of u ∗ . Then, it is natural to solve the problem in SBV (Ω) and to consider K ∗ as the closure of the set S u∗ . That leads us to consider the following weak formulation of the problem (12.61): 2 N −1 2 inf |∇u| d x + (S u ) + |u − g | d x : u ∈ SBV (Ω) , (12.62) Ω
Ω
where ∇u denotes the density of the Lebesgue part of D u and S u the jump set of u (see Section 10). The functional E defined on SBV (Ω) by E(u) = Ω |∇u|2 d x + N −2 (S u ) + |u − g |2 d x will be referred to as the Mumford–Shah energy functional. In Section 14.3 Ω we will establish the following existence result. Theorem 12.5.1. There exists at least a solution of the weak problem (12.62).
i
i i
i
i
i
i
12.5. Application to image segmentation and phase transitions
book 2014/8/6 page 535 i
535
12.5.2 Variational approximation of a more elementary problem: A phase transitions model To describe a numerical processing of the weak formulation (problem (12.62)), a natural way consists in approximating in a variational sense the Mumford–Shah energy functional by classical integral functionals. Before treating the Mumford–Shah energy, we begin by showing how the Van Der Waals–Cahn–Hilliard thermodynamical model of phase transitions allows us to define a good approximation of the term N −1 (S u ). For another and more direct method in one or two dimensions (i.e., N = 1, 2), we refer the reader to Chambolle [168]. For nonlocal variational approximations of the Mumford–Shah functional we refer the reader to Braides and Dal Maso [126], Gobbino [231], Cortesani and Toader [179], and references therein. Let Ω be an open bounded subset of RN , m > 0, 0 < α < β be such that α meas(Ω) ≤ m ≤ β meas(Ω), and SBV (Ω : {α, β}) be the subspace of all functions of SBV (Ω) taking only the two values α or β. We consider the following problem: inf
N −1
(S u ) : u ∈ SBV (Ω : α, β),
Ω
u dx = m .
As said above, the thermodynamical model of phase transition provides an analogous estimate of this problem. Indeed, consider the functional F defined in L1 (Ω) by ⎧ 1 $ ⎪ 2 ⎨c |D u| + $ W (u) d x 0 F (u) = Ω ⎪ ⎩ +∞ otherwise, where
c0 = 2
1
if u ∈ H (Ω), u ≥ 0,
β/
W (t ) d t
α
Ω
u d x = m,
−1 ,
is a positive parameter intended to go to zero, and W : [0, +∞[→ R is a nonnegative, continuous function with exactly two zeros α, β (0 < α < β). The integral functional c0−1 F is the rescaled Van Der Waals–Cahn–Hilliard energy functional by the ratio $ 1/ and W is a thermodynamical potential of a liquid with constant mass m confined to a bounded container Ω under isothermal conditions and whose density distribution u presents two phases α and β. More precisely, the Van Der Waals–Cahn–Hilliard energy is the functional u → W (u) d x + Ind{ v(x) d x=m, v≥0} (u) + |D u|2 d x, Ω
Ω
where Ind{
Ω
v(x) d x=m, v≥0} (u) =
Ω
⎧ ⎨ ⎩
0
if
+∞
Ω
u(x) d x = m, u ≥ 0,
otherwise.
The thickness L of the transition between the two phases is given by L=
$
β−α . β / 2 α W (τ) d τ
i
i i
i
i
i
i
book 2014/8/6 page 536 i
Chapter 12. Γ -convergence and applications
536
Since L is very small, so is and the Van Der Waals–Cahn–Hilliard free energy is nothing but a perturbation of the Gibbs free energy functional G : u → W (u) d x + Ind{ v(x) d x=m, v≥0} (u) Ω
Ω
by the functional H : u → Ω |D u|2 d x. The first Gibbs model, which consists in minimizing G, is unsatisfactory. It is indeed easily seen that the set arg min (G) is made by infinitely many piecewise constant functions u taking the value α in an arbitrary subset A of Ω with measure (βmeas(Ω) − m)/(β − α), and the value β in Ω \ A, with no restriction on the shape of the interface between [u = α] and [u = β]. In particular, there is no way to recover the physical criterion: the interface has minimal area. This criterion may be recovered by the new model, consisting in minimizing the functional F . We point out that because of arg min(G) ∩ dom (H ) = , this last model is a (viscosity) singular perturbation of the first one. For a general study of viscosity perturbations consult Attouch [39]. Modica proved in [293] the following result, previously established in the special case N = 1 by Gurtin [235]. Theorem 12.5.2. The sequence (F )>0 Γ -converges to the functional F defined by ⎧ ⎨ N −1 (S u ) if u ∈ SBV (Ω : α, β), and u d x = m, F (u) = Ω ⎩ +∞ otherwise in L1 (Ω) equipped with its strong topology. Assume moreover that W satisfies the following polynomial behavior at infinity: there exist t0 > 0, c1 > 0, c2 > 0, k ≥ 2 such that for all t ≥ t0 c1 t k ≤ W (t ) ≤ c2 t k . Then the set {u : → 0} of minimum points of F has a compact closure in L1 (Ω), and any cluster point u is a minimum point of F . PROOF. We only give the proof of the lower bound in the definition of Γ -convergence and establish the compactness result. For a complete proof, consult [293] or the proof of $ Proposition 12.5.2. We begin by substituting by and we omit the constant c0 in the definition of F . The expected Γ -limit must be ⎧ ⎨ −1 N −1 (S u ) if u ∈ SBV (Ω : α, β), and u d x = m, c0 F (u) = Ω ⎩ +∞ otherwise, which is actually the asymptotic model of Van Der Waals–Cahn–Hilliard. First step. We begin by proving that for all v in L1 (Ω) and all sequence (v )>0 strongly converging to v in L1 (Ω), one has lim inf F (v ) ≥ F (v). →0
The proof given here is based on a general method described in [351]. One may assume, for a subsequence not relabeled, that lim inf→0 F (v ) = lim→0 F (v ) = C < +∞, where
i
i i
i
i
i
i
12.5. Application to image segmentation and phase transitions
book 2014/8/6 page 537 i
537
C is a nonnegative constant which does not depend on . We then deduce ⎧ ⎪ ⎪ v ≥ 0, v d x = m, ⎨ Ω ⎪ ⎪ W (v ) d x ≤ C . ⎩ Ω
According to the continuity of W and Fatou’s lemma, the last inequality yields W (v) d x ≤ lim inf W (v ) d x ≤ 0, →0
Ω
Ω
so that W (v(x)) = 0 a.e. and v takes only the two values α and β. Note that since truncations operate on H 1 (Ω), v is also the strong limit of the truncated functions v˜ = α∨v ∧β. Moreover, from the definition of W which achieves its infimum at α and β, F (v ) ≥ F (v˜ ). According to these remarks, keeping the same notation, we will replace v by v˜ . The elementary Young inequality yields 1/2
F (v ) ≥ 2
Ω
1/2 2
W (v ) d x
Ω
|Dv | d x
.
This estimate is optimal and may be recovered by studying the map → F (u) for a fixed u in H 1 (Ω) whose minimum point is 6 =
Ω
W (u) d x
Ω
|D u|2 d x
71/2
and for which the minimal value is 1/2 1/2 2 2 W (u) d x |D u| d x . Ω
Ω
By the Cauchy–Schwarz inequality we obtain lim inf F (v ) ≥ 2 lim inf W (v )|Dv | d x →0 →0 Ω = 2 lim inf |D ψ(v ) | d x, →0
Ω
t / where ψ(t ) = α W (s) d s. Since v → v strongly in L1 (Ω), α ≤ v ≤ β, and ψ is continuous, we deduce that ψ(v ) strongly converges to ψ(v) in L1 (Ω). According to Proposition 10.1.1, we finally deduce that ψ(v) belong to BV (Ω) and lim inf F (v ) ≥ 2 |Dψ(v)|. (12.63) →0
Ω
Let us compute this last integral. Since ⎧ ⎨ 0 on [v = α], β/ ψ(v) = W (s) d s on [v = β], ⎩ α
i
i i
i
i
i
i
book 2014/8/6 page 538 i
Chapter 12. Γ -convergence and applications
538
the function ψ(v) is a simple function of BV (Ω) and
Ω
|Dψ(v)| =
β/
W (s) d s N −1 (Sv ).
α
Inequality (12.63) finally gives lim inf F (v ) ≥ 2 →0
β/
W (s) d s N −1 (Sv ), = F (v),
α
concluding the first step. Second step. Let us now establish the relative compactness in L1 (Ω) of the set {u : → 0} of minimum points of F . The letter C will denote various positive constants. Consider v = ψ◦ u , where ψ is the primitive of the function W 1/2 defined above. Let us first prove the relative compactness of the set {v : → 0}. From the polynomial behavior of W and the fact that k/2 + 1 ≤ k, we have for all t ≥ t0 , ψ(t ) ≤
t0
W α
1/2
t
(s) d s +
W 1/2 (s) d s
t0
≤ C (1 + W (t )), which yields
Ω
v d x ≤ C (1 +
$
F (u ))
and gives the boundedness in L1 (Ω) of v . On the other hand, from the proof of the lower bound above 1 |Dv | d x ≤ F (v ), 2 Ω which finally gives the boundedness of v in BV (Ω). The relative compactness of {v : → 0} is a consequence of the compactness of the embedding BV (Ω) → L1 (Ω). Let us now go back to the functions u . Let v be a strong limit in L1 (Ω) of a nonrelabeled subsequence of v , consider the inverse function ψ−1 of ψ, and set u = ψ−1 ◦ v. We establish the strong convergence of u to u in L1 (Ω). We proceed as follows: we prove the equi-integrability of u and the convergence in measure of u to u (see, for instance, Marle [287]). From the polynomial behavior of W , we have |u |k d x ≤ t0k meas(Ω) + C W (u ) d x Ω
≤ C (1 +
$
Ω
F (u )) ≤ C
$ k/2 and equi-integrability follows from k ≥ 2. On the other hand, since ψ (t ) ≥ c 1 t0 for all t > t0 , ψ−1 is a Lipschitz function on [ψ(t0 ), +∞) and hence uniformly continuous on R+ . Therefore, u converges in measure to u. For a numerical approach, it suffices now to establish the Γ -convergence of the discretization F,h() of the functional F by finite elements, to the functional F , with a suitable choice of the size h of discretization. For this study, consult Bellettini [89].
i
i i
i
i
i
i
12.5. Application to image segmentation and phase transitions
book 2014/8/6 page 539 i
539
12.5.3 Variational approximation of the Mumford–Shah functional energy
When neglecting the functional u → Ω |u − g |2 d x in the expression of the Mumford– Shah functional, to control the jumps of admissible functions, a natural domain is the space GSBV (Ω) of generalized special functions of bounded variation defined by 4 5 GSBV (Ω) := u : Ω → R : u Borel function , k ∧ u ∨ (−k) ∈ SBV (Ω) ∀k ∈ N .
It can be shown (see Ambrosio and Tortorelli [30]) that to each function u ∈ GSBV (Ω) there corresponds a Borel function ∇u : Ω → RN and S u ⊂ Ω such that ∇u = ∇(k ∧ u ∨ (−k)) a.e. on [|u| ≤ k] for all k ∈ N and N −1 (Sk∧u∨(−k) ) → N −1 (S u ) when k → +∞. Following the strategy of the previous subsection, as in Ambrosio and Tortorelli [30], we establish that the functional F defined in X := L1 (Ω) × L1 (Ω, [0, 1]) equipped with its strong topology by ⎧ ⎨ |∇u|2 d x + N −1 (S ) if u ∈ GSBV (Ω) and s = 1, u F (u, s) = ⎩ Ω +∞ otherwise can be approximated, in the sense of Γ -convergence, by the functionals defined in X by ⎧ ⎨ (s 2 + 2 )|∇u|2 d x + M (s, Ω) if (u, s) ∈ C1 (Ω) × C1 (Ω, [0, 1]) ∩ X , F (u, s) = Ω ⎩ +∞ otherwise. For all open subsets A of Ω and all s in C1 (Ω, [0, 1]), M (., A) denotes the integral functional 1 2 2 |∇s| + (1 − s) d x. M (s, A) := 4 A The second argument s is, as we will see in the proof, a control parameter on the gradient. The approximation of the Mumford–Shah energy will be the functional G defined by G (u, s) = F (u, s) + Ω |u − g |2 d x. Indeed, u → Ω |u − g |2 d x is a continuous perturbation of F and the conclusion will follow from Theorem 12.1.1(ii). We assume that Ω satisfies the following “reflection condition” (=) on ∂ Ω: there exists an open neighborhood U of ∂ Ω in RN and a one-to-one Lipschitz function ϕ : U ∩Ω −→ U \ Ω such that ϕ −1 is Lipschitz. Theorem 12.5.3. Assume that Ω satisfies condition (=). Then the sequence of functionals (F )>0 Γ -converges to the functional F . The proof proceeds with Propositions 12.5.1 and 12.5.2. We denote the strong topology of L1 (Ω) × L1 (Ω, [0, 1]) by τ, and the letter C will denote various positive constants which do not depend on . We point out that condition (=) is not necessary for obtaining the lower bound in Proposition 12.5.1. Proposition 12.5.1. For all (u, s) ∈ X and all sequences ((u , s )) τ-converging to (u, s), we have F (u, s) ≤ lim inf→0 F (u , s ), or equivalently, F ≤ Γ − lim inf→0 F . PROOF. Obviously, one may assume Γ − lim inf→0 F (u, s) < +∞ and s = 1. First step. We assume u ∈ L∞ (Ω) and establish the proposition in the one-dimensional case N = 1 when Ω is a bounded interval I in R. When I is not an interval, it suffices to
i
i i
i
i
i
i
book 2014/8/6 page 540 i
Chapter 12. Γ -convergence and applications
540
argue on each connected component of I and to conclude thanks to the superadditivity of Γ − lim inf→0 F . When working on a bounded open subset A of R, we will denote F and Γ − lim inf→0 F by F (., A) and Γ − lim inf→0 F (., A), respectively. The key point of the proof is the following lemma. Lemma 12.5.1. Assume u in L∞ (I ) and fix x0 ∈ I . (i) If there exists η > 0 such that for all ρ < η, u ∈ W 1,2 (Bρ (x0 )), then for all ρ < η Γ − lim inf F ((u, 1), Bρ (x0 )) ≥ 1. →0
(ii) If there exists ρ > 0 such that u ∈ W 1,2 (Bρ (x0 )), then Γ − lim inf F ((u, 1), Bρ (x0 )) ≥ →0
Bρ (x0 )
|∇u|2 d x.
Assume for the moment that the proof of Lemma 12.5.1 is established. We claim that the set E := {x ∈ I : ∃η > 0 ∀ρ < η, u ∈ W 1,2 (Bρ (x0 ))} is finite. Indeed, otherwise E would contain an infinite countable subset D = {xi , i ∈ N}. For all n in N and ρ small enough such that Bρ (xi ), i = 0, . . . , n, are pairwise disjoint sets, we would have from (i), superadditivity, and nondecreasing properties of A → Γ − lim inf→0 F (., A), +∞ > Γ − lim inf F ((u, 1), I ) ≥ →0
n i =0
Γ − lim inf F ((u, 1), Bρ (xi )) ≥ n. →0
This being true for all n ∈ N, we obtain a contradiction. The set E is then made up of a finite number of points x0 , . . . , xn and it is easily seen that u ∈ W 1,2 (I \ E). From 0 (E) < +∞, we deduce u ∈ SBV (I ) and E = S u . For ρ small enough as previously, according to (ii) of Lemma 12.5.1, we have 6 7 3 Γ − lim inf F ((u, 1), I ) ≥ Γ − lim inf F (u, 1), I \ Bρ (x0 ) →0
→0
6 + Γ − lim inf F (u, 1), →0
≥
I\
x∈S u
Bρ (x0 )
x∈S u
3
7
Bρ (x0 )
x∈S u
|∇u|2 d x + 0 (S u ).
We conclude the step by letting ρ → 0 in the above inequality. It remains to establish assertions (i) and (ii) of Lemma 12.5.1. Let (u , s ) ∈ X ∩C1 (I )× 1 C (I , [0, 1]) τ-converging to (u, 1) and satisfying lim inf F ((u , s ), Bρ (x0 )) < +∞. →0
For proving (i), we establish the existence of x , x , x in Bρ (x0 ) such that x < x < x and satisfying lim→0 s (x ) = 0, lim→0 s (x ) = lim→0 s (x ) = 1, for a nonrelabeled subsequence. Let us assume this result for the moment. We conclude as follows: by convexity inequality (precisely a 2 + b 2 ≥ 2ab ),
i
i i
i
i
i
i
12.5. Application to image segmentation and phase transitions
book 2014/8/6 page 541 i
541
F (u , s ) ≥ M (s , Bρ (x0 )) (1 − s )|∇s | d x ≥ Bρ (x0 )
≥
x x
(1 − s )|∇s | d x +
x x
(1 − s )|∇s | d x
x x ≥ (1 − s )∇s d x + (1 − s )∇s d x x x % % & & (1 − s )2 x (1 − s )2 x − = − + , 2 2 x x
which tends to 1 when → 0. We are going to establish the existence of x , x , and x . In what follows, we argue with various nonrelabeled subsequences and C denotes various positive constants independent of . Let σ < ρ and set m := infBσ (x0 ) s . From F ((u , s ), Bρ (x0 )) ≤ C , we derive 2 m |∇u |2 d x ≤ C . Bσ (x0 )
Up to a subsequence, m converges to some l , 0 ≤ l ≤ 1. We claim that l = 0. Otherwise, C lim |∇u |2 d x ≤ 2 →0 B (x ) l σ 0 and u would weakly converge to u in W 1,2 (Bσ (x0 )), which is in contradiction with u ∈ W 1,2 (Bρ (x0 )) for all ρ < η. Consequently, there exists x ∈ Bσ (x), satisfying lim→0 s (x ) = lim→0 m = 0. On the other hand, estimates x0 −σ x0 +ρ (1 − s )2 (1 − s )2 dx ≤ C, dx ≤ C 4 4 x0 −ρ x0 +σ and the mean value theorem yield, for a subsequence, the existence of x and x satisfying the required assertions. Let us show (ii). According to s2 |∇u |2 d x ≤ F (u , s ), Bρ (x0 )
it is enough to establish the inequality 2 |∇u| d x ≤ lim inf Bρ (x0 )
→0
Bρ (x0 )
s2 |∇u |2 d x
when F (u , s ) is equibounded. Set v = (1 − s )2 . The equiboundedness of ∇v in L1 (I ) will provide the following uniform control on v : for all δ > 0, there exists a finite part Jδ of I such that for all compact subsets K, K ⊂ I \ Jδ , one has lim sup→0 supK v < δ. (12.64) We will then deduce lim inf→0 infK s > 1 − δ 1/2 .
i
i i
i
i
i
i
book 2014/8/6 page 542 i
Chapter 12. Γ -convergence and applications
542
Let us assume for the moment estimate (12.64). We have for all δ > 0 and all compact subset K with K ⊂ I \ Jδ , 2 2 C≥ s |∇u | d x ≥ s2 |∇u |2 d x Bρ (x0 )
Bρ (x0 )∩K
≥ inf(s2 ) K
Therefore
C ≥ lim inf →0
Bρ (x0 )
s2 |∇u |2
1 2
Bρ (x0 )∩K
|∇u |2 d x.
2
d x ≥ (1 − δ ) lim inf →0
Bρ (x0 )∩K
|∇u |2 d x,
and the weak convergence of u to u in W 1,2 (Bρ (x0 ) ∩ K) yields, by lower semicontinuity, 1 lim inf s2 |∇u |2 d x ≥ (1 − δ 2 )2 |∇u|2 d x. →0
Bρ (x0 )
Bρ (x0 )∩K
The conclusion (ii) follows after letting K to I and δ → 0. We are now going to establish (12.64). We claim that sup I |∇v | d x < +∞. Indeed, by convexity +∞ ≥ 2M (s , I ) ≥ 2(1 − s )|∇s | d x I = |∇v | d x. I
Let σ > 0 satisfying δ > σ and consider for all t in R, the sets At := [v ≤ t ]. According to the classical coarea formula, more precisely to Corollary 4.2.2, we have +∞ C ≥ |∇v | d x = 0 ([v = t ]) d t I
≥
−∞ δ σ
0 ([v = t ]) d t .
Therefore, there exists t ∈ ]σ, δ[ such that 0 ([v = t ]) ≤
C . δ−σ
The set At has then at
C most k = [ δ−σ ] connected components with k independent of : more precisely, there k exists a family (Ii )i =1,...,k of intervals (possibly empty) such that At = i =1 Ii . For every ( i = N n≥N Ii . The complementary of the union i = 1, . . . , k, consider the interval I∞ n k 0i of k intervals I∞ := i =1 I∞ is the required finite part Iδ of I . Indeed, since v converges a.e. to zero, 7 6 7 6 k 3 ' 3' 3 Ii = meas Atn meas(I∞ ) = meas i =1 N n≥N
n
6 ≥ meas
N n≥N
3'
n
7 [vn ≤ σ]
N n≥N
= meas(I )
i
i i
i
i
i
i
12.5. Application to image segmentation and phase transitions
book 2014/8/6 page 543 i
543
so that Iδ = I \ I∞ possesses k elements. Finally, if K is a compact set included in Iδ , 0
0
i i i arguing on each interval I∞ , we have K∩ I∞ ⊂⊂ I∞ and, for N large enough 0
i K∩ I∞ ⊂
' n≥N
Ii ⊂ n
'
[vn ≤ δ].
n≥N
Second step. We establish Proposition 12.5.1 in the N -dimensional case, N > 1. We will use the same notation for the functionals considered in the one-dimensional and the N -dimensional case. We begin by assuming u ∈ L∞ (Ω). Let (u , s ) be a sequence in X converging to (u, 1) such that lim inf→0 F ((u , s ), Ω) < +∞ and A any open subset of Ω. With the notation and definitions of Theorem 10.5.2 for all ν ∈ S N −1 and for a subsequence not relabeled, (u,x , s,x ) strongly converges in L1 (Ax )× L1 (Ax , [0, 1]) for N −1 a.e. x in Aν . (It’s an easy consequence of Fubini’s theorem.) On the other hand, N −1 lim inf F ((u,x , s,x ), Ax ) d ≤ lim inf F ((u,x , s,x ), Ax ) d N −1 Aν
→0
→0
Aν
= lim inf F ((u , s ), A) < +∞. →0
N −1
Thus, for a.e. x in Aν , lim inf→0 F ((u,x , s,x ), Ax ) < +∞. One may apply the result of the first step: for N −1 a.e. x in Aν , u x belongs to SBV (Ax ) ∩ L∞ (Ax ) and lim inf F ((u,x , s,x ), Ax ) ≥ |∇u x |2 + 0 (S ux ∩ Ax ). →0
Ax
Integrating this inequality over Aν , according to Theorem 10.5.2, we deduce that for all open subset A of Ω, u ∈ SBV (A) and lim inf F ((u , s ), A) ≥ lim inf F ((u,x , s,x ), Ax ) d N −1 →0
A
ν
→0
≥ Aν
Ax
|∇u x |2 d t d N −1 (x) +
|∇u.ν|2 d x +
= A
A
Aν
0 (S ux ∩ Ax ) d N −1 (x)
|ν u .ν| d N −1 Su .
We conclude thanks to Lemma 4.2.2 and Example 4.2.2. If now u is not assumed to belong to L∞ (Ω), by a truncation argument we have Γ − lim inf F ((u, s), Ω) ≥ Γ − lim inf F ((N ∧ u ∨ (−N ), s), Ω) →0
→0
≥ F ((N ∧ u ∨ (−N ), s), Ω). Letting N → +∞ gives the thesis. Proposition 12.5.2. For all (u, s) ∈ X there exists a subsequence ((u , s )) τ-converging to (u, s) such that F (u, s) ≥ lim sup→0 F (u , s ) or, equivalently, F ≥ Γ − lim sup→0 F . PROOF. One may assume u in SBV (Ω) ∩ L∞ (Ω). Indeed, if u ∈ GSBV (Ω), an easy truncation argument gives the thesis. For a given u ∈ SBV (Ω) ∩ L∞ (Ω), it suffices to construct (u , s ) in H 1 (Ω) × H 1 (Ω, [0, 1]) ∩ X , τ-converging to (u, 1) in X and satisfying F (u, s) ≥ lim sup→0 F (u , s ).
i
i i
i
i
i
i
book 2014/8/6 page 544 i
Chapter 12. Γ -convergence and applications
544
The expression of F has indeed a sense in H 1 (Ω) × H 1 (Ω, [0, 1]) ∩ X . Moreover, C1 (Ω) × C1 (Ω, [0, 1]) is dense in H 1 (Ω) × H 1 (Ω, [0, 1]) equipped with its strong topology and F is continuous for this topology which is stronger than τ. The conclusion will follow by a diagonalization argument. First step. Let a , b , c be three sequences in R+ going to zero which will be adjusted later in a suitable way. The idea consists in modifying (u, 1) in a neighborhood of S u to obtain, from the expression of F (u , s ), an equivalent of N −1 (S u ). We begin by assuming the following regularity condition on S u : lim
mes(Ω ∩ (S u )ρ )
ρ→0
2ρ
= HN −1 (S u ),
(12.65)
where (S u )ρ is the tubular neighborhood {x ∈ RN : d(x, S u ) < ρ} of order ρ of S u . In what follows, for any t ∈ R+ , (S u ) t denotes a tubular neighborhood of order t > 0 of S u . We construct u in H 1 (Ω) satisfying u = u in Ω\(S u )a and such that its gradient satisfies |∇u (x)| ≤
C
(12.66)
a
a.e. in (S u )a . Consider now s in H 1 (Ω, [0, 1]) such that 0 on (S u )a , s = 1 − c on Ω \ (S u )a +b . The positive constant c is introduced for technical reasons and, as said before, will be adjusted later. We have (u , s ) ∈ H 1 (Ω) × H 1 (Ω, [0, 1]) ∩ X and 2 2 2 F ((u , s ), Ω) = (s + )|∇u| d x + 2 |∇u |2 d x Ω\(S u )a
+
c2
meas(Ω \ (S u )a +b ) +
(S u )a
1
meas((S u )a ) 4 4 + M (s , (S u )a +b \ (S u )a ). The first term trivially goes to Ω |∇u|2 d x when goes to zero. We adjust a so that the second and fourth terms go to zero. For this, it suffices, thanks to (12.66), to select an intermediate power of between$ 2 and , for instance, a = 3/2 . To make the third term vanish, it suffices to choose c ≤ , for instance, c = 3/4 . We are reduced to finding s satisfying lim sup M (s , (S u )a +b \ (S u )a ) ≤ HN −1 (S u ). →0
Let us denote the map d (.) = dist(., S u ) by d . We try to find s of the form σ ◦d . Applying (1−σ ◦d )2
and to the the coarea formula Theorem 4.2.5 to the function g = |σ ◦ d |2 + 4 truncated function f = a ∨ d ∧ (a + b ) of d , we obtain a +b (1 − σ (t ))2 2 M (s , (S u )a +b \ (S u )a ) = |σ (t )| + HN −1 ([d = t ]) d t . 4 a
Consider h(t ) = meas([d < t ]). Then, according to Corollary 4.2.3, h (t ) = HN −1 ([d = t ]), and a +b (1 − σ (t ))2 2 |σ (t )| + M (s , (S u )a +b \ (S u )a ) = h (t ) d t . 4 a
i
i i
i
i
i
i
12.5. Application to image segmentation and phase transitions
book 2014/8/6 page 545 i
545
The function σ is chosen as the solution of the ordinary boundary value problem 1−σ σ = 2 , σ (a ) = 0, σ (a + b ) = 1 − c , a −t
that is, σ (t ) = 1 − exp( 2 ) when we choose b = − ln(3/2 ). On the other hand, the regularity assumption on S u yields for all η > 0 the existence of 0 such that for all < 0 and all t < a + b , one has h(t ) ≤ 2t (HN −1 (S u ) + η). Thanks to this estimate, the conclusion then follows by integrating by parts the expression M (s , (S u )a +b \ (S u )a ). (For details see [30, Proposition 5.1].) Second step. We do not assume hypothesis (12.65). To apply the first step, we construct a sequence uη converging to u in L2 (Ω) such that S uη satisfies (12.65) and such that F (u) =
limη→0 F (uη ). Afterwards, it will suffice to apply the procedure of the first step to the function uη and to conclude by a diagonalization argument. For constructing uη , the idea consists in finding uη as a solution of the Mumford–Shah problem 1 2 N −1 2 inf |∇v| + (Sv ) + |v − u| d x : v ∈ SBV (Ω ) , (+η ) η Ω Ω where Ω = Ω ∪ U and u is the extension of u on Ω defined by ⎧ ⎨ u(ϕ −1 (x)) if x ∈ U \ Ω, u(x) = γ0 (u) if x ∈ ∂ Ω, ⎩ u(x) if x ∈ Ω. U and ϕ are given by the regularity condition (=) fulfilled by Ω and γ0 denotes the trace operator. We next use the following regularity property related to Mumford–Shah solutions (see Ambrosio and Tortorelli [30] and De Giorgi, Carriero, and Leaci [196]): N −1 (S uη ∩Ω \ S uη ) = 0 and, for all compact set K included in S uη ∩ Ω , lim
meas((K)ρ )
ρ→0
2ρ
= N −1 (K).
Taking K = Ω ∩ S uη , we obtain the required regularity on uη in Ω.
Obviously, uη converges to u in L2 (Ω) thanks to the penalization parameter 1/η in (+η ). It remains to establish the convergence of F (uη ) to F (u). Consider the two Borel measures μη and μ in M+ (Ω ) defined for all Borel set B in Ω , by ⎧ ⎪ 2 N −1 ⎪ (B ∩ S uη ), ⎪ ⎨ μη (B) := |∇uη | d x + ⎪ ⎪ ⎪ ⎩ μ(B) :=
B
B
|∇u|2 d x + N −1 (B ∩ S u ).
It is worth noticing that u has no jump through ∂ Ω so that μ(∂ Ω) = 0. Taking v = u as a test function in (+η ), we obtain lim sup μη (Ω ) ≤ μ(Ω ). η→0
i
i i
i
i
i
i
546
book 2014/8/6 page 546 i
Chapter 12. Γ -convergence and applications
On the other hand, according to Theorem 13.4.3, which we will establish in the next chapter, we have for all open subset A of Ω μ(A) ≤ lim inf μη (A). η→0
According to Proposition 4.2.5, we deduce that μη narrow converges to μ. Since μ(∂ Ω) = 0, we have μη (Ω) → μ(Ω).
i
i i
i
i
i
i
book 2014/8/6 page 547 i
Chapter 13
Integral functionals of the calculus of variations
This chapter is devoted to the study of the sequential lower semicontinuity of certain types of functionals which occur in many variational problems. As noted in Chapter 11, lower semicontinuity is the key tool to apply the direct methods of the calculus of variations, and we deal in the sections below with some different cases, depending on the spaces the functionals are defined on. We will see that, due to the integral form of the functionals under consideration, the convexity or quasi-convexity conditions play a central role in all the results. We first complement Section 11.2 by establishing necessary and sufficient conditions for more general integral functionals to be lower semicontinuous. Then to complement Section 11.3, we deal with lower semicontinuity of functionals defined on the space of measures, on BV and SBV . We do not pretend to be exhaustive in this very widely studied field; we intend only to give here some principal results. For other cases and details see [26], [147], [153], [182], [225], [302].
13.1 Lower semicontinuity in the scalar case In this section we consider integral functionals of the form F (u) = f (x, u, D u) d x,
(13.1)
Ω
where u varies on a Sobolev space W 1, p (Ω). We stress the fact that in this section we restrict our attention to the case of functions u which take their values in R; some differences with the case of functionals defined on vector valued functions will be discussed in the next section. To study the sufficient conditions for the lower semicontinuity of functionals of the form (13.1), it is convenient to consider first the case of functionals of the form f x, u(x), v(x) d μ(x), (13.2) F (u, v) = Ω
where (Ω, , μ) is a measure space with the measure μ nonnegative and finite (i.e., μ ∈ M+ (Ω)), f : Ω×R m ×Rn → [0, +∞] is an ⊗B m ⊗Bn -measurable function (B m and Bn , respectively, denote the σ-algebras of Borel subsets of R m and Rn ), and u, v, respectively, vary in the spaces L1μ (Ω; R m ), L1μ (Ω; Rn ) of μ integrable R m , Rn valued functions. The theorem below is a lower semicontinuity result for functionals of the form (13.2); the link with the case (13.1) will be discussed later. 547
i
i i
i
i
i
i
548
book 2014/8/6 page 548 i
Chapter 13. Integral functionals of the calculus of variations
Theorem 13.1.1. Assume the function f satisfies the following conditions: (i) for μ-a.e. x ∈ Ω the function f (x, ·, ·) is lower semicontinuous on R m × Rn ; (ii) for μ-a.e. x ∈ Ω and for every s ∈ R m the function f (x, s, ·) is convex on Rn . Then the functional F defined in (13.2) is sequentially lower semicontinuous on the space L1μ (Ω; R m )× L1μ (Ω; Rn ) endowed with the strong topology on L1μ (Ω; R m ) and the weak topol-
ogy on L1μ (Ω; Rn ).
PROOF. Let u h → u strongly in L1μ (Ω; R m ) and v h → v weakly in L1μ (Ω; Rn ); we have to prove that (13.3) F (u, v) ≤ lim inf F (u h , v h ). h→+∞
Possibly passing to subsequences, we may assume without loss of generality that the lim inf in the right-hand side of (13.3) is a finite limit, that is, lim F (u h , v h ) = c ∈ R.
h→+∞
(13.4)
Since the sequence (v h ) is weakly compact in L1μ (Ω; Rn ), we may use the Dunford–Pettis theorem, Theorem 2.4.5, and the De La Vallée–Poussin criterion, Theorem 2.4.4, to conclude that there exists a function ϑ : [0, +∞[→ [0, +∞[, which can be taken convex and strictly increasing, with a superlinear growth, that is, ϑ(t )
lim
t →+∞
= +∞,
such that sup
h∈N Ω
Setting
t
ϑ(|v h |) d μ ≤ 1.
(13.5)
⎧ / ⎨ H (t ) = t ϑ(t ), −1 Φ(t ) = ϑ H (t ) , ⎩ ξ h (x) = H |v h (x)| ,
it is easy to see that (i) H is strictly increasing and H (t )/t → +∞ as t → +∞, (ii) Φ is strictly increasing and Φ(t )/t → +∞ as t → +∞, (iii) ϑ(t )/H (t ) → +∞ as t → +∞, (iv) Φ ξ h (x) = ϑ |v h (x)| . Therefore, by (13.5) we have
sup h∈N
Ω
Φ(ξ h ) d μ ≤ 1.
We can use now the Dunford–Pettis theorem again to deduce that the sequence (ξ h ) is weakly compact in L1μ (Ω), hence (up to extracting a subsequence) we may assume that ξ h → η weakly in L1μ (Ω) for a suitable η. By the Mazur theorem, a suitable sequence of
i
i i
i
i
i
i
13.1. Lower semicontinuity in the scalar case
book 2014/8/6 page 549 i
549
convex combinations of (ξ h , v h ) is strongly convergent in L1μ (Ω) × L1μ (Ω; Rn ) to (η, v). More precisely, there exist N h → +∞ and αi ,h ≥ 0 with N h+1 i =N h +1
αi ,h = 1
such that the sequences ηh =
N h+1 i =N h +1
αi ,h ξi ,
νh =
N h+1 i =N h +1
αi ,h vi
strongly converge to η in L1μ (Ω) and to v L1μ (Ω; Rn ), respectively. Possibly passing to subsequences we may also assume that η h → η, ν h → v, and u h → u pointwise μ-a.e. on Ω. Consider now a point x ∈ Ω where all the convergences above occur, and set 4 5 ⎧ h = max |u(x) − ui (x)| : N h < i ≤ N h+1 , ⎪ ⎪ ⎪ ⎪ N ⎨ h+1 αi ,h f x, ui (x), vi (x) , λh = ⎪ ⎪ i =N h +1 ⎪ ⎪ 4 5 ⎩ h = (ν, η, λ) ∈ Rn+2 : η = H (|ν|), ∃s ∈ R m , |s − u(x)| ≤ h , λ ≥ f (x, s, ν) . We have h → 0 and by definition of ν h , η h , λ h we obtain that ν h (x), η h (x), λ h (x) belongs to the convex hull c o h of h . Since h ⊂ Rn+2 , by the Carathéodory theorem on convex hulls in Euclidean spaces the vector ν h (x), η h (x), λ h (x) can be written as a convex combination of n + 3 elements of h , that is, there exist βi ,h ≥ 0,
νi ,h ∈ Rn ,
ηi ,h ≥ 0,
λi ,h ≥ 0
(i = 1, . . . , n + 3)
such that (νi ,h , ηi ,h , λi ,h ) ∈ h for every index i, and ⎧ n+3 n+3 ⎪ ⎪ ⎪ β = 1, βi ,h νi ,h = ν h (x), ⎪ i ,h ⎨ i =1
i =1
n+3 ⎪ ⎪ ⎪ ⎪ βi ,h ηi ,h = η h (x), ⎩ i =1
n+3 i =1
βi ,h λi ,h = λ h (x).
Therefore, for suitable si ,h ∈ R m with |si ,h − u(x)| ≤ h we have λi ,h ≥ f (x, si ,h , νi ,h ). By extracting subsequences, without loss of generality, we may assume that for every index i the sequence |νi ,h | tends to a limit and, denoting by I the set of indices i such that this limit is finite, again by passing to subsequences, we may also assume that ⎧ ∀i ∈ I , ⎨ νi ,h → νi ∀i ∈ / I, |ν | → +∞ ⎩ βi ,h → β ∀i = 1, . . . , n + 3. i ,h i Since
n+3 i =1
βi ,h H (|νi ,h |) = η h (x) → η(x),
i
i i
i
i
i
i
550
book 2014/8/6 page 550 i
Chapter 13. Integral functionals of the calculus of variations
the set I cannot be empty. From the relation n+3 i =1
βi ,h νi ,h = ν h (x) → v(x),
we obtain that βi = 0 for every i ∈ / I . Moreover, from η h (x) =
n+3 i =1
βi ,h ηi ,h ≥
βi ,h ηi ,h =
i ∈I /
we get i ∈I
H (|νi ,h |) |νi ,h |
∀i ∈ /I
βi = 1,
βi ,h |νi ,h |
i ∈I /
βi ,h |νi ,h | → 0
so that
i ∈I
βi νi = v(x).
We now use the assumptions on the function f to obtain f x, u(x), v(x) ≤ βi f x, u(x), νi i ∈I
≤ lim inf h→+∞
≤ lim inf h→+∞
i ∈I n+3 i =1
βi ,h f (x, si ,h , νi ,h ) βi ,h f (x, si ,h , νi ,h )
≤ lim inf λ h (x) h→+∞
so that by Fatou’s lemma, f (x, u, v) d μ ≤ lim inf λ h (x) d μ h→+∞
Ω
= lim inf h→+∞
Ω N h+1
i =N h +1
(13.6)
αi ,h
Ω
f (x, ui , vi ) d μ.
(13.7)
Fix now > 0; by using (13.4) we obtain, for h large enough, f (x, ui , vi ) d μ ≤ c + ∀i ∈ [N h + 1, N h+1 ] Ω
and so by (13.7) F (u, v) ≤ c + . The proof is then achieved by taking → 0+ . Remark 13.1.1. It is easy to see that the result of Theorem 13.1.1 remains true if the measure μ is only assumed to be σ-finite. The result above, under the slightly stronger assumption that f (x, ·, ·) is continuous for μ-a.e. x ∈ Ω, was first obtained by De Giorgi in 1968 in an unpublished paper. The original proof by De Giorgi is obtained by approximating from below the convex function f (x, s, ·) by finite suprema of affine functions fk (x, s, ·) for which the proof is easier, and then passing to the limit as k → +∞; the interested reader may find further details about this type of proof in the book by Buttazzo [147]. The proof reported above follows on the contrary the scheme of the proof which was given in 1977 by Ioffe [246].
i
i i
i
i
i
i
13.1. Lower semicontinuity in the scalar case
book 2014/8/6 page 551 i
551
By using the result of Theorem 13.1.1 we can easily give some sufficient conditions for the sequential lower semicontinuity of functionals of the form (13.1) on the Sobolev space W 1,1 (Ω). More precisely, the following result holds. Theorem 13.1.2. Let Ω be a Lipschitz domain of Rn and let f : Ω × R m × R mn → [0, +∞] be a function verifying the assumptions of Theorem 13.1.1. Then the functional F defined in (13.1) is sequentially weakly lower semicontinuous on the Sobolev space W 1,1 (Ω; R m ). PROOF. Let (u h ) be a sequence in W 1,1 (Ω; R m ) converging weakly to some function u. Setting v h = D u h we have that v h converges weakly to v = D u in L1 (Ω; R mn ) and, by the Rellich theorem, u h converge strongly to u in L1 (Ω; R m ). By Theorem 13.1.1 we have F (u) = f (x, u, v) d x ≤ lim inf f (x, u h , v h ) d x = lim inf F (u h ), Ω
h→+∞
Ω
h→+∞
which then proves the assertion. In the so-called scalar case (i.e., when m = 1), and in the case of ordinary integrals as well (i.e., when n = 1), we will prove below that the convexity of the integrand f with respect to the gradient is also a necessary condition for the semicontinuity. This is no longer true in the vector valued case m > 1 for multiple integrals, as we will discuss in the next section. For the sake of simplicity we here limit ourselves to consider only the case of functionals of the form f (D u) d x; (13.8) F (u) = Ω
the more general case (13.1) presents only technical differences in the proof, and we refer to one of the books mentioned at the beginning of this chapter for the details. Theorem 13.1.3. Assume that either m = 1 or n = 1 and that the functional F in (13.8) is sequentially weakly* lower semicontinuous in the Sobolev space W 1,∞ , in the sense that F (u) ≤ lim inf F (u h ) h→+∞
(13.9)
for every sequence u h converging to u uniformly in Ω and with D u h uniformly bounded in Ω. Then the function f is convex and lower semicontinuous. PROOF. We give the proof only in the case m = 1, the other one being similar. Let z1 , z2 ∈ Rn , let t ∈ ]0, 1[, and let z = t z1 +(1−t )z2 . Denote the linear function u z (x) = z·x by u z and define ⎧ z2 − z1 ⎪ ⎪ z0 = , ⎪ ⎪ ⎪ |z ⎪ 2 − z1 | N M ⎪ ⎪ ⎪ j −1 j −1+t ⎪ ⎪ , j ∈ Z, h ∈ N, Ω1h j = x ∈ Ω : h < z0 · x < h ⎪ ⎪ ⎪ N M ⎪ ⎪ ⎪ j −1+t j 2 ⎪ ⎪ j ∈ Z, h ∈ N, ⎨ Ω h j = x ∈ Ω : h < z0 · x < h , M N 1 ⎪ Ωh j : j ∈ Z , Ω1h = ⎪ ⎪ ⎪ M N ⎪ ⎪ ⎪ 2 2 ⎪ ⎪ Ω = : j ∈ Z , Ω ⎪ h hj ⎪ ⎪ ⎪ ⎪ ⎪ c h1 j + z1 · x if x ∈ Ω1h j , ⎪ ⎪ ⎪ ⎪ u (x) = ⎩ h c h2 j + z2 · x if x ∈ Ω2h j ,
i
i i
i
i
i
i
552
book 2014/8/6 page 552 i
Chapter 13. Integral functionals of the calculus of variations
where c h1 j =
( j − 1)(1 − t )
h It is easy to verify that, as h → +∞, meas(Ω1h ) meas(Ω)
c h2 j = −
|z2 − z1 |,
meas(Ω2h )
→ t,
meas(Ω)
jt h
|z2 − z1 |.
→ 1 − t.
(13.10)
Moreover, the functions u h are Lipschitz continuous and for every x ∈ Ω1h j we have j −1 + (z1 − z) · x| = (1 − t ) |z2 − z1 | + (z1 − z2 ) · x h j −1 t (1 − t ) − z0 · x ≤ |z2 − z1 |. = (1 − t )|z2 − z1 | h h
|u h (x) − u z (x)| = |c h1 j
Analogously, a similar computation gives for every x ∈ Ω2h j |u h (x) − u z (x)| ≤
t (1 − t ) h
|z2 − z1 |.
Therefore u h → u z uniformly on Ω. Moreover, it is immediate to see that the gradients D u h are uniformly bounded on Ω, so that the sequence (u h ) converges to u z weakly* in W 1,∞ (Ω). By the assumption (13.9), using also (13.10), we then obtain f (z) meas(Ω) = F (u z ) ≤ lim inf F (u h ) h→+∞ " ! = lim inf f (z1 ) meas(Ω1h ) + f (z2 ) meas(Ω2h ) h→+∞
= t f (z1 ) meas(Ω) + (1 − t ) f (z2 ) meas(Ω), which proves the convexity of f . The lower semicontinuity of f on Rn is a straightforward consequence of the sequential lower semicontinuity assumption (13.9) on F .
13.2 Lower semicontinuity in the vectorial case We have seen in the previous section that the convexity assumption on the integrand f (x, s, ·) is necessary and sufficient for the sequential weak lower semicontinuity of the functional F (u) =
Ω
f (x, u, D u) d x
(13.11)
in the case of scalar functions u. On the contrary, in the case of functions u with vector values, the convexity of the integrand with respect to the gradient variable describes only a small class of weakly lower semicontinuous functionals. For this reason, according to Morrey [302] we introduce the notion of quasi-convexity defined in Chapter 11 in the context of relaxation theory. Definition 13.2.1. A Borel function f : R mn → [0, +∞] is said to be quasi-convex if f z + Dφ(x) d x (13.12) f (z) meas(A) ≤ A
i
i i
i
i
i
i
13.2. Lower semicontinuity in the vectorial case
book 2014/8/6 page 553 i
553
for a suitable (hence for all) bounded open subset A of Rn , every m × n matrix z, and every φ ∈ C10 (A; R m ). Remark 13.2.1. It is possible to prove that when either m = 1 or n = 1 quasi-convexity reduces to the usual convexity; this will actually follow from Theorem 13.1.3 once the equivalence between lower semicontinuity and quasi-convexity will be proved. On the other hand, there are many examples of quasi-convex functions which are not convex, as, for instance, the function z → f (z) = | det z|. If f : R mn → [0, +∞] is quasi-convex, then for every z ∈ R mn and s ∈ R m the function φ s ,z : Rn → [0, +∞] defined by φ s ,z (ξ ) = f (z + s ⊗ ξ )
(13.13)
is convex. This fact follows again from the results of the previous section on the scalar case, once we remark that by (13.12) we obtain φ s ,z (ξ ) meas(A) ≤ φ s ,z ξ + Dφ(x) d x A
for every vector ξ ∈ Rn and every scalar function φ ∈ C10 (A). The property given by (13.13) is called rank-one convexity. From the convexity of the functions φ s ,z defined in (13.13), we obtain that every rank-one convex function f of class C2 (R mn ) satisfies the so-called Legendre–Hadamard condition: ∂ 2f ∂ zi j ∂ z hk
(z0 )αi α h β j βk ≥ 0
for all z0 ∈ R mn and for all α ∈ R m , β ∈ Rn . (The summation convention over repeated indices is adopted.) Moreover, we get that every rank-one convex finite-valued function is locally Lipschitz on R mn . This is made precise in Lemma 13.2.1. Remark 13.2.2. A wide class of quasi-convex functions is the class of polyconvex functions, introduced by Ball in [79]. A function f is called polyconvex if it can be written in the form f (z) = g X (z) ∀z ∈ R m×n , where X (z) denotes the vector of all subdeterminants of the matrix z and g is a convex function. For instance, if m = n = 2, every polyconvex function is of the form g (z, det z) with g convex on R4 × R; analogously, if m = n = 3, every polyconvex function is of the form g (z, adj z, det z) with g convex on R9 ×R9 ×R and where adj z denotes the adjugate matrix of z, that is the transpose of the matrix of cofactors of z. Lemma 13.2.1. Let f : R mn → R be a rank-one convex function such that 0 ≤ f (z) ≤ c(1 + |z| p )
∀z ∈ R mn .
Then, for a suitable constant k > 0 we have | f (z) − f (w)| ≤ k|z − w| 1 + |z| p−1 + |w| p−1 ∀z, w ∈ R mn .
i
i i
i
i
i
i
554
book 2014/8/6 page 554 i
Chapter 13. Integral functionals of the calculus of variations
PROOF. With f convex with respect to each column vector, it is enough to prove the inequality in the case f convex. Again, arguing component by component, we may assume that f is a convex function of only one variable t . These functions are differentiable almost everywhere, and we have f (t ) ≤ f (t ) ≥
f (t + h) − f (t ) h f (t + h) − f (t ) h
∀h > 0,
(13.14)
∀h < 0.
(13.15)
Taking h = 1 + |t | in (13.14) and h = −1 − |t | in (13.15) we obtain for a.e. t ∈ R | f (t )| ≤
f (t + h) |h|
≤
c 1 + |t |
1 + |t | p + |h| p ≤ c 1 + |t | p−1 ,
from which the conclusion follows easily. For a systematic study of the properties of quasi-convex functions, see [303], [3], [286], [302], and [182]. The main interest of the notion of quasi-convexity consists in its relation with the lower semicontinuity of integral functionals, in the sense specified by the following theorem. Theorem 13.2.1. Let p ≥ 1 and let f : Ω × R m × R mn → R be a Carathéodory integrand such that 0 ≤ f (x, s, z) ≤ c a(x) + |s| p + |z| p (13.16) for all (x, s, z) ∈ Ω×R m ×R mn , where c ≥ 0 is a constant, and a ∈ L1 (Ω). Then the following conditions are equivalent: (i) for a.e. x ∈ Ω and every s ∈ R m the function f (x, s, ·) is quasi-convex; (ii) the functional F defined by (13.11) is sequentially lower semicontinuous on W 1, p (Ω; R m ) with respect to its weak topology. PROOF. We give here the proof only in the basic case f = f (z), where 0 ≤ f (z) ≤ c 1 + |z| p ; the proof of the general case can be found in the references mentioned above. We start by proving that the quasi-convexity condition (i) implies the lower semicontinuity condition (ii). We have to prove that F (u) ≤ lim inf F (u h ) h→+∞
(13.17)
whenever u h → u weakly in W 1, p (Ω; R m ). This will be done in three steps. First step. Inequality (13.17) holds when u is affine and u h = u on ∂ Ω. In fact, in this 1, p case D u is a constant matrix z and u h − u ∈ W0 (Ω; R m ) so that, by definition of quasiconvexity, we get f z + D(u h − u) d x = F (u h ), F (u) = |Ω| f (z) ≤ Ω
hence (13.17).
i
i i
i
i
i
i
13.2. Lower semicontinuity in the vectorial case
book 2014/8/6 page 555 i
555
Second step. Inequality (13.17) holds when u is affine. We use a slicing method near the boundary of Ω (see also the proof of Proposition 11.2.3). Let Ω0 be a compact subset of Ω, let R = 12 dist(Ω0 , ∂ Ω), and let N ∈ N. For every integer i = 1, . . . , N define
iR
Ωi = x ∈ Ω : dist(x, Ω0 )
0 and let w be a piecewise affine function such that u − wW 1, p < . In particular, there exist open sets Ωi and constant matrices zi such that Dw = zi on Ωi . Setting wi ,h (x) = u h (x) − u(x) + zi x on Ωi ,
i
i i
i
i
i
i
556
book 2014/8/6 page 556 i
Chapter 13. Integral functionals of the calculus of variations
we have wi ,h → zi x weakly in W 1, p (Ωi ; R m ) so that, by the second step, f (zi ) d x ≤ lim inf f (Dwi ,h ) d x. h→+∞
Ωi
Ωi
By Lemma 13.2.1 we get f (Dwi ,h ) d x ≤ | f (D u h ) − f Dwi ,h | d x F (u h ) − Ωi Ωi i i |D u h − Dwi ,h | 1 + |D u h | p−1 + |Dwi ,h | p−1 d x ≤c i
≤c
i
≤c ≤ c. Analogously,
Ωi
Ωi
|D u − Dw| 1 + |D u h | p−1 + |D u − Dw| p−1 d x 1/ p
p
Ω
|D u − Dw| d x
Ω
p
1 + |D u h | + |D u − Dw|
p
1−1/ p dx
f (zi ) d x ≤ c. F (u) − Ω i
Therefore, F (u) ≤ c +
i
≤ c +
i
i
Ωi
f (zi ) d x
lim inf h→+∞
≤ c + lim inf
Ωi
h→+∞
i
Ωi
f (Dwi ,h ) d x f (Dwi ,h ) d x
≤ 2c + lim inf F (u h ), h→+∞
and the conclusion follows by taking the limit as → 0+ . We prove now that lower semicontinuity condition (ii) implies quasi-convexity of f . The proof is based on the following result, which is, as said in Example 2.4.2 about oscillation phenomena, a straightforward consequence of a general ergodic theorem. (See also Lemma 12.3.1.) Lemma 13.2.2. Let Q be any open cube in Rn of size L > 0, v a function in L p (Q, R m ), p and v˜ its Q-periodic extension, i.e., the function of L l oc (Rn , R m ) defined by ˜ = v(x − z) if x ∈ Q + z, z ∈ LZn . v(x) p
˜ x) weakly converges in L l oc (Rn , R m ) Then the sequence (v h ) h∈N defined by v h (x) = v(h 1 (weak* if p = +∞) to its mean value meas(Q) Q v(x) d x. The proof is a straightforward consequence of Proposition 13.2.1 that we establish below because of its own interest.
i
i i
i
i
i
i
13.2. Lower semicontinuity in the vectorial case
book 2014/8/6 page 557 i
557
Let now z be an m × n matrix, l z the linear function defined by l z (x) = z x, u ∈ C10 (Q, R m ), u˜ its Q-periodic extension, and set for every x ∈ Rn , u h (x) = u˜(h x)/h. It is p easily seen that u h strongly converges to 0 in L l oc (Rn , R m ). On the other hand, according to Lemma 13.2.2, D u h weakly converges to 1 D u(x) d x = 0 meas(Q) Q p
in L l oc (R m , R mn ). Therefore u h + l z weakly converges to l z in W 1, p (Q, R m ) and, by hypothesis (ii), lim inf f (D u h + z) d x ≥ f (z) d z = meas(Q) f (z). (13.18) h→+∞
Q
Q
A change of scale and the periodicity assumption on u˜ gives 1 1 f (D u h + z) d x = f (D u˜ + z) d x meas(Q) Q meas(hQ) hQ 1 = f (D u + z) d x meas(Q) Q so that (13.18) yields 1 meas(Q)
f (D u + z) d x ≥ f (z), Q
which completes the proof. Remark 13.2.3. When p = +∞ the result of Theorem 13.2.1 still holds if we substitute the weak topology with the weak* topology of W 1,∞ (Ω; R m ), and condition (13.16) with 0 ≤ f (x, s, z) ≤ α(x, |s|, |z|)
(13.19)
for all (x, s, z) ∈ Ω×R m ×R mn , where α(x, t , τ) is a function which is summable in x and increasing in t and τ. Remark 13.2.4. From what was presented above, it follows that the implications f convex
⇒
f polyconvex
⇒
f quasi-convex
⇒
f rank-one convex
hold true. We stress the fact that, as shown in Theorem 13.2.1, quasi-convexity is the right property to use when dealing with the lower semicontinuity of integral functionals of the calculus of variations. However, due to its intrinsic definition, it is not easy to work with quasi-convex functions, while polyconvexity and rank-one convexity conditions are much more explicit: in many cases, the sufficiency of the first and the necessity of the second are of great help. None of the implications above can be reversed; this can be seen very easily for the first one (indeed |det z| is a polyconvex function which is not convex), whereas more delicate counterexamples are needed for the remaining ones. See the book by Dacorogna [182] and the paper by Sverak [342] for details concerning these topics. The only problem that remains open in the study of the implications above is the equivalence between quasiconvexity and rank-one convexity in the case m = n = 2: neither counterexamples nor proofs of the equivalence are known.
i
i i
i
i
i
i
558
book 2014/8/6 page 558 i
Chapter 13. Integral functionals of the calculus of variations
We state now the ergodic theorem stated in Lemma 13.2.2, which is a generalization of the convergence result described in Example 2.4.2 about oscillations phenomena, and a particular case of Lemma 12.3.1. Proposition 13.2.1. With the notation of Lemma 12.3.1, let Q be a cube in Rn of the form Πni=1 (ai , ai + L) for some L > 0, and : b (Rn ) → R satisfying (i) A∪B = A + B for every disjoint set of b (Rn ), (ii) A+z = A for every set of b (Rn ) and every z in LZn . Then, for any bounded Borel convex subset B of Rn , lim
hB
h→+∞
meas(hB)
=
1 meas(Q)
Q .
Consequently, if v belongs to L p (Q, R m ) and v˜ denotes its Q-periodic extension, then v h p ˜ x), h ∈ N, weakly converges in L l oc (Rn , R m ) (weakly* if p = +∞) defined by v h (x) = v(h 1 v(x) d x. to its mean value meas(Q) Q PROOF. When B = Q, the first assertion is a straightforward consequence of the following decomposition: 3 hQ = (z + Q). z∈LZn ∩hQ
For the proof of the general case, see [167]. We are going to prove the second assertion. Let U be an open ball of Rn ; then v h is bounded in L p (U, R m ). Indeed 1 meas(U )
U
|v h | p d x =
1 meas(hU )
˜ p d x, |v| hU
which converges thanks to the first assertion. Therefore, in the case when p > 1, there exists a subsequence of (v h ) h∈N (not relabeled) and u ∈ L p (U, R m ) such that v h u weakly in L p (U, R m ). To identify the weak limit u, we work in the space of measures M(U, R m ). Obviously v h u " n U weakly in M(U, R m ). For a.e. x0 in U , ρ in (0, 1) \ N where N is a countable subset of R (see Lemma 4.2.2), according to the first assertion, we have 1 u(x) d x u(x0 ) = lim ρ→0 meas(B (x )) B (x ) ρ 0 ρ 0 1 v h (x) d x = lim lim ρ→0 h→+∞ meas(B (x )) B (x ) ρ 0 ρ 0 1 ˜ dx = lim lim v(x) ρ→0 h→+∞ meas(hB (x )) hB (x ) ρ 0 ρ 0 1 v(x) d x. = meas(Q) Q It remains to treat the case p = 1. We establish the uniform integrability of the sequence (v h ) h∈N in L1 (U, R m ); the conclusion will follow by the Dunford–Pettis theorem,
i
i i
i
i
i
i
13.3. Lower semicontinuity for functionals defined on the space of measures
book 2014/8/6 page 559 i
559
Theorem 2.4.5, and by the above procedure for identifying the weak limit. We now use a truncation argument. Let δ > 0 intended to go to +∞ and set ⎧ v = v h ∧ δ ∨ (−δ), ⎪ ⎪ ⎨ h,δ vδ = v ∧ δ ∨ (−δ), = v h − v h,δ , w ⎪ ⎪ ⎩ h,δ w δ = v − vδ . Note that v h,δ = (vδ ) h . For every Borel subset A of U , we have
A
|v h | d x ≤
U
|w h,δ | d x +
A
|v h,δ | d x.
(13.20)
On the other hand, according to the first assertion, 1 lim |w h,δ | d x = |w | d x. h→+∞ U meas(Q) Q δ Then, given > 0, since there exists δ large enough such that 1 |wδ | d x < , meas(Q) Q 4 there exists h() ∈ N such that sup h≥h()
|w h,δ | d x < . 2 U
(13.21)
Collecting (13.20) and (13.21), we deduce sup |v h | d x < + δ meas(A) 2 h≥h() A and the conclusion follows by taking meas(A)
0 such that ∀b ∈ K
f ∗ (b ) ≤ C .
(13.24)
Approximating Theorem 10.1.2 may be generalized as follows. Theorem 13.4.1. Let f be a nonnegative convex function satisfying (13.23) and (13.24); then the space C∞ (Ω) ∩ BV (Ω) is dense in BV (Ω) equipped with the intermediate convergence associated with f . Namely, for all u in BV (Ω), there exists un in C∞ (Ω) ∩ BV (Ω) such that u → u strongly in L1 (Ω); n |D un | d x → |D u|; Ω Ω f (D un ) d x → f (D u). Ω
Ω
PROOF. Proceed exactly as in the proof of Theorem 10.1.2. For a proof explained in detail, consult Temam [348]. With the intention of obtaining a complete description of the two lower semicontinuous envelopes of the functionals F and G, we assume a coerciveness condition on f : there exists a positive constant α such that ∀a ∈ RN
α(|a| − 1) ≤ f (a).
(13.25)
Theorem 13.4.2. Let Ω be a Lipschitz bounded open subset of RN with boundary Γ and f a nonnegative and convex function satisfying (13.23), (13.24), and (13.25). Then the lsc envelopes F and G of the functionals F and G in L1 (Ω) equipped with its strong topology are
i
i i
i
i
i
i
564
book 2014/8/6 page 564 i
Chapter 13. Integral functionals of the calculus of variations
defined by F (u) =
⎧ ⎨ ⎩
Ω
f (∇u) d x +
Ω
f ∞ (D s u)
if u ∈ BV (Ω),
+∞ otherwise; ⎧ ⎨ f (∇u) d x + f ∞ (D s u) + f ∞ (γ0 (u)ν) d N −1 G(u) = Ω Γ ⎩ Ω +∞ otherwise.
if u ∈ BV (Ω),
PROOF. We argue as in the proof of Proposition 11.3.2. For the function F , we must establish that for all u in L1 (Ω), if un → u in L1 (Ω), then F (u) ≤ lim inf F (un ), n→+∞
there exists un converging to u in L1 (Ω), such that F (u) ≥ lim sup F (un ). n→+∞
These two assertions are straightforward consequences of Proposition 13.4.1 and Theorem 13.4.1. Indeed, let un → u strongly in L1 (Ω), such that lim infn→+∞ F (un ) < +∞. From (13.25), for a nonrelabeled subsequence, un u in BV (Ω) and, according to Proposition 13.4.1, F (u) ≤ lim inf F (un ). n→+∞
On the other hand, for u ∈ BV (Ω), Theorem 13.4.1 provides a sequence of functions un in C∞ (Ω) ∩ BV (Ω) such that F (u) ≥ lim supn→+∞ F (un ). For the function G, the proof is exactly that of Proposition 11.3.2. Let us recall that it suffices to apply the previous result related to F after enlarging the set Ω to obtain the first assertion and to approach Ω from below to derive the second.
13.4.2 Compactness and lower semicontinuity in S BV The following result, due to Ambrosio, is a key tool in the so-called direct method in the calculus of variations when working with integral functionals defined in SBV (Ω) equipped with the strong topology of L1 (Ω) (see Chapter 14). In this section, Ω denotes an open bounded subset of RN . Theorem 13.4.3. Let (un )n∈N be a sequence in SBV (Ω) satisfying for p > 1, p N −1 sup un ∞ + |∇un | d x + (S un ) < +∞. Ω
n∈N
Then there exists a subsequence (unk )k∈N and a function u in SBV (Ω), such that u nk → u
strongly in L1l oc (Ω);
∇unk ∇u
weakly in L p (Ω, RN );
N −1 (S u ) ≤ lim inf N −1 (S un ). k−→+∞
k
Moreover, the Lebesgue part and the jump part of the derivatives converge separately. More precisely, ∇unk ∇u weakly in L1 (Ω) and J unk J u weakly in M(Ω, RN ).
i
i i
i
i
i
i
13.4. Functionals with linear growth: Lower semicontinuity in BV and SBV
book 2014/8/6 page 565 i
565
PROOF. In what follows C denotes various constants which do not depend on n. Thanks to inequalities |∇un | p d x ≤ C and |un |∞ ≤ C , Ω
we have un BV (Ω) ≤ C . Indeed, according to Remark 10.3.4, for N −1 a.e. x in S un , |un+ (x)| ≤ un L∞ (Ω) ≤ C and |un− (x)| ≤ un L∞ (Ω) ≤ C . From the compactness of the embedding of BV (U ) into L1 (U ), for each regular open subset of Ω, there exists a subsequence (unk )k∈N and u in BV (Ω) such that u nk u u nk → u
weakly in BV (Ω), strongly in L1l oc (Ω).
Let us show that u ∈ SBV (Ω). Since unk ∈ SBV (Ω), according to Theorem 10.5.1, there exists a Borel measure μnk in M(Ω × R, RN ) such that for all Φ ∈ C1c (Ω, RN ) and all ϕ ∈ C10 (R),
Ω×R
ϕ(s)Φ(x) μnk (d x, d s) = −
|μnk |(Ω × R) = 2 H
N −1
Ω
ϕ (unk ) ∇unk . Φ(x) + ϕ(unk ) div Φ(x) d x,
(S un ). k
The second equality and the hypothesis N −1 (S un ) ≤ C yield the existence of a Borel k
measure μ in M(Ω × R, RN ) and a subsequence (not relabeled) of (μnk )k∈N , weakly converging to μ in M(Ω×R, RN ). On the other hand, for a further nonrelabeled subsequence, there exists a ∈ L1 (Ω, RN ) such that ∇unk weakly converges to a in L1 (Ω, RN ). Going to the limit in the first equality, we obtain ϕ(s)Φ(x) μ(d x, d s) = − ϕ (u) a . Φ(x) + ϕ(u) div Φ(x) d x, Ω×R
Ω
which, from Theorem 10.5.1, yields u ∈ SBV (Ω) and a = ∇u. On the other hand, according to the lower semicontinuity of the total variation, one has 2 N −1 (S u ) = |μ|(Ω × R) ≤ lim inf |μnk |(Ω × R) k−→+∞
= lim inf 2 N −1 (S un ). k−→+∞
k
Finally, since D unk weakly converges to D u in M(Ω, RN ) and ∇unk weakly converges to ∇u in L1 (Ω, RN ), one has J unk = D unk − ∇unk " N weakly converges to J u = D u − ∇u" N in M(Ω, RN ). When (un )n∈N is not bounded in L∞ (Ω), we obtain the same result provided that (un )n∈N be bounded in BV (Ω). Note also that the boundedness of (∇un )n∈N in L p (Ω) with p > 1, implies the equi-integrability of (∇un )n∈N . In this sense, the next theorem generalizes Theorem 13.4.3.
i
i i
i
i
i
i
566
book 2014/8/6 page 566 i
Chapter 13. Integral functionals of the calculus of variations
Theorem 13.4.4. Let (un )n be a sequence in SBV (Ω) satisfying (i) supn∈N {|un |BV (Ω) } < +∞; (ii) the approximate gradients ∇un are equi-integrable (i.e., (∇un )n∈N is relatively compact with respect to the weak topology of L1 (Ω, RN )); (iii) the sequence ( N −1 (S un ))n∈N is bounded. Then there exists a subsequence (unk )k∈N weakly converging to some u ∈ SBV (Ω) such that u nk → u
strongly in L1l oc (Ω);
∇unk ∇u J u nk J u
weakly in L1 (Ω, RN ); weakly in M(Ω, RN );
N −1 (S u ) ≤ lim inf N −1 (S un ). k−→+∞
k
PROOF. According to Theorem 2.4.5, equi-integrability condition (ii) yields the existence of a subsequence of (∇un )n∈N and a in L1 (Ω, RN ) such that ∇un a in L1 (Ω, RN ). Then we argue as in the proof of Theorem 13.4.3 and adopt the same notation. We only have to justify the convergence of ϕ (unk ) ∇unk . Φ(x) d x Ω
to
Ω
ϕ (u) ∇u . Φ(x) d x.
According to Egorov’s theorem, since ϕ (unk ) converges a.e. to ϕ (u), for all > 0 there exists a Borel subset Ω of Ω such that " N (Ω \ Ω ) < and limk→+∞ sup x∈Ω |ϕ(unk ) − ϕ(u)| = 0. Let us write ϕ (unk ) ∇unk . Φ(x) d x = ϕ (unk ) ∇unk . Φ(x) d x + ϕ (unk ) ∇unk . Φ(x) d x. Ω
Since moreover supk∈N right-hand side tends to
Ω
Ω
Ω\Ω
|∇unk | d x < +∞, one easily obtains that the first term in the Ω
ϕ (u) ∇u . Φ(x) d x.
Letting → 0, the conclusion then follows, provided that we establish lim lim ϕ (unk ) ∇unk . Φ(x) d x = 0, →0 k→+∞
Ω\Ω
which is a straightforward consequence of ϕ (unk ) ∇unk . Φ(x) d x ≤ C |∇unk | d x Ω\Ω Ω\Ω
and equi-integrability of (∇un )n∈N .
i
i i
i
i
i
i
13.4. Functionals with linear growth: Lower semicontinuity in BV and SBV
book 2014/8/6 page 567 i
567
Remark 13.4.1. If one of the two conditions (ii) and (iii) is not satisfied, the conclusion may fail. Indeed, in the Cantor–Vitali example of Section 10.4, un belongs to SBV (0, 1), weakly converges to the Cantor–Vitali function u in BV (0, 1) and 0 (S un ) = 0. Nev ertheless (∇un )n∈N is not equi-integrable since (0,1) ∇un d x = 1 does not converge to ∇u d x = 0. With the notation of this example, consider now vn in SBV (0, 1) de(0,1) fined by vn = un 1(0,1)\Cn . It is easily seen that vn weakly converges to u in BV (0, 1) and that ∇vn = 0 is obviously equi-integrable. But 0 (Svn ) = 2(2n − 1) is not uniformly bounded. Condition (iii) of Theorem 13.4.4 can be weakened by the following condition (iii ): there exists a function ψ : [0, +∞) → [0, +∞] such that ψ(t )/t → +∞ as t → 0 and sup ψ |un+ − un− | d N −1 < +∞. n∈N
S un
For a proof, consult Braides [122]. Remark 13.4.2. Theorem 13.4.4 obviously holds in the vectorial case, i.e., when the considered sequences belong to SBV (Ω, R m ). We now deal with the lower semicontinuity property of functionals of the form f (∇u) d x + g (u + , u − )h(ν u ) d N −1 S u , Ω
Ω
where f , g , and h verify suitable conditions. Theorem 13.4.5. Let us consider a function f : RN → R+ satisfying the De La Vallée– Poussin criterion: f is convex and lim
a→+∞
f (a) |a|
= +∞.
Let, moreover, g : R × R → R+ be a lower semicontinuous symmetric and subadditive function, i.e., g (a, b ) = g (b , a) ≤ g (b , c) + g (c, a) ∀ a, b , c ∈ R, and assume that g (a, b ) ≥ max(ψ(|a − b |), δ|a − b |) for all a, b ∈ R where the function ψ : [0, +∞) → [0, +∞] satisfies the condition ψ(t )/t → +∞ as t → 0, and δ is some positive constant. Let finally h : RN → [0, +∞) be a convex, even function, positively homogeneous of degree 1 and satisfying, for all ν ∈ RN , h(ν) ≥ c|ν| for some positive constant c. Then the functional defined in L1 (Ω) by ⎧ ⎨ f (∇u) d x + g (u + , u − )h(ν u ) d N −1 S u if u ∈ SBV (Ω), F (u) = Ω Ω ⎩ +∞ otherwise is lower semicontinuous for the strong topology of L1 (Ω). If the lower semicontinuous, symmetric, and subadditive function g only satisfies the condition g (a, b ) ≥ ψ(|a − b |), then, given a nonempty compact subset K of R, the functional F˜
i
i i
i
i
i
i
568
book 2014/8/6 page 568 i
Chapter 13. Integral functionals of the calculus of variations
defined in L1 (Ω) by ⎧ ⎨ f (∇u) d x + g (u + , u − )h(ν u ) d N −1 S u F˜ (u) = Ω ⎩ Ω +∞ otherwise
if u ∈ SBV (Ω) and u(x) ∈ K,
is lower semicontinuous for the strong topology of L1 (Ω). SKETCH OF THE PROOF. The proof is based on the following lemma. Lemma 13.4.1. Let g : R × R → R+ be a lower semicontinuous, symmetric, and subadditive function and h : RN → [0, +∞) be a convex, even function, positively homogeneous of degree 1. Then g (u + , u − )h(ν) d N −1 S u ≤ lim inf g (un+ , un− )h(ν un ) d N −1 S un n→+∞
Ω
Ω
whenever un , u satisfy the thesis of Theorem 13.4.4 (with condition (iii) or (iii )). For the proof of Lemma 13.4.1, consult Braides [122, Theorem 2.12]. Let (un )n∈N be a sequence strongly converging to some u in L1 (Ω) and such that lim infn→+∞ F (un ) < +∞. From the De La Vallée–Poussin criterion, there exists a subsequence of (un )n∈N (not relabeled) such that ∇unk weakly converges in L1 (Ω, RN ). On the other hand, according to the coercivity assumption on g , sup |un+ − un− | d N −1 S un < +∞. n∈N
Ω
The sequence (un )n∈N is then bounded in BV (Ω). Moreover, from the assumption made on ψ, condition (iii ) of Remark 13.4.1 is satisfied. Thus conditions (i), (ii), and (iii ) of Theorem 13.4.4 are fulfilled. The conclusion then follows from the convexity of f and Lemma 13.4.1. The proof of the lower semicontinuity property of the functional F˜ follows the same scheme. The boundedness sup |un+ − un− | d N −1 S un < +∞ n∈N
Ω
|un+ (x)|
is now satisfied thanks to ≤ un L∞ (Ω) ≤ C and |un− (x)| ≤ un L∞ (Ω) ≤ C for N −1 a.e. x in S un (cf. Remark 10.3.4). Remark 13.4.3. In the vectorial case, Theorem 13.4.5 holds with the same conditions on g and h (now g : R m × R m → R+ ), and when f : R × M m×N → R is quasi-convex and satisfies the growth conditions of order p > 1: for all A ∈ M m×N , |A| p ≤ f (x, A) ≤ β(1 + |A| p ) for some positive constants α and β. For a proof, consult Ambrosio [18]. This result will be essential in Section 14.2 to establish the existence of a solution for the weak Griffith model in the framework of fracture mechanics.
i
i i
i
i
i
i
book 2014/8/6 page 569 i
Chapter 14
Application in mechanics and computer vision
14.1 Problems in pseudoplasticity 14.1.1 Introduction This section is devoted to the study of the equilibrium of a three-dimensional elastoplastic material occupying a bounded domain Ω ⊂ R3 as reference configuration and subjected to body and surface forces. The unknown displacement vector field u solves a minimization problem of the form inf Ω
W ((v)) d x − L(v) : v ∈ , ∂u
∂u
where (v) denotes the linearized strain tensor i , j (v) = 12 ( ∂ x i + ∂ x j ). The linear mapping j
v → L(v) accounts for the exterior loading and is of the form L(v) =
i
Ω
f ·v dx +
Γ1
g · v d 2,
where f denotes the body forces and g the surface forces on a part Γ1 of the boundary. The body is assumed to be clamped on Γ0 = Γ \Γ 1 with 2 (Γ0 ) > 0. We will denote the space of 3×3 symmetric matrices by MS and its subspace of matrices with null trace by MD , so that S D MS = MS ⊕RI . The constitutive equation of the material is such that the restriction W D of the stored energy density W to MD satisfies a linear growth at infinity. The admissible S set of displacement fields is a subset of the space {v ∈ LD(Ω) : v = 0 on Γ0 }, where LD(Ω) = {v ∈ L1 (Ω, R3 ) : (v) ∈ L1 (Ω, MS )}. From a mathematical point of view, due to the linear growth of W D at infinity, two difficulties may appear: • The value of the infimum may be infinite. This problem leads to the theory of yield design (or limit load), which consists in analyzing the set of λ ∈ R+ such that
inf Ω
W ((v)) d x − λL(v) : v ∈
> −∞.
569
i
i i
i
i
i
i
570
book 2014/8/6 page 570 i
Chapter 14. Application in mechanics and computer vision
From the mechanical point of view, this analysis predicts the load capacity of the structure. For more details about limit analysis, see Section 15.4. • Even if the infimum is finite, the problem has generally no solution. This is not surprising from a mechanical point of view, since the observed displacements are sometimes discontinuous on surfaces. A well-adapted space must contain admissible displacements with discontinuity on two-dimensional surfaces. This space, derived from BV (Ω, R3 ) where the measure (u) plays the role of the measure D u, is precisely 5 4 B D(Ω) := v ∈ L1 (Ω, R3 ) : (u) ∈ M(Ω, MS ) . It will be described in detail in Subsection 14.1.3. The integral functional Ω W ((u)) of the measure (u) will be defined in Subsection 14.1.2. Unfortunately, the new problem ˜ inf W ((v)) − L(v) : v ∈ , Ω
where now the set ˜ of admissible displacements is a subset of the space {v ∈ B D(Ω) : v = 0 on Γ0 }, has generally no solution. Indeed, plastification phenomena may appear on the boundary, and discontinuities are sometimes observed on the part Γ0 of the boundary. The boundary condition v = 0 in the trace sense must be replaced by a surface energy of the form W ∞ (γ0 (v)τ ⊗ s ν) d 2 . Γ0
We have denoted by γ0 the trace operator and by ν the exterior unit normal to Γ0 . The symmetric matrix field γ0 (v)τ ⊗ s ν will be defined further. Roughly, the function W ∞ describes the behavior of W at infinity on straight lines generated by γ0 (v)τ ⊗ s ν. The set of admissible displacement fields, still denoted by ˜, is now a subset of B D(Ω) whose elements satisfy a weaker boundary condition and will be described further. The mathematical theory of relaxation introduced in Chapter 11 allows us to sum up this discussion as follows: inf W ((v)) + W ∞ (γ0 (v)τ ⊗ s ν) d 2 − L(v) : v ∈ ˜ (+ ) Ω
Γ0
is the relaxed problem of inf Ω
that is,
W ((v)) d x − L(v) : v ∈ ,
v →
Ω
W ((v)) +
Γ0
(+ )
W ∞ (γ0 (v)τ ⊗ s ν) d 2 + Ind˜
is the lower semicontinuous envelope of v → Ω W ((v)) d x + Ind when B D(Ω) is equipped with its weak convergence. Moreover inf(+ ) = min(+ ). The next sections are devoted to a precise description and to a proof of this relaxation scheme.
i
i i
i
i
i
i
14.1. Problems in pseudoplasticity
book 2014/8/6 page 571 i
571
14.1.2 The Hencky model To illustrate the previous general considerations, we deal with the description of the Hencky model. The reference configuration Ω is assumed to have a boundary of class C1 . Let λ and μ be two given positive constants, namely, the Lamé coefficients of the material, and set k = λ + 2μ/3, the compression stiffness. The constitutive equation of the material is such that there exists a potential W of the form W (E) = W D (E D ) +
2 k tr(E) 2
for all E = E D + (1/3) tr(E)I in MS = MD ⊗ RI . The density W D is more precisely S k D D D + defined by W (E ) = φ(|E |), where Φ : R −→ R has a quadratic growth up to $ μ 2
and a linear growth beyond this threshold. More precisely, ⎧ k ⎪ 2 ⎪ ⎪ μs if s ≤ $ , ⎪ ⎨ μ 2 φ(s) = ⎪ $ k2 k ⎪ ⎪ ⎪s k 2− if s ≥ $ . ⎩ 2μ μ 2 The proof of Lemma 14.1.1 may be easily established and is left to the reader. Lemma 14.1.1. The function W D is convex and fulfills the three conditions (13.23), (13.24), and (13.25). The set of admissible displacement fields is the set of finite energy, that is, = {v ∈ LD(Ω) : div(v) ∈ L2 (Ω), v = 0 on Γ0 }, and problem (+ ) is precisely k D D 2 W ( (v)) d x + (div(v)) d x − L(v) : v ∈ . inf 2 Ω Ω Let us recall that for a function v : Ω → RN , its divergence in the distributional sense ∂ vi . The boundary condition v = 0 on Γ0 must be taken in the is given by div v := N i =1 ∂ x i
trace sense. The trace operator is indeed well defined from LD(Ω) into L1 (Γ0 , R3 ). For a proof, it suffices to adapt the trace Theorem 10.2.1 (see Remark 10.2.2 or see, for instance, Temam [348]). We define now the relaxed problem. For all a in R3 and every unit vector ν in R3 , we denote the tangential and normal components of a relative to ν, by aτ and aν , respectively. In other words aτ = a −(a ·ν) ν, where a ·ν denotes the scalar product of a and ν in R3 . For all a and b in R3 , we define their symmetric tensor product by a⊗ s b := 1/2(ai b j +a j bi )i , j . The relaxed problem is precisely k D D inf W ( (v)) + (div(v))2 d x (+ ) 2 Ω Ω (14.1) + W ∞ (γ0 (v)τ ⊗ s ν) d 2 − L(v) : v ∈ ˜ , Γ0
i
i i
i
i
i
i
572
book 2014/8/6 page 572 i
Chapter 14. Application in mechanics and computer vision
where γ0 is the trace operator from B D(Ω) into L1 (Γ ) and the set of admissible displacements is ˜ = {v ∈ B D(Ω) : div v ∈ L2 (Ω), vν = 0 on Γ0 }. Note that the boundary condition v = 0 (taken in the trace sense) on Γ0 in problem (+ ) has been relaxed, in problem (+ ), by the surface energy W ∞ (γ0 (v)τ ⊗ s ν) d 2 Γ0
and the weaker boundary condition vν = 0 on Γ0 . This last condition must be taken in the trace sense γν (v) = 0, where γν is a linear continuous operator from the space {v ∈ L1 (Ω, R3 ) : div v ∈ L2 (Ω)} into the dual C1 (Γ ) of C1 (Γ ). The existence of this trace operator γν will be established in Theorem 14.1.3. The integral Ω W D (D (v)) must be taken in the sense of measures defined in Sections 13.3 and 13.4.1. Let us recall that the measure W D (D (v)) denotes the Borel measure W D (e D ) " 3 Ω + (W D )∞ (D (v)S ), where e D " 3 Ω + D (v)S is the Lebesgue–Nikodým decomposition of the measure D (v), and that the recession function (W D )∞ of W D is defined for all E in MD by S (W D )∞ (E) = lim
t →+∞
W D (t E) t
.
Consequently, by definition one has W D (D (v)) := W D (e D ) d x + (W D )∞ (D (v)S ). Ω
Ω
Ω
When the singular part D (v)S of D (v) vanishes, we also denote the measure e D (v)" 3 Ω by D (v)" 3 Ω. In the same spirit, if e " 3 Ω + (v)S is the Lebesgue–Nikodým decomposition of the measure (v), we denote the measure e(v)" 3 Ω by (v)" 3 Ω when (v)S = 0.
14.1.3 The spaces BD (Ω), M (div), and U (Ω) Unless differently specified, the set Ω is, for the moment, a bounded open subset of R3 . As said before, a well-adapted space for relaxing the above model is the space defined below. Definition 14.1.1. The subspace B D(Ω) := {v ∈ L1 (Ω, R3 ) : (v) ∈ M(Ω, MS )} of L1 (Ω, R3 ) is called the space of bounded deformations. The measure (v) ∈ C0 (Ω, MS ) is defined by its action on all ϕ in C0 (Ω, MS ): 〈(v)i , j , ϕi , j 〉, 〈(u), ϕ〉 = i,j
where the brackets on the right-hand side denote the action of the signed measure (v)i , j on the scalar function ϕi , j for the duality (C0 (Ω), C0 (Ω)).
i
i i
i
i
i
i
14.1. Problems in pseudoplasticity
book 2014/8/6 page 573 i
573
Remark 14.1.1. The action of (v) on ϕ will also be written as Ω ϕ (v), which is also well defined on bounded D u-integrable functions ϕ. Note that when v is regular (for instance, belongs to W 1,1 (Ω, R3 )), ϕ (v) = ϕ(x) (v)(x) d x Ω Ω := ϕ(x) : (v)(x) d x, Ω
where for two 3 × 3 matrices A and B, A : B denotes their Hilbert–Schmidt scalar product defined by A : B := trace(AT B). This is why we also write Ω ϕ : (v) for the integral ϕ (v) with respect to the measure (v). Ω Under these considerations, we leave the reader to adapt the definitions of the weak and intermediate convergences and the proof of the approximating Theorem 10.1.2 or its generalization, Theorem 13.4.1. It suffices to argue with the components of u and to replace everywhere D u by (u) (see also Remark 10.2.2). Because of its importance in Subsection 14.1.4, we state the approximating theorem. Theorem 14.1.1. Let f : MS → R+ be a convex function satisfying (13.23) and (13.24). The space C∞ (Ω, MS ) ∩ B D(Ω) is dense in B D(Ω) equipped with the intermediate convergence associated with f . More precisely, for all u in B D(Ω), there exists un in C∞ (Ω, MS ) ∩ B D(Ω) such that ⎧ u → u strongly in L1 (Ω, R3 ); ⎪ ⎪ ⎪ n ⎪ ⎪ ⎪ ⎪ ⎪ |(u )| d x → |(u)|; ⎪ n ⎪ ⎪ Ω ⎨ Ω ⎪ f ((un )) d x → f ((u)); ⎪ ⎪ ⎪ ⎪ Ω Ω ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ f (D (u )) d x → f (D (u)). ⎩ n
Ω
Ω
In the same spirit, we state without proof the trace theorem. Theorem 14.1.2. Let Ω be a Lipschitz open bounded subset of R3 . There exists a linear continuous map γ0 from B D(Ω) onto L1 2 (Γ , R3 ) satisfying (i) for all u ∈ C(Ω, R3 ) ∩ B D(Ω), γ0 (u) = uΓ ; (ii) for all ϕ ∈ C(Ω, MS ) ϕ : (u) = − u · div ϕ d x + γ0 (u) ⊗S ν : ϕ d 2 , Ω
Ω
Γ
where ν is the outer unit normal at 2 -almost all x on Γ and div ϕ denotes the vector valued ∂ϕ function defined by (div ϕ)i := 3j =1 ∂ xi, j , i = 1, . . . , 3. j
As a consequence of the Green’s formula (ii) above, one can adapt the two first examples of Section 10.2.
i
i i
i
i
i
i
574
book 2014/8/6 page 574 i
Chapter 14. Application in mechanics and computer vision
Example 14.1.1. Consider two disjoint Lipschitz open bounded subsets Ω1 and Ω2 of an open bounded subset Ω of R3 such that Ω = Ω1 ∪ Ω2 and set Γ1,2 := ∂ Ω1 ∩ ∂ Ω2 , which is assumed to satisfy 2 (Γ1,2 ) > 0. We denote the trace operators from B D(Ω1 ) onto L1 2 (∂ Ω1 , R3 ) and B D(Ω2 ) onto L1 2 (∂ Ω2 , R3 ) by γ1 and γ2 , respectively. Let u1 and u2 be, respectively, two elements of B D(Ω1 ) and B D(Ω2 ) and define < u1 in Ω1 , u= u2 in Ω2 . Then u belongs to B D(Ω) and (u) = (u1 )Ω1 +(u2 )Ω2 +[u] ⊗ s ν N −1 Γ1,2 , where [u] = γ1 (u1 ) − γ2 (u2 ) and ν(x) is the unit inner normal at x to Γ1,2 , considered as a part of the boundary of Ω1 . Example 14.1.2. By slightly modifying the previous example, if we set < u in Ω, v= 0 in R3 \ Ω, where Ω is a Lipschitz bounded open subset of R3 and u ∈ B D(Ω), we see that v belongs to B D(R3 ) and that (v) = (u)Ω +u + ⊗ s ν N −1 Γ , where Γ is the boundary of Ω, ν denotes the inner unit vector normal to Γ , and u + denotes the trace of u on Γ . Remark 14.1.2. Remark 10.2.1 also holds in this situation. More precisely, let Ω be a Lipschitz open bounded subset of R3 . The approximating Theorem 14.1.1 may easily be improved as follows: the regular approximating functions of u ∈ B D(Ω) have all their traces equal to that of u on the boundary of Ω. For a proof, see, for instance, [348]. We define now the space M(div) which is involved in the definition of the admissible set of displacement fields in the Hencky model. Definition 14.1.2. We denote the space of all the functions in L1 (Ω, R3 ) such that their divergence belongs to L2 (Ω), by M(div): M(div) := {v ∈ L1 (Ω, R3 ) : div v ∈ L2 (Ω)}. On M(div), we can define a trace notion. Theorem 14.1.3. Let Ω be a C2 open bounded subset of R3 . There exists a linear continuous map γν from M(div) into C1 (Γ ) satisfying the following: (i) for all u ∈ C(Ω, R3 ) ∩ B D(Ω), γν (u) = u · νΓ ; (ii) for all u in M(div), all ϕ ∈ C1 (Γ ), and all Φ ∈ C1 (Ω) such that ΦΓ = ϕ, 〈γν (u), ϕ〉 = u · DΦ d x + Φ div u d x. Ω
Ω
i
i i
i
i
i
i
14.1. Problems in pseudoplasticity
book 2014/8/6 page 575 i
575
For a proof, consult Temam [348, Proposition 7.2]. Note that this theorem also makes sense when div u is a Borel measure. In this case, the space M(div) is defined by M(div) := {v ∈ L1 (Ω, R3 ) : div v ∈ M(Ω)}. For our application we consider only the case when div v ∈ L2 (Ω). We consider now the space U (Ω) := B D(Ω)∩M(div) equipped with two convergences. Precisely, for all sequence (un )n∈N in U (Ω), one defines • the weak convergence, defined by the weak convergence of un to u in B D(Ω) and the strong convergence of div un to div u in L2 (Ω); • the intermediate convergence associated with a convex function f , defined by the intermediate convergence of un to u in B D(Ω) associated with f together with the strong convergence of div un to div u in L2 (Ω). The following result completes the approximating Theorem 14.1.1 in the spirit of Remark 14.1.2. For a proof, consult Theorems 3.4 and 5.3 in [348]. Theorem 14.1.4. Let f : MS → R be a convex function satisfying α(|A| − 1) ≤ f (A) ≤ β(1 + |A|) for all A in MS . Then for all u in U (Ω), there exists un in C∞ (Ω, MS ) ∩ B D(Ω) such that ⎧ un → u strongly in L1 (Ω, R3 ); ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ |(u )| d x → |(u)|; ⎪ n ⎪ ⎪ ⎪ Ω Ω ⎪ ⎪ ⎪ ⎪ ⎨ f ((un )) d x → f ((u)); Ω Ω ⎪ ⎪ div u → div u strongly in L2 (Ω); ⎪ ⎪ n ⎪ ⎪ ⎪ ⎪ ⎪ D ⎪ ⎪ f ( (un )) d x → f (D (u)); ⎪ ⎪ ⎪ Ω Ω ⎪ ⎩ γ0 (un ) = γ0 (u).
14.1.4 Relaxation of the Hencky model In this subsection we show that the lower semicontinuous envelope of the Hencky functional energy for the weak topology of U (Ω) is the functional of (+ ) in (14.1). We moreover establish the convergence of the corresponding energy to the energy of the relaxed functional. Here, Ω is a bounded open subset of R3 of class C2 . Theorem 14.1.5. The lower semicontinuous envelope for the weak convergence of U (Ω), of the integral functional defined on U (Ω) by
F (v) =
⎧ ⎨ ⎩
D
Ω
D
W ( (v)) d x +
+∞
otherwise,
k 2
Ω
(div(v))2 d x if v ∈ LD(Ω) ∩ M(div),
v = 0 on Γ0 ,
i
i i
i
i
i
i
576
book 2014/8/6 page 576 i
Chapter 14. Application in mechanics and computer vision
is the functional defined on U (Ω) by ⎧ k ⎪ D D 2 ⎪ (div(v)) d x + (W D )∞ (γ0 (v)τ ⊗ s ν) d 2 ⎨ W ( (v)) + 2 Ω Ω Γ0 F (v) = ⎪ if v ∈ U, γν (v) = 0 on Γ0 , ⎪ ⎩ +∞ otherwise. SKETCH OF THE PROOF. Arguing exactly as in the proof of Theorem 13.4.2, where Theorem 13.4.1 is replaced by Theorem 14.1.4, we obtain ⎧ k ⎪ D D 2 ⎪ (div(v)) d x + (W D )∞ (γ0 (v) ⊗ s ν) d 2 ⎨ W ( (v)) + 2 Ω Ω Γ0 F (v) = ⎪ if v ∈ U, vν = 0 on Γ0 , ⎪ ⎩ +∞ otherwise, in the trace sense: for H 2 -a.e. x on Γ0 , where condition vν = 0 on Γ0 is intended γν (v)(x) = 0. Indeed, the map v → Ω (div(v))2 d x is continuous for the weak convergence of U (Ω). Note also that according to Theorem 14.1.3, the trace operator defined from M(div) into C1 (Γ ) is continuous when the two spaces are equipped with their weak convergences. Finally, for 2 -a.e. x on Γ0 , since vν = 0 in the trace sense on Γ0 , we have γ0 (v)τ ⊗ s ν(x) = γ0 (v)⊗ s ν(x). Indeed, it is easily seen that γ0 (v)·ν = γν (v) Γ2 -a.e. 0
Corollary 14.1.1. The energy of the Hencky model D
inf Ω
D
W ( (v)) d x +
k 2
2
Ω
(div(v)) d x − L(v) : v ∈ ,
where = {v ∈ LD(Ω) : div(v) ∈ L2 (Ω), v = 0 on Γ0 }, relaxes to D
min Ω
D
W ( (v)) +
k 2
Ω
(div(v)) d x +
˜ W (γ0 (v)τ ⊗ s ν) d H − L(v) : v ∈ , ∞
2
Γ0
2
where ˜ = {v ∈ B D(Ω) : div(v) ∈ L2 (Ω), vν = 0 on Γ0 }. SKETCH OF THE PROOF. According to the general theory of relaxation (see Theorem 11.1.2), it suffices to prove that any minimizing sequence related to the Hencky energy possesses a subsequence weakly converging in the space U . This assertion may be easily established thanks to coercivity condition (13.25).
14.2 Some variational models in fracture mechanics 14.2.1 A few considerations in fracture mechanics Let us consider an elastic brittle medium whose reference configuration is a bounded domain Ω of R3 . Griffith’s theory of fracture mechanics asserts that the energy necessary to produce a crack K included in Ω is proportional to the crack area 2 (K). Consequently, the elastic deformation energy outside the crack must be completed by an additional energy whose simplest form is λ 2 (K). The constant λ is the Griffith coefficient, introduced for fracture initiation (see [232], [324]). The elastic energy of the deformable body
i
i i
i
i
i
i
14.2. Some variational models in fracture mechanics
book 2014/8/6 page 577 i
577
under consideration then takes the form E(u, K) = f (∇u) d x + λ 2 (K), Ω\K
where u denotes the deformation vector field and ∇u the deformation gradient. Under suitable conditions on f , the functional E makes sense in a classical way if, for instance, K is a closed set, and u belongs to C1 (Ω\K, R3 ). From the inequality 2 (K) ≤ λ−1 E(u, K), we see that when E(u, K) < +∞, the crack surface K is a two-Hausdorff-dimensional closed set of Ω and its Lebesgue measure is zero. Thus, the crack surface K can be seen as the set of discontinuity points for the measurable function u, more precisely, the measurable representative of u defined on Ω, satisfying the convention of Remark 10.3.2. It is worth noticing the analogy between this model and the strong model introduced in image segmentation by Mumford and Shah and discussed in Section 12.5 (see also Section 14.3 for complements). Following the idea developed in Section 12.5, one may define a weak formulation in the setting of SBV functions introduced in Section 10.5 (completed by Remark 10.5.1) by considering the functional f (∇u) d x + λ 2 (S u ), u ∈ SBV (Ω, R3 ), E(u) = Ω
where ∇u is the density of the regular part of the measure D u and S u the jump set of u. Actually, one can deal with functionals of the more general form f (x, ∇u) d x + g (x, u + (x), u − (x), ν u (x)) d 2 S u . E(u) = Ω
Ω
The bulk energy density f accounts for the elastic deformation outside the crack and g for the density energy necessary to produce a crack of surface S u . The meaning of the presence of the terms u + (x), u − (x), and ν u (x) is the following: the fracture energy may depend on the crack opening and, for nonisotropic materials, on the crack surface orientation. According to Theorem 13.4.5 and Remark 13.4.3, suitable conditions on f and g ensure semicontinuity of E so that direct methods of the calculus of variations provide the existence of solutions for optimization problems related to the energy functionals of the type E. More precisely, we consider a lower semicontinuous symmetric and subadditive function g : R3 × R3 → R+ , i.e., g (a, b ) = g (b , a) ≤ g (b , c) + g (c, a)
∀ a, b , c ∈ R3 .
We assume moreover that g (a, b ) ≥ ψ(|a − b |) for all a, b ∈ R3 , where ψ : [0, +∞) → [0, +∞], satisfies ψ(t )/t → +∞ as t → 0. Subadditivity assumption on the function g forces the crack material to possess a minimal number of connected components. We finally consider a convex, even function h : R3 → [0, +∞), positively homogeneous of degree 1 and satisfying for all ν ∈ RN , h(ν) ≥ c|ν|, where c is a given positive constant. Theorem 14.2.1. Let f : R × M 3×3 → R be a quasi-convex function satisfying the following growth conditions of order p > 1: there exist two positive constants α and β such that for all A ∈ M 3×3 , α|A| p ≤ f (x, A) ≤ β(1 + |A| p ). Let g and h be two functions satisfying the conditions introduced above, and let L ∈ L∞ (Ω, R3 ) (the exterior loading). Then, given a nonempty compact subset K of R3 , there exists a solution
i
i i
i
i
i
i
578
book 2014/8/6 page 578 i
Chapter 14. Application in mechanics and computer vision
of the minimum problem + − 2 min f (∇u) d x + g (u , u )h(ν u ) d S u + L · u d x : u ∈ SBV (Ω, R3 ), Ω Ω Ω u(x) ∈ K for a.e. x . PROOF. Following the classical direct methods of the calculus of variation, the conclusion is a straightforward consequence of the lower semicontinuity of the functional energy, which has been established in Theorem 13.4.5 and Remark 13.4.3. The coercivity easily follows from the confinement condition u(x) ∈ K for a.e. x. A similar type of result is described in detail in Section 14.3 for the Mumford–Shah model in image segmentation. In the second assertion of Theorem 14.2.1, the confinement condition u ∈ K does not seem to be natural for all problems in fracture mechanics. To remove such a condition, we have in general to state the problems in the space GSBV of functions whose truncations are in SBV (see [21], [122], and references therein). A similar statement can be given for boundary value problems. In this case we know (see the previous section or Theorem 11.3.1) that the Dirichlet boundary conditions u = u0 on a subset Γ0 of the boundary Γ of Ω, assumed to be Lipschitz, are relaxed into a surface energy at the boundary. Taking, for instance, g = 1, we have to consider minimization problems of the form 2 2 f (∇u) d x + h(ν u ) d S u + h(ν) d Γ0 + L · u d x : u ∈ SBV (Ω, R3 ), min Ω Ω Ω u(x) ∈ K for a.e. x , where ν is the inner unit normal to Γ0 . Existence of a solution is obtained in a similar way. The argument above also works for more general minimum problems of the form (see, for instance, [184]) + − 2 3 min f (∇u) d x + g (u , u )h(ν u ) d S u + V (x, u) d x : u ∈ SBV (Ω, R ) , Ω
Ω
Ω
where f , g , h are as above, and the potential V is integrable in x and satisfies V (x, s) ≥ θ(|s|) − a(x) with a ∈ L1 (Ω) and θ(t )/t → +∞ as t → +∞. The case of Theorem 14.2.1 corresponds to V (x, s) = L(x) · s + δK (s). Functionals of the type E can be approximated by elliptic functionals via a variational procedure due to Ambrosio and Tortorelli and described in Section 12.5.3 for the Mumford–Shah energy. These approximating functionals are more adapted to a numerical treatment (see [89], [120]). In the one-dimensional case, the two models introduced in Subsections 14.2.2 and 14.2.3 provide an alternative and more direct way for defining a discrete variational approximation of E. Other models, proposed by Barenblatt [83], can be similarly weakened in the space SBV (Ω, R3 ) by considering energies of the type f (∇u) d x + g (|u + − u − |) d 2 S u , E(u) = Ω
Ω
i
i i
i
i
i
i
14.2. Some variational models in fracture mechanics
book 2014/8/6 page 579 i
579
where g (t ) → 0 as t → 0. Optimization problems related to such energies functionals do not provide minimizing sequences satisfying Theorem 14.2.1. Indeed, we lose the control of the Hausdorff measure of the jump set. Therefore, minimizing sequences may converge to a function whose jump part is not concentrated on an (N − 1)-Hausdorff dimensional set and, consequently, it does not belong to SBV (Ω, R3 ). To select minimizers, one can follow a singular perturbation approach, involving the notion of viscosity solutions (consult Attouch [39]). This procedure has already been described in detail for the phase transition model in Section 12.5.2. Precisely, the method consists in perturbing the functional E by the fracture initiation energy 2 (S u ) and introducing functionals of Griffith type E (u) = E(u) + 2 (S u ) ( > 0 intended to tend to zero) for which Theorem 14.2.1 applies. The cluster points of minimizers related to E then belong to SBV (Ω, R3 ) and minimize E among minimizers with jump set of minimal N −1 measure (see [21] and [125]). For a specific study of problems involving fracture mechanics in the modern framework of SBV functions, we refer the reader to [21], [26], [219], [220], and references therein. In the sections below, we would like to supply a justification of weak Griffith’s models, in the one-dimensional case, by taking into account the microscopic scale and the statistical local energy distribution. More precisely, we deal with two discrete systems of material points, which, in the reference configuration, occupy the points of the lattice Z, = 1/n included in the interval [0, 1]. For each material point placed at x ∈ Z in the reference configuration, u(x) ∈ R denotes its new position in the deformed state. Each point interacts only with its nearest neighbors. We aim at describing the continuous limit of the discrete energy in the sense of Γ -convergence. We show that the continuous variational limit of the second model takes the form of Griffith’s model discussed above. In the first model, we show that convexity conditions satisfied by local density energies entail the presence of a Cantor-part energy: the Griffith’s initiation energy is completed by an additional term C uc (0, 1), where uc is the Cantor part of u and C a suitable constant. It is worth noticing that the discrete energies considered may also be viewed as discrete variational approximating functionals of Griffith’s type in the one-dimensional case. This approach is very close to that of Chambolle consisting in approximating the Mumford– Shah functionals in the two-dimensional case (see Chambolle [168]) and to some recent works by Braides [124] and Braides and Gelli [129].
14.2.2 A first model in one dimension The interaction between each pair {z, (z + 1)} of contiguous points is described by a random energy u((z + 1)) − u(z) , Wω z where Wω z belongs to a finite set {W j , j ∈ J }. Each density W j , j ∈ J , mapping R into [0, +∞], is assumed to be of Hencky pseudoplastic type (see Section 14.1) and must take the value +∞ if two neighboring points occupy the same location. The total energy of the interaction between points located in [0, 1] ∩ Z is E (ω, u) =
n−1 z=0
Wω z
u((z + 1)) − u(z)
,
i
i i
i
i
i
i
580
book 2014/8/6 page 580 i
Chapter 14. Application in mechanics and computer vision
where ω = (ω z ) z∈Z ∈ J Z . One may think that two neighboring material points are connected by randomly chosen nonlinear springs, and this model also describes a system of n springs which are randomly distributed. We still denote the continuous extension of u, which is affine on (z, (z + 1)) for all z ∈ Z, by u. Thus the discrete energy may be considered as defined on L1 (0, 1). This first model is described by the almost sure Γ convergence of u → E (ω, u) toward a deterministic energy functional living in BV (0, 1). Let us give some specific notation about BV -functions in the one-dimensional case. For each function u in BV (0, 1), one writes u = ua d t + u s the Lebesgue–Nikodým decomposition of its distributional derivative u with respect to the Lebesgue measure d t on (0, 1). The singular part with respect to d t has the following decomposition: u s = + − + − t ∈S u (u − u )δ t + uc , where u and u are, respectively, the approximate upper and lower limits of u, S u is the jump set {t ∈ (0, 1) : u + (t ) = u − (t )} of u, and uc is the singular diffuse part, also called the Cantor part of u . Let BV + (0, 1) be the subset of all the functions u in BV (0, 1) such that ua > 0 a.e. in (0, 1) and u s ≥ 0. We will establish that the (deterministic) limit energy functional is defined on BV + (0, 1) by 1 W ho m (ua ) d t + C u s (0, 1) , E(u) := 0
where C is a positive constant. The density e → W ho m (e) is obtained as the almost sure limit of a suitable subadditive ergodic process. This mathematical result expresses that the mechanical macroscopic behavior of a string can be interpreted as the variational limit of a discrete system at a microscopic scale. Moreover, we will express the duality principle (W ho m )∗ = j ∈J p j W j∗ , where p j is the probability presence of each density W j . For other models in a deterministic setting and some discussions about the possible existence of a fracture site related to these models, consult [127] and [21]. Let us give some more details on the random discrete model. As said above, we consider a finite set Λ of functions W j , j ∈ J , of probability presence p j , satisfying the three following conditions: (i) W j : R → R+ ∪ {+∞} is convex, finite for e > 0, W j (1) = 0, and there exists α > 0 such that α(e − 1) ≤ W j (e) for all e ≥ 0; (ii) there exists β > 0 such that W j (e) ≤ β(1 + e) for all e > 1; (iii) lime→0+ W j (e) = +∞ and W j (e) = +∞ when e ≤ 0. The assumption W j (1) = 0 means that no energy is needed when no deformation occurs. Assumption (iii) means that an infinite amount of energy is needed to squeeze a pair of material points down to a single one and that there is no interpenetrability of the matter. These density functions are of Hencky pseudoplastic type owing to the convexity and the linear growth conditions. The fundamental stochastic setting that we will need for describing the discrete energy and its asymptotic behavior is the discrete dynamical system (Ω, , P, (T z ) z∈Z ): Ω = ΛZ , (Ω, , P) is the product probability space of the Bernoulli probability space on Λ constructed from p j , j ∈ J , the transformation T z is the shift defined for all z ∈ Z by T z (ω s ) s ∈Z = (ω s +z ) s ∈Z . The expectation operator will be denoted by E. To write in a continuous form the random energy functional, we consider the random function defined for all (ω, t , e) ∈ Ω × R × R by W (ω, t , e) = Wω z (e) when t ∈ [z, z + 1).
i
i i
i
i
i
i
14.2. Some variational models in fracture mechanics
book 2014/8/6 page 581 i
581
Let (0, 1) denote the space of continuous functions on (0, 1) which are affine on each interval (z, (z + 1)) of (0, 1). More generally when s = (b − a)/n, s (a, b ) will denote the space of continuous functions on (a, b ) which are affine on each interval (a + s z, a + s(z +1)) of (a, b ). The total energy due to the interactions between the points of [0, 1]∩Z is the functional defined in L1 (0, 1) by ⎧ n−1 ! u((z + 1)) − u(z) " ⎨ W ω, z, if u ∈ (0, 1), E (ω, u) = ⎩ z=0 +∞ otherwise, or, in a continuous form,
⎧ ⎨
! t " W ω, (t ) dt , u E (ω, u) = ⎩ 0 +∞ otherwise, 1
if u ∈ (0, 1),
where t → u(t ) also denotes the piecewise affine extension of u. Note that the domain of E (ω, .) is the subset of all functions u in (0, 1) whose distributional derivative u is positive. By using classical probabilistic arguments, it is easily seen that (Ω, , P, (T z ) z∈Z ), previously defined, is an ergodic dynamical system. To solve the problem, we make use of the concept of the subadditive process developed in Section 12.4. Let us recall the following ergodic theorem taken from Theorem 12.4.4. Theorem 14.2.2. Let (Ω, , P, (T z ) z∈Z ) be an ergodic dynamical system and @ a discrete subadditive process. Suppose that @I (ω) P(d ω) : |I | = 0 > −∞ inf |I | Ω and let (An )n∈N be a regular sequence of B satisfying limn→+∞ ρ(An ) = +∞. Then almost surely @[0,m[d @An (ω) . = inf ∗ E lim n→+∞ |A | m∈N md n The space L1 (0, 1), equipped with its norm, plays the role of the metric space (X , d ) in the Γ -convergence definition of Chapter 12. Theorem 14.2.3. The energy functional E (ω, .) Γ -converges almost surely to the functional E defined in L1 (0, 1) by ⎧ ⎨ 1 ho m W (ua ) d t + W ho m,∞ (1) u s (0, 1) if u ∈ BV + (0, 1), E(u) = ⎩ 0 +∞ otherwise. The density W ho m is defined as follows: W ho m (e) = +∞ if e ≤ 0 and, if e > 0, one has ω-a.s., n 1 1,1 ho m W (e) = lim inf W (ω, t , e + v ) d t : v ∈ W0 (0, n) n→+∞ n 0 n 1 1,1 = inf∗ E inf W (., t , e + v ) d t : v ∈ W0 (0, n) . n∈N n 0
i
i i
i
i
i
i
582
book 2014/8/6 page 582 i
Chapter 14. Application in mechanics and computer vision
Moreover, W ho m verifies properties (i), (ii), and (iii) of the functions W j and its Legendre– Fenchel transform is given by (W ho m )∗ = j ∈J p j W j∗ . The proof is established by means of Propositions 14.2.1 and 14.2.2, each giving, respectively, the lower bound and the upper bound for a subsequence in the definition of Γ -convergence. Before stating Proposition 14.2.1, we introduce a parametrized subadditive process, i.e., a family of subadditive processes which will be used to define the limit problem. For this purpose, for δ ∈ (0, δ0 ], j ∈ J , we consider the truncated functions Tδ W j = W j ∧ Lδ , where Lδ is the affine function defined by Lδ (t ) = W j (δ) + τi (t − δ), τ j ∈ ∂ W j (δ) (the subdifferential of W j at δ) and set Wδ (ω, t , e) = Tδ ω z (e) when t ∈ [z, z + 1), W0 = W . The nonincreasing family (Wδ )δ∈(0,δ0 ] satisfies for 0 < δ ≤ δ0 , ω ∈ Ω, t ∈ R, and e ∈ R: ⎧ Wδ ≤ W , Wδ (e) = W (e) for e ≥ δ, ⎪ ⎪ ⎪ ⎨ lim δ→0 Wδ (ω, t , e) = W (ω, t , e), ⎪ e → Wδ (ω, t , e) is convex, ⎪ ⎪ ⎩ α(|e| − 1) ≤ Wδ (ω, t , e) ≤ βδ (1 + |e|), where βδ is a positive constant depending only on δ. Let 0 (Ω, R+ ∪ {+∞}) be the set of all the measurable functions from Ω into R+ ∪ {+∞} and consider the parametrized subadditive process @ defined by @ : B × [0, δ0 ] × R −→ 0 (Ω, R+ ∪ {+∞}), (A, δ, e) → @A(δ, e, .),
where @A(δ, e, ω) = inf
A
Wδ (ω, t , e + v ) d t : v
◦ ∈ W01,1 (A)
.
It is worth noticing that the domain of e → @A(δ, e, ω) is R when δ ∈ (0, δ0 ] while the one of e → @A(0, e, ω) is ]0, +∞[. Indeed, if @A(0, e, ω) < +∞, for e ≤ 0, there exists ◦
◦
v ∈ W01,1 (A) such that W (ω, t , v (t )) < +∞ t a.e. in A. Therefore v (t ) > −e for t a.e. ◦ in A and A v d t = 0 > −e, a contradiction. It is easily seen that for all fixed (δ, e) in (0, δ0 ] × R ∪ {0} × (0, +∞) , the map A → @A(δ, e, .) is a subadditive process satisfying @A(δ, e, ω) ≤ |A| max{Tδ W j (e) : j ∈ J } and that all conditions of Theorem 14.2.2 are there exists a set Ω fulfilled. Consequently, of full probability such that for all (δ, e) in (0, δ0 ] ∩ Q × R and all ω ∈ Ω , Wδho m (e) := lim
n→+∞
@[0,n) (δ, e, ω) n
= inf ∗ E m∈N
@[0,m) m
(δ, e, .).
(14.2)
On the other hand, for all e > 0, there exists a set Ωe of full probability such that @[0,m) @[0,n) (0, e, ω) ho m = inf ∗ E (0, e, .) (e) := lim W n→+∞ m∈N n m and W ho m (e) = +∞ if e ≤ 0. The independence of the first set Ω with respect to e comes from the equi-Lipschitz property of e → @[0,n) (δ, e, ω) (see [185] or [291]). The following lemma states the continuity at δ = 0+ of the function δ → Wδho m (e) when e > 0.
i
i i
i
i
i
i
14.2. Some variational models in fracture mechanics
book 2014/8/6 page 583 i
583
Lemma 14.2.1. For all fixed e > 0, there exists δ(e) > 0 such that for all δ ∈ (0, δ(e)] ∩ Q, Wδho m (e) = W ho m (e). PROOF. First step. This step is also valid for all e ∈ R and all ω ∈ Ω. Let δ > 0, we show that there exists uδ,n (ω) ∈ 1 (0, n) ∩ W01,1 (0, n) such that @[0,n) (δ, e, ω) n
=
1
n @
n 0
Wδ (ω, t , e + uδ,n (ω)) d t .
(δ,e,ω)
To shorten notation, we set Wn (e) = [0,n) n . From a classical calculation, the Fenchel conjugate of e → Wn (e) is given for all σ in R by 1 n ∗ Wn∗ (σ) = Wδ (ω, t , σ) d t . n 0 Set A j (ω, n) = {t ∈ [0, n] : Wδ (ω, t , .) = Tδ W j } and λ j (ω, n) = meas(A j (ω, n)) for j ∈ J . Thus λ j (ω, n) Wn∗ (σ) = (14.3) (Tδ W j )∗ (σ) n j ∈J and, by classical subdifferential rules (see Chapter 9), ∂ Wn∗ (σ) =
λ j (ω, n) n
j ∈J
∂ (Tδ W j )∗ (σ).
(14.4)
Now let σ ∈ ∂ Wn (e). Thus e ∈ ∂ Wn∗ (σ), and from (14.4) there exists Uj ,δ,n (ω) ∈ ∂ (Tδ W j )∗ (σ) such that λ j (ω, n) e= Uj ,δ,n (ω). n j ∈J The function uδ,n (ω)(x) =
x 0
j ∈J
Uj ,δ,n (ω) − e 1Aj (ω,n) (s) d s
answers the question. Indeed, uδ,n belongs to 1 (0, n) ∩ W01,1 (0, n). On the other hand, according to Uj ,δ,n (ω) ∈ ∂ (Tδ W j )∗ (σ), (14.3), and the fact that e ∈ ∂ Wn∗ (σ), 1 n
n 0
Wδ (ω, t , e + uδ,n (ω)) d t =
=
λ j (ω, n) j ∈J
n
Tδ W j (Uj ,δ,n (ω))
λ j (ω, n) σ Uj ,δ,n (ω) − Tδ W j∗ (σ) n j ∈J
= σ e − Wn∗ (σ) = Wn (e).
Second step. From classical probabilistic arguments, there exists a set Ω of full probability such that for all ω ∈ Ω , lim
n→+∞
λ j (ω, n) n
= pj .
(14.5)
i
i i
i
i
i
i
584
book 2014/8/6 page 584 i
Chapter 14. Application in mechanics and computer vision
We pick up ω0 in Ωe ∩ Ω ∩ Ω and claim that infn Uj ,δ,n (ω0 ) ≥ δ for all δ ∈ Q∗+ small enough, say, δ ∈ (0, δ(e)] ∩ Q. Otherwise there exist two sequences (δk )k and (nk )k converging to 0 and +∞ such that Uj ,δk ,nk (ω0 ) ≤ δk . But, according to the first step @[0,nk ) (0, e, ω0 ) nk
≥ =
@[0,nk ) (δk , e, ω0 ) nk λ j (ω0 , nk )
≥
Tδk W j (Uj ,δk ,nk (ω0 ))
nk
j ∈J
λ j (ω0 , nk )
W j (δk )
nk
j ∈J
and letting k → +∞, we obtain W ho m (e) = +∞, a contradiction. Last step. According to the two previous steps, for δ ∈ (0, δ(e)] ∩ Q one has @[0,n) (0, e, ω0 ) n
≥ =
@[0,n) (δ, e, ω0 ) n λ j (ω0 , n) n
j ∈J
=
λ j (ω, n) n
j ∈J
1
Tδ W j (Uj ,δ,n (ω0 ))
W j (Uj ,δ,n (ω0 ))
n
W (ω0 , t , uδ,n (ω0 )) d t n 0 @[0,n[ (0, e, ω0 ) ≥ . n
=
Therefore
@[0,n) (0,e,ω0 ) n
=
@[0,n) (δ,e,ω0 ) . Letting n n
as soon as δ ∈ (0, δ(e)] ∩ Q.
→ +∞, we obtain that W ho m (e) = Wδho m (e)
Proposition 14.2.1. Let Ω , Ω be the subsets of full probability defined in (14.2) and (14.5) and let u, u in L1 (0, 1) be such that u → u strongly in L1 (0, 1). Then for all ω in Ω ∩ Ω , E(u) ≤ lim inf E (ω, u ). →0
Moreover the domain of Γ − lim inf→0 E (ω, .) is included in BV + (0, 1). PROOF. We fix ω in Ω ∩ Ω . First step. If lim inf→0 E (ω, u ) < +∞, by the coercivity condition on the functions W j (property (i)), u belongs to BV (0, 1) and obviously u ≥ 0. Let us now consider, for δ ∈]0, δ0 ] ∩ Q, the truncated energy E,δ (ω, u) =
⎧ ⎨ ⎩
1
t
Wδ ω, , u (t ) d t +∞ otherwise, 0
if u ∈ (0, 1),
i
i i
i
i
i
i
14.2. Some variational models in fracture mechanics
book 2014/8/6 page 585 i
585
and the corresponding energy with domain W 1,1 (0, 1) ⎧ 1 t ⎨ Wδ ω, , u (t ) d t E˜,δ (ω, u) = ⎩ 0 +∞ otherwise,
if u ∈ W 1,1 (0, 1),
which satisfies all the properties of random integral functionals considered in [1]. From the inequalities E (ω, .) ≥ E,δ (ω, .) ≥ E˜,δ (ω, .) and according to [1] we then deduce lim inf E (ω, u ) ≥ lim inf E,δ (ω, u ) ≥ Eδ (u), →0
→0
(14.6)
where Eδ (u) =
⎧ ⎨ ⎩
1 0
! " Wδho m (ua ) d t + Wδho m,∞ (1)u s (0, 1) if u ∈ BV (0, 1),
+∞
otherwise.
Second step. It is not restrictive to assume lim inf→0 E (ω, u ) < +∞. Obviously ua ≥ 0 and u s ≥ 0. We would like to let δ going to 0 in (14.6) and apply Lemma 14.2.1. It remains to prove that ua > 0 a.e. in (0, 1). Otherwise, there exists a Borel set N of (0, 1) with meas(N ) = 0, such that ua = 0 on N . We have +∞ > lim inf E (ω, u ) ≥ meas(N )Wδho m (0), →0
(14.7)
where, from the first step in the proof of Lemma 14.2.1, ⎧ λ j (ω, n) ⎪ ⎪ ⎪ Tδ W j (Uj ,δ,n (ω)), Wδho m (0) = lim ⎪ ⎨ n→+∞ n j ∈J
λ j (ω, n) ⎪ ⎪ ⎪ Uj ,δ,n (ω) = 0. ⎪ ⎩ n j ∈J
By the coercivity assumption (i), an easy calculation leads to supn |Uj ,δ,n (ω)| ≤ C . Thus there exists Uj ,δ (ω) in R satisfying, up to a subsequence, limn→+∞ Uj ,δ,n (ω) = Uj ,δ (ω) and ⎧ ⎪ p j Tδ W j (Uj ,δ (ω)), Wδho m (0) = ⎪ ⎨ j ∈J ⎪ p j Uj ,δ (ω) = 0. ⎪ ⎩ j ∈J
The second equality yields the existence of an index jδ such that Ujδ ,δ ≤ 0 and the first equality gives Wδho m (0) ≥ p jδ Tδ W jδ (Ujδ ,δ (ω)) ≥ p jδ Tδ W jδ (0) ≥ min p j min Tδ W j (0) j
j
so that limδ→0 Wδho m (0) = +∞. Letting δ → 0 in (14.7) leads to a contradiction.
i
i i
i
i
i
i
586
book 2014/8/6 page 586 i
Chapter 14. Application in mechanics and computer vision
Last step. Letting δ → 0 in (14.6), according to the monotone convergence theorem and Lemma 14.2.1, we finally obtain 1 W ho m (ua ) d t + W ho m,∞ (1) u s (0, 1) . lim inf E (ω, u ) ≥ →0
0
The proof is then achieved. To establish the upper bound, we will apply the following lemma. Lemma 14.2.2. (i) Let e > 0 and let i ∈ N be a fixed integer. There exists ui ,n (ω) in 1 (i n, (i + 1)n) ∩ W01,1 (i n, (i + 1)n) such that @[i n,(i +1)n) (0, e, ω) = (ii) The map W
ho m
(i +1)n
W (ω, t , e + ui,n (ω)) d t .
in
: R → [0, +∞] is convex, continuous, and W ho m (1) = 0.
PROOF. For establishing (i), reproduce the first step of the proof of Lemma 14.2.1 with (0, n) replaced by (i n, (i + 1)n) and Wδ by W . Note also that there exist Uj ,δ > 0, j = 1, 2, 3 (depending on e), satisfying supn Uj ,n < +∞, such that ⎧ @[0,n) (0, e, ω) λ j (ω, n) ⎪ ⎪ ⎪ = W j (Uj ,n (ω)), ⎪ ⎨ n n j ∈J λ j (ω, n) ⎪ ⎪ ⎪ Uj ,n = e, Uj ,n > 0. ⎪ ⎩ n j ∈J
The convexity of the map e → W ho m (e) is a consequence of Jensen’s inequality fulfilled for e > 0 and established by a straightforward calculation. Consequently, this map is continuous on its domain R∗+ and, to prove (ii), we must show that lime→0+ W ho m (e) = +∞. Let ek > 0 tend to 0 and fix ω in Ω ∩k∈N Ωek . Letting n → +∞ (up to a subsequence) in ⎧ @[0,n[ (0, ek , ω) λ j (ω, n) ⎪ ⎪ ⎪ = W j (Uj ,n,k (ω)), ⎪ ⎨ n n j ∈J
λ j (ω, n) ⎪ ⎪ ⎪ Uj ,n,k = ek , ⎪ ⎩ n j ∈J
we obtain the existence of numbers Uj ,k ≥ 0 such that ⎧ ⎪ p j W j (Uj ,k (ω)), W ho m (ek ) = ⎪ ⎨ j ∈J ⎪ p j Uj ,k = ek . ⎪ ⎩ j ∈J
Note that this shows that W ho m (1) = 0. Obviously Uj ,k → 0 when k → +∞. Thus lim W ho m (ek ) = p j lim W j (Uj ,k ) = +∞ k→+∞
j ∈J
k→+∞
and the proof is complete.
i
i i
i
i
i
i
14.2. Some variational models in fracture mechanics
book 2014/8/6 page 587 i
587
We establish now the upper bound. Proposition 14.2.2. There exists a subset Ω of full probability such that for all ω ∈ Ω , there exists a subsequence (nk )k∈N of positive integers satisfying the following: for all u in L1 (0, 1), there exists unk in L1 (0, 1) possibly depending on ω, converging to u, and such that lim supk→+∞ E1/nk (ω, unk ) ≤ E(u). PROOF. First step. We prove Proposition 14.2.2 for all u belonging to SBV# (0, 1)∩BV + (0, 1) where SBV# (0, 1) denotes the subspace of all functions of SBV (0, 1) having a finite jump set. It is not restrictive to assume that S u = {t0 }. For m ∈ N∗ , consider the decomposition (i/m, (i +1)/m). Let us set i0 to denote the integer such that t0 ∈ [i0 /m, (i0 + (0, 1) = ∪im−1 =0 1)/m) and by u m the interpolate function of u with respect to this decomposition. We set ei ,m := u m (i/m, (i + 1)/m) if i = i0 and ei0 ,m = m u((i0 + 1)/m) − u(i0 /m) . For each i ∈ {1, . . . , m −1} we consider a sequence ei ,m,η in Q∗+ such that limη→0 ei ,m,η = ei ,m . By continuity of W ho m and Jensen’s inequality, we have m−1 m−1 1 i+1 m ho m lim (ei ,m,η ) ≤ W ho m (ua ) d t W η→0 i m i =0, i =i0 i =0, i =i0 m 1 W ho m (ua ) d t ≤ 0
and lim lim
m→+∞ η→0
1 m
W
ho m
(ei0 ,m,η ) = lim
1
m→+∞
m
W
ho m
i0 + 1 i0 m u −u m m
= W ho m,∞ ([u](t0 )) so that lim lim
m−1
1
i =0
m
m→+∞ η→0
W ho m (ei ,m,η ) ≤ E(u).
(14.8)
But Lemma 14.2.2 implies the existence of ui ,η,n in 1 (i n, (i + 1)n) ∩ W01,1 (i n, (i + 1)n) (we have dropped the dependence on ω) such that @[i n,(i +1)n) (0, ei ,m,η , ω) 1 (i +1)n W (ω, t , ei ,m,η + ui,η,n ) d t = n n in (i +1)/m =m W (ω, mnt , ei ,m + ui,η,n (mnt )) d t . i /m
Therefore, as the sequence of intervals (i n, (i + 1)n) n∈N∗ is regular, according to Theorem 14.2.2, there exists Ω = ∩e∈Q∗+ Ωe of full probability such that for all ω ∈ Ω (i +1)/m W ho m (ei ,m,η ) = lim m W (ω, mnt , ei ,m,η + ui,η,n (mnt )) d t . (14.9) n→+∞
i /m
Combining (14.8) and (14.9), we obtain 1 W (ω, mnt , v m,η,n ) d t ≤ E(u), lim lim lim m→+∞ η→0 n→+∞
0
i
i i
i
i
i
i
588
book 2014/8/6 page 588 i
Chapter 14. Application in mechanics and computer vision
where v m,η,n (t ) = u m (t ) +
m−1 i =0
1(i /m,(i +1)/m) (t )
1 mn
ui ,η,n (nmt ).
An easy calculation shows that lim m→+∞ limη→0 limn→+∞ v m,η,n = u strongly in L1 (0, 1) and that v m,η,n belongs to 1/n m (0, 1). Therefore, by using a diagonalization argument, there exists a map n → (m(n), η(n)) such that limn→+∞ E 1 (ω, v m(n),η(n),n ) ≤ E(u), m(n)n
limn→+∞ v m(n),η(n),n = u. We complete the proof by denoting k → nk , the subsequence n → m(n)n, and setting un m(n) := v m(n),η(n),n . Second step. We prove Proposition 14.2.2 for u belonging to SBV + (0, 1). Let S u = {t0 , . . . , ti , . . .} be the jump set of u and let u l the function of SBV# (0, 1) with jump set S ul = {t0 , . . . , t l } defined by u l (0+ ) = u(0+ ) and u l = ua + il =0 [u](.)δ ti . According to the first step, there exists u l ,nk strongly converging to u l in L1 (0, 1) such that lim sup E1/nk (ω, u l ,nk ) ≤ E(u l ). k→+∞
Letting l tend to +∞ and by using a diagonalization argument, there exists a map k → l (k) such that the sequence u l (k),nk , still denoted by unk , strongly converges to u in L1 (0, 1) and satisfies lim sup E1/nk (ω, unk ) ≤ E(u). k→+∞
˜ Last step. According to the previous step, we have Γ − lim supk→+∞ E1/nk (ω, .) ≤ E, where ⎧ 1 ⎪ ⎨ W ho m (ua ) d t + W ho m,∞ (1) [u](t ) if u ∈ SBV + (0, 1), ˜ E(u) = 0 t ∈S u ⎪ ⎩ +∞ otherwise. Its lower semicontinuous envelope for the strong topology of L1 (0, 1) is (see [112]) ⎧ 1 ⎨ W ho m (ua ) d t + W ho m,∞ (1)u s ((0, 1)) if u ∈ BV + (0, 1), E(u) = ⎩ 0 +∞ otherwise, which ends the proof. PROOF OF THEOREM 14.2.3. Propositions 14.2.1 and 14.2.2 and the last assertion of Theorem 12.1.1 imply that for all ω in the set Ω ∩ Ω ∩ Ω of full probability, E (ω, .) Γ -converges to E. It remains to show that W ho m satisfies the three properties of the functions W j . Convexity, property (iii), and W ho m (1) = 0 have been proved in Lemma 14.2.2(ii) and the growth conditions are trivially satisfied. Finally, we establish (W ho m )∗ = ∗ j ∈J p j W j . We make precise the notation introduced in the proof of Lemma 14.2.1 by pointing out the dependence of Wn with respect to the parameter δ. We then set Wnδ (e) =
@[0,n) (δ, e, ω) n
.
i
i i
i
i
i
i
14.2. Some variational models in fracture mechanics
book 2014/8/6 page 589 i
589
Its Fenchel conjugate is given for all σ in R by 1 n ∗ (Wnδ )∗ (σ) = Wδ (ω, t , σ) d t . n 0 Equation (14.3) becomes now (Wnδ )∗ (σ) =
λ j (ω, n) n
j ∈J
(Tδ W j )∗ (σ).
(14.10)
According to the strong law of large numbers, the right-hand side of (14.10) tends almost surely to p j (Tδ W j )∗ (σ) j ∈J
when n goes to infinity. Note that this pointwise limit is also the Γ -limit of σ →
λ j (ω, n) n
j ∈J
(Tδ W j )∗ (σ)
defined on R. On the other hand the pointwise limit of Wnδ toward Wδho m obtained by the subadditive ergodic theorem is also the Γ -limit of σ → Wnδ (σ) defined on R. Indeed, the sequence of functions (Wδho m )n∈N∗ is equi-Lipschitz, so that pointwise and Γ -limit agree (see Corollary 2.59 in [37]). According to the continuity of the Fenchel conjugate with respect to the Mosco-convergence, hence here to the Γ -convergence, we deduce from (14.10) p j (Tδ W j )∗ (σ) ∀σ ∈ R. (14.11) (Wδho m )∗ (σ) = j ∈J
We would now like to go to the limit on δ in (14.11). Since the sequence (Wδho m )δ of lower semicontinuous functions defined on R increases to the lower semicontinuous function W ho m when δ tends to 0, we have Wδho m → W ho m in the sense of Γ -convergence for functionals defined on R. The same argument implies that Tδ W j Γ -converges to W j . According to the continuity of the Fenchel conjugate with respect to the Γ -convergence (note that Mosco- and Γ -convergence agree), we finally deduce our result by going to the limit on δ in (14.11). Remark 14.2.1. Concerning the upper bound, for all u in SBV# (0, 1) ∩ BV + (0, 1), we have proved the existence of u strongly converging to u in L1 (0, 1) and having the traces of u at 0+ and 1− . It is possible to generalize the previous study when the density functions W j satisfy a growth condition of order p > 1. In this case W ho m,∞ (1) = +∞ and the limit functional given by Theorem 14.2.3 becomes ⎧ ⎨ 1 ho m W (u ) d t if u ∈ L p (0, 1), u > 0, E(u) = ⎩ 0 +∞ otherwise. Let us consider the functional ⎧ t ⎨ 1 W ω, , u d t E˜ (ω, u) = ⎩ 0 +∞ otherwise.
if u ∈ W 1,1 (0, 1),
i
i i
i
i
i
i
590
book 2014/8/6 page 590 i
Chapter 14. Application in mechanics and computer vision
Then, according to Proposition 14.2.2, we have Γ − lim sup→0 E˜ ≤ E almost surely. On the other hand, the truncation argument of the first step in the proof of Proposition 14.2.2 implies that for u → u in L1 (0, 1), lim inf E˜ (ω, u ) ≥ lim inf E˜,δ (ω, u ) ≥ Eδ (u). →0
→0
Arguing as in the second step of this proof, we also obtain lim inf→0 E˜ (ω, u ) ≥ E(u). ˜ Thus the functional E(ω, .) Γ -converges to the functional E.
14.2.3 A second model in one dimension Keeping the same probabilistic setting, we study a new discrete model for which interaction between each pair of contiguous points is described by a random energy density which is no longer assumed to be convex but which satisfies the same conditions in a neighborhood of 0+ . This energy functional is precisely assumed to be subadditive beyond a random threshold e satisfying lim→0 e = 0. In this case, the total energy is of the form n−1 u z+1 − u z W ω, z, F (ω, u) = z=0 and almost surely Γ -converges to a deterministic energy functional defined on the subset SBV + (0, 1) := SBV (0, 1) ∩ BV + (0, 1) by F (u) = 0
1
f (ua ) d t +
g ([u]).
t ∈S u
More precisely, we consider a finite number W j , j ∈ J of density functions satisfying (i) W j : R → R+ ∪ {+∞} is convex, finite for e > 0, W j (1) = 0, and there exists α > 0 such that α(e − 1)2 ≤ W j (e) for all e in [0, +∞); (ii) there exists β > 0 such that W j (e) ≤ β(1 + e 2 ) for all e > 1; (iii) lime→0+ W j (e) = +∞ and W j (e) = +∞ when e ≤ 0. Note that (i), (ii), and (iii) are the conditions of the first model but with growth conditions of order 2. On the other hand, let g be a subadditive, continuous function with at most linear growth, mapping [0, +∞) into (0, +∞) and satisfying inf[0,+∞) g > 0. We then consider the density functions W j , , j ∈ J , from R into [0, +∞) defined by ⎧ ⎨ W j (e) if e ≤ 1, 1 W j , (e) = ⎩ W j (e) ∧ g ((e − 1))
if e > 1.
According to growth conditions, it is easily seen that there exists e j , > 1 satisfying lim→0 e j , = +∞, lim→0 e j , = 0, and such that ⎧ ⎨ W j (e) if e ≤ e j , , W j , (e) = 1 ⎩ g ((e − 1)) if e > e j , .
i
i i
i
i
i
i
14.2. Some variational models in fracture mechanics
book 2014/8/6 page 591 i
591
An example of such functions is given by W j = (e − 1)2 when e ≥ 1, W j (e) = − ln e when $ $ 0 ≤ e ≤ 1, and g (e) = 1 + e or g (e) = 1 + e or g (e) = 1 + e. As said above, the subadditivity assumption on g forces the crack to possess a minimal number of connected components. Denoting by eW , every threshold e j , , we now can j
define the random threshold ω → e (ω) by e (ω) = (eω z , ) z∈Z and the random function W for all t ∈ [z, z + 1) by ⎧ ⎨ ω z (e) if e ≤ eω z , , W (ω, t , e) = 1 ⎩ g ((e − 1)) if e > eω , . z The total energy modeling interactions between contiguous points of [0, 1] ∩ Z is the functional defined on L1 (0, 1) by ⎧ n−1 u((z + 1)) − u(z) ⎨ if u ∈ (0, 1), W ω, z, F (ω, u) = z=0 ⎩ +∞ otherwise, or, in a continuous form, by
⎧ ⎨
t W dt ω, , u F (ω, u) = ⎩ 0 +∞ otherwise. 1
if u ∈ (0, 1),
We equip L1 (0, 1) with its strong convergence. The main result is given below. Theorem 14.2.4. The functional F Γ -converges almost surely to the functional F defined in L1 (0, 1) by ⎧ 1 ⎪ ⎨ W ho m (ua ) d t + g ([u](t )) if u ∈ SBV + (0, 1), F (u) = 0 t ∈S u ⎪ ⎩ +∞ otherwise, where W ho m is the limit density defined in the first model. The proof follows the lines of the first model proof: first we establish the lower bound, then the upper one for a subsequence. For more general models, but in a deterministic setting, see [128]. We would like to write the functional F so that the contribution of W j and g are separated. To this end, we consider the space SBV (0, 1) of all the functions of SBV (0, 1) whose restriction to each interval (z, (z + 1)) included in (0, 1) is affine, and we associate to each function u of (0, 1) the function u˜ in SBV (0, 1) defined for all t ∈ [z, z + 1) by ⎧ ⎪ ⎨ u(t ) if u((z + 1)) − u(z) ≤ e , ω z , u˜(t ) = ⎪ ⎩ t − z + u(z) otherwise. Note that actually u˜ is a random function, but we have dropped the dependence on ω to shorten notations. Then, for all u ∈ (0, 1), one has
i
i i
i
i
i
i
592
book 2014/8/6 page 592 i
Chapter 14. Application in mechanics and computer vision
u((z + 1)) − u(z) W ω, z, {z : u((z+1))−u(z)≤eω z , } u((z + 1)) − u(z) −1 g + {z : u((z+1))−u(z)>eω z , } 1 t W ω, , u˜a d t + g ([ u˜ ](t )) := F˜ (ω, u˜ ), ≥ 0 t ∈S
F (ω, u) =
u˜
where ω → W (ω, t , e) is the random function defined in the first model. The proof of the lower bound in the definition of Γ -convergence is based on the following lemma. Lemma 14.2.3. Assume that sup F (ω, u ) < +∞ and that u strongly converges to some u in L1 (0, 1). Then u˜ strongly converges to u in L1l oc (0, 1). Moreover u belongs to SBV + (0, 1) and g ([u]) ≤ lim inf g ([ u˜ ]). →0
t ∈S u
t ∈S u˜
PROOF. In the proof of the first statement, the difficulty stems from the lack of coercivity of W with respect to (u − 1)+ . Nevertheless, note that for all i, j in {1, . . . , n − 1},
j i
|u − u˜ | d t ≤
and that
1
sup
0
2
j i
(u − 1)+ d t
(u − 1)− d t < +∞.
Thus for (a, b ) ⊂⊂ (0, 1), where a, b satisfy lim→0 u (a) = u(a) and lim→0 u (b ) = u(b ), one has b (u − 1)+ d t < +∞. sup
a
Let now (a, b ) ⊂⊂ (0, 1) as above. From sup F˜ (ω, u˜ ) < +∞ and using the previous estimate, the coercivity assumption on W and the fact that inf[0,+∞) g > 0, we have ⎧ sup u˜ BV (a,b ) < +∞; ⎪ ⎨ ( u˜,a ) equi-integrable on (a, b ); ⎪ ⎩ sup H 0 (S u˜ (a,b ) ) < +∞. Thus, according to compactness Theorem 13.4.4, u belongs to SBV (a, b ) and there exists a subsequence (not relabeled) satisfying ⎧ u˜ u weakly in BV (a, b ), ⎪ ⎪ ⎪ ⎪ ˜ ua weakly in L1 (a, b ), u ⎪ ⎪ ⎨ ,a [ u˜ (a, b )]δ t [u(a, b )]δ t weakly in M (a, b ), ⎪ ⎪ t ∈S u˜ (a,b ) t ∈S u(a,b ) ⎪ ⎪ ⎪ ⎪ ⎩ H 0 (S u(a,b ) ) ≤ lim inf H 0 (S u˜ (a,b ) ). →0
Note that thanks to the coercivity assumption on W j , ( u˜,a ) is equi-integrable on (0, 1) 1 and thus weakly converges to ua in L (0, 1). Moreover, u is a Borel measure on (0, 1) as a
i
i i
i
i
i
i
14.2. Some variational models in fracture mechanics
book 2014/8/6 page 593 i
593
nonnegative distribution, so that u belongs to BV (0, 1). Therefore u belongs to SBV (0, 1) owing to the characterization of the space SBV (0, 1) (see Theorem 10.5.1). To prove the last assertion, it is enough to establish g ([u]) ≤ lim inf g ([ u˜ ]) →0
t ∈S u(a,b )
t ∈S u˜
and to let a go to 0 and b to 1. Set μ = g ([ u˜ (a, b )]) H 0 S u˜ (a,b ) . From sup F˜ (ω, u˜ ) < +∞, up to a subsequence, μ weakly converges to a Borel measure μ ∈ M (a, b ). Let μ = θ H 0 S u˜(a,b ) + μ s its Lebesgue–Nikodým decomposition with respect to the Borel measure H 0 S u˜(a,b ) . It suffices now to establish θ(t0 ) ≥ g ([u](t0 )) for H 0 S u˜(a,b ) a.e. t0 in (0, 1). For H 0 S u˜(a,b ) a.e. t0 in (0, 1) and for a.e. ρ > 0, we have θ(t0 ) = lim lim ρ→0 →0
μ (Bρ (x0 )) H S u˜(a,b ) (Bρ (x0 )) 0
= lim lim μ (Bρ (x0 )) ρ→0 →0 = lim lim ρ→0 →0
t ∈Bρ (x0 )∩S u˜ (a,b )
≥ lim inf lim inf g ρ→0
≥ lim inf g
!
→0
!
ρ→0
g ([ u˜ ](t ))
t ∈Bρ (x0 )∩S u˜ (a,b )
[ u˜](t )
[ u˜ ](t )
"
"
t ∈Bρ (x0 )∩S u(a,b )
= g ([u](t0 )), where we have used the subadditivity and continuity assumptions on g . We now establish the lower bound. Proposition 14.2.3. There exists a set Ω of full probability such that for all ω in Ω and u, u in L1 (0, 1) satisfying u → u strongly in L1 (0, 1), F (u) ≤ lim inf F (ω, u ). →0
PROOF. We assume lim inf→0 F (ω, u ) < +∞. Define v in W 1,1 (0, 1) by v (t ) = t weakly converges to ua in L2 (0, 1), v strongly converges in L2 (0, 1) u˜ d s. Since u˜a, 0 a, t to the function v of L2 (0, 1) defined by v(t ) = 0 ua d s. Therefore, according to Remark 14.2.1 seen for the first model, there exists a set Ω of full probability such that, for all ω ∈ Ω , 1 1 t lim inf W ω, , v d t ≥ W ho m (v ) d t . →0 0 0 Therefore
1
lim inf →0
and
ua
0
t
W ω, , u˜a, dt ≥
1 0
W ho m (ua ) d t ,
> 0. We end the proof by applying Lemma 14.2.3.
i
i i
i
i
i
i
594
book 2014/8/6 page 594 i
Chapter 14. Application in mechanics and computer vision
To conclude the proof of Theorem 14.2.4, we now establish the upper bound. Proposition 14.2.4. There exists a subset Ω of full probability such that for all ω ∈ Ω , there exists a subsequence (nk )k∈N of positive integers satisfying: for all u in L1 (0, 1), there exists unk in L1 (0, 1) possibly depending on ω, converging to u and such that lim supk→+∞ F1/nk (ω, unk ) ≤ F (u). PROOF. Without loss of generality, one may assume F (u) < +∞ and S u = {t0 }. Let m ∈ N∗ and i0 ∈ N be such that t0 ∈ [i0 /m, (i0 +1)/m) and set I m = (0, i0 /m) ∪ ((i0 +1)/m, 1). According to Remark 14.2.1, where (0, 1) is replaced by (0, i0 /m) or ((i0 + 1)/m, 1), there exists a subset Ω of full probability which can be chosen independent of m, such that for all ω ∈ Ω , there exists wn ∈ 1/mn (0, i0 /m) ∩ 1/mn ((i0 + 1)/m, 1) satisfying ⎧ lim wn = u strongly in L1 (I m ), ⎪ ⎪ ⎪ n→+∞ ⎪ ⎪ ⎪ i0 i0 + 1 i0 i0 + 1 ⎨ wn =u and wn =u , m m m m 1 ⎪ ⎪ ⎪ ⎪ ho m ⎪ W (ω, nmt , wn ) d t ≤ W (ua ) d t ≤ W ho m (ua ) d t . ⎪ ⎩ lim sup n→+∞
Im
Im
0
Noticing that W1/mn ≤ W , we obtain 1 lim sup W1/mn (ω, nmt , wn ) d t ≤ W ho m (ua ) d t . n→+∞
Im
(14.12)
0
Consider now the function wn,m defined as follows: ⎧ wn,m = wn on I m ; ⎪ ⎪ ⎪ ⎪ ⎪ i0 i0 + 1 i0 i0 + 1 ⎪ ⎪ ⎪ = u , w = u ; w ⎪ n,m n,m ⎪ m m m m ⎪ ⎨ 1 i0 i0 + 1 ⎪ wn,m is affine on , − with wn,m = 1; ⎪ ⎪ ⎪ m m nm ⎪ ⎪ ⎪ ⎪ ⎪ i0 + 1 1 i0 + 1 ⎪ ⎪ ⎩ wn,m is affine on − , . m nm m i +1
Clearly wn,m belongs to 1/mn (0, 1) and the slope e of its restriction to ( 0m − satisfies 1 i0 + 1 i0 n −1 i0 + 1 i0 1 e=u −u − >u −u − , mn m m mn m m m where the last term tends to [u](t0 ) > 0. Since
1 e mn ωi0 +1 ,1/mn
1 i0 +1 , m ) nm
tends to 0, for n large enough
we have e > eωi +1 ,1/mn . A straightforward calculation then yields 0 1 i0 i0 + 1 −u − W1/mn (ω, nmt , wn,m ) d t = g u m m m (0,1)\I m so that
lim sup lim sup m→+∞ n→+∞
(0,1)\I m
W1/mn (ω, nmt , wn,m ) d t = g ([u](t0 )).
(14.13)
i
i i
i
i
i
i
14.3. The Mumford–Shah model
book 2014/8/6 page 595 i
595
Finally, combining (14.12) and (14.13), we obtain 1 W1/mn (ω, nmt , wn,m ) d t ≤ F (u). lim sup lim sup m→+∞ n→+∞
0
On the other hand, clearly lim sup lim sup wn,m = u strongly in L1 (0, 1) m→+∞ n→+∞
and we complete the proof by a diagonalization argument as in Proposition 14.2.2. Remark 14.2.2. One can slightly generalize this model by assuming the function g of random type (see [249], [250]). More precisely, let {g j , j ∈ J } be a finite set of Lipschitz functions mapping [0, +∞) into (0, +∞), satisfying inf[0,+∞) g > 0, and not necessarily subadditive. We set Ω = {(w j , g j ), j ∈ J }Z and ⎧ ⎨ W j (e) if e ≤ 1, 1 W j , (e) = ⎩ W j (e) ∧ g j ((e − 1))
if e > 1,
which, as previously, leads to the random function defined by ⎧ ⎨ ω 1z (e) if e ≤ eω z , , W (ω, t , e) = 1 2 ⎩ ω z ((e − 1)) if e > eω , z for all ω = ((ω 1z , ω 2z )) z∈Z in Ω, t in [z, z + 1), and e in R. Then, the corresponding total energy almost surely Γ -converges to the functional ⎧ 1 ⎪ ⎨ W ho m (ua ) d t + g ho m ([u](t )) if u ∈ SBV + (0, 1), F (u) = 0 t ∈S ⎪ u ⎩ +∞ otherwise. Setting g (ω, t , .) = ω 2z for all t in [z, z + 1), the density g ho m (a) at a ∈ R is the almost sure deterministic limit of the process ⎧ ⎪ + ⎨ 1(−T ,T ) (a) = inf g (ω, t , [v](t )) : v ∈ SBV0,a (−T , T ) , t ∈(−T ,T )∩Sv ⎪ ⎩ SBV + (−T , T ) = {v ∈ SBV + (−T , T ) : v = 0, v(−T ) = 0, v(T ) = a} 0,a a
when T goes to +∞. It is easily seen that a → g ho m (a) is subadditive.
14.3 The Mumford–Shah model Let us recall the Mumford–Shah model discussed in Section 12.5. Let Ω be a bounded open subset of RN and g a given function in L∞ (Ω). Denoting by 0 the class of the closed sets of Ω, for all K in 0 and all u in C1 (Ω \ K) we define the functional |∇u|2 d x + N −1 (K) E(u, K) := |u − g |2 d x + Ω
Ω\K
i
i i
i
i
i
i
596
book 2014/8/6 page 596 i
Chapter 14. Application in mechanics and computer vision
and the associated optimization problem which is the strong formulation of the Mumford– Shah model in image segmentation: inf{E(u, K) : (u, K) ∈ C1 (Ω \ K) × 0 }.
(14.14)
When Ω is a rectangle in R and g (x) is the light signal striking Ω at a point x, (14.14) is the Mumford–Shah model of image segmentation: K may be considered as the outline of the given light image in computer vision. In Section 12.5 we introduced the corresponding weak formulation, 2 2 N −1 |u − g | d x + |∇u| d x + (S u ) : u ∈ SBV (Ω) , (14.15) mw := inf 2
Ω
Ω
where ∇u denotes the density of the Lebesgue part of D u. One may now establish the weak existence result. Theorem 14.3.1. There exists at least a solution of the weak problem (14.15). PROOF. We adopt the strategy of the so-called direct methods in the calculus of variations. From the hypothesis g ∈ L∞ (Ω), we may assume that all the admissible functions u in (14.15) are uniformly bounded in L∞ (Ω) by |g |L∞ (Ω) . Indeed, let c = |g |L∞ (Ω) and consider the truncated function uc = c ∧ u ∨ (−c). It is easily seen that uc belongs to SBV (Ω) and satisfies ⎧ S ⊂ Su , ⎪ ⎪ ⎪ uc ⎪ ⎪ ⎨ |∇uc |2 d x ≤ |∇u|2 d x, Ω Ω ⎪ ⎪ ⎪ ⎪ 2 ⎪ |uc − g | d x ≤ |u − g |2 d x. ⎩ Ω
Ω
(Note that only the last inequality requires the explicit value of c.) Let (un )n be a minimizing sequence of (14.15). It obviously satisfies |un |∞ + |∇un |2 d x + H N −1 (S un ) ≤ C , Ω
where C is a constant which does not depend on n. According to Theorem 13.4.3, there exists a subsequence (unk )k and a function u ∗ in SBV (Ω) such that ⎧ ∗ 1 ⎪ ⎪ unk → u in L l oc (Ω), ⎨ ∇unk ∇u ∗ in L2 (Ω, RN ), ⎪ ⎪ ⎩ N −1 (S ∗ ) ≤ lim inf H N −1 (S ). u
k−→+∞
un
k
According to Fatou’s lemma and to the weak lower semicontinuity of the norm of the space L2 (Ω, RN ), we deduce ∗ 2 |u − g | d x + |∇u ∗ |2 d x + N −1 (S u ∗ ) Ω Ω 2 ≤ lim inf |unk − g | d x + lim inf |∇unk |2 d x + lim inf N −1 (S un ) k k→+∞ Ω k→+∞ Ω k→+∞ ≤ lim inf |unk − g |2 d x + |∇unk |2 d x + N −1 (S un ) = mw , k→+∞
Ω
Ω
k
∗
which proves that u is a solution of (14.15).
i
i i
i
i
i
i
14.3. The Mumford–Shah model
book 2014/8/6 page 597 i
597
We establish now the existence of a solution for the strong problem (14.14). Theorem 14.3.2. There exists at least a solution of the strong problem (14.14). PROOF. The proof proceeds in three steps. First step. Let m s be the value of the infimum of problem (14.14). We begin by showing that mw ≤ m s . Indeed, arguing as in the previous proof, one may assume that all the admissible functions of (14.14) are uniformly bounded in L∞ (Ω). Moreover, according to Example 10.5.1, for all K in 0 such that N −1 (K) < +∞, the space W 1,1 (Ω\K)∩ L∞(Ω) is included in SBV (Ω) and all its elements satisfy N −1 (S u \ K) = 0. Second step. We establish that any solution u ∗ of (14.15) satisfies u ∗ ∈ C1 (Ω \ S u ∗ ), N −1 (S u ∗ ∩ Ω \ S u ∗ ) = 0. We only prove the first assertion. The second is more involved and we refer the reader to the paper of De Giorgi, Carriero, and Leaci [196]. Let Bρ (x) be the open ball centered at x, with radius ρ small enough so that Bρ (x) ⊂ Ω \ S u ∗ . Then u ∗ belongs to W 1,2 (Bρ (x)) and minimizes the problem 1,2 2 2 ∗ |∇v| d x + |v − g | d x : v ∈ u + W0 (Bρ (x)) . inf Bρ (x)
Bρ (x)
Thus u ∗ is a solution of the Dirichlet problem −Δv + v = g in Bρ (x), v = u ∗ on ∂ Bρ (x). According to classical results on regularity properties of the solutions of Dirichlet problems (see, for instance, [137], [310]) we have u ∗ ∈ C1 (Bρ (x)). Last step. Collecting the two previous steps we straightforwardly deduce that mw = m s and that (u ∗ , S u ∗ ∩ Ω) is a solution of (14.14). For other variational models in computer vision and image processing, see [300], [60], [61], [62], and references therein.
i
i i
i
i
i
i
book 2014/8/6 page 599 i
Chapter 15
Variational problems with a lack of coercivity
As we have seen in Section 3.2, every minimization problem of a coercive lower semicontinuous function admits a solution. On the other hand, without the coercivity assumption, in general we cannot apply the direct method and the existence of a minimizer may fail. This may occur even if the cost function is convex, as it happens, for instance, in the case min{e x : x ∈ R}. However, some minimum problems, even if not coercive, still admit a solution, as, for instance, the case min{x 2 : (x, y) ∈ R2 } trivially shows. In this chapter we present some methods which allow us to identify the noncoercive minimum problems which admit a solution. The history of these tools goes back to Stampacchia [338] and Fichera [214], who developed them to treat noncoercive cases in the framework of variational inequalities and of unilateral contact problems in elasticity, respectively. On the other hand, at least in the finite dimensional convex situations, the geometrical tool of recession function was introduced by Rockafellar in 1964, and this has been shown to be very useful in a large number of cases. The theory we present in Section 15.1 for the convex cases and in Section 15.2 for the general ones appeared first in the paper by Baiocchi et al. in 1988 [75] and makes it possible to treat in a unified way problems of geometrical type as well as problems coming from continuum mechanics.
15.1 Convex minimization problems and recession functions In this section we will treat convex minimum problems, not necessarily coercive, and we will prove the existence of minimizers provided some compatibility conditions are satisfied. The simplest example of a variational noncoercive minimization problem is the classical Neumann problem presented in Section 6.2, 1 2 1 |D u| d x − 〈L, u〉 : u ∈ H (Ω) , min 2 Ω where Ω is a connected bounded Lipschitz domain of Rn and L belongs to the dual space 1 H (Ω) . The Euler–Lagrange equation of the minimization problem above can be 599
i
i i
i
i
i
i
600
book 2014/8/6 page 600 i
Chapter 15. Variational problems with a lack of coercivity
written in the weak form Ω
∀φ ∈ H 1 (Ω).
D uDφ d x = 〈L, φ〉
(15.1)
Now, if the term L on the right-hand side of (15.1) is of the form f φdx + g φ d n−1 〈L, φ〉 = Ω
∂Ω
with f ∈ L (Ω) and g ∈ L (∂ Ω), which we write as L = f + g , integrating by parts the left-hand side of (15.1) we obtain the PDE problem ⎧ ⎨ −Δu = f in Ω, ∂u ⎩ =g on ∂ Ω. ∂ν 2
2
It is well known that a solution of the problem above exists iff the compatibility condition 〈L, 1〉 = 0
(15.2)
is fulfilled. This can be seen in a simple way by remarking that if a solution of (15.1) exists, condition (15.2) follows straightforwardly by taking as a test function φ = 1; on the other hand, if the compatibility condition (15.2) is fulfilled, then by the Poincaré inequality (Theorem 5.3.1), the minimization problem becomes coercive as soon as it is restricted to the class of functions in H 1 (Ω) with zero average. Therefore we obtain the existence of a solution u for the problem written in weak form as D uDψ d x = 〈L, ψ〉 ∀ψ ∈ H 1 (Ω), ψ d x = 0. Ω
Ω
Now, this implies the existence of a solution for (15.1) by noticing that every function φ in H 1 (Ω) can be written as φ = ψ + c, where ψ is a function with zero average and c ∈ R, so that condition (15.2) yields D uDφ d x = D uDψ d x = 〈L, ψ〉 = 〈L, φ〉. Ω
Ω
In this section, (V , σ) will denote a real locally convex Hausdorff topological vector space, and F : V → ] − ∞, +∞] will be a proper convex and sequentially σ-lsc mapping. The minimization problem we are interested in is 4 5 min F (v) : v ∈ V . From the discussions above we know that the existence or nonexistence of a solution depends on some compatibility conditions that we want to identify. To do this we recall the classical definition of recession function introduced by Rockafellar [325] in the finite dimensional case (see also Sections 11.3 and 13.3). Definition 15.1.1. Given a proper convex and sequentially σ-lower semicontinuous functional F : V → ] − ∞, +∞], the recession functional F ∞ of F is defined, for every v ∈ V , by F ∞ (v) = lim
t →+∞
F (v0 + t v) t
,
(15.3)
where v0 is any element of dom F = {v ∈ V : F (v) < +∞}.
i
i i
i
i
i
i
15.1. Convex minimization problems and recession functions
book 2014/8/6 page 601 i
601
The main properties of the recession functional F ∞ are listed in the following proposition. Proposition 15.1.1. We have the following: (i) The limit in (15.3) exists and is independent of v0 . (ii) The functional F ∞ can equivalently be expressed by 4 5 F ∞ (v) = sup F (u + v) − F (u) : u ∈ dom F M F (v + t v) − F (v ) N 0 0 = sup : t >0 t for any v0 ∈ dom F . (iii) F ∞ is proper, convex, sequentially σ-lsc, and positively 1-homogeneous, that is, F ∞ (t v) = t F ∞ (v)
∀t ≥ 0,
∀v ∈ V .
(iv) For every F1 , . . . , Fn proper, convex, sequentially σ-lsc mappings, with (dom F1 ) ∩ · · · ∩ (dom Fn ) = , it is ∞
n n Fi = Fi∞ . i =1
i =1
(v) F ∞ (v) + F ∞ (−v) ≥ 0 for every v ∈ V . PROOF. Let us prove that the limit in (15.3) exists. This follows from the fact that for every v0 ∈ dom F and v ∈ V the function φ(t ) = F (v0 + t v) is convex on R; hence the mapping t → φ(t ) − φ(0) /t is nondecreasing and so it admits a limit as t → +∞. Moreover, the fact that the definition of F ∞ does not depend on v0 ∈ dom F follows from property (ii). Let us prove now that for every v0 ∈ dom F and v ∈ V it is F (v0 + t v) F (v0 + t v) − F (v0 ) lim = sup : t >0 . t →+∞ t t The inequality ≤ is trivial because lim
t →+∞
F (v0 + t v) t
F (v0 + t v) − F (v0 ) = lim t →+∞ t F (v0 + t v) − F (v0 ) ≤ sup : t >0 . t
To prove the opposite inequality, fix s > 0 and t > s; by the convexity of F we have s s v0 + (v0 + t v) F (v0 + s v) = F 1 − t t s s ≤ 1− F (v0 ) + F (v0 + t v), t t so that
F (v0 + s v) − F (v0 ) s
≤
F (v0 + t v) − F (v0 ) t
.
i
i i
i
i
i
i
602
book 2014/8/6 page 602 i
Chapter 15. Variational problems with a lack of coercivity
By letting first t → +∞, and then taking the supremum for s > 0, we get F (v0 + t v) F (v0 + s v) − F (v0 ) : s > 0 ≤ lim . sup t →+∞ s t To conclude the proof of (ii) it remains to prove that for every v0 ∈ dom F and v ∈ V the equality F (v0 + t v) − F (v0 ) (15.4) sup F (u + v) − F (u) = sup t t >0 u∈dom F holds. Let u, v0 ∈ dom F and v ∈ V ; by using the convexity and lower semicontinuity of F we obtain 1 1 F (u + v) ≤ lim inf F 1 − u + (v0 + t v) t →+∞ t t % & 1 1 ≤ lim inf 1 − F (u) + F (v0 + t v) t →+∞ t t F (v0 + t v) − F (v0 ) = F (u) + lim ; t →+∞ t hence, inequality ≤ in (15.4) is proved. To prove the opposite inequality, we denote by S the left-hand side of (15.4); it is clear that without loss of generality we may assume S < +∞. Then u + v ∈ dom F for every u ∈ dom F and so, from F (u + v) ≤ S + F (u), we deduce for every integer k ≥ 0 F (u + kv) = F (u) +
k
F (u + i v) − F (u + (i − 1)v) ≤ F (u) + kS.
i =1
Take now two nonnegative integers h, k; by using the convexity of F and the inequality above we obtain 1 u + hv h u+ F u + v = F 1− k k k 1 1 ≤ 1− F (u) + F (u + hv) k k 1 h 1 F (u) + (F (u) + hS) = F (u) + S. ≤ 1− k k k Finally, by using the lower semicontinuity of F , we have F (u + t v) ≤ F (u) + t S
∀t ≥ 0.
Thus, taking u = v0 , we obtain F (v0 + t v) − F (v0 ) t
≤S
∀t > 0,
which concludes the proof of (15.4). The fact that F ∞ is proper, convex, and sequentially σ-lsc follows from assertion (ii). Indeed, for every u ∈ dom F the mapping v → F (u + v) − F (u) is clearly convex and
i
i i
i
i
i
i
15.1. Convex minimization problems and recession functions
book 2014/8/6 page 603 i
603
sequentially σ-lsc, and so is F ∞ , thanks to the well-known properties of supremum of convex functions. The fact that F ∞ is positively 1-homogeneous follows easily from the definition. Indeed, given v ∈ V and s > 0, we have F ∞ (s v) = lim
F (v0 + t s v)
t →+∞
t
and, setting τ = t s, F ∞ (s v) = s lim
F (v0 + τv)
τ→+∞
τ
= s F ∞ (v).
Assertion (iv) follows straightforwardly by the definition of recession function. Finally, to prove (v), we may reduce ourselves to the case when both F ∞ (v) and F ∞ (−v) are finite; otherwise the statement is trivial. Therefore, by using (ii), we deduce that u + v ∈ dom F ,
u − v ∈ dom F
∀u ∈ dom F ;
hence, by using (ii) again, taking u − v instead of u in the supremum associated to F ∞ (v), we have F ∞ (v) + F ∞ (−v) ≥ sup F (u) − F (u − v) u∈dom F
+ sup
F (u − v) − F (u) ≥ 0,
u∈dom F
which concludes the proof of Proposition 15.1.1. Here are some simple examples in which the recession functional can be explicitly computed. Example 15.1.1. Let V = R and let F (x) = e x for every x ∈ R. Then an easy calculation shows that in this case < if x ≤ 0, ∞ F (x) = 0 +∞ if x > 0. Analogously, if V = R2 and F (x) = x12 , then < 0 F ∞ (x) = +∞
if x1 = 0, if x1 = 0.
As we will see more precisely later, this example shows how the recession function indicates the directions of coercivity. Example 15.1.2. Let F : V → [0, +∞] be a nonnegative convex σ-lsc functional which is positively homogeneous of degree p > 1, that is, F (t v) = t p F (v) Then we have
< F ∞ (v) = 0 +∞
∀t > 0, ∀v ∈ V . if F (v) = 0, otherwise.
On the other hand, if F is positively homogeneous of degree 1, then it is clear that F∞ = F.
i
i i
i
i
i
i
604
book 2014/8/6 page 604 i
Chapter 15. Variational problems with a lack of coercivity
Example 15.1.3. Let Ω be an open subset of Rn with a Lipschitz boundary, and let V = W 1, p (Ω; R m ) ( p ≥ 1) be the Sobolev space of all R m -valued functions which are in L p (Ω) along with their first derivatives. Consider the functional f (x, D u) d x ∀u ∈ W 1, p (Ω, R m ), F (u) = Ω
where (i) f : Ω × R mn → [0, +∞] is a Borel function, (ii) for a.e. x ∈ Ω the function f (x, ·) is convex and lower semicontinuous on R mn , (iii) there exists u0 ∈ W 1, p (Ω, R m ) such that F (u0 ) < +∞. It is well known (see, for instance, Section 13.1) that under the assumptions above, the functional F turns out to be proper, convex, and sequentially lower semicontinuous with respect to the weak topology of W 1, p (Ω, R m ). Moreover, we have ∞ f ∞ (x, D u) d x ∀u ∈ W 1, p (Ω, R m ), F (u) = Ω
where f ∞ (x, ·) is the recession function of f (x, ·). In fact, because of the convexity of f , for all u ∈ W 1, p (Ω, R m ) the function f x, D u0 (x) + t D u(x) − f x, D u0 (x) g (x, t ) = t is nondecreasing with respect to t for a.e. x ∈ Ω. Therefore the monotone convergence theorem gives F (u0 + t u) − F (u0 )
F ∞ (u) = lim
t →+∞
= lim
t →+∞
t
Ω
g (x, t ) d x =
f ∞ x, D u(x) d x. Ω
As a consequence, if F (u) =
Ω
|D u| p d x
∀u ∈ W 1, p (Ω, R m )
with p > 1, we get F ∞ (u) =
−∞
i
i i
i
i
i
i
15.1. Convex minimization problems and recession functions
book 2014/8/6 page 605 i
605
(which always occurs, for instance, if F admits a minimum point on V ). Then F ∞ (v) ≥ 0
∀v ∈ V .
(15.5)
PROOF. Let m be the infimum of F on V , let v0 be any point in dom F , and let v ∈ V . Since F is proper the infimum m is finite and so, by the definition of F ∞ we obtain F ∞ (v) = lim
F (v0 + t v)
t →+∞
t
≥ lim
t →+∞
m t
= 0.
Example 15.1.4. Let P : V → [0, +∞] be a convex sequentially σ-lower semicontinuous functional which is positively homogeneous of degree p > 1 (for instance, the pth power of a seminorm), let L ∈ V , and let F be the functional defined by F (v) = P (v) − 〈L, v〉
∀v ∈ V .
By Example 15.1.2 we have ∞
F (v) =
0 t F (t w + v0 ) ≤ lim inf F t w h + 1 − v h→+∞ h 0 t t ≤ lim inf F (hw h ) + 1 − F (v0 ) ≤ F (v0 ), h→+∞ h h where the last inequality follows from (15.8). Hence F ∞ (w) = lim
t →+∞
F (t w + v0 ) t
≤ lim
t →+∞
F (v0 ) t
= 0,
which, together with the necessary condition (ii), implies w ∈ ker F ∞ .
i
i i
i
i
i
i
15.1. Convex minimization problems and recession functions
book 2014/8/6 page 607 i
607
Step 4. By the compatibility condition (iii) we obtain F ∞ (−w) = 0, and this implies, by Proposition 15.1.1(ii), that F (v − t w) ≤ F (v)
∀v ∈ V , ∀t ≥ 0.
(15.9)
In particular, by taking v = v h and t = h, from the fact that w h − w → 0 we have v h −hw < h for h large enough, and then from (15.9) it follows that v h −hw is a solution of problem (℘ h ) for h large enough, with v h − hw < h. Hence we found a solution of problem (℘ h ), for h large enough, with norm strictly less than h, and by repeating the argument used in Step 2, this provides a solution of the minimum problem (15.6) Remark 15.1.1. In the case of nonreflexive Banach spaces V the result above still holds; it is enough to assume that V is a Banach space with norm · and to consider a topology σ on V coarser than the norm topology and such that (V , σ) is a Hausdorff vector space with the closed unit ball of (V , · ) sequentially σ-compact. This happens, for instance, when V is the dual W of a separable Banach space W and σ is the weak* topology on V . Then the proof above can be repeated if F : V → ] − ∞, +∞] is assumed to be proper convex and sequentially σ-lsc and such that (ii), (iii) hold together with the compactness condition: if t h → +∞, v h → v with respect to σ, and F (t h v h ) is bounded from above, then v h − v → 0. Remark 15.1.2. Notice that if dom F is bounded in (V , · ), then conditions (i), (ii), and (iii) are automatically fulfilled. In fact, in this case the set ker F ∞ reduces to {0}, as can be seen immediately. Remark 15.1.3. Consider the particular case when the so-called condition of Lions– Stampacchia type (see [277]) F ∞ (v) > 0
∀v = 0
(15.10)
holds. Then we obtain immediately that assumptions (ii) and (iii) are fulfilled. By Theorem 15.1.1 we obtain that, if the compactness assumption (i) also holds, then the set of solutions of problem (15.6) is nonempty. In this case the set of solutions is also bounded. In fact, by contradiction, assume there exist v h solutions of (15.6) with v h → +∞; arguing as in the proof of Theorem 15.1.1, it is possible to prove that the sequence w h = v h /v h converges in norm to some w ∈ ker F ∞ . From assumption (15.10) it follows that w = 0, and this contradicts the fact that w h = 1 for every h. We consider now the particular case of quadratic forms on a Hilbert space. More precisely, let V be a separable Hilbert space, let a : V × V → R be a symmetric bilinear continuous form, and let L ∈ V . Define 1 F (v) = a(v, v) − 〈L, v〉 2
∀v ∈ V
and consider the minimum problem
4 5 min F (v) : v ∈ V ,
(15.11)
which is equivalent to the equation a(v, ·) = L
in V .
i
i i
i
i
i
i
608
book 2014/8/6 page 608 i
Chapter 15. Variational problems with a lack of coercivity
On the bilinear form a we assume that the following conditions are satisfied: a(v, v) ≥ 0 ∀v ∈ V , v h → 0 weakly, and a(v h , v h ) → 0
⇒
v h → 0 strongly.
(15.12) (15.13)
In this case, since ker(F ∞ − L) = (ker F ∞ ) ∩ (ker L), as a corollary of Theorem 15.1.1 we obtain the following result. Proposition 15.1.3. The minimum problem (15.11) admits a solution iff the compatibility condition L is orthogonal to ker a (i.e., 〈L, v〉 = 0 whenever a(v, v) = 0) (15.14) is fulfilled. Note that the condition above reads simply ker a ⊂ ker L. PROOF. An easy computation shows that < −〈L, v〉 ∞ F (v) = +∞
if a(v, v) = 0, if a(v, v) = 0;
then, by Proposition 15.2, if problem (15.11) admits a solution, we must necessarily have F ∞ ≥ 0, that is, (15.14). Conversely, let us assume (15.14) holds. The weak lower semicontinuity of the functional F is then a consequence of the continuity of the bilinear form a, the compactness assumption (i) follows from property (15.13), and finally the necessary condition (ii) and the compatibility condition (iii) follow from property (15.14). Therefore, by Theorem 15.1.1 we obtain that the minimum problem (15.11) admits a solution. Example 15.1.6. Consider the variational formulation of the classical Neumann problem 1 2 1 min |D u| d x − 〈L, u〉 : u ∈ H (Ω) , 2 Ω where Ω is a bounded regular open subset of Rn and L ∈ H 1 (Ω) . If we consider the bilinear form a(u, v) = D u Dv d x ∀u, v ∈ H 1 (Ω) Ω
and apply Proposition 15.1.3, we obtain that a solution exists iff the compatibility condition 〈L, 1〉 = 0 is fulfilled. In terms of partial differential equations, when L = f + g with f ∈ L2 (Ω) and g ∈ L2 (∂ Ω), this means that the problem ⎧ ⎨ −Δu = f in Ω, ∂u ⎩ =g on ∂ Ω ∂ν admits a solution iff Ω
f (x) d x +
∂Ω
g (x) d n−1 (x) = 0.
i
i i
i
i
i
i
15.1. Convex minimization problems and recession functions
book 2014/8/6 page 609 i
609
For every subset K of V we denote by χK the indicator function of K defined by < χK (v) =
0 +∞
if v ∈ K, otherwise.
Notice that if K is a nonempty sequentially σ-closed convex subset of V , then the function χK turns out to be a proper convex sequentially σ-lsc mapping. Definition 15.1.2. For every nonempty sequentially σ-closed convex subset K of V we define the recession cone K ∞ of K by setting K ∞ = dom(χK )∞ .
(15.15)
For every function F : V → ] − ∞, +∞] we denote by epi F the epigraph of F 4 5 epi F = (v, t ) ∈ V × R : t ≥ F (v) .
(15.16)
We remark that epi F is a nonempty sequentially closed convex subset of V ×R whenever F is a proper sequentially σ-lower semicontinuous convex function. Proposition 15.1.4. If F : V → ] − ∞, +∞] is a proper convex sequentially σ-lsc function, we have epi F ∞ = (epi F )∞ . PROOF. By Proposition 15.1.1(ii) it is (v, t ) ∈ epi F ∞ ⇐⇒ t ≥ F (u + v) − F (u) ∀u ∈ dom F . In other words, if (s, u) ∈ epi F , we have F (u + v) ≤ t + F (u) ≤ t + s, that is, (v, t ) + epi F ⊂ epi F , or equivalently,
χepi F (v, t ) + (w, α) = 0
∀(w, α) ∈ epi F .
By Proposition 15.1.1 again, we obtain ∞ χepi F (v, t ) = 0, that is, (v, t ) ∈ (epi F )∞ . The definition given in (15.15) of recession cone is a very useful tool in convex analysis (see [325], [76], [77], and [119], where it is called “cône asymptote”); let us now state its main properties.
i
i i
i
i
i
i
610
book 2014/8/6 page 610 i
Chapter 15. Variational problems with a lack of coercivity
Proposition 15.1.5. For any nonempty sequentially σ-closed convex subset K of V the set K ∞ is a convex sequentially σ-closed cone and the following properties hold true: ' K∞ = t −1 (K − v0 ) ∀v0 ∈ K, (15.17) t >0
K is a cone ⇐⇒ K ∞ = K, 0 ∈ K ⇐⇒ K ∞ ⊂ K,
(15.18) (15.19)
K is bounded ⇒ K ∞ = {0}, dom F ∞ ⊂ (dom F )∞ whenever F : V → ] − ∞, +∞] is convex and lsc.
(15.20) (15.21)
Moreover, a point w ∈ V belongs to K ∞ iff one of the following conditions is fulfilled: k +w ∈K
∀k ∈ K,
(15.22)
k + t w ∈ K ∀k ∈ K ∀t ≥ 0, ∃k ∈ K : k + t w ∈ K ∀t ≥ 0.
(15.23) (15.24)
PROOF. By Proposition 15.1.1(ii) and the definition (15.15) of K ∞ we have that v ∈ K ∞ iff χK (v0 + t v) − χK (v0 ) ≤ 0 which is equivalent to
v + t v0 ∈ K
∀t > 0, ∀v0 ∈ K,
∀t > 0, ∀v0 ∈ K.
Therefore (15.17) follows. The proof of (15.18) follows from Proposition 15.1.1(iii). Assertion (15.19) follows easily from (15.17) by taking v0 = 0. Always from (15.17) we obtain that K ∞ reduces to {0} whenever K is bounded, that is, (15.20). The proof of (15.21) simply follows by remarking that when F ∞ (v) < +∞, then v0 + t v ∈ dom F for every t > 0 and every v0 ∈ dom F . Finally, (15.22), (15.23), (15.24) follow from Proposition 15.1.1(ii). Remark 15.1.4. We remark that the inclusion in (15.21) may be strict: this can be seen by taking, for instance, F (v) = v2 . In this case we have dom F = (dom F )∞ = V
whereas
dom F ∞ = {0}.
In the finite dimensional case, implication (15.20) can be reversed, as the following proposition shows. Proposition 15.1.6. Let K be a convex closed subset of Rn such that K ∞ = {0}. Then K is bounded. PROOF. Assume by contradiction that K is unbounded; then for every h ∈ N there exists x h ∈ K with |x h | ≥ h. The sequence y h = x h /|x h | is bounded, so that we may extract a subsequence (which we still denote for simplicity by y h ) converging to some y with |y| = 1. We claim that y ∈ K ∞ . In fact, fix t > 0 and let x0 be any point in K; since K is convex, for h large enough we have t t x + 1− x ∈ K, |x h | h |x h | 0
i
i i
i
i
i
i
15.1. Convex minimization problems and recession functions
book 2014/8/6 page 611 i
611
and, since K is closed, passing to the limit as h → +∞, t y + x0 ∈ K, which implies that y belongs to K ∞ . This gives a contradiction because by assumption K ∞ = {0} and |y| = 1. In general infinite dimensional topological vector spaces, Proposition 15.1.6 is false, as shown by the following example. Example 15.1.7. Let V be a separable (infinite dimensional) Hilbert space, and let (en )n∈N be a complete orthonormal system in V . Consider the set 5 4 K = v ∈ V : |(v, en )| ≤ n for every n ∈ N , where (·, ·) denotes the scalar product in V . It is easy to see that K is a nonempty convex weakly closed subset of V ; moreover, we have K ∞ = {0}. In fact, if v ∈ K ∞ , by using the fact that 0 ∈ K, we obtain from (15.17) tv ∈ K that is, |(v, en )| ≤
n t
for every t > 0, for every t > 0 and n ∈ N.
Therefore, as t → +∞, we get (v, en ) = 0
∀n ∈ N,
and so v = 0. Nevertheless, K is unbounded, as it contains the points vn = nen for every n ∈ N. We will specialize now our existence results on noncoercive minimum problems to the case when the functional F can be written in the form F (v) = J (v) − 〈L, v〉 + χK (v). This situation arises, for instance, in many problems of mathematical physics, where J represents the stored energy functional depending on the nature of the body, L describes the action of the applied forces, and K is the set of admissible configurations which takes into account the physical constraints of the problems. In this case, the minimization problem we are dealing with takes the form 4 5 min J (v) − 〈L, v〉 : v ∈ K . Theorem 15.1.2. Assume that J : V → [0, +∞] is a proper convex sequentially σ-lsc functional,
(15.25)
L : V → R is a linear σ-continuous functional, K ⊂ V is a nonempty convex sequentially σ-closed set,
(15.26) (15.27)
and consider the minimum problem 4 5 min J (v) − 〈L, v〉 : v ∈ K .
(15.28)
i
i i
i
i
i
i
612
book 2014/8/6 page 612 i
Chapter 15. Variational problems with a lack of coercivity
Then a necessary condition for the existence of at least a minimizer is J ∞ (v) ≥ 〈L, v〉
∀v ∈ K ∞ .
(15.29)
On the other hand, the minimum problem (15.28) admits at least a solution provided the necessary condition (15.29) holds and the following compactness and compatibility conditions are fulfilled: if t h → +∞, v h ∈ K, v h → v weakly, and J (t h v h ) − t h 〈L, v h 〉 is bounded from above, then v h − v → 0,
(15.30)
K ∞ ∩ ker(J ∞ − L) is a linear subspace of V .
(15.31)
PROOF. Noticing that by Proposition 15.1.1(iv) the equality (J − L + χK )∞ = J ∞ − L + χK ∞ holds, and taking Proposition 15.1.2 into account, we obtain that condition (15.29) is necessary for the existence of at least a solution of the minimum problem (15.28). Analogously, compactness and compatibility conditions (i) and (iii) become in this case (15.30) and (15.31), so that the conclusion follows by Theorem 15.1.1. Remark 15.1.5. When J is a quadratic form, or more generally when J ∞ takes only the values 0 and +∞ (which, for instance, occurs if J is positively p-homogeneous with p > 1; see Example 15.1.2, condition (15.31)), it can be written in the simpler form K ∞ ∩ ker J ∞ ∩ ker L is a linear subspace of V ,
(15.32)
which will be used in the following. Let us discuss the structural assumptions of the existence result of Theorem 15.1.1 above. Example 15.1.8. The compactness assumption (i) in the existence theorem, Theorem 15.1.1, cannot be dropped, as the following example shows. Let V be an infinite dimensional separable Hilbert space, and let (en )n∈N be a complete orthonormal system in V . We denote by (·, ·) the scalar product in V , and we define a functional F : V → R by setting 2−n |(v, en ) − 1|2 ∀v ∈ V . F (v) = n∈N
It is easy to see that the functional F is finite-valued, convex, and weakly lower semicontinuous. Moreover, for every v ∈ V F ∞ (v) = lim
t →+∞
= lim
t →+∞
F (t v) t
% 2
−n
n∈N
Hence, F ∞ (v) =
0, F (x) = +∞ if x ≤ 0. The function F is proper, convex, and lower semicontinuous, and the compactness condition (i) is fulfilled since the dimension of V is finite. Moreover, a simple calculation yields < 0 if x ≥ 0, F ∞ (x) = +∞ if x < 0, and so the necessary condition (ii) is fulfilled too, but 4 5 inf F (x) : x ∈ R = −∞. We conclude this section by showing how Theorem 15.1.1 can be used to determine whether the algebraic difference of two closed convex sets in a Banach space is closed. Here we consider a reflexive Banach space V and two nonempty closed convex subsets A, B of V ; the algebraic difference A − B is defined by A − B = {a − b : a ∈ A, b ∈ B}. Example 15.1.10. We emphasize that even when dealing with cones in a finite dimensional space, the convex set A − B may be not closed: take, for instance, V = R3 and 4 5 A = x ∈ R3 : x1 ≥ 0, x2 ≥ 0, x3 ≥ 0, x1 x3 ≥ x22 , 5 4 B = x ∈ R3 : x2 = x3 = 0 . The sets A and B are closed convex cones, but a simple calculation gives 4 5 4 5 A − B = x ∈ R3 : x2 > 0, x3 > 0 ∪ x ∈ R3 : x2 = 0, x3 ≥ 0 , which is not closed. Indeed, for every n ∈ N we have that the point (0, 1, 1/n) belongs to A − B, whereas their limit (0, 1, 0) is not in A − B.
i
i i
i
i
i
i
614
book 2014/8/6 page 614 i
Chapter 15. Variational problems with a lack of coercivity
In the previous example the convex sets A and B are both unbounded. On the other hand, if A and B are two closed convex subsets of a reflexive Banach space and at least one of them is bounded, then A − B is closed (weak and strong closedness coincide, due to convexity). Indeed, if A is bounded and x h = a h − b h tends weakly to x with a h ∈ A and b h ∈ B, then up to subsequences we have a h → a ∈ A weakly, hence b h = x h − a h → x − a ∈ B weakly, so that x ∈ A − B. The following lemma characterizes the closed convex subsets of V . Lemma 15.1.1. Let K be a nonempty convex subset of a reflexive Banach space V . Then the following conditions are equivalent: (i) K is closed. (ii) For every u ∈ V the function v → u − v has a minimum on K. PROOF. Assume K is closed and let u ∈ V ; set 4 5 M = inf u − v : v ∈ K , and for every h ∈ N let v h ∈ K be such that u − v h ≤ M +
1 h
.
The sequence (v h ) is bounded; since V is reflexive, possibly passing to subsequences, we may assume that v h converges weakly to some v which belongs to K, because K is weakly closed (being strongly closed and convex). By the weak lower semicontinuity of the norm, we obtain u − v ≤ lim inf u − v h ≤ M , h→+∞
which proves (ii). Conversely, assume (ii) holds, and let (v h ) be a sequence in K strongly convergent to some v ∈ V . By (ii) there exists w ∈ K such that
In particular,
w − v ≤ w − v
∀w ∈ K.
w − v ≤ v h − v
∀h ∈ N;
hence, as h → +∞, we obtain w = v, and this proves that v ∈ K. We are now in a position to prove a closure result for the difference of two closed sets. Theorem 15.1.3. Let V be a reflexive Banach space, and let A and B be two nonempty closed convex subsets of V . Assume that A is locally compact for the strong topology and that A∞ ∩ B ∞ is a linear subspace.
(15.33)
Then the convex set A − B is closed. PROOF. By Lemma 15.1.1 it is enough to prove that for every u ∈ V the function v → u − v has a minimum on A − B, or equivalently that the problem 4 5 min F (a, b ) : (a, b ) ∈ V × V (15.34)
i
i i
i
i
i
i
15.1. Convex minimization problems and recession functions
book 2014/8/6 page 615 i
615
has at least a solution, where F (a, b ) = u − a + b + χA(a) + χB (b ). We apply to the minimum problem (15.34) the existence theorem, Theorem 15.1.1. Since F is a nonnegative functional, the necessary condition F ∞ ≥ 0 is immediately fulfilled. To prove the compactness condition (i) take a h → a weakly in V , b h → b weakly in V , t h → +∞ such that F (t h a h , t h b h ) ≤ C ; then by the definition of F we have t h a h ∈ A,
t h b h ∈ B,
u − t h a h + t h b h ≤ C .
Since the convex set A is assumed to be locally compact for the strong topology of V and t h a h ∈ A, the convergence a h → a is actually strong; moreover, by u − t h a h + t h b h ≤ C we obtain that b h − a h → 0 and so also b h → a strongly. Let us finally prove the compatibility condition (iii). By the definition of F we find F ∞ (a, b ) = lim
t →+∞
F (a0 + t a, b0 + t b ) t
,
where a0 ∈ A and b0 ∈ B. Therefore F ∞ (a, b ) = lim a − b + χA (a0 + t a) + χB (b0 + t b ) t →+∞
= a − b + χA∞ (a) + χB ∞ (b ) so that 4 5 ker F ∞ = (a, b ) : a ∈ A∞ , b ∈ B ∞ , a = b 4 5 = (a, a) : a ∈ A∞ ∩ B ∞ . Therefore the compatibility condition (iii) follows from assumption (15.33). Remark 15.1.6. The requirement that A is locally compact for the strong topology is clearly fulfilled when A is finite dimensional. On the other hand, there are closed convex subsets A which are locally compact for the strong topology but not finite dimensional. For instance, it is enough to take in a separable Hilbert space V the set 5 4 A = v ∈ V : |(v, en )| ≤ 1/n ∀n ∈ N , where (en )n∈N is a complete orthonormal system in V . The results of this section allow us to study the problem of lower semicontinuity for the inf-convolution of two convex functions. We recover some results of Section 9.2. If V is a reflexive Banach space and f , g : V → ] − ∞, +∞] are two proper convex functions, we recall that the inf-convolution f #e g is defined by ( f #e g )(w) = inf{ f (u) + g (v) : u, v ∈ V , u + v = w}.
(15.35)
For instance, if f = χA and g = χB , with A, B convex subsets of V , we have f #e g = χA+B . Moreover, it is easy to see that in terms of epigraphs the inf-convolution operation turns out to simply reduce to the algebraic sum, that is, epi f # g = epi f + epi g . e
i
i i
i
i
i
i
616
book 2014/8/6 page 616 i
Chapter 15. Variational problems with a lack of coercivity
As we saw in Example 15.1.10, it may happen that f and g are both lower semicontinuous but the lower semicontinuity does not occur for the inf-convolution f #e g . To see when f #e g is lower semicontinuous, we investigate equivalently on the closedness of epi f #e g and we obtain the following result. Proposition 15.1.7. Let f , g : V → ]−∞, +∞] be two proper convex lower semicontinuous functions. Assume the following: (i) compactness: if t h → +∞, u h and v h converge weakly, u h + v h → 0 strongly, and f (t h u h ) + g (t h v h ) is bounded from above, then u h and v h converge strongly; (ii) compatibility: if f ∞ (v) + g ∞ (−v) ≤ 0, then f ∞ (−v) + g ∞ (v) ≤ 0. Then the inf-convolution f #e g defined in (15.35) is lower semicontinuous. PROOF. By Lemma 15.1.1 it is enough to show that for every M ∈ R and every u ∈ V the function (t , v) → |t − M | + v − u admits a minimum on epi f #e g . In other words, we have to show the existence of a solution for the minimum problem 4 5 min |t − M | + v − u : t ≥ ( f #e g )(v) . It is easy to see that for a fixed v, the optimal t in the minimum problem above is given by t = M ∨( f #e g )(v), so that we have to show the existence of a solution for the problem 4 5 min M ∨ ( f #e g )(v) + v − u : v ∈ V . By the definition of inf-convolution this fact turns out to be equivalent to the existence of a solution for 4 5 min M ∨ f (x) + g (y)) + x + y − u : x, y ∈ V . Setting Φ(x, y) = M ∨ f (x) + g (y)) + x + y − u we apply to Φ the existence theorem, Theorem 15.1.1. The convexity and the weak lower semicontinuity of Φ follow straightforwardly. To prove the compactness assumption (i) of Theorem 15.1.1, take x h → x and y h → y weakly in V , and t h → +∞ such that Φ(t h x h , t h y h ) is bounded from above. Dividing by t h we obtain lim sup h→+∞
f (t h x h ) th
+
g (t h y h ) th
+ + x h + y h ≤ 0,
which implies that x h + y h → 0 strongly in V . By the compactness assumption (i), the convergence of x h and of y h is actually strong in V . + Since Φ∞ (x, y) = f ∞ (x) + g ∞ (y) x + y the necessary condition (ii) of Theorem 15.1.1 is fulfilled. It remains to prove the compatibility condition (iii). Since 4 5 ker Φ∞ = (x, y) ∈ V × V : x + y = 0, f ∞ (x) + g ∞ (y) ≤ 0 , the fact that ker Φ∞ is a subspace follows immediately from assumption (ii).
i
i i
i
i
i
i
15.2. Nonconvex minimization problems and topological recession
book 2014/8/6 page 617 i
617
Example 15.1.11. Let Ω be a bounded connected open subset of Rn and let X be the Sobolev space H 1 (Ω). We denote by (x, y) the points of Ω, where x represents some k coordinates and y the remaining n − k. Consider the functionals F (u) = |D x u|2 d x d y − 〈 f , u〉, Ω
G(u) =
Ω
|Dy u|2 d x d y − 〈g , u〉,
where f and g are in the dual space of X . The functionals F and G are convex and lower semicontinuous on H 1 (Ω); we want to see if their inf-convolution F #e G is still lower semicontinuous. By Proposition 15.1.7 above it is enough to verify the compactness assumption (i) and the compatibility assumption (ii). If t h → +∞, u h and v h converge weakly, u h + v h → 0 strongly, and F (t h u h )+G(t h v h ) is bounded from above, then dividing by t h2 we have that D x u h → 0 in L2 (Ω),
Dy v h → 0 in L2 (Ω),
which, together with the fact that u h + v h → 0 strongly in H 1 (Ω), implies that u h and v h converge actually strongly in H 1 (Ω). Finally, an easy calculation gives the expressions of the recession functions of F and G: < +∞ if Dy v = 0, +∞ if D x u = 0, ∞ ∞ F (u) = G (v) = −〈g , v〉 if Dy v ≡ 0. −〈 f , u〉 if D x u ≡ 0, Therefore, to prove the compatibility assumption, and hence the lower semicontinuity of F #e G, it is enough to assume that 〈 f − g , 1〉 = 0.
15.2 Nonconvex minimization problems and topological recession In this section we will consider general minimum problems of the form 4 5 min F (v) : v ∈ V ,
(15.36)
where F is a possibly nonconvex functional noncoercive as well. Problems of this kind arise, for instance, in nonlinear elasticity (see Section 11.2), and due to the lack of convexity the results of previous sections cannot be applied. We will introduce a new kind of recession functional for general nonconvex functions and we will prove an abstract existence result under lower semicontinuity, compactness, and compatibility conditions, which will be expressed by means of this new tool. As in Section 15.1, (V , σ) will denote a real locally convex Hausdorff topological vector space, and F : V → ] − ∞, +∞] will be a proper (not necessarily convex) mapping. Definition 15.2.1. The topological recession functional F∞ of F is defined for every v ∈ V by F∞ (v) = lim inf t →+∞ w→v
F (t w) t
.
(15.37)
The main properties of the functional F∞ are listed in the following proposition.
i
i i
i
i
i
i
618
book 2014/8/6 page 618 i
Chapter 15. Variational problems with a lack of coercivity
Proposition 15.2.1. We have the following: (i) F∞ is σ-lsc and positively homogeneous of degree 1. (ii) F∞ = F whenever F is σ-lsc and positively homogeneous of degree 1. (iii) F∞ = F ∞ whenever F is proper, convex, and σ-lsc. (iv) (F + G)∞ (v) ≥ F∞ (v) + G∞ (v) for every mapping G : V → ] − ∞, +∞] and for every v ∈ V such that the sum at the right-hand side is defined. (v) The equality (F + G)∞ = F∞ + G∞ holds in the following cases: (va ) F and G are proper, convex, σ-lsc, and dom F ∩ dom G = ; (v b ) G is positively homogeneous of degree 1, finite, and σ-continuous; (vc ) F is convex and σ-lower semicontinous, F (0) < +∞, G is σ-lsc and positively homogeneous of degree 1. PROOF. The proof of properties (i), (ii), (iv), (v b ) can be obtained immediately from Definition 15.2.1; property (va ) follows from property (iii) and from Proposition 15.1.1(iv); property (vc ) follows from property (iv) and from the fact that using properties (ii), (iii), and Proposition 15.1.1(i), we get for every v ∈ V F (t v) + G(t v) (F + G)∞ (v) ≤ lim inf t →+∞ t F (t v) = lim inf + G(v) t →+∞ t = F ∞ (v) + G(v) = F∞ (v) + G∞ (v). It remains to prove property (iii). Let v ∈ V and let v0 ∈ dom F be fixed; taking w t = v + v0 /t we get F∞ (v) ≤ lim inf t →+∞
F (t w t )
F (v0 + t v)
= lim inf t →+∞
t
t
= F ∞ (v).
On the other hand, by using the convexity and the lower semicontinuity of F , for every s > 0 we have s s F (v0 + s v) ≤ lim inf F 1 − v0 + t w t →+∞ t t w→v % & s s ≤ lim inf 1 − F (v0 ) + F (t w) t →+∞ t t w→v = F (v0 ) + s lim inf
F (t w)
t →+∞ w→v
Therefore,
F (v0 + s v) − F (v0 ) s
t
= F (v0 ) + s F∞ (v).
≤ F∞ (v)
for every s > 0, and passing to the limit as s → +∞, we obtain F ∞ (v) ≤ F∞ (v).
i
i i
i
i
i
i
15.2. Nonconvex minimization problems and topological recession
book 2014/8/6 page 619 i
619
Analogously to what we made in Section 15.2 in the convex case, for every nonempty subset K of V we may define the topological recession cone K∞ of K. Definition 15.2.2. Let K be a nonempty subset of V . The topological recession cone K∞ of K is defined by K∞ = dom(χK )∞ . (15.38) Remark 15.2.1. By Definition 15.2.1 it follows that a point u belongs to K∞ iff 1 ∀s > 0 ∀U ∈ ℑ(u) ∃t > s : U ∩ K = , t where ℑ(u) denotes the family of all σ-neighborhoods of u. In other words, 6 7 31 ' K∞ = clσ K , t t >s s >0
(15.39)
where clσ denotes the closure with respect to σ. The following proposition contains a list of properties of the topological recession cones. Proposition 15.2.2. We have the following: (i) K∞ is a σ-closed cone (possibly nonconvex); (ii) K∞ = K whenever K is a σ-closed cone; (iii) K∞ = K ∞ whenever K is a nonempty convex σ-closed set; (iv) K∞ = {0} whenever K is bounded (the converse is false even in the convex case, as shown in Example 15.1.7); (v) if V is finite dimensional, then K∞ = {0} implies that K is bounded; (vi) K∞ = (K ∪ H )∞ = (K + H )∞ for every bounded subset H of V ; (vii) (epi F )∞ = epi F∞ for every proper function F : V → ] − ∞, +∞]. PROOF. Properties (i), (ii), (iii) follow from the definition (15.38) of K∞ and from Proposition 15.2.1(i), (ii), (iii), respectively. To prove property (iv), let U be a closed neighborhood of 0; since K is bounded, there exists s > 0 such that K ⊂ t U for every t > s. By the characterization of K∞ given by (15.39) this implies that K∞ ⊂ U , and since U is arbitrary, we get K∞ = {0}. Let us prove property (v). Assume by contradiction that K is unbounded; then for every h ∈ N there exists x h ∈ K with |x h | ≥ h. Since the sequence y h = x h /|x h | is bounded, we may extract a subsequence (still denoted by y h ) converging to some y ∈ V with |y| = 1. Therefore (χK )∞ (y) ≤ lim inf χK (|x h |y h ) h→+∞
= lim inf χK (x h ) = 0, h→+∞
so that y ∈ K∞ , and this is impossible because y = 0.
i
i i
i
i
i
i
620
book 2014/8/6 page 620 i
Chapter 15. Variational problems with a lack of coercivity
Let us prove property (vi). Since the inclusion K∞ ⊂ (K ∪H )∞ is obvious, it is enough to prove the opposite inclusion. Let x ∈ (K ∪ H )∞ ; if x = 0 we have x ∈ K∞ because, by (i), K∞ is a closed cone. If x = 0 let U0 , U be two disjoint neighborhoods of 0 and x, respectively. Since H is bounded, there exists s0 > 0 such that H ⊂ t U0 for every t > s0 ; moreover, since x ∈ (K ∪ H )∞ , by (15.39) we have ∀s > 0
∀W x
1 ∃t > s : W x ∩ (K ∪ H ) = , t
(15.40)
where we denoted by W x a generic neighborhood of x. When s ≥ s0 and W x ⊂ U we have 1 ∀t > s, W x ∩ H ⊂ U ∩ U0 = t so that, by (15.40), 1 W x ∩ K = . t By (15.39) this proves that x ∈ K∞ . The equality K∞ = (K ∪ H )∞ can be proved in a similar way. Let us now prove property (vii). If (u, ξ ) ∈ (epi F )∞ , denoting by U and I generic neighborhoods of u and ξ , respectively, by (15.39) we have ∀s > 0
∀U
∀I
∃t > s
∃v ∈ U
∃η ∈ I : F (t v) ≤ t η.
Therefore F∞ (u) ≤ ξ , so that (u, ξ ) ∈ epi F∞ . On the other hand, if (u, ξ ) ∈ epi F∞ , we have F∞ (u) ≤ ξ ; hence by the definition of F∞ we obtain ∀s > 0 ∀U
∀η > ξ
∃t > s
∃v ∈ U :
F (t v) t
< η.
Therefore (u, ξ ) ∈ epi F∞ . Remark 15.2.2. It can be useful to give a characterization of the topological recession functional F∞ in terms of converging nets: actually, it is easy to show that for every v ∈ V F (tλ vλ) F∞ (v) = inf lim inf : tλ → +∞, vλ → v , (15.41) λ∈Λ tλ where Λ is an arbitrary directed set and (tλ ), (vλ ) are nets indexed by Λ. Recall that a directed set is a set together with a relation ≥ which is both transitive and reflexive such that for any two elements a, b there exists another element c with c ≥ a and c ≥ b . Definition (15.37) or the equivalent one (15.41) is very general and provides a good extension of the notion of convex recession functional introduced in the last section. However, in many situations, it is not easy to deal with neighborhoods or nets; for this reason, we introduce now a new definition of recession functional, in which only the behavior of converging sequences is involved. More precisely, for every v ∈ V we set F (t h v h ) s eq : t h → +∞, v h → v , (15.42) F∞ (v) = inf lim inf h→+∞ th where (t h ) and (v h ) are sequences. It is clear from (15.41) and (15.42) that F∞ ≤ F∞s eq , and that F∞ = F∞s eq whenever the space (V , σ) is metrizable. Moreover, by a proof similar to
i
i i
i
i
i
i
15.2. Nonconvex minimization problems and topological recession
book 2014/8/6 page 621 i
621
the one of Proposition 15.2.1, it is possible to show that the following properties for F∞s eq hold. Proposition 15.2.3. We have the following: (i) F∞s eq is positively homogeneous of degree 1. (ii) F∞s eq = F whenever F is sequentially σ-lsc and positively homogeneous of degree 1. (iii) F∞s eq = F whenever F is proper, convex, and sequentially σ-lsc. s eq s eq (v) ≥ F∞s eq (v) + G∞ (v) for every mapping G : V → ] − ∞, +∞] and for (iv) (F + G)∞ every v ∈ V such that the sum at the right-hand side is defined. s eq s eq (v) = F∞s eq (v) + G∞ (v) holds in the following cases: (v) The equality (F + G)∞
(va ) F and G are proper, convex, sequentially σ-lsc, and dom F ∩ dom G = ; (v b ) G is positively homogeneous of degree 1 and sequentially σ-continuous; (vc ) F is convex and sequentially σ-lsc, F (0) < +∞, G is sequentially σ-lsc and positively homogeneous of degree 1. Remark 15.2.3. We point out that in general the functional F∞s eq is neither σ-lsc nor sequentially σ-lsc. However, this simpler definition will be sufficient to obtain an existence theorem for minimizers (Theorem 15.2.1) which will be used in many applications. We stress the fact that the explicit computation of F∞s eq may be difficult; nevertheless, to apply the existence result of Theorem 15.2.1, in many cases it will be enough to show qualitative properties of F∞s eq which are easy to obtain thanks to Proposition 15.2.3. In an analogous way, for every nonempty subset K of V we may introduce the ses eq by setting quential recession cone K∞ s eq s eq = dom(χK )∞ . K∞
(15.43)
In other words, it is s eq x ∈ K∞
⇐⇒
∃t h → +∞
∃x h → x
∀h ∈ N
t h x h ∈ K.
s eq turns out to be a cone (possibly not sequentially σ-closed) with vertex 0 and, The set K∞ by a proof similar to the one of Proposition 15.2.2, we obtain the following properties.
Proposition 15.2.4. We have the following: s eq = K if K is a sequentially σ-closed cone. (i) K∞ s eq = K ∞ if K is a nonempty sequentially σ-closed set. (ii) K∞ s eq (iii) K∞ = {0} if K is bounded. s eq = {0} implies K bounded if V is finite dimensional. (iv) K∞ s eq s eq s eq = (K ∪ H )∞ = (K + H )∞ if H ⊂ V is bounded. (v) K∞
i
i i
i
i
i
i
622
book 2014/8/6 page 622 i
Chapter 15. Variational problems with a lack of coercivity
The following result provides a general necessary condition for the existence of minimizers. Proposition 15.2.5. Assume that 4 5 inf F (v) : v ∈ V > −∞. Then we have F∞ (v) ≥ 0 ∀v ∈ V (hence F∞s eq ≥ 0, because F∞ ≤ F∞s eq ). PROOF. Let m be the infimum of F on V , and let v ∈ V . By the definition of F∞ we get F∞ (v) = lim inf t →+∞ w→v
F (t w) t
≥ lim inf t →+∞
m t
= 0.
We recall that even in the convex case, the condition F ∞ ≥ 0 is not sufficient for the existence of a solution of problem 4 5 min F (v) : v ∈ V (15.44) (see, for instance, Example 15.1.9). To show an existence result for problem (15.44), analogously to what was done in Section 15.1 we set 4 5 ker F∞s eq = v ∈ V : F∞s eq (v) = 0 . Theorem 15.2.1. Assume V is a Banach space with norm ·, let σ be a topology on V coarser than the norm topology and such that (V , σ) is a Hausdorff vector space with the closed unit ball of (V , ·) sequentially σ-compact, and let F : V → ]−∞, +∞] be a sequentially σ-lower semicontinuous functional. Assume also that the following conditions are satisfied: (i) compactness: if t h → +∞, v h → v with respect to σ, and F (t h v h ) is bounded from above, then v h − v → 0; (ii) necessary condition: F∞s eq (v) ≥ 0 for every v ∈ V ; (iii) compatibility: for every u ∈ ker F∞s eq there exists t > 0 such that F (v − t u) ≤ F (v) for all v ∈ V . Then the minimum problem (15.44) has at least a solution. PROOF. The proof is similar to the one of Theorem 15.1, and we will follow it step by step. Step 1. For every h ∈ N let v h be a solution of the minimum problem 4 5 min F (v) : v ≤ h . (℘ h ) Since F is sequentially σ-lsc and {v ∈ V : v ≤ h} is sequentially σ-compact, by the direct method of the calculus of variations (see Corollary 3.2.3) we obtain that there exists a solution v h of problem (℘ h ). Moreover, by the lower semicontinuity of F and by the assumptions on σ, we may choose v h such that 4 5 (15.45) v h = min w : w solves (℘ h ) .
i
i i
i
i
i
i
15.2. Nonconvex minimization problems and topological recession
book 2014/8/6 page 623 i
623
Step 2. If the sequence (v h ) is bounded in norm, then by the lower semicontinuity of F and by the σ-compactness of bounded sets, the existence of a solution of problem (15.44) follows from the direct method of the calculus of variations. Indeed, if (v hk ) is a subsequence of (v h ) which is σ-convergent to some v ∈ V , we have 4 5 F (v) ≤ lim inf F (v hk ) = inf F (v) : v ∈ V . k→+∞
Step 3. It remains to show that the case (v h ) unbounded cannot occur. By contradiction, assume that a subsequence of v h (which we still index by h) tends to +∞. Since the normalized vectors w h = v h /v h are bounded, there exists a subsequence of (w h ) (which we still index by h) which σ-converges to some w ∈ V . We have F (v h+1 ) ≤ F (v h )
∀h ∈ N,
so that F (v h ) is bounded from above. Hence F∞s eq (w) ≤ lim inf h→+∞
F (v h w h ) v h
= lim inf h→+∞
F (v h ) v h
≤ 0,
which, together with the necessary condition (ii), gives w ∈ ker F∞s eq .
(15.46)
Step 4. We have v h → +∞, w h → w with respect to σ, and F (v h w h ) = F (v h ) bounded from above. By the compactness assumption (i) we obtain w h − w → 0, and this prevents w from being zero, because w h = 1 for all h ∈ N. From (15.46) and from the compatibility condition (iii) we get that there exists t > 0 such that F (v h − t w) ≤ F (v h )
∀h ∈ N.
(15.47)
Finally, ) ) ) ) t ) ) v h − t w = ) 1 − v h + t (w h − w)) ) ) v h t ≤ 1− v h + t w h − w v h = v h + t w h − w − 1 . The right-hand side of the last equality is strictly less than v h for h large enough, and this is in contradiction with (15.47) and (15.45). Remark 15.2.4. Looking at the proof of Theorem 15.2.1 we see that the compactness condition (i) can be imposed only on sequences (v h ) which σ-converge to an element v ∈ ker F∞s eq . Remark 15.2.5. In Theorem 15.2.1 a weaker form of the compatibility condition (iii) can be used, namely, for every u ∈ ker F∞s eq there exists R > 0 such that for every v ∈ V with v ≥ R there exists t > 0 such that F (v − t u) ≤ F (v).
i
i i
i
i
i
i
624
book 2014/8/6 page 624 i
Chapter 15. Variational problems with a lack of coercivity
Moreover, an inspection of the proof of Theorem 15.2.1 shows that in the case of a product space V1 × · · · × Vn the compatibility condition (iii) can be replaced by the following weaker one: for every (u1 , . . . , un ) ∈ ker F∞s eq there exist positive numbers t1 , . . . , tn such that F (v1 − t1 u1 , . . . , vn − tn un ) ≤ F (v1 , . . . , vn )
∀ (v1 , . . . , vn ) ∈ V1 × · · · × Vn .
Remark 15.2.6. Theorem 15.2.1 includes in a certain sense the classical direct method of the calculus of variations, which gives the existence of a solution of problem (15.44) under the following assumptions: (i) F is sequentially σ-lsc. (ii) There exist α > 0 and b ∈ R such that F (v) ≥ αv + b
∀v ∈ V .
Indeed, by (ii) we obtain t h → +∞, v h → v, F (t h v h ) ≤ c
⇒
v h → 0 and v = 0,
so that the compactness hypothesis (i) is satisfied. Moreover, by (iii) again we get F∞s eq (v) ≥ αv
∀v ∈ V ,
so that ker F∞s eq reduces to {0}. Hence, hypotheses (ii) and (iii) of Theorem 15.2.1 are also satisfied, and problem (15.44) admits at least a solution. In particular (ii) holds if dom F is bounded. As in Section 15.1, we now specialize our results to the case when the functional F is of the form F (v) = J (v) − 〈L, v〉 + χK (v), where
J : V → [0, +∞] is a sequentially σ-lsc functional; L : V → R is a linear σ-continuous functional; K ⊂ V is a sequentially σ-closed set.
Taking into account Propositions 15.39 and 15.40, we have that a necessary condition for the existence of a solution of the minimum problem 4 5 min J (v) − 〈L, v〉 : v ∈ K is given by s eq J∞ (v) ≥ 〈L, v〉
∀v ∈ K,
(15.48)
whereas the compatibility condition (iii) can be written as follows: s eq s eq with J∞ (u) = 〈L, u〉, there exists t > 0 such that for every v ∈ K (iii ) for every u ∈ K∞
n/2. Remark 15.3.3. The argument used in the proof of Proposition 15.3.2 applies also to problems of the form 1 2 + 1 (15.57) min |Dv| + B(v ) d x − 〈L, v〉 : v ∈ H (Ω) Ω 2 for any convex and strictly increasing B : R+ → R+ . In this case the associated Euler– Lagrange equation reads as < −Δu + ∂ B(u + ) # h in Ω, (15.58) ∂ u/∂ ν = g on ∂ Ω, where L = h + g and ∂ B is the subdifferential of the convex function B, whereas the associated linearized equation coincides with (15.56). Remark 15.3.4. Consider the obstacle (from below) problem (see also [223]) 1 |Dv|2 d x − 〈L, v〉 : v ∈ H 1 (Ω), v ≤ 0 . min Ω 2
(15.59)
By introducing the function < B(s) =
0 +∞
if t ≤ 0, if t > 0,
problem (15.59) can be written in the form (15.58) and the argument above works in the same way. The Euler–Lagrange equation is usually written in this case as a variational inequality: 1 u ∈ H (Ω), u ≤ 0 D uD(v − u) d x − 〈L, v − u〉 ≥ 0 ∀v ∈ H 1 (Ω), v ≤ 0. Ω
i
i i
i
i
i
i
628
book 2014/8/6 page 628 i
Chapter 15. Variational problems with a lack of coercivity
Consider now, in a Hilbert space V , a linear continuous functional L : V → R, a closed convex subset K ⊂ V , and a symmetric continuous bilinear form a : V × V → R which we assume to be nonnegative, that is, a(v, v) ≥ 0
∀v ∈ V .
If F denotes the functional 1 F (v) = a(v, v) − (L, v) 2
∀v ∈ V ,
we may consider the minimum problem min{F (v) : v ∈ K}.
(15.60)
Proposition 15.3.3. The following conditions are equivalent: (i) There exists u ∈ K such that F (u) ≤ F (v) for every v ∈ K. (ii) There exists u ∈ K such that a(u, u − v) ≤ (L, u − v) for every v ∈ K. PROOF. Assume (i). Then, since K is convex, for every v ∈ V we have t v + (1 − t )u ∈ K and the map t ∈ [0, 1], gv (t ) = F t v + (1 − t )u , achieves its minimum at t = 0, hence 0 ≤ gv (0) = −a(u, u) + a(u, v) − (L, v) + (L, u), that is, (ii). Assume now (ii). We have to prove that the map gv above achieves its minimum at t = 0. Since gv (t ) = a(v, v) + a(u, u) − 2a(u, v) ≥ 0, the function gv (t ) is convex on [0, 1]. Therefore it is enough to show that gv (0) ≥ 0, which holds true because gv (0) = −a(u, u) + a(u, v) − (L, v) + (L, u). Remark 15.3.5. Since the map v → a(v, v) is quadratic, the recession functional associated to F is given by < −(L, v) if a(v, v) = 0, ∞ F (v) = +∞ otherwise; hence (see Theorem 15.1.2), so that a solution of (15.60) exists, a necessary condition is (L, v) ≤ 0
∀v ∈ K ∞ ∩ ker a,
whereas, if a satisfies the compactness condition v h → 0 weakly, a(v h , v h ) → 0
→
v h → 0 strongly,
a sufficient one is K ∞ ∩ ker a ∩ ker L is a subspace.
i
i i
i
i
i
i
15.3. Some examples
book 2014/8/6 page 629 i
629
Problem (15.60) written in the form (ii) above is called a variational inequality. For instance, the obstacle problems are of this type, where the Hilbert space is H 1 (Ω) and the convex set K is 4 5 K = v ∈ H 1 (Ω) : v ≥ ψ q.e. in Ω , where q.e. is intended in the sense of capacity (see Section 5.8). Let us consider more particularly this situation in the case the bilinear form a is given by means of an elliptic operator, n ai j (x)Di uD j u d x, a(u, v) = Ω i , j =1
whose coefficient ai j are symmetric and measurable and satisfy the ellipticity condition c1 |ξ |2 ≤
n i , j =1
ai j (x)ξi ξ j ≤ c2 |ξ |2
∀ξ ∈ Rn .
We denote by A the corresponding elliptic operator Au = −
n i , j =1
Di ai j (x)D j u .
Proposition 15.3.4. The obstacle problem (15.60), equivalent to the variational inequality (ii) of Proposition 15.3.3, is also equivalent to the complementary problem ⎧ ⎨ u − ψ ≥ 0, (15.61) Au − L ≥ 0, ⎩ 〈Au − L, u − ψ〉 = 0. PROOF. Let u be a solution of the obstacle problem (15.60); by formulation (ii) of Proposition 15.3.3, for every nonnegative function φ ∈ H 1 (Ω), taking v = u + φ, we obtain 〈Au, φ〉 ≥ 〈L, φ〉, that is, Au − L ≥ 0. On the other hand, taking v = ψ we have 0 ≤ 〈Au − L, u − ψ〉 = 〈Au, u − ψ〉 − 〈L, u − ψ〉 ≤ 0, that is, 〈Au − L, u − ψ〉 = 0. On the other hand, if u verifies (15.61), writing a generic v ≥ ψ in the form ψ + φ with φ ≥ 0, we have 〈Au − L, u − v〉 = 〈Au − L, u − ψ〉 − 〈Au − L, φ〉 ≤ 0, so that by Proposition 15.3.3, u solves the obstacle problem. Remark 15.3.6. Since Au − L is a nonnegative distribution, by the Riesz–Schwartz theorem there exists a nonnegative Borel measure μ on Ω such that Au − L = μ. Moreover, by (15.61) the measure μ is concentrated on the coincidence set {u = ψ}. For further details on the obstacle problems see, for instance, the book by Kinderlehrer and Stampacchia [258]. Here we want only to remark that the obstacle problem is a mathematical model for determining the shape of a thin elastic membrane subject to a vertical load L and where
i
i i
i
i
i
i
630
book 2014/8/6 page 630 i
Chapter 15. Variational problems with a lack of coercivity
the unknown u represents the vertical displacement of the membrane. The measure μ in this framework has a natural interpretation as the upward force due to the constraint reaction of the rigid obstacle. As a further example we consider now the case of minimum problems of the form
T
%
min 0
1 2
&
2
n
1
|u | + V (t , u) d t : u ∈ H (0, T ; R ), u(0) = u(T ) ,
(15.62)
where V is a Borel function, with V (t , ·) lower semicontinuous on Rn , and such that V (t , s) ≥ −a(t ) − b (t )|s|q
for a.e. t ∈ (0, T ) and for every s ∈ Rn
(15.63)
for suitable q < 2 and a(t ), b (t ) in L1 (0, T ). If V (t , ·) is smooth the minimum problem above has the Euler–Lagrange equation < −u + ∇V (t , u) = 0, (15.64) u(0) = u(T ), u (0) = u (T ). We are therefore looking for solutions of the differential equations −u + ∇V (t , u) = 0 which are periodic on the interval [0, T ]. It is convenient to denote by HT1 the space 5 4 u ∈ H 1 (0, T ; Rn ) : u(0) = u(T ) and by F the functional
T
%
F (u) =
1 2
0
& |u |2 + V (t , u) d t .
Notice that in general the functional F is not convex. Nevertheless the sequential weak lower semicontinuity of F is straightforward. Let us prove now the compactness property (i) of Theorem 15.2.1. If t h → +∞ and u h → u weakly in HT1 , with F (t h u h ) bounded T from above, then we deduce by (15.63) that 0 |u h |2 d t → 0, and this implies immediately the strong convergence of the sequence {u h } to a constant. To prove the necessary condition F∞ (u) ≥ 0 for every u ∈ HT1 , we notice that it is trivially fulfilled when u is a nonconstant function, because in this case we have F∞ (u) = +∞. Indeed, if t h → +∞ and u h → u weakly in HT1 with u nonconstant, we have
T
lim inf h→+∞
0
|u h |2 d t ≥
T
|u |2 d t > 0.
0
Therefore, by (15.63) and using the fact that q < 2, we obtain lim inf h→+∞
F (t h u h ) th
≥ lim inf h→+∞
0
T
%
th 2
& q−1 |u h |2 − b (t )t h |u h |q
d t = +∞.
The necessary condition F∞ (u) ≥ 0 is then reduced to F∞ (c) ≥ 0
for every c ∈ Rn .
(15.65)
Here we assume (15.65) is satisfied and we will particularize some special cases when additional assumptions on the potential V are made. When V (t , ·) is convex on Rn , then the functional F turns out to be convex, and we are in the framework of Section 13.1. Assume there exists a function u0 ∈ HT1 such that
i
i i
i
i
i
i
15.3. Some examples
book 2014/8/6 page 631 i
631
V t , u0 (t ) is integrable and introduce for every c ∈ Rn the convex function Φc : R → ] − ∞, +∞] given by T V t , u0 (t ) + c r d t . Φc (r ) = 0
Then by applying Theorem 15.1.1 it is easy to see that the existence of at least a solution to problem (15.62) occurs provided for every c ∈ Rn the function Φc is either constant or such that lim Φc (r ) = +∞. (15.66) |r |→+∞
For instance, if V (t , s) = V (s) − f (t )s we have that condition (15.66) is fulfilled for every f ∈ L1 (0, T ; Rn ) whenever the function V (s) has a superlinear growth, that is, V (s)
lim
|s |→+∞
|s|
= +∞.
Another case in which the existence of minimizers for problem (15.62) can be easily obtained is when the potential V satisfies a Lipschitz condition of the form for a.e. t ∈ (0, T ) and for every s1 .s2 ∈ Rn . (15.67)
|V (t , s1 ) −V (t , s2 )| ≤ k(t )|s1 − s2 |
Here we assume that k ∈ L1 (0, T ), and that the potential V satisfies the coercivity condition T
V (t , s) d t = +∞.
lim
|s |→+∞
(15.68)
0
In this case problem (15.62) is actually coercive, and the existence result then follows immediately from the direct methods of the calculus of variations of Section 3.2. Indeed, by using the Lipschitz condition (15.67) we obtain for every u ∈ HT1
T
%
F (u) ≥
1 2
0
2
|u | + V t , u(0)
&
T
dt −
k(t )|u(t ) − u(0)| d t . 0
Moreover, since 1/2 T t u (s) d s ≤ T |u |2 d t , |u(t ) − u(0)| = 0 0 we deduce that F (u) ≥ 0
T
%
1 2
& |u |2 + V t , u(0) d t − C
T
|u |2 d t
1/2
0
for a suitable positive constant C . Now, by using the coercivity assumption (15.68), and the fact that on HT1 the norm is equivalent to
T
2
1/2 2
|u | d t + |u(0)|
,
0
i
i i
i
i
i
i
632
book 2014/8/6 page 632 i
Chapter 15. Variational problems with a lack of coercivity
we obtain that uH 1 → +∞ yields F (u) → +∞, hence the coercivity of F . The lower T semicontinuity of F is a straightforward consequence of the results of Section 13.1. Consider now the case when the function V (t , ·) is periodic. More precisely, we assume that the function V is nonnegative, or more generally bounded from below by an L1 function, and that for suitable independent vectors τi ∈ Rn we have for a.e. t ∈ (0, T ) and for every s ∈ Rn V (t , s + τi ) = V (t , s),
i = 1, . . . , n.
Note that in this case the functional F is not convex in general and that, by the nonnegativity of V , the necessary condition (15.65) is clearly fulfilled. Thus, to apply Theorem 15.2.1 and obtain the existence of a minimizer for F , it is enough to show that the compatibility condition (iii ) of Remark 15.2.5 holds. In other words we have to find for every c ∈ Rn a vector μ ∈ Rn whose components μi are positive, such that F (u1 − μ1 c1 , . . . , un − μn cn ) ≤ F (u1 , . . . , un )
∀u ∈ HT1 .
This can be achieved if we choose, for instance, μi = |τi |/|ci | for all i = 1, . . . , n with μi = 1 if ci = 0.
15.4 Limit analysis problems We consider the so-called limit analysis problems which consist in minimizing functionals of the form F (x) − γ L(x), x ∈ X , where X is a normed space, F : X → ] − ∞, +∞] is a (possibly nonconvex) functional, and L ∈ X . We are interested in characterizing the values of γ for which the minimum is attained. As an application we consider nonconvex minimum problems defined on the space of measures and on the BV space. Let X be a normed space; we consider on X the norm topology τ and another (linear Hausdorff) topology σ weaker than τ and such that the unit ball {x ∈ X : x ≤ 1} is σ-compact. Let F : X → ] − ∞, +∞] be a functional proper and σ-lsc (on all τ-bounded sets), and let L : X → R be a linear σ-continuous functional. The limit analysis problem associated to F and L consists in finding the values γ ∈ R for which the minimum problem min{F (x) − γ L(x) : x ∈ X }
(15.69)
admits at least a solution. The problem when F is convex was studied by Bouchitté and Suquet in [118], where the following result is proved. Theorem 15.4.1. Assume that F is proper, convex, σ-lsc, and σ-coercive in the sense that lim F (x) = +∞.
x→+∞
Then, setting γ ∗ = min{F ∞ (x) : x ∈ X , L(x) = 1}, γ∗ = − min{F ∞ (x) : x ∈ X , L(x) = −1},
i
i i
i
i
i
i
15.4. Limit analysis problems
book 2014/8/6 page 633 i
633
the following statements are equivalent: (i) F − γ L is σ-coercive. (ii) γ∗ < γ < γ ∗ . PROOF. Since F is proper, there exists x0 ∈ X such that F (x0 ) < +∞; then, by considering the functional F (x + x0 ) − F (x0 ) it is easy to see that we may reduce ourselves to assume without any loss of generality that F (0) = 0. Assume now (i). By the properties of recession functions we have (F − γ L)∞ (x) ≥
F (t x) − γ L(t x)
∀t > 0, ∀x ∈ X ;
t
if for some x = 0 we had (F − γ L)∞ (x) ≤ 0, this would be in contradiction to the coerciveness of F − γ L. Then, taking as x ∗ the solution of 4 5 γ ∗ = min F ∞ (x) : L(x) = 1 , we obtain
0 < (F − γ L)∞ (x ∗ ) = F ∞ (x ∗ ) − γ L(x ∗ ) = γ ∗ − γ .
Analogously, if x∗ is the solution of
4 5 −γ∗ = min F ∞ (x) : L(x) = −1 ,
we get
0 < (F − γ L)∞ (x∗ ) = F ∞ (x∗ ) − γ L(x∗ ) = −γ ∗ + γ ,
that is, (ii). Assume now (ii) and let 0 < γ < γ ∗ . By contradiction, if F − γ L were not coercive, we could find a sequence (x h ) such that x h → +∞ and (F − γ L)(x h ) ≤ M
(15.70)
for a suitable M ∈ R. Note that L(x h ) → +∞ because otherwise (15.70) and the coerciveness of F would prove that (x h ) is bounded. Setting y h = x h /x h and t h = x h we get that (y h ) is σ-relatively compact, hence (up to subsequences) converging to a suitable y ∈ X . By using the lower semicontinuity and convexity of F , and the fact that F (0) = 0, we have for every t > 0 F (t y) t
≤ lim inf
F (t y h )
h→+∞
≤ lim inf γ h→+∞
so that, as t → +∞,
t L(x h ) x h
≤ lim inf h→+∞
F (t h y h ) th
= γ L(y)
F ∞ (y) ≤ γ L(y).
(15.71) ∞
Being F lower semicontinuous and σ-coercive, by Proposition 15.1.3 we have F ≥ 0, hence L(y) ≥ 0 by (15.71). The case L(y) > 0 must be excluded because, taking z = y/L(y) we get L(z) = 1 and F ∞ (y) γ ∗ ≤ F ∞ (z) = ≤ γ, L(y)
i
i i
i
i
i
i
634
book 2014/8/6 page 634 i
Chapter 15. Variational problems with a lack of coercivity
which contradicts (ii). Therefore L(y) = 0 and, by (15.71), F ∞ (y) = 0. Arguing as in the first part of the proof we get that by the coerciveness of F , this implies y = 0, so that using (15.70) and recalling that L(x h ) → +∞, xh F (x h ) M F ≤ ≤ + γ. L(x h ) L(x h ) L(x h ) Hence x h /L(x h ) is bounded in X (by the coerciveness of F ), and so y h /L(y h ) is bounded too. But this contradicts the fact that y h = 1 whereas L(y h ) → 0. An analogous argument can be used in the case γ∗ < γ < 0. When F is not necessarily convex we have the following result. Theorem 15.4.2. Assume that (i) F is proper and sequentially σ-lsc; (ii) for all z ∈ ker F∞ there exists η = η(z) > 0 such that F (x − ηz) ≤ F (x)
∀x ∈ X ;
(iii) there exist a seminorm P : X → [0, +∞[ satisfying the compactness condition (2.3.11) and a number C ≥ 0 such that F (x) ≥ P (x) − C
∀x ∈ X .
Then, setting γ ∗ = inf{F∞ (x) : L(x) = 1}, γ∗ = − inf{F∞ (x) : L(x) = −1}, we have the following: (a) if the minimum problem (15.69) admits a solution, then γ∗ ≤ γ ≤ γ ∗ ; (b) if γ∗ < γ < γ ∗ , then the minimum problem (15.69) admits a solution. PROOF. To prove statement (a) we apply Proposition 15.2.5. The necessary condition for the existence of a solution to problem (15.69) is (F − γ L)∞ ≥ 0
on X ,
which, thanks to Proposition 15.2.1, becomes F∞ (x) ≥ γ L(x)
∀x ∈ X .
(15.72)
Taking L(x) = −1 in (15.72) gives 5 4 γ ≥ sup − F∞ (x) : L(x) = −1 = γ∗ ; similarly, taking L(x) = 1 in (15.72) gives 4 5 γ ≤ inf F∞ (x) : L(x) = 1 = γ ∗ .
i
i i
i
i
i
i
15.4. Limit analysis problems
book 2014/8/6 page 635 i
635
To prove statement (b) we are going to verify all the hypotheses of Theorem 15.2.1 for the functional G = F − γ L which is clearly sequentially σ-lsc. As seen in the proof of statement (a), the necessary condition (ii) of Theorem 15.2.1 is equivalent to inequalities γ∗ ≤ γ ≤ γ ∗ . To verify the compactness hypothesis (i) of Theorem 15.2.1, let t h → +∞ and let x h → x be a sequence σ-converging in X such that F (t h x h ) − γ L(t h x h ) ≤ C . Dividing by t h we get
F (t h x h ) th
− γ L(x h ) ≤
(15.73)
C th
so that, by definition of the topological recession functional, F∞ (x) ≤ γ L(x). If L(x) > 0 we would obtain F∞
x L(x)
4 5 ≤ γ < γ ∗ = inf F∞ (y) : L(y) = 1 ,
which gives a contradiction. In an analogous way we can exclude the case L(x) < 0. Therefore we have L(x) = 0 and so also F∞ (x) = 0. Now, if L(t h x h ) is bounded from above, by (15.73) we would get P (t h x h ) ≤ C , and so, since P satisfies the compactness assumption (ii) of Theorem 15.2.1, we would obtain x h → x strongly in X . Otherwise, if L(t h x h ) (or a subsequence of it) tends to +∞, dividing by L(t h x h ) in (15.73) we get P
xh L(x h )
≤ C.
Since L(x h ) → L(x) = 0, by the compactness assumption of P we obtain again x h → x. Therefore the compactness hypothesis (2.3.11) is satisfied. Finally we verify the compatibility conditions (iii) of Theorem 15.2.1. Let z ∈ ker(F∞ − γ L), i.e., F∞ (z) = γ L(z). As before, using the strict inequalities γ∗ < γ < γ ∗ we obtain F∞ (z) = L(z) = 0, and so, by assumption (ii) F (x − ηz) − γ L(x − ηz) = F (x − ηz) − γ L(x) ≤ F (x) − γ L(x) for all x ∈ X , that is, the compatibility condition (iii) is satisfied.
i
i i
i
i
i
i
636
book 2014/8/6 page 636 i
Chapter 15. Variational problems with a lack of coercivity
Remark 15.4.1. If the infimum 4 5 inf F∞ (x) : L(x) = 1
4 5 respectively, inf F∞ (x) : L(x) = −1
is not attained, then we can also accept in Theorem 15.4.2 (b) γ = γ ∗ (respectively, γ = γ∗ ). As an application of the limit analysis theorems above we consider the case of functionals defined on measures. We refer to Section 13.3 for the theory of convex functionals on measures and to Bouchitté and Buttazzo [110], [111], [112] for further details on nonconvex functionals defined on measures. Consider a measure space (Ω, , μ), where Ω is a separable locally compact metric space, is the σ-algebra of all Borel subsets of Ω, and μ : → [0, +∞[ is a positive, finite, nonatomic measure. Consider a functional defined on M(Ω; Rn ) of the form first considered by Bouchitté and Buttazzo [110], dλ dμ + F (λ) = f f ∞ (λ s ) + g (λ(x)) d #. (15.74) dμ Ω Ω\Aλ Aλ Here f : Rn → [0, +∞] is a proper, convex, lower semicontinuous function with f (0) = 0; f ∞ is its recession function; g : Rn → [0, +∞] is a lower semicontinuous function with g (0) = 0 satisfying the subadditivity condition g (s1 + s2 ) ≤ g (s1 ) + g (s2 )
∀s1 , s2 ∈ Rn ;
λ = dd μλ μ + λ s is the Lebesgue–Nikodým decomposition of λ into absolutely continuous and singular parts with respect to μ; Aλ is the set of all atoms of λ; λ(x) is the value λ({x}); # is the counting measure. As already recalled, Bouchitté and Buttazzo [110] proved that if the condition f ∞ (s) = lim+ t →0
g (t s) t
∀s ∈ Rn
is fulfilled, then the functional (15.74) is sequentially weakly∗ -lsc on M(Ω; Rn ). For our purposes, it is convenient to introduce for every function g : Rn → [0, +∞] the functions g (t s) g ∞ (s) = lim inf , t →+∞ t g (t s) . g 0 (s) = lim sup + t t →0 The following proposition holds (see [110]). Proposition 15.4.1. Let g : Rn → [0, +∞[ be a lower semicontinuous and subadditive function with g (0) = 0. Then, we have the following:
i
i i
i
i
i
i
15.4. Limit analysis problems
book 2014/8/6 page 637 i
637
(i) the functions g 0 and g ∞ are convex, lower semicontinuous, and positively 1-homogeneous; (ii) g 0 (s) = sup t >0
g (t s ) t
= lim t →0+
(iii) g ∞ (s) = inf t >0
g (t s ) t
= lim t →+∞
g (t s ) t
for every s ∈ Rn ;
g (t s ) t
for every s ∈ Rn .
Remark 15.4.2. From Proposition 15.72 it follows easily that g ∞ (s) ≤ g (s) ≤ g 0 (s)
for every s ∈ Rn .
The following theorem gives sufficient conditions on f and g to apply Theorem 15.4.2 to functionals of the form (15.74). Theorem 15.4.3. Let f : Rn → [0, +∞], and g : Rn → [0, +∞[ be given functions. Assume that (i) f is convex, lower semicontinuous, and proper on Rn , and f (0) = 0; (ii) there exist C1 > 0 and D ∈ R such that ∀x ∈ Rn ;
f (x) ≥ C1 |x| − D
(iii) g is lower semicontinuous and subadditive on Rn , and g (0) = 0; (iv) there exists C2 > 0 such that g (x) ≥ C2 |x|
∀x ∈ Rn ;
(v) g 0 = f ∞ in Rn ; (vi) H ∈ C0 (Ω; Rn ). Let F : M(Ω; Rn ) → [0, +∞] be the functional defined in (15.74). Then, setting γ ∗ = inf{F∞ (λ) : 〈H , λ〉 = 1}, γ∗ = − inf{F∞ (λ) : 〈H , λ〉 = −1}, we have the following: (a) if the functional F − γ 〈H , ·〉 admits a minimum on M(Ω; Rn ), then γ∗ ≤ γ ≤ γ ∗ ; (b) the functional F − γ 〈H , ·〉 admits a minimum on M(Ω; Rn ), for every γ such that γ∗ < γ < γ ∗ . PROOF. By the assumptions made on f and g the functional F is sequentially weakly∗ -lsc on M(Ω; Rn ). Moreover, by assumptions (ii) and (iv), we have dλ |λ s | + C2 |λ s (x)| d # − Dμ(Ω) F (λ) ≥ C1 d μ + C1 d μ Ω Ω\A A λ
λ
so that F (λ) ≥ C λ − b
(15.75)
i
i i
i
i
i
i
638
book 2014/8/6 page 638 i
Chapter 15. Variational problems with a lack of coercivity
for suitable C > 0 and b ∈ R. Finally, from (15.75) we get ker F∞ = {0} so that hypothesis (ii) of Theorem 15.4.2 is satisfied too, and hence the conclusions follow from Theorem 15.4.2. We give now an explicit formula for the bounds γ ∗ and γ∗ . To obtain this result we need first an explicit representation for the topological recession function F∞ . We use a representation theorem for the relaxed functional associated to integrals of the form (15.74). More precisely, given a functional F : M(Ω; Rn ) → [0, +∞] of the form F (λ) =
⎧ ⎨ ⎩
f
! dλ " dμ
Ω
g λ(x) #
dμ +
if λ s = 0 on Ω \ Aλ ,
Aλ
+∞
otherwise,
we consider its relaxed functional F defined by 4 5 F = sup G : G ≤ F , G sequentially weakly∗ -lsc on M(Ω; Rn ) . Bouchitté and Buttazzo [111] proved that if f , g : Rn → [0, +∞] satisfy the assumptions • f is convex and lower semicontinuous on Rn , and f (0) = 0, • there exist α > 0 and β ≥ 0 such that ∀s ∈ Rn ,
f (s) ≥ α|s| − β
• g is subadditive and lower semicontinuous on Rn , and g (0) = 0, • g 0 (s) ≥ α|s| for every s ∈ Rn , then the following integral representation holds for F :
F (λ) =
f Ω
dλ dμ
dμ +
∞
Ω\Aλ
s
( f ) (λ ) +
g (λ(x)) d #,
(15.76)
Aλ
where f = f #e g 0 ,
g = f ∞ #e g .
To characterize the topological recession function for functionals of the form (15.74) we introduce the functional dλ G ∞ (λ) = g ∞ (λ) = g∞ g ∞ (λ s ). dμ + d μ Ω Ω Ω Theorem 15.4.4. Under the assumptions of Theorem 15.4.3, we have F∞ (λ) = G ∞ (λ)
∀λ ∈ M(Ω; Rn ).
i
i i
i
i
i
i
15.4. Limit analysis problems
book 2014/8/6 page 639 i
639
PROOF. We prove first that F∞ (λ) ≤ G ∞ (λ) for every λ ∈ M(Ω; Rn ). Let λ ∈ M(Ω; Rn ) be a measure with a finite number of atoms. Then ⎡ ⎤ 6 7 1 g (t λ(x)) dλ f t dμ + d #⎦ f ∞ (λ s ) + F∞ (λ) ≤ lim inf ⎣ t →+∞ dμ t Ω t Ω\Aλ Aλ 7 6 dλ ∞ ≤ dμ + f f ∞ (λ s ) + g ∞ λ(x) d #. (15.77) dμ Ω Ω\Aλ Aλ 4 5 Now, let λ be any measure in M(Ω; Rn ). Setting Aλh = x ∈ Aλ : |λ|(x) < 1/h and 4 5 λ h = λ · 1Ω\Ah , we get λ h → λ; moreover, since Aλh = x ∈ Aλ : |λ|(x) ≥ 1/h , we have λ that λ h has a finite number of atoms. From the weak∗ -lower semicontinuity of F∞ , taking into account (15.77), we have ⎤ ⎡ 7 6 d λ d μ+ f ∞ (λ s ) + g ∞ λ s (x) d #⎦ F∞ (λ)≤ lim inf F∞ (λ h )≤ lim inf ⎣ f ∞ h h→+∞ h→+∞ d μ Ω Ω\Aλ Aλ \Aλ 7 6 dλ ≤ dμ + f∞ χ{0} (λ s ) + g ∞ λ s (x) d #. (15.78) dμ Ω Ω\Aλ Aλ By computing the relaxation of the first and the last terms of (15.78), we get by (15.76) dλ ∞ ∞ ∞ ∞ s F∞ (λ) ≤ ( f #e g ) ( f #e g )(λ ) + ( f ∞ #e g ∞ ) λ s (x) d # dμ + d μ Ω Ω\Aλ Aλ for every λ ∈ M(Ω; Rn ). From (v) of Theorem 15.4.3 we deduce that f ∞ #e g ∞ = g ∞ so that F∞ (λ) ≤ G ∞ (λ) ∀λ ∈ M(Ω; Rn ). We prove now the opposite inequality. We claim that for every > 0 there exists a k > 0 such that ∀s ∈ Rn . (15.79) g ∞ (s) ≤ f (s) + |s| + k By contradiction, assume there exists an 0 > 0 such that for every k ∈ N there exists a sk ∈ Rn with g ∞ (sk ) > f (sk ) + 0 |sk | + k. Setting vk = sk /|sk |, and tk = |sk | we have g ∞ (vk ) >
f (tk vk ) tk
+ 0 +
k tk
.
(15.80)
Since |vk | = 1, it is not restrictive to assume vk → v for some v ∈ Rn . If (tk ) is bounded we get, by using Proposition 15.4.1(i), g (v) ≥ g ∞ (v) = +∞, which is impossible since g is finite. Therefore, we can assume tk → +∞. Passing to the limit in (15.80) yields, taking into account Proposition 15.4.1(i) again, g ∞ (v) ≥ f ∞ (v) + 0 = g 0 (v) + 0 > g ∞ (v),
i
i i
i
i
i
i
640
book 2014/8/6 page 640 i
Chapter 15. Variational problems with a lack of coercivity
which is a contradiction. Then (15.79) holds, and from the weak∗ -lower semicontinuity and 1-homogeneity of G ∞ we get G ∞ (t h λ h ) G ∞ (λ) = inf lim inf h→+∞ th % d λ 1 d λh h f th ≤ inf lim inf + t h dμ dμ h→+∞ t h dμ Ω & s ∞ s f (t h λ h ) + g t h λ h (x) d # + Ω\Aλ
%
≤ inf lim inf
Aλ
h
F (t h λ h ) th
h→+∞
h
&
+ λ h
,
where the infimum is taken over all t h → +∞ and all λ h → λ. Taking (t h ) and (λ h ) such that F (t h λ h ) F∞ (λ) = lim inf , h→+∞ th we have
G ∞ (λ) ≤ F∞ (λ) + lim sup λ h ≤ F∞ (λ) + C . h→+∞
Letting → 0, we get
G ∞ (λ) ≤ F∞ (λ),
and the proof is achieved. By virtue of Theorem 15.4.4 we can write γ ∗ = inf{G ∞ (λ) : 〈H , λ〉 = 1}, γ∗ = − inf{G ∞ (λ) : 〈H , λ〉 = −1} or equivalently γ∗ = γ∗ =
1 sup {〈H , λ〉 : G ∞ (λ) = 1} 1 inf {〈H , λ〉 : G ∞ (λ) = 1}
,
.
The last expressions allow us to compute explicitly γ ∗ and γ∗ in terms of g ∞ and H only. Indeed, by using the definition of the Fenchel transform, it is easy to see that 1 sup {〈H , λ〉 : G ∞ (λ) = 1} Therefore, since ∞ ∗
0 for every function u. A complete discussion of the problem can be found in [144], where all the relevant references are quoted. Here we simply recall that the problem 1 d x : u concave, 0 ≤ u ≤ M min 2 Ω 1 + |D u| admits a solution uo p t . Some interesting necessary conditions of optimality can be deduced: for instance (see [269]), it can be proved that on every open set ω where uo p t is of class C 2 we obtain det D 2 u(x) = 0 ∀x ∈ ω. In particular, this excludes that in the case Ω = B(0, R) the solution uo p t is radially symmetric. The optimal radially symmetric profile and a nonsymmetric profile which is better that all the radial ones are shown, respectively, in Figures 16.2 and 16.3. It is interesting to notice that with simple calculations, one can write the optimization problem above in the form (16.1) by taking the cost functional as a boundary integral, j x, ν(x) d N −1 , F (A) = ∂A
for a suitable integrand j (x, s), being ν(x) the exterior normal unit vector to ∂ A at x and N −1 the Hausdorff (N − 1)-dimensional measure (see [155]).
i
i i
i
i
i
i
16.2. The Newton problem
book 2014/8/6 page 647 i
647
1
Figure 16.2. The optimal radial shape for M = R.
2
0
Figure 16.3. A nonradial profile better than all radial ones for M = 2R.
i
i i
i
i
i
i
648
book 2014/8/6 page 648 i
Chapter 16. An introduction to shape optimization problems
16.3 Optimal Dirichlet free boundary problems We consider now the model example of a Dirichlet problem over an unknown domain, which has to be optimized according to a given cost functional. More precisely, we consider a given bounded open subset Ω of RN , an admissible class of subsets of Ω, a given function f ∈ L2 (Ω), and a cost functional of the form F (A) =
Ω
j (x, uA) d x.
(16.4)
Here the integrand j : Ω × R → R is given, and we denote by uA the unique solution of the elliptic problem −Δu = f in A, (16.5) u ∈ H01 (A), extended by zero to Ω \ A. It is well known that in general one should not expect the existence of an optimal solution; below we show an example where the existence of an optimal domain does not occur. The problem we consider is of the form (16.1), where the admissible class consists of all subdomains of a given bounded open subset Ω of RN and the cost functional F is of the form (16.4) with j (x, s) = |s − u(x)|2 for a prescribed desired state u. In the thermostatic model the shape optimization problem (16.1) with the choices above consists in finding an optimal distribution, inside Ω, of the Dirichlet region Ω \ A to achieve a temperature which is as close as possible to the desired temperature u, once the heat sources f are prescribed. For simplicity, we consider a uniformly distributed heat source, that is, we take f ≡ 1, and we take the desired temperature u constantly equal to c > 0. Therefore, problem (16.1) becomes 2 1 |uA − c| d x : −ΔuA = 1 in A, uA ∈ H0 (A) . (16.6) min Ω
We will actually show that for small values of the constant c no regular domain A can solve problem (16.6) above; the proof of nonexistence of any domain is slightly more delicate and requires additional tools like the capacitary form of necessary conditions of optimality (see, for instance, [144], [148], [149], [171]). Proposition 16.3.1. If c > 0 is small enough, then problem (16.6) has no smooth solutions. PROOF. The nonexistence proof can be obtained by contradiction. Assume indeed that a regular domain A solves the optimization problem (16.6) and that A does not coincide with the whole set Ω. Take a point x0 ∈ Ω \ A and a small ball B of radius sufficiently small, centered at x0 . If uA denotes the solution of (16.5) corresponding to A, and if is small enough so that B does not intersect A, then the solution uA∪B , corresponding to the admissible choice A ∪ B , is given by ⎧ if x ∈ A, ⎨ uA(x) uA∪B (x) = (2 − |x − x0 |2 )/4 if x ∈ B , ⎩ 0 otherwise.
i
i i
i
i
i
i
16.3. Optimal Dirichlet free boundary problems
649
Therefore, we obtain c2 d x + F (A) = |uA − c|2 d x + A
book 2014/8/6 page 649 i
B
c 2 d x, Ω\(A∪B )
2 2 2 − |x − x0 | 2 c 2 d x. F (A ∪ B ) = |uA − c| d x + − c d x + 4 A B Ω\(A∪B )
By using the minimality of A this then yields 2 2 2 − |x − x0 | 2 c meas(B ) ≤ − c d x 4 B 2 2 2 − r −N = N meas(B ) − c r N −1 d r 4 0 1 = c 2 meas(B ) + (2 − r 2 )(2 − r 2 − 8c)r N −1 d r, 16 0 which, for a fixed c > 0, turns out to be false if is small enough. Thus all regular domains A = Ω are ruled out by the argument above. We can now exclude also the case A = Ω if c is small enough, by comparing the full domain Ω to the empty set . This gives, taking into account that u ≡ 0, F (Ω) = |uΩ − c|2 d x, Ω F () = c 2 d x, Ω
so that we have F () < F (Ω) if c is small enough. Hence all regular subdomains of Ω are excluded, and the nonexistence proof is achieved. By a more refined proof we could exclude all kinds of domains, so we may conclude that the shape optimization problem (16.6) has no solutions at all. In fact, we could consider the relaxed problem |uμ − c|2 d x : −Δuμ + μu = 1 in Ω, uμ ∈ H01 (Ω) , (16.7) min Ω
where now μ varies in the class M0 (Ω) of all capacitary measures introduced in Section 5.8.4. By the compactness properties of M0 (Ω) with respect to the γ -convergence the existence of a relaxed solution μ is straightforward; by an argument developed in [171] it is possible to prove that this solution is also unique. The example above proves that even very simple shape optimization problems do not admit a domain as an optimal solution. Nevertheless, the existence of an optimal domain occurs for problem (16.1) in some particular cases: (i) when severe geometrical constraints on the class of admissible domains are imposed; (ii) when the cost functional fulfills some particular qualitative assumptions; (iii) when the problem is of a very special type, involving, for instance, only the eigenvalues of the Laplace operator, and where neither geometrical constraints nor monotonicity of the cost is required.
i
i i
i
i
i
i
650
book 2014/8/6 page 650 i
Chapter 16. An introduction to shape optimization problems
See the lecture notes of Bucur and Buttazzo [144], [145] for a more complete discussion on this topic; here, for the sake of simplicity, we list only some situations when conditions (i), (ii), and (iii) occur. Concerning case (i), a rather simple situation when the existence of an optimal solution occurs is when the class of admissible domains fulfills the geometrical constraint which is called the exterior cone condition. It consists in requiring that there exists a fixed height h and opening ω such that for every domain A of the admissible class and every point x0 ∈ ∂ A a cone with height h, opening ω, and vertex at x0 is contained in Ω\A. This condition is weaker than an equi-Lipschitz conditions on the class of admissible domains; for instance, a domain with an exterior cusp is not Lipschitz but verifies the exterior cone condition. Conditions weaker than the exterior cone but which still imply the existence of an optimal domain can be given in terms of capacity (see [144], [145]). Concerning case (ii), we consider the problem 4 5 min F (A) : A ∈ , meas(A) ≤ m , (16.8) where the volume constraint meas(A) ≤ m has been added. The cost functional F is assumed to be monotone nonincreasing with respect to the set inclusion, that is, A1 ⊂ A2
⇒
F (A2 ) ≤ F (A1 ).
Moreover, F is assumed to be lower semicontinuous with respect to the γ -convergence on the class of domains, defined by An → A in the γ -convergence
⇐⇒
uAn → uA weakly in H01 (Ω),
where uAn and uA are the solutions of (16.5) in An and A, respectively, with the right-hand side f ≡ 1. Under the assumptions above, Buttazzo and Dal Maso in [150] showed that problem (16.5) admits an optimal solution. It is important to stress that several optimal shape problems can be written in the form (16.8) with a cost functional F which is nonincreasing for the set inclusion and γ -lower semicontinuous. For instance, if L denotes a second-order elliptic operator of the form Lu = − div a(x)D u with the N × N matrix a(x) symmetric, uniformly elliptic, and with bounded and measurable coefficients, we may consider the spectrum Λ(A) of L associated to the Dirichlet boundary conditions on A: Lu = λu,
u ∈ H01 (A).
It is known that Λ(A) is given by a nonnegative sequence λk (A) which tends to +∞, that every λk (A) is a nonincreasing set function with respect to A, and that the mappings λk (A) are γ -continuous. Therefore the existence result above applies and we obtain that the problem 4 5 min Φ Λ(A) : A ∈ , meas(A) ≤ m admits an optimal solution whenever the mapping Φ : RN → [0, +∞] is • nondecreasing in the sense that Λ1k ≤ Λ2k
∀k
⇒
Φ(Λ1 ) ≤ Φ(Λ2 );
i
i i
i
i
i
i
16.4. Optimal distribution of two conductors
book 2014/8/6 page 651 i
651
• lower semicontinuous in the sense that Λnk → Λk
∀k
⇒
Φ(Λ) ≤ lim inf Φ(Λn ). n→+∞
In particular, for a fixed integer k, all cost functionals of the form F (A) = φ λ1 (A), . . . , λk (A) with a mapping φ : Rk → R continuous and nondecreasing in each variable verify the assumptions above. Concerning case (iii), we may still prove the existence of an optimal domain for problem (16.8) for some special form of the cost functional. The case which has been considered in [146] (see also [144]) is when F (A) = φ λ1 (A), λ2 (A) , where λ1 (A) and λ2 (A) are the first two eigenvalues of the Laplace operator −Δ on H01 (A). Due to the special form of the cost functional and to the fact that the operator is −Δ, it is possible to show that an optimal solution exists without any monotonicity assumption on φ by requiring only that φ is a lower semicontinuous function on R2 .
16.4 Optimal distribution of two conductors Another interesting case of a shape optimization problem is the optimal distribution of two given conductors into a given set. If Ω denotes a bounded open subset of RN (the prescribed container), denoting by α and β the conductivities of the two materials, the problem consists in filling Ω with the two materials in the best performing way according to some given cost functional. The volume of each material can also be prescribed. It is convenient to denote by A the domain where the conductivity is α and by aA(x) the conductivity coefficient aA(x) = α1A(x) + β1Ω\A(x). In this way the state equation becomes < − div aA(x)D u = f u = 0 on ∂ Ω,
in Ω,
(16.9)
where f is the (given) source density, and we denote by uA its unique solution. It is well known (see, for instance, [261], [309]) that if we take as a cost functional an integral of the form Ω
j (x, 1A, uA, D uA) d x,
in general an optimal configuration does not exist. However, the addition of a perimeter penalization is enough to imply the existence of classical optimizers. In other words, if we take as a cost the functional J (u, A) = j (x, 1A, u, D u) d x + σ PerΩ (A), Ω
where σ > 0, the problem can be written as an optimal control problem in the form 4 5 min J (u, A) : A ⊂ Ω, u solves (16.9) . (16.10)
i
i i
i
i
i
i
652
book 2014/8/6 page 652 i
Chapter 16. An introduction to shape optimization problems
A volume constraint of the form meas(A) = m could also be present. The main ingredient for the proof of the existence of an optimal classical solution is the following result. Theorem 16.4.1. Let an (x) be a sequence of N × N symmetric matrices with measurable coefficients, such that the uniform ellipticity condition ∀x ∈ Ω, ∀z ∈ RN
c0 |z|2 ≤ an (x)z · z ≤ c1 |z|2
(16.11)
holds with 0 < c0 ≤ c1 . Given f ∈ H −1 (Ω) we denote by un the unique solution of the problem − div an (x)D u = f , un ∈ H01 (Ω). (16.12) If an (x) → a(x) a.e. in Ω, then un → u weakly in H01 (Ω), where u is the solution of (16.12) with an replaced by a. PROOF. By the uniform ellipticity condition (16.11) we have 2 c0 |D un | d x ≤ f un d x, Ω
Ω
and by the Poincaré inequality we have that un are bounded in H01 (Ω) so that a subsequence (still denoted by the same indices) converges weakly in H01 (Ω) to some v. All we have to show is that v = u, or equivalently that − div a(x)Dv = f . (16.13) This means that for every smooth test function φ we have a(x)DvDφ d x = 〈 f , φ〉. Ω
Then it is enough to show that for every smooth test function φ we have lim an (x)D un Dφ d x = a(x)DvDφ d x. n→+∞
Ω
Ω
This is an immediate consequence of the fact that φ is smooth, D un → Dv weakly in L2 (Ω), and an → a a.e. in Ω remaining bounded. Another way to show that (16.13) holds is to verify that v minimizes the functional F (w) = a(x)DwDw d x − 2〈 f , w〉, w ∈ H01 (Ω). (16.14) Ω
Since the function α(s, z) = s z · z, which is defined for all z ∈ RN and for symmetric positive definite N × N matrices s, is convex in z and lower semicontinuous in s, the functional Φ(a, ξ ) =
Ω
a(x)ξ · ξ d x
is sequentially lower semicontinuous with respect to the strong L1 -convergence on a and the weak L1 -convergence on ξ (see, for instance, Theorem 13.1.1 and [147]). Therefore we have F (v) = Φ(a, Dv) − 2〈 f , v〉 ≤ lim inf Φ(an , D un ) − 2〈 f , un 〉 = lim inf F (un ). n→+∞
n→+∞
i
i i
i
i
i
i
16.4. Optimal distribution of two conductors
book 2014/8/6 page 653 i
653
Since un minimizes the functional Fn defined as in (16.14) with a replaced by an , we also have for every w ∈ H01 (Ω) Fn (un ) ≤ Fn (w) =
Ω
an (x)DwDw d x − 2〈 f , w〉,
so that taking the limit as n → +∞ and using the convergence an → a we obtain lim inf Fn (un ) ≤ a(x)DwDw d x − 2〈 f , w〉 = F (w). n→+∞
Ω
Thus F (v) ≤ F (w), which shows what is required. Remark 16.4.1. The result above can be rephrased in terms by say of G-convergence ing that for uniformly elliptic operators of the form − div a(x)D u the G-convergence is weaker than the L1 -convergence of coefficients. Analogously, we can say that the functionals Gn (w) = an (x)DwDw d x Ω
Γ -converge, with respect to the L (Ω)-convergence, to the functional G defined in the same way with a in the place of an . 2
Corollary 16.4.1. If An → A in L1 (Ω), then uAn → uA weakly in H01 (Ω). A more careful inspection of the proof of Theorem 16.4.1 shows that the following stronger result holds. Theorem 16.4.2. Under the same assumptions of Theorem 16.4.1, the convergence of un is actually strong in H01 (Ω). PROOF. We have already seen that un → u weakly in H01 (Ω), which gives D un → D u weakly in L2 (Ω). Denoting by cn (x) and c(x) the square root matrices of an (x) and a(x), respectively, we have that cn → c a.e. in Ω remaining equibounded. Then cn (x)D un converges to c(x)D u weakly in L2 (Ω). Multiplying (16.4) by un and integrating by parts we obtain a(x)D uD u d x = 〈 f , u〉 = lim 〈 f , un 〉 n→+∞ Ω = lim an (x)D un D un d x. n→+∞
Ω
This implies that
strongly in L2 (Ω). cn (x)D un → c(x)D u −1 Multiplying now by cn (x) we finally obtain the strong convergence of D un to D u in L2 (Ω).
We are now in a position to obtain an existence result for the optimization problem (16.2). On the function j we only assume that it is nonnegative, Borel measurable, and such that j (x, s, z, w) is lower semicontinuous in (s, z, w) for a.e. x ∈ Ω.
i
i i
i
i
i
i
654
book 2014/8/6 page 654 i
Chapter 16. An introduction to shape optimization problems
Theorem 16.4.3. Under the assumptions above, the minimum problem (16.2) admits at least a solution. PROOF. Let (An ) be a minimizing sequence; then PerΩ (An ) are bounded so that, up to extracting subsequences, we may assume (An ) is strongly convergent in the L1l oc sense to some set A ⊂ Ω. We claim that A is a solution of problem (16.2). Let us denote by un a solution of problem (16.1) associated to An ; by Theorem 16.4.2, (un ) converges strongly in H01 (Ω) to some u ∈ H01 (Ω). Then by the lower semicontinuity of the perimeter (see Proposition 10.1.1) and by Fatou’s lemma we have J (u, A) ≤ lim inf J (un , An ), n→+∞
which proves the optimality of A. Remark 16.4.2. The same proof works when volume constraints of the form meas(A) = m are present. Indeed this constraint passes to the limit when An → A strongly in L1 (Ω). The existence result above shows the existence of a classical solution for the optimization problem (16.2). This solution is simply a set with finite perimeter and additional assumptions have to be made to prove further regularity. For instance, in [22] Ambrosio and Buttazzo considered the similar problem N M min E(u, A) + c PerΩ (A) : u ∈ H01 (Ω), A ⊂ Ω , where c > 0 and E(u, A) =
Ω
aA(x)|D u|2 + 1A(x) g1 (x, u) + 1Ω\A g2 (x, u) d x.
They showed that every solution A is actually an open set provided g1 and g2 are Borel measurable and satisfy inequalities gi (x, s) ≥ γ (x) − k|s|2 ,
i = 1, 2,
where γ ∈ L1 (Ω) and k < αλ1 , λ1 being the first eigenvalue of −Δ on Ω.
16.5 Optimal potentials for elliptic operators In this section we consider the Schrödinger operator −Δ+V (x) in a bounded domain Ω of RN with homogeneous Dirichlet conditions on ∂ Ω. We consider nonnegative potentials V (x) and our goal is to show the existence of optimal potentials for some suitable cost functionals F and admissible classes . Then our problem will be of the form 4 5 min F (V ) : V ∈ . Problems of this kind have been treated in [152] and in [238], to which we refer the reader for a complete list of references in the field. The following proposition links the weak L1 (Ω)-convergence to the γ -convergence introduced in Section 5.8.4. Proposition 16.5.1. Let Vn ∈ L1 (Ω) be a sequence weakly converging in L1 (Ω) to a function V . Then the capacitary measures Vn d x γ -converge to V d x.
i
i i
i
i
i
i
16.5. Optimal potentials for elliptic operators
book 2014/8/6 page 655 i
655
PROOF. We have to prove that the solutions un = RVn (1) of
−Δu + Vn (x)u = 1, u ∈ H01 (Ω)
weakly converge in H01 (Ω) to the solution u = RV (1) of
−Δu + V (x)u = 1, u ∈ H01 (Ω),
or equivalently that the functionals 2 Jn (u) = |∇u| d x + Vn (x)u 2 d x Ω
Ω
Γ L2 (Ω) -converge to the functional
J (u) =
|∇u| d x + 2
Ω
Ω
V (x)u 2 d x.
The Γ -liminf inequality is immediate since, if un → u in L2 (Ω), we have
Ω
|∇u|2 d x ≤ lim inf n
Ω
|∇un |2 d x
by the lower semicontinuity of the H 1 (Ω) norm with respect to the L2 (Ω)-convergence, and Ω
V (x)u 2 d x ≤ lim inf n
Ω
Vn (x)un2 d x
by the strong-weak lower semicontinuity theorem for integral functionals (see Theorem 13.1.1). Let us now prove the Γ -limsup inequality which consists, given u ∈ H01 (Ω), in constructing a sequence un → u in L2 (Ω) such that
lim sup n
2
Ω
|∇un | d x +
Ω
Vn (x)un2
dx ≤
2
Ω
|∇u| d x +
Ω
V (x)u 2 d x.
(16.15)
For every t > 0 let u t = (u ∧ t ) ∨ (−t ); then for t fixed we have lim Vn (x)|u t |2 d x = V (x)|u t |2 d x n
Ω
and
Ω
t 2
lim
t →+∞
Ω
V (x)|u | d x =
Ω
V (x)|u|2 d x.
Then, by a diagonal argument, we can find a sequence tn → +∞ such that lim n
tn 2
Ω
Vn (x)|u | d x =
Ω
V (x)|u|2 d x.
i
i i
i
i
i
i
656
book 2014/8/6 page 656 i
Chapter 16. An introduction to shape optimization problems
Taking now un = u tn , and noticing that for every t > 0 t 2 |∇u | d x ≤ |∇u|2 d x, Ω
Ω
we obtain (16.15) and so the proof is complete. We are now in a position to prove a rather general existence result for an optimal potential. Theorem 16.5.1. Let be a subset of L1+ (Ω), weakly compact for the L1 (Ω)-convergence, and let F : → R be a functional which is γ -lsc. Then the optimization problem 4 5 min F (V ) : V ∈
(16.16)
admits a solution. PROOF. Let (Vn ) be a minimizing sequence for the problem (16.16); by the weak L1 (Ω) compactness assumption on we may extract a subsequence (still denoted by (Vn )) which converges weakly in L1 (Ω) to some function V ∈ . By Proposition 16.5.1 we obtain that Vn d x →γ V d x as capacitary measures, and so the γ -lower semicontinuity of F allows us to conclude that V is an optimal potential for problem (16.16). Remark 16.5.1. We recall that, by the De La Vallée–Poussin theorem, Theorem 2.4.4, the set is weakly compact in L1 (Ω) if θ(V ) d x ≤ 1 ∀V ∈ , Ω
where θ : R → R is a superlinear function, that is, lim
θ(t )
t →+∞
t
= +∞.
Remark 16.5.2. Since the γ -convergence is rather strong, many cost functionals are γ -lsc, for instance, the integral functionals and the spectral functionals shown in Section 5.8.4 of the following form: • the integral functionals F (V ) =
Ω
j (x, uV , ∇uV ) d x,
where uV = RV ( f ) is the solution of −Δu + V (x)u = f u ∈ H01 (Ω)
in Ω,
(16.17)
and the integrand j (x, s, z) is measurable in x, lower semicontinuous in (s, z), convex in z, and such that j (x, s, z) ≥ −a(x) − c|s| p , where a ∈ L1 (Ω), p < 2N /(N − 2) ( p < +∞ if N = 2), c ∈ R;
i
i i
i
i
i
i
16.5. Optimal potentials for elliptic operators
book 2014/8/6 page 657 i
657
• the spectral functionals
F (V ) = Φ Λ(V ) ,
where Λ(V ) is the spectrum of the Schrödinger operator −Δ +V (x) on H01 (Ω) and Φ : RN → R is lower semicontinuous in the sense that λnk → λk ∀k
Φ(Λ) ≤ lim inf Φ(Λn ).
⇒
n→+∞
Example 16.5.1. Consider the cost functional given by the energy 1 1 2 2 1 / f (V ) = min |∇u| + V (x)u − f (x)u d x : u ∈ H0 (Ω) 2 Ω 2 and the admissible class
V p dx ≤ 1 = V ≥0 : Ω
with p > 1. The functional / f turns out to be γ -continuous; therefore the existence Theorem 16.5.1 applies to F (V ) = −/ f (V ) and we obtain that the problem
max / f (V ) : V ≥ 0,
p
Ω
V dx ≤ 1
(16.18)
admits a solution. Notice that, / f being the infimum of linear functions with respect to V , the cost F (V ) is convex on . The functional / f is an integral functional: in fact, from the PDE (16.17) we find, multiplying by uV and integrating by parts, 2 2 f uV d x. |∇uV | + V uV d x = Ω
Ω
Therefore we obtain / f (V ) = −
1 2
Ω
f uV d x,
which corresponds to the integral functional above with 1 j (x, s, z) = − f (x)s. 2 In this particular case some more explicit computations can be made: indeed, interchanging the max and the min in (16.18) we obtain 1 1 2 2 |∇u| + V (x)u − f (x)u d x max min V ∈ u∈H 1 (Ω) Ω 2 2 0 1 1 |∇u|2 + V (x)u 2 − f (x)u d x. ≤ min max 2 u∈H01 (Ω) V ∈ Ω 2 The max in the last term can be explicitly computed and we obtain easily
1
max V ∈
Ω
2
2
V (x)u d x =
1 2
1−1/ p
2 p/( p−1)
Ω
|u|
dx
,
i
i i
i
i
i
i
658
book 2014/8/6 page 658 i
Chapter 16. An introduction to shape optimization problems
reached at
−1/ p
2/( p−1)
2 p/( p−1)
V = |u|
Ω
|u|
dx
.
This gives max / f (V ) ≤ min V ∈
u∈H01 (Ω)
1 Ω
2
|∇u|2 d x +
1
1−1/ p
2
Ω
|u|2 p/( p−1) d x
−
Ω
f (x)u d x.
Due to the assumption p > 1, the functional appearing in the right-hand side is coercive and strictly convex, so that the minimum in the right-hand side is attained at a unique function u ∈ H01 (Ω) which verifies the PDE −Δu + C u |u|2/( p−1) u = f
with C u =
Now, taking
Ω
|u|2 p/( p−1) d x
V = |u|2/( p−1)
Ω
|u|2 p/( p−1) d x
−1/ p .
(16.19)
−1/ p (16.20)
we have V ∈ , hence max / f (V ) ≥ / f (V ). V ∈
The term / f (V ) can be computed through its Euler–Lagrange equation
−Δu + V (x)u = f u ∈ H01 (Ω),
in Ω,
which is uniquely solved by u. Therefore, all the inequalities above become equalities and we may conclude that problem (16.18) admits a unique solution V given by (16.20), where u is the solution of the minimum problem
1
min
u∈H01 (Ω)
Ω
2
|∇u|2 d x +
1 2
Ω
1−1/ p |u|2 p/( p−1) d x
−
Ω
f (x)u d x,
which corresponds to the PDE (16.19). Similar computations can be made (see [238]) in the case of the cost functional 2 2 1 |∇u| + V (x)u d x : u ∈ H0 (Ω), uL2 (Ω) = 1 . λ1 (V ) = min Ω
Changing the integral constraint which defines the admissible class and considering the new class = V ≥0 : V −p d x ≤ 1 Ω
with p > 0, the meaningful problem for the energy cost / f (V ) (similarly for the first eigenvalue λ1 (V )) becomes the minimization problem −p min / f (V ) : V ≥ 0, V dx ≤1 . (16.21) Ω
i
i i
i
i
i
i
16.5. Optimal potentials for elliptic operators
book 2014/8/6 page 659 i
659
Note that in the present situation the set is unbounded in every L p (Ω); actually, in this case the value +∞ is admissible for a potential V , intending that on the set V = +∞ the Sobolev functions entering in the definition of / f (V ) must vanish q.e. on the set uV = 0. By repeating the calculations made in Example 16.5.1 we end up with the conclusion that for every p > 0 the minimization problem (16.21) admits a solution V , given by
1/ p
−2/( p+1)
V = |u|
2 p/( p+1)
Ω
|u|
dx
.
Here u solves the minimum problem min
u∈H01 (Ω)
Ω
1 2
2
|∇u| d x +
1
1+1/ p
2
2 p/( p+1)
Ω
|u|
−
dx
Ω
f (x)u d x,
which corresponds to the PDE −2/( p+1)
−Δu + C u |u|
u=f
with C u =
1/ p 2 p/( p+1)
Ω
|u|
dx
.
In order to handle more general optimization problems of the form min {F (V ) : V ∈ } ,
(16.22)
we consider a function Ψ : [0, +∞] → [0, +∞] and the admissible class = V : Ω → [0, +∞] : V Lebesgue measurable, Ψ(V ) d x ≤ 1 . Ω
On the function Ψ we make the following assumptions: (i) Ψ is strictly decreasing; (ii) there exists p > 1 such that the function s → Ψ −1 (s p ) is convex. Note that the conditions above are, for instance, satisfied by the following functions: Ψ(s) = s − p Ψ(s) = e −αs
for any p > 0; for any α > 0.
The general existence result we may obtain in this framework is the following. Theorem 16.5.2. Let Ω ⊂ RN be a bounded open set and let Ψ : [0, +∞] → [0, +∞] be a function satisfying the conditions (i) and (ii) above. Then, for any functional F (V ) which is monotone increasing and lower semicontinuous with respect to the γ -convergence, the problem (16.22) admits a solution. 1/ p PROOF. Let Vn ∈ be a minimizing sequence for problem (16.22). Then, vn := Ψ(Vn ) is a bounded sequence in L p (Ω) and so, up to a subsequence, vn converges weakly in L p (Ω) to some function v. We will prove that V := Ψ −1 (v p ) is a solution of (16.22). Clearly V ∈ and so it remains to prove that F (V ) ≤ lim infn F (Vn ). In view of the compactness of the γ -convergence on the class M0 (Ω) of capacitary measures (see Section 5.8.4), we can
i
i i
i
i
i
i
660
book 2014/8/6 page 660 i
Chapter 16. An introduction to shape optimization problems
suppose that, up to a subsequence, Vn γ -converges to a capacitary measure μ ∈ M0 (Ω). We claim that the following inequalities hold true: F (V ) ≤ F (μ) ≤ lim inf F (Vn ).
(16.23)
n→∞
In fact, the second inequality in (16.23) is the lower semicontinuity of F with respect to the γ -convergence, while the first needs a more careful examination. By the definition of γ -convergence, we have that for any u ∈ H01 (Ω), there is a sequence un ∈ H01 (Ω) which converges to u in L2 (Ω) and is such that 2 2 2 |∇u| d x + u d μ = lim |∇un | d x + un2 Vn d x Ω
n→∞
Ω
= lim
n→∞
Ω
Ω
|∇un |2 d x + 2
≥
Ω
2
|∇u| d x +
Ω
2
=
Ω
|∇u| d x +
Ω
Ω
un2 Ψ −1 (vnp ) d x (16.24)
−1
p
u Ψ (v ) d x u 2 V d x,
Ω
where the inequality in (16.24) is due to strong-weak lower semicontinuity of integral functionals (see Theorem 13.1.1). Thus, for any u ∈ H01 (Ω), we have
Ω
u2 d μ ≥
u 2V d x, Ω
which gives V ≤ μ. Since F is assumed to be monotone increasing, we obtain the first inequality in (16.23) and so the conclusion. Taking for instance Ψ(s) = e −αs , we can repeat the explicit computation made for problem (16.21), and we find that the optimization problem e −αV d x ≤ 1 min / f (V ) : V ≥ 0, Ω
admits a solution V given by V=
1 α
2 |u| d x − log |u| . 2
log Ω
Here u solves the minimum problem 1 1 u 2 d x log u 2 d x − u 2 log u 2 d x − f u d x, min |∇u|2 d x + 2α Ω u∈H01 (Ω) Ω 2 Ω Ω Ω which corresponds to the PDE 1 Bu 2 −Δu+ Au u + − u 1 + log(u ) = f α u
with Au =
2
Ω
log(u ) d x, B u =
u 2 d x. Ω
i
i i
i
i
i
i
16.5. Optimal potentials for elliptic operators
book 2014/8/6 page 661 i
661
It is interesting to notice that the case of constraints of the form Ω e −αV d x ≤ 1 can be used to efficiently approximate shape optimization problems, in which the main unknown is a domain A ⊂ Ω, that is, a capacitary measure of the form ∞Ω\A (see Section 5.8.4). Indeed, if V = ∞Ω\A we have that e −αV = 1A and the constraint Ω e −αV d x ≤ 1 becomes the measure constraint |A| ≤ 1, while the Schrödinger operator −Δ + V corresponds to the Dirichlet–Laplacian on the set A. Buttazzo et al. [152] prove the Γ -convergence, as α → 0, of the Schrödinger problems with constraint Ω e −αV d x ≤ 1 to the shape optimization problem with measure constraint |A| ≤ 1.
i
i i
i
i
i
i
book 2014/8/6 page 663 i
Chapter 17
Gradient flows
17.1 The classical continuous steepest descent 17.1.1 Existence and uniqueness of global orbits
/ is a real Hilbert space with scalar product and norm denoted by 〈·, ·〉 and · = 〈·, ·〉, respectively. Let Φ : → R be a real-valued function which is continuously differentiable. We often refer to Φ as the potential function. By the Riesz representation theorem, for any u ∈ there exists a unique vector in , denoted by ∇Φ(u), and called the gradient of Φ at u, such that Φ (u)(v) = 〈∇Φ(u), v〉
∀v ∈ .
The differential equation on which is governed by the gradient vector field v ∈ → −∇Φ(v) ∈ (SD) u˙(t ) = −∇Φ(u(t )) (17.1) is called the (classical) continuous steepest descent. Throughout the chapter we write either dd vt or v˙ to denote the distributional derivative of any function v : (0, +∞) → . We call the trajectory (also the solution curve or orbit) of the differential equation (SD) any C 1 function u : [0, +∞) → satisfying (17.1) for all t ≥ 0. The word “classical” refers to the fact that the potential Φ is continuously differentiable, which allows us to consider classical solutions (u ∈ C 1 ) of (SD). By contrast, for nonsmooth potential functions, we will have to consider weaker notions of solution. Descent property. The following property explains the important role played by the steepest descent dynamic in optimization. Proposition 17.1.1 (descent property). Let u ∈ C 1 ([0, +∞); ) be a trajectory of (SD). Then t → Φ(u(t )) is a decreasing function, and for all t ≥ 0 d
Φ(u(t )) = − u˙ (t )2 . (17.2) dt Thus, as long as the trajectory does not reach a stationary point, the function t → Φ(u(t )) is decreasing. PROOF. Taking the scalar product with u˙(t ) in (17.1) yields u˙(t )2 + 〈∇Φ(u(t )), u˙(t )〉 = 0.
(17.3)
663
i
i i
i
i
i
i
664
book 2014/8/6 page 664 i
Chapter 17. Gradient flows
Using the classical derivation chain rule d dt
Φ(u(t )) = 〈∇Φ(u(t )), u˙ (t )〉 ,
(17.3) gives d Φ(u(t )) = − u˙(t )2 . dt Hence t → Φ(u(t )) is a nonincreasing function. Indeed, as long as the trajectory moves (does not stop), it is a decreasing function. Related notions: We use the term integral curve, especially if we are interested in the image in of a trajectory rather than in the trajectory itself as a function. The terminology dynamical system refers to the evolution of each point of by the semiflow (or semigroup) (S(t )) t ≥0 generated by −∇Φ. For each t ∈ [0, +∞), the operator S(t ) associates to each u0 ∈ the point S(t ; u0 ) = u(t ), where u is the unique orbit starting at u0 , that is, u˙(t ) = −∇Φ(u(t )); u(0) = u0 . The minus sign in (SD) reflects the fact that we are interested in the minimization of Φ. In fact, in view of the maximization of Φ, one would rather consider the reversing flow u˙(t ) = ∇Φ(u(t )) called steepest ascent, which has the same integral curves as (SD) with a different orientation. The descent property of the trajectories is obtained by increasing the time variable. That’s why we consider only the semiflow {S(t ); t ≥ 0}, and from now on, unless specified, we consider only orbits which are defined on some positive time interval. An equilibrium for a semiflow (S(t )) t ≥0 is a point z ∈ such that S(t )z = z for all t ≥ 0. An equivalent terminology is stationary point. For the semiflow (S(t )) t ≥0 generated by −∇Φ the equilibria are the critical points of Φ, i.e., critΦ = {v ∈ : ∇Φ(v) = 0}. The potential function Φ : → R is a strict Lyapunov function for the steepest descent differential system (SD). This means that for each trajectory t → u(t ) of (SD) the real-valued mapping t → Φ(u(t )) is (strictly) decreasing, as long as the trajectory has not reached an equilibrium. This property makes (SD) a dissipative system, which, as we shall see, has a great impact on the asymptotic behavior of the trajectories of (SD). In particular, the only periodic trajectories of (SD) are the trajectories which remain at critical points. This makes a great difference with conservative systems (like Hamiltonian systems) which naturally exhibit many periodic trajectories. Geometrical aspects. The steepest descent direction has a natural geometrical interpretation. Being at u ∈ , and v ∈ being normalized v = 1, the directional derivative of Φ at u in the direction v is equal to d dt
Φ(u + t v)| t =0 = 〈∇Φ(u), v〉 .
When u is noncritical, i.e., ∇Φ(u) = 0, it is minimal for v =−
1 ∇Φ(u)
∇Φ(u).
Being at u, by using only first-order local conditions on Φ, the direction v ∈ that provides the greatest decrease of Φ is −∇Φ(u), whence the steepest descent terminology for differential equation (17.1). We could as well consider the differential equation (SD)
u˙ (t ) = −α(t )∇Φ(u(t )),
(17.4)
i
i i
i
i
i
i
17.1. The classical continuous steepest descent
book 2014/8/6 page 665 i
665
where α : R+ → R+ is a smooth positive function. Differential systems (17.1) and (17.4) have the same integral curves. One can pass from one system to the other by time reparametrization (e.g., length parametrization), which does not change the portrait of the system. The flow is often represented by its phase portrait, that is, the picture of its integral curves in . For gradient flows, the integral curves are in the direction perpendicular to the equipotential surfaces, which helps in visualizing them. This is a consequence of the following result, which plays a key role in differential geometry. Lemma 17.1.1. Let Φ : → R be a C 1 function. Let Σc = {v ∈ : Φ(v) = c} be a level set of Φ for some level c ∈ R. Assume that u ∈ Σc (i.e., Φ(u) = c), and u is not critical (i.e., ∇Φ(u) = 0). Then, in a neighborhood of u, Σc is a C 1 surface, with codimension 1, and ∇Φ(u) is orthogonal to Σc at u. One can also get a mechanical intuition of the gradient flow by considering the graph of Φ, which is a surface in ×R. Given an orbit t → u(t ) of (SD), the point (u(t ), Φ(u(t )) moves along the graph of Φ just like a drop of water: it slides down the surface in the direction of steepest descent until it reaches the bottom of the surface. Then it stops. In accordance with this mechanical interpretation, there is no inertial effect in (SD). This is due to the fact that, in the equation of mechanics, the acceleration term u¨ has been neglected. (SD) is a first-order differential equation with respect to time t . Due to space limitations, the study of dissipative gradient systems involving inertial features, such as the heavy ball with friction dynamical system, is not considered here, the interested reader can consult [12], [50], and references therein. Let us show the existence and uniqueness of a global solution of the Cauchy problem for (SD). Theorem 17.1.1. Let Φ : → R be a real-valued function which satisfies the following: (i) Φ is minorized, i.e., inf Φ > −∞. (ii) Φ is continuously differentiable, and ∇Φ : → is Lipschitz continuous on bounded sets, i.e., for each positive R > 0 there exists some constant LR ≥ 0 such that ∇Φ(v2 ) − ∇Φ(v1 ) ≤ LR v2 − v1 ∀v1 , v2 ∈ B(0, R). Then, for any u0 ∈ , the following properties hold: (a) global existence: There exists a unique classical global solution u ∈ C 1 ([0, +∞); ) of the Cauchy problem u˙(t ) = −∇Φ(u(t )), (17.5) u(0) = u0 . (b) descent property: t → Φ(u(t )) is a decreasing function, and, for all t ≥ 0, d dt
Φ(u(t )) = − u˙ (t )2 .
(c) finite energy property: ∞ u˙(t )2 d t ≤ Φ(u0 ) − inf Φ < +∞. 0
(17.6)
(17.7)
i
i i
i
i
i
i
666
book 2014/8/6 page 666 i
Chapter 17. Gradient flows
PROOF. The existence of solutions to (SD) is based on the Cauchy–Lipschitz theorem that we recall below (in a local form that fits our study, and in the classical global form). Theorem 17.1.2 (Cauchy–Lipschitz). (a) (local version) Let F : → be a vector field which is Lipschitz continuous on bounded sets, i.e., for each positive R > 0 there exists some constant LR ≥ 0 such that F (v2 ) − F (v1 ) ≤ LR v2 − v1 ∀v1 , v2 ∈ B(0, R). Then, for any u0 ∈ , there exists some T > 0, such that there exists a unique classical solution u ∈ C 1 ([−T , +T ]; ) of the Cauchy problem u˙(t ) = F (u(t )) (17.8) u(0) = u0 . (b) (global version) Let F : → be a vector field which is globally Lipschitz continuous, i.e., there exists some constant L ≥ 0 such that F (v2 ) − F (v1 ) ≤ Lv2 − v1
∀v1 , v2 ∈ .
Then, for any u0 ∈ there exists a unique global classical solution u ∈ C 1 (R; ) of Cauchy problem (17.8). PROOF OF THEOREM 17.1.1 CONTINUED. By the Cauchy–Lipschitz theorem (local version), there exists some T > 0 such that Cauchy problem (17.5) admits a unique local solution u ∈ C 1 ([0, T ]; ). Let 4 5 T ma x = sup T > 0 : there exists a solution of (17.5) on [0, T ] . Thus the maximal solution u of (17.5) belongs to C 1 ([0, T ma x [; ). Let us show that T ma x = +∞. We argue by contradiction and assume that T ma x < +∞. Let us show that lim t →Tmax u(t ) exists. For all t ∈ [0, T ma x [ we have u˙(t ) + ∇Φ(u(t )) = 0. By the descent property (17.2) d dt
Φ(u(t )) = − u˙(t )2 .
(17.9)
By integration of (17.9) on [0, T ], with 0 < T < T ma x , we obtain T u˙(t )2 d t + Φ(u(T )) − Φ(u0 ) = 0. 0
By assumption (i), Φ is minorized. Hence T u˙(t )2 d t ≤ Φ(u0 ) − inf Φ < +∞. 0
This majorization being valid for any 0 < T < T ma x , taking the supremum with respect to T yields Tmax u˙(t )2 d t ≤ Φ(u0 ) − inf Φ < +∞. (17.10) 0
i
i i
i
i
i
i
17.1. The classical continuous steepest descent
book 2014/8/6 page 667 i
667
From (17.10) we deduce a property of uniform continuity of u : [0, T ma x [→ . For any 0 ≤ s ≤ t < T ma x , by the Cauchy–Schwarz inequality
t
u(t ) − u(s) ≤
u˙(τ)d τ s
≤
$
1
t
t −s
2
u˙(τ) d τ
2
s
≤
$
≤C
t −s $
Tmax
1 2
2
u˙(τ) d τ
0
t −s
1
with C = (Φ(u0 ) − inf Φ) 2 given by (17.10). Thus, u : [0, T ma x [→ is Hölder continuous and hence uniformly continuous from [0, T ma x [ into the complete metric space . By the classical continuous extension theorem, u admits a unique extension by continuity to [0, T ma x ], that is, lim u(t ) := uTmax exists. t →Tmax
By applying the Cauchy–Lipschitz theorem (local version) with Cauchy data uTmax at initial time t = T ma x , we obtain a solution w ∈ C 1 ([T ma x , T1 ]; ) with T1 > T ma x of the Cauchy problem: ˙ ) = −∇Φ(w(t )), w(t w(T ma x ) = uTmax . The function u˜ : [0, T1 ] → which is equal to u on [0, T ma x ] and equal to w on [T ma x , T1 ] belongs to C 1 ([0, T1 ]; ). Indeed, u˜ and its time derivative are continuous at t = T ma x . This last property follows from the continuity of the vector field ∇Φ: lim
t →Tmax , t Tmax
u˙˜(t ) = −∇Φ(uTmax ).
Thus, u˜ satisfies the steepest descent equation on an interval strictly larger than T ma x , which contradicts the maximality of T ma x .
17.1.2 Asymptotic properties, t → +∞ Let us examine the asymptotic behavior of the trajectories and provide some first general properties. This is a topic of fundamental importance in optimization. Moreover, it models the evolution of the transition between the initial state and the final equilibrium for many systems in physics, biology, economics, and so forth. We will use the following lemma. Lemma 17.1.2. Let g : [0, +∞[→ [0, +∞[ be a continuous function which satisfies (i) and (ii): +∞ (i) 0 g (t )2 d t < +∞. (ii) g is Lipschitz continuous on [0, +∞[. Then g (t ) → 0 as t → +∞.
i
i i
i
i
i
i
668
book 2014/8/6 page 668 i
Chapter 17. Gradient flows
PROOF. Let us argue by contradiction and suppose that the statement ( g (t ) → 0 as t → +∞) is false. Then, there exists some ε0 > 0 and a sequence tn → +∞ such that for each n ∈ N, g (tn ) ≥ ε0 . After extraction of a subsequence, we can assume that, for each n ∈ N, |tn+1 − tn | > 1. Suppose that g is L-Lipschitz continuous. On the interval ε ε [tn − 2L0 , tn + 2L0 ], we have g (t ) ≥ g (tn ) − L|t − tn | ≥ ε0 − L|t − tn | ε0 ≥ . 2
4 ε 5 Set η = inf 12 , 2L0 > 0. The intervals [tn −η, tn +η] do not overlap, and, on each of them, ε g is minorized by 20 . From this we infer +∞ tn +η 2 g (t ) d t ≥ g (t )2 d t 0
tn −η
n
≥
2η
n
a clear contradiction with
+∞ 0
! ε "2 0
2
= +∞,
g (t )2 d t < +∞.
Theorem 17.1.3. Let Φ : → R be a real-valued function which satisfies the following: (i) Φ is minorized, i.e., inf Φ > −∞. (ii) Φ is continuously differentiable, and ∇Φ : → is Lipschitz continuous on bounded sets. Let u ∈ C 1 ([0, +∞); ) be a bounded orbit generated by the gradient flow associated to Φ. Then, lim u˙ (t ) = 0, lim ∇Φ(u(t )) = 0. t →+∞
t →+∞
As a consequence, if u(tn ) → u∞ for some sequence tn → +∞, then ∇Φ(u∞ ) = 0. PROOF. Let us show that the function g : [0, +∞[→ [0, +∞[, which is defined for any t ≥ 0 by g (t ) = u˙(t ), satisfies conditions (i) and (ii) of Lemma 17.1.2. ∞ (i) By the finite energy property (see (17.7) in Theorem 17.1.1), we have 0 u˙(t )2 d t ≤ +∞ Φ(u0 ) − inf Φ < +∞. Equivalently, 0 g (t )2 d t < +∞, which is item (i). (ii) Let us now show that g is Lipschitz continuous on [0, +∞[. Since u has been assumed to be bounded, there exists some R > 0 such that u(t ) ≤ R for all t ≥ 0. Since ∇Φ : → is Lipschitz continuous on bounded sets, there exists some LR > 0 such that for all s, t ≥ 0, ∇Φ(u(t )) − ∇Φ(u(s)) ≤ LR u(t ) − u(s). From this, and the definition of the gradient flow, we infer that for all s, t ≥ 0, |g (t ) − g (s)| = | u˙(t ) − u˙(s) | ≤ u˙(t ) − u˙(s) ≤ ∇Φ(u(t )) − ∇Φ(u(s)) ≤ LR u(t ) − u(s).
(17.11)
i
i i
i
i
i
i
17.1. The classical continuous steepest descent
book 2014/8/6 page 669 i
669
Let us complete the proof by showing that u is Lipschitz continuous on [0, +∞[. Indeed, from t u˙ (τ)d τ
u(t ) − u(s) = s
and the definition of the gradient flow, we obtain t u˙(τ)d τ u(t ) − u(s) ≤
s t
≤
∇Φ(u(τ))d τ. s
Moreover we have, for any τ ≥ 0, ∇Φ(u(τ)) ≤ ∇Φ(u0 ) + LR u(τ) − u0 ≤ ∇Φ(u0 ) + LR (u0 + R). Setting CR := ∇Φ(u0 ) + LR (u0 + R), we deduce from the two above inequalities that u(t ) − u(s) ≤ CR |t − s|. Returning to (17.11), we obtain that for all s, t ≥ 0, |g (t ) − g (s)| ≤ CR LR |t − s|. By Lemma 17.1.2 we infer that g (t ) → 0 as t → +∞, which gives lim t →+∞ u˙ (t ) = 0 and, by the gradient flow definition, limt →+∞ ∇Φ(u(t )) = 0. Since ∇Φ is continuous, this clearly implies that the strong cluster points of the orbits are critical points of Φ. The following classical example from Palis and De Melo [315] shows that without any further geometrical assumption on Φ, bounded orbits of the gradient flow may fail to converge. Let Φ : R2 → R be defined (in polar coordinates) by + * ⎧ 1 if r < 1, ⎪ ⎨exp r 2 −1 Φ(r cos θ, r sin θ) = 0 if * r =+ 1, * + ⎪ ⎩exp 1 sin 1 − θ if r > 1. r −1 r 2 −1 Then Φ is C 1 and there exists an orbit of the gradient flow whose ω-limit set is the unit circle S 1 . A similar example of such a “Mexican hat” function was given in [2]. Thus, in order to obtain the asymptotic convergence property of the orbits of the gradient flow, we need to make some additional assumptions. The above example suggests that we have to make some geometrical assumptions on Φ which prevent it from wild oscillations. Because of their dissipative properties (existence of strict Lyapunov functions), gradient flows enjoy remarkable asymptotic convergence properties (t → +∞). Indeed, we are going to show that, in some important cases, their orbits converge to equilibria which are critical points (global minima in the convex case) of the potential Φ. More precisely, there are two important classes of functions Φ for which it has been established the convergence of the orbits of the gradient flow associated to Φ: (i) the convex case (and related situations like quasi-convex), (ii) the analytic case (and related situations like semialgebraic). In the two next sections we are going to examine successively these situations.
i
i i
i
i
i
i
670
book 2014/8/6 page 670 i
Chapter 17. Gradient flows
17.2 The gradient flow associated to a convex potential In many instances (PDEs, constrained problems, sparse representation in signal/image, optimization problems involving the total variation and BV spaces), one has to consider a potential Φ which is nonsmooth. A natural way to manage this situation is to use a regularization method. We thus reduce to the classical steepest descent. This approach has been particularly successful in the case of convex potentials. This is the situation that we now examine.
17.2.1 Moreau–Yosida approximation of nonsmooth convex functions Let us consider a potential function Φ : → R ∪ {+∞} which is convex, lower semicontinuous, and proper: Φ ∈ Γ0 ( ) for short. For Φ ∈ Γ0 ( ), a natural extension of the notion of gradient is the notion of subdifferential, which we recall below. Given u ∈ dom Φ z ∈ ∂ Φ(u) ⇔ Φ(v) ≥ Φ(u) + 〈z, v − u〉 ∀v ∈ . The classical steepest descent becomes a differential inclusion − u˙ (t ) ∈ ∂ Φ(u(t )). The lack of continuity of the operator ∂ Φ prevents a direct application of a general existence theorem for differential equations. A classical way to overcome this difficulty is to use the Moreau–Yosida regularization of the nonsmooth potential Φ. This technique is widely used in convex variational analysis, which is why we present a detailed study. The following statements respectively give its definition, regularization, and approximation properties. Proposition 17.2.1. Let Φ : → R∪{+∞} be a convex, lower semicontinuous, and proper function. For any λ > 0, the Moreau–Yosida approximation of index λ of Φ is the function Φλ : → R which is defined for all u ∈ by 1 2 Φλ (u) = inf Φ(v) + u − v . (17.12) v∈ 2λ 1. The infimum in (17.12) is attained at a unique point Jλ u ∈ , which satisfies 1
u − Jλ u2 ; 2λ Jλ u + λ∂ Φ(Jλ u) # u.
Φλ (u) = Φ(Jλ u) +
(17.13) (17.14)
The operator Jλ = (I + λ∂ Φ)−1 : → is everywhere defined and nonexpansive. It is called the resolvent of index λ of A = ∂ Φ. 2. Φλ is convex, and continuously differentiable. For each u ∈ , its gradient is given by ∇Φλ (u) =
1 λ
(u − Jλ u).
(17.15)
3. The operator Aλ =
1
(17.16) (I − Jλ ) λ is called the Yosida approximation of index λ of the maximal monotone operator A = ∂ Φ. It is Lipschitz continuous with Lipschitz constant λ1 . Thus, for all u ∈ Aλ u = ∇Φλ (u),
(17.17)
Aλ u ∈ ∂ Φ(Jλ u).
(17.18)
i
i i
i
i
i
i
17.2. The gradient flow associated to a convex potential
book 2014/8/6 page 671 i
671
1 PROOF. For u ∈ fixed, the function v → Φ(v) + 2λ u − v2 is strictly convex, lower semicontinuous, and coercive (indeed it is strongly convex). Therefore it reaches its minimal value at a unique point Jλ u ∈ which satisfies
Φλ (u) = inf
v∈
Φ(v) +
1 u − v2 = Φ(Jλ u) + u − Jλ u2 , 2λ 2λ 1
and Φλ : → R. By writing the first-order optimality condition, and using the subdifferential additivity rule of Moreau and Rockafellar, Theorem 9.5.4, we obtain 1 ∂ Φ(Jλ u) + (Jλ u − u) # 0, λ
(17.19)
which gives (17.14). Then notice that Φλ is the epi-sum of the two convex functions Φ 1 and 2λ · 2 : 1 Φλ = Φ#e 2 . 2λ By Proposition 9.2.2, Φλ is a convex function. Let us now prove that Jλ = (I + λ∂ Φ)−1 : → is nonexpansive. Take u, v ∈ . By (17.19), 1 λ 1 λ
(u − Jλ u) ∈ ∂ Φ(Jλ u), (v − Jλ v) ∈ ∂ Φ(Jλ v),
and the monotonicity property of ∂ Φ (see Proposition 17.2.3), we obtain 1 1 (u − Jλ u) − (v − Jλ v), Jλ u − Jλ v ≥ 0. λ λ
(17.20)
Equivalently 〈Jλ u − Jλ v, u − v〉 ≥ Jλ u − Jλ v2 , which by the Cauchy–Schwarz inequality gives Jλ u − Jλ v ≤ u − v. Set Aλ = λ1 (I − Jλ ). Reformulating (17.20) with Aλ , we obtain 〈Aλ u − Aλ v, (u − λAλ u) − (v − λAλ v)〉 ≥ 0. Equivalently 〈Aλ u − Aλ v, u − v〉 ≥ λAλ u − Aλ v2 , which, by the Cauchy–Schwarz inequality, implies that Aλ is Lipschitz continuous with Lipschitz constant λ1 : 1 Aλ u − Aλ v ≤ u − v. λ
i
i i
i
i
i
i
672
book 2014/8/6 page 672 i
Chapter 17. Gradient flows
Let us now prove that Φλ is differentiable, with ∇Φλ (u) = Aλ u, for any u ∈ . In order to verify the Fréchet differentiability of Φλ at u ∈ , set, for any v ∈ , Pλ (v) := Φλ (v) − Φλ (u) − 〈Aλ u, v − u〉,
(17.21)
and prove that Pλ (v) = O(v − u). According to the convexity of Φλ , let us first prove that Pλ (v) ≥ 0. By definition (17.21) of Pλ , definition (17.16) of Aλ , and (17.13) λ λ Pλ (v) = Φ(Jλ v) + Aλ v2 − Φ(Jλ u) + Aλ u2 − 〈Aλ (u), v − u〉 2 2 λ (17.22) = (Φ(Jλ v) − Φ(Jλ u)) + Aλ v2 − Aλ u2 − 〈Aλ (u), v − u〉. 2 By Aλ u ∈ ∂ Φ(Jλ u) (see (17.18)), we have the convex subdifferential inequality Φ(Jλ v) − Φ(Jλ u) ≥ 〈Aλ u, Jλ v − Jλ u〉.
(17.23)
Let us successively combine (17.22) and (17.23), then use Jλ v = v −λAλ v, Jλ u = u −λAλ u to obtain λ Aλ v2 − Aλ u2 − 〈Aλ u, v − u〉 2 λ ≥ 〈Aλ u, v − u〉 + λ〈Aλ u, Aλ u − Aλ v〉 + Aλ v2 − Aλ u2 − 〈Aλ u, v − u〉. 2
Pλ (v) ≥ 〈Aλ u, Jλ v − Jλ u〉 +
After simplification λ Aλ u2 + Aλ v2 − λ〈Aλ u, Aλ v〉 2 2 λ ≥ Aλ u − Aλ v2 ≥ 0. 2
Pλ (v) ≥
λ
(17.24)
The above argument being valid for any u, v ∈ , by reversing the role of u and v, we obtain Φλ (u) − Φλ (v) − 〈Aλ v, u − v〉 ≥ 0. (17.25) Equivalently
Φλ (v) − Φλ (u) − 〈Aλ v, v − u〉 ≤ 0.
(17.26)
Using successively the definition of Pλ (v), (17.26), the Cauchy–Schwarz inequality, and the Lipschitz continuity with Lipschitz constant λ1 of Aλ Pλ (v) = (Φλ (v) − Φλ (u) − 〈Aλ v, v − u〉) + 〈Aλ v − Aλ u, v − u〉 ≤ 〈Aλ v − Aλ u, v − u〉 ≤ Aλ v − Aλ uv − u 1 ≤ v − u2 . λ
(17.27)
Combining (17.24) and (17.27) gives, for any v ∈ , 0 ≤ Φλ (v) − Φλ (u) − 〈Aλ u, v − u〉 ≤
1 λ
v − u2 ,
i
i i
i
i
i
i
17.2. The gradient flow associated to a convex potential
book 2014/8/6 page 673 i
673
which shows that Φλ is differentiable, with gradient at u being equal to Aλ u. Since Aλ is continuous this proves that Φλ is continuously differentiable. Remark 17.2.1. Just like for classical convolution, it is the regularity of the quadratic kernel · 2 which confers to Φλ its regularity property. But, by contrast with classical convolution, in general we cannot expect more regularity than C 1,1 for Φλ , i.e., C 1 with a Lipschitz continuous gradient. Take, for example, = R and Φ equal to the indicator 1 (r + )2 , which is C 1,1 but not C 2 . function of (−∞, 0]. Then Φλ (r ) = 2λ We complement Proposition 17.2.1 by showing the following approximation results. Proposition 17.2.2. Let Φ : → R ∪ {+∞} be convex, lower semicontinuous, and proper. Then we have the following: (i) Monotone convergence: Φλ (u) ↑ Φ(u)
as λ ↓ 0 ∀u ∈ .
(ii) Convergence of resolvents: Jλ u → u as λ → 0 ∀u ∈ dom Φ; Jλ u → projdom Φ u as λ → 0 ∀u ∈ . (iii) Convergence of Yosida approximation: For all u ∈ dom ∂ Φ Aλ u → ∂ Φ(u)0 as λ → 0; 0 Aλ u ≤ ∂ Φ(u) ∀λ > 0.
(17.28) (17.29)
PROOF. (i) Let us fix u ∈ . By definition (17.12) of Φλ , for any v ∈ , Φλ (u) ≤ Φ(v) +
1 2λ
u − v2 .
(17.30)
Taking v = u in (17.30) we obtain Φλ (u) ≤ Φ(u).
(17.31)
1 u − v2 increases, as well as the As λ decreases, the sequence of functions v → Φ(v) + 2λ sequence of its minimal values. Hence limλ→0 Φλ (u) exists, which by (17.31) gives
lim Φλ (u) ≤ Φ(u).
λ→0
(17.32)
To show the opposite inequality, we successively consider the cases u ∈ dom Φ, u ∈ dom Φ, and u ∈ / dom Φ. (a) u ∈ dom Φ. By considering a continuous affine minorant of Φ, we obtain the existence of some positive constant c such that Φ(v) ≥ −c(1 + v)
∀v ∈ .
(17.33)
By (17.31), the definition of Φλ , and (17.33), 1
u − Jλ u2 2λ 1 ≥ −c(1 + Jλ u) + u − Jλ u2 . 2λ
Φ(u) ≥ Φλ (u) = Φ(Jλ u) +
(17.34)
i
i i
i
i
i
i
674
book 2014/8/6 page 674 i
Chapter 17. Gradient flows
By using the triangle inequality in (17.34), u − Jλ u2 − 2λcu − Jλ u − 2λ(c + cu + Φ(u)) ≤ 0, which after elementary computation gives u − Jλ u ≤ 2λc + 2λ(c + cu + Φ(u)). Since u ∈ dom Φ, it follows that Jλ u → u as λ → 0. By Φλ (u) ≥ Φ(Jλ u), and by lower semicontinuity of Φ we deduce that lim Φλ (u) ≥ lim inf Φ(Jλ u)
λ→0
λ→0
≥ Φ(u).
(17.35)
Combining (17.32) and (17.35) gives limλ→0 Φλ (u) = Φ(u) for u ∈ dom Φ. (b) u ∈ dom Φ. Let us show that Jλ u → u as λ → 0 still holds true. Take v ∈ dom Φ. By the triangle inequality and the nonexpansive property of Jλ , Jλ u − u ≤ Jλ u − Jλ v + Jλ v − v + v − u ≤ 2v − u + Jλ v − v.
(17.36)
Letting λ → 0, and since v ∈ dom Φ lim sup Jλ u − u ≤ 2v − u. λ→0
This being true for any v ∈ dom Φ, we can take v arbitrarily close to u, which gives Jλ u → u
∀u ∈ dom Φ.
(17.37)
As in case (a), we complete the argument by using the lower semicontinuity property of Φ. (c) u ∈ / dom Φ. Since Jλ u ∈ dom Φ, we have u − Jλ u ≥ dist(u, dom Φ) := γ > 0. On the other hand, by the nonexpansive property, Jλ u remains bounded: taking some u0 ∈ dom Φ, we have Jλ u0 → u0 , and Jλ u ≤ Jλ u0 + u − u0 ≤ M < +∞.
(17.38)
By (17.33) and (17.38) 1
Φλ (u) ≥ −c(1 + Jλ u) + ≥ −c(1 + M ) +
γ2 2λ
2λ
u − Jλ u2
.
Thus limλ→0 Φλ (u) = +∞ = Φ(u).
i
i i
i
i
i
i
17.2. The gradient flow associated to a convex potential
book 2014/8/6 page 675 i
675
(ii) In the process, in (17.37), we have obtained that Jλ u → u for any u ∈ dom Φ. Let us complete this result by examining the convergence of (Jλ u) when u ∈ / dom Φ. By (17.38), (Jλ u) remains bounded. Let ξ be a weak cluster point of the generalized sequence (Jλ u). For simplicity we write Jλ u ξ weakly in . Since Jλ u ∈ dom ∂ Φ ⊂ dom Φ, by the weak closedness property of the closed convex set dom Φ, we have ξ ∈ dom Φ. By (17.18), we have λ1 (u − Jλ u) ∈ ∂ Φ(Jλ u). By the subdifferential inequality, for any v ∈ dom Φ, 1 Φ(v) ≥ Φ(Jλ u) + 〈u − Jλ u, v − Jλ u〉. λ Using a continuous affine minorant of Φ (see (17.33)) and developing the above expression we obtain λΦ(v) ≥ −λc(1 + Jλ u) + 〈u − Jλ u, v〉 − 〈Jλ u, u〉 + Jλ u2 .
(17.39)
Passing to the limit as λ → 0, and using the lower semicontinuity for the weak topology of the convex continuous function z → Jλ z2 , we obtain 〈u − ξ , v − ξ 〉 ≤ 0
∀v ∈ dom Φ.
This inequality can be extended by continuity to v ∈ dom Φ, which gives 〈u − ξ , v − ξ 〉 ≤ 0 ∀v ∈ dom Φ, ξ ∈ dom Φ. This is the obtuse angle condition which characterizes ξ = projdom Φ u. By uniqueness of the weak sequential cluster point we deduce that the whole generalized sequence Jλ u converges weakly to projdom Φ u. Strong convergence is obtained by showing the convergence of the norms. To simplify notation set p(u) = projdom Φ u. Returning to (17.39), by taking the lim sup as λ → 0, we obtain 0 ≥ 〈u − p(u), v〉 − 〈 p(u), u〉 + lim sup Jλ u2
∀v ∈ dom Φ.
This inequality can be extended by continuity to v ∈ dom Φ. By taking v = p(u) we obtain 0 ≥ 〈u − p(u), p(u)〉 − 〈 p(u), u〉 + lim sup Jλ u2 , that is, p(u)2 ≥ lim sup Jλ u2 , which completes the proof of (ii). (iii) Fix u ∈ dom ∂ Φ. By (17.18), we have Aλ u ∈ ∂ Φ(Jλ u). By the monotonicity property of ∂ Φ, we deduce that, for any z ∈ ∂ Φ(u), 〈Aλ u − z, Jλ u − u〉 ≥ 0. Since Jλ u − u = −λAλ u, we obtain Aλ u2 ≤ 〈Aλ u, z〉.
i
i i
i
i
i
i
676
book 2014/8/6 page 676 i
Chapter 17. Gradient flows
Hence, by the Cauchy–Schwarz inequality Aλ u ≤ z
∀z ∈ ∂ Φ(u).
(17.40)
Since ∂ Φ(u) is a closed convex nonempty set, it has a unique element of minimal norm ∂ Φ(u)0 , and (17.40) gives Aλ u ≤ ∂ Φ(u)0 .
(17.41)
Hence (Aλ u) remains bounded. Let η be a weak sequential cluster point of the generalized sequence (Aλ u). For simplicity we write Aλ u η
weakly in .
By passing to the limit on the subdifferential inequality, Φ(v) ≥ Φ(Jλ u) + 〈Aλ u, v − Jλ u〉
∀v ∈ dom Φ,
and using that Jλ u → u, we obtain Φ(v) ≥ Φ(u) + 〈η, v − u〉
∀v ∈ dom Φ,
that is, η ∈ ∂ Φ(u). On the other hand, by (17.41) η ≤ lim inf Aλ u ≤ ∂ Φ(u)0 . λ→0
(17.42)
By η ∈ ∂ Φ(u) and (17.42) we conclude that η = ∂ Φ(u)0 . By uniqueness of its weak sequential cluster point, the whole sequence (Aλ u) converges weakly to ∂ Φ(u)0 . Moreover, by (17.41) lim sup Aλ u ≤ ∂ Φ(u)0 . λ→0
Weak convergence and convergence of the norms imply strong convergence, which completes (iii). Remark 17.2.2. Let us mention some related properties of the Moreau–Yosida approximation. (i) By Proposition 17.2.2, Jλ u → u as λ → 0 for all u ∈ dom Φ. Since Jλ u ∈ dom ∂ Φ, this property implies that the domain of ∂ Φ is dense in the domain of Φ, i.e., dom ∂ Φ = dom Φ. 1 (ii) Since Φλ = Φ#e 2λ · 2 , the Legendre–Fenchel calculus gives
λ Φ∗λ = Φ∗ + · 2 . 2 Thus, for any λ > 0, μ > 0, ((Φλ )μ )∗ = (Φλ )∗ +
μ 2
· 2
λ μ = Φ∗ + · 2 + · 2 2 2 λ + μ = Φ∗ + · 2 2 = (Φλ+μ )∗ .
i
i i
i
i
i
i
17.2. The gradient flow associated to a convex potential
Hence
(Φλ )μ = Φλ+μ .
book 2014/8/6 page 677 i
677
(17.43)
This relation defines a semiflow λ → Φλ , which suggests interpreting λ as a time variable. We shall confirm this interpretation when considering the discrete time version of our dynamics. Indeed (see [37, Remark 3.32]), one can prove that d
1 Φλ (u) = − ∇Φλ (u)2 . dλ 2 Thus, (t , u) → w(t , u) = Φ t (u) is a solution of the Hamilton–Jacobi equation w t + 12 ∇ u w2 = 0; w(0, u) = Φ(u).
(17.44)
It is the viscosity solution of this equation; w(t , u) = Φ t (u), with 1 2 Φ t (u) = inf Φ(v) + u − v , v∈ 2t is known as the Lax or Hopf formula for the viscosity solution of (17.44).
17.2.2 Gradient flow for a convex lower semicontinuous potential on a Hilbert space: Existence and uniqueness results Let us consider a potential function Φ ∈ Γ0 ( ), i.e., Φ : → R ∪ {+∞} is convex, lower semicontinuous, and proper. This is an important situation that is preparatory to the study of gradient flows associated with general nonsmooth potentials. For Φ ∈ Γ0 ( ), a natural extension of the notion of gradient is the notion of subdifferential. The operator ∂ Φ : → 2 is multivalued, with domain dom ∂ Φ = {u ∈ dom Φ : ∂ Φ(u) = } which is a dense subset of the domain of Φ. Thus we are led to consider the differential inclusion (GSD)
u˙(t ) + ∂ Φ(u(t )) # 0,
called the generalized steepest descent. When Φ is differentiable, ∂ Φ = ∇Φ, and we recover the classical (SD) equation. As a model situation, take = L2 (Ω), where Ω is a regular bounded open set in Rn , and 1 ∇v(x)2 d x if v ∈ H01 (Ω), Φ(v) = 2 Ω / H01 (Ω). +∞ if v ∈ L2 (Ω), v ∈ Then A = ∂ Φ : L2 (Ω) → L2 (Ω) is the (minus) Laplace operator dom A = H 2 (Ω) ∩ H01 (Ω), A(v) = −Δv for v ∈ dom(A), and (GSD) is the heat equation (with Dirichlet boundary condition) ∂u ∂t
− Δu = 0.
Let us now return to (GSD) for a general potential Φ ∈ Γ0 ( ) and note that (GSD) is a particular instance of evolution equations governed by maximal monotone operators. Let
i
i i
i
i
i
i
678
book 2014/8/6 page 678 i
Chapter 17. Gradient flows
us recall some basic notions concerning maximal monotone operators that will be useful for our study; see [66], [85], [135], [364] for an extended presentation. It is convenient to identify an operator A : → 2 with its graph. We equivalently write z ∈ A(u) or (u, z) ∈ A. Definition 17.2.1. Let A : → 2 be an operator. (i) A is monotone if for all u1 , u2 ∈ dom A, for all z1 ∈ Au1 , z2 ∈ Au2 , 〈z2 − z1 , u2 − u1 〉 ≥ 0.
(17.45)
(ii) A is maximal monotone if it is monotone, and there is no proper monotone extension of A, i.e., 〈¯ z − z, u¯ − u〉 ≥ 0 ∀(u, z) ∈ A ⇒ z¯ ∈ A( u¯). Maximality holds in the class of monotone operators with respect to the inclusion relation on the graphs: it is not possible to extend the graph of a maximal monotone operator A into the graph of a monotone operator which is strictly larger than A. Subdifferentials of closed convex functions provide an important subclass of maximal monotone operators. Proposition 17.2.3. Let Φ : → R∪{+∞} be a convex, lower semicontinuous, and proper function. Then, its subdifferential A = ∂ Φ : → 2 is a maximal monotone operator. PROOF. Let z1 ∈ ∂ Φ(u1 ), z2 ∈ ∂ Φ(u2 ). Adding the subdifferential inequalities Φ(u2 ) ≥ Φ(u1 ) + 〈z1 , u2 − u1 〉 , Φ(u1 ) ≥ Φ(u2 ) + 〈z2 , u1 − u2 〉 , gives (17.45), and the monotonicity of A = ∂ Φ. The maximal monotonicity of A = ∂ Φ is a direct consequence of the fact that the operator I + A is surjective, i.e., R(I + A) = ; see Proposition 17.2.1(1). Suppose that there exists B monotone, B ⊃ A, and z ∈ B u. Since R(I + A) = there exists some u˜ ∈ dom A such that u˜ + A( u˜ ) # u + z. (17.46) Since B ⊃ A, we have B( u˜) ⊃ A( u˜), and (17.46) gives u˜ + B( u˜ ) # u + z. On the other hand, since z ∈ B u u + B(u) # u + z. Comparing these two last relations, by the contraction property of (I + B)−1 (a direct consequence of the monotonicity of B), we obtain u = u˜ . Returning to (17.46), we obtain z ∈ A(u), which proves that A = ∂ Φ is maximal monotone. In the above argument, the maximal monotonicity of A = ∂ Φ has been obtained as a consequence of the monotonicity of ∂ Φ and of the surjectivity of I + ∂ Φ. Indeed, this is a particular instance of the following theorem, due to Minty (see [292]), which provides a very useful characterization of maximal monotone operators.
i
i i
i
i
i
i
17.2. The gradient flow associated to a convex potential
book 2014/8/6 page 679 i
679
Theorem 17.2.1 (Minty). Let A : → 2 be a monotone operator. Then the following equivalence holds: A is maximal monotone ⇐⇒ R(I + A) = . PROOF. A detailed proof of this nontrivial result can be found in [66, Theorem 5, Chapter 6, Section 7], [85, Theorem 21.1], [135]. A maximal monotone operator, and in particular the subdifferential of a function Φ ∈ Γ0 ( ), is demiclosed, a property that is useful, and is described below. Proposition 17.2.4. Let A : → 2 be a maximal monotone operator. Then A is demiclosed, i.e., (un → u strongly, zn → z weakly, zn ∈ A(un ) ∀ n ∈ N) =⇒ (z ∈ A(u)). PROOF. By monotonicity of A, for any w ∈ A(v), for all n ∈ N 〈zn − w, un − v〉 ≥ 0. Thanks to the respective strong and weak convergence properties of un − v and zn − w, passing to the limit gives 〈z − w, u − v〉 ≥ 0. This being true for any (v, w) ∈ A, by maximal monotonicity of A, we conclude that z ∈ Au. Note that in accordance with the theory of evolution equations governed by maximal monotone operators, in general we cannot expect to obtain classical global solutions of (GSD). Take, for example, = R and Φ(v) = v + . Given Cauchy data u0 > 0, it can be easily seen that the unique solution of (GSD) is the function u − t for 0 ≤ t ≤ u0 , u(t ) = 0 0 for t ≥ u0 . Intuitively, the drop of water (u(t ), Φ(u(t ))) ∈ R2 slides along the oblique line y = x at constant speed, and at time t = u0 it reaches the origin. Then, it stops. The resulting global solution u is not differentiable at t = u0 . It is only Lipschitz continuous on R. In order to define the notion of strong solution for (GSD), let us first recall some notions concerning vector-valued functions of real variables. (See [135, Appendix] for more details.) Definition 17.2.2. Given T ∈ R+ , a function f : [0, T ] −→ is said to be absolutely continuous if one of the following equivalent properties holds: (i) There exists an integrable function g : [0, T ] −→ such that t f (t ) = f (0) + g (s) d s ∀t ∈ [0, T ] . 0
(ii) f is continuous and its distributional derivative belongs to the Lebesgue space L1 ([0, T ] ; ) .
i
i i
i
i
i
i
680
book 2014/8/6 page 680 i
Chapter 17. Gradient flows
(iii) For every ε > 0, there exists some η > 0 such that for any finite family of intervals Ik = (ak , bk ) Ik ∩ I j = for i = j and |bk − ak | ≤ η =⇒ f (bk ) − f (ak ) ≤ ε. k
k
Moreover, an absolutely continuous function is differentiable almost everywhere, its pointwise derivative coincides with its distributional derivative (a.e.), and one can recover the function from its derivative f = g using the integration formula (i). Definition 17.2.3. We say that u(·) is a strong global solution of (GSD) if (i) and (ii) hold: (i) u : [0, +∞) → is continuous and is absolutely continuous on each interval [0, T ], T < ∞; (ii) for almost all t > 0, u(t ) ∈ dom ∂ Φ, and − u˙(t ) ∈ ∂ Φ(u(t )). Theorem 17.2.2. Let Φ : → R ∪ {+∞} be a convex, lower semicontinuous, and proper function. Supposed that Φ is minorized, i.e., inf Φ > −∞. Then, for any u0 ∈ dom ∂ Φ, there exists a unique strong global solution u : [0, +∞) → of the Cauchy problem u˙(t ) + ∂ Φ(u(t )) # 0, (17.47) u(0) = u0 . Moreover the following properties hold: (i) u(t ) ∈ dom ∂ Φ for all t ≥ 0. (ii) u˙ ∈ L2 (0, +∞; )∩L∞ (0, +∞; ); in particular u is Lipschitz continuous on [0, +∞). (iii) For each t ≥ 0, u has a right derivative, and d+u dt
(t ) = −∂ Φ(u(t ))0 ,
where ∂ Φ(u(t ))0 is the element of minimal norm of ∂ Φ(u(t )). +
(iv) t → dd tu (t ) is nonincreasing. (v) t → Φ(u(t )) is nonincreasing, absolutely continuous on each bounded interval [0, T ], and d (17.48) Φ(u(t )) = − u˙ (t )2 for almost all t > 0. dt PROOF. A. Uniqueness is a direct consequence of the monotonicity property of ∂ Φ. Suppose u and v are two strong global solutions of (GSD), associated with the respective ˙ ) ∈ ∂ Φ(v(t )), by monotonicity Cauchy data u0 and v0 . Since − u˙ (t ) ∈ ∂ Φ(u(t )) and −v(t of ∂ Φ ˙ ), u(t ) − v(t )〉 ≥ 0. 〈− u˙ (t ) + v(t (17.49) One can easily verify that the function t → u(t ) − v(t )2 is absolutely continuous on bounded intervals. By (17.49), it satisfies for almost every t ≥ 0 d dt
u(t ) − v(t )2 ≤ 0.
i
i i
i
i
i
i
17.2. The gradient flow associated to a convex potential
book 2014/8/6 page 681 i
681
As a consequence, t → u(t ) − v(t ) is a decreasing function, and u(t ) − v(t ) ≤ u(0) − v(0) ∀t ≥ 0.
(17.50)
For a given Cauchy data we have u(0) = v(0), which gives uniqueness of the solution. B. Existence. (a) For each λ > 0, consider the Cauchy problem obtained by replacing A = ∂ Φ by its Yosida approximation Aλ = ∇Φλ . By Proposition 17.2.1, the operator ∇Φλ : → is Lipschitz continuous with Lipschitz constant λ1 . Therefore, according to the Cauchy–Lipschitz theorem 17.1.2 (global version), there is a unique global classical solution uλ : [0, +∞) → of the Cauchy problem
u˙λ (t ) + ∇Φλ (uλ (t )) = 0, uλ (0) = u0 .
(17.51)
(b) Let us establish estimations on the sequence (uλ ). Energy estimate (17.7) gives
∞
u˙λ (t )2 d t ≤ Φλ (u0 ) − inf Φλ .
0
Since Φλ ≤ Φ, and inf Φλ = inf Φ (a direct consequence of the definition of Φλ ), it follows that ∞ u˙λ (t )2 d t ≤ Φ(u0 ) − inf Φ.
0
Due to the autonomous nature of the differential equation in (17.51), for any positive real number h > 0, t → uλ (t ) and t → uλ (t + h) are solutions of this differential equation. Using the argument developed in the proof of uniqueness, we obtain that t → uλ (t + h) − uλ (t ) is a decreasing function. Hence, for any 0 ≤ s ≤ t uλ (t + h) − uλ (t ) ≤ uλ (s + h) − uλ (s). Dividing by h > 0 and letting h → 0 gives u˙λ (t ) ≤ u˙λ (s).
(17.52)
Hence t → u˙λ (t ) is a decreasing function. In particular, for any t ≥ 0 u˙λ (t ) ≤ u˙λ (0) = ∇Φλ (u0 ). By (17.29), and u0 ∈ dom ∂ Φ, we have ∇Φλ (u0 ) ≤ ∂ Φ(u0 )0 . Hence sup u˙λ (t ) ≤ ∂ Φ(u0 )0 .
(17.53)
λ>0, t >0
(c) Let us show that for all T > 0, (uλ ) converges uniformly on [0, T ] as λ → 0. To that end, fix arbitrary T > 0, and prove that (uλ ) is a Cauchy sequence in the Banach space ' ([0, T ]; ), equipped with the supremum norm. Take λ > 0, μ > 0, and consider the corresponding solutions uλ , uμ of the Cauchy problems
u˙λ (t ) + ∇Φλ (uλ (t )) = 0, uλ (0) = u0 , u˙μ (t ) + ∇Φμ (uμ (t )) = 0, uμ (0) = u0 .
i
i i
i
i
i
i
682
book 2014/8/6 page 682 i
Chapter 17. Gradient flows
Set h(t ) := 12 uλ (t ) − uμ (t )2 . We have (recall that Aλ = ∇Φλ , Aμ = ∇Φμ ) ˙h(t ) = 〈u (t ) − u (t ), u˙ (t ) − u˙ (t )〉 λ μ λ μ = 〈uλ (t ) − uμ (t ), −Aλ uλ (t ) + Aμ uμ (t )〉.
(17.54)
From now on, to simplify notation, we omit the variable t and write uλ for uλ (t ). By (17.15) and (17.16) we have uλ = Jλ uλ +λAλ uλ and uμ = Jμ uμ +μAμ uμ . Replacing uλ and uμ by these expressions in (17.54) gives ˙h(t ) + 〈J u − J u , A u − A u 〉 + 〈A u − A u , λA u − μA u 〉 ≤ 0. λ λ μ μ λ λ μ μ λ λ μ μ λ λ μ μ
(17.55)
By (17.18) we have Aλ uλ ∈ ∂ Φ(Jλ uλ ), and Aμ uμ ∈ ∂ Φ(Jμ uμ ). Hence, by monotonicity of ∂ Φ, 〈Jλ uλ − Jμ uμ , Aλ uλ − Aμ uμ 〉 ≥ 0 and (17.55) gives
˙h(t ) + 〈A u − A u , λA u − μA u 〉 ≤ 0. λ λ μ μ λ λ μ μ
(17.56)
Let us consider the last expression in (17.56), develop it, and apply the Cauchy–Schwarz inequality: 〈Aλ uλ − Aμ uμ , λAλ uλ − μAμ uμ 〉 = λAλ uλ 2 + μAμ uμ 2 − (λ + μ)〈Aλ uλ , Aμ uμ 〉 ≥ λAλ uλ 2 + μAμ uμ 2 − (λ + μ)Aλ uλ Aμ uμ . (17.57) By adding the two elementary inequalities λ λAλ uλ Aμ uμ ≤ λAλ uλ 2 + Aμ uμ 2 , 4 μ 2 μAλ uλ Aμ uμ ≤ μAμ uμ + Aλ uλ 2 , 4 we obtain λ μ λAλ uλ 2 + μAμ uμ 2 − (λ + μ)Aλ uλ Aμ uμ ≥ − Aμ uμ 2 − Aλ uλ 2 . (17.58) 4 4 Combining (17.56), (17.57), and (17.58) we obtain ˙h(t ) ≤ λ A u 2 + μ A u 2 . 4 μ μ 4 λ λ From u˙λ = −Aλ uλ , u˙μ = −Aμ uμ , and (17.53) we deduce that ˙h(t ) ≤ λ + μ ∂ Φ(u )0 2 . 0 4 By integration of (17.59), and definition of h, we obtain 0 1 $ 2λ + μ uλ (t ) − uμ (t ) ≤ ∂ Φ(u0 )0 t 2
(17.59)
∀t ≥ 0
(17.60)
i
i i
i
i
i
i
17.2. The gradient flow associated to a convex potential
and uλ − uμ L∞ (0,T ;H ) ≤
book 2014/8/6 page 683 i
683
0 1 2λ+μ 2
$ ∂ Φ(u0 )0 T .
Thus (uλ ) is a Cauchy sequence in the Banach space ' ([0, T ]; ), equipped with the supremum norm, and hence converges uniformly. The argument being valid for all T > 0, set uλ → u L∞ (0, T ; ) ∀T ≥ 0 as λ → 0. Passing to the limit on (17.60) gives uλ (t ) − u(t ) ≤
0 1 2 λt 2
∂ Φ(u0 )0
∀t ≥ 0.
Moreover, by (17.53) Jλ uλ (t ) − uλ (t ) = λAλ uλ = λ u˙λ (t ) ≤ λ∂ Φ(u0 )0 . Hence Jλ u λ → u
L∞ (0, T ; )
∀T ≥ 0 as λ → 0.
(17.61)
Since ( u˙λ ) is bounded in L2 (0, ∞; ), we also deduce that u˙λ u˙
weak-L2 (0, T ; ) ∀T ≥ 0 as λ → 0.
(d) Let us now pass to the limit on the approximate equations and prove that u is a strong solution of the Cauchy problem (17.47). Note first that − u˙λ (t ) = ∇Φλ (uλ (t )) ∈ ∂ Φ(Jλ uλ (t )).
(17.62)
Then reformulate (17.62) in a variational way by using the Legendre–Fenchel conjugate: − u˙λ (t ) ∈ ∂ Φ(Jλ uλ (t )) > ∗ Φ(Jλ uλ (t )) + Φ (− u˙λ (t )) + 〈Jλ uλ (t ), u˙λ (t )〉 = 0. Since this last expression is always nonnegative, there is no loss of information by integrating it: T (Φ(Jλ uλ (t )) + Φ∗ (− u˙λ (t )) + 〈Jλ uλ (t ), u˙λ (t )〉) d t = 0. 0
Equivalently IΦ (Jλ uλ ) + IΦ∗ (− u˙λ ) + 〈Jλ uλ , u˙λ 〉L2 (0,T ;H ) = 0,
(17.63)
where the integral functionals IΦ and IΦ∗ on L2 (0, T ; ) are respectively defined by IΦ (v) = T T Φ(v(t )) d t and IΦ∗ (v) = 0 Φ∗ (v(t )) d t . By passing to the lower limit on (17.63) we 0 obtain lim inf IΦ (Jλ uλ ) + lim inf IΦ∗ (− u˙λ ) + lim inf〈Jλ uλ , u˙λ 〉L2 (0,T ;H ) ≤ 0. λ→0
λ→0
λ→0
i
i i
i
i
i
i
684
book 2014/8/6 page 684 i
Chapter 17. Gradient flows
Functionals IΦ and IΦ∗ are lower semicontinuous on L2 (0, T ; ) (Fatou’s lemma) and convex. Hence they are lower semicontinuous for the weak topology of L2 (0, T ; ). On the other hand, the scalar product 〈Jλ uλ , u˙λ 〉L2 (0,T ;H ) involves two sequences which respectively converge strongly and weakly in L2 (0, T ; ). As a consequence IΦ (u) + IΦ∗ (− u˙) + 〈u, u˙ 〉L2 (0,T ; ) ≤ 0. Equivalently
T
(Φ(u(t )) + Φ∗ (− u˙(t )) + 〈u(t ), u˙ (t )〉) d t ≤ 0.
(17.64)
0
By definition of the Legendre–Fenchel conjugate, in (17.64) the integrand is nonnegative. As a consequence Φ(u(t )) + Φ∗ (− u˙(t )) + 〈u(t ), u˙ (t )〉 = 0 for almost all t > 0. This is the Legendre–Fenchel extremality condition, which gives − u˙(t ) ∈ ∂ Φ(u(t )) for almost all t > 0.
(17.65)
Thus, we have proved the central part of Theorem 17.2.2: for any u0 ∈ dom∂ Φ, there exists a unique strong global solution u : [0, +∞) → of the Cauchy problem u˙(t ) + ∂ Φ(u(t )) # 0, u(0) = u0 . (e) Let us prove some further regularity properties of the solution u. By (17.53), for any λ > 0, for any 0 ≤ s ≤ t < ∞, uλ (t ) − uλ (s) ≤ |t − s| ∂ Φ(u0 )0 . By letting λ → 0, we obtain, for any 0 ≤ s ≤ t < ∞, u(t ) − u(s) ≤ |t − s| ∂ Φ(u0 )0 .
(17.66)
Hence, u is globally Lipschitz continuous on [0, +∞) and u˙(t ) ≤ ∂ Φ(u0 )0 for almost all t > 0. Let’s use again the autonomous nature of the (GSD) dynamic. Take t0 ≥ 0 such that u(t0 ) ∈ dom ∂ Φ, u is differentiable at t0 , and − u˙(t0 ) ∈ ∂ Φ(u(t0 )), which is verified for almost all t0 . The orbit t → u(t0 + t ) is the strong solution of (GSD) with Cauchy data u(t0 ) at time t = 0. By (17.66), for any h > 0 u(t0 + h) − u(t0 ) ≤ |h| ∂ Φ(u(t0 ))0 . Letting h → 0
u˙(t0 ) ≤ ∂ Φ(u(t0 ))0 .
Since − u˙ (t0 ) ∈ ∂ Φ(u(t0 )), we deduce that − u˙(t0 ) = ∂ Φ(u(t0 ))0 . Thus − u˙(t ) = ∂ Φ(u(t ))0 for almost all t > 0.
(17.67)
i
i i
i
i
i
i
17.2. The gradient flow associated to a convex potential
(f) Let us now prove that for all t ≥ 0, u(t ) ∈ dom ∂ Φ, d+u dt
book 2014/8/6 page 685 i
685 d+u (t ) dt
exists, and
(t ) = −∂ Φ(u(t ))0 .
To that end, let us use again (17.53), and − u˙λ (t ) = ∇Φλ (uλ (t )), to obtain ∇Φλ (uλ (t )) ≤ ∂ Φ(u0 )0 .
(17.68)
Let us fix t ≥ 0, and let η ∈ be a weak cluster point of the net (∇Φλ (uλ (t )))λ , say, ∇Φλ (uλ (t )) η. Since ∇Φλ (uλ (t )) ∈ ∂ Φ(Jλ uλ (t )), by passing to the limit on the subdifferential inequality ∀ξ ∈
Φ(ξ ) ≥ Φ(Jλ uλ (t )) + 〈∇Φλ (uλ (t )), ξ − Jλ uλ (t )〉
we obtain ∀ξ ∈
Φ(ξ ) ≥ Φ(u(t )) + 〈η, ξ − u(t )〉.
Hence η ∈ ∂ Φ(u(t )), which proves that for all t ≥ 0, u(t ) ∈ dom ∂ Φ. Moreover, by (17.68) and the lower semicontinuity of the norm for the weak topology, η ≤ ∂ Φ(u0 )0 . Since η ∈ ∂ Φ(u(t )), it follows that ∂ Φ(u(t ))0 ≤ ∂ Φ(u0 )0 . Again using the autonomous nature of (GSD) we infer that t → ∂ Φ(u(t ))0 is a nonincreasing function.
(17.69)
From this we infer that t → ∂ Φ(u(t ))0 is right-continuous. Fix t0 ≥ 0, and let tn → t0 , tn > t0 . By (17.69), ∂ Φ(u(tn ))0 ≤ ∂ Φ(u(t0 ))0 .
(17.70)
Let η be a weak cluster point of (∂ Φ(u(tn ))0 )n . By the strong × weak closedness property of the maximal monotone operator ∂ Φ (see Proposition 17.2.4), we have η ∈ ∂ Φ(u(t0 )). By (17.70) and the lower semicontinuity property of the norm with respect to the weak topology, we have η ≤ ∂ Φ(u(t0 ))0 . Hence η = ∂ Φ(u(t0 ))0 , which by uniqueness of the cluster point gives the weak convergence ∂ Φ(u(tn ))0 ∂ Φ(u(t0 ))0 . On the other hand, by (17.70), we have lim sup ∂ Φ(u(tn ))0 ≤ ∂ Φ(u(t0 ))0 . n
Hence ∂ Φ(u(tn ))0 converges strongly in to ∂ Φ(u(t0 ))0 , which proves the right-continuity of t → ∂ Φ(u(t ))0 . From this, we can easily deduce that for all t ≥ 0, u(t ) ∈ dom ∂ Φ,
i
i i
i
i
i
i
686
book 2014/8/6 page 686 i
Chapter 17. Gradient flows +
and dd tu (t ) exists. Since u is Lipschitz continuous, it is absolutely continuous on each bounded interval, and ∀t ≥ 0, ∀h > 0
t +h
u(t + h) − u(t ) =
u˙ (τ)d τ.
t
By (17.67) we have − u˙ (t ) = ∂ Φ(u(t ))0 for almost all t > 0. Hence ∀t ≥ 0, ∀h > 0
t +h
u(t + h) − u(t ) = −
∂ Φ(u(τ))0 d τ.
t
Dividing by h > 0, letting h → 0, and taking account of the right-continuity of τ → ∂ Φ(u(τ))0 , we obtain 1 h
(u(t + h) − u(t )) = −
1
h
t +h
∂ Φ(u(τ))0 (τ) d τ −→ −∂ Φ(u(t ))0 ,
t
which proves that for all t ≥ 0, u(t ) ∈ dom ∂ Φ, d+u dt
d+u (t ) dt
exists, and
(t ) = −∂ Φ(u(t ))0.
(17.71)
Combining (17.69) with (17.71) we obtain ) ) ) d+u ) ) ) t → ) (t )) is nonincreasing. ) dt ) (g) Let us complete the proof of Theorem 17.2.2 by showing that t → Φ(u(t )) is nonincreasing, absolutely continuous on each bounded interval [0, T ], and d dt
Φ(u(t )) = − u˙(t )2
for almost all t > 0.
(17.72)
Indeed this is a direct consequence of Proposition 17.2.5 below. By taking h = u˙ in (17.73) we obtain (17.72). The following result is due to Brézis [135, Lemma 3.3]. This is a remarkable derivation chain rule in a nonsmooth context. Proposition 17.2.5. Let u ∈ W 1,2 ([0, T ]; ). Suppose that u(t ) ∈ dom ∂ Φ for almost all t ∈ [0, T ] and that there exists some h ∈ L2 ([0, T ]; ) such that h(t ) ∈ ∂ Φ(u(t )) for almost all t ∈ [0, T ]. Then, the function t → Φ(u(t )) is absolutely continuous on [0, T ], and d dt
Φ(u(t )) = 〈h(t ), u˙ (t )〉 for almost all t ∈ [0, T ].
(17.73)
PROOF. By integration between t1 , t2 ∈ [0, T ] of the classical derivation chain rule d dt
Φλ (u) = 〈Aλ u, u˙〉,
i
i i
i
i
i
i
17.2. The gradient flow associated to a convex potential
we obtain
687
Φλ (u(t2 )) − Φλ (u(t1 )) =
book 2014/8/6 page 687 i
t2 t1
〈Aλ u(τ), u˙ (τ)〉d τ.
In order to pass to the limit as λ → 0 we apply the dominated convergence theorem. We have Aλ u(τ) ≤ ∂ Φ(u(τ))0 ≤ h(τ). As a consequence, |〈Aλ u, u˙〉| ≤ h u˙, which belongs to L1 ([0, T ]; ). On the other hand, Aλ u → ∂ Φ(u)0 , and Φλ (u) → Φ(u). Hence t2 Φ(u(t2 )) − Φ(u(t1 )) = 〈∂ Φ(u(τ))0 , u˙(τ)〉d τ. t1
As a consequence, the function t → Φ(u(t )) is absolutely continuous on [0, T ]. By h(t ) ∈ ∂ Φ(u(t )), the convex subdifferential inequality gives, for almost all t ∈ [0, T ], for all v ∈, Φ(v) − Φ(u(t )) ≥ 〈h(t ), v − u(t )〉. Taking successively v = u(t +ε), v = u(t −ε), dividing by ε, and letting ε → 0, for almost all t ∈ [0, T ] we obtain d Φ(u(t )) = 〈h(t ), u˙(t )〉, dt which completes the proof. Let us show some further convergence properties of the net (uλ ). Our approach relies on a variational argument first developed in [38] and [74] and which uses the following elementary result. Lemma 17.2.1. Let (an,1 )n∈N , . . . , (an,l )n∈N be a finite family of real sequences which satisfy l
an,k ≤ 0 for each n ∈ N;
k=1
ak ≤ lim inf an,k n
l
for each k = 1, 2, . . . , l ;
ak = 0.
k=1
Then an,k → ak for each k = 1, 2, . . . , l . Proposition 17.2.6. Let u be the solution of Cauchy problem (17.47) and (uλ ) the sequence of solutions of problems (17.51) obtained by Moreau–Yosida approximation. For any T > 0, as λ → 0, (i) uλ → u uniformly on [0, T ], (ii) u˙λ → u˙ strongly in L2 ([0, T ]; ), (iii) Φλ (uλ ) → Φ(u) uniformly on [0, T ], (iv) Φ∗ (− u˙λ ) → Φ∗ (− u˙) strongly in L1 ([0, T ]).
i
i i
i
i
i
i
688
book 2014/8/6 page 688 i
Chapter 17. Gradient flows
PROOF. In the proof of Theorem 17.2.2 we showed that (uλ ) converges uniformly to u on [0, T ] and that u˙λ u˙ weakly in L2 ([0, T ]; ) for all T > 0. Taking the scalar product by u˙λ (t ) in u˙λ (t ) + ∇Φλ (uλ (t )) = 0 and integrating on [0, T ] gives the energy estimate T u˙λ (t )2 d t + Φλ (uλ (T )) − Φλ (u0 ) = 0.
(17.74)
0
Let us apply Lemma 17.2.1 to (17.74). By lower semicontinuity of v → v2L2 (0,T ; ) for the weak topology of L2 (0, T ; H )
T
u˙ (t )2 d t ≤ lim inf λ
0
T 0
u˙λ (t )2 d t .
(17.75)
From Φλ (uλ ) ≥ Φ(Jλ uλ ), by uniform convergence of Jλ uλ to u on [0, T ] (see (17.61)), and lower semicontinuity of Φ, we infer Φ(u(T )) ≤ lim inf Φλ (uλ (T )).
(17.76)
Φ(u0 ) = lim Φλ (u0 ).
(17.77)
λ
Moreover
λ
On the other hand, by 17.48, t → Φ(u(t )) is absolutely continuous on [0, T ] and satisfies d dt
Φ(u(t )) = − u˙(t )2
for almost all t > 0.
Integration of (17.78) on [0, T ] gives T u˙(t )2 d t + Φ(u(T )) − Φ(u0 ) = 0.
(17.78)
(17.79)
0
Comparing (17.74) with (17.79), taking account of (17.75), (17.76), (17.77), gives, by Lemma 17.2.1, T T u˙λ (t )2 d t → u˙(t )2 d t , 0
0
Φλ (uλ (T )) → Φ(u(T )). Weak convergence and convergence of the norms imply strong convergence. Thus u˙λ → u˙
strongly in L2 ([0, T ]; ).
The convergence property Φλ (uλ (T )) → Φ(u(T )) holds for all T > 0. Moreover, by (17.53) d Φλ (uλ (t )) = u˙λ (t )2 ≤ ∂ Φ(u0 )0 2 for almost all t > 0. dt As a consequence, by Ascoli’s theorem, the sequence of functions (Φλ (uλ )) is relatively compact for the topology of the uniform convergence on [0, T ]. Thus Φλ (uλ ) → Φ(u)
uniformly on [0, T ].
i
i i
i
i
i
i
17.2. The gradient flow associated to a convex potential
book 2014/8/6 page 689 i
689
By the Fenchel extremality relation, for almost all t > 0 Φλ (uλ (t )) + Φλ ∗ (− u˙λ (t )) + 〈uλ (t ), u˙λ (t )〉 = 0. By uniform convergence of uλ to u and of Φλ (uλ ) to Φ(u) and convergence of u˙λ to u˙ in L2 ([0, T ]; ), Φλ ∗ (− u˙λ ) = −Φλ (uλ ) − 〈uλ , u˙λ 〉 → −Φ(u) − 〈u, u˙ 〉 in L1 ([0, T ]).
(17.80)
By the Fenchel extremality relation, for almost all t > 0 Φ(u(t )) + Φ∗ (− u˙(t )) + 〈u(t ), u˙(t )〉 = 0.
(17.81)
Comparing (17.80) and (17.81) gives Φλ ∗ (− u˙λ ) → Φ∗ (− u˙)
in L1 ([0, T ]).
Since Φλ ∗ (− u˙λ ) = Φ∗ (− u˙λ ) + λ2 u˙λ 2 we deduce that Φ∗ (− u˙λ ) → Φ∗ (− u˙) in L1 ([0, T ]), which completes the proof of Proposition 17.2.6.
17.2.3 Regularizing effect In this section, Φ : → R ∪ {+∞} is a convex, lower semicontinuous, and proper function. Following Theorem 17.2.2, for any initial data u0 ∈ dom ∂ Φ, we denote by u(·, u0 ) : [0, +∞) → the solution of the Cauchy problem u˙ (t ) + ∂ Φ(u(t )) # 0, (17.82) u(0) = u0 . By monotonicity of ∂ Φ, (see (17.50), proof of uniqueness), for any u0 , uˆ0 ∈ dom ∂ Φ, u(t , u0 ) − u(t , uˆ0 ) ≤ u0 − uˆ0
∀t ≥ 0.
(17.83)
For any t ≥ 0, let us consider the operator S(t ) : u0 ∈ dom ∂ Φ → S(t )u0 = u(t , u0 ) ∈ dom ∂ Φ. By (17.83), S(t ) is a nonexpansive mapping from dom∂ Φ into the complete metric space dom ∂ Φ = dom Φ. By a classical uniform continuity argument, S(t ) can be uniquely extended into an operator, still denoted by S(t ) : dom Φ → dom Φ, which satisfies S(t )u0 = u(t , u0 )
for u0 ∈ dom ∂ Φ;
S(t )u0 = lim u(t , u0n ) n
(17.84)
for u0 ∈ dom Φ, u0n → u0 , u0n ∈ dom ∂ Φ;
S(t )u0 − S(t ) uˆ0 ) ≤ u0 − uˆ0
for u0 , uˆ0 dom Φ.
Let us verify the following semigroup (semiflow) properties of the family of operators (S(t )) t ≥0 : S(t )0S(s) = S(t + s) lim S(t )u0 = u0
t →0+
∀s, t ≥ 0;
∀u0 ∈ dom Φ.
(17.85) (17.86)
i
i i
i
i
i
i
690
book 2014/8/6 page 690 i
Chapter 17. Gradient flows
Given u0 ∈ dom Φ, let u0n → u0 , u0n ∈ dom ∂ Φ. By uniqueness of the solution of the Cauchy problem (17.82), for each n ∈ N, for all s, t ≥ 0, S(t )0S(s)u0n = S(s + t )u0n .
(17.87)
Passing to the limit on (17.87), as n → +∞, we obtain the semigroup property (17.85). Moreover, by the triangle inequality, and nonexpansiveness of S(t ), we have S(t )u0 − u0 ≤ S(t )u0 − S(t )u0n + S(t )u0n − u0n + u0n − u0 ; ≤ 2u0n − u0 + S(t )u0n − u0n . As a consequence
lim sup S(t )u0 − u0 ≤ 2u0n − u0 . t
This being true for all n ∈ N, we obtain (17.86). We call (S(t )) t ≥0 the semigroup of contractions generated by A = ∂ Φ. Let us prove that for u0 ∈ dom Φ, the mapping t → S(t )u0 can still be interpreted as the strong solution of (17.82). This result, which is due to Brézis, is called the regularization effect. It has important consequences in PDE’s evolution problems. Theorem 17.2.3. Let Φ : → R ∪ {+∞} be convex, lower semicontinuous, and proper. Suppose that Φ is minorized, i.e., inf Φ > −∞. Let u0 ∈ dom Φ. Then, there exists a unique strong global solution u : [0, +∞) → of the Cauchy problem u˙(t ) + ∂ Φ(u(t )) # 0, (17.88) u(0) = u0 , which is given by u(t ) = S(t , u0 ). The Cauchy problem (17.88) is satisfied in the following sense: (i) u ∈ C ([0, +∞); ); (ii) u(t ) ∈ dom ∂ Φ for all t > 0; (iii) u is Lipschitz continuous on [δ, +∞) for any δ > 0; (iv) (17.88) is satisfied for almost all t > 0. Moreover $ (v) t u˙ ∈ L2 (0, T ; ) for all T > 0; (vi) for each t > 0, 1 ∂ Φ(u(t ))0 ≤ ∂ Φ(v)0 + u0 − v t
∀v ∈ dom ∂ Φ.
(vii) For each t > 0, u has a right derivative, and d+u dt
(t ) = −∂ Φ(u(t ))0 ,
(17.89)
where ∂ Φ(u(t ))0 is the element of minimal norm of ∂ Φ(u(t )).
i
i i
i
i
i
i
17.2. The gradient flow associated to a convex potential
book 2014/8/6 page 691 i
691
+
(viii) t → dd tu (t ) is nonincreasing. (ix) t → Φ(u(t )) is nonincreasing, absolutely continuous on each bounded interval [δ, T ], δ > 0, and d Φ(u(t )) = − u˙(t )2 for almost all t > 0. (17.90) dt PROOF. Let u0 ∈ dom Φ. According to the Cauchy–Lipschitz theorem, Theorem 17.1.2, for any λ > 0, there is a unique global classical solution uλ : [0, +∞) → of the Cauchy problem u˙λ (t ) + ∇Φλ (uλ (t )) = 0, (17.91) uλ (0) = u0 . Let us denote respectively by S(t ) and Sλ (t ) the semigroups generated by A = ∂ Φ and Aλ = ∇Φλ . Let us show that uλ (t ) = Sλ (t )u0 → S(t )u0 as λ → 0.
(17.92)
Let u0n → u0 with u0n ∈ dom ∂ Φ for each n ∈ N. By the triangle inequality and nonexpansiveness of S(t ) and Sλ (t ) Sλ (t )u0 − S(t )u0 ≤ Sλ (t )u0 − Sλ (t )u0n + Sλ (t )u0n − S(t )u0n + S(t )u0n − S(t )u0 ≤ 2u0n − u0 + Sλ (t )u0n − S(t )u0n . By Proposition 17.2.6 and u0n ∈ dom ∂ Φ, we have Sλ (t )u0n → S(t )u0n as λ → 0. Hence, lim sup Sλ (t )u0 − S(t )u0 ≤ 2u0n − u0 . λ→0
Letting n → +∞ gives the result. The proof of Theorem 17.2.3 will consist in establishing estimates on the net (uλ ), and then pass to the limit as λ → 0. The key idea is to establish an energy estimate on ( u˙λ ) in a weighted L2 space, with a weight which vanishes at zero. Doing so we can cancel the singularities at the origin (occurring from the nonsmooth data u0 ). Thus, let us multiply (17.91) by t u˙λ (t ) and integrate from 0 to T : 0
T
T
2
t u˙λ (t ) d t +
t 0
d dt
Φλ (uλ (t ))d t = 0.
Integrating by parts
T
2
t u˙λ (t ) d t + T Φλ (uλ (T )) =
0
T 0
Φλ (uλ (t ))d t .
(17.93)
To exploit this estimate, we establish majorization/minimization of each of its constituents. By (17.52), t → u˙λ (t ) is a nonincreasing function. Hence
T 0
t u˙λ (t )2 d t ≥
T2 2
u˙λ (T )2 =
T2 2
Aλ uλ (T )2 .
(17.94)
i
i i
i
i
i
i
692
book 2014/8/6 page 692 i
Chapter 17. Gradient flows
Take v ∈ dom ∂ Φ. By − u˙λ (t ) = ∇Φλ (uλ (t )), the subdifferential inequality for Φλ at uλ (t ) gives Φλ (v) ≥ Φλ (uλ (t )) + 〈− u˙λ (t ), v − uλ (t )〉 1 d ≥ Φλ (uλ (t )) + u (t ) − v2 . 2 dt λ Integrating from from 0 to T T 1 1 T Φλ (v) ≥ Φλ (uλ (t ))d t + uλ (T ) − v2 − u0 − v2 . 2 2 0
(17.95)
The subdifferential inequality for Φλ at v gives Φλ (uλ (T )) ≥ Φλ (v) + 〈Aλ v, uλ (T ) − v〉, which, after multiplication by T , gives T Φλ (uλ (T )) ≥ T Φλ (v) + T 〈Aλ v, uλ (T ) − v〉.
(17.96)
Combining (17.93) with (17.94), (17.95), (17.96) gives T2
1 1 Aλ uλ (T )2 +T Φλ (v)+T 〈Aλ v, uλ (T )− v〉 ≤ T Φλ (v)+ u0 − v2 − uλ (T )− v2 . 2 2 2
After simplification T2
1 1 Aλ uλ (T )2 ≤ u0 − v2 − T 〈Aλ v, uλ (T ) − v〉 − uλ (T ) − v2 . 2 2 2
(17.97)
By using the elementary majorization 1 1 − uλ (T ) − v2 − T 〈Aλ v, uλ (T ) − v〉 ≤ T Aλ v2 2 2 in (17.97) we obtain T2
1 T2 Aλ uλ (T )2 ≤ u0 − v2 + Aλ v2 . 2 2 2
Equivalently Aλ uλ (T )2 ≤
1 T2
u0 − v2 + Aλ v2 ,
which implies Aλ uλ (T ) ≤ Aλ v +
1
T Since v ∈ dom ∂ Φ, we have Aλ v ≤ A0 v. Hence Aλ uλ (T ) ≤ A0 v +
1 T
u0 − v.
u0 − v.
(17.98)
As a consequence, for each t > 0 the net (Aλ uλ (t ))λ→0 is bounded. Let us fix t > 0. After extraction of a subnet (we keep the same notation) we have Aλ uλ (t ) η
weakly in .
i
i i
i
i
i
i
17.2. The gradient flow associated to a convex potential
book 2014/8/6 page 693 i
693
By the subdifferential inequality for Φλ at uλ (t ), for any ξ ∈ Φλ (ξ ) ≥ Φλ (uλ (t )) + 〈Aλ uλ (t ), ξ − uλ (t )〉. By the nonincreasing property of λ → Φλ v we deduce that, for any 0 < λ < λ0 , Φ(ξ ) ≥ Φλ (ξ ) ≥ Φλ0 (uλ (t )) + 〈Aλ uλ (t ), ξ − uλ (t )〉.
(17.99)
By (17.92), uλ (t ) = Sλ (t )u0 → S(t )u0 as λ → 0. By passing to the limit on (17.99), as λ → 0 (the scalar product involves two sequences which converge respectively weakly and strongly), Φ(ξ ) ≥ Φλ0 (S(t )u0 ) + 〈η, ξ − S(t )u0 〉. This being true for any λ0 > 0, letting λ0 → 0 gives, for any ξ ∈ , Φ(ξ ) ≥ Φ(S(t )u0 ) + 〈η, ξ − S(t )u0 〉. Therefore, for all t > 0, S(t )u0 ∈ dom ∂ Φ
and
η ∈ ∂ Φ(S(t )u0 ).
As a consequence, by (17.84), for any δ > 0, t → S(t )u0 is the strong solution on [δ, +∞) of the evolution equation u˙ (t ) + ∂ Φ(u(t )) # 0 with Cauchy data u(δ) = S(δ)u0 . This implies that u(t ) = S(t )u0 satisfies properties (i), (ii), (iii), and (iv) of Theorem 17.2.3. Moreover, passing to the limit on (17.98), by the lower semicontinuity of the norm for the weak topology, 1 η ≤ A0 v + u0 − v. t Since η ∈ ∂ Φ(u(t )) we deduce that 1 ∂ Φ(u(t ))0 ≤ ∂ Φ(v)0 + u0 − v t
∀v ∈ dom ∂ Φ.
Let us now return to (17.93) and use (17.95), (17.96). We obtain T 1 1 t u˙λ (t )2 d t +T Φλ (v)+T 〈Aλ v, uλ (T )−v〉 ≤ T Φλ (v)+ u0 −v2 − uλ (T )−v2 . 2 2 0 After simplification and using the same majorization as above we obtain T 1 1 t u˙λ (t )2 d t ≤ u0 − v2 − uλ (T ) − v2 − T 〈Aλ v, uλ (T ) − v〉 2 2 0 2 1 T ≤ u0 − v2 + Aλ v2 . 2 2 Passing to the limit, as λ → 0, according to an argument of lower semicontinuity for the weak topology T 1 T2 0 2 t u˙(t )2 d t ≤ u0 − v2 + A v ∀v ∈ dom ∂ Φ. 2 2 0 $ Hence t u˙ ∈ L2 (0, T ; ) for all T > 0. Items (vii), (viii), and (ix) are direct consequences of Theorem 17.2.2 and of the fact that u(t ) = S(t )u0 is a strong solution of (GSD) on [δ, +∞) for any δ > 0. This completes the proof of Theorem 17.2.3.
i
i i
i
i
i
i
694
book 2014/8/6 page 694 i
Chapter 17. Gradient flows
17.2.4 The exponential formula As shown in the previous section, the Cauchy problem
u˙(t ) + ∂ Φ(u(t )) # 0, u(0) = u0 , u0 ∈ dom ∂ Φ
(17.100)
possesses a unique strong global solution u in C([0, +∞), ) which may be written in terms of a semigroup: u(t ) = @ (t )u0 . By analogy with the solution u(t ) = e −t A u0 of the linear Cauchy problem u˙ + Au = 0, u(0) = u0 when is a finite dimensional space, it is convenient to introduce the notation u(t ) = e −t ∂ Φ u0 , which is consistent with the semigroup property. According to the definition of the matrix exponential, the solution of −n u˙ +Au = 0, u(0) = u0 can also be expressed as the limit u(t ) = limn→+∞ I + nt A u0 . In this section, we establish an analogous formula for the global strong solution of (17.100). Theorem 17.2.4 (exponential formula). Let Φ : → R ∪ {+∞} be a convex function satisfying the conditions of Theorem 17.2.3 and u0 ∈ dom ∂ Φ. Then for each t ∈ [0, +∞) the limit −n t u0 (17.101) u(t ) := lim I + ∂ Φ n→+∞ n exists and is uniform on bounded intervals of [0, +∞). Furthermore, u is the unique strong global solution of the Cauchy problem (17.100). For proving Theorem (17.2.4), it is convenient to introduce the notion of backward implicit discrete scheme. Given a sequence (λk )k∈N of positive numbers, we call a proximal sequence with step size (λk )k∈N , associated with ∂ Φ (or more generally with a maximal monotone operator A), the sequence (xk )k∈N in defined by ⎧ ⎨ xk − xk−1 ∈ −∂ Φ(x ) for k ≥ 1, k λk ⎩ x given in dom ∂ Φ. 0 Such a sequence (xk )k∈N is well defined and satisfies the formula xk = Jλk xk−1 = (I + λk ∂ Φ)−1 xk−1 for all k ≥ 1. Consequently if the step size λk is constant equal to λ, then ! "−k x0 . xk = Jλk x0 = I + λ∂ Φ It should be noted that xk ∈ dom ∂ Φ for all k ≥ 1. These sequences play an important role in the discrete (or difference) approximation of gradient flow problems in the general context of monotone operators. More precisely let t0 = 0 < t1 < · · · tk−1 < tk < · · · tn = T be a finite partition of [0, T ], set λk = tk − tk−1 > 0 for k = 1, . . . , n, and consider the proximal sequence associated with (λ k )k=1,...,N and ∂ Φ. Then, under some additional conditions, the step function un := nk=1 1]tk−1 ,t k] xk gives rise to the notion of (backward) DS-approximate solution of the corresponding Cauchy problem in [0, T ] (see [259, 260] and Section 17.6). We do not address this issue in this section. Among other things, the following lemma provides an estimation for the distance bex l ) l ∈N with step size λk and λˆl , respectively, tween two proximal sequences (xk )k∈N and (ˆ
i
i i
i
i
i
i
17.2. The gradient flow associated to a convex potential
book 2014/8/6 page 695 i
695
associated with ∂ Φ (or a maximal monotone operator A). We use the notation σk =
k i =1
λi , τk =
k i =1
λ2i
(similarly for σˆk and τˆk )
for k ≥ 1 and set σ0 = τ0 = 0. x l ) l ∈N be two proximal sequences Lemma 17.2.2 (Kobayashi inequality). Let (xk )k∈N and (ˆ as above. Then for every v ∈ dom ∂ Φ and all (k, l ) ∈ N2 , xk − xˆl ≤ x0 − v + ˆ x0 − v + ∂ Φ(v)0 (σk − σˆl )2 + τk + τ l .
(17.102)
x −x PROOF. To shorten notation we set ak,l = (σk − σˆl )2 + τk + τ l , yk = k−1λ k , and we k replace ∂ Φ by a maximal monotone operator A. We reason by induction on (k, l ). We begin by establishing (17.102) for (k, 0) and (0, l ). Since Jλ1 is nonexpansive, and noticing that v = Jλ1 (v + λ1 Av 0 ) and x1 = Jλ1 x0 , we have x1 − v ≤ x0 − v − λ1 Av 0 ≤ x0 − v + λ1 Av 0 . Hence, by induction xk − v ≤ x0 − v + σk Av 0 . Consequently xk − xˆ0 ≤ xk − v + v − xˆ0 ≤ x0 − v + σk Av 0 + v − xˆ0 ≤ x0 − v + v − xˆ0 + ak,0 Av 0 x l ) l ∈N playing a symmetrical role, inequalbecause σk ≤ ak,0 . The sequences (xk )k∈N and (ˆ ity (17.102) also holds for (0, l ). To continue the proof, we need the following technical lemma. Lemma 17.2.3. Let (x, y) and (ˆ x , yˆ) be two elements of A, and λ ≥ 0, μ ≥ 0 in R. Then we have the following inequality: (λ + μ)x − xˆ ≤ λˆ x + μˆ y − x + μx + λy − xˆ.
(17.103)
PROOF OF LEMMA 17.2.3. It suffices to follow the calculation (λ + μ)x − xˆ2 = λ〈x − xˆ, x − xˆ〉 + μ〈x − xˆ, x − xˆ〉 = λ〈x − xˆ − μˆ y , x − xˆ 〉 + μ〈x − xˆ + λy, x − xˆ〉 + λμ〈ˆ y − y, x − xˆ〉 ≤ λ〈x − xˆ − μˆ y , x − xˆ 〉 + μ〈x − xˆ + λy, x − xˆ〉 ! " ≤ λˆ x + μˆ y − x + μx + λy − xˆ x − xˆ, where we have used the monotonicity of A to claim that λμ〈ˆ y − y, x − xˆ〉 ≤ 0.
i
i i
i
i
i
i
696
book 2014/8/6 page 696 i
Chapter 17. Gradient flows
PROOF OF LEMMA 17.2.2 CONTINUED. Assume that (17.102) holds for (k − 1, l ) and (k, l − 1). Applying Lemma 17.2.3 we infer x l + λˆl yˆl − xk + λˆl xk + λk yk − xˆl . (λk + λˆl )xk − xˆl ≤ λk ˆ Hence, setting αk,l :=
λˆl λk +λˆl
and βk,l :=
λk λk +λˆl
(17.104)
, from (17.104) we infer
x l −1 − xk . xk − xˆl ≤ αk,l xk−1 − xˆl + βk,l ˆ Using the induction hypothesis we obtain # $ xk − xˆl ≤ αk,l x0 − v + ˆ x0 − v + ak−1,l Au 0 # $ x0 − v + ak,l −1 Au 0 + βk,l x0 − v + ˆ x0 − v + (αk,l ak−1,l + βk,l ak,l −1 )Au 0 . = x0 − v + ˆ To conclude, it suffices to check that αk,l ak−1,l + βk,l ak,l −1 ≤ ak,l (see [319, Proposition 2.12]). PROOF OF THEOREM 17.2.4. The proof proceeds in three steps. Step 1. Existence and Lipschitz continuity of the limit (17.101). Since J nt is nonexpann
sive, the limit (17.101) exists iff it exists for u0 ∈ dom ∂ Φ. Hence, in what follows, we assume that u0 ∈ dom ∂ Φ. For each fixed T > 0, let us define the function un : [0, T ] → by un (t ) = (I + nt ∂ Φ)−n u0 . Let (m, n) ∈ N∗ × N∗ , s > 0, t > 0 in [0, T ], and consider the two proximal sequences x l ) l ∈N associated with the two step sizes λk = s/m and λˆl = t /n, respec(xk )k∈N and (ˆ tively, and with initial condition x0 = xˆ0 = u0 . Noticing that xˆn = un (t ) and x m = u m (s), from Lemma 17.2.2 (take v = u0 ) we infer 0 1 2 s2 t 2 0 un (t ) − u m (s) ≤ ∂ Φ(u0 ) (t − s)2 + + . (17.105) m n Taking t = s, (17.105) yields that (un )n∈N is a Cauchy sequence in C([0, T ], ). Consequently it uniformly converges to some function u (for t = s = 0, un (t ) = u m (t ) = u0 ). Taking m = n and going to the limit on n in (17.105), we conclude that u is a ∂ Φ(u0 )0 Lipschitz function in C([δ, T ], ) for every δ > 0. Note that u(t ) ∈ dom ∂ Φ, since un (t ) belongs to dom ∂ Φ. Actually we will see in the last step that u(t ) belongs to dom ∂ Φ. Step 2. We prove that u˙(t ) ∈ −∂ Φ(u(t )) for a.e. t in [0, +∞). For any λ > 0, we look at the unique global classical solution uλ : [0, +∞) → of the Cauchy problem (17.91) considered in the previous section. From Theorem 17.2.3, for all t > 0, uλ (t ) converges to u¯(t ) when λ → 0, where u¯ satisfies u˙¯(t ) ∈ −∂ Φ( u¯(t )) for a.e. t in [0, +∞). Therefore, in order to complete the proof, it suffices to establish that for all t > 0, uλ (t ) → u(t ) when λ → 0. For each fixed t > 0 and each λ intended to go to 0, t > λ > 0, choose m the integer part of λt , thus satisfying t = λm + δ with 0 ≤ δ < λ, and write uλ (t )−u(t ) ≤ uλ (t )−uλ (mλ)+uλ (λm)−Jλm u0 +Jλm u0 −u(mλ)+u(mλ)−u(t ). (17.106)
i
i i
i
i
i
i
17.2. The gradient flow associated to a convex potential
book 2014/8/6 page 697 i
697
We are going to estimate each of the four terms of the right-hand side of (17.106). We know that uλ and u are Lipschitz continuous with a constant Lipschitz equal to ∂ Φ(u0 )0 . For the function u this property holds in each interval [δ, T ] with δ > 0 and has been established in Step 1. Thus, since mλ > 0, uλ (t ) − uλ (mλ) ≤ |t − mλ|∂ Φ(u0 )0 ≤ λ∂ Φ(u0 )0 , u(mλ) − u(t ) ≤ |t − mλ|∂ Φ(u0 )0 ≤ λ∂ Φ(u0 )0 .
(17.107)
On the other hand, with the notation of the first step, and from (17.105), Jλm u0 − u(mλ) = J tm−δ u0 − u(mλ) m
= u m (t − δ) − u(t − δ) ≤ u m (t − δ) − u m (t ) + u m (t ) − u(t ) + u(t ) − u(t − δ) 0 1 2 t 2 + (t − δ)2 0 0 ≤ λ∂ Φ(u0 ) + u m (t ) − u(t ) + ∂ Φ(u0 ) δ 2 + . m Noticing that λ → 0 =⇒ m → +∞, from the first step and the estimate above we infer that (17.108) lim Jλm u0 − u(mλ) = 0. λ→0
The second term is more complex to estimate and requires the following lemma. Lemma 17.2.4 (Chernoff). Let J : → be a nonexpansive operator, v0 ∈ , and v the strong global solution of the Cauchy problem in [0, +∞), ˙ ) = − λ1 (I − J )v(t ), v(t (17.109) v(0) = v0 . Then, for all n ∈ N and all t ≥ 0, ˙ λt + (nλ − t )2 . v(t ) − J n v0 ≤ v(0)
(17.110)
PROOF OF LEMMA 17.2.4. According to Theorem 17.1.2, (17.109) is well-posed in the sense that there exists a unique solution v. It is enough to consider the case λ = 1 (otherwise apply the inequality to the function vλ defined by vλ (t ) = v(λt )). We establish ˙ ) is decreasing (a conse(17.110) inductively. For n = 0, due to the fact that t → v(t quence of v → (I − J ) monotone), we infer ) ) t ) t ) / ) ) ˙ ˙ ˙ ˙ d s) ≤ v(s) d s ≤ t v(0) ≤ v(0) t + t 2, v(t ) − v0 = ) v(s) ) 0 ) 0 and (17.110) holds. Assume that (17.110) holds for n − 1. From ˙ ) + e t v(t ) = e t J v(t ) e t v(t we obtain v(t ) = v0 e
−t
t
+
e s −t J v(s) d s.
0
i
i i
i
i
i
i
698
book 2014/8/6 page 698 i
Chapter 17. Gradient flows
Hence
t ) ! " ) ) ) n −t s −t n ) J v(s) − J v0 d s ) e v(t ) − J v0 = )(v0 − J v0 )e + ) 0 t ≤ e −t v0 − J n v0 + e s v(s) − J n−1 v0 d s , n
0
where we have used the fact that J is nonexpansive. Noticing that v0 − J n v0 ≤
n
˙ J k−1 v0 − J k v0 ≤ nv0 − J v0 = nv(0),
k=1
we infer that
n
v(t ) − J v0 ≤ e
−t
t
˙ nv(0) +
s
e v(s) − J
n−1
0
v0 d s .
We complete the proof by using the induction hypothesis and the inequality t V 2 e s s + (n − 1) − s d s ≤ e t t + (n − t )2 n+ 0
obtained in an elementary way (h(t ) := n+ satisfies h(0) = 0 and h (t ) ≤ 0).
t 0
es
s + ((n − 1) − s)2 d s −e t
t + (n − t )2
CONCLUSION OF THE PROOF OF STEP 2. We go back to the estimate of the second term uλ (λm)−Jλm u0 of (17.106). Applying Lemma 17.2.4 with J := Jλ , thus v = uλ , and using the notation of Proposition 17.2.1, we obtain $ $ uλ (λm) − Jλm u0 ≤ Aλ (u0 )λ m ≤ Aλ (u0 ) λt . But from Proposition 17.2.2, Aλ (u0 ) ≤ ∂ Φ(u0 )0 so that $ uλ (λm) − Jλm u0 ≤ ∂ Φ(u0 )0 λt .
(17.111)
Collecting (17.107), (17.108), and (17.111), inequality (17.106) yields limλ→0 uλ (t )−u(t ) = 0, which completes the proof. Remark 17.2.3. When ∂ Φ is single valued and Lipschitz continuous with some constant L > 0, the proof of Step 2 above, i.e., u˙ (t ) = −∇Φ(u(t )) for a.e. t in [0, +∞), can be achieved in a more direct and independent way. For any t ≥ 0 let us define the operator (t ) : u0 ∈ dom ∂ Φ → (t )u0 := u(t ) ∈ dom ∂ Φ, where u is the limit obtained in the first step above. We first prove that satisfies the semigroup property: (s + t ) = (s) (t ). Indeed for m ∈ N we have p m m mp (mt ) = lim J nm t = lim J t = lim J t = (t ) , n→+∞
n
p→+∞
p
p→+∞
p
where, in the third equality, we took into account the continuity of the resolvent operator. (Recall that the resolvent is nonexpansive.) Now, let p, q, p , q be positive integers. Then, from the above,
i
i i
i
i
i
i
17.2. The gradient flow associated to a convex potential
p q
+
p
=
q
pq + p q
=
1
qq pq
book 2014/8/6 page 699 i
699
=
qq
pq + p q
1 qq
p q
1
p
=
qq
q
T
p q
.
Hence (s + t ) = (s) (t ) holds if t and s are rational. The claim follows in view of the continuity in t . Since u is a Lipschitz continuous function on [δ, T ] for all δ > 0 and all T > 0, u is absolutely continuous on each interval [δ, T ] and thus is a.e. differentiable on (0, +∞). In what follows t is fixed in (0, +∞) in such a way that lim
u(t + h) − u(t ) h
h→0
exists. According to the semigroup property we have u(t + h) − u(t ) h
(h) (t )u0 −
=
(t )u0
h
= lim
J nh n
(t )u0 −
(t )u0
h
n→0
.
Consider the proximal sequence associated with ∇Φ, with constant size nh , and initialized in (t )u0 = u(t ): ⎧ ⎨ xk − xk−1 ∈ −∇Φ(x ) for k ≥ 1, k h/n ⎩ x = u(t ). 0
The difference quotient above. Indeed one has
(t )u0 − (t )u0
J nh
n
h
J nh n
(t )u0 −
may be written in terms of the proximal sequence (t )u0
h
= =
1 xn − x0 n h/n n x −x 1 k k−1 n
=−
k=1 n
1
n
h/n yk ,
k=1
where yk = ∇Φ(xk ). Thus u(t + h) − u(t ) h
= lim − n→+∞
n 1
n
yk .
(17.112)
k=1
On the other hand, since ∇Φ is Lipschitz continuous, for k = 1, . . . , n we have yk − ∇Φ(u(t )) ≤ Lxk − u(t ). We claim that
(17.113)
h xk − u(t ) ≤ k ∇Φ(u(t )). n
i
i i
i
i
i
i
700
book 2014/8/6 page 700 i
Chapter 17. Gradient flows
Indeed, using the fact that the resolvent is nonexpansive, we have for i = 1, . . . , k ) ) ) ) h h ) ) xi − u(t ) ≤ ) xi + ∇Φ(xi ) − u(t ) + ∇Φ(u(t )) ) ) ) n n ) ) ) ) h ) ) = ) xi + xi −1 − xi − u(t ) + ∇Φ(u(t )) ) ) ) n ) ) ) ) h ) ) = ) xi −1 − u(t ) − ∇Φ(u(t ))) ) ) n ≤ xi −1 − u(t ) +
h n
∇Φ(u(t )).
The claim follows by summing these inequalities for i = 1, . . . , k. Thus (17.113) yields h yk − ∇Φ(u(t )) ≤ Lk ∇Φ(u(t )). n
(17.114)
Now, according to (17.112) and from (17.114), for all (x, y) ∈ ∇Φ we have u(t + h) − u(t ) − y, u(t ) − x − h A 6 7 B n 1 = lim ∇Φ(u(t )) − y + y − ∇Φ(u(t )) , u(t ) − x n→+∞ n k=1 k A B W X n 1 yk − ∇Φ(u(t )), u(t ) − x ≥ ∇Φ(u(t )) − y, u(t ) − x − sup ∗ n k=1 n∈N W X ≥ ∇Φ(u(t )) − y, u(t ) − x − L|h|∇Φ(u(t )) u(t ) − x ≥ −L|h|∇Φ(u(t )) u(t ) − x. Letting h → 0 we finally infer that for all (x, y) ∈ ∇Φ W X − u˙ (t ) − y, u(t ) − x ≥ 0. From the maximality of ∇Φ we deduce that − u˙(t ) = ∇Φ(u(t )), which completes the proof.
17.2.5 The nonautonomous case: Time-dependent convex potentials In this section, we consider quasi-autonomous evolution equations u˙(t ) + ∂ Φ(u(t )) # f (t ), u(0) = u0 . The function Φ : → R ∪ {+∞} is convex, lower semicontinuous, and proper. It can be interpreted as the internal potential energy of the system, while the second element f ∈ L2 (0, T ; ) represents the external action. This is a particular case (take Φ(t , v) = Φ(v) − 〈 f (t ), v〉) of the general nonautonomous evolution equation u˙ (t ) + ∂ Φ(t , u(t )) # 0, (17.115) u(0) = u0 ,
i
i i
i
i
i
i
17.2. The gradient flow associated to a convex potential
book 2014/8/6 page 701 i
701
where the operator ∂ Φ(t , ·) is the subdifferential of a convex, lower semicontinuous, and proper function Φ(t , ·) : → R ∪ {+∞}, which varies with t . By using techniques similar to those used in the autonomous case ( f = 0), the following results were obtained (see [134], [135] for further details). Theorem 17.2.5. Let Φ : → R ∪ {+∞} be a convex, lower semicontinuous, and proper function. Then, for every f ∈ L2 (0, T ; ), and u0 ∈ domΦ, there is a unique strong solution u ∈ C ([0, T ]; ) of the Cauchy problem u˙(t ) + ∂ Φ(u(t )) # f (t ), u(0) = u0 . Moreover the following properties hold: (i) u(t ) ∈ dom ∂ Φ a.e. t ∈ (0, T ). $ (ii) t u˙ ∈ L2 (0, T ; ). (iii) For a.e. t ∈ (0, T ), u is derivable at t , and u˙(t ) + (∂ Φ(u(t )) − f (t ))0 = 0, where (∂ Φ(u(t )) − f (t ))0 is the element of minimal norm of ∂ Φ(u(t )) − f (t ). (iv) t → Φ(u(t )) is absolutely continuous on each interval [δ, T ], and d dt
Φ(u(t )) + u˙(t )2 = 〈 f (t ), u˙(t )〉
for almost all t > 0.
(v) If u0 ∈ dom Φ, then u˙ ∈ L2 (0, T ; ), and t → Φ(u(t )) is continuous on [0, T ]. When the second member is absolutely continuous, we can obtain additional regularity results that make quasi-autonomous and autonomous cases very similar. Theorem 17.2.6. Let Φ : → R ∪ {+∞} be a convex, lower semicontinuous, and proper function. Let f ∈ W 1,1 (0, T ; ) (i.e., f is absolutely continuous). Then, for any u0 ∈ dom Φ, there exists a unique strong solution u : [0, T ] → of the Cauchy problem u˙(t ) + ∂ Φ(u(t )) # f (t ), u(0) = u0 . Moreover the following properties hold: (i) u(t ) ∈ dom ∂ Φ for all t ∈ ]0, T ]. (ii) t u˙(t ) ∈ L∞ [0, T ]. (iii) For each t ≥ 0, u has a right derivative, and d+u dt
(t ) + (∂ Φ(u(t )) − f (t ))0 = 0,
where ∂ Φ(u(t ))0 is the element of minimal norm of ∂ Φ(u(t )).
i
i i
i
i
i
i
702
book 2014/8/6 page 702 i
Chapter 17. Gradient flows
(iv) For each t ≥ 0, t → Φ(u(t )) has a right derivative, and ) ) ) d + u )2 d+ ) ) Φ(u(t )) + ) (t )) = 〈 f (t ), u˙(t )〉 ) ) dt dt
∀t ∈ ]0, T ].
Remark 17.2.4. Evolution equations governed by time-dependent subdifferential operators are used to model a wide range of situations. A well-known example is the sweeping process introduced by Moreau [298], which plays an important role in unilateral mechanics, economics, and control; see [265] for a recent survey. The equation has the form u˙(t ) + NC (t ) (u(t )) # f (t ), u(0) = u0 , where C (t ) is a time-dependent (moving) convex set in , and NC (t ) (u) is the normal cone to C (t ) at u ∈ C (t ). Since NC (t ) is the subdifferential of the indicator function of C (t ) we are in the framework of (17.115). For further examples and results concerning evolution equations governed by timedependent subdifferential operators, see [46], [47], [48], [253], [254], [263], [313].
17.2.6 Gradient flow for a convex potential: Asymptotic analysis, t → +∞ Let Φ : → R ∪ {+∞} be a convex, lower semicontinuous, and proper function, which is minorized, i.e., inf Φ > −∞. Following Theorem 17.2.3, given u0 ∈ dom Φ, there exists a unique strong global solution u : [0, +∞) → of the Cauchy problem u˙(t ) + ∂ Φ(u(t )) # 0, u(0) = u0 . We are interested in the asymptotic behavior of u(t ) and Φ(u(t )) as t → +∞. The following minimization property holds, even if the set of minimizers of Φ is empty.
Minimizing property.
Proposition 17.2.7 (minimizing property). (i) t → Φ(u(t )) is a decreasing function and lim Φ(u(t )) = inf Φ.
t →+∞
(ii) The following estimate holds: for any any v ∈ dom Φ, for any t > 0, Φ(u(t )) ≤ Φ(v) +
1 2t
u0 − v2 .
As a consequence, if S = arg minΦ = Φ(u(t )) ≤ inf Φ +
1 2t
dist(u0 , S)2 .
PROOF. Because of the effect of regularization, when the asymptotic behavior is studied, it is not restrictive to assume that u0 ∈ dom ∂ Φ. For any v ∈ dom Φ, by − u˙ (t ) ∈ ∂ Φ(u(t )), we have the subdifferential inequality Φ(v) ≥ Φ(u(t )) + 〈− u˙(t ), v − u(t )〉.
i
i i
i
i
i
i
17.2. The gradient flow associated to a convex potential
book 2014/8/6 page 703 i
703
Equivalently, Φ(v) ≥ Φ(u(t )) +
1 d 2 dt
u(t ) − v2 .
After integration from 0 to T T 1 1 (Φ(u(t )) − Φ(v))d t + u(T ) − v2 − u0 − v2 . 0≥ 2 2 0 Since t → Φ(u(t )) is nonincreasing 1 1 0 ≥ T (Φ(u(T )) − Φ(v)) + u(T ) − v2 − u0 − v2 . 2 2 As a consequence, for any t > 0, Φ(u(t )) ≤ Φ(v) +
1 2t
u0 − v2 .
Passing to the limit, as t → +∞, gives lim Φ(u(t )) ≤ Φ(v).
t →+∞
This inequality being valid for any v ∈ dom Φ, we obtain lim Φ(u(t )) ≤ inf Φ.
t →+∞
The opposite inequality is trivially satisfied, which gives the result. Velocity goes to zero.
The following result holds, even if the set of minimizers of Φ
is empty. +
Proposition 17.2.8. (i) t → dd tu (t ) is nonincreasing and ) ) ) d+u ) ) ) lim ) (t )) = 0. ) t →+∞ ) d t (ii) The following estimate holds:
) ) ) d+u ) C ) ) (t )) ≤ $ . ) ) dt ) t
(17.116)
PROOF. Because of the regularization effect, when the asymptotic behavior is studied, it is not restrictive to assume that u0 ∈ dom ∂ Φ. By integration of the energy estimate (17.90), for any t > 0 t
0
u˙(τ)2 d τ ≤ Φ(u0 ) − inf Φ.
+
Since for each t > 0, u has a right derivative, and t → dd tu (t ) is nonincreasing, we deduce that, for each t > 0, ) ) ) d + u )2 ) ) t) (t )) ≤ Φ(u0 ) − inf Φ, ) dt ) which gives the result.
i
i i
i
i
i
i
704
book 2014/8/6 page 704 i
Chapter 17. Gradient flows
In order to prove the weak convergence of the orbits of (GSD) we use the classical Opial’s lemma. We recall it in its continuous form and give a short proof of it.
Weak convergence results.
Lemma 17.2.5. Let S be a nonempty subset of and u : [0, +∞) → H a map. Assume that (i) for every z ∈ S, limt →+∞ u(t ) − z exists; (ii) every sequential weak cluster point of the map u belongs to S. Then
w − lim u(t ) = u∞ exists for some element u∞ ∈ S. t →+∞
PROOF. By (i) and S = , the trajectory u is asymptotically bounded in . In order to obtain its weak convergence, we just need to prove that the trajectory has a unique sequential weak cluster point. Let u(tn1 ) z 1 and u(tn2 ) z 2 , with tn1 → +∞, and tn2 → +∞. By (ii), z 1 ∈ S, and z 2 ∈ S. By (i), it follows that lim t →+∞ u(t ) − z 1 and lim t →+∞ u(t ) − z 2 exist. Hence, lim t →+∞ (u(t ) − z 1 2 − u(t ) − z 2 2 ) exists. Developing and simplifying this last expression, we deduce that Y Z lim u(t ), z 2 − z 1 exists. t →+∞
Hence lim
Y
n→+∞
Z Y Z u(tn1 ), z 2 − z 1 = lim u(tn2 ), z 2 − z 1 , n→+∞
which gives z − z = 0, and hence z = z . 2
1 2
2
1
Theorem 17.2.7 (Bruck [142]). Let Φ : → R ∪ {+∞} be a convex, lower semicontinuous, and proper function, and suppose that S = arg minΦ = . Let u : [0, +∞) → be a strong global trajectory of the generalized steepest descent (GSD). Then u(t ) converges weakly to some u∞ ∈ S, as t → +∞. PROOF. Let us apply Opial’s lemma with S = arg minΦ, which has been supposed to be nonempty. Let z = w − lim u(tn ) be a weak sequential cluster point of u with tn → +∞. By Proposition 17.2.7 and the lower semicontinuity of Φ with respect to the weak topology of we infer inf Φ = lim Φ(u(t ))
t →+∞
= lim Φ(u(tn )) ≥ Φ(z), which yields z ∈ S and proves item (ii) of Opial’s lemma. Let us now show that for any z ∈ S, t → u(t ) − z is a nonincreasing function, which will prove item (i). Set h z (t ) = 12 u(t ) − z2 , which is absolutely continuous on the bounded intervals. For almost all t > 0 ˙h (t ) = 〈u(t ) − z, u˙(t )〉 . (17.117) z Since u is a strong global trajectory of the generalized steepest descent (GSD), we have − u˙ (t ) ∈ ∂ Φ(u(t )). Hence, we have the subdifferential inequality Φ(z) ≥ Φ(u(t )) + 〈− u˙(t ), z − u(t )〉 .
i
i i
i
i
i
i
17.2. The gradient flow associated to a convex potential
book 2014/8/6 page 705 i
705
Since Φ(z) ≤ Φ(u(t )) we infer 〈 u˙(t ), u(t ) − z〉 ≤ 0, which, with (17.117), gives ˙h z (t ) ≤ 0. As a consequence, the function t → u(t ) − z is nonincreasing. Since it is nonnegative, it converges as t → +∞. Corollary 17.2.1. Concerning the asymptotic behavior of the orbits of the generalized steepest descent (GSD), there are two types of situations: (i) S = arg minΦ = , in which case every orbit u(t ) is bounded, and converges weakly to some u∞ ∈ S, as t → +∞. (ii) S = arg minΦ = , in which case every orbit u(t ) verifies lim t →+∞ u(t ) = +∞. PROOF. Item (i) is just the Bruck theorem. To prove (ii), let us prove the reverse implication. Suppose that there exists an orbit u and a sequence tn → +∞ such that supn u(tn ) < +∞. Let us show that this property implies S = arg minΦ = . Indeed, since supn u(tn ) < +∞, the orbit u admits at least a weak asymptotic sequential cluster point u¯. By Proposition 17.2.7, we know that lim t →+∞ Φ(u(t )) = inf Φ. Hence by the lower semicontinuity property of Φ for the weak topology, we infer Φ( u¯) ≤ lim t →+∞ Φ(u(t )) = inf Φ. As a consequence u¯ ∈ S, and S = arg minΦ = . Weak versus strong convergence.
Let us state Baillon’s result [71, Proposition 1].
Theorem 17.2.8. There exists a closed convex proper function Φ : = l 2 (N) → R∪{+∞}, with ∂ Φ−1 (0) = , such that the semigroup S(t ) generated by the maximal monotone operator A = ∂ Φ satisfies the following property: there exists some a ∈ dom Φ such that S(t )a does not converge strongly to an element of ∂ Φ−1 (0). Bruck’s theorem, Theorem 17.2.7, states that for all u0 ∈ dom Φ, S(t )u0 converges weakly to an element of ∂ Φ−1 (0). Thus, Baillon’s counterexample is a constructive example of a closed convex proper function Φ : → R ∪ {+∞} and of a trajectory of the semigroup generated by ∂ Φ which converges weakly and not strongly. Baillon’s thesis [72] contains an extended version of [71] with a counterexample involving a convex function Φ of class ' 1 . Let us state it below in a precise form. Proposition 17.2.9. There exists a closed convex proper function Φ : = l 2 (N) → R ∪ {+∞}, with S = ∂ Φ−1 (0) = , such that the semigroup Sλ (t ) generated by the Yosida approximation Aλ = ∇Φλ of the maximal monotone operator A = ∂ Φ satisfies the following property: there exists some a ∈ dom ∂ Φ, and λ0 > 0, such that for any 0 < λ < λ0 , Sλ (t )a does not converge strongly to an element of ∂ Φ−1 (0). There are some important situations where the the orbits of the gradient flow associated to a convex potential Φ converge strongly: (i) Φ inf-compact, (ii) Φ strongly convex, (iii) Φ is an even function.
i
i i
i
i
i
i
706
book 2014/8/6 page 706 i
Chapter 17. Gradient flows
Item (i) is a clear consequence of the fact that, on the bounded subsets of the lower level sets of Φ, weak and strong convergence coincide. Let us successively examine item (ii) and (iii). The strongly convex case.
Proposition 17.2.10. Suppose that Φ : → R is a strongly convex lower semicontinuous function. Let u be an orbit of the generalized steepest descent associated to Φ. (i) Then, u converges strongly to the unique minimizer of Φ. (ii) Suppose, moreover, that Φ is differentiable with ∇Φ Lipschitz continuous on bounded +∞ sets. Then u has a finite length, i.e., 0 u˙(t )d t < +∞. PROOF. (i) For a strongly convex lower semicontinuous function (cf. (17.119) below), the set of minimizers is nonvoid and reduced to a single element. Moreover, any minimizing sequence converges strongly to the unique minimizer. By the minimization property Proposition 17.2.7, we infer the strong convergence of u. (ii) Let us now suppose that Φ is differentiable with ∇Φ Lipschitz continuous on bounded sets. Let us show that u (a classical orbit) has a finite length. As a direct consequence, this will give us another proof of the strong convergence property. The proof relies on the use of a Łojasiewicz inequality satisfied by a strongly convex function and is preparatory to the next section. Let z be the unique minimizer of Φ (we have ∇Φ(z) = 0). Since u is bounded, there exists some R > 0 such that the whole orbit u is contained in the ball B(z, R). Let LR > 0 be the Lipschitz constant of ∇Φ on B(z, R). The classical derivation chain rule gives, for any v ∈ ,
1
Φ(v) − Φ(z) =
〈∇Φ(z + t (v − z)), v − z〉 d t 0
=
1
〈∇Φ(z + t (v − z)) − ∇Φ(z), v − z〉 d t . 0
By the Cauchy–Schwarz inequality, and integration on [0, 1], we deduce that for any v ∈ B(z, R) LR |Φ(v) − Φ(z)| ≤ v − z2 . (17.118) 2 On the other hand, by the strong convexity assumption on Φ, there exists some positive constant α such that, for any v ∈ , 〈∇Φ(v) − ∇Φ(z), v − z〉 ≥ αv − z2 .
(17.119)
Since ∇Φ(z) = 0, by the Cauchy–Schwarz inequality, we deduce that v − z ≤
1 α
∇Φ(v).
(17.120)
Combining (17.118) and (17.120) we obtain |Φ(v) − Φ(z)| ≤
LR 2α2
∇Φ(v)2 .
i
i i
i
i
i
i
17.2. The gradient flow associated to a convex potential
book 2014/8/6 page 707 i
707
Equivalently, for any v ∈ B(z, R), v = z,
∇Φ(v) (Φ(v) − Φ(z))
1 2
≥α
2
1 2
.
LR
(17.121)
Let us introduce h : [0, +∞[→ [0, +∞[ 1
h(t ) := 2 (Φ(u(t )) − Φ(z)) 2
and show that h is a strict Lyapunov function. Recalling that Φ(u(·)) is a C 1 function, the classical derivation chain rule gives (as long as u(t ) = z) ˙h(t ) := 〈∇Φ(u(t )), u˙ (t )〉 . 1 (Φ(u(t )) − Φ(z)) 2 Using the gradient flow equation, we have ˙h(t ) + Equivalently ˙h(t ) +
∇Φ(u(t ))2 1
(Φ(u(t )) − Φ(z)) 2 ∇Φ(u(t )) 1
(Φ(u(t )) − Φ(z)) 2
≤ 0.
u˙(t ) ≤ 0.
Since u(t ) ∈ B(z, R), by using the Łojasiewicz-type inequality (17.121), we obtain ˙h(t ) + α
2 LR
1 2
u˙(t ) ≤ 0.
+∞ By integration of this inequality, and h ≥ 0, we obtain 0 u˙(t )d t < +∞, which gives the finite length property of the orbits in the strongly convex case. Strong convergence in the case Φ even is a rather unexpected result from Bruck [142]. A detailed proof is given below.
The case Φ even.
Proposition 17.2.11. Let Φ : → R ∪ {+∞} be a convex, lower semicontinuous, and proper function, which is supposed to be even, i.e., Φ(−v) = Φ(v) for all v ∈ . Let u : [0, +∞) → be a strong global trajectory of the generalized steepest descent (GSD). Then u(t ) converges strongly to some u∞ ∈ S = arg minΦ, as t → +∞. PROOF. First note that for an even function, ∂ Φ(−v) = −∂ Φ(v), which implies ∂ Φ(0) # 0. Hence S = arg minΦ is nonempty and contains the origin. By the Bruck theorem, u(t ) converges weakly to some u∞ ∈ S = arg minΦ, as t → +∞. Let us show that there is strong convergence. Let us fix some t0 > 0 and work on the interval [0, t0 ]. For any t ∈ [0, t0 ], set k(t ) = u(t ) − u(t0 )2 − 2u(t )2 . Derivation of k gives ˙ ) = 2 〈u(t ) − u(t ), u˙(t )〉 − 4 〈u(t ), u˙(t )〉 k(t 0 = −2 〈u(t ) + u(t0 ), u˙(t )〉 .
(17.122)
i
i i
i
i
i
i
708
book 2014/8/6 page 708 i
Chapter 17. Gradient flows
On the other hand, using successively the nonincreasing property of t → Φ(u(t )), the fact that Φ is even, and the convex subdifferential inequality associated to − u˙(t ) ∈ ∂ Φ(u(t )), we obtain Φ(u(t )) ≥ Φ(u(t0 )) ≥ Φ(−u(t0 )) ≥ Φ(u(t )) + 〈−u(t0 ) − u(t ), − u˙ (t )〉 . Hence
〈u(t0 ) + u(t ), u˙ (t )〉 ≤ 0.
(17.123)
˙ ) ≥ 0. Hence k is nondecreasing on [0, t ]. As Combining (17.122) and (17.123) gives k(t 0 a consequence, for any t ∈ [0, t0 ], k(t ) ≤ k(t0 ). Equivalently u(t ) − u(t0 )2 − 2u(t )2 ≤ −2u(t0 )2 , that is, for any 0 < t < t0 , u(t ) − u(t0 )2 ≤ 2u(t )2 − 2u(t0 )2 .
(17.124)
We know that for any z ∈ S, lim t →+∞ u(t ) − z exists (see the proof of the Bruck theorem). In particular, since 0 ∈ S, we infer that lim u(t )2 exists.
n→+∞
(17.125)
From (17.124) and (17.125) we deduce that lim
t ,s →+∞, t 0, u has a right derivative at t which satisfies d+u dt
(t ) = projT (u(t )) (−∇Ψ(u(t ))) .
(ii) Ψ(u(t )) decreases to infC Ψ as t increases to +∞. (iii) If S = arg minC Ψ is nonvoid, then u(t ) converges weakly to some u∞ ∈ S, as t → +∞. In the above system, the trajectories satisfy a completely inelastic shock law at the boundary. When reaching the boundary of the constraint C , the normal component of the velocity vector is set to zero, and the trajectory restarts tangentially to the boundary. This is especially interesting for modeling in unilateral mechanics or PDEs. 2. Indeed, from the perspective of optimization, the previous system has a major drawback: the orbits ignore the constraints until they meet the boundary of C . Moreover, the vector field which governs the dynamic is discontinuous (at the boundary of the constraint). The following dynamic, which was first considered by Antipin [32] and Bolte [102], gives a positive answer to these questions. First note that the optimality condition for (17.126) (17.127) ∇Ψ(u) + NC (u) # 0 can be equivalently formulated as u − projC (u − μ∇Ψ(u)) = 0,
(17.128)
i
i i
i
i
i
i
710
book 2014/8/6 page 710 i
Chapter 17. Gradient flows
where μ is a positive parameter (arbitrarily chosen). To obtain it, just write (17.127) as u + NC (u) # u − μ∇Ψ(u) and use that the resolvent (I + NC )−1 of the normal cone mapping is precisely the projection operator on C . This transformation is widely used in convex optimization. It transforms a variational inequality into a fixed point problem governed by an operator which is hopefully nonexpansive. The dynamic whose stationary points are the solutions of (17.128) is given by u˙(t ) + u(t ) − projC (u(t ) − μ∇Ψ(u(t ))) = 0 and is called the relaxed gradient-projection dynamic. The dynamic is now governed by a Lipschitz continuous vector field, and the orbits are classical solutions, i.e., continuously differentiable. Its properties are summarized in the following proposition. Proposition 17.2.13. Let Ψ : → R be a convex differentiable function whose gradient is Lipschitz continuous on bounded sets. Let C be a closed convex set in , and suppose that Ψ is bounded from below on C . Then, for any u0 ∈ , there exists a unique classical global solution u : [0, +∞[→ of the Cauchy problem for the relaxed gradient-projection dynamic u˙ (t ) + u(t ) − projC (u(t ) − μ∇Ψ(u(t ))) = 0; (17.129) u(0) = u0 . The following asymptotic properties are satisfied: (i) If S = arg minC Ψ is nonvoid, then u(t ) converges weakly to some u∞ ∈ S, as t → +∞. (ii) If moreover u0 ∈ C , then u(t ) ∈ C for all t ≥ 0, Ψ(u(t )) decreases to infC Ψ as t increases to +∞, and d μ Ψ(u(t )) + u˙(t )2 ≤ 0. dt PROOF. (i) We just give the main lines of the proof; see [102] for further details. The proof is based on using the following Lyapunov function : given z ∈ S = arg minC Ψ, set 1 E(t , z) = u(t ) − z2 + μ[Ψ(u(t )) − Ψ(z) − 〈∇Ψ(z), u(t ) − z〉]. 2 Time derivation of E(·, z) gives (for short we write E(t ) = E(t , z)) d dt
˙ )〉 + μ 〈∇Ψ(u(t )), u˙(t )〉 − μ 〈∇Ψ(z), u˙(t )〉 . E(t ) = 〈u(t ) − z, u(t
(17.130)
On the one hand, the optimality condition for z ∈ S = arg minC Ψ gives −∇Ψ(z) ∈ NC (z), i.e., 〈∇Ψ(z), z − ξ 〉 ≤ 0 ∀ξ ∈ C . Taking ξ = u˙(t ) + u(t ), which belongs to C (by (17.129)), we obtain 〈∇Ψ(z), z − u˙ (t ) − u(t )〉 ≤ 0. Equivalently − 〈∇Ψ(z), u˙(t )〉 ≤ 〈∇Ψ(z), u(t ) − z〉 .
(17.131)
i
i i
i
i
i
i
17.2. The gradient flow associated to a convex potential
book 2014/8/6 page 711 i
711
On the other hand, the obtuse angle condition for u˙ (t )+u(t ) = projC (u(t ) − μ∇Ψ(u(t ))) gives 〈u(t ) − μ∇Ψ(u(t )) − ( u˙(t ) + u(t )) , ξ − ( u˙(t ) + u(t ))〉 ≤ 0
∀ξ ∈ C .
Equivalently 〈μ∇Ψ(u(t )) + u˙(t ), u˙ (t ) + u(t ) − ξ 〉 ≤ 0 ∀ξ ∈ C . Taking ξ = z gives ˙ )〉 + μ 〈∇Ψ(u(t )), u˙(t )〉 ≤ − u˙ (t )2 − μ 〈∇Ψ(u(t )), u(t ) − z〉 . (17.132) 〈u(t ) − z, u(t Combining (17.130) with (17.131) and (17.132) gives d dt
E(t ) ≤ − u˙(t )2 − μ 〈∇Ψ(u(t )) − ∇Ψ(z), u(t ) − z〉 .
This clearly implies that E(·) is a nonincreasing function, from which one classically infers the global existence of trajectories and the energy estimate u˙ ∈ L2 (0, +∞). By using the Lyapunov function E(·), let us show how one can adapt the Opial argument and show the asymptotic weak convergence property. Let z1 = w − lim u(tn ) and z2 = w − lim u(sn ) be two weak sequential cluster points of u with tn → +∞ and sn → +∞. An elementary algebraic computation gives E(t , z2 ) − E(t , z1 ) = 〈u(t ), z1 − z2 〉 + μ 〈∇Ψ(z1 ) − ∇Ψ(z2 ), u(t )〉 + C (z1 , z2 ), where C (z1 , z2 ) is independent of t . Hence lim [〈u(t ), z1 − z2 〉 + μ 〈∇Ψ(z1 ) − ∇Ψ(z2 ), u(t )〉] exists.
t →∞
Replacing t successively by tn and sn and passing to the limit gives 〈z1 , z1 − z2 〉 + μ 〈∇Ψ(z1 ) − ∇Ψ(z2 ), z1 〉 = 〈z2 , z1 − z2 〉 + μ 〈∇Ψ(z1 ) − ∇Ψ(z2 ), z2 〉 . Equivalently
z1 − z2 2 + μ 〈∇Ψ(z1 ) − ∇Ψ(z2 ), z1 − z2 〉 = 0.
Since 〈∇Ψ(z1 ) − ∇Ψ(z2 ), z1 − z2 〉 ≥ 0 (by monotonicity of ∇Ψ), we obtain z1 = z2 . Then, one completes the proof as in the Bruck theorem. (ii) When u0 ∈ C , the relaxed gradient projection dynamics enjoys some supplementary properties. In that case, by writing the equation under the form u˙ (t ) + u(t ) = f (t ) with f (t ) ∈ C , and by integrating this linear differential equation, we obtain t f (s)e s d s. u(t ) = e −t u0 + e −t 0
Thus, u(t ) is equal to a barycenter of elements of C , which, by convexity of C , implies u(t ) ∈ C . Indeed, the same argument shows that if u0 belongs to the interior of C , then u(t ) also remains in the interior of C . The orbit possibly reaches the boundary of C only at the limit, as t → ∞.
i
i i
i
i
i
i
712
book 2014/8/6 page 712 i
Chapter 17. Gradient flows
17.2.8 First examples Let us first describe some direct applications in the case of a smooth potential. We shall further examine applications to PDEs, which in general require working with a nonsmooth potential. Steepest descent for linear least squares problems. Let and 6 be two Hilbert spaces equipped with the scalar products 〈, 〉 , and 〈, 〉6 , respectively. Let A : → 6 be a linear continuous operator from into 6 . We denote by t A : 6 → the transpose (adjoint) of A: ∀v ∈ , ∀y ∈ 6 〈Av, y〉6 = 〈v, t Ay〉 . Let us give b ∈ 6 . The function ΦA : → R+ 1 ΦA(v) = Av − b 26 2 is convex and differentiable. For any v ∈ , its gradient at v is given by ∇ΦA(v) = t A(Av − b ). The operator v → t A(Av − b ) is affine, continuous, and hence Lipschitz continuous on . The conditions of Theorem 17.1.1 are satisfied. Hence, for any Cauchy data u0 ∈ , there exists a unique global classical solution u ∈ C 1 ([0, +∞); ) of the Cauchy problem u˙ (t ) + t AA(u(t )) − t Ab = 0, (17.133) u(0) = u0 . Let us now discuss the asymptotic behavior of the trajectories of (17.133). Since ΦA : → R+ is a convex function, the critical points of ΦA are the solutions of the convex minimization problem 4 5 (17.134) min Av − b 26 . v∈
The following properties are direct consequences of the general results concerning the asymptotic behavior of trajectories of (SD) associated to a convex potential. Let u ∈ C 1 ([0, +∞); ) be the solution trajectory of (17.133). Then, as t → +∞, 4 5 (i) ΦA(u(t )) = 12 Au(t ) − b 26 decreases to the infimal value infv∈ Av − b 26 ; (ii) assume moreover that the solution set S of (17.134) is nonempty; then u(t ) converges weakly in to some u∞ ∈ S. The condition S = is satisfied in each of the following situations: (a) The range of A (denoted by R(A)) is a closed subspace of 6 . In that case, (17.134) is equivalent to finding the projection of b on the closed subspace R(A). Denoting by z an element such that Az = projR(A) b , the solution set is equal to S = z + ker A. Let us give an example where the condition R(A) is closed and fails to be satisfied: take = l 2 (N), the Hilbert space of sequences which are square summable. Given (αn ) a sequence of real numbers that satisfies 0 < αn ≤ 1 and n∈N (αn )2 < +∞, let A : → be defined by (ξn ) → (αn ξn ). One can easily verify that A is linear continuous, its range is a dense subspace of = l 2 (N) (it contains the sequences with compact support), but its range is not equal to the whole space ((αn ) does not belong to the range). Hence, the range of A is not closed.
i
i i
i
i
i
i
17.2. The gradient flow associated to a convex potential
book 2014/8/6 page 713 i
713
(b) If b ∈ R(A), then clearly S = , the infimal value is zero, and it is attained on the affine subset z + ker A, where z is an element such that Az = b . In that case, let us show that each trajectory of (SD) converges strongly to an element of S. Indeed, with w(t ) = u(t ) − z, (SD) can be reformulated as ˙ ) + t AA(w(t )) = 0. w(t This is the steepest descent equation associated with the convex even potential Ψ(v) = 1 Av26 . By Proposition 17.2.11, w(t ) converges strongly, as t → +∞, and so does 2 u(t ) = w(t ) + z. Steepest descent for coupled systems. With the same notation as in the preceding example, take A : → 6 as a linear continuous operator. Let us give f , g : → R convex, of class C 1 . We suppose that f and g are bounded from below on and that ∇ f and ∇ g are Lipschitz continuous on bounded sets. Set Φ : × → R 1 Φ(v, w) = f (v) + g (w) + A(v − w)26 . 2 Let us equip × with the Hilbertian product structure 〈(v, w), (ξ , η)〉 = 〈v, ξ 〉 + 〈w, η〉. One can easily verify that Φ is convex and continuously differentiable. The operator (v, w) → ∇Φ(v, w) = (∇ f (v) + t AA(v − w), ∇ g (w) + t AA(w − v)) is Lipschitz continuous on bounded subsets of × . The conditions of Theorem 17.1.1 are satisfied. Thus, for any Cauchy data (v0 , w0 ) ∈ × , there exists a unique classical solution (v(·), w(·)) ∈ C 1 ([0, +∞); × ) of ⎧ ˙ ) + ∇ f (v(t )) + t AA(v(t ) − w(t )) = 0, v(t ⎪ ⎪ ⎨ ˙ ) + ∇ g (w(t )) + t AA(w(t ) − v(t )) = 0, w(t ⎪ ⎪ ⎩ v(0) = v0 , w(0) = w0 .
(17.135)
Let us discuss the asymptotic behavior of the trajectories of (17.135). Since Φ : × → R is a convex function, the critical points of Φ are the solutions of the convex minimization problem 1 min f (v) + g (w) + A(v − w)26 . (17.136) (v,w)∈ × 2 The following properties are direct consequences of the general results concerning the asymptotic behavior of trajectories of (SD) associated to a convex potential. Let (v(·), w(·)) ∈ C 1 ([0, +∞); ) be the solution trajectory of (17.135). Then, as t → +∞, (i) Φ(v(t ), w(t )) = f (v(t )) + g (w(t )) + 12 A(v(t ) − w(t ))26 decreases to the infimal 4 5 value inf(v,w)∈ × f (v) + g (w) + 12 A(v − w)26 ; (ii) assume moreover that the solution set S of (17.136) is nonempty; then (v(t ), w(t )) converges weakly in × to some (v∞ , w∞ ) ∈ S.
i
i i
i
i
i
i
714
book 2014/8/6 page 714 i
Chapter 17. Gradient flows
17.2.9 Applications to PDEs Analysis of the generalized steepest descent, which was developed in the previous sections, is valid for arbitrary convex lower semicontinuous potentials. Thus, in order to apply it to a specific potential Φ, it suffices to calculate the corresponding subdifferential operator ∂ Φ. However, when working in functional spaces, solving this problem of convex analysis is not immediate. The full description of the operator ∂ Φ requires using subtle techniques. Just to mention a few of them, we will use the theory of maximal monotone operators, Fenchel duality, Sobolev spaces, and the regularity theory for elliptic PDEs. 1. The linear heat equation. Dirichlet boundary condition. Let Ω be a bounded open set in RN . Take = L2 (Ω), and define Φ : L2 (Ω) → R ∪ {+∞} by 1 2
Φ(v) =
Ω
∇v(x)2 d x
+∞
if v ∈ H01 (Ω), if v ∈ L2 (Ω), v ∈ / H01 (Ω).
Clearly, Φ is a convex and proper function, whose domain is H01 (Ω). Let us verify that Φ is lower semicontinuous for the topology of L2 (Ω). Equivalently, let us show that for any γ ∈ R, the sublevel set l e vγ Φ is closed for the topology of L2 (Ω). Let (vn ) be a sequence which satisfies, vn ∈ H01 (Ω), vn → v in L2 (Ω), and Φ(vn ) ≤ γ . Hence ∇vn (x)2 d x ≤ 2γ . Thus, (vn ) and its first distributional derivatives are bounded in Ω 2 L (Ω). As a consequence, the sequence (vn ) is bounded in H01 (Ω). Hence, vn converges weakly to v in H01 (Ω). Since Φ is convex continuous on H01 (Ω), it is lower semicontinuous for the weak topology of H01 (Ω). Thus, Φ(v) ≤ lim inf Φ(vn ) ≤ γ . Theorem 17.2.10. (a) The subdifferential A = ∂ Φ is equal to
5 4 dom A = v ∈ H01 (Ω) : Δv ∈ L2 (Ω) , A(v) = −Δv
for v ∈ dom(A),
(b) When Ω is regular, dom A = H 2 (Ω) ∩ H01 (Ω). Because of the importance of this result, and to illustrate the different strategies of demonstration, we give two different proofs (which can be extended to more involved situations). FIRST PROOF. We use the characterization of the subdifferential via the Fenchel conjugate and the extremality relation. Let us recall (see Proposition 9.5.1) that f ∈ ∂ Φ(u) ⇔ Φ(u) + Φ∗ ( f ) − 〈 f , u〉 = 0. Thus, the problem has been converted into the computation of Φ∗ ( f ). Indeed, in Chapter 9, Section 9.8, as an illustration of the Fenchel duality calculus, we proved that inf
1 2
v∈H01 (Ω)
= − inf
1 2
Ω
∇v(x) d x − 2
Ω
2
2
Ω
f (x)v(x)d x
N
y(x) d x : y ∈ L (Ω) , divy = f
.
i
i i
i
i
i
i
17.2. The gradient flow associated to a convex potential
book 2014/8/6 page 715 i
715
Equivalently ∗
Φ ( f ) = inf
1 2
Ω
2
2
N
y(x) d x : y ∈ L (Ω) , divy = f
.
Clearly, in the above expression, the infimum is achieved. As a consequence, the extremality relation which characterizes f ∈ ∂ Φ(u) can be equivalently formulated as follows: there exists y ∈ L2 (Ω)N such that divy = f , and 1 1 ∇u(x)2 d x + y(x)2 d x − f (x)u(x)d x = 0. (17.137) 2 Ω 2 Ω Ω By divy = f , and the definition of the derivation in the sense of distributions, we have that for any test function v ∈ (Ω) f (x)v(x)d x = − y(x).∇v(x)d x. Ω
Ω
Since y ∈ L (Ω) , this equality can be extended by continuity to any v ∈ H01 (Ω) and in particular to u. Hence, f (x)u(x)d x = − y(x).∇u(x)d x. (17.138) 2
N
Ω
Ω
Combining (17.137) and (17.138) we obtain 1 1 ∇u(x)2 d x + y(x)2 d x + y(x).∇u(x)d x = 0. 2 Ω 2 Ω Ω Equivalently
1 2
Ω
∇u(x) + y(x)2 d x = 0.
Hence y = −∇u. From divy = f , we finally infer −Δu = f . Note that the above argument only involves equivalent relations, which gives the claim. (b) When Ω is regular (∂ Ω of class C 2 ), we use the Agmon–Douglis–Nirenberg theorem (see [8], [137, Theorem IX.32]) which gives the H 2 (Ω) regularity of the solution of the Poisson equation −Δu = f , u ∈ H01 (Ω). Thus dom A = H 2 (Ω) ∩ H01 (Ω). SECOND PROOF. Let us introduce the operator A acting on = L2 (Ω) which is defined by 5 4 dom A = v ∈ H01 (Ω) : Δv ∈ L2 (Ω) , A(v) = −Δv for v ∈ dom A. We are going to prove that A is maximal monotone, and A ⊂ ∂ Φ. Since ∂ Φ is monotone (indeed, it is maximal monotone), this will imply A = ∂ Φ. Clearly, A is linear, and for any v ∈ dom A Ω
−Δv.(v)d x =
Ω
∇v(x)2 d x.
i
i i
i
i
i
i
716
book 2014/8/6 page 716 i
Chapter 17. Gradient flows
Hence, A is monotone. Similarly, for any u ∈ dom A and v ∈ dom Φ −Δu.(v − u)d x = 〈∇u, ∇v − ∇u〉 d x. Ω
Ω
≤ Φ(v) − Φ(u).
Thus, A ⊂ ∂ Φ. By Minty’s theorem, Theorem 17.2.1, it just remains to prove that R(I + A) = . Equivalently, for any f ∈ L2 (Ω), we have to prove the existence of a solution to the Dirichlet problem u − Δu = f on Ω, u ∈ H01 (Ω). This is a classical result (see Chapter 6, Theorem 6.1.1), which was obtained by applying the Lax–Milgram theorem. Let us now combine the above results and the general properties of the gradient flow associated to a convex lower semicontinuous potential. For simplicity of the statements, we suppose Ω is a bounded regular open set in Rn , whose topological boundary is denoted by ∂ Ω. Theorem 17.2.11. Let h ∈ L2 (Ω). Then, the following properties hold: (a) For any u0 ∈ L2 (Ω), there exists a unique solution u of the heat equation ⎧∂ u a.e. on Ω×]0, +∞[, ⎪ ⎨ ∂ t − Δu = h, u(x, t ) = 0 a.e. on ∂ Ω×]0, +∞[, ⎪ ⎩ u(x, 0) = u0 a.e. on Ω $ satisfying u ∈ C ([0, +∞[; L2 (Ω)), u(·, t ) ∈ H 2 (Ω) for any t > 0, and t ∂∂ ut ∈ L2 ([0, T ]; L2 (Ω)) for any T > 0. (b) As t → +∞, u(t ) converges strongly in H01 (Ω) to the unique solution w of the Dirichlet problem −Δw = h on Ω, w =0
on ∂ Ω.
PROOF. Let us consider the functional Φ : = L2 (Ω) → R ∪ {+∞} which is defined by 1 ∇v(x)2 d x − Ω h(x)v(x)d x if v ∈ H01 (Ω), 2 Ω Φ(v) = / H01 (Ω). +∞ if v ∈ L2 (Ω), v ∈ The functional v → Ω h(x)v(x)d x is linear continuous on L2 (Ω). Its gradient is constant and equal to h. By the additivity rule for the subdifferential of the sum of two convex lower semicontinuous functions (Theorem 9.5.4), and Theorem 17.2.10, the subdifferential A = ∂ Φ is equal to 5 4 dom A = v ∈ H01 (Ω) : Δv ∈ L2 (Ω) , A(v) = −Δv − h
for v ∈ dom A.
The existence of a strong global solution follows from Theorem 17.2.3 (regularizing effect) and the fact that the domain of Φ, which is equal to H01 (Ω), is dense in L2 (Ω). The asymptotic convergence result follows from Bruck’s theorem, Theorem 17.2.7, and the
i
i i
i
i
i
i
17.2. The gradient flow associated to a convex potential
book 2014/8/6 page 717 i
717
fact that the weak convergence in = L2 (Ω) and the convergence of the potential energies imply the strong convergence in H01 (Ω). Neumann boundary condition. Let Ω be a bounded open set in RN with smooth boundary ∂ Ω. Let h ∈ L2 (Ω). Take = L2 (Ω), and define Φ : L2 (Ω) → R ∪ {+∞} by 1 ∇v(x)2 d x − Ω h(x)v(x)d x if v ∈ H 1 (Ω), 2 Ω Φ(v) = +∞ if v ∈ L2 (Ω), v ∈ / H 1 (Ω). Clearly, Φ is a convex and proper function, whose domain is H 1 (Ω). By a similar argument to that used in the Dirichlet case, Φ is lower semicontinuous on = L2 (Ω), and its subdifferential A = ∂ Φ is equal to K L dom A = v ∈ H 2 (Ω) : ∂∂ nu = 0 on Ω , A(v) = −Δv − h
for v ∈ dom A.
Note that the Neumann boundary condition ∂∂ nu = 0 on ∂ Ω naturally occurs, when expressing an extremality relation on the space H 1 (Ω). This phenomena has been studied in detail in Section 6.2. Thus, by applying Theorem 17.2.3 (regularizing effect), and Bruck’s theorem, Theorem 17.2.7 (asymptotic behavior), we obtain the following result. Theorem 17.2.12. (a) For any u0 ∈ L2 (Ω) and h ∈ L2 (Ω), there exists a unique solution u of ⎧∂ u − Δu = h a.e. on Ω×]0, +∞[, ⎪ ⎪ ⎨∂t ∂u
=0 ∂n ⎪ ⎪ ⎩ u(x, 0) = u0
a.e. on ∂ Ω×]0, +∞[,
a.e. on Ω
satisfying u ∈ C ([0, +∞[; L2 (Ω)), u(·, t ) ∈ H 2 (Ω) for any t > 0, and L2 (Ω)) for any T > 0. (b) Suppose that the solution set S of the Neumann problem −Δw = h on Ω, ∂w ∂n
=0
on ∂ Ω.
$
t ∂∂ ut ∈ L2 ([0, T ];
(17.139)
is nonvoid. Then, as t → +∞, u(t ) converges strongly in H 1 (Ω) to a solution w ∈ S. Recall that the above Neumann problem is semicoercive. A necessary and sufficient condition for the existence of solutions to the stationary problem (17.139) is that h(x)d x = 0, in which case all the solutions differ by an additive constant (see SecΩ tion 6.2). Note that if S = , then for any initial data u0 ∈ L2 (Ω), the corresponding orbit u verifies lim t →+∞ u(t )L2 (Ω) = +∞. 2. The Stefan problem. The following problem is named after the physicist J. Stefan, who introduced it around 1890, in relation to problems of ice formation. It describes the temperature distribution in a medium undergoing a phase change, for example, ice passing to water. The Stefan problem is a free boundary problem, and the determination of the (time evolving) interface between the two phases is the central part of the problem. (As soon as it is known, we just need to solve the heat problem in both phases.) The following gradient flow approach to the Stefan problem was introduced by Brezis in [134].
i
i i
i
i
i
i
718
book 2014/8/6 page 718 i
Chapter 17. Gradient flows
The operator which governs the Stefan problem is identified to the subdifferential of an integral functional with respect to the H −1 (Ω) metric. It can be formulated as ∂u ∂t
− Δβ(u) = f
(17.140)
with given boundary conditions and Cauchy data. The operator β is a maximal monotone graph from R onto R, and it carries the physical information on the phase change. Let us fix the functional setting. Let Ω be a bounded open set in RN . Let = H −1 (Ω) be the topological dual of H01 (Ω); see Section 5.2. We know that the (minus) Laplace– Dirichlet operator −Δ is an isomorphism between H01 (Ω) and H −1 (Ω). For our purpose, it is convenient to introduce the scalar product on H −1 (Ω) which is induced by this isomorphism and by the scalar product on H01 (Ω), 〈u, v〉H 1 (Ω) = ∇u(x) · ∇v(x)d x, 0
Ω
which, by the Poincaré inequality, induces a norm which is equivalent to the classical one. For any f , g ∈ H −1 (Ω) we set [ \ 〈 f , g 〉 H −1 (Ω) = (−Δ)−1 f , (−Δ)−1 g 1 =
H0 (Ω)
N i =1
[
∂ * Ω
∂ xi
(−Δ)−1 f
= (−Δ)−1 f , g
\
+ ∂ * ∂ xi
(H01 (Ω),H −1 (Ω))
+ (−Δ)−1 g d x
.
This last expression is the duality bracket between g ∈ H −1 (Ω) = H01 (Ω)∗ and (−Δ)−1 f ∈ H01 (Ω). Equipped with this scalar product, = H −1 (Ω) is a Hilbert space with a norm that is equivalent to the classic. Let us give j : R → R ∪ {+∞}, a convex, lower semicontinuous, proper function which is strongly coercive, i.e., lim
|r |→+∞
j (r ) |r |
= +∞.
(17.141)
Let β = ∂ j be the subdifferential of j . It is a maximal monotone graph from R into R that satisfies R(β) = R. To obtain this last property, note that for any s ∈ R, the convex lower semicontinuous function r → j (r ) − r s is coercive (a consequence of (17.141)) and hence attains its minimum at a point ¯r . Writing the optimality condition gives β(¯r ) # s, whence the result. Let us define Φ : H −1 (Ω) → R ∪ {+∞} by ⎧ if v ∈ H −1 (Ω) ∩ L1 (Ω), j (v) ∈ L1 (Ω), ⎨ Ω j (v(x))d x Φ(v) = (17.142) ⎩ +∞ otherwise. The following result, which describes the subdifferential of Φ on the space H −1 (Ω), makes the link with formulation (17.140) of the Stefan problem. Theorem 17.2.13. (a) The function Φ : H −1 (Ω) → R ∪ {+∞} is convex, lower semicontinuous, and proper on H −1 (Ω).
i
i i
i
i
i
i
17.2. The gradient flow associated to a convex potential
book 2014/8/6 page 719 i
719
(b) The subdifferential A = ∂ Φ is equal to
5 4 −1 1 1 dom A = 5 β(v(x)) a.e. on Ω , 4 v ∈ H (Ω) ∩ L1 (Ω) : ∃w ∈ H0 (Ω) such that w(x) ∈ A(v) = −Δw : w ∈ H0 (Ω) and w(x) ∈ β(v(x)) a.e. on Ω . (17.143)
PROOF. (a) Let (un ) be a sequence such that un ∈ H −1 (Ω) ∩ L1 (Ω), un → u in H −1 (Ω) and Ω
j (un (x))d x ≤ λ.
By the De La Vallée–Poussin theorem, Theorem 2.4.4, and the Dunford–Pettis theorem, Theorem 2.4.5, the sequence (un ) is σ(L1 , L∞ ) sequentially relatively compact. Thus, we can extract a subsequence (unk ) such that unk → u˜ weakly in L1 (Ω). Since un → u in 1 H −1 (Ω), we have u = u˜ and the whole sequence un → u weakly in L (Ω). By1 Fatou’s lemma, the integral functional v → Ω j (v(x))d x is lower semicontinuous on L (Ω). Being convex, it is lower semicontinuous for the weak topology σ(L1 , L∞ ). Hence
Ω
j (u(x))d x ≤ lim inf n
Ω
j (un (x))d x ≤ λ,
which proves that the lower-level sets of Φ are closed. Hence Φ is lower semicontinuous on H −1 (Ω). (b) To show that A = ∂ Φ, where A is the operator described in (17.143), we prove that A ⊂ ∂ Φ, and A is maximal monotone. The following lemma from [134] will play a key role. In particular, it allows between F ∈ H −1 (Ω) and us to interpret the duality pairing 1 1 w ∈ H0 (Ω) as an integral Ω F (x)w(x)d x when F w ∈ L (Ω) . Lemma 17.2.6. Let F ∈ H −1 (Ω) ∩ L1 (Ω), and let w ∈ H01 (Ω). Let g ∈ L1 (Ω), and let h be measurable with F (x)w(x) ≥ h(x) ≥ g (x) a.e. x ∈ Ω. (17.144) Then h ∈ L1 (Ω), and
〈w, F 〉(H 1 (Ω),H −1 (Ω)) ≥ 0
h(x)d x. Ω
PROOF. For each n ∈ N, set ⎧ ⎨n wn = w ⎩−n Set hn = h we obtain
wn , w
and gn = g
wn . w
if w ≥ n, if |w| ≤ n, if w ≤ −n.
Multiplying (17.144) by the nonnegative function
F (x)wn (x) ≥ hn (x) ≥ gn (x)
wn , w
a.e. x ∈ Ω,
and hence 0 ≤ hn (x) − gn (x) ≤ F (x)wn (x) − gn (x)
a.e. x ∈ Ω.
(17.145)
i
i i
i
i
i
i
720
book 2014/8/6 page 720 i
Chapter 17. Gradient flows
Since F ∈ L1 (Ω) and wn ∈ L∞ (Ω), we have F wn ∈ L1 (Ω). After integration of (17.145), we obtain (hn − gn )d x ≤ F wn d x − gn d x. (17.146) Ω
Then note that
Ω
Ω
Ω
F wn d x = 〈wn , F 〉(L∞ (Ω),L1 (Ω)) = 〈wn , F 〉(H 1 (Ω),H −1 (Ω)) . 0
By Fatou’s lemma, hn − gn nonnegative, and hn − gn → h − g a.e. on Ω, we deduce that (h − g )d x ≤ lim inf (hn − gn )d x. n
Ω
Ω
Since contractions operate on H01 (Ω) (see Section 5.8), we have wn → w in H01 (Ω). Moreover, gn → g in L1 (Ω). As a consequence, by passing to the limit on (17.146), we obtain g d x. (17.147) 0 ≤ (h − g )d x ≤ 〈w, F 〉(H 1 (Ω),H −1 (Ω)) − 0
Ω
Ω
Hence, h − g ∈ L (Ω), which implies h ∈ L (Ω), and after simplification of (17.147) h(x)d x ≤ 〈w, F 〉(H 1 (Ω),H −1 (Ω)) , 1
1
Ω
0
which completes the proof of Lemma 17.2.6. PROOF OF THEOREM 17.2.13 CONTINUED. Let us prove that A ⊂ ∂ Φ. Let f ∈ Au, i.e., u ∈ H −1 (Ω) ∩ L1 (Ω), f = −Δw, with w ∈ H01 (Ω), and w(x) ∈ β(u(x)) a.e. on Ω. Let v ∈ dom Φ, i.e., v ∈ H −1 (Ω) ∩ L1 (Ω), j (v) ∈ L1 (Ω). For a.e. x ∈ Ω, by the convex subdifferential inequality, j (v(x)) − j (u(x)) ≥ w(x)(v(x) − u(x)). Equivalently w(x)(u(x) − v(x)) ≥ j (u(x)) − j (v(x)). Let us apply Lemma 17.2.6 with F = u − v ∈ H −1 (Ω) ∩ L1 (Ω), w ∈ H01 (Ω), and h = j (u) − j (v). Noticing that j (r ) ≥ −C (1 + |r |) for some positive constant C , we have j (u(x)) − j (v(x)) ≥ g (x) := −C (1 + |u(x)|) − j (v(x)) with g ∈ L1 (Ω). Thus, the conditions of Lemma 17.2.6 are satisfied. We conclude that j (u) ∈ L1 (Ω), and j (v)d x − j (u)d x ≥ 〈w, v − u〉(H 1 (Ω),H −1 (Ω)) 0 Ω Ω \ [ −1 = (−Δ) f , v − u (H01 (Ω),H −1 (Ω)) = 〈 f , v − u〉 H −1 (Ω) . Hence, f ∈ ∂ Φ(u).
i
i i
i
i
i
i
17.2. The gradient flow associated to a convex potential
book 2014/8/6 page 721 i
721
Let us complete the proof of Theorem 17.2.13 by showing that A is maximal monotone. Since A ⊂ ∂ Φ, we have that A is monotone. Thus, by Minty’s theorem, Theorem 17.2.1, we just need to prove that R(I +A) = . Equivalently, for any given f ∈ H −1 (Ω), we have to find u ∈ H −1 (Ω) ∩ L1 (Ω), and w ∈ H01 (Ω) such that
w(x) ∈ β(u(x)) a.e. on Ω, u − Δw = f .
(17.148)
Let us rewrite (17.148) as a semilinear equation. Set γ = β−1 . Since R(β) = R, we have dom γ = R. Thus (17.148) is equivalent to finding w ∈ H01 (Ω) such that γ (w) − Δw # f
(17.149)
is satisfied in the following sense: there exists some z ∈ H −1 (Ω) ∩ L1 (Ω) such that z(x) ∈ γ (w(x)) a.e. on Ω and z − Δw = f . Solving this nonlinear equation is not immediate. The difficulty is that the second member of the equation belongs to H −1 (Ω). By contrast, for f ∈ L2 (Ω), the methods developed in Section 6.2.4 provide existence and uniqueness of a solution to (17.149) for an arbitrary monotone graph γ . To solve (17.149) with f ∈ H −1 (Ω), we use the fact that dom γ = R. Without loss of generality, we can assume that 0 ∈ γ (0). (This amounts to replacing f by f − γ (0), which still belongs to H −1 (Ω).) Let us approximate (17.149) by using the Yosida approximation γλ of the monotone graph γ . For any λ > 0, there exists a unique solution wλ ∈ H01 (Ω) of γλ (wλ ) − Δwλ = f .
(17.150)
This can be achieved by a variational argument. Let us consider the convex minimization problem 1 ( j ∗ )λ (w(x))d x + |∇w(x)|2 d x − 〈w, f 〉(H 1 (Ω),H −1 (Ω)) : w ∈ H01 (Ω) , min 0 2 Ω Ω (17.151) r where γ = (∂ j )−1 = ∂ j ∗ and ( j ∗ )λ (r ) = 0 γλ (s)d s. Since (17.151) is a strongly convex minimization problem, it has a unique solution wλ . To write the corresponding optimal ity condition, we can notice that the convex integral functional v → Ω ( j ∗ )λ (w(x))d x is continuous on L2 (Ω) (because ( j ∗ )λ (r ) ≤ C (1+|r |2 )) and hence on H01 (Ω). The additivity rule (Theorem 9.5.4) for the subdifferential of a sum of convex functions gives (17.150). Multiplying (17.150) by wλ , and integrating on Ω, we obtain sup wλ H 1 (Ω) < +∞ 0
λ
and
(17.152)
sup λ
Ω
γλ (wλ )wλ d x < +∞.
(17.153)
By (17.152) and the Rellich–Kondrakov theorem, we can find a sequence λn → 0 such that weakly in H01 (Ω);
wλ n w
wλn (x) → w(x) −1
(17.154)
a.e. on Ω;
(I + λn γ ) wλn (x) → w(x)
a.e. on Ω.
(17.155)
i
i i
i
i
i
i
722
book 2014/8/6 page 722 i
Chapter 17. Gradient flows
This last result comes from the contraction property of the resolvents (Proposition 17.2.1) |(I + λn γ )−1 wλn (x) − (I + λn γ )−1 w(x)| ≤ |wλn (x) − w(x)| and the approximation property of the resolvents (Proposition 17.2.2) (I + λn γ )−1 r → r
as n → ∞ ∀r ∈ dom γ = R.
In order to pass to the limit on (17.150) in the distribution sense, let us show that (γλ (wλ )) is σ(L1 , L∞ ) sequentially relatively compact. By definition of the Yosida approximation, we have γλ (wλ )(I + λγ )−1 wλ = γλ (wλ ) (wλ − λγλ (wλ )) = γλ (wλ )wλ − λ|γλ (wλ )|2 ≤ γλ (wλ )wλ . From (17.153) we deduce that sup γλ (wλ )(I + λγ )−1 wλ d x < +∞. λ
(17.156)
Ω
Since γλ (wλ ) ∈ γ (I + λγ )−1 wλ , and γ = (∂ j )−1 = ∂ j ∗ , by the Fenchel extremality relation j ∗ ((I + λγ )−1 wλ ) + j (γλ (wλ )) = γλ (wλ )(I + λγ )−1 wλ . From (17.156), and j ∗ minorized (a consequence of 0 ∈ γ (0)), we obtain sup j (γλ (wλ ))d x < +∞. λ
Ω
Since j is strongly coercive (17.141), using again the De La Vallée–Poussin theorem, Theorem 2.4.4, and the Dunford–Pettis theorem, Theorem 2.4.5, we obtain that the net (γλ (wλ )) is σ(L1 , L∞ ) sequentially relatively compact. Thus, we can extract a sequence (still denoted λn ) such that λn → 0 and find some z ∈ L1 (Ω) such that γλn (wλn ) z
weakly in L1 (Ω).
(17.157)
From (17.154) and (17.157), by passing to the limit on (17.150) in the distribution sense, we obtain z − Δw = f with z ∈ H −1 (Ω) ∩ L1 (Ω). To complete the proof of Theorem 17.2.13, we just need to prove that z(x) ∈ γ (w(x)) a.e. on Ω. It is sufficient to prove that, for every N ∈ N, z(x) ∈ γ (w(x)) a.e. on ΩN := {x ∈ Ω : |w(x)| ≤ N } . By (17.155), vn (x) := (I + λn γ )−1 wλn (x) → w(x) a.e. on Ω. Hence, by Egorov’s theorem, since |Ω| < +∞, for any ε > 0 there is a measurable set E ⊂ ΩN such that |E| < ε, and vn → w uniformly on ΩN \ E. Thus, setting zn := γλn (wλn ), we are reduced to the following situation: zn (x) ∈ γ (vn (x))
a.e. on Ω,
vn → w uniformly on Ω; zn z weakly in L1 (Ω), w is bounded.
i
i i
i
i
i
i
17.2. The gradient flow associated to a convex potential
book 2014/8/6 page 723 i
723
We can now use a monotonicity argument involving spaces in duality (namely, L1 (Ω) and ˜ a.e. on Ω. By the L∞ (Ω)). Let v˜ ∈ L∞ (Ω) and f˜ ∈ L1 (Ω) be such that f˜(x) ∈ γ (v(x)) monotonicity of γ , and after integration on Ω, we obtain ˜ − vn (x))d x ≥ 0. ( f˜(x) − zn (x))(v(x) Ω
By passing to the limit we obtain ˜ − w(x))d x ≥ 0. ( f˜(x) − z(x))(v(x)
(17.158)
Ω
Take v˜ := (I + γ )−1 (w + z). Since (vn ) is uniformly bounded, zn (x) ∈ γ (vn (x)), and dom γ = R, the sequence (zn ) remains uniformly bounded, and hence z ∈ L∞ (Ω). As a ˜ # w+z. consequence w+z ∈ L∞ (Ω), and v˜ := (I +γ )−1 (w+z) ∈ L∞ (Ω). We have v˜+γ (v) 1 ˜ ˜ ˜ ˜ which satisfies f ∈ L (Ω) and f (x) ∈ γ (v(x)) ˜ By taking in (17.158) f = w + z − v, a.e. on Ω, we obtain Ω
˜ − w(x)|2 d x ≤ 0. |v(x)
˜ ˜ Hence v(x) = w, which in turn implies f˜ = w + z − v˜ = z. Since f˜(x) ∈ γ (v(x)) a.e. on Ω, we finally obtain z(x) ∈ γ (w(x)) a.e. on Ω. As a direct consequence of Theorem 17.2.13, and the theory of gradient flow, we obtain the following result. For simplicity of the statement, we suppose that β is a monotone and continuous function from R onto R. Theorem 17.2.14. Let β : R → R be a continuous monotone function such that dom β = R and R(β) = R. Let h ∈ H −1 (Ω). (a) For any u0 ∈ H −1 (Ω), there exists a unique solution u of the Stefan problem ⎧ on Ω×]0, +∞[, ⎨ ∂∂ ut − Δβ(u) = h (17.159) β(u)(x, t ) = 0 on ∂ Ω×]0, +∞[, ⎩ on Ω u(x, 0) = u0 satisfying u ∈ C ([0, +∞[; H −1 (Ω)), β(u(x, t )) ∈ H01 (Ω) for all t > 0, u(x, t ) ∈ L1 (Ω) for $ all t > 0, t ∂∂ ut ∈ L2 ([0, T ]; H −1 (Ω)) for any T > 0. (b) As t → +∞, β(u(x, t )) converges strongly in H01 (Ω) to the solution w∞ ∈ H01 (Ω) of the Dirichlet problem on Ω, −Δw∞ = h (17.160) w∞ = 0 on ∂ Ω. Moreover, if the solution set S = (∂ Φ)−1 (h) = , then u(x, t ) converges weakly in H −1 (Ω) to an element of S. PROOF. Take = H −1 (Ω), and consider the gradient flow associated to the potential v → Ψ(v) = Φ(v) − 〈v, h〉(H 1 (Ω),H −1 (Ω)) , where Φ : H −1 (Ω) → R ∪ {+∞} is the convex 0
lower semicontinuous functional defined in (17.142) (with β = ∂ j ), ⎧ if v ∈ H −1 (Ω) ∩ L1 (Ω), j (v) ∈ L1 (Ω), ⎨ Ω j (v(x))d x Φ(v) = ⎩ +∞ otherwise.
(17.161)
i
i i
i
i
i
i
724
book 2014/8/6 page 724 i
Chapter 17. Gradient flows
Note that L∞ (Ω) ⊂ dom Φ (a consequence of the continuity of β and j ). Thus, the domain of Ψ is dense in H −1 (Ω). By the regularizing effect (cf. Theorem 17.2.3) and Theorem 17.2.13, which describes the subdifferential of Φ on the space H −1 (Ω), we deduce that for any u0 ∈ H −1 (Ω), there exists a unique strong solution u of the Stefan problem (17.159). Moreover, u(t ) ∈ dom Ψ = dom Φ for all t > 0. That’s the existence part (a) of the above theorem. Concerning the asymptotic behavior, note that −Δβ(u) = h −
∂u ∂t
.
Since ∂∂ ut (t ) converges strongly to zero in H −1 (Ω) as t → +∞ (see Proposition 17.2.8) we deduce that β(u(x, t )) converges strongly in H01 (Ω) to the solution of the Dirichlet problem (17.160). The weak convergence of u(t ) in H −1 (Ω) follows from Theorem 17.2.7. Remark 17.2.5. 1. Depending on the choice of the monotone graph β, (17.140) provides different PDEs. After transformation, the free boundary Stefan problem (ice-water) corresponds to ⎧ for r ≤ 0, ⎨αr for 0 < r < k, β(r ) = 0 ⎩β(r − k) for r ≥ k, where α and β are positive diffusion coefficients, and k is related to the latent heat (the amount of heat energy required to change the phase of an unit mass of a substance). The case β(r ) = |r | p−1 r has also received a lot of attention, because of its role in porous media; see [36], [314], [356]. 2. The above approach to the Stefan problem illustrates the flexibility of the gradient flow methods. By playing with the potential and the metric (the scalar product may also vary with time), one can get different types of PDEs; see [49], [134], [169], [188], and references therein. 3. L1 (Ω) plays a central role in the above analysis. Indeed, one can develop a different approach to the Stefan problem, which is based on the maximal accretivity property of the operator −Δβ in L1 (Ω); see [94]. 4. The Stefan problem has a natural link with other phase transition models. The solution of the Cahn–Hilliard phase separation equation for a binary mixture is reasonably comparable with the solution of a Stefan problem; see [318].
17.3 Gradient flow associated to a tame function. Kurdyka–Łojasiewicz theory 17.3.1 The analytic case, Łojasiewicz inequality Let us first recall some classical notation and basic definitions concerning real-valued analytic functions of several variables. an N-dimensional multi-index, K = (k1 , . . . , kN ). Definition 17.3.1. Let K ∈ NN beJ (a) We set |K| = N k , K! = ki !, and for each x ∈ RN , x K = x1 k1 × · · · × xN kN . i =1 i
i
i i
i
i
i
i
17.3. Gradient flow associated to a tame function. Kurdyka–Łojasiewicz theory
book 2014/8/6 page 725 i
725
(b) Given Φ : RN → R, and a ∈ RN , we set ∂ |K| Φ
D K Φ(a) =
k
k
∂ x1 1 · · · ∂ xNN
(a).
Definition 17.3.2. Let U be an open subset of RN . A function Φ : U ⊂ RN → R is said to be analytic if locally it is given by a convergent power series. An analytic functions is infinitely differentiable and is equal to its Taylor series in some neighborhood of each point of its domain. More precisely, for each a ∈ U , there exists an open set 3 , a ∈ 3 ⊂ U such that for all x ∈ 3 D K Φ(a)
Φ(x) =
N
K∈N
K!
(x − a)K .
The following well-known result is due to Łojasiewicz [278]. It is known as the Łojasiewicz inequality. It plays a central role in the proof of the convergence of the orbits of the gradient flow associated to a real-analytic potential. Its further extensions are at the core of the modern semialgebraic and semianalytic geometry. Theorem 17.3.1 (Łojasiewicz inequality). Let U be an open set in RN , Φ : U ⊂ RN → R a real-analytic function, and u¯ ∈ U a critical point of Φ. Then, there exists θ ∈ [ 12 , 1), C > 0, and a neighborhood W of u¯ such that |Φ(v) − Φ( u¯ )|θ ≤ C ∇Φ(v).
∀v ∈ W
Remark 17.3.1. The Łojasiewicz inequality is trivially satisfied at any point u¯ which is not critical. Remark 17.3.2. An elegant proof of the Łojasiewicz inequality can be found in [267]. In dimension N = 1, the proof is elementary, as shown below. By analyticity of Φ, there exists a sequence (ak ) of real numbers, p0 ≥ 2, and a p0 = 0 such that for all v in a neighborhood of u¯ +∞ Φ(v) − Φ( u¯ ) = ak (v − u¯)k . k= p0
Differentiating term by term, we obtain Φ (v) =
+∞
kak (v − u¯)k−1 .
k= p0
, and v = u¯ close to u¯, Taking θ ∈ R+ ∗ |Φ(v) − Φ( u¯ )|θ
|Φ (v)| By taking 1 > θ > 1 −
1 p0
≈
1 p0 |a p0 |1−θ
|v − u¯| p0 (θ−1)+1 .
and v sufficiently close to u¯, we obtain |Φ(v) − Φ( u¯ )|θ ≤ |Φ (v)|.
i
i i
i
i
i
i
726
book 2014/8/6 page 726 i
Chapter 17. Gradient flows
Remark 17.3.3. As a consequence of the Łojasiewicz inequality, we obtain that the critical values of a real-analytic function are isolated. Indeed, if u¯ ∈ U is a critical point of Φ, and v is another critical point in the neighborhood W of u¯ for which the inequality holds, then Φ(v) = Φ( u¯). Let us stress that, in general, the critical points of a real-analytic function are not iso 2 lated. For example, the function Φ : U ⊂ R2 → R which is defined by Φ(x) = x2 − 1 is a polynomial (hence analytic) function whose critical set is S 1 ∪ 0. (The elements of S 1 are critical points which are not isolated.) By contrast the critical values of Φ are 0 and 1, which are isolated in R. Remark 17.3.4. It is convenient to reformulate the Łojasiewicz inequality as follows. Take α(s) = c s 1−θ (with constant c ad hoc). Then for all v in W that satisfies Φ(v) > Φ( u¯), we have (17.162) α (Φ(v) − Φ( u¯))∇Φ(v) ≥ 1. Equivalently ∇(α ◦ (Φ − Φ( u¯ )))(v) ≥ 1. The function α is called a desingularizing function (this terminology is justified by the above property). Note that α is an increasing concave function, and α(0) = 0. This formulation is quite useful for the geometrical understanding of the Łojasiewicz inequality and further generalizations. Theorem 17.3.2 (Łojasiewicz [279]). Let Φ : U ⊂ RN → R be a real-analytic function. Then any bounded trajectory of the steepest descent dynamical system (SD)
u˙(t ) + ∇Φ(u(t )) = 0
has a finite length and converges to a critical point of Φ. PROOF. Φ is a real-analytic function. In particular, it is continuously differentiable, and its gradient is Lipschitz continuous on bounded sets. Hence, we can apply Theorems 17.1.1 and 17.1.3, which concern the classical continuous steepest descent. Note that in these theorems, we make the assumption “Φ is minorized.” Indeed, in the proof, we only use the weaker property “Φ is minorized along the orbit.” In our situation, this last property is satisfied: it is a consequence of the fact that the orbit u is bounded, Φ is continuous, and we are in a finite dimensional setting. As a consequence, any orbit u of (SD) satisfies the following: (i) t → Φ(u(t )) is a decreasing function, and for all t ≥ 0, (ii)
∞ 0
d Φ(u(t )) = − u˙(t )2 . dt
u˙(t )2 d t < +∞.
(iii) lim t →+∞ u˙ (t ) = 0,
lim t →+∞ ∇Φ(u(t )) = 0.
As a key property providing asymptotic convergence, we are going to show that in the analytic case, the orbit has a finite length, i.e.,
+∞
u˙(t )d t < +∞.
(17.163)
0
i
i i
i
i
i
i
17.3. Gradient flow associated to a tame function. Kurdyka–Łojasiewicz theory
book 2014/8/6 page 727 i
727
Indeed, this last property implies that for any 0 ≤ s ≤ t < +∞ t u(t ) − u(s) ≤ u˙(τ)d τ
s
t
≤
s
u˙(τ)d τ − 0
u˙(τ)d τ. 0
By the finite length property (17.163), we immediately infer that lim s ,t →+∞ u(t ) − u(s) = 0. This Cauchy property implies the convergence of u(t ) as t → +∞. By continuity of ∇Φ, and item (iii) above, we will obtain that the limit is a critical point of Φ. Thus the only point we need to prove is (17.163). The orbit has been assumed to be bounded in RN . Let u(tn ) → u∞ for some sequence tn → +∞. Then ∇Φ(u∞ ) = 0, i.e., u∞ is a critical point of Φ. By item (i) above, and continuity of Φ, lim Φ(u(t )) = inf Φ(u(t )) = Φ(u∞ ).
t →+∞
t ≥0
(17.164)
We are going to consider two cases: Case 1: There exists some T > 0 such that Φ(u(T )) = Φ(u∞ ). By the nonincreasing property of t → Φ(u(t )) (item (i)) above, and (17.164) we deduce that for any T ≤ t < +∞ Φ(u∞ ) ≤ Φ(u(t )) ≤ Φ(u(T )) = Φ(u∞ ). Hence Φ(u(·)) is constant for T ≤ t < +∞. From item (i), and ddt Φ(u(t )) = − u˙(t )2 , we deduce that u˙(t ) = 0 for T ≤ t < +∞. As a consequence +∞ T "1/2 $ ! T u˙(t )d t = u˙(t )d t ≤ T u˙(t )2 d t < +∞. 0
0
0
Case 2: For all t > 0, we have Φ(u(t )) > Φ(u∞ ). By the Łojasiewicz inequality (Theorem 17.3.1) there exists a neighborhood W of u∞ and a desingularizing function α (see Remark 17.3.4) such that for all v in W that satisfies Φ(v) > Φ( u¯), we have α (Φ(v) − Φ(u∞ ))∇Φ(v) ≥ 1.
(17.165)
We consider two steps: Step 2.1. Suppose that for t large enough, say, t ≥ T for some T > 0, we have u(t ) ∈ W . In that case, let us show that the function h(t ) := α(Φ(u(t )) − Φ(u∞ )) is a Lyapunov function. Indeed, by using successively the classical derivation chain rule, and the (SD) equation, we have ˙h(t ) = α (Φ(u(t )) − Φ(u )) 〈∇Φ(u(t )), u˙ (t )〉 ∞ = −α (Φ(u(t )) − Φ(u∞ )) ∇Φ(u(t )) 2 . Equivalently ˙h(t ) + α (Φ(u(t )) − Φ(u )) ∇Φ(u(t )) ∇Φ(u(t )) = 0. ∞
(17.166)
Since u(t ) ∈ W for t ≥ T , and Φ(u(t )) > Φ(u∞ ), by the Łojasiewicz inequality (17.165), we have, for all t ≥ T , α (Φ(u(t )) − Φ(u∞ ))∇Φ(u(t )) ≥ 1.
(17.167)
i
i i
i
i
i
i
728
book 2014/8/6 page 728 i
Chapter 17. Gradient flows
From (17.166) and (17.167) we infer ˙h(t )+ ∇Φ(u(t )) ≤ 0.
(17.168)
By (SD) we have ∇Φ(u(t )) = − u˙(t ). Hence, (17.168) gives ˙h(t )+ u˙(t ) ≤ 0. Since h is nonnegative, integration of this inequality from T to t > T gives
t
u˙(s)d s ≤ h(T ) < +∞. T
This majorization being valid for any t > T , we finally obtain (17.163). Step 2.2. Let us show that one can always find some T > 0 such that u(t ) ∈ W for t ≥ T and thus reduce to the situation examined in the previous Step 2.1. Set R > 0 such that the ball B(u∞ , R) ⊂ W . Since u(tn ) → u∞ , and α is continuous at 0 (α(0) = 0), there exists some integer N such that u(tN ) − u∞ < α(Φ(u(tN )) − Φ(u∞ ))
0 for all s ∈ ]0, η[; • ϕ is concave; and such that for all x in U ∩ [ f (¯ x ) < f < f (¯ x ) + η], the (KL) inequality holds: x )) dist(0, ∂ f (x)) ≥ 1. (KL) ϕ ( f (x) − f (¯
(17.171)
(b) Proper lower semicontinuous functions that satisfy the (KL) inequality at each point of dom ∂ f are called (KL) functions. Remark 17.3.6. The (KL) inequality has a rich story. The general concept as defined above has been gradually emerging. The following are some important steps: • Łojasiewicz in [278] (1963) introduced the concept for real analytic functions with ϕ(s) = s 1−θ , θ ∈ [ 12 , 1).
i
i i
i
i
i
i
17.3. Gradient flow associated to a tame function. Kurdyka–Łojasiewicz theory
book 2014/8/6 page 731 i
731
• Kurdyka in [266] (1998) extended the concept to differentiable functions definable in an o-minimal structure (semialgebraic, subanalytic), whence the terminology. • Bolte et al. in [105] (2007) gave the first extension to nonsmooth functions by considering Clarke subgradients of nonsmooth functions definable in an o-minimal structure. • Attouch et al. in [40] (2010) introduced the above (KL) formulation; see also [41]. Semialgebraic sets and functions. Functions which can be defined by a finite number of polynomial equalities or inequalities are called semialgebraic. They provide a rich class of functions satisfying the (KL) inequality. Definition 17.3.6. (a) A subset S of RN is a real semialgebraic set if there exists a finite number of real polynomial functions Pi j , Qi j : RN → R such that S=
p ' q 3
{ x ∈ Rn : Pi j (x) = 0, Qi j (x) < 0}.
j =1 i =1
(b) A function f : RN → R ∪ {+∞} (respectively, a point-to-set mapping F : RN → R m ) is called semialgebraic if its graph {(x, λ) ∈ RN +1 : f (x) = λ} (respectively, {(x, y) ∈ RN +m : y ∈ F (x)}) is a semialgebraic subset of RN +1 (respectively, RN +m ). One easily sees that the class of semialgebraic sets is stable under the operation of finite union, finite intersection, Cartesian product, or complementation and that polynomial functions are, of course, semialgebraic functions. The high flexibility of the concept of semialgebraic sets is captured by the following fundamental theorem, known as the Tarski–Seidenberg principle. Theorem 17.3.4 (Tarski–Seidenberg). Let A be a semialgebraic subset of RN +1 ; then its canonical projection on RN , namely, 5 4 (x1 , . . . , xN ) ∈ RN : ∃z ∈ R, (x1 , . . . , xN , z) ∈ A , is a semialgebraic subset of RN . Let us illustrate the power of this theorem by proving that max functions associated to polynomial functions are semialgebraic. Let S be a nonempty semialgebraic subset of R m and g : RN × R m → R a real polynomial function. Set f (x) = sup{g (x, y) : y ∈ S}. (Note that f can assume infinite values.) Let us prove that f is semialgebraic. Using the definition and the stability with respect to finite intersection, we see that the set {(x, λ, y) ∈ RN × R × S : g (x, y) > λ} = {(x, λ, y) ∈ RN × R × R m : g (x, y) > λ}
' RN × R × S
is semialgebraic. For (x, λ, y) in RN ×R×R m , define the projection Π(x, λ, y) = (x, λ) and use Π to project the above set on RN × R. One obtains the following semialgebraic set: {(x, λ) ∈ RN × R : ∃y ∈ S, g (x, y) > λ}. The complement of this set is {(x, λ) ∈ RN × R : ∀y ∈ S, g (x, y) ≤ λ} = epi f .
i
i i
i
i
i
i
732
book 2014/8/6 page 732 i
Chapter 17. Gradient flows
Hence epi f is semialgebraic. Similarly, hypo f := {(x, μ) ∈ RN × R : f (x) ≥ μ} is semialgebraic. Hence graph f = epi f ∩ hypo f is semialgebraic. Clearly, this result also holds when replacing sup by inf. As a byproduct of these stability results, we recover the following standard result which is useful in optimization when using for example a penalization method. Lemma 17.3.1. Let S be a nonempty semialgebraic subset of R m ; then the function R m # x → dist(x, S)2 is semialgebraic. PROOF. It suffices to consider the polynomial function g (x, y) = x − y2 for x, y in R m and to use the definition of the distance function. Remark 17.3.7. The fact that the composition of semialgebraic mappings gives a semialgebraic mapping or that the image (respectively, the preimage) of a semialgebraic set by a semialgebraic mapping is a semialgebraic set is also a consequence of the Tarski– Seidenberg principle. See [93, 101] for these and many other consequences of this principle. Remark 17.3.8. Numerical analysis provides numerous examples of semialgebraic objects [283]: for example, the cone of the positive semidefinite matrices, Stiefel manifolds (spheres, orthogonal group [205]), and matrices with fixed rank. The following result makes the link between semialgebraic structures and the (KL) inequality. Theorem 17.3.5. Let f : RN → R∪{+∞} be a proper lower semicontinuous function. Then the following implication holds: f semialgebraic ⇒ f satisfies (KL) inequality, with ϕ(s) = c s 1−θ , for the same θ ∈ [0, 1) ∩ Q and c > 0. o-minimal structures and (KL) inequality. o-minimal structures correspond to an axiomatization of the geometrical properties of semialgebraic sets, particularly of the stability under projection (Tarski–Seidenberg). Let us cite the important contributions to this theory of Van den Dries [201], Coste [180], and Shiota [332]. This construction allows us to define new classes of sets and functions (semilinear, semialgebraic, subanalytic), for which the (KL) inequality is still valid. Clearly, this considerably enlarges the range of application of this theory. Let us give its main lines. Definition 17.3.7. Let 3 = {3n }n∈N , where 3n is a collection of subsets of Rn . 3 is an o-minimal structure iff the following hold: (i) Each 3n is a boolean algebra: ∈ 3n , A, B in 3n ⇒ A ∪ B, A ∩ B, Rn \ A ∈ 3n . (ii) For all A in 3n , A × R and R × A belong to 3n+1 . (iii) For all A in 3n+1 , Π(A) := {(x1 , . . . , xn ) ∈ Rn : (x1 , . . . , xn , xn+1 ) ∈ A} ∈ 3n . (iv) For all i = j in {1, . . . , n}, {(x1 , . . . , xn ) ∈ Rn : xi = x j } ∈ 3n .
i
i i
i
i
i
i
17.3. Gradient flow associated to a tame function. Kurdyka–Łojasiewicz theory
book 2014/8/6 page 733 i
733
(v) The set {(x1 , x2 ) ∈ R2 : x1 < x2 } belongs to 32 . (vi) The elements of 31 are exactly finite unions of intervals. Definition 17.3.8. Definable sets and functions are defined as follows: • A is definable in 3 iff A belongs to 3 . • f : RN → R ∪ {+∞} is definable iff its graph is a definable subset of RN × R. The following important result from Bolte et al. [105] makes the link between functions definable in an o-minimal structure and (KL) inequality. Theorem 17.3.6. Let f : RN → R ∪ {+∞} be lower semicontinuous, definable in an o-minimal structure 3 . Then, f has the (KL) property at each point of dom ∂ f . Moreover, the desingularizing function ϕ is definable in 3 . Remark 17.3.9. Let us mention some further examples of functions satisfying (KL): • Uniform convexity: for all x, y ∈ RN , x ∗ ∈ ∂ f (x), f (y) ≥ f (x) + 〈x ∗ , y − x〉 + Ky − x p ,
p ≥ 1,
⇒ f satisfies (KL) with φ(s) = c s 1/ p . • Linearly regular intersection of Fi , transversality, [283]: we have f (x) :=
1 2
dist(x, Fi )2
satisfies (KL).
i
• Metric regularity: F : RN → R m is metrically regular at x¯ ∈ RN if there exists a neighborhood V of x¯ in RN , a neighborhood W of F (¯ x ) in R m , and k > 0, x ∈ V , y ∈ W ⇒ dist(x, F −1 (y)) ≤ k dist(y, F (x)). If F is metrically regular, then [40] f (x) =
1
dist2 (F (x), C )
$ satisfies (KL), C ⊂ R m closed convex, φ(s) = c s .
2 For existence of a smooth convex f : R2 → R which does not satisfy (KL) see Bolte et al. [106] and Daniilidis, Ley, and Sabourau [191]. Asymptotic behavior of gradient flows and (KL) inequality. The study of the gradient flow associated to a general lower semicontinuous function is a broad subject, still in progress. Let us briefly describe the results obtained in [103] by Bolte, Daniilidis, and Lewis. They concern the subgradient dynamical system associated to a nonsmooth subanalytic function and the use of the (KL) inequality in the study of asymptotic behavior. We make the following assumptions: (H1) Φ is either lower semicontinuous convex or lower-C 2 with dom Φ = Rn . (H2) Φ is somewhere finite (dom f = ) and bounded from below. (H3) Φ is a subanalytic function.
i
i i
i
i
i
i
734
book 2014/8/6 page 734 i
Chapter 17. Gradient flows
We recall (see [330, Definition 10.29], for example) that a function f is called lower-C 2 if for every x0 ∈ dom f there exist a neighborhood U of x0 , a compact topological space S, and a jointly continuous function F : U × S → R satisfying f (x) = max s ∈S F (x, s), for all x ∈ U , and such that the (partial) derivatives ∇ x F and ∇ x 2 F exist and are jointly continuous. The main results of [103] are summarized in the following. Theorem 17.3.7. (a) Under the assumptions (H1) and (H2), for every u0 ∈ RN such that ∂ Φ(u0 ) = , there exists a unique trajectory u : [0, +∞[→ RN of u˙(t ) + ∂ Φ(u(t )) # 0, u(0) = u0 . In addition, the function Φ ◦ u is absolutely continuous, and for almost all t > 0 (i)
d (Φ ◦ u)(t ) = 〈 u˙(t ), v〉 dt
(ii) u˙(t ) = mΦ (u(t )),
for all v ∈ ∂ Φ(u(t )),
d (Φ ◦ u)(t ) = −[mΦ (u(t ))]2 , dt
where mΦ (x) = inf {v : v ∈ ∂ Φ(x)} is called the nonsmooth slope of Φ at x (∂ Φ is the limiting subdifferential of Φ). (b) Assume that Φ satisfies (H1)–(H2)–(H3). Then any bounded maximal orbit of the gradient flow associated to Φ has a finite length and converges to some critical point of Φ.
17.4 Sequences of gradient flow problems 17.4.1 Graph-convergence of operators Let us recall the classical notion of set convergence introduced in Remark 12.1.2, namely, the Kuratowski–Painlevé convergence for sequence of sets: let (An )n∈N be a sequence of subsets of a metric space (X , d ) or more generally of a topological space. The lower limit of the sequence (An )n∈N is the subset of X denoted by lim infn→+∞ An and defined by 4 5 lim inf An = x ∈ X : ∃xn → x, xn ∈ An ∀n ∈ N . n→+∞
The upper limit of the sequence (An )n∈N is the subset of X denoted by lim supn→+∞ An and defined by 4 5 lim sup An = x ∈ X : ∃(nk )k∈N , ∃(xk )k∈N ∀k, xk ∈ Ank , xk → x . n→+∞
The sets lim infn→+∞ An and lim supn→+∞ An are clearly two closed subsets of (X , d ) satisfying lim inf An ⊂ lim sup An . n→+∞
n→+∞
The sequence (An )n∈N is said to be convergent if the following equality holds: lim inf An = lim sup An . n→+∞
n→+∞
The common value A is called the limit of (An )n∈N in the Kuratowski–Painlevé sense and denoted by K − limn→+∞ An . Therefore, by definition A := K − limn→+∞ An iff lim sup An ⊂ A ⊂ lim inf An , n→+∞
n→+∞
i
i i
i
i
i
i
17.4. Sequences of gradient flow problems
book 2014/8/6 page 735 i
735
so that x ∈ A = K − limn→+∞ An iff the two following assertions hold: ∀x ∈ A, ∃(xn )n∈N
such that ∀n ∈ N, xn ∈ An and xn → x;
∀(nk )k∈N , ∀(xk )k∈N
such that ∀k ∈ N, xk ∈ Ank , xk → x =⇒ x ∈ A.
In general such a convergence is not topological. One can show that there exists a topology which governs the Kuratowski–Painlevé convergence iff the space (X , d ) is locally compact. For a complete study and comparison of various types of convergence and their associated topologies, namely, Vietoris, Fell, Wijsmann, Attouch–Wets, and Moscoconvergence, see [87, 88]. From now on (V , .) is a Banach space and V ∗ is its topological dual space whose dual norm is denoted by .∗ and we recall that for (u, u ∗ ) ∈ V × V ∗ we write 〈u ∗ , u〉 for ∗ u ∗ (u). Given a multivalued operator A : V → 2V , for any v ∈ V we write Av instead of A(v). Let us recall some basic definitions which were addressed in Section 17.2.2 in the Hilbertian setting: dom A = {v ∈ X : Av = } denotes the domain of A; G(A) := {(v, v ∗ ) ∈ V × V ∗ : v ∗ ∈ Av} denotes the graph of A; R(A) := {v ∗ ∈ V ∗ : ∃v ∈ V s.t. v ∗ ∈ Av} denotes the range of A. We define the inverse operator A−1 : V ∗ → V of A by A−1 (v ∗ ) = {v ∈ V : v ∗ ∈ Av} . ∗
Definition 17.4.1. An operator A : V → 2V is said to be monotone if 〈u ∗ − v ∗ , u − v〉 ≥ 0 whenever (u, u ∗ ) ∈ G(A) and (v, v ∗ ) ∈ G(A). It is maximal monotone if it is monotone and if its graph is maximal among all the monotone operators mapping V to V ∗ when V × V ∗ is ordered by inclusion. An element (u, u ∗ ) of V × V ∗ is said to be monotonically related to a monotone operator A provided 〈u ∗ − v ∗ , u − v〉 ≥ 0
∀(v, v ∗ ) ∈ G(A).
A useful form of the definition of maximality for a monotone operator A is the following condition, whose proof follows straightforwardly from the foregoing definition (see also Definition 17.2.1). ∗
Proposition 17.4.1. Let A : V → 2V be a monotone operator. Then A is maximal monotone iff whenever (u, u ∗ ) is monotonically related to A, then u ∈ dom A and u ∗ ∈ Au. Given a sequence of operators, one can consider the lim inf and lim sup of the sequence of their graphs as subsets of V × V ∗ . This leads to the following definition. Definition 17.4.2. A sequence (An )n∈N of operators mapping V to V ∗ is said to be graph ∗ convergent to an operator A : V → 2V if the sequence (G(An ))n∈N converges to the graph G(A) of A in the sense of Kuratowski and Painlevé when V ×V ∗ is endowed with the product norm. From now on we systematically identify the operators with their graphs so that we G
write A instead of G(A) and A = G−lim An or An → A instead of G(A) = K−limn→+∞ G(An ). When considering sequences of maximal monotone operators, the definition of the graph convergence is reduced to the following.
i
i i
i
i
i
i
736
book 2014/8/6 page 736 i
Chapter 17. Gradient flows
Proposition 17.4.2. Let (An , A)n∈N be a sequence of maximal monotone operators mapping V to V ∗ . Then we have A = G − lim An ⇐⇒ A ⊂ lim inf An . n→+∞
n→+∞
(17.172)
PROOF. The only implication we have to establish is A ⊂ lim inf An =⇒ A = G − lim An , n→+∞
n→+∞
the converse being trivial. Thus, it remains to show that lim supn→+∞ An ⊂ A is automatically satisfied. Let (u, u ∗ ) ∈ lim supn→+∞ An ; then there exists a subsequence (nk )k∈N of integers and (uk , uk∗ ) ∈ Ank such that (uk , uk∗ ) → (u, u ∗ ) in V × V ∗ whenever k → +∞. On the other hand, since A ⊂ lim infn→+∞ An , for all (v, v ∗ ) ∈ A, there exists (vn , vn∗ ) ∈ An such that (vn , vn∗ ) → (v, v ∗ ) in V × V ∗ . Going to the limit on 〈uk∗ − vn∗ , uk − vnk 〉 ≥ 0 k
(recall that Ank is monotone), we infer 〈u ∗ − v ∗ , u − v〉 ≥ 0
∀(v, v ∗ ) ∈ A.
Therefore (u, u ∗ ) is monotonically related to A and, according to Proposition 17.4.1, (u, u ∗ ) ∈ A, which completes the proof. For various examples of maximal monotone operators, see [321]. In the subsection below we consider the most basic class of maximal monotone operators, namely, the class of subdifferentials of convex functions.
17.4.2 Mosco-convergence of convex potentials and graph-convergence of their subdifferential operators (Attouch theorem) Given a convex proper function Φ : V → R ∪ {+∞}, let us recall (see Sections 9.5 and ∗ Section 17.2.2 in the Hilbertian setting) that the subdifferential mapping ∂ Φ : V → 2V is defined for all u in dom Φ by ∂ Φ(u) := {u ∗ ∈ V ∗ : Φ(v) ≥ Φ(u) + 〈u ∗ , v − u〉 ∀v ∈ V }, while ∂ Φ(u) = if u ∈ V \ dom Φ. It may also be empty at points of dom Φ as shown in [321, Example 2.7] or [320, Example 3.8]. It is readily seen that ∂ Φ is monotone but it is not obvious that ∂ Φ is maximal (see Proposition 17.2.3 in the Hilbertian setting and Theorem 17.4.1 below). It is not even obvious that ∂ Φ is not trivial, i.e., that dom ∂ Φ = as stated in Proposition 9.5.2 under a continuity condition. The lemma below is due to Brønsted and Rockafellar [141] and strengthens Proposition 9.5.2 by showing that if Φ is a convex, proper, lower semicontinuous function, then the domain of ∂ Φ is dense in the domain of Φ (see also Remark 17.2.2). This lemma is also a key argument in the proof of Rockafellar and Attouch’s theorems (Theorems 17.4.1 and 17.4.4 below). Before stating this important lemma we have to define the -differential, a useful notion in many parts of convex analysis. Definition 17.4.3. Let Φ : V → R ∪ {+∞} be a convex, proper, lower semicontinuous function. For any > 0, the -subdifferential mapping ∂ Φ : V → V ∗ of Φ is defined for all
i
i i
i
i
i
i
17.4. Sequences of gradient flow problems
book 2014/8/6 page 737 i
737
u in dom Φ by ∂ Φ(u) := {u ∗ ∈ V : Φ(v) ≥ Φ(u) + 〈u ∗ , v − u〉 − ∀v ∈ V } , while ∂ Φ(u) = if u ∈ V \ dom Φ. Remark 17.4.1. It follows from the definition that ∂ Φ(u) is a weak* closed convex set. Moreover, using the convexity of epiΦ and the Hahn–Banach separation theorem (Theorem 9.1.1) in V × R, one can prove that ∂ Φ(u) is nonempty for every u ∈ dom Φ (see [320, Proposition 3.14]). The following equivalence follows straightforwardly from the definition 0 ∈ ∂ Φ(u) ⇐⇒ Φ(u) ≤ Φ(v) +
∀v ∈ V ,
i.e., u is a -minimizer of Φ. Moreover, from the definition of the Legendre–Fenchel conjugate of Φ and by reproducing the proof of Proposition 9.5.1, it readily follows that u ∗ ∈ ∂ Φ(u) ⇐⇒ 0 ≤ Φ∗ (u ∗ ) + Φ(u) − 〈u ∗ , u〉 ≤ .
(17.173)
Lemma 17.4.1 (Brønsted and Rockafellar). Let Φ : V → R ∪ {+∞} be a convex, proper, lower semicontinuous function. Then given any u0 ∈ dom Φ, > 0, λ > 0, and any u0∗ ∈ ∂ Φ(u0 ), there exists u ∈ dom ∂ Φ and u ∗ ∈ V ∗ such that u ∗ ∈ ∂ Φ(u), u − u0 ≤ λ and u ∗ − u0∗ ∗ ≤ . λ $ In particular (take λ = ), the domain of ∂ Φ is dense in dom Φ. PROOF. The proof consists in applying the Ekeland variational principle to a suitable function. Consider the convex, proper, lower semicontinuous function Ψ : V → R ∪ {+∞} defined by Ψ(v) = Φ(v) − 〈u0∗ , v〉. According to Remark 17.4.1, u0 is a -minimizer of Ψ. From Theorem 3.4.5, it follows that there exists u ∈ dom Φ such that u − u0 ≤ λ;
Ψ(u) < Ψ(v) + v − u λ
∀v = u.
The second statement says that u is a minimizer of v → Ψ(v) + λ v − u so that, using the optimality condition Proposition 9.5.3, we have " ! 0 ∈ ∂ Ψ + . − u (u). λ Recalling the definition of Ψ, and from the classical rule about additivity of subdifferentials (cf. Theorem 9.5.4), we infer u0∗ ∈ ∂ Φ(u) + ∂ . − u(u). λ Equivalently, there exists u ∗ ∈ ∂ Φ(u) such that u0∗ − u ∗ ∈ λ ∂ . − u(u). The subdifferential inequality then yields v − u ≥ 〈u0∗ − u ∗ , v − u〉 λ for all v ∈ V , thus u0∗ − u ∗ ∗ ≤ λ . This completes the proof.
i
i i
i
i
i
i
738
book 2014/8/6 page 738 i
Chapter 17. Gradient flows
In the case when (V , .) is a reflexive Banach space, by using the Moreau–Yosida regularization of Φ, the second assertion of Lemma 17.4.1, i.e., the density of dom ∂ Φ in dom Φ, may be specified as stated below in Proposition 17.4.3. (See also Theorem 9.5.3 when V is not assumed to be reflexive, or Section 17.2.1 when V is a Hilbert space.) The formulation of the statement resumes some definitions and results of Section 17.2.1 and needs some preparations. Lemma 17.4.2 (property of the duality map). Denote by H the subdifferential of the ∗ convex continuous function v → 12 v2 , also called the duality map from V into 2V . Then H is characterized by u ∗ ∈ H (u) ⇐⇒ u ∗ ∗ = u and 〈u ∗ , u〉 = u2 .
(17.174)
If the norms . and .∗ are strictly convex, then H is one to one and sequentially continuous from V onto V ∗ , when V and V ∗ are equipped with their strong convergence and weak convergence, respectively. Furthermore, if the dual space (V ∗ , .∗ ) satisfies the property “the weak convergence and the convergence of the norms imply the strong convergence,” then H is strongly continuous from V onto V ∗ . PROOF. From Proposition 9.5.1 v ∗ → 12 v ∗ 2∗ is the Legendre–Fenchel conjugate of v → 1 v2 , so that according to Corollary 9.3.1 2 u ∗ ∈ H (u) ⇐⇒
1
1 u ∗ 2∗ + u2 = 〈u ∗ , u〉. 2 2
(17.175)
Hence, from 〈u ∗ , u〉 ≤ u ∗ ∗ u, (17.175) yields u ∗ 2∗ + u2 ≤ 2u ∗ ∗ u, from which we deduce u ∗ ∗ = u. Finally (17.175) gives (17.174). According to Theorem 9.5.1, H −1 is the subdifferential of v ∗ → 12 v ∗ 2∗ , and thus u ∈ H −1 (u ∗ ) ⇐⇒ u ∗ ∗ = u and 〈u ∗ , u〉 = u ∗ 2∗ . If the norms . and .∗ are strictly convex, from (17.174) we deduce that H is a one-to-one mapping from V onto V ∗ . Indeed, assume that there exist u1∗ and u2∗ , two elements of H (u) with u1∗ = u2∗ , and take λ ∈ ]0, 1[. Since H (u) is a convex subset of V ∗ , λu1∗ + (1 − λ)u2∗ ∈ H (u) and from (17.174) and the strict convexity of .∗ we infer u = λu1∗ + (1 − λ)u2∗ ∗ < λu1∗ ∗ + (1 − λ)u2∗ ∗ = u, a contradiction. The same argument, using the strict convexity of ., shows that H −1 is univalent. We are going to establish the continuity of H . Let (un )n∈N be a sequence converging strongly to u in V and set un∗ := H (un ). Then from (17.174) un∗ ∗ = un ,
(17.176)
〈un∗ , un 〉 = un 2 .
(17.177) supn∈N un∗ ∗
From (17.176) and the fact that un → u in V we infer < +∞, so that there exists a subsequence of (un∗ )n∈N (not relabeled) weakly converging to some v ∗ in V ∗ .
i
i i
i
i
i
i
17.4. Sequences of gradient flow problems
book 2014/8/6 page 739 i
739
We claim that v ∗ = H (u). Indeed, going to the limit in (17.177) and (17.176) and from the lower semicontinuity of the norm we obtain 〈v ∗ , u〉 = u2
(17.178)
and v ∗ ∗ ≤ u. But (17.178) yields v ∗ ≥ u so that v ∗ ∗ = u. Thus from the characterization (17.174) of H (u), we conclude that v ∗ = H (u), and that the whole sequence (H (un ))n∈N converges weakly to H (u). For establishing the strong continuity under the additional assumption on (V ∗ , .∗ ), it is enough to prove H (un )∗ → H (u)∗ which is a straightforward consequence of (17.174), H (un )∗ = un → u = H (u)∗ , which completes the proof. The next proposition resumes some results and definitions established in Propositions 17.2.1 and 17.2.2 previously stated in the Hilbertian setting. Proposition 17.4.3. Let (V , .) be a reflexive Banach space, and assume that the norms . and .∗ are strictly convex. Let Φ : V → R ∪ {+∞} be a convex, proper, lower semicontinuous function, and for very λ > 0 consider its Moreau–Yosida regularization, i.e., the function Φλ defined for all u ∈ V by 1 Φλ (u) = inf Φ(v) + v − u2 , v∈V 2λ and denote by Jλ u the unique point where the infimum is achieved. Then, for every u ∈ dom Φ 1
H (u − Jλ u) ∈ ∂ Φ(Jλ u), in particular Jλ u ∈ dom ∂ Φ; λ Jλ u → u strongly in V when λ goes to zero; Φ(Jλ u) → Φ(u) when λ goes to zero.
(17.179) (17.180) (17.181)
1 PROOF. As noted in Proposition 17.2.1, the function v → Φ(v) + 2λ v − u2 is convex, proper, lower semicontinuous, and coercive. Thus, from Theorem 3.3.4 it reaches its minimum at a unique point denoted by Jλ u. (Uniqueness follows from the strict convexity of the norm ..) According to the optimality condition Proposition 9.5.3, we have
" ! 1 0 ∈ ∂ Φ + . − u2 (Jλ u). 2λ Therefore, from the classical rule about additivity of subdifferentials (cf. Theorem 9.5.4) and since from Lemma 17.4.2 H is a one to one mapping from V onto V ∗ , we infer that Jλ u satisfies the extremality condition 1 λ
H (u − Jλ u) ∈ ∂ Φ(Jλ u);
in particular we obtain (17.179). On the other hand, from Theorem 9.3.1, Φ admits an affine continuous minorant, so that for all v ∈ V , Φ(v) ≥ −α(v + 1) for some constant α ≥ 0. Hence, given u ∈ dom Φ
i
i i
i
i
i
i
740
book 2014/8/6 page 740 i
Chapter 17. Gradient flows
we infer Φ(u) = Φ(Jλ u) +
1 2λ
Jλ u − u2
1 ≥ −αJλ u − u + Jλ u − u2 − α(u + 1) 2λ α2 1 1 ≥ − Jλ u − u2 − − α(u + 1), 2λ 2 2 so that, for λ small enough, Jλ u − u2 ≤
2λ ! 1−λ
Φ(u) +
α2 2
" + α(u + 1) ,
from which we derive (17.180). By using the inequality Φ(u) ≥ Φ(Jλ u), the lower semicontinuity of Φ, and (17.180), we obtain Φ(u) ≥ lim sup Φ(Jλ u) ≥ lim inf Φ(Jλ u) ≥ Φ(u), λ→0
λ→0
which ensures (17.181). Definition 17.4.4. Under the assumptions and notation of Proposition 17.4.3, the operator Jλ : V → V is called the resolvent of index λ of ∂ Φ. We sometimes write JλΦ to highlight the dependence on Φ. Theorem 17.4.1 (Rockafellar). Let (V , .) be a general Banach space and Φ : V → R ∪ {+∞} a convex proper lower semicontinuous function. Then, its subdifferential ∂ Φ : V → ∗ 2V is a maximal monotone operator. PROOF. We follow the ideas of Simon’s proof [334]. It relies on Lemma 17.4.3 below, whose proof is based on the Brønsted–Rockafellar lemma. Lemma 17.4.3. Let Φ : V → R ∪ {+∞} be a convex proper lower semicontinuous function and suppose that u0 ∈ V (not necessarily in dom Φ) satisfies infv∈V Φ(v) < Φ(u0 ). Then there exists u ∈ dom ∂ Φ and u ∗ ∈ ∂ Φ(u) such that Φ(u) < Φ(u0 )
and 〈u ∗ , u0 − u〉 > 0.
PROOF OF LEMMA 17.4.3. By using the subdifferential inequality, the existence of u ∈ dom ∂ Φ and u ∗ ∈ ∂ Φ(u) satisfying 〈u ∗ , u0 − u〉 > 0 readily yields Φ(u) < Φ(u0 ). Fix r ∈ R satisfying infV Φ < r < Φ(u0 ) and set M :=
sup v∈V , v = u0
r − Φ(v) u0 − v
.
First step. We prove that 0 < M < +∞. To see that M > 0, pick any v ∈ V such that r −Φ(v) Φ(v) < r . We have v = u0 and M ≥ u −v > 0. We are going to prove that M < +∞. For 0
r −Φ(v)
all v satisfying Φ(v) > r we have u −v < 0. Consider the closed set E := {v ∈ V : Φ(v) ≤ 0 r }. Since u0 ∈ E, we have dist(u0 , E) > 0. On the other hand, from Theorem 9.3.1,
i
i i
i
i
i
i
17.4. Sequences of gradient flow problems
book 2014/8/6 page 741 i
741
the function Φ admits an affine continuous minorant w ∗ + β, where w ∗ ∈ V ∗ and β ∈ R. Therefore for all v ∈ E r − Φ(v) ≤ r − 〈w ∗ , v〉 − β " ! = r − 〈w ∗ , u0 〉 − β + 〈w ∗ , u0 − v〉 ≤ |r − 〈w ∗ , u0 〉 − β| + w ∗ ∗ u0 − v. Hence r − Φ(v) u0 − v
≤ ≤
|r − 〈w ∗ , u0 〉 − β| u0 − v |r − 〈w ∗ , u0 〉 − β| dist(u0 , E)
In either case, there is an upper bound for
r −Φ(v) , u0 −v
+ w ∗ ∗ + w ∗ ∗ .
so that M < +∞.
Second step. End of the proof. Consider the function g : V → R ∪ {+∞} defined by g (v) = Φ(v) + M u0 − v. By definition of M , for all > 0 (chosen such that 0 < < M ), there exists v ∈ V , v = u0 , such that 0 r , and for v = u0 Φ(v) + M u0 − v ≥ Φ(v) +
r − Φ(v) u0 − v
u0 − v = r.
From (17.182) we deduce that g (v ) ≤ inf g + u0 − v , V
so that (see Remark 17.4.1) 0 ∈ ∂u0 −v g (v ). We are in a position to apply the Brønsted–Rockafellar lemma, Lemma 17.4.1: choosing λ satisfying < λ < M , there exist u ∈ dom g = dom Φ, w ∗ ∈ ∂ g (u) such that u − v ≤
u0 − v λ
;
w ∗ ∗ ≤ λ.
(17.183) (17.184)
From (17.183) we infer that u − u0 > 0. Indeed u − u0 ≥ u0 − v − u − v ! " ≥ u0 − v 1 − >0 λ
i
i i
i
i
i
i
742
book 2014/8/6 page 742 i
Chapter 17. Gradient flows
by the choice of λ. On the other hand, from the classical rule about additivity of subdifferentials (cf. Theorem 9.5.4), ∂ g (u) = ∂ Φ(u) + M ∂ . − u0 (u). Consequently there exists u ∗ ∈ ∂ Φ(u) and z ∗ ∈ M ∂ . − u0 (u) such that w ∗ = u ∗ + z ∗ . The subdifferential inequality M v − u0 ≥ M u − u0 + 〈z ∗ , v − u〉 related to z ∗ ∈ M ∂ . − u0 (u) applied for v = u0 gives 〈z ∗ , u − u0 〉 ≥ M u − u0 . Therefore, from (17.184), by the choice of λ, and since u0 − u > 0, 〈u ∗ , u0 − u〉 = 〈w ∗ − z ∗ , u0 − u〉
= 〈w ∗ , u0 − u〉 + 〈z ∗ , u − u0 〉 ≥ − w ∗ ∗ + M u − u0 ≥ (−λ + M )u − u0 > 0,
which completes the proof of Lemma 17.4.3. PROOF OF THEOREM 17.4.1 CONTINUED. We argue by contraposition, taking into account Proposition 17.4.1. Suppose u ∈ V , u ∗ ∈ V ∗ , and u ∗ ∈ ∂ Φ(u). Thus 0 ∈ ∂ (Φ − u ∗ )(u). Indeed 0 ∈ ∂ (Φ − u ∗ )(u) ⇐⇒ (Φ − u ∗ )(v) ≥ (Φ − u ∗ )(u) ∀v ∈ V ⇐⇒ Φ(v) − Φ(u) ≥ 〈u ∗ , v − u〉 ∀v ∈ V ⇐⇒ u ∗ ∈ ∂ Φ(u). Thus u satisfies infv∈V (Φ − u ∗ ) < (Φ − u ∗ )(u). By Lemma 17.4.3, there exist z ∈ dom(Φ − u ∗ ) = dom Φ and z ∗ ∈ ∂ (Φ − u ∗ )(z) such that 〈z ∗ , z − u〉 < 0. But it is easily seen that z ∗ ∈ ∂ (Φ−u ∗ )(z) is equivalent to z ∗ +u ∗ ∈ ∂ Φ(z). Consequently there exists w ∗ ∈ ∂ Φ(z) such that z ∗ = w ∗ − u ∗ , so that 〈w ∗ − u ∗ , z − u〉 < 0. In order to establish the characterization by Rockafellar of subdifferentials among the maximal monotone operators we introduce the notion of cyclic monotonicity. We call the finite chain of V any finite family u0 , . . . , u l −1 , u l of elements in V with l ≥ 1, and we call the closed chain any finite family u0 , . . . , u l −1 , u l = u0 of elements in V with l ≥ 1. We write u0 u l and u0 u l = u0 , respectively, for such chains. ∗
Definition 17.4.5. An operator A : V → 2V is said to be cyclically monotone if l −1
〈uk∗ , uk − uk+1 〉 ≥ 0
k=0
for every closed chain u0 u l = u0 in dom A and every uk∗ ∈ Auk . Note that taking l = 2, i.e., a closed chain u0 u2 = u0 , we obtain 〈u0∗ − u1∗ , u0 − u1 〉 ≥ 0 for every (u0 , u0∗ ) and (u1 , u1∗ ) in A, which is the definition of a monotone operator. ∗ Therefore this notion generalizes the notion of monotonicity for operators A : V → 2V .
i
i i
i
i
i
i
17.4. Sequences of gradient flow problems
book 2014/8/6 page 743 i
743
Proposition 17.4.4. Let Φ : V → R ∪ {+∞} be a convex, proper, lower semicontinuous function. Then for all finite chain u0 u l = u with uk ∈ dom(∂ Φ) and (uk , uk∗ ) ∈ ∂ Φ for k = 0, . . . , l − 1, we have Φ(u) ≥ Φ(u0 ) +
l −1
〈uk∗ , uk+1 − uk 〉.
(17.185)
k=0
Moreover its subdifferential operator ∂ Φ is cyclically monotone. PROOF. Take a finite chain u0 u l = u with uk ∈ dom ∂ Φ and uk∗ with (uk , uk∗ ) ∈ ∂ Φ for k = 0, . . . , l − 1. According to the definition of ∂ Φ, for k = 0, . . . , l − 1 we have Φ(uk+1 ) ≥ Φ(uk ) + 〈uk∗ , uk+1 − uk 〉. Summing these l inequalities and using the fact that u l = u we obtain Φ(u) ≥ Φ(u0 ) +
l −1
〈uk∗ , uk+1 − uk 〉.
k=0
Taking now u = u0 , i.e., a closed chain u0 u l = u0 , the above inequality yields l −1
〈uk∗ , uk+1 − uk 〉 ≤ 0
k=0
so that ∂ Φ is cyclically monotone. From now on, unless otherwise specified, the Banach space (V , .) is assumed to be reflexive. The following theorem stated without proof is due to Rockafellar [329]. It expresses that among the maximal monotone operators, the subdifferentials are the only operators which are cyclically monotone. ∗
Theorem 17.4.2 (Rockafellar). Let A : V → 2V be a maximal monotone operator with dom A = . Then A is the subdifferential of a convex, proper, lower semicontinuous function Φ : V → R ∪ {+∞} iff A is cyclically monotone. In that case the following “integration” formula holds: A = ∂ Φ with for every u ∈ V Φ(u) = sup C0 +
l −1
〈uk∗ , uk+1 − uk 〉 : u0 u l = u
k=0
∗
with uk ∈ dom A for k = 0, . . . , l − 1, l ∈ N
,
(17.186)
where the primitive Φ is defined up to an arbitrary constant C0 (with C0 = Φ(u0 )). Corollary 17.4.1. The class of subdifferentials is closed in the class of maximal monotone operators. In other words, if A = G − lim ∂ Φn , where A is a maximal monotone operator, then there exists a convex proper lower semicontinuous function Φ such that A = ∂ Φ. PROOF. From Theorem 17.4.2, it suffices to establish that A is cyclically monotone. Let u0 u l = u0 be a closed chain in dom A and consider uk∗ , k = 0, . . . , l , with uk∗ ∈ Auk .
i
i i
i
i
i
i
744
book 2014/8/6 page 744 i
Chapter 17. Gradient flows
According to the fact that A = G − lim ∂ Φn , for each k = 0, . . . , l there exists (ukn , uk∗n ) ∈ ∂ Φn satisfying (ukn , uk∗n ) → (uk , uk∗ ) in V × V ∗ . Since for all n ∈ N, ∂ Φn is cyclically monotone we have l −1 n 〈uk∗n , ukn − uk+1 〉 ≥ 0. k=0
Letting n → +∞ we infer
l −1
〈uk∗ , uk − uk+1 〉 ≥ 0,
k=0
which proves the thesis. A natural question now arises. What type of variational convergence should equip the class of convex functions so that the mapping Φ → ∂ Φ is continuous when the class of maximal monotone operators is equipped with the graph convergence? The appropriate notion is the convergence in the sense of Mosco defined below. The Banach space (V , .) being endowed with two convergences, we have two notions of Γ -convergence. Given a sequence (Φn )n∈N of functionals Φn : V → R ∪ {+∞}, according to Definition 12.1.1, we denote by Γw − lim Φn and Γ s − lim Φn the Γ -limits associated with the weak and the strong convergence in V , respectively, when they exist. Definition 17.4.6 (Mosco-convergence). Let (V , .) be a Banach space, a sequence (Φn )n∈N of extended real-valued functions Φn : V −→ R ∪ {+∞}, and Φ : V −→ R ∪ {+∞}. The M
sequence (Φn )n∈N Mosco-converges to Φ and we write Φn → Φ if Φ = Γ w − Φn = Γ s − Φn . The argument which naturally led us to introduce the Mosco-convergence notion yields the bicontinuity of the Fenchel duality transformation in the context of convex functions. We state this more precisely in the following theorem. Theorem 17.4.3. Let (V , .) be a reflexive Banach space, and (Φn )n∈N , Φ a sequence of convex, proper, lower semicontinuous functions from V into R∪ {+∞}. The following statements are equivalent: M
(i) Φn → Φ on V ; M
(ii) Φ∗n → Φ∗ on V ∗ . PROOF. Since the functions Φn and Φ are convex and lower semicontinuous, (Φ∗n )∗ = Φn and (Φ∗ )∗ = Φ, so that it suffices to establish (i) =⇒ (ii). From hypothesis (i) and according to the notation of Section 12.1 we have Γ s − lim sup Φn ≤ Φ ≤ Γw − lim inf Φn . n→+∞
n→+∞
By Fenchel conjugation, these inequality are reversed, i.e., ∗ ∗ Γw − lim inf Φn ≤ Φ∗ ≤ Γ s − lim sup Φn . n→+∞
(17.187)
n→+∞
Let us recall that a sequence (Φn )n∈N is said to be uniformly proper if there exists a M
bounded sequence (u0n )n∈N in V such that supn∈N Φn (u0n ) < +∞. Since Φn → Φ, the
i
i i
i
i
i
i
17.4. Sequences of gradient flow problems
book 2014/8/6 page 745 i
745
sequence (Φn )n∈N is automatically uniformly proper. Indeed, fix u0 ∈ dom Φ. From the fact that Φ = Γ s − Φn , there exists u0n strongly converging to u0 in V such that Φ(u0 ) = limn→+∞ Φn (u0n ), so that (Φn (u0n ))n∈N is bounded. Note that by using the same arguments, one may prove that (Φ∗n )n∈N is uniformly proper too. To complete the proof, the following result, stated without proof, is essential. (For a proof consult [37, Theorem 3.7].) Lemma 17.4.4. Let (Φn )n∈N , Φn : V → R ∪ {+∞} be a sequence of convex, lower semicontinuous, and uniformly proper functions. Then ∗ Γw − lim inf Φn = Γ s − lim sup Φ∗n . n→+∞
n→+∞
PROOF OF THEOREM 17.4.3 CONTINUED. From Lemma 17.4.4 and noticing that Φn = and that for any function g the inequality g ∗∗ ≤ g holds, (17.187) yields Φ∗∗ n ∗ Γ s − lim sup Φ∗n ≤ Φ∗ ≤ Γ s − lim sup Φn n→+∞ ∗ = Γ s − lim sup Φ∗∗ n n→+∞ ∗∗ = Γw − lim inf Φ∗n ≤ Γw − lim inf Φ∗n n→+∞
n→+∞
The following proposition, whose proof is straightforward, states an equivalent formulation that is interesting from a practical point of view. Proposition 17.4.5. Let (V , .) be a reflexive Banach space and (Φn )n∈N , Φ a sequence of convex, proper, lower semicontinuous functions from V into R ∪ {+∞}. The following statements are equivalent: M
(i) Φn → Φ; (ii) for all v ∈ V , ∃vn → v such that Φn (vn ) → Φ(v); for all v ∈ V , ∀vn v, Φ(v) ≤ lim infn→+∞ Φn (vn ); (iii) for all v ∈ V , ∃vn → v such that Φn (vn ) → Φ(v), for all v ∗ ∈ V ∗ , ∃vn∗ → v ∗ such that Φ∗n (vn ) → Φ∗ (v). We state now the main result of this section. Theorem 17.4.4 (Attouch). Let (V , .) be a reflexive Banach space and (Φn )n∈N , Φ a sequence of convex, proper, lower semicontinuous functions from V into R ∪ {+∞}. The following statements are equivalent: M
(i) Φn → Φ; (ii) ∂ Φ = G − limn→+∞ ∂ Φn and the following normalization condition (N C ) holds: ∃( u¯, u¯∗ ) ∈ ∂ Φ, ∃(u n , u n∗ ) ∈ ∂ Φn s.t. u n → u¯ in V , u n∗ → u¯∗ in V ∗ , Φn (u n ) → Φ( u¯). PROOF. Proof of (i) =⇒ (ii). The proof of this implication relies on the Brønsted– Rockafellar lemma, Lemma 17.4.1. Let (u, u ∗ ) ∈ ∂ Φ. From Proposition 17.4.5 there exist
i
i i
i
i
i
i
746
book 2014/8/6 page 746 i
Chapter 17. Gradient flows
a sequence (un )n∈N in V and a sequence (un∗ )n∈N in V ∗ such that un → u in V , un∗ → u ∗ in V ∗ , and lim Φ (u ) = Φ(u); n→+∞ n n (17.188) lim Φ∗n (un∗ ) = Φ∗ (u ∗ ). n→+∞
Set n := Φn (un )+Φ∗n (un∗ )−〈un∗ , un 〉; then from (17.173) of Remark 17.4.1 we have n ≥ 0 and we infer that un∗ ∈ ∂n Φn (un ). On the other hand, from (17.188), and since (u, u ∗ ) ∈ ∂ Φ, n goes to Φ(u) + Φ∗ (u ∗ ) − 〈u ∗ , u〉 = 0 when n goes to +∞. Hence, according to Lemma 17.4.1, there exists (vn , vn∗ ) ∈ ∂ Φn such that $ vn − un ≤ $n ; vn∗ − un∗ ∗ ≤ n . This proves that vn → u in V and vn∗ → u ∗ in V ∗ . This being true for any (u, u ∗ ) ∈ ∂ Φ, from Proposition 17.4.2, and since from Theorem 17.4.1 ∂ Φn and ∂ Φ are maximal monotone operators, we conclude that ∂ Φ = G − limn→+∞ ∂ Φn . Given any (u, u ∗ ) ∈ ∂ Φ, we claim that the normalization condition (NC) is automatically satisfied by the sequences (vn )n∈N and (vn∗ )n∈N previously considered. To prove this, we only have to establish that Φn (vn ) → Φ(u). Since vn∗ ∈ ∂ Φn (vn ), we have Φn (un ) ≥ Φn (vn ) + 〈vn∗ , un − vn 〉; thus, letting n → +∞, we infer Φ(u) ≥ lim supn→+∞ Φn (vn ). On the other hand, since un∗ ∈ ∂n Φn (un ), we have Φn (vn ) ≥ Φn (un ) + 〈un∗ , vn − un 〉 − n , from which we infer lim infn→+∞ Φn (vn ) ≥ Φ(u), which proves the thesis. Proof of (ii) =⇒ (i). The proof relies on integration formula (17.186) of Theorem 17.4.2. First step. For every v ∈ V and every sequence (vn )n∈N satisfying vn v in V we establish Φ(v) ≤ lim infn→+∞ Φn (vn ). Let us connect u given by the normalization condition (NC) and v by any finite chain u v : u0 := u, . . . , u l −1 , u l := v, where uk in dom(∂ Φ) for k = 0, . . . , l − 1, and consider in V ∗ a chain u0∗ := u ∗ , . . . , u l∗−1 satisfying (uk , uk∗ ) ∈ ∂ Φ for k = 0, . . . , l − 1. Since ∂ Φ = G − limn→+∞ ∂ Φn , for each k = 0, . . . , l − 1 there exists (ukn , ukn∗ ) ∈ ∂ Φn such that ukn → uk in V , ukn∗ → uk∗ in V ∗
(17.189)
(for k = 0 such a condition is given by (NC)). Let us connect u n and vn by the following chain, where (ukn )k=1,...,l −1 is defined above: u n vn : u0n := u n , . . . , u ln−1 , vn .
i
i i
i
i
i
i
17.4. Sequences of gradient flow problems
book 2014/8/6 page 747 i
747
According to (17.185) we have Φn (vn ) ≥ Φn (u n ) +
l −1
n 〈ukn∗ , uk+1 − ukn 〉,
(17.190)
k=0 n − ukn → uk+1 − uk for k = 0, . . . , l − 2, and for k = l − 1, u ln − where from (17.189), uk+1 u ln−1 = vn − u ln−1 v − u l −1 . Going to the limit in (17.190) and using the normalization condition (NC) we obtain
lim inf Φn (vn ) ≥ Φ(u) +
l −1
n→+∞
〈uk∗ , uk+1 − uk 〉
k=0
for every finite chain u v with uk ∈ dom ∂ Φ for k = 0, . . . , l −1. Taking the supremum with respect to all such finite chains u v, from integration formula (17.186) of Theorem 17.4.2, we infer lim inf Φn (vn ) n→+∞ ≥ sup Φ(u) +
l −1
〈uk∗ , uk+1
∗
− uk 〉 : u v, uk ∈ dom ∂ Φ for k = 0, . . . , l − 1, l ∈ N
k=0
= Φ(v), which proves the thesis. Second step. For every v ∈ V we prove the existence of a sequence (vn )n∈N strongly converging to v in V and satisfying Φ(v) = limn→+∞ Φn (vn ). If v ∈ dom Φ there is nothing to prove since for every sequence (vn )n∈N , inequality Φ(v) ≥ lim supn→+∞ Φn (vn ) is automatically satisfied. On the other hand, according to Proposition 17.4.3, choosing a sequence (λk )k∈N of positive numbers going to zero, for every v ∈ dom Φ the sequence (Jλk v)k∈N belongs to dom ∂ Φ, strongly converges to v in V , and satisfies Φ(Jλk v) → Φ(v). Consequently, for proving the statement, by using the diagonalization lemma, Lemma 11.1.1, in V × R, it is enough to establish that for every v ∈ dom ∂ Φ, there exists a sequence (vn )n∈N satisfying vn → v and Φ(v) = limn→+∞ Φn (vn ). Given v ∈ dom ∂ Φ, let us fix w ∗ ∈ ∂ Φ(v). By assumption (ii) there exists an approximating sequence (vn , w n∗ ) ∈ ∂ Φn such that vn → v in V and w n∗ → w ∗ in V ∗ . We claim that the sequence (vn )n∈N satisfies the thesis. For u, u n , and u n∗ given by the normalization condition (NC), consider any finite chain v u : u0 := v, . . . , u l −1 , u l := u, the approximating chain in V vn u n : u0n := vn , . . . , u ln−1 , u ln := u n , and the chain in V ∗
u0n := w n∗ , . . . , u ln∗ , u n∗ := u n∗ −1 l ∗
satisfying (ukn , ukn∗ ) ∈ ∂ Φn and ukn → uk in V , ukn → uk∗ in V ∗ for k = 0, . . . , l . From Proposition 17.4.4 we have Φn (u n ) ≥ Φn (vn ) +
l −1
n 〈ukn∗ , uk+1 − ukn 〉.
(17.191)
k=0
i
i i
i
i
i
i
748
book 2014/8/6 page 748 i
Chapter 17. Gradient flows
Going to the limit as n goes to +∞ (recall that from (NC), Φn (u n ) → Φ(u)), we infer Φ(u) ≥ lim sup Φn (vn ) + n→+∞
l −1
〈uk∗ , uk+1 − uk 〉
k=0
6
= lim sup Φn (vn ) + Φ(v) + n→+∞
l −1
7 〈uk∗ , uk+1
− uk 〉 − Φ(v)
k=0
for any finite chain v u with uk ∈ dom ∂ Φ for k = 0, . . . , l − 1. Taking the supremum with respect to all such finite chains u v, from integration formula (17.186) we deduce Φ(u) ≥ lim sup Φn (vn ) + Φ(u) − Φ(v), n→+∞
which, together with the first step, proves the thesis. We assume now that (V , .) along with its dual (V ∗ , .∗ ) satisfies the following additional property (=) : the norms . and .∗ are strictly convex, and the weak convergence and the convergence of the norms imply the strong convergence. For example, L p -spaces (1 < p < +∞) and W k, p -spaces (k ∈ N, 1 < p < +∞) satisfy (=). Under assumption (=), Theorem 17.4.4 may be completed by the convergence of the resolvents introduced in Definition 17.4.4. Corollary 17.4.2. Let (V , .) be a reflexive Banach space and (Φn )n∈N , Φ a sequence of convex, proper, lower semicontinuous functions from V into R∪{+∞}. Assume that (V , .) along with (V ∗ , .∗ ) satisfies (=). Then the following statements are equivalent: M
(i) Φn → Φ; Φ
(ii) for all λ > 0 and all u ∈ V , Jλ n u → JλΦ u and the normalization condition (NC) holds; Φ
(iii) ∃λ0 > 0, for all u ∈ V , Jλ n u → JλΦ u and the normalization condition (NC) holds; 0
0
(iv) ∂ Φ = G − limn→+∞ ∂ Φn and the normalization condition (NC) holds. PROOF. Since (iv) =⇒ (i) is established in Theorem 17.4.4, it only remains to prove (i) =⇒ (ii) and (iii) =⇒ (iv). Proof of (i) =⇒ (ii). We proceed in four steps. In what follows, u is a fixed element of V , and λ is any positive number in R. Φ
M
Step 1. We claim that (Jλ n u)n∈N is bounded in V . Take v0 ∈ dom Φ. Since Φn → Φ, there exists a sequence (vn )n∈N such that vn → v in V and Φn (vn ) → Φ(v). Assume for the moment that there exists α > 0 such that for all n ∈ N Φn ≥ −α(. + 1).
(17.192)
Φ
From the definition of Jλ n it follows that Φn (vn ) +
1 2λ
Φ
vn − u2 ≥ Φn (Jλ n u) + Φ ≥ −αJλ n u
1 2λ
Φ
vn − Jλ n u2
− vn +
1 2λ
Φ
Jλ n u − vn 2 − α(vn + 1).
i
i i
i
i
i
i
17.4. Sequences of gradient flow problems
book 2014/8/6 page 749 i
749 Φ
From the boundedness of (vn )n∈N and (Φn (vn ))n∈N , we infer that supn∈N Jλ n u − vn < Φ
+∞, from which we derive that (Jλ n u)n∈N is bounded. Lemma 17.4.5 below states that the uniform bound (17.192) is automatically fulfilled by the sequence (Φn )n∈N . For a proof see [37, Lemma 3.8].
Lemma 17.4.5. Let (Φn )n∈N be a sequence of convex, lower semicontinuous, and uniformly proper functions from V into R ∪ {+∞}. Assume that Γw − lim infn→+∞ Φn < +∞. Then there exists α > 0 such that for all n ∈ N, Φn ≥ −α(. + 1). Φ
Step 2. We establish that Jλ n u JλΦ u. From Step 1 and the reflexivity of the space Φ
(V , .), there exist a subsequence of (Jλ n u)n∈N , not relabeled, and w ∈ V such that Φ
Jλ n u w in V . We claim that w = JλΦ u. To see this, it suffices to use the variational Φ
formulation of Jλ n u and the Mosco-convergence of Φn to Φ. The thesis then follows from the variational property of the Γ -convergence, Theorem 12.1.1(i). Φ
Step 3. We establish that Jλ n u → JλΦ u. Since (V , .) fulfills the property (=), it Φ
is enough to show that Jλ n u → JλΦ u. From hypothesis (ii) there exists a sequence (vn )n∈N satisfying vn → JλΦ u and Φn (vn ) → Φ(JλΦ u). From Φ
Φn (Jλ n u) +
1
Φ
2λ
u − Jλ n u2 ≤ Φn (vn ) +
1 2λ
u − vn 2
(17.193)
we infer lim sup n→+∞
1 2λ
Φ
Φ
u − Jλ n u2 ≤ − lim inf Φn (Jλ n u) + Φ(JλΦ u) + n→+∞
≤
1 2λ
1 2λ
u − JλΦ u2
u − JλΦ u2 . Φ
Φ
(We have used Φ(JλΦ u) ≤ lim infn→+∞ Φn (Jλ n u) since from the previous step Jλ n u JλΦ u.) But, by lower semicontinuity 1 2λ
u − JλΦ u2 ≤ lim inf n→+∞
1 2λ
Φ
u − Jλ n u2
so that Φ
lim u − Jλ n u2 = u − JλΦ u2 ,
n→+∞
which proves the thesis. Step 4. We prove the normalization condition (NC). Taking (JλΦ u, λ1 H (u − JλΦ u)) (recall that H is one to one from V onto V ∗ ) which from Proposition 17.4.3 belongs to ∂ Φ, Φ and setting un := Jλ n u and un∗ := λ1 H (u − un ), we have (un , un∗ ) ∈ ∂ Φn and from Step 3 un → JλΦ u. Furthermore, since (V ∗ , .∗ ) fulfills (=), from Lemma 17.4.2, the map H is strongly continuous from V onto V ∗ . Hence un∗ → λ1 H (u − JλΦ u). It remains to establish that Φn (un ) → Φ(JλΦ u) which readily follows from (17.193). Indeed we already have Φ
Φ
Φ(JλΦ u) ≤ lim infn→+∞ Φn (Jλ n u), and (17.193) yields lim supn→+∞ Φn (Jλ n u) ≤ Φ(JλΦ u).
i
i i
i
i
i
i
750
book 2014/8/6 page 750 i
Chapter 17. Gradient flows
Proof of (iii) =⇒ (iv). To simplify the notation, we write λ > 0 instead of λ0 . Consider (u, u ∗ ) ∈ ∂ Φ. With the notation of Lemma 17.4.2, set v := u + λH −1 (u ∗ ), then λ1 H (v − u) = u ∗ ∈ ∂ Φ(u). According to Proposition 17.4.3 we have 1 λ
H (v − JλΦ v) ∈ ∂ Φ(JλΦ v) Φ
Φ
so that by uniqueness JλΦ v = u. Let us set un := Jλ n v and un∗ := λ1 H (v − Jλ n v). From Proposition 17.4.3, (un , un∗ ) ∈ ∂ Φn , and from hypothesis (iii), un → JλΦ v = u when n Φ
goes to +∞. It remains to establish λ1 H (v − Jλ n v) → λ1 H (v − JλΦ v) in V ∗ . Since (V , .) fulfills (=), this assertion follows directly from Lemma 17.4.2, which states the strong continuity of H : V → V ∗ . This completes the proof. By rephrasing Corollary 17.4.2 in the context of Hilbert spaces, we obtain the following corollary. Corollary 17.4.3. Let (Φn )n∈N , Φ be a sequence of convex, proper, lower semicontinuous functions from a Hilbert space ( , .) into R ∪ {+∞}. Then the following statements are equivalent: M
(i) Φn → Φ; (ii) for all λ > 0 and all u ∈ , (I + λ∂ Φn )−1 u − (I + λ∂ Φ)−1 u → 0 and the normalization condition (NC) holds; (iii) ∃λ0 > 0, for all u ∈ , (I + λ0 ∂ Φn )−1 u − (I + λ0 ∂ Φ)−1 u → 0 and the normalization condition (NC) holds; (iv) ∂ Φ = G − limn→+∞ ∂ Φn and the normalization condition (NC) holds. PROOF. In this context, H is the identity map I . Therefore from Proposition 17.4.3, JλΦ u is the unique element of satisfying λ1 (u − JλΦ u) ∈ ∂ Φ(JλΦ u) or equivalently JλΦ u = Φ
(I + λ∂ Φ)−1 u. The same calculation holds for Jλ n u.
17.4.3 The weak version of the approximation theorem As noted in Chapter 12, a large number of problems arising from mechanics, physics, economics, or approximation methods in numerical analysis are modeled by means of minimization of functionals depending on some parameter, here formally denoted by n. For instance, we write these functionals Φn for Φ when is a small parameter associated to a thickness, a stiffness in mechanics, or a size of small discontinuities. Furthermore, in many cases, for instance, in linear elasticity or in thermic or theoretical physics, these functionals are convex. Therefore, for the evolution equations governed by the subdifferentials ∂ Φn , the theory of approximation of semigroups of operators is powerful for questions of convergence in transient boundary value problems. Theorem 17.4.6 below is a straightforward consequence of the previous corollary and the theory of the convergence of semigroups generated by maximal monotones operators initiated by Trotter in the Hilbertian setting and generalized by many authors (Attouch, Barbu, Brézis, Benilan, Crandall, Kato, Liggett, Pazy, and many others). In the next section, we will establish a strong version by a direct method without resorting to the theory of semigroups.
i
i i
i
i
i
i
17.4. Sequences of gradient flow problems
book 2014/8/6 page 751 i
751
Before stating the first approximation theorem, Theorem 17.4.6, we need to complete Definition 17.2.3 and Theorem 17.2.2 of Section 17.2.2 when considering nonhomogeneous Cauchy problems in finite time. Let ( , .) be a Hilbert space, Φ : → R ∪ {+∞} be a convex, proper, lower semicontinuous function, and f in L1 (0, T ; ). In what follows we look at the gradient flow equation (GF)
du dt
+ ∂ Φ(u) # f .
Definition 17.4.7. We say that u is a strong solution of (GF) if (i) and (ii) hold: (i) u ∈ C([0, T ]; ) and is absolutely continuous on [0, T ]; (ii) for almost all t ∈ (0, T ), u(t ) ∈ dom ∂ Φ, and
du (t ) + ∂ dt
Φ(u(t )) # f (t ).
We say that u is a weak solution of (GF) if there exist a sequence ( fn )n∈N in L1 (0, T ; ), du fn → f in L1 (0, T ; ), and un ∈ C([0, T ]; ) such that un is a strong solution of d tn + ∂ Φ(un ) # fn and satisfies un → u in (C([0, T ]; ), .∞ ). Theorem 17.4.5. Let u 0 ∈ dom ∂ Φ; then there exists a unique weak solution of the Cauchy problem ⎧ ⎪ ⎨ d u + ∂ Φ(u) # f , (+ ) dt ⎪ ⎩ u(0) = u 0 . PROOF. We begin by stating the following elementary lemma. For a proof we refer the reader to [135, Lemma 3.1]. Lemma 17.4.6. Let f and g in L1 (0, T ; ) and assume that u and v are weak solutions of du dt
+ ∂ Φ(u) # f and
du dt
respectively. Then for all (s, t ), 0 ≤ s ≤ t ≤ T ,
+ ∂ Φ(u) # g ,
t
u(t ) − v(t ) ≤ u(s) − u(t ) +
f (ξ ) − g (ξ ) d ξ .
(17.194)
s
Uniqueness follows directly from (17.194). For proving existence, we proceed in two steps. Step 1. We assume that f is a step function defined on the subdivision 0 = a0 < a1 ≤ · · · ai −1 < ai · · · an = T by f (.) = fi on [ai −1 , ai [ for i = 1, . . . , n and we prove that there exists a strong solution of (+ ). Consider Φi := Φ − 〈 fi , .〉 and on each interval [ai −1 , ai ] define by induction for i = 1, . . . , n, the Cauchy problems ⎧ i ⎪ ⎨ d u + ∂ Φ (u i ) # 0, i (+i ) dt ⎪ ⎩ i u (ai −1 ) = u i −1 (ai −1 ) with u 0 (0) := u 0 . A careful analysis of the proof of Theorem 17.2.2 shows that the hypothesis inf Φ > −∞ is not necessary to establish existence (and uniqueness) of a strong
i
i i
i
i
i
i
752
book 2014/8/6 page 752 i
Chapter 17. Gradient flows
solution of Problem (17.47) as long as we restrict it to a finite interval. Therefore applying Theorem 17.2.2 to each Cauchy problem (+i ) in [ai −1 , ai ], taking into account the previous remark and reasoning by induction, we infer that there exists aunique strong solution u i of (+i ) in [ai −1 , ai ]. The function u defined in [0, T ] by u := ni=1 1[ai ,ai−1 [ ui is clearly a strong solution of (+ ). Step 2. We consider the general case: f ∈ L1 (0, T ; ). There exists a sequence ( fn )n∈N of step functions in L1 (0, T ; ) such that fn → f in L1 (0, T , ). Denote by un the strong solution of ⎧ ⎪ ⎨ d un + ∂ Φ(u ) # f , n n dt ⎪ ⎩ u (0) = u 0 n obtained in Step 1. From Lemma 17.4.6 we infer un (t ) − u m (t ) ≤
t 0
fn (ξ ) − f m (ξ ) d ξ
so that (un )n∈N is a Cauchy sequence in C([0, T ]; ). Thus (un )n∈N uniformly converges to some function u in C([0, T ]; ), which, by definition, is a weak solution of (+ ). Theorem 17.4.6 (Approximation 1). Let (Φn )n∈N , Φ be a sequence of convex, proper, lower semicontinuous functions from a Hilbert space ( , .) into R ∪ {+∞} such that dom ∂ Φ ⊂ dom ∂ Φn for all n ∈ N. Let ( fn )n∈N , f , be a sequence in L1 (0, T ; ), un0 ∈ dom ∂ Φn , u 0 ∈ dom ∂ Φ, and consider un and u the unique weak solutions of the Cauchy problems
(+n )
⎧ ⎪ ⎨ d un + ∂ Φ (u ) # f , n n n dt ⎪ ⎩ u (0) = u 0 , n n
(+ )
⎧ ⎪ ⎨ d u + ∂ Φ(u) # f , dt respectively. ⎪ ⎩ u(0) = u 0 , M
Assuming that fn → f in L1 (0, T ; ), un0 → u 0 strongly in , and Φn → Φ, then un → u uniformly on [0, T ], i.e., un → u in the normed space (C(0, T ; ), .∞ ). PROOF (sketch). Step 1. We assume fn = f = 0. In the context of subdifferential operators and in the Hilbertian setting, rephrasing the result obtained by Brezis and Pazy (or Trotter for linear operators), we have the following relation between the convergence of the resolvents and the convergence of the corresponding semigroups. Lemma 17.4.7. Let (Φn )n∈N , Φ be a sequence of convex, proper, lower semicontinuous functions from a Hilbert space ( , .) into R ∪ {+∞}, and denote by (S ∂ Φn (t )) t ≥0 and (S ∂ Φ (t )) t ≥0 the contraction semigroups generated by ∂ Φn and ∂ Φ, respectively. Assume that dom ∂ Φ ⊂ dom ∂ Φn for all n ∈ N and that for all λ > 0 and all v ∈ dom ∂ Φ, (I +λ∂ Φn )−1 v −(I +λ∂ Φ)−1 v → 0. Then, for every v ∈ dom ∂ Φ, S ∂ Φn (t )v → S ∂ Φ (t )v uniformly on all compact subsets of [0, +∞).
i
i i
i
i
i
i
17.4. Sequences of gradient flow problems
book 2014/8/6 page 753 i
753
For a proof (under less restrictive conditions) see [140, Theorems 3.1, 4.1, and Corollary 4.2] or [135, Theorem 4.2]. Recall that the contraction semigroup (S A(t )) t ≥0 generated by a maximal operator A is given by the “exponential formula” #!
∀u0 ∈ dom A S A(t )u0 = lim
k→+∞
I+
t "−1 $k u0 . A k
We conclude the step by combining Corollary 17.4.3, Lemma 17.4.7, and the standard formulas un (t ) = S Φn (t )un0 , u(t ) = S Φ (t )u 0 . Step 2: General case. Let g be an arbitrary step function in L1 (0, T ; ) and consider the two evolution equations ⎧ ⎧ ⎪ ⎪ ⎨ d w + ∂ Φ(w) # g , ⎨ d wn + ∂ Φ (w ) # g , n n dt dt ⎪ ⎪ ⎩ w(0) = u 0 . ⎩w (0) = u 0 , n n The result of Step 1 applied for each interval of [0, T ] where g is constant and for the subdifferential operators ∂ Φn − 〈g , .〉 and ∂ Φ − 〈g , .〉 gives wn (t − w(t ) → 0
uniformly on [0, T ].
(17.195)
From (17.194) of Lemma 17.4.6, for all 0 ≤ s ≤ t ≤ T one has un (t ) − wn (t ) ≤ un (s) − wn (s) +
t
f (ξ ) − g (ξ )d ξ , s
t
(17.196)
f (ξ ) − g (ξ )d ξ .
u(t ) − w(t ) ≤ u(s) − w(s) + s
Applying (17.196) with s = 0, we obtain un (t ) − wn (t ) ≤ fn − g L1 (0,T ; ) , u(t ) − w(t ) ≤ f − g L1 (0,T ; ) .
(17.197)
From un (t ) − u(t ) ≤ un (t ) − wn (t ) + wn (t ) − w(t ) + u(t ) − w(t ) and (17.195), (17.197) we deduce lim sup sup un (t ) − u(t ) ≤ 2 f − g L1 (0,T ; ) . n→+∞ t ∈[0,T ]
This completes the proof since f − g L1 (0,T ; ) can be chosen arbitrarily small.
17.4.4 A strong version of the approximation theorem Under slightly less general assumptions, but not too restrictive with regard to the applications, we propose a direct proof of the previous approximation theorem. This approach has the advantage of not requiring the approximation lemma, Lemma 17.4.7; furthermore the conclusion is more accurate. The proof consists in taking into account the fact that the operators are subdifferentials.
i
i i
i
i
i
i
754
book 2014/8/6 page 754 i
Chapter 17. Gradient flows
Theorem 17.4.7 (Approximation 2). Let (Φn )n∈N , Φ be a sequence of convex, proper, lower semicontinuous functions from a Hilbert space ( , .) into R ∪ {+∞}, ( fn )n∈N , f a sequence in L2 (0, T ; ), un0 ∈ dom ∂ Φn , and u 0 ∈ dom ∂ Φ. Let un and u be the solutions of the Cauchy problems ⎧ ⎧ ⎪ ⎪ ⎨ d un + ∂ Φ (u ) # f , ⎨ d u + ∂ Φ(u) # f , n n n dt dt (+n ) (+ ) ⎪ ⎪ ⎩ u (0) = u 0 , ⎩ u(0) = u 0 , n n and assume that (i) supn∈N Φn (un0 ) < +∞; (ii) fn → f strongly in L2 (0, T ; ); (iii) un0 → u 0 strongly in ; M
(iv) Φn → Φ. Then un → u in (C(0, T ; ), .∞ ) and Φ(u 0 ), then
d un dt
→
du dt
d un dt
du dt
in L2 (0, T ; ). If moreover Φn (un0 ) →
in L2 (0, T ; ).
PROOF. For existence and uniqueness of the strong solutions of (+n ) and (+ ) see Theorem 17.2.5. (See also [135, Theorem 3.6], where it is noted that, in this situation, there is no difference between strong and weak solutions.) For the sake of simplicity, we assume that Φn and Φ are Gâteaux differentiable so that ∂ Φn and ∂ Φ are reduced to the singletons {∇Φn } and {∇Φ}. The proof proceeds in four steps and we follow the ideas of Attouch in [38]. Step 1. We establish ) ) )du ) ) n) sup ) < +∞; ) n∈N ) d t )L2 (0,T ; )
(17.198)
sup un C(0,T ; ) < +∞.
(17.199)
n∈N
From (+n ) we deduce that for a.e. t ∈ (0, T ), ) )2 )du ) d un d un ) n ) (t )) + ∇Φn (un (t )), (t ) = fn , (t ) . ) ) dt ) dt dt By integrating this equality on (0, T ), we obtain )2 T) T T ) )du d un d un ) n ) ∇Φn (un (t )), fn (t ), (t )) d t + (t ) d t = (t ) d t ) ) ) dt dt dt 0 0 0 from which we deduce (note that according to Proposition 17.2.5, for a.e. t in (0, T ), du d Φ (u (t )) = 〈∇Φn (un (t )), d tn (t )〉) dt n n 0
T
)2 ) T ) )du d un ) n ) fn (t ), (t )) d t = −Φn (un (T )) + Φn (un0 ) + (t ) d t . ) ) ) dt dt 0
(17.200)
i
i i
i
i
i
i
17.4. Sequences of gradient flow problems
book 2014/8/6 page 755 i
755
M
Since Φn → Φ, from Lemma 17.4.5, there exists α > 0 such that for all n ∈ N, Φn ≥ −α(. + 1). Therefore (17.200) yields 6 ) )2 )2 71/2 T) T ) )du ) ) ) n ) ) d un ) 0 . (t )) d t ≤ Φn (un ) + α(un (T ) + 1) + fn L2 (0,T ; ) (t )) d t ) ) ) dt ) dt ) ) 0 0 (17.201) From
un (T ) = un0 +
we infer
T
d un dt
0
6 un (T ) ≤ un0 + T 1/2
T 0
(t ) d t
)2 71/2 ) ) )du ) n ) (t )) d t . ) ) ) dt
(17.202)
Combining (17.201) and (17.202) we finally obtain 6 )2 )2 71/2 T) " T) ! ) ) )du )du ) ) n ) ) n (t )) d t ≤ Φn (un0 )+α(un0 +1)+ αT 1/2 + fn L2 (0,T ; ) (t )) d t ) ) ) ) ) ) dt dt 0 0 and estimate (17.198) follows from the equiboundedness of (un0 )n∈N in , that of ( fn )n∈N in L2 (0, T ; ), and the equiboundedness of (Φn (un0 ))n∈N . Estimate (17.199) follows from (17.198) and from 6 ) )2 71/2 T ) ) ) d un ) 0 1/2 (t )) d t ∀t ∈ (0, T ). un (t ) ≤ un + T ) ) ) dt 0 Step 2. Compactness. We prove that there exists u ∈ L2 (0, T ; ) and a subsequence of (un )n∈N (not relabeled) such that d un
du
in L2 (0, T ; ); dt dt un u in L2 (0, T ; ).
(17.203) (17.204)
du
Since from (17.198), ( d tn )n∈N is bounded in L2 (0, T ; ), there exist a subsequence (not relabeled) and g ∈ L2 (0, T ; ) such that d un dt
g
in L2 (0, T ; ).
For all t ∈ [0, T ] set
t
0
u(t ) := u +
g (s) d s. 0
Thus
du dt
= g and u(0) = u 0 . On the other hand, vn defined by t d un (s) d s vn (t ) := u 0 + 0 dt
is bounded in L2 (0, T ; ) (this is a consequence of (17.198)). Thus for a further subsequence that we do not relabel, there exists v ∈ L2 (0, T ; ) such that vn v
in L2 (0, T ; ).
i
i i
i
i
i
i
756
book 2014/8/6 page 756 i
Chapter 17. Gradient flows
Consider the map L : L2 (0, T ; ) → L2 (0, T ; ) defined by t L(h)(t ) = u 0 + h(s) d s. 0
Clearly L is a (strongly) continuous affine function so that its graph is strongly closed, thus weakly closed in L2 (0, T ; ) × L2 (0, T ; ). Therefore, from d un d un ,v = L ( g , v) in L2 (0, T ; ) × L2 (0, T ; ) dt n dt we deduce that v = L( g ), i.e., v = u. To sum up we have proved for a subsequence that d un dt
du dt
in L2 (0, T ; );
un = vn − u 0 + un0 u
in L2 (0, T ; ).
Step 3. We prove that u is the solution of (+ ). We will need the following lemma. Lemma 17.4.8. Let (ψn )n∈N , ψ be a sequence of convex, proper, lower semicontinuous funcM
tions from a separable Hilbert space ( , .) into R ∪ {+∞} such that ψn → ψ and consider (Ψn )n∈N , Ψ from L2 (0, T ; ) → R ∪ {+∞} defined by T T ψn (v(t )) d t ; Ψ(v) := ψ(v(t )) d t . Ψn (v) := 0
0
M
Then Ψn → Ψ. For a proof, see [38, Corollary 1.17]. Note that since from Lemma 17.4.5, ψn + α(. + 1) and ψ + α(. + 1) are nonnegative, the integrals entering the definition of Ψn and Ψ are well defined. According to the Fenchel extremality condition (+n ) is equivalent to d un d un ∗ Φn (un (t )) + Φn fn (t ) − (t ) + (t ) − fn (t ), un (t ) = 0 dt dt for a.e. t ∈ (0, T ), which is also equivalent to & T% d un d un (t ) + (t ) − fn (t ), un (t ) Φn (un (t )) + Φ∗n fn (t ) − d t = 0. dt dt 0 The above equivalence is due to the Fenchel inequality which asserts that the inequality du du Φn (un (t )) + Φ∗n fn (t ) − d tn (t ) + 〈 d tn (t ) − fn (t ), un (t )〉 ≥ 0 for a.e. t ∈ (0, T ) is always true. Therefore, (+n ) is equivalent to & T% d 1 d un ∗ 2 (t ) + u (t ) − 〈 fn (t ), un (t )〉 d t = 0, Φn (un (t )) + Φ fn (t ) − dt dt 2 n 0 or, equivalently,
0
T
%
& d un Φn (un (t )) + Φ∗n fn (t ) − (t ) d t dt T 1 2 0 2 〈 fn (t ), un (t )〉 d t = 0. + un (T ) − un − 2 0
(17.205)
i
i i
i
i
i
i
17.4. Sequences of gradient flow problems
book 2014/8/6 page 757 i
757
From
un (T ) =
un0
T
+
d un dt
0
(t ) d t
and (17.203), we infer that un (T ) u(T ) in . Going to the limit in (17.205), from (17.203), (17.204), the strong convergence of fn to f in L2 (0, T ; ) and Lemma 17.4.8, the lower semicontinuity of v → v in , and the strong convergence un0 to u0 in , we obtain & T% T du 1 1 Φ(u(t )) + Φ∗ f (t ) − 〈 f (t ), u(t )〉 d t ≤ 0. (t ) d t + u(T )2 − u 0 2 − dt 2 2 0 0 Equivalently
T
%
Φ(u(t )) + Φ
0
∗
f (t ) −
du dt
(t ) +
du dt
& (t ) − f (t ), u(t )
d t ≤ 0.
Noticing that, according to the Fenchel inequality, Φ(u(t )) + Φ∗ f (t ) − dd ut (t ) + 〈 dd ut (t ) − f (t ), u(t )〉 ≥ 0, we deduce that for a.e. t ∈ (0, T ), Φ(u(t )) + Φ∗ f (t ) − dd ut (t ) + 〈 dd ut (t ) − f (t ), u(t )〉 = 0. Thus ⎧ ⎪ ⎨ d u + ∇Φ(u) = f , dt ⎪ ⎩ u(0) = u 0 . Note that according to the uniqueness of the solution of (+ ), the whole sequence (un )n∈N satisfies (17.203) and (17.204). Step 4. We establish (17.206) un → u in C(0, T ; ); d un du → in L2 (0, T ; ) under the hypothesis Φn (un0 ) → Φ(u 0 ). (17.207) dt dt To prove (17.206), we will apply the Ascoli compactness theorem in (C(0, T ; ), .∞ ) to the sequence (un )n∈N . From (17.199) we already have supn∈N un C(0,T ; ) < +∞. The uniform equicontinuity of the sequence (un )n∈N is a straightforward consequence of (17.198) and ) t) ) )du ) ) n un (t ) − un (s) ≤ (ξ )) d ξ ) ) ) dt s ) ) )du ) n) 1/2 ) . ≤ |t − s| ) ) ) dt ) 2 L (0,T ; ) It remains to establish that un (t ) → u(t ) in for every t ∈ [0, T ]. Noticing that (17.203) holds in L2 (0, T ; ) for all 0 ≤ T ≤ T , from un (T ) = un0 +
T 0
d un dt
(t ) d t
we infer that un (t ) u(t ) in for all t ∈ [0, T ]. In order to prove the strong convergence un (t ) → u(t ), we are going to establish that un (t ) → u(t ). To see that, the idea
i
i i
i
i
i
i
758
book 2014/8/6 page 758 i
Chapter 17. Gradient flows
consists in extracting the maximum information from (17.205). For each term of (17.205) we have indeed obtained T T a := Φ(u(t )) d t ≤ lim inf Φ(un (t )) d t ; n→+∞
0
T
b := 0
0
T du d un Φ∗ f (t ) − Φ∗ fn (t ) − (t ) d t ≤ lim inf (t ) d t ; n→+∞ dt dt 0
1 1 1 1 c := u(T )2 − u 0 2 ≤ lim inf un (T )2 − u 0 2 ; n→+∞ 2 2 2 2 T T d := − 〈 f (t ), u(t )〉 = lim − 〈 fn (t ), un (t )〉 n→+∞
0
0
with a + b + c + d = 0. Therefore, denoting by an , bn , cn , and dn each of the four terms of (17.205) we have obtained a ≤ lim infn→+∞ an ; b ≤ lim infn→+∞ bn ; c ≤ lim infn→+∞ cn ; d = limn→+∞ dn ; a + b + c + d = an + bn + cn + dn = 0, from which we easily infer, using Lemma 17.2.1, that a = limn→+∞ an , b = limn→+∞ bn , and c = limn→+∞ c b . In particular un (T ) → u(T ). This being true for each 0 ≤ T ≤ T (by reasoning on on L2 (0, T ; )), we infer that un (t ) → u(t ) for all t ∈ [0, T ]. To prove (17.207), it suffices to establish )2 ) T) T) ) )du ) d u )2 ) ) n ) ) (t )) d t → (t )) d t , ) ) ) ) ) ) d t d t 0 0 which follows directly from (17.200). Indeed going to the limit on )2 T) T )du ) d un ) n ) 0 f d t = −Φ (u (T )) + Φ (u ) + (t ), (t ) (t ) dt ) ) n n n n n ) dt ) dt 0 0 M
we deduce, since Φn → Φ, )2 T) T ) )du du ) n ) 0 (t )) d t = − lim inf Φn (un (T )) + Φ(u ) + (t ) d t f (t ), lim sup ) ) n→+∞ dt n→+∞ 0 ) d t 0 T du 0 ≤ −Φ(u(T )) + Φ(u ) + (t ) d t f (t ), dt 0 ) T) ) d u )2 ) ) = (t )) d t ) ) dt ) 0 and then the conclusion follows from the lower semicontinuity of the norm in L2 (0, T , ).
i
i i
i
i
i
i
17.4. Sequences of gradient flow problems
book 2014/8/6 page 759 i
759
17.4.5 Application to diffusion in random media We go back to the notation and definitions of Section 12.4 in the specific case m = 1, p = 2. We write to denote a sequence (n )n∈N of positive numbers n going to zero when n → +∞, and we briefly write → 0 instead of limn→+∞ n = 0. Given α > 0 and β > 0, we denote by Convα,β the class of functions g : RN ×RN → R, (x, ξ ) → g (x, ξ ), measurable in x, convex with respect to ξ , and satisfying the growth condition (12.5), i.e., α|ξ |2 ≤ g (x, ξ ) ≤ β(1 + |ξ |2 ) for all (x, ξ ) ∈ RN × RN . Using the subdifferential inequality together with (12.5), it is easy to prove that the functions g automatically satisfy the local Lipschitz condition (12.6) for some L ≥ 0 depending only on α and β. Therefore, with the notation of Section 12.4.3, Convα,β ⊂ Bα,β,L . In what follows, Convα,β is endowed with the trace σ-algebra which equips Bα,β,L . Let (Σ, , P) be a probability space. In this section, we are given a random convex integrand f : Σ × RN × RN → R, i.e., a ⊗ (RN ) ⊗ (RN ), (R) measurable function such that for every ω ∈ Σ, the function f (ω, ., .) belongs to the class Convα,β . It is noted that since Convα,β is equipped with the trace σ-algebra of the one that equips Bα,β,L , all the results of Section 12.4 remain valid when replacing the class Bα,β,L by the class Convα,β . Given the group (T z ) z∈ZN , defined for all g in Convα,β by T z g (x, .) = g (x + z, .), in the context of the discrete dynamical system (Σ, , P, (T z ) z∈ZN ), we assume that f is periodic in law, i.e., that the law f #P of f is invariant with respect to the group (Tz ) z∈ZN . (See Section 12.4 for precise definitions and examples.) By combining the abstract results of the previous section with Section 12.4 we intend to analyze the asymptotic behavior in C (0, T ; H01 (Ω)) of the solution u (ω) of the random Cauchy problem when → 0: ⎧ ⎪ ⎨ d u (ω) + A (ω)(u (ω)) # g (ω), dt (17.208) ⎪ ⎩ u (ω, 0) = u 0 (ω), where the random operator A (ω) : L2 (Ω) → 2L (Ω) is defined for every ω ∈ Σ by M ! . N " dom A (ω) = v ∈ H01 (Ω) : ∃σ ∈ ∂ξ f ω, , ∇v , div σ ∈ L2 (Ω) 2
and, for all v ∈ dom A (ω),
! . " A (ω)v = − div ∂ξ f ω, , ∇v . We assume that ω → u0 (ω) and ω → g (ω) are two , (L2 (Ω)) and , (L2 (0, T ; L2 (Ω)))
measurable functions respectively, and that u0 (ω) ∈ dom A (ω). To shorten the notation we will write the evolution problem (17.208) as follows: ⎧ " ! ⎪ ⎨ d u (ω) − div ∂ f ω, . , ∇u # g (ω), ξ (+ (ω)) dt ⎪ ⎩ u (ω, 0) = u 0 (ω).
i
i i
i
i
i
i
760
book 2014/8/6 page 760 i
Chapter 17. Gradient flows
It is easy to check that A (ω) is nothing but the subdifferential of the random functional F (ω, .) considered in Section 12.4.5 and defined by F : Ω × L2 (Ω) −→ R+ ∪ {+∞}, ⎧ ! " ⎨ f ω, x , ∇u d x if u ∈ W 1,2 (Ω), 0 F (ω, u) = ⎩ Ω +∞ otherwise. The functional F (ω, .) models a random energy concerning various steady state situations and, with the notation of Section 12.3.2, the equilibrium configuration is given by the field u solution of the random problem 2 inf F (ω, u) − L(u) : u ∈ L (Ω) , Ω
where the small parameter accounts for the size of small and randomly distributed heterogeneities. The Cauchy problem (+ (ω)) then models the corresponding diffusion. From Theorem 12.4.7 for P-almost all ω in Σ the sequence of functional (F (ω, .))>0 Γ -converges to the random integral functional F ho m (ω, ) defined in L2 (Ω) by ⎧ ⎨ f ho m (ω, ∇u) d x if u ∈ W01,2 (Ω), F ho m (ω, u) = Ω ⎩ +∞ otherwise, when L2 (Ω) is equipped with the norm topology. Recall that, from Proposition 12.4.3, the density f ho m is given, for P-a.e. ω ∈ Σ, by 1 1,2 f ho m (ω, a) = lim inf f (ω, y, a + ∇u(y)) d y : u ∈ W (Y ) 0 n→+∞ n N nY 1 1,2 0 = inf∗ E inf f (ω, y, a + ∇u(y)) d y : u ∈ W0 (Y ) , n∈N n N nY where E0 denotes the conditional expectation with respect to the σ-algebra of invariant sets of by the group (T z ) z∈ZN . If f is ergodic, then f ho m is deterministic and given for P-a.e. ω ∈ Σ by 1 1,2 ho m f (a) = lim inf f (ω, y, a + ∇u(y)) d y : u ∈ W0 (Y ) n→+∞ n N nY 1 1,2 = inf∗ E inf f (ω, y, a + ∇u(y)) d y : u ∈ W (Y ) . 0 n∈N n N nY Given ω → u 0 (ω) and ω → g (ω) two , (L2 (Ω)) and , (L2 (0, T ; L2 (Ω))) measurable functions, respectively, and assuming that u 0 (ω) ∈ dom A(ω), we consider the Cauchy problem ⎧ ⎪ ⎨ d u(ω) + A(ω)(u) # g (ω), (+ ho m (ω)) (17.209) dt ⎪ ⎩ u(ω, 0) = u 0 (ω),
i
i i
i
i
i
i
17.4. Sequences of gradient flow problems
book 2014/8/6 page 761 i
761
where the random operator A(ω) : L2 (Ω) → 2L (Ω) is defined for every ω ∈ Σ by 2
dom A(ω) = {v ∈ H01 (Ω) : ∃σ ∈ ∂ f
ho m
(ω, ∇v), div σ ∈ L2 (Ω)}
and, for all v ∈ dom A(ω), A(ω)v = − div ∂ f ho m (ω, ∇v). It is easily seen that A(ω) is the subdifferential of the random functional F ho m (ω, .). To shorten the notation we write the evolution problem (17.208) as follows: ⎧ ⎪ ⎨ d u(ω) − div ∂ f ho m (ω, ∇u ) # g (ω), ho m (ω)) dt (+ ⎪ ⎩ u(ω, 0) = u 0 (ω). The next theorem states that for P-a.e. ω ∈ Σ, (+ ho m (ω)) is the limit evolution problem of (+ (ω)) in the sense of Theorem 17.4.7. It is referred to as the homogenized evolution problem. Proposition 17.4.6 expresses the limit operator A(ω) = − div ∂ f ho m (ω, ∇.) in terms of almost sure graph limit. Theorem 17.4.8. Assume that for P-a.e. ω ∈ Σ, u0 (ω) strongly converges to u 0 (ω) in L2 (Ω) and that g (ω) strongly converges to g (ω) in L2 ([0, T ]; L2 (Ω)). Suppose further that the unique solution u (ω) of (+ (ω)) satisfies sup>0 F (u0 (ω)) < +∞ for P-a.e. ω ∈ Σ. Then for P-a.e. ω ∈ Σ, u (ω) uniformly converges in C(0, T ; L2 (Ω)) to the unique solution u(ω) of the evolution problem ⎧ ⎪ ⎨ d u(ω) − div ∂ f ho m (ω, ∇u) # g (ω), ho m (+ (17.210) (ω)) dt ⎪ ⎩ u(ω, 0) = u 0 (ω). Furthermore, for P-a.e. ω ∈ Σ, d u (ω) dt
d u(ω) dt
in L2 (0, T ; L2 (Ω))
and, if F (ω, u0 ) → F ho m (ω, u 0 ), then d u (ω) dt
→
d u(ω) dt
in L2 (0, T ; L2 (Ω)).
If f is ergodic, then f ho m is deterministic. If further u 0 and g are deterministic, then the homogenized evolution problem is deterministic and given by ⎧ ⎪ ⎨ d u − div ∂ f ho m (∇u) # g , (+ ho m ) (17.211) dt ⎪ ⎩ u(0) = u 0 . PROOF. We claim that condition (iv) of Theorem 17.4.7 is satisfied. Indeed, from Theorem 12.4.7, for P-almost all ω in Σ, the sequence of functional (F (ω, .))>0 Γ -converges to
i
i i
i
i
i
i
762
book 2014/8/6 page 762 i
Chapter 17. Gradient flows
the random integral functional F ho m (ω, .) when L2 (Ω) is equipped with the norm topology. From the lower bound condition f (ω, x, ξ ) ≥ α|ξ |2 we deduce that every sequence (u )>0 of bounded energy, i.e., satisfying sup>0 F (ω, u ) < +∞, which weakly converges to some u in L2 (Ω), weakly converges to u in H01 (Ω), then strongly converges to u in L2 (Ω). Therefore the sequence (F (ω, .))>0 Mosco-converges to the random integral functional F ho m (ω, .) and the claim then follows. Since the three other conditions are fulfilled, the conclusion follows from Theorem 17.4.7. For every a ∈ RN , every n ∈ N∗ , and all ω ∈ Σ, let us set 1 1,2 f (ω, y, a + ∇u(y)) d y : u ∈ W0 (Y ) fn (ω, a) := inf n N nY
(17.212)
and consider the function fn (ω, .) : RN → R, a → fn (ω, a). From Proposition 12.4.3, for all fixed a ∈ RN , limn→+∞ fn (ω, a) = f ho m (ω, a) for P-a.e. ω ∈ Σ. In the proposition below, we establish that f ho m (ω, .) is a Mosco limit so that we can express ∂ f ho m as a graph limit. Proposition 17.4.6. Under the assumptions of Theorem 17.4.8, the following assertions hold for P-a.e. ω ∈ Σ: M
(i) fn (ω, .) → f ho m (ω, .). G
(ii) ∂ fn (ω, .) → ∂ f
(ω, .), where ∂ fn (ω, .) is characterized by 1 ∂ fn (ω, a) = σ d y : div σ = 0, σ(y) ∈ ∂ξ f (ω, y, q + a) n N nY ho m
a.e. in nY, q
∈ ∇H01 (nY )
.
(iii) Assume that for a.e. x ∈ RN , f (ω, x, .) is strictly convex and Gâteaux differentiable and that its Fenchel conjugate is such that 〈ξ1∗ − ξ2∗ , ξ 1 − ξ 2 〉 ≥ γ |ξ1 − ξ2 |2 for some γ > 0 and for all (ξ1 , ξ2 ) ∈ RN × RN and all (ξ1∗ , ξ2∗ ) ∈ ∂ξ f ∗ (ω, x, ξ1 ) × ∂ξ f ∗ (ω, x, ξ2 ). Then fn (ω, .) and f ho m (ω, .) are Gâteaux differentiable and for all a ∈ RN , ∇ f ho m (ω, a) = lim ∇ fn (ω, a) n→+∞
where ∇ fn (ω, a) =
1
for P-a.e. ω ∈ Σ,
∇ξ f (ω, y, ∇ua,n (ω)(y) + a) d y, n N nY and ua,n (ω) is the unique solution of the random Dirichlet problem div ∇ξ f (ω, ., a + ∇v(.)) = 0 a.e. in nY ; v = 0 on ∂ nY. PROOF. In the proof, we fix ω in a set of full probability, for which the conclusions of Theorem 17.4.8 hold. It is easy to prove that fn (ω, .), f ho m (ω, .) : RN → R are convex and satisfy the growth conditions fulfilled by f , i.e., α|.|2 ≤ f n (ω, .),
f ho m (ω, .) ≤ β(1 + |.|2 ).
i
i i
i
i
i
i
17.4. Sequences of gradient flow problems
book 2014/8/6 page 763 i
763
Consequently they satisfy the local equi-Lipschitz condition (12.6). This implies that fn (ω, .) almost surely Gamma-converges to f ho m (ω, .). (This is the general feature of sequences of equilower semicontinuous functionals.) Indeed let (an ) be a sequence in RN converging to a. From (12.6) we infer fn (ω, an ) ≥ fn (ω, a) − L|an − a|(1 + |an | + |a|), from which we deduce, since fn (ω, .) almost surely converges to f ho m (ω, .), that lim inf fn (ω, an ) ≥ f n→+∞
ho m
(ω, a).
The upper bound in the definition of the Gamma-convergence is trivially satisfied by taking the constant sequence (a)n∈N as a recovery sequence. Assertion (i) follows since the Gamma-convergence and the Mosco-convergence coincide in finite dimensional spaces. Assertion (ii) follows from Theorem 17.4.4. To express the subdifferential ∂ fn (ω, .), it suffices to remark that fn (ω, .) is the epigraphical sum defined in RN by fn (ω, a) = G#δK ( j (a)), where G(v) =
1 nN
f (ω, y, v) d y
∀v ∈ L2 (nY, RN ),
nY
δK is the indicator function of K = {v ∈ L2 (nY, RN ) : ∃u ∈ H01 (nY ), v = ∇u}, and j is the canonical embedding from RN to L2 (nY, RN ), then to use standard subdifferential calculus rules in convex analysis. More precisely ! " 3 ∂ G(v + j (a)) ∩ ∂ δK (−v) , ∂ G#δK ( j (a)) = j T ◦ v∈L2 (nY,RN )
where j T is the transposed operator of the embedding j given by T j (v) = v d y. nY
Furthermore v ∗ ∈ ∂ δK (v) ⇐⇒ 〈v ∗ , ∇ϕ〉L2 (nY,RN ) = 0 for all ϕ ∈ H01 (nY ), i.e., div v ∗ = 0 a.e. in nY . Let us prove (iii). The fact that fn (ω, .) is Gâteaux differentiable comes from the formula of its subdifferential operator expressed in (ii). Indeed, under the hypotheses of (iii), ∂ fn (ω, a) is reduced to 1 ∇ξ f (ω, y, ∇ua (y) + a) d y, n N nY where ua is the unique minimizer in H01 (nY ) of (17.212), then satisfies the random Dirichlet problem div ∇ξ f (ω, ., a + ∇ua,n (.)) = 0 a.e. in nY ; ua,n = 0 on ∂ nY. In order to simplify the notation, we do not indicate the dependence on ω and n for the minimizer ua .
i
i i
i
i
i
i
764
book 2014/8/6 page 764 i
Chapter 17. Gradient flows G
Let (a, a ∗ ) ∈ ∂ f ho m (ω, .). Since ∇ fn (ω, .) → ∂ f ho m (ω, .), there exists an ∈ RN such that an → a and ∇ fn (ω, an ) → a ∗ . We first claim that |∇ fn (ω, an ) − ∇ fn (ω, a)| ≤
1 γ
|an − a|.
(17.213)
Indeed from Jensen’s inequality we have 1 2 |∇ fn (ω, an )−∇ fn (ω, a)| ≤ N |∇ξ f (ω, y, ∇uan (y)+an )−∇ξ f (ω, y, ∇ua (y)+a)|2 d y. n nY (17.214) On the other hand, from the hypothesis of (ii) γ |∇ξ f (ω, y, ∇uan (y) + an ) − ∇ξ f (ω, y, ∇ua (y) + a)|2 d y nY Z Y ≤ ∇ξ f (ω, y, ∇uan (y) + an ) − ∇ξ f (ω, y, ∇ua (y) + a), ∇uan (y) + an − ∇ua (y) − a RN d y nY Z Y = ∇ξ f (ω, y, ∇uan (y) + an ) − ∇ξ f (ω, y, ∇ua (y) + a), an − a RN d y nY
from which we deduce 1 1 |∇ξ f (ω, y, ∇uan (y) + an ) − ∇ξ f (ω, y, ∇ua (y) + a)|2 d y ≤ 2 |an − a|2 . (17.215) N γ n nY Combining (17.214) and (17.215) yields (17.213). From (17.213), we infer that a ∗ = limn→+∞ ∇ fn (ω, a). Thus ∂ f ho m (ω, .) is made up of a single point, which concludes the proof. Combining Proposition 12.3.4, Theorem 17.4.8, and Proposition 17.4.6, we deduce the following convergence result of the evolution problem: ⎧ " ! ⎪ ⎨ d u − div ∂ f . , ∇u # g , ξ dt (+ ) ⎪ ⎩ u (0) = u 0 , when x → f (x, ξ ) is Y -periodic. Corollary 17.4.4 (periodic case). Assume that u0 strongly converges to u 0 in L2 (Ω) and that g strongly converges to g in L2 ([0, T ]; L2 (Ω)). Suppose further that the unique solution u of (+ ) satisfies sup>0 F (u0 ) < +∞. Then u uniformly converges in C(0, T ; L2 (Ω)) to the unique solution u of the evolution problem ⎧ ⎪ ⎨ d u − div ∂ f ho m (∇u) # g , ho m (+ (ω)) dt (17.216) ⎪ ⎩ u(0) = u 0 , where f ho m is given by f
ho m
(a) = inf
f (y, a + ∇u(y)) d y : u Y
1, p ∈ W# (Y )
.
i
i i
i
i
i
i
17.4. Sequences of gradient flow problems
Furthermore,
d u dt
book 2014/8/6 page 765 i
765
du dt
in L2 (0, T ; L2 (Ω))
and, if F (u0 ) → F ho m (u 0 ), d u dt
→
du dt
in L2 (0, T ; L2 (Ω)).
Assume that for a.e. x ∈ RN , f (x, .) is strictly convex and Gâteaux-differentiable and that its Fenchel conjugate is such that 〈ξ1∗ − ξ2∗ , ξ 1 − ξ 2 〉 ≥ γ |ξ1 − ξ2 |2 for some γ > 0 and for all (ξ1 , ξ2 ) ∈ RN × RN and all (ξ1∗ , ξ2∗ ) ∈ ∂ξ f ∗ (x, ξ1 ) × ∂ξ f ∗ (x, ξ2 ). Then f ho m is Gâteaux differentiable and for all a ∈ RN , ho m ∇f (a) = ∇ξ f (y, ∇ua (ω)(y) + a) d y, Y
where ua is the solution of the Dirichlet problem div ∇ξ f (., a + ∇v(.)) = 0 a.e. in Y ; v ∈ W#1,2 (Y )/R. Example 17.4.1. We choose to illustrate Theorem 17.4.8 by considering a diffusion through a composite made up of small balls distributed at random following a Poisson point process with intensity λ "N , λ > 0, included in a homogeneous material. We carry on with Example 12.4.2, where the convex density g− represents, for instance, a thermal or electrical conductivity of the balls whose centers are randomly distributed with a frequency λ per unit of volume, while the convex density g+ represents a conductivity outside the balls. The random integrand is then given by 3 ⎧ B(ωi , r ), ⎨ g− (ξ ) if x ∈ i ∈N f (ω, x, ξ ) = ⎩ g+ (ξ ) otherwise, or, in an equivalent way, by
f (ω, x, ξ ) := g+ (ξ ) + g− (ξ ) − g+ (ξ ) min(1, 8 (ω, B(x, r ))),
(17.217)
where 8 is the Poisson point process satisfying for all A ∈ b (R3 ), 8 (ω, A) = #(A∩ Ω) and E(8 (., A)) = λ "N (A). Starting from formula (17.217), one can easily see that f is β ergodic. We consider the standard quadratic case, i.e., g− (ξ ) = α2 |ξ |2 and g+ (ξ ) = 2 |ξ |2 . 1 1 |ξ ∗ |2 and g+∗ (ξ ) = 2β |ξ ∗ |2 and that condition (iii) of It is easy to show that g−∗ (ξ ) = 2α
Proposition 17.4.6 is fulfilled with γ = min{ α1 , β1 }.
Finally we assume that for P-a.e. ω ∈ Σ, u0 (ω) strongly converges in L2 (Ω) to some u 0 (ω) and that the source g (ω) strongly converges to a function g (ω) in L2 ([0, T ]; L2 (Ω)). Then applying Theorem 17.4.8 together with Proposition 17.4.6, we infer that the unique solution u (ω) of the random evolution problem ⎧ " ! ⎪ ⎨ d u (ω) − div ∇ f ω, . , ∇u = g (ω), ξ (+ (ω)) dt ⎪ ⎩ u (ω, 0) = u 0 (ω),
i
i i
i
i
i
i
766
book 2014/8/6 page 766 i
Chapter 17. Gradient flows
P-almost surely uniformly converges in C(0, T ; L2 (Ω)) to the unique solution u(ω) of the evolution problem ⎧ ⎪ ⎨ d u(ω) − div ∇ f ho m (∇u(ω)) = g (ω), ho m (17.218) ) (+ dt ⎪ ⎩ u(ω, 0) = u 0 (ω). d u (ω)
d u(ω)
Furthermore, for P-a.e. ω ∈ Σ, d t d t in L2 (0, T ; L2 (Ω)). The limit deterministic operator div ∇ f ho m (∇u(ω)) can be calculated following the process below: • Solve the random Dirichlet problem div ∇ξ f (ω, ., a + ∇v(.)) = 0 v = 0 on ∂ nY,
a.e. in nY ;
whose ua,n (ω) is the unique solution. • Compute ∇ fn (ω, a) = n1N nY ∇ξ f (ω, y, ∇ua,n (ω)(y) + a) d y; then, for P-a.e. ω in Σ, ∇ f ho m (ω, a) = lim ∇ fn (ω, a). n→+∞
17.5 Steepest descent and gradient flow on general metric spaces Evolution equations in general describe the changing of a physical (or economic, or social) system with respect to the time; in many situations the state of the system is the main variable entering in the evaluation of a cost functional Φ whose values tend to become as low as possible in a unit of time. Then we say that the system evolves through the maximal slope or the steepest descent of the cost functional Φ and that the evolution occurs through the gradient flow of Φ. The theory of gradient flows has received great attention from the mathematical community in the recent years, mainly because of several links with the mass transportation theory presented in Section 11.5. This made it possible to write several partial differential equations of evolution type as gradient flows of functionals defined in some spaces of measures. We recall here some notions and results in the direction of variationally driven evolutions in metric spaces. In particular, we start by presenting the general theory of steepest descent and gradient flow on general metric spaces, introduced by De Giorgi in [194] in order to study evolution problems with an underlying variational structure. The theory was later developed in the monograph [27], to which we refer for further details. The framework of the theory is very general and applies both to quasi-static evolutions as well as to gradient flows, under rather mild assumptions. When the state of the system under consideration is a vector u(t ) of the Euclidean space RN or more generally of a Hilbert space , and the cost functional Φ is smooth, the evolution by maximal slope is described by the differential equation u˙ (t ) = −∇Φ u(t ) . (17.219) In fact, multiplying by u˙ the equation above, we obtain d dt
Φ u(t ) = −| u˙ (t )|2 ,
i
i i
i
i
i
i
17.5. Steepest descent and gradient flow on general metric spaces
book 2014/8/6 page 767 i
767
which shows that Φ u(t ) decreases. The two scalar equalities ⎧ d ⎪ ⎨ Φ u(t ) = −|∇Φ u(t ) || u˙(t )|, dt ⎪ ⎩ | u˙(t )| = |∇Φ u(t ) | then show that the decreasing rate of Φ u(t ) is maximal. It is interesting to note that we can equivalently write the differential equation (17.219) by the inequality 1 1 Φ u(t ) ≤ − | u˙(t )|2 − |∇Φ u(t ) |2 , dt 2 2 d
(17.220)
where only the quantities |∇Φ| and | u˙| appear. When the state variable of the system belongs to a metric space, as, for instance, occurs in the case of shape optimization problems, the concept of differentiability and smoothness are no longer available, and the description of evolution by maximal slope and the related gradient flow have to be defined in a more general way. Our general framework deals with a complete metric space (X , d ), an initial condition u0 ∈ X , and a functional Φ : X →] − ∞, +∞]. In the following, we set 4 5 dom Φ = u ∈ X : Φ(u) < +∞ and we always assume that Φ is proper, that is, dom Φ = , which means that Φ is not constantly equal to +∞. An important concept in this framework is the one of metric derivative. For a function u : [0, T ] → X we call metric derivative at the point t0 the quantity d u(t ), u(t0 ) | u˙|(t0 ) = lim sup . |t − t0 | t →t0 The function u is said to belong to the class AC p (0, T ; X ), with p ∈ [1, ∞], if the metric derivative | u˙| belongs to L p (0, T ). In this case it can be shown that the lim sup above is actually a limit for a.e. t0 ∈ [0, T ]. Definition 17.5.1. A function G : X → [0, +∞] is called an upper gradient of Φ if for every curve u ∈ AC 1 (0, T ; X ) with (G ◦ u)| u˙| ∈ L1 (0, T ) the function Φ ◦ u belongs to W 1,1 (0, T ) and d for a.e. t ∈ [0, T ]. Φ u(t ) ≤ G u(t ) | u˙|(t ) dt Remark 17.5.1. In [27] a weaker definition of upper gradient is also considered, which requires only that the function Φ ◦ u belongs to BV (0, T ). Definition 17.5.2. A locally absolutely continuous curve u : [0, T ] → X is said a curve of maximal slope (or of steepest descent) for the functional Φ with respect to a functional G if (i) G is an upper gradient of Φ; (ii) Φ u(t ) is decreasing; (iii)
1 1 Φ u(t ) ≤ − | u˙|2 (t ) − G 2 u(t ) for a.e. t ∈ [0, T ]. dt 2 2 d
i
i i
i
i
i
i
768
book 2014/8/6 page 768 i
Chapter 17. Gradient flows
In view of Definitions 17.5.1 and 17.5.2 above it is crucial to identify some canonical functional that is a good candidate to be an upper gradient of a given cost Φ. Definition 17.5.3. The local slope |∂ Φ| of Φ at a point u0 ∈ dom Φ is defined by |∂ Φ|(u0 ) = lim sup u→u0
+ Φ(u0 ) − Φ(u) d (u, u0 )
,
where (·)+ denotes the positive part function. If τ is another topology on X we also define the τ-relaxed slope |∂τ− Φ| of Φ as |∂τ− Φ|(u0 ) = inf lim inf |∂ Φ|(un ) : un →τ u, un bounded in X , sup Φ(un ) < +∞ . n
n
We are now in a position to consider the general problem of existence of steepest descent curves in a complete metric space X ; in Section 17.6 we see that under quite mild assumptions on Φ these curves exist and they can be considered as the natural generalizations of the evolution PDEs for smooth cost functionals in Hilbert spaces.
17.6 Minimizing movements and the implicit Euler scheme The framework we consider in this section is the same as that of Section 17.5, that is, (X , d ) a complete metric space, an initial condition u0 ∈ X , and a functional Φ : X →]−∞, +∞] which we assume proper, that is, not constantly equal to +∞. For every fixed > 0 the implicit Euler scheme of time step and initial condition u0 consists in constructing a function u (t ) = w([t /]), where [·] stands for the integer part function, in the following recursive way: d 2 v, w(n) . (17.221) w(n + 1) ∈ arg min Φ(v) + w(0) = u0 , 2 Definition 17.6.1. We say that u : [0, T ] → X is a minimizing movement associated to the functional Φ, to the topology τ, and to the initial condition u0 , and we write u ∈ M M (Φ, τ, u0 ) if u (t ) →τ u(t ) ∀t ∈ [0, T ]. If the limit above occurs only for a sequence (n )n∈N (independent of t ), we say that u : [0, T ] → X is a generalized minimizing movement and we write u ∈ GM M (Φ, τ, u0 ). Let Φ : X →] − ∞, +∞] be a proper functional and let τ be a topology on X . We assume the following: (i) τ is weaker that d and un →τ u, vn →τ v ⇒ d (u, v) ≤ lim inf d (un , vn ); n
(ii) Φ is sequentially τ-lower semicontinuous, that is, un →τ u ⇒ Φ(u) ≤ lim inf Φ(un ); n
i
i i
i
i
i
i
17.6. Minimizing movements and the implicit Euler scheme
book 2014/8/6 page 769 i
769
(iii) Φ is τ-coercive, that is, for every c ∈ R the sublevel {Φ ≤ c} is sequentially τcompact. A crucial existence theorem for minimizing movements is the following. (We refer to Proposition 2.2.3 of [27] for the proof.) Theorem 17.6.1. Under the assumptions above, for every initial condition u0 ∈ dom Φ there exists a generalized minimizing movement u ∈ GM M (Φ, τ, u0 ). Moreover, we have that GM M (Φ, τ, u0 ) ⊂ AC 2 (0, T ; X ). Remark 17.6.1. For a fixed p > 1, a recursive construction similar to (17.221) can be done by setting d p v, w(n) . w(n + 1) ∈ arg min Φ(v) + w(0) = u0 , p p−1 Then a result similar to the existence theorem, Theorem 17.6.1, holds, in the sense that GM M (Φ, τ, u0 ) is nonempty and GM M (Φ, τ, u0 ) ⊂ AC p (0, T ; X ). The case p = 1 corresponds to the recursive definition 4 5 w(0) = u0 , w(n + 1) ∈ arg min Φ(v) + d v, w(n) and is used to describe the quasi-static evolution problems. We refer to [282] for a general presentation of quasi-static evolution problems and rate-independent processes. The link between minimizing movements and curves of maximal slope is given by the following result (see Theorem 2.3.3 of [27]). Theorem 17.6.2. Let us assume the conditions (i), (ii), (iii) above on Φ and τ; assume in addition that (17.222) the mapping |∂τ− Φ| is an upper gradient of Φ. Then, for every initial condition u0 ∈ dom Φ, every curve u ∈ GM M (Φ, τ, u0 ) is a curve of maximal slope for Φ with respect to |∂τ− Φ|. Moreover, the energy identity 1 2
s
2
| u˙| d t + 0
1 2
s 0
|∂τ− Φ|2 u(t ) d t + Φ u(s) = Φ(u0 )
holds for every s > 0. Notice that from the energy identity above we obtain an equality in condition (iii) of Definition 17.5.2 of steepest descent curves: 1 1 Φ u(t ) + | u˙ |2 (t ) + |∂τ− Φ|2 u(t ) = 0 dt 2 2 d
for a.e. t ∈ [0, T ].
The problem is now reduced to proving condition (17.222). The simplest situation in which this can be done is when X is a Hilbert space, τ its weak topology, and Φ : X → ] − ∞, +∞] a proper, convex, lower semicontinuous cost functional. In this case we have |∂τ− v| = |∂ Φ| and both coincide with the element of minimal norm of the subdifferential of Φ, which is an upper gradient of Φ. This case is analyzed in detail in Section 17.2.
i
i i
i
i
i
i
770
book 2014/8/6 page 770 i
Chapter 17. Gradient flows
An interesting generalization of the concept of convexity in metric spaces is the notion of geodesic convexity. Definition 17.6.2. Let λ ∈ R be fixed. A functional Φ : X →] − ∞, +∞] is called λ-geodesically convex if for every u0 , u1 ∈ X , there exists a curve γ : [0, 1] → X such that (i) γ (0) = u0 and γ (1) = u1 ; (ii) γ is a constant speed geodesic, that is, d γ (s), γ (t ) = (t − s)d (u0 , u1 )
for every 0 ≤ s ≤ t ≤ 1;
(iii) Φ is λ-convex along γ , that is, t (1 − t ) 2 Φ γ (t ) ≤ (1 − t )Φ(u0 ) + t Φ(u1 ) − λ d (u0 , u1 ) 2
for every t ∈ [0, 1].
Note that in a Banach space, since the geodesics are the line segments, when λ = 0 we recover the usual convexity. For λ-geodesically convex functionals the following result holds (see Corollary 2.4.12 of [27]). Theorem 17.6.3. Let us assume the conditions (i), (ii), (iii) on Φ and τ. Assume in addition that (a) Φ is λ-geodesically convex for some λ ∈ R; (b) |∂ Φ| = |∂τ− |, that is, the map u → |∂ Φ|(u) is sequentially τ-lsc on d -bounded subsets of sublevels of Φ. Then the map |∂ Φ| is an upper gradient of Φ and so Theorem 17.6.2 applies. The class of λ-geodesically convex cost functionals includes some very interesting cases coming from mass transportation theory, in which the metric space X is the space P(Ω) of probabilities on Ω, metrized by the Wasserstein distance introduced in Section 11.5. We refer the interested reader to the book [27] and the references therein.
i
i i
i
i book
i
2014/8/6 page 771
i
i
Bibliography [1] Y. Abddaimi, C. Licht, G. Michaille. Stochastic homogenization of an integral functional of quasiconvex function with linear growth. Asymptot. Anal. 15 (1997), 183–202. (Cited on pp. 467, 559, 585) [2] P.-A. Absil, R. Mahony, B. Andrews. Convergence of the iterates of descent methods for analytic cost functions. SIAM J. Optim. 16 (2005), 531–547. (Cited on p. 669) [3] N. Acerbi, N. Fusco. Semicontinuity problems in the calculus of variations. Mech. Anal. 86 (1984), 125–145. (Cited on pp. 476, 554) [4] M. A. Ackoglu, U. Krengel. Ergodic theorem for superadditive processes. J. Reine Angew. Math. 323 (1981), 53–67. (Cited on pp. 500, 519) [5] D. R. Adams, L. I. Hedberg. Function Spaces and Potential Theory. Springer-Verlag, Berlin, 1996. (Cited on p. 210) [6] S. Adly, M. Thera, E. Ernst. Stability of the solution set of non-coercive variational inequalities. Comm. Contemp. Math. 4 (2002), no. 1, 145–160. (Not cited) [7] S. Adly, E. Ernst, M. Thera. A characterization of convex and semicoercive functionals. J. Convex Anal. 8 (2001), no. 1, 127–148. (Not cited) [8] S. Agmon, A. Douglis, L. Nirenberg. Estimates near the boundary for solutions of elliptic partial differential equations satisfying general boundary conditions I. Comm. Pure Appl. Math. 12 (1959), 623–727. (Cited on pp. 223, 715) [9] S. Agmon, A. Douglis, L. Nirenberg. Estimates near the boundary for solutions of elliptic partial differential equations satisfying general boundary conditions II. Comm. Pure Appl. Math. 17 (1964), 35–92. (Cited on p. 223) [10] G. Alberti. Rank one properties for derivatives of functions with bounded variation. Proc. Roy. Soc. Edinburgh 123A (1993), 239–274. (Cited on p. 460) [11] G. Allaire. Shape Optimization by the Homogenization Method. Appl. Math. Sci. 146, Springer-Verlag, New York, 2002. (Cited on pp. 643, 644) [12] F. Alvarez, H. Attouch, J. Bolte, P. Redont. A second-order gradient-like dissipative dynamical system with Hessian-driven damping. Application to optimization and mechanics. J. Math. Pures Appl., 81 (2002), no. 8, 747–779. (Cited on p. 665) [13] F. Alvarez, J.-P. Mandallena. Multiparameter homogenization by localization and blow-up. Proc. Roy. Soc. Edinburgh 134A (2004), 801–814. (Cited on p. 501) [14] F. Alvarez, J.-P. Mandallena. Homogenization of multiparameter integrals. Nonlinear Anal. 50 (2002), 839–870. (Cited on p. 501)
771
i
i i
i
i book
i
2014/8/6 page 772
i
i
772
Bibliography [15] A. Ambrosetti, P. Rabinowitz. Dual variational methods in critical point theory and applications. J. Funct. Anal. 14 (1973), 349–381. (Cited on p. 95) [16] L. Ambrosio. A compactness theorem for a special class of functions of bounded variation. Boll. Un. Mat. It. 3B (1989), 857–881. (Cited on p. 534) [17] L. Ambrosio. A new proof of the SBV compactness theorem. Calc. Var. Partial Differential Equations 3 (1995), 127–137. (Cited on p. 431) [18] L. Ambrosio. On the lower semicontinuity of quasi-convex integrals in SBV (Ω; Rk ). Nonlinear Anal. 23 (1994), 405–425. (Cited on p. 568) [19] L. Ambrosio. Corso introduttivo alla Theoria Geometrica della misura ed alle Superfici Minime. Appunti dei corsi tenuti da documenti della Scuola, Scuola Normale Superiore, Pisa, 1997. (Cited on pp. 113, 115) [20] L. Ambrosio. Lecture notes on optimal transport problems. In Mathematical Aspects of Evolving Interfaces, Funchal 2000, Lecture Notes in Math. 1812, Springer-Verlag, Berlin, 2003, 1–52. (Cited on p. 481) [21] L. Ambrosio, A. Braides. Energies in SBV and variational models in fracture mechanics. Homogenization and Applications to Material Sciences 9. D. Cioranescu, A. Damlamian, P. Donato eds., Gakuto, Gakkotosho, Tokyo, Japan, 1997, 1–22. (Cited on pp. 578, 579, 580) [22] L. Ambrosio, G. Buttazzo. An optimal design problem with perimeter penalization. Calc. Var. Partial Differential Equations 1 (1993), 55–69. (Cited on p. 654) [23] L. Ambrosio, G. Buttazzo. Weak lower semicontinuous envelope of functionals defined on a space of measures. Ann. Mat. Pura Appl. 150 (1988), 311–340. (Cited on p. 562) [24] L. Ambrosio, G. Dal Maso. A general chain rule for distributional derivatives. Proc. Amer. Math. Soc. 108 (1990), 691–702. (Cited on p. 430) [25] L. Ambrosio, G. Dal Maso. On the relaxation in BV (Ω; Rm ) of quasi-convex integrals. J. Funct. Anal. 109 (1992), 76–97. (Cited on pp. 460, 461, 465, 466) [26] L. Ambrosio, N. Fusco, D. Pallara. Functions of Bounded Variation and Free Discontinuity Problems. Oxford Mathematical Monographs, Oxford University Press, New York, 2000. (Cited on pp. 547, 579) [27] L. Ambrosio, N. Gigli, G. Savaré. Gradient Flows in Metric Spaces and in the Space of Probability Measures. Lectures in Mathematics ETH Zürich, Birkhäuser Verlag, Basel, 2005. (Cited on pp. 484, 766, 767, 769, 770) [28] L. Ambrosio, A. Pratelli. Existence and stability results in the L1 theory of optimal transportation. In Optimal Transportation and Applications, Martina Franca 2001, Lecture Notes in Math. 1813, Springer, Berlin, 2003, 123–160. (Cited on p. 483) [29] L. Ambrosio, P. Tilli. Topics on Analysis in Metric Spaces. Oxford Lecture Series in Math. Appl. 25, Oxford University Press, Oxford, UK, 2004. (Cited on p. 110) [30] L. Ambrosio, V. M. Tortorelli. Approximation of functionals depending on jumps by elliptic functionals via Γ -convergence. Comm. Pure Appl. Math. 18 (1990), 999–1036. (Cited on pp. 539, 545) [31] C. Amrouche, V. Girault. Decomposition of vector spaces and application to the Stokes problem in arbitrary dimension. Czech. Math. J. 44 (1994), 109–140. (Cited on p. 254) [32] A. S. Antipin. Minimization of convex functions on convex sets by means of differential equations. Differential Equations 30, (1994), 1365–1375. (Cited on p. 709)
i
i i
i
i book
i
2014/8/6 page 773
i
i
Bibliography
773 [33] O. Anza Hafsa. Variational formulations on thin elastic plates with constraints. J. Convex Anal. 12 (2005), 365–382. (Cited on p. 493) [34] O. Anza Hafsa, J.-P. Mandallena. Interchange of infimum and integral. Calc. Var. Partial Differential Equations 18 (2003), 433–449. (Cited on p. 450) [35] O. Anza Hafsa, J.-P. Mandallena. Relaxation of second order geometric integrals and non-local effects. J. Nonlinear Convex Anal. 5 (2004), 295–306. (Cited on p. 493) [36] D. G. Aronson. The porous medium equation. In Nonlinear Diffusion Problems, Lecture Notes in Math. 1224, Springer-Verlag, Berlin, 1986, 1–46. (Cited on p. 724) [37] H. Attouch. Variational convergence for functions and operators. Applicable Mathematics Series, Pitman Advanced Publishing Program, Boston, 1985. (Cited on pp. 162, 438, 489, 490, 532, 589, 677, 745, 749) [38] H. Attouch. Convergences de fonctionnelles convexes Journées d’Analyse Non Linéaire, Besançon. Lecture Notes in Math. 665, Springer-Verlag, Berlin, 1977, pp. 1–40. (Cited on pp. 687, 754, 756) [39] H. Attouch. Viscosity solutions of minimization problems. SIAM J. Optim. 6 (1996), 769–806. (Cited on pp. 536, 579) [40] H. Attouch, J. Bolte, P. Redont, A. Soubeyran. Proximal alternating minimization and projection methods for nonconvex problems. An approach based on the Kurdyka-Lojasiewicz inequality. Math. Oper. Res. 35 (2010), no. 2, 438–457. (Cited on pp. 729, 731, 733) [41] H. Attouch, J. Bolte, B. F. Svaiter. Convergence of descent methods for semi-algebraic and tame problems: Proximal algorithms, forward-backward splitting, and regularized Gauss-Seidel methods. Math. Program. 137 (2013), no. 1, 91–129. (Cited on pp. 729, 731) [42] H. Attouch, G. Bouchitté, M. Mabrouk. Variational formulations for semilinear elliptic equations involving measures. In Nonlinear Variational Problems, Vol. II, A. Marino, A. Murthy, eds., Pitman Res. Notes in Math. 193, 1989, 1–56. (Not cited) [43] H. Attouch, H. Brezis. Duality for the sum of convex functions in general Banach spaces. In Aspects of Mathematics and Its Applications, J. Barroso, ed., North-Holland, Amsterdam, 1986, 125–133. (Cited on pp. 354, 355) [44] H. Attouch, T. Champion. L p regularization of the non-parametric minimal surface problem. Ill-Posed Variational Problems and Regularization Techniques, Lecture Notes in Econom. and Math. Systems 477, 1999, 25–34. (Not cited) [45] H. Attouch, R. Cominetti. L p approximation of variational problems in L1 and L∞ . Nonlinear Anal. 36 (1999), no. 3, 373–399. (Not cited) [46] H. Attouch, A. Damlamian. Strong solutions for parabolic variational inequalities. Nonlinear Anal. 2 (1978), no. 3, 329–353. (Cited on p. 702) [47] H. Attouch, A. Damlamian. Applications des méthodes de convexité et monotonie à l’étude de certaines équations quasi-linéaires. Proc. Roy. Soc. Edinburgh 79A (1977), 107–129. (Cited on p. 702) [48] H. Attouch, A. Damlamian. Problèmes d’évolution dans les espaces de Hilbert et applications. J. Math. Pures Appl. 54 (1975), no. 1, 53–74. (Cited on p. 702) [49] H. Attouch, A. Damlamian. Application of methods of convexity and monotonicity to the study of quasilinear equations. Proc. Roy. Soc. Edinburgh 79A (1977), no. 1–2, 107–129. (Cited on p. 724)
i
i i
i
i book
i
2014/8/6 page 774
i
i
774
Bibliography [50] H. Attouch, X. Goudou, P. Redont. The heavy ball with friction method. The continuous dynamical system, global exploration of the local minima of a real-valued function by asymptotical analysis of a dissipative dynamical system. Commun. Contemp. Math. 2 (2000), no. 1, 1–34. (Cited on p. 665) [51] H. Attouch, C. Picard. Problèmes variationnels et théorie du potentiel non linéaire. Ann. Fac. Sci. Toulouse 1 (1979), 89–136. (Cited on p. 281) [52] H. Attouch, C. Picard. Variational inequalities with obstacles and functional spaces in potential theory. Appl. Anal. 12 (1981), no. 4, 287–306. (Not cited) [53] H. Attouch, C. Picard. Variational inequalities with varying obstacles: The general form of the limit problem. J. Funct. Anal. 50 (1983), no. 3, 329–386. (Not cited) [54] H. Attouch, H. Riahi. Stability results for Ekeland’s -variational principle and cone extremal solutions. Math. Oper. Res. 18 (1993), no. 1, 173–201. (Not cited) [55] H. Attouch, A. Soubeyran. Towards stable routines: Improving and satisficing enough by exploration-exploitation on an unknown landscape. Working paper GREQAM, University of Montpellier, 2004. (Cited on pp. 94, 96, 97, 102) [56] H. Attouch, M. Thera. A general duality principle for the sum of two operators. J. Convex Anal. 3 (1996), no. 1, 1–24. (Not cited) [57] H. Attouch, R. J.-B. Wets. Epigraphical analysis. Ann. Inst. H. Poincaré Anal. Non Linéaire 6 (1989), 73–100. (Cited on p. 76) [58] H. Attouch, R. J.-B. Wets. Quantitative stability of variational systems: I. The epigraphical distance. Trans. Amer. Math. Soc. 328 (1991), no. 2, 695–729. (Not cited) [59] H. Attouch, R.J.-B. Wets. Epigraphical processes: Laws of large numbers for random lsc functions. Sém. Anal. Convexe 13 (1990). (Cited on pp. 506, 507, 519, 520) [60] G. Aubert, R. Deriche, P. Kornprobst. Computing optical flow via variational techniques. SIAM J. Appl. Math. 60 (1999), 156–182. (Cited on p. 597) [61] G. Aubert, P. Kornprobst. A mathematical study of the relaxed optical flow problem in the space BV (Ω). SIAM J. Math. Anal. 30 (1999), 1282–1308. (Cited on p. 597) [62] G. Aubert, P. Kornprobst. Mathematical problems in image processing: Partial differential equations and the calculus of variations. Appl. Math. Sci. 147, Springer-Verlag, New York, 2002. (Cited on p. 597) [63] Th. Aubin. Problèmes isopérimétriques et espaces de Sobolev. J. Diff. Geom., 11 (1976), 573– 598. (Cited on p. 190) [64] J. P. Aubin. Mathematical Methods of Game and Economic Theory. North-Holland, Amsterdam, 1979. (Cited on pp. 355, 382) [65] J. P. Aubin, A. Cellina. Differential Inclusions. Springer, Berlin, 1984. (Not cited) [66] J. P. Aubin, I. Ekeland. Applied Nonlinear Analysis. John Wiley, New York, 1984. (Cited on pp. 678, 679) [67] J. P. Aubin, I. Ekeland. Applied Nonlinear Analysis. John Wiley, New York, 1984. (Cited on pp. 94, 95, 96, 357) [68] J. P. Aubin, H. Frankowska. Set-Valued Analysis. Birkhäuser-Verlag, Boston, 1990. (Cited on p. 76)
Systems Control Found. Appl. 2,
i
i i
i
i book
i
2014/8/6 page 775
i
i
Bibliography
775 [69] A. Auslender. Noncoercive optimization problems. Math. Oper. Res. 21 (1996), 769–782. (Not cited) [70] D. Azé. Eléments d’ Analyse Convexe et Variationnelle. Ellipse, Paris, 1997. (Cited on p. 357) [71] J.-B. Baillon. Un exemple concernant le comportement asymptotique de la solution du problème du + ∂ φ(u) # 0. J. Funct. Anal. 28 (1978), 369–376. (Cited on p. 705) dt [72] J.-B. Baillon. Thèse. Université Paris VI, 1978. (Cited on p. 705) [73] J.-B. Baillon, H. Brézis. Une remarque sur le comportement asymptotique des semi-groupes non linéaires. Houston, J. Math. 2 (1976), 5–7. (Not cited) [74] J.-B. Baillon, A. Haraux. Comportement à l’infini pour les équations d’évolution avec forcing périodique. Arch. Ration. Mech. Anal. 67 (1977), 101–109. (Cited on p. 687) [75] C. Baiocchi, G. Buttazzo, F. Gastaldi, F. Tomarelli. General existence theorems for unilateral problems in continuum mechanics. Arch. Ration. Mech. Anal. 100 (1988), no. 2, 149–189. (Cited on p. 599) [76] C. Baiocchi, F. Gastaldi, F. Tomarelli. Inéquations variationnelles non coercives. C. R. Acad. Sci. Paris 299 (1984), 647–650. (Cited on p. 609) [77] C. Baiocchi, F. Gastaldi, F. Tomarelli. Some existence results on noncoercive variational inequalities. Ann. Scuola Norm. Sup. Pisa (4) 13 (1986), 617–659. (Cited on p. 609) [78] E. J. Balder. Lectures on Young measures theory and its applications in economics. Workshop di Teoria della Misura e Analisi Reale, Grado, 1997, Rend. Instit. Univ. Trieste 31, 2000, 1–69. (Cited on p. 132) [79] J. M. Ball Convexity conditions and existence theorems in nonlinear elasticity. Arch. Ration. Mech. Anal. 63 (1977), 13–23. (Cited on p. 553) [80] J. M. Ball. A version of the fundamental theorem for Young measures. In PDE’s and Continuum Models of Phase Transitions, M. Rascle, D. Serre, M. Slemrod, eds., Lecture Notes in Phys. 344, Springer-Verlag, New York, 1989, 207–215. (Cited on p. 132) [81] J. M. Ball, R. D. James. Fine phase mixtures as minimizers of energy. Arch. Ration. Mech. Anal. 100 (1987), 13–52. (Cited on pp. 468, 475) [82] J. M. Ball, F. Murat. W 1, p -quasiconvexity and variational problems for multiple integrals. J. Funct. Anal. 58 (1984), 225–253. (Cited on p. 445) [83] G. I. Barenblatt. The mathematical theory of equilibrium cracks in brittle fracture. Adv. Appl. Mech. 7 (1962), 55. (Cited on p. 578) [84] E. Barozzi, E. H. A. Gonzalez. Least area problems with a volume constraint. Variational Methods for Equilibrium Problems of Fluids, Astérisque 118 (1984), 33–53. (Cited on p. 644) [85] H. Bauschke, P. Combettes. Convex Analysis and Monotone Operator Theory. CMS Books in Mathematics, Springer, New York, 2011. (Cited on pp. 678, 679) [86] G. Beer. Lipschitz regularization and the convergence of convex functions. Numer. Funct. Anal. Optim. 15 (1994), no. 1–2, 31–46. (Cited on p. 78) [87] G. Beer. Topologies on Closed and Closed Convex Sets. Math. Appl. 263 (1993). (Cited on p. 735) [88] G. Beer. Wijsman convergence: A survey. Set-Valued Anal. 2 (1994), no. 1–2, 77–94. (Cited on p. 735)
i
i i
i
i book
i
2014/8/6 page 776
i
i
776
Bibliography [89] G. Bellettini, A. Coscia. Discrete approximation of a free discontinuity problem. Numer. Funct. Anal. Optim. 15 (1994), no. 3–4, 201–224. (Cited on pp. 538, 578) [90] M. Bellieud, G. Bouchitté. Homogenization of elliptic problems in a fiber reinforced structure. Nonlocal effect. Ann. Scuola Norm. Sup. Pisa Cl. Sci. (4) 26 (1998), no. 3, 407–436. (Cited on p. 501) [91] M. Bellieud, G. Bouchitté. Homogenization of a soft elastic material reinforced by fibers. Asymptot. Anal. 32 (2002), no. 2, 153–183. (Cited on p. 501) [92] M. Belloni, G. Buttazzo, L. Freddi. Completion by Gamma-convergence for optimal control problems. Ann. Fac. Sci. Toulouse Math. 2 (1993), 149–162. (Cited on p. 644) [93] R. Benedetti, J.-J. Risler. Real Algebraic and Semialgebraic Sets. Hermann, Éditeur des Sciences et des Arts, Paris, 1990. (Cited on p. 732) [94] Ph. Benilan, M. G. Crandall. The continuous dependence on φ of solutions of u t − Δφ(u) = 0. Indiana Univ. Math. J. 30 (1981), 61–77. (Cited on p. 724) [95] A. Beurling, J. Deny. Dirichlet spaces. Proc. Nat. Acad. Sci. USA 45 (1959), 208–215. (Cited on p. 198) [96] K. Bhattacharya. Wedge-like microstructure in martensite. Acta Metal. 39 (1991), 2431–2444. (Cited on p. 475) [97] K. Bhattacharya. Self accommodation in martensite. Arch. Ration. Mech. Anal. 120 (1992), 201–244. (Cited on p. 475) [98] K. Bhattacharya, N. Firoozye, R. D. James, R. V. Kohn. Restrictions on microstructures. Proc. Roy. Soc. Edinburgh 124A (1994), 843–878. (Cited on p. 475) [99] K. Bhattacharya, R. V. Kohn. Elastic energy minimization and the recoverable strains of polycrystalline shape-memory materials. Arch. Ration. Mech. Anal. 139 (1997), 99–180. (Cited on pp. 468, 475) [100] E. Bishop, R. Phelps. The support functional of a convex set. Proc. Symp. Pure Math. Amer. Math. Soc. 7 (1962), 27–35. (Cited on p. 94) [101] J. Bochnak, M. Coste, M.-F. Roy. Real Algebraic Geometry. Springer, New York, 1998. (Cited on p. 732) [102] J. Bolte. Continuous gradient projection method in Hilbert spaces. JOTA, 119 (2003), 235–259. (Cited on pp. 709, 710) [103] J. Bolte, A. Daniilidis, A. Lewis. The Łojasiewicz inequality for nonsmooth subanalytic functions with applications to subgradient dynamical systems. SIAM J. Optim., 17 (2007), no. 4, 1205–1223. (Cited on pp. 733, 734) [104] J. Bolte, A. Daniilidis, A. Lewis. A nonsmooth Morse-Sard theorem for subanalytic functions. J. Math. Anal. Appl., 321 (2006), no. 2, 729–740. (Not cited) [105] J. Bolte, A. Daniilidis, A. Lewis, M. Shiota. Clarke subgradients of stratifiable functions. SIAM J. Optim. 18 (2007), no. 2, 556–572. (Cited on pp. 731, 733) [106] J. Bolte, A. Daniilidis, O. Ley, L. Mazet. Characterizations of Łojasiewicz inequalities: subgradient flows, talweg, convexity. Trans. Amer. Math. Soc. 362 (2010), 3319–3363. (Cited on p. 733) [107] J. M. Borwein, A. Lewis. Convex Analysis and Nonlinear Optimization. Canadian Mathematical Society Books in Mathematics, Springer-Verlag, New York, 2000. (Cited on p. 76)
i
i i
i
i book
i
2014/8/6 page 777
i
i
Bibliography
777 [108] J. M. Borwein, A. Lewis, D. Noll. Maximum entropy spectral analysis using first order information. Part I: Fisher information and convex duality. Math. Oper. Res. 21 (1996), 442–468. (Not cited) [109] J. M. Borwein, A. Lewis, R. Nussbaum. Entropy minimization, DAD problems and doubly stochastic kernels. J. Funct. Anal. 123 (1994), 264–307. (Not cited) [110] G. Bouchitté, G. Buttazzo. New lower semicontinuity results for non convex functionals defined on measures. Nonlinear Anal. 15 (1990), 679–692. (Cited on pp. 561, 636) [111] G. Bouchitté, G. Buttazzo. Integral representation of nonconvex functionals defined on measures. Ann. Inst. H. Poincaré Anal. Non Linéaire 9 (1992), 101–117. (Cited on pp. 561, 636, 638) [112] G. Bouchitté, G. Buttazzo. Relaxation for a class of nonconvex functionals defined on measures. Ann. Inst. H. Poincaré Anal. Non Linéaire 10 (1993), 345–361. (Cited on pp. 562, 588, 636) [113] G. Bouchitté, G. Buttazzo. Characterization of optimal shapes and masses through MongeKantorovich equation. J. Eur. Math. Soc., 3 (2001), 139–168. (Cited on p. 485) [114] G. Bouchitté, G. Buttazzo, P. Seppecher. Energies with respect to a measure and applications to low dimensional structures. Calc. Var. Partial Differential Equations 5 (1997), no. 1, 37–54. (Cited on p. 493) [115] G. Bouchitté, G. Buttazzo, P. Seppecher. Shape optimization solutions via Monge-Kantorovich equation. C. R. Acad. Sci. Paris, 324-I (1997), 1185–1191. (Cited on p. 485) [116] G. Bouchitté, C. Jimenez, M. Rajesh. Asymptotique d’un problème de positionnement optimal. C. R. Acad. Sci. Paris, 335 (2002), 835–858. (Cited on p. 485) [117] G. Bouchitté, C. Jimenez, M. Rajesh. Asymptotic analysis of a class of optimal location problems. J. Math. Pures Appl., 95 (2011), 382–419. (Cited on p. 485) [118] G. Bouchitté, P. Suquet. Equi-coercivity of variational problems: The role of recession functions. Proceedings of the Séminaire du Collège de France, Vol. XII (Paris, 1991–1993), Pitman Res. Notes Math. Ser. 302, 31–54. (Cited on p. 632) [119] N. Bourbaki. Elements de mathématiques–Espaces vectoriels topologiques, Act. Sci. Ind. 1189, Hermann, Paris, 1966. (Cited on pp. 41, 609) [120] B. Bourdin, G. A. Francfort, J.-J. Marigo. Numerical experiments in revisited brittle fracture. J. Mech. Phys. Solids 48 (2000), no. 4, 797–826. (Cited on p. 578) [121] A. Bourgeat, A. Mikelic, S. Wright. Stochastic two-scale convergence in the mean and applications. J. Reine Angew. Math. 456 (1994), 19–51. (Cited on p. 559) [122] A. Braides. Approximation of Free-Discontinuity Problems. Lecture Notes in Math. 1694, Springer-Verlag, New York, 1998. (Cited on pp. 435, 567, 568, 578) [123] A. Braides. Γ -Convergence for Beginners. Oxford Lecture Ser. Math. Appl. 22, Oxford University Press, Oxford, UK, 2002. (Cited on pp. 489, 490) [124] A. Braides. Discrete approximation of functionals with jumps and creases. Homogenization, 2001 (Naples), Gakuto Internat. Ser. Math. Sci. Appl. 18, Gakk¯ otosho, Tokyo, 2003, 147– 153. (Cited on p. 579) [125] A. Braides, A. Coscia. A singular perturbation approach to problems in fracture mechanics. Math. Mod. Meth. Appl. Sci. 3 (1993), 302–340. (Cited on p. 579)
i
i i
i
i book
i
2014/8/6 page 778
i
i
778
Bibliography [126] A. Braides, G. Dal Maso. Non-local approximation of the Mumford-Shah functional. Calc. Var. Partial Differential Equations 5 (1997), 293–322. (Cited on p. 535) [127] A. Braides, G. Dal Maso, A. Garroni. Variational formulation of softening phenomena in fracture mechanics: The one dimensional case. Arch. Ration. Mech. Anal. 146 (1999), 23–58. (Cited on p. 580) [128] A. Braides, M. S. Gelli. Continuum limits of discrete systems without convexity hypotheses. Math. Mech. Solids 7 (2002), 41–66. (Cited on p. 591) [129] A. Braides, M. S. Gelli. Limits of discrete systems with long-range interactions. J. Convex Anal. 9 (2002), no. 2, 363–399. (Cited on p. 579) [130] A. Brancolini, G. Buttazzo. Optimal networks for mass transportation problems. ESAIM Control Optim. Calc. Var. 11 (2005), 88–101. (Cited on p. 485) [131] A. Brancolini, G. Buttazzo, F. Santambrogio. Path functionals over Wasserstein spaces. J. Eur. Math. Soc. 8 (2006), 415–434. (Cited on p. 485) [132] A. Brancolini, G. Buttazzo, F. Santambrogio, E. Stepanov. Long-term planning versus shortterm planning in the asymptotical location problem. ESAIM Control Optim. Calc. Var. 15 (2009), 509–524. (Cited on p. 485) [133] Y. Brenier. Extended Monge-Kantorovich theory. In Optimal Transportation and Applications, Martina Franca 2001, Lecture Notes in Math. 1813, Springer-Verlag, Berlin, 2003, 92– 121. (Cited on p. 481) [134] H. Brézis. Monotonicity methods in Hilbert spaces and some applications to nonlinear partial differential equations. Contributions to Nonlinear Funct. Analysis, Madison, 1971, Academic Press, 1971, 101–156. (Cited on pp. 275, 701, 717, 719, 724) [135] H. Brézis. Opérateurs maximaux monotones et semi-groupes de contractions dans les espaces de Hilbert. North-Holland Math. Stud. 5, 1973. (Cited on pp. 678, 679, 686, 701, 751, 753, 754) [136] H. Brézis. Intégrales convexes dans les espaces de Sobolev. Israel J. Math. 13 (1972), 9–23. (Not cited) [137] H. Brézis. Analyse Fonctionnelle, Théorie et Applications. Masson, Paris, 1983. (Cited on pp. 13, 44, 53, 55, 172, 194, 254, 313, 407, 597, 715) [138] H. Brézis, F. E. Browder. A general principle on ordered sets in nonlinear functional analysis. Adv. Math. 21 (1976), no. 3, 353–364. (Cited on pp. 94, 96) [139] H. Brézis, L. Nirenberg. Characterizations of the ranges of some nonlinear operators and applications to boundary value problems. Ann. Scuola Norm. Sup. Pisa Cl. Sci. (4) 5 (1978), 225–326. (Cited on p. 626) [140] H. Brézis, A. Pazy. Convergence and approximation of semigroups of nonlinear operators in Banach spaces. J. Funct. Anal. 9 (1972), 63–74. (Cited on p. 753) [141] A. Brønsted, R. T. Rockafellar. On the subdifferentiability of convex functions. Proc. AMS 16 (1965), no. 4, 605–611. (Cited on p. 736) [142] R. E. Bruck. Asymptotic convergence of nonlinear contraction semigroups in Hilbert spaces. J. Funct. Anal., 18 (1975), 15–26. (Cited on pp. 704, 707) [143] H. Buchwalter. Variations sur l’Analyse. Ellipse, Paris, 1992. (Cited on pp. 60, 62, 121)
i
i i
i
i book
i
2014/8/6 page 779
i
i
Bibliography
779 [144] D. Bucur, G. Buttazzo. Variational Methods in Some Shape Optimization Problems. Lecture notes, Dipartimento di Matematica Università di Pisa and Scuola Normale Superiore di Pisa, Series Appunti di Corsi della Scuola Normale Superiore, 2002. (Cited on pp. 643, 644, 645, 646, 648, 650, 651) [145] D. Bucur, G. Buttazzo. Variational Methods in Shape Optimization Problems. Progress in Nonlinear Differential Equations 65, Birkhäuser Verlag, Basel, 2005. (Cited on pp. 212, 217, 643, 650) [146] D. Bucur, G. Buttazzo, I. Figueiredo. On the attainable eigenvalues of the Laplace operator. SIAM J. Math. Anal. 30 (1999), 527–536. (Cited on pp. 210, 651) [147] G. Buttazzo. Semicontinuity, Relaxation and Integral Representation in the Calculus of Variations. Pitman Res. Notes Math. Ser. 207, Longman, Harlow, UK, 1989. (Cited on pp. 212, 547, 550, 652) [148] G. Buttazzo, G. Dal Maso. Shape optimization for Dirichlet problems: Relaxed solutions and optimality conditions. Bull. Amer. Math. Soc. 23 (1990), 531–535. (Cited on p. 648) [149] G. Buttazzo, G. Dal Maso. Shape optimization for Dirichlet problems: Relaxed formulation and optimality conditions. Appl. Math. Optim. 23 (1991), 17–49. (Cited on pp. 217, 648) [150] G. Buttazzo, G. Dal Maso. An existence result for a class of shape optimization problems. Arch. Ration. Mech. Anal. 122 (1993), 183–195. (Cited on p. 650) [151] G. Buttazzo, L. Freddi. Relaxed optimal control problems and applications to shape optimization. Lecture Notes, NATO-ASI Summer School Nonlinear Analysis, Differential Equations and Control, Montreal, Kluwer, Dordrecht, The Netherlands, 1999, 159–206. (Cited on p. 644) [152] G. Buttazzo, A. Gerolin, B. Ruffini, B. Velichkov. Optimal potentials for Schrödinger operators. Preprint, Dipartimento di Matematica, Università di Pisa, (2013). Available at http://cvgmt.sns.it and at http://arxiv.org. (Cited on pp. 654, 661) [153] G. Buttazzo, M. Giaquinta, S. Hildebrandt. One-Dimensional Variational Problems. An Introduction. Oxford Lecture Ser. Math. Appl. 15, Oxford University Press, New York, 1998. (Cited on pp. 57, 547) [154] G. Buttazzo, S. Guarino, F. Oliviero. Optimal location problems with routing cost. Discrete Contin. Dyn. Syst. 34 (2014), 1301–1317. (Cited on p. 485) [155] G. Buttazzo, P. Guasoni. Shape optimization problems over classes of convex domains. J. Convex Anal. 4 (1997), 343–351. (Cited on p. 646) [156] G. Buttazzo, E. Oudet, E. Stepanov. Optimal transportation problems with free Dirichlet regions. In Variational Methods for Discontinuous Structures, Cernobbio 2001, Progress in Nonlinear Differential Equations 51, Birkhäuser Verlag, Basel, 2002, 41–65. (Cited on p. 485) [157] G. Buttazzo, A. Pratelli, S. Solimini, E. Stepanov. Optimal Urban Networks via Mass Transportation. Lecture Notes in Mathematics 1961, Springer-Verlag, Berlin, 2009. (Cited on p. 485) [158] G. Buttazzo, F. Santambrogio. A model for the optimal planning of an urban area. SIAM J. Math. Anal., 37 (2005), 514–530. (Cited on p. 485) [159] G. Buttazzo, F. Santambrogio. A mass transportation model for the optimal planning of an urban region. SIAM Rev. 51 (2009), 593–610. (Cited on p. 485)
i
i i
i
i book
i
2014/8/6 page 780
i
i
780
Bibliography [160] G. Buttazzo, F. Santambrogio, E. Stepanov. Asymptotic optimal location of facilities in a competition between population and industries. Ann. Sc. Norm. Super. Pisa Cl. Sci. 12 (2013), 239–273. (Cited on p. 485) [161] G. Buttazzo, E. Stepanov. Optimal transportation networks as free Dirichlet regions for the Monge-Kantorovich problem. Ann. Scuola Norm. Sup. Pisa Cl. Sci. Ser. V 2 (2003), 631–678. (Cited on p. 485) [162] C. Castaing, P. Raynaud de Fitte, M. Valadier. Young Measure on Topological Spaces. With Applications in Control Theory and Probability Theory. Math. Appl. 571, Kluwer Academic Publishers, Dordrecht, The Netherlands, 2004. (Cited on pp. 132, 471) [163] L. Caffarelli, M. Feldman, R. J. McCann. Constructing optimal maps for Monge’s transport problem as a limit of strictly convex costs. J. Amer. Math. Soc. 15 (2002), 1–26. (Cited on p. 483) [164] G. Carlier, C. Jimenez, F. Santambrogio. Optimal transportation with traffic congestion and Wardrop equilibria. SIAM J. Control Optim. 47 (2008), 1330–1350. (Cited on p. 485) [165] G. Carlier, F. Santambrogio. A continuous theory of traffic congestion and Wardrop equilibria. In Optimization and Stochastic Methods for Spatially Distributed Information, St. Petersburg 2010, J. Math. Sci. 181, 2012, 792–804. (Cited on p. 485) [166] C. Castaing, M. Valadier. Convex Analysis and Measurable Multifunctions. Lecture Notes in Math. 590, Springer-Verlag, Berlin, 1977. (Cited on p. 449) [167] E. Chabi, G. Michaille. Ergodic theory and application to nonconvex homogenization. Set Valued Anal. 2 (1994), 117–134. (Cited on pp. 558, 559) [168] A. Chambolle. Image segmentation by variational methods: Mumford and Shah functional and the discrete approximations. SIAM J. Appl. Math. 55 (1995), 827–863. (Cited on pp. 535, 579) [169] R. Chill, E. Fašangová. Gradient Systems. Proceedings of the 13th International Internet Seminar, 2010. (Cited on p. 724) [170] M. Chipot, C. Collins, D. Kinderlehrer. Numerical analysis of oscillations in multiple well problems. Numer. Math. 70 (1995), 259–282. (Cited on p. 132) [171] M. Chipot, G. Dal Maso. Relaxed shape optimization: The case of nonnegative data for the Dirichlet problem. Adv. Math. Sci. Appl. 1 (1992), 47–81. (Cited on pp. 648, 649) [172] M. Chipot, D. Kinderlehrer. Equilibrium configurations of crystals. Arch. Ration. Mech. Anal. 103 (1988), 237–277. (Cited on pp. 132, 475) [173] P. G. Ciarlet. The Finite Element Method for Elliptic Problems. Classics Appl. Math. 40, SIAM, Philadelphia, 2002. (Cited on p. 257) [174] P. G. Ciarlet. Mathematical Elasticity, Volume I: Three-Dimensional Elasticity. Holland, Amsterdam, 1988. (Cited on p. 250)
North-
[175] D. Cioranescu, F. Murat. Un terme étrange venu d’ailleurs. In Nonlinear Partial Differential Equations and Their Applications, Collège de France Seminar, Vol. II (Paris, 1979/1980), Res. Notes in Math. 60, Pitman, Boston, 1982, pp. 98–138, 389–390. (Cited on p. 214) [176] F. H. Clarke. Optimization and Nonsmooth Analysis. Classics Appl. Math. 5, SIAM, Philadelphia, 1990. (Not cited) [177] F. H. Clarke. Functional Analysis, Calculus of Variations, and Optimal Control. Graduate Texts in Math., Springer, New York, 2013. (Cited on p. 729)
i
i i
i
i book
i
2014/8/6 page 781
i
i
Bibliography
781 [178] C. Combari, L. Thibault. On the graph convergence of subdifferentials of convex functions. Proc. Amer. Math. Soc. 126 (1998), 2231–2240. (Not cited) [179] G. Cortesani, R. Toader. Nonlocal approximation of nonisotropic free-discontinuity problems. SIAM J. Appl. Math. 59 (1999), 1507–1519. (Cited on p. 535) [180] M. Coste. An Introduction to O-Minimal Geometry. RAAG Notes, Institut de Recherche Mathématiques de Rennes, November 1999. Available at http://perso.univ-rennes1.fr/ michel.coste/. (Cited on p. 732) [181] R. Courant, D. Hilbert. Methoden der Mathematischen Physik. Berlin, Vol. I (1931), Vol. II (1937). English ed.: Interscience, New York, Vol. I (1953), Vol. II (1962). (Cited on p. 13) [182] B. Dacorogna. Direct Methods in the Calculus of Variations. Appl. Math. Sci. 78, SpringerVerlag, Berlin, 1989. (Cited on pp. 445, 476, 547, 554, 557) [183] G. Dal Maso. An Introduction to Γ -Convergence. Birkhäuser, Boston, 1993. (Cited on pp. 466, 488, 489, 490) [184] G. Dal Maso, G. Franfort, R. Toader. Quasi-static evolution in brittle fracture: The case of bounded solutions. In Calculus of Variations: Topics from the Mathematical Heritage of E. De Giorgi, Quad. Mat. 14, Dept. Math., Seconda Univ. Napoli, Caserts, 2004, 245–266. Available at http://cvgmt.sns.it/cgi/get.cgi/papers/dalfratoa04a/. (Cited on p. 578) [185] G. Dal Maso, L. Modica. Nonlinear stochastic homogenization and ergodic theory. J. Reine Angew. Math. 363 (1986), 27–43. (Cited on pp. 443, 500, 506, 559, 582) [186] G. Dal Maso, U. Mosco. Wiener’s criterion and Γ -convergence. Appl. Math. Optim. 15 (1987), 15–63. (Cited on pp. 214, 216) [187] A. Damlamian. Some unilateral Korn inequalities with application to a contact problem with inclusions. C. R. Acad. Sci. Paris Sér. I 350 (2012), 861–865. (Cited on p. 261) [188] A. Damlamian, N. Kenmochi. Uniqueness of the solution of a Stefan problem with variable lateral boundary conditions. Adv. Math. Sci. Appl. 1 (1992), 175–194. (Cited on p. 724) [189] A. Damlamian, N. Kenmochi, N. Sato. The subdifferential operator approach to a class of nonlinear systems for Stefan problems with phase relaxation. Nonlinear Analysis 23 (1994), 115–142. (Not cited) [190] A. Daniilidis. Gradient dynamical systems, tame optimization and applications. Lecture Notes, Spring School on Variational Analysis Paseky nad Jizerou, Czech Republic (2009). (Cited on p. 729) [191] A. Daniilidis, O. Ley, S. Sabourau. Asymptotic behaviour of self-contracted planar curves and gradient orbits of convex functions. J. Math. Pures Appl. 94 (2010), 183–199. (Cited on pp. 729, 733) [192] R. Dautray, J. L. Lions. Analyse mathématique et calcul numérique pour les sciences et les techniques. Masson, Paris, 1984. (Not cited) [193] D. G. De Figueiredo. Lectures at the Tata Institute of Fundamental Research. Springer-Verlag, New York, 1989. (Cited on p. 95) [194] E. De Giorgi. New problems on minimizing movements. In Boundary Value Problems for Partial Differential Equations and Applications, Res. Notes Appl. Math. 29, Masson, Paris, 1993, 81–98. (Cited on p. 766) [195] E. De Giorgi, L. Ambrosio. Un nuovo tipo di funzionale del calcolo delle variazioni. Atti Accad. Naz. Lincei Rend. Cl. Sci. Fis. Mat. Natur. 82 (1988), 199–210. (Cited on p. 562)
i
i i
i
i book
i
2014/8/6 page 782
i
i
782
Bibliography [196] E. De Giorgi, M. Carriero, A. Leaci. Existence theorem for a minimum problem with free discontinuity set. Arch. Ration. Mech. Anal. 108 (1989), 195–218. (Cited on pp. 534, 545, 597) [197] E. De Giorgi, T. Franzoni. Su un tipo di convergenza variazionale. Atti Accad. Naz. Lincei Rend. Cl. Sci. Fis. Mat. Natur. (8) 58 (1975), 842–850. (Cited on p. 488) [198] C. Dellacherie, P. A. Meyer. Probabilité et Potentiel. Hermann, Paris, 1975. (Cited on pp. 57, 127, 138, 209) [199] F. Demengel, R. Temam. Convex functions of measures and applications. Indiana Univ. Math. J. 33 (1984), 673–709. (Cited on p. 466) [200] J. L. Doob. Classical Potential Theory and Its Probabilistic Counterpart. Springer-Verlag, Berlin, 2001. (Cited on p. 209) [201] L. van den Dries. Tame Topology and O-Minimal Structures. London Mathematical Society Lecture Note Series 248, Cambridge University Press, Cambridge, 1998. (Cited on p. 732) [202] L. van den Dries, C. Miller. Geometric categories and o-minimal structures, Duke Math. J. 84 (1996), 497–540. (Not cited) [203] N. Dunford, J. T. Schwartz. Linear Operators. Interscience, New York, 1958. (Cited on pp. 57, 354) [204] G. Duvaut, J. L. Lions. Les inéquations en mécanique et physique. Dunod, Paris, 1972. (Cited on pp. 254, 284) [205] A. Edelman, T. A. Arias, S. T. Smith. The geometry of algorithms with orthogonality constraints. SIAM J. Matrix Anal. Appl. 20 (1998), 303–353. (Cited on p. 732) [206] I. Ekeland. Sur les problèmes variationnels. C. R. Acad. Sci. Paris Sér. A-B 275 (1972), 1057– 1059. (Cited on p. 94) [207] I. Ekeland. Remarques sur les problèmes variationnels. C. R. Acad. Sci. Paris Sér. A-B 276 (1973), 1347–1349. (Cited on p. 94) [208] I. Ekeland. On the variational principle. J. Math. Anal. Appl. 47 (1974), 324–353. (Cited on p. 94) [209] I. Ekeland, R. Temam. Convex Analysis and Variational Problems. Classics Appl. Math. 28, SIAM, Philadelphia, 1999. (Not cited) [210] L. C. Evans, W. Gangbo. Differential equation methods for the Monge-Kantorovich mass transfer problem. Memo. Amer. Math. Soc. 653 (1999). (Cited on pp. 481, 483) [211] L. C. Evans, R. F. Gariepy. Measure Theory and Fine Properties of Functions. Stud. Adv. Math., CRC Press, Boca Raton, FL, 1992. (Cited on pp. 110, 113, 115, 129, 203, 401, 425, 428) [212] K. F. Falconer. The Geometry of Fractal Sets. Cambridge University Press, Cambridge, UK, 1986. (Cited on pp. 115, 117) [213] H. Federer. Geometric Measure Theory. Springer-Verlag, Berlin, 1969. (Cited on pp. 115, 428) [214] G. Fichera. Problemi elastostatici con vincoli unilaterali: Il problema di Signorini con ambigue condizioni al contorno. Atti Accad. Naz. Lincei Mem. Cl. Sci. Fis. Mat. Natur. Sez. I (8) 7 (1963/1964), 91–140. (Cited on pp. 258, 599) [215] G. Fichera. Existence Theorems in Elasticity. Handbuch der Physik, VIa/2, Springer-Verlag, Berlin, 1972. (Cited on p. 258)
i
i i
i
i book
i
2014/8/6 page 783
i
i
Bibliography
783 [216] G. Fichera. Boundary value problems of elasticity with unilateral constraints. Encyclopedia of Physics, Vol.VIa/2 (S. Flügge, ed.), Springer-Verlag, Berlin, 1972, 347–424. (Cited on pp. 258, 284) [217] W. Fleming, R. Rishel. An integral formula for total gradient variation. Arch. Math. 11 (1960), 218–222. (Cited on p. 422) [218] I. Fonseca, S. Müller, P. Pedregal. Analysis of concentration and oscillation effects generated by gradients. SIAM J. Math. Anal. 29 (1998), no. 3, 736–756. (Cited on pp. 143, 469) [219] G. A. Francfort, J.-J. Marigo. Une approche variationnelle de la mécanique du défaut. Actes du 30ème Congrès d’Analyse Numérique: CANum’98 (Arles 1998), ESAIM Proc. 6, 1999, 57–74. (Cited on p. 579) [220] G. A. Francfort, J.-J. Marigo. Cracks in fracture mechanics: A time indexed family of energy minimizers. In Variations of Domain and Free-Boundary Problems in Solid Mechanics (Paris, 1997), Solid Mech. Appl. 66, Kluwer, Dordrecht, The Netherlands, 1999, 197–202. (Cited on p. 579) [221] K. Friedrichs. Spektraltheorie halbbeschränkter Operatorem, I, II. Math. Ann. 109 (1934), 465–487; 685–713. (Cited on p. 13) [222] B. Fuglede. Finely Harmonic Functions. Lecture Notes in Math. 289, Springer-Verlag, Berlin, 1972. (Cited on pp. 209, 210) [223] F. Gastaldi, F. Tomarelli. Some remarks on nonlinear and non-coercive variational inequalities. Boll. U.M.I. (7) 1-B (1987) 143–165. (Cited on p. 627) [224] G. Geymonat. Introduction à la localisation. Cours LMGC, Université Montpellier II, 2003. (Cited on pp. 476, 479) [225] M. Giaquinta, S. Hildebrandt. Calculus of Variations, I, II. Springer-Verlag, Berlin, 1996. (Cited on p. 547) [226] M. Giaquinta, G. Modica. Non linear systems of the type of the stationary Navier-Stokes system. J. Reine Angew. Math. 330 (1982), 173–214. (Cited on p. 258) [227] E. Giusti. Minimal Surfaces and Functions of Bounded Variation. Birkhäuser, Boston, 1984. (Not cited) [228] D. Gilbarg, N. Trudinger. Elliptic Partial Differential Equations of Second Order. SpringerVerlag, Berlin, 1977. (Cited on p. 223) [229] R. Glowinski. Numerical Methods for Nonlinear Variational Problems. Springer-Verlag, New York, 1984. (Not cited) [230] R. Glowinski, J. L. Lions, R. Trémolière. Numerical Analysis of Variational Inequalities. North-Holland, Amsterdam, 1981. (Not cited) [231] M. Gobbino. Finite difference approximation of the Mumford-Shah functional. Comm. Pure Appl. Math. 51 (1998), 197–228. (Cited on p. 535) [232] A. A. Griffith. The phenomenon of rupture and flow in solids. Phil Trans. Roy. Soc. London A 221 (1920), 163–198. (Cited on p. 576) [233] P. Grisvard. Problèmes aux limites dans des domaines avec points de rebroussement. Ann. Fac. Sci. Toulouse Math. 6 (1995), no. 3, 561–578. (Cited on pp. 258, 304) [234] P. Grisvard. Singularities in boundary value problems. Rech. Math. Appl. 22, Masson, Paris, Springer-Verlag, Berlin, 1992. (Cited on pp. 258, 304)
i
i i
i
i book
i
2014/8/6 page 784
i
i
784
Bibliography [235] M. E. Gurtin. On phase transitions with bulk, interfacial, and boundary energy. Arch. Ration. Mech. Anal. 96 (1986), 243–264. (Cited on p. 536) [236] A. Haraux, M. A. Jendoubi. The Łojasiewicz gradient inequality in the infinite-dimensional Hilbert space framework. J. Funct. Anal. 260 (2011), no. 9, 2826–2842. (Cited on p. 729) [237] J. Heinonen, T. Kilpelainen, O. Martio. Nonlinear Potential Theory of Degenerate Elliptic Equations. Clarendon Press, Oxford, UK, 1993. (Cited on p. 211) [238] A. Henrot. Extremum Problems for Eigenvalues of Elliptic Operators. Frontiers in Math., Birkhäuser Verlag, Basel, 2006. (Cited on pp. 654, 658) [239] A. Henrot, M. Pierre. Variation et optimisation de formes. Une analyse géométrique. Math. Appl. 48, Springer, New York, 2005. (Cited on p. 643) [240] C. Hess. Epi-convergence of sequences of normal integrands and strong consistency of the maximum likelihood estimator. Ann. Statist. 24 (1996), 1298–1315. (Cited on pp. 519, 520) [241] D. Hilbert. Mathematical problems. Bull. Amer. Math. Soc. 8 (1900), 437–479. (Cited on p. 1) [242] J. B. Hiriart-Urruty, C. Lemaréchal. Convex Analysis and Minimization Algorithms I, II. Springer-Verlag, New York, 1993. (Not cited) [243] F. Hirsch, G. Lacombe. Eléments d’analyse fonctionnelle. Masson, Paris, 1997. (Cited on pp. 30, 274) [244] R. L. Hughes. The flow of human crowds. Ann. Rev. Fluid Mech. 35 (2003), 169–183. (Cited on p. 485) [245] S. I. Huhjaev, A. I. Vol’pert. Analysis in Classes of Discontinuous Functions and Equations of Mathematical Physics. Martinus Nijhoff, Dordrecht, The Netherlands, 1985. (Cited on p. 411) [246] A. D. Ioffe. On lower semicontinuity of integral functionals I, II. SIAM J. Control Optim. 15 (1977), 521–538 and 991–1000. (Cited on p. 550) [247] A. D. Ioffe, An invitation to tame optimization. SIAM J. Optim. 19 (2009), 1894–1917. (Cited on p. 729) [248] A. D. Ioffe, V. M. Tihomirov. Theory of Extremal Problems. North-Holland, Amsterdam, 1979. (Not cited) [249] O. Iosifescu, C. Licht, G. Michaille. Variational limit of a one dimensional discrete and statistically homogeneous system of material points. Asymptot. Anal. 28 (2001), 309–329. (Cited on p. 595) [250] O. Iosifescu, C. Licht, G. Michaille. Variational limit of a one dimensional discrete and statistically homogeneous system of material points. C. R. Acad. Sci. Paris Sér. I Math. 322 (2001), 575–580. (Cited on p. 595) [251] J. L. Joly. Une famille de topologies sur l’ensemble des fonctions convexes pour lesquelles la polarité est bicontinue. J. Math. Pures Appl. 52 (1973), 421–441. (Cited on p. 3) [252] L. V. Kantorovich. On the transfer of masses. J. Math. Sci. 133 (2006), 1381–1382. (Cited on p. 483) [253] N. Kenmochi. Some nonlinear parabolic variational inequalities. Israel J. Math. 22 (1975), 304–331. (Cited on p. 702)
i
i i
i
i book
i
2014/8/6 page 785
i
i
Bibliography
785 [254] N. Kenmochi. Solvability of nonlinear evolution equations with time-dependent constraints and applications. Bull. Fac. Education Chiba Univ. 39 (1981), 1–87. (Cited on p. 702) [255] T. Kilpelainen, J. Maly. Supersolutions to degenerate elliptic equations on quasi open sets. Comm. Partial Differential Equations 17 (1992), 371–405. (Cited on p. 210) [256] D. Kinderlehrer. Remarks about Signorini’s problem in linear elasticity. Ann. Scuola Norm. Sup. Pisa (8) 4 (1981), 605–645. (Cited on p. 261) [257] D. Kinderlehrer, P. Pedregal. Characterization of Young measures generated by gradients. Arch. Ration. Mech. Anal. 119 (1991), 329–365. (Cited on p. 469) [258] D. Kinderlherer, G. Stampacchia. An Introduction to Variational Inequalities and Their Applications. Classics Appl. Math. 31, SIAM, Philadelphia, 2000. (Cited on p. 629) [259] Y. Kobayashi. Difference approximation of Cauchy problems for quasi-dissipative operators and generation of nonlinear semigroups. J. Math. Soc., Japan 27 (1975), 640–665. (Cited on p. 694) [260] K. Kobayashi, Y. Kobayashi, S. Oharu. Nonlinear evolution operators in Banach spaces. Osaka J. Math. 21 (1984), 281–310. (Cited on p. 694) [261] R. V. Kohn, G. Strang. Optimal design and relaxation of variational problems, I, II, III. Comm. Pure Appl. Math. 39 (1986), 113–137, 139–182, 353–377. (Cited on p. 651) [262] U. Krengel. Ergodic Theorems. Studies in Mathematics, Walter de Gruyter, Berlin, 1985. (Cited on pp. 443, 559) [263] M. Kubo. Characterization of a class of evolution operators generated by time-dependent subdifferentials. Funk. Ekvac. 32 (1989), 301–321. (Cited on p. 702) [264] S. G. Krantz, H. R. Parks. The Geometry of Domains in Space. Birkhäuser Advanced Texts, Boston, 1999. (Cited on p. 511) [265] M. Kunze, M. D. P. Monteiro Marques. An introduction to Moreau’s sweeping process. In Impacts in Mechanical Systems, Lecture Notes in Phys. 551, Springer-Verlag, Berlin, 2000, 1–60. (Cited on p. 702) [266] K. Kurdyka. On gradients of functions definable in o-minimal structures. Ann. Inst. Fourier 48 (1998), 769–783. (Cited on p. 731) [267] K. Kurdyka, A. Parusinski. w f -stratification of subanalytic functions and the Łojasiewicz inequality. C. R. Acad. Sci. Paris Sér. I 318 (1994). (Cited on p. 725) [268] K. Kurdyka, T. Mostowski, A. Parusinski. Proof of the gradient conjecture of R. Thom. Ann. Math. 152 (2000), 763–792. (Cited on p. 729) [269] T. Lachand-Robert, M. A. Peletier. An example of non-convex minimization and an application to Newton’s problem of the body of least resistance. Ann. Inst. H. Poincaré Anal. Non Linéaire 18 (2001), no. 2, 179–198. (Cited on p. 646) [270] B. Larrouturou, P. L. Lions. Méthodes Mathématiques pour les Sciences de l’ingénieur, Optimisation et Analyse Numérique. Ecole Polytechnique, Palaiseau, 1996. (Not cited) [271] H. Le Dret, A. Raoult. The nonlinear membrane model as variational limit in nonlinear threedimensional elasticity. J. Math. Pures Appl. (9) 74 (1995), no. 6, 549–578. (Cited on p. 493) [272] L. Leghmizi, C. Licht, G. Michaille. The nonlinear membrane model: A Young measure and varifold formulation. ESAIM: COCV 11 (2005), 449–472. (Cited on p. 493)
i
i i
i
i book
i
2014/8/6 page 786
i
i
786
Bibliography [273] A. S. Lewis, J. Malick. Alternating projection on manifolds. Math. Oper. Res. 33 (2008), no. 1, 216–234. (Cited on pp. 732, 733) [274] C. Licht, G. Michaille. Global-local subadditive ergodic theorems and application to homogenization in elasticity. Ann. Math. Blaise Pascal 9 (2002), 21–62. (Cited on pp. 443, 500, 515) [275] E. H. Lieb. Sharp constants in the Hardy-Littlewood-Sobolev and related inequalities. Ann. Math. 118 (1983), no. 2, 349–374. (Cited on p. 190) [276] J. L. Lions. Quelques méthode de résolution des problèmes aux limites non linéaires. Dunod, Paris, 1969. (Not cited) [277] J. L. Lions, G. Stampacchia. Variational inequalities. Comm. Pure Appl. Math. 20 (1967), 493–519. (Cited on p. 607) [278] S. Łojasiewicz. Une propriété topologique des sous-ensembles analytiques réels. Les Équations aux Dérivées Partielles 87–89, Éditions du centre National de la Recherche Scientifique, Paris, 1963. (Cited on pp. 725, 730) [279] S. Łojasiewicz. Sur les trajectoires du gradient d’une fonction analytique. Seminari di Geometria, Bologna (1982/83), Università degli Studi di Bologna, Bologna, 1984, 115–117. (Cited on p. 726) [280] S. Łojasiewicz. Sur la géométrie semi- et sous-analytique. Ann. Inst. Fourier 43 (1993), 1575– 1595. (Not cited) [281] S. Luckhaus, L. Modica. The Gibbs-Thompson relation within the gradient theory of phase transitions. Arch. Ration. Mech. Anal. 107 (1989), 71–83. (Cited on p. 466) [282] A. Mainik, A. Mielke. Existence results for energetic models for rate-independent systems. Calc. Var. Partial Differential Equations 22 (2005), 73–99. (Cited on p. 769) [283] A. S. Lewis, J. Malick. Alternating projection on manifolds. Math. Oper. Res. 33 (2008), no. 1, 216–234. (Cited on pp. 732, 733) [284] J.-P. Mandallena. On the relaxation of nonconvex superficial integral functionals. J. Math. Pures Appl. 79 (2000), 1011–1028. (Cited on p. 493) [285] J.-P. Mandallena. Quasiconvexification of geometric integrals. Ann. Mat. Pura Appl. 184 (2005), 473–493. (Cited on p. 493) [286] P. Marcellini. Approximation of quasiconvex functions, and lower semicontinuity of multiple integrals. Manuscripta Math. 51 (1985), 1–28. (Cited on p. 554) [287] C. M. Marle. Mesure et probabilités. Hermann, Paris, 1974. (Cited on pp. 121, 129, 538) [288] R. H. Martin. Nonlinear Operators and Differential Equations in Banach Spaces. Willey, New York, 1976. (Not cited) [289] B. Maury, A. Roudneff-Chupin, F. Santambrogio. A macroscopic crowd motion model of gradient flow type. Math. Models Methods Appl. Sci. 20 (2010), 1787–1821. (Cited on p. 485) [290] B. Maury, A. Roudneff-Chupin, F. Santambrogio, J. Venel. Handling congestion in crowd motion modeling. Netw. Heterog. Media 6 (2011), 485–519. (Cited on p. 485) [291] K. Messaoudi, G. Michaille. Stochastic homogenization of nonconvex integral functionals. Math. Modelling Numer. Anal. 28 (1994), no. 3, 329–356. (Cited on pp. 528, 559, 582)
i
i i
i
i book
i
2014/8/6 page 787
i
i
Bibliography
787 [292] G. Minty. A theorem on maximal monotone sets in Hilbert space. J. Math. Anal. Appl. 11 (1965), 434–439. (Cited on p. 678) [293] L. Modica. The gradient theory of phase transitions and the minimal interface criterion. Arch. Ration. Mech. Anal. 98 (1987), 123–142. (Cited on p. 536) [294] B. Mohammadi, J.-H. Saïac. Pratique de la Simulation Numérique. Dunod, Paris, 2003. (Not cited) [295] G. Monge. Mémoire sur la théorie des déblais et des remblais. Histoire Acad. Sciences Paris, 1781, 666–704. (Cited on p. 480) [296] J. J. Moreau. Fonctionnelles convexes. Cours Collège de France 1967, CNR, Facoltà di Ingegneria di Roma, Roma, 2003. (Cited on pp. 352, 353) [297] J. J. Moreau. Décomposition orthogonale d’un espace hilbertien selon deux cônes mutuellement polaires. C. R. Acad. Sci. Paris Sér. A 225 (1962), 238–240. (Cited on p. 708) [298] J. J. Moreau. Numerical aspects of the sweeping process. Computer Methods Appl. Mech. Eng. 177 (1999), 329–349. (Cited on p. 702) [299] B. Mordukhovich. Variational Analysis and Generalized Differentiation. I. Basic Theory. Grundlehren Math. Wiss. 330, Springer-Verlag, Berlin, 2006. (Cited on pp. 729, 730) [300] J. M. Morel, S. Solimini. Variational Models in Image Segmentation. Birkhäuser-Boston, Boston, 1995. (Cited on p. 597) [301] F. Morgan. Geometric Measure Theory: A Beginners Guide. Academic Press, New York, 1988. (Cited on p. 414) [302] C. B. Morrey. Multiple Integrals in the Calculus of Variations. Springer-Verlag, Berlin, 1966. (Cited on pp. 192, 547, 552, 554) [303] C. B. Morrey. Quasiconvexity and the semicontinuity of multiple integrals. Pacific J. Math. 2 (1952), 25–53. (Cited on p. 554) [304] U. Mosco. Convergence of convex sets and of solutions of variational inequalities. Adv. Math. 3 (1969), 510–585. (Cited on p. 3) [305] U. Mosco. On the continuity of the Young-Fenchel transformation. J. Math. Anal. Appl. 35 (1971), 518–535. (Cited on p. 3) [306] J. Moser. A Harnack inequality for paraboloc differential equations. Comm. Pure Appl. Math. 17 (1964), 101–134. (Cited on p. 196) [307] S. Müller, V. Sverak. Attainment results for the two-well problem by convex integration. In Geometric Analysis and the Calculus of Variations, International Press, Cambridge, MA, 1996, 239–251. (Cited on p. 475) [308] D. Mumford, J. Shah. Optimal approximation by piecewise smooth functions and associated variational problem. Comm. Pure Appl. Math. 17 (1989), 577–685. (Cited on p. 534) [309] F. Murat, L. Tartar. Optimality conditions and homogenization. Proceedings of Nonlinear Variational Problems, Isola d’Elba 1983, Res. Notes Math. 127, Pitman, London, 1985, 1–8. (Cited on pp. 498, 651) [310] J. Neˇcas. Les méthodes directes en théorie des équations elliptiques. Masson, Paris, 1967. (Cited on pp. 223, 597)
i
i i
i
i book
i
2014/8/6 page 788
i
i
788
Bibliography [311] X. X. Nguyen, H. Zessin. Ergodic theorems for spatial processes. Z. Wah. Vew. Gebiette 48 (1979), 133–158. (Cited on pp. 508, 559) [312] Z. Opial. Weak convergence of the sequence of successive approximations for nonexpansive mappings. Bull. Amer. Math. Soc. 73 (1967), 591–597. (Not cited) [313] M. Otani. Nonlinear evolution equations with time-dependent constraints. Adv. Math. Sci. Appl. 3 (1993/1994), special issue, 383–399. (Cited on p. 702) [314] F. Otto. The geometry of dissipative evolution equations: The porous medium equation. Comm. Partial Differential Equations 26 (2001), 101–174. (Cited on p. 724) [315] J. Palis, W. De Melo. Geometric Theory of Dynamical Systems. An Introduction. SpringerVerlag, New York, 1982. (Cited on p. 669) [316] P. Pedregal. Parametrized Measures and Variational Principle. Birkhäuser-Boston, Boston, 1997. (Cited on pp. 469, 476) [317] P. Pedregal. Laminates and microstructure. European J. Appl. Math. 4 (1993), 121–149. (Cited on p. 475) [318] R. L. Pego. Front migration in the nonlinear Cahn-Hilliard equation. Proc. Roy. Soc. Lond. Ser. A 422 (1989), 261–278. (Cited on p. 724) [319] J. Peypouquet, S. Sorin. Evolution equations for maximal monotone operators: Asymptotic analysis in continuous and discrete time. J. Convex Anal. 17 (2010), no. 3–4, 1113–1163. (Cited on p. 696) [320] R. R. Phelps. Convex Functions, Monotone Operators and Differentiability. Lecture Notes in Math. 1364, Springer-Verlag, New York, 1989. (Cited on pp. 356, 357, 736, 737) [321] R. R. Phelps. Lectures on maximal monotone operators. Extracta Math. 12 (1997), no. 3, 193–230. (Cited on p. 736) [322] O. Pironneau. Optimal Shape Design for Elliptic Systems. Springer Ser. Comput. Phys., Springer-Verlag, New York, 1984. (Cited on pp. 643, 644) [323] P. A. Raviart, J. M. Thomas. Introduction à l’analyse numérique des équations aux dérivées partielles. Masson, Paris, 1983. (Cited on p. 332) [324] J. R. Rice. Mathematical analysis in the mechanics of fracture. In Fracture: An Advanced Treatise, H. Liebowitz, ed., Academic Press, New York, 1969, 191–311. (Cited on p. 576) [325] R. T. Rockafellar. Convex Analysis. Princeton University Press, Princeton, NJ, 1970. (Cited on pp. 353, 600, 609) [326] R. T. Rockafellar. Integrals which are convex functionals. Pacific J. Math. 24 (1968), no. 3, 525–539. (Not cited) [327] R. T. Rockafellar. Convex integral functionals, II. Pacific J. Math. 39 (1971), 439–469. (Not cited) [328] R. T. Rockafellar. Conjugate convex functions in optimal control and the calculus of variations. J. Math. Appl. 32 (1970), 174–222. (Not cited) [329] R. T. Rockafellar. On the maximal monotonicity of subdifferential mappings. Pacific J. Math., 44 (1970), 209–216. (Cited on p. 743) [330] R. T. Rockafellar, R. J. B. Wets. Variational analysis. Grundlehren Math. Wiss. 317, SpringerVerlag, Berlin, 1998. (Cited on pp. 1, 729, 734)
i
i i
i
i book
i
2014/8/6 page 789
i
i
Bibliography
789 [331] W. Rudin. Analyse réelle et complexe. Masson, Paris, 1975. (Cited on pp. 60, 121) [332] M. Shiota. Geometry of Subanalytic and Semialgebraic Sets. Prog. Math. 150, BirkhäuserBoston, Boston, 1997. (Cited on p. 732) [333] A. Signorini. Questioni di elastostatica linearizzata e semilinearizzata. Rend. Mat. Appl. 5 (1959), 95–139. (Cited on p. 258) [334] H. A. Simon. Bounded rationality in social science: Today and tomorrow. Mind Society 1 (2000), 25–39. (Cited on pp. 97, 740) [335] J. Simon. Démonstration constructive d’un théorème de G. de Rham. C. R. Acad. Sci. Paris Sér. I 316 (1993), 1167–1172. (Cited on p. 263) [336] S. Sobolev. Methode nouvelle a résoudre le probléme de Cauchy pour les équations linéaires hyperboliques normales. Mat. Sbornik 1 (1936), 39–72. (Cited on p. 13) [337] J. Sokolowski, J.-P. Zolesio. Introduction to Shape Optimization. Shape Sensitivity Analysis. Springer Ser. in Comput. Math. 16, Springer-Verlag, Berlin, 1992. (Cited on pp. 643, 644) [338] G. Stampacchia. Formes bilinéaires coercitives sur les ensembles convexes. C. R. Acad. Sci. Paris 258 (1964), 4413–4416. (Cited on p. 599) [339] V. N. Sudakov. Geometric problems in the theory of infinite-dimensional probability distributions. Proc. Steklov Inst. Math. (1979), 1–178. (Cited on p. 483) [340] V. Sverak. New examples of quasiconvex functions. Arch. Ration. Mech. Anal. 119 (1992), 293–300. (Cited on p. 445) [341] V. Sverak. Quasiconvex functions with subquadratic growth. Proc. Roy. Soc. London Ser. A 433 (1991), 723–725. (Cited on p. 445) [342] V. Sverak. Rank-one convexity does not imply quasiconvexity. Proc. Roy. Soc. Edinburgh 120A (1992), 185–189. (Cited on pp. 445, 557) [343] V. Sverak. On the problem of two well. In Microstructure and Phase Transition, D. Kinderlehrer et al. eds., IMA Vol. Math. Appl. 54, Springer-Verlag, New York, 1993, 183–190. (Cited on p. 475) [344] M. A. Sychev. A new approach to Young measure theory, relaxation and convergence in energy. Ann. Inst. Henri Poincaré 16 (1999), no. 6, 773–812. (Cited on pp. 132, 469) [345] G. Talenti. Best constants in Sobolev inequality. Ann. Mat. Pura Appl. 110 (1976), 353–372. (Cited on p. 190) [346] L. Tartar. H -measures, a new approach for studying homogenization, oscillations and concentration effects in partial differential equations. Proc. Roy. Soc. Edinburgh 115A (1990), 193–230. (Cited on p. 143) [347] L. Tartar. An introduction to the homogenization method in optimal design. In Optimal Shape Design, Lecture Notes in Math. 1740, Springer-Verlag, Berlin, 2000, 47–156. (Cited on p. 643) [348] R. Temam. Problèmes Mathématiques en Plasticité. Gauthier-Villars, Paris, 1983. (Cited on pp. 395, 563, 571, 574, 575) [349] L. Thibault. Sequential convex subdifferential calculus and sequential Lagrange multipliers. SIAM J. Control Optim. 35 (1997), 1434–1444. (Not cited) [350] R. Thom. Problèmes rencontrés dans mon parcours mathématique: un bilan. IHES Publ. Math. 70 (1989), 199–214. (Cited on p. 729)
i
i i
i
i
i
i
790
book 2014/8/6 page 790 i
Bibliography [351] D. Torralba. Applications aux transitions de phases et à la méthode barrière logarithmique. Thèse de l’Université Montpellier II, 1996. (Cited on p. 536) [352] N. S. Trudinger. On Harnack type inequalities and their applications to quasilinear elliptic equations. Comm. Pure Appl. Math. 20 (1967), 721–747. (Cited on p. 196) [353] N. S. Trudinger, X. J. Wang. On the Monge mass transfer problem. Calc. Var. Partial Differential Equations 13 (2001), 19–31. (Cited on p. 483) [354] M. Valadier. Young measures. In Methods of Nonconvex Analysis, A. Cellina ed., Lecture Notes in Math. 1446, Springer-Verlag, Berlin, 1990, 152–188. (Cited on pp. 132, 133) [355] M. Valadier. A course on Young measures. Rend. Instit. Mat. Univ. Trieste 26 suppl. (1994), 349–394. (Cited on pp. 132, 133) [356] J. L. Vàzquez. The Porous Medium Equation. Mathematical Theory. Oxford Math. Monogr., Oxford University Press, Oxford, UK, 2007. (Cited on p. 724) [357] P. Villagio A unilateral contact problem in linear elasticity. J. Elasticity 10 (1980), 113–119. (Cited on p. 258) [358] C. Villani. Topics in Optimal Transportation. Grad. Stud. Math. 58, American Mathematical Society, Providence, RI, 2003. (Cited on pp. 481, 484) [359] C. Villani. Optimal Transport, Old and New. Grundlehren Math. Wiss. 338, Springer-Verlag, Berlin, 2009. (Cited on pp. 481, 484) [360] J. G. Wardrop. Some theoretical aspects of road traffic research. Proc. Inst. Civ. Eng. 2 (1952), 325–378. (Cited on p. 485) [361] K. Yosida. Functional Analysis. Springer-Verlag, New York, 1971. (Cited on pp. 53, 254, 451) [362] L. C. Young. Lectures on Calculus of Variations and Optimal Control Theory. W. B. Saunders, Philadelphia, 1969. (Not cited) [363] E. Zeidler. Nonlinear functional analysis and its applications III: Variational method. In Variational Methods and Optimization, Springer-Verlag, Berlin, 1985. (Cited on p. 96) [364] E. Zeidler. Nonlinear Functional Analysis and Its Applications, Part II: Monotone Operators. Springer-Verlag, Berlin, 1989. (Cited on p. 678) [365] K. Zhang. Rank-one connections at infinity and quasiconvex hulls. J. Convex Anal. 7 (2000), no. 1, 19–45. (Cited on p. 475) [366] W. P. Ziemer. Weakly Differentiable Functions. Springer-Verlag, Berlin, 1989. (Cited on pp. 110, 111, 203, 209, 210, 417, 421, 424, 447)
i
i i
i
i book
i
2014/8/6 page 791
i
i
Index absolutely continuous, 120 Aff0 (D, Rm ), 444 Agmon–Douglis–Nirenberg theorem, 715 α = ap limx→x0 f (x), 410 ap lim infx→x0 f (x), 411 ap lim supx→x0 f (x), 411 approximate derivative, 428 approximate limit, 410 approximate limit inf, 411 approximate limit sup, 411 approximation theorem strong version, 753 weak version, 750 asymptotic analysis, t → +∞, 702 Attouch theorem, 745 (Ω), 119 backward implicit discrete scheme, 694 Baillon counterexample, 705 book shifting, 482 Brønsted–Rockafellar theorem, 737 Bruck theorem, 704 BV (Ω), 393 BV (Ω, Rm ), 404 C# (Y ), 451 C0 (Ω), 59 C0 (Ω, Rm ), 124 C1 -diffeomorphism, 167 Cantor part, 427 Cantor set, 117 Cantor–Vitali function, 428, 567 capacitary measures, 216, 217 capacity, 197, 203 Cap p (·), 203 Carathéodory criterion, 108 Cauchy–Lipschitz theorem, 666
Cauchy–Riemann, 9 Cb (Ω, Rm ), 127 Cb (Ω; E), 132 Cc (Ω), 15 C1c (Ω), 16 Cc (Ω, Rm ), 124 Chernoff lemma, 697 Cioranescu–Murat example, 217 Cm (Ω), 24 coercive, 74 σ -coercive, 632 coercivity, 82 coincidence set, 280 complementary problem, 363 complementary slackness condition, 368, 378 concentration, 48 congestion effect, 485 convolution, 18 Courant–Fisher, 318, 319, 321 covering fine, 110 crowd movements, 485 cyclically monotone operator, 742
(Ω), 16
(Ω), 16 Δ, 7 De La Vallée–Poussin criterion, 567, 568 definable sets and functions, 733 demiclosed operator, 679 density point, 409 descent property, 663 desingularizing function, 726 diam(E), 105 diffusion in random media, 759 dimH , 116 Dirac mass, 15, 26 Dirichlet, 642
Dirichlet Laplacian, 214 Dirichlet problem, 30 distribution, 16 derivation, 23 div, 9 domain, 74 dom f , 74 dual function, 376, 380, 383 dual problem, 374, 376 dual value, 376 duality gap, 376, 377, 380 duality map, 738 Dunford–Pettis theorem, 57, 139 Eberlein–Smulian theorem, 53 eigenvalue, 307–309 eigenvector, 308, 309, 313, 316, 318 Ekeland’s -variational principle, 94 epi-sum, 338, 352 epiconvergence, 490 epi f , 76 -subdifferential, 737 ergodic dynamical system, 581 ergodic theorem, 47 exact minorant, 355, 357 exponential formula, 694 extension operator, 167 extension theorem, 171 exterior cone condition, 650 0 , 174 f #e g , f #g , 338 f ∗ , 345 Fenchel extremality relation, 356, 359, 386 Fenchel–Moreau theorem, 346
791
i
i i
i
i book
i
2014/8/6 page 792
i
i
792 Legendre–Fenchel conjugate, 345 transform, 640 fine topology, 210 finely open, 211 finite perimeter (set of), 416 Fréchet subdifferential, 730 free boundary problem, 280, 281, 648 Γ − lim infn→+∞ Fn , 489 Γ − lim supn→+∞ Fn , 489 Γ -convergence, 488 γ -convergence, 212 Galerkin approximating method, 285 Galerkin method, 71 Gâteaux differentiability, 94 Gauss, 8 Gauss–Green formula, 416, 421 generalized minimizing movement, 768 generalized solution, 440 geodesically convex, 770 gradient flows, 663 gradient-projection dynamics, 708 graph-convergence of operators, 734 H (Ω), 146 H01 (Ω), 147 H −1 (Ω), 160 s , 108 H s (RN ), 177 Hahn–Banach separation theorem, 88 Hahn–Banach theorem, 333, 472 Hamilton–Jacobi equation, 677 harmonic function, 7 hat function, 289, 290 Hausdorff dimension, 115 Hausdorff measure, 105, 108 Hausdorff outer measure, 105 Hilbert, 12 1
implicit Euler scheme, 768 infcompact function, 82 inner measure-theoretic normal, 415 interpolant, 293 irrigation problems, 485 isoperimetric problem, 644 J u, C u, 427 jump part, 427
Index jump point, 414 jump set, 414, 415, 424 K ∞ , 609 Kantorovich problem, 484 Karush–Kuhn–Tucker optimality conditions, 364, 368 kernel, 10 Kobayashi inequality, 695 Kurdyka–Łojasiewicz inequality, 730 " N , 110 L1λ (Ω, Rm ), 120 Lw (Ω, M(E)), 133 Lagrange multiplier vector, 369 characterization of, 369 Lagrangian, 375 Laplace, 8 Lax–Milgram theorem, 65 Lebesgue–Nikodým, 636 Legendre–Fenchel conjugate, 383 Liapunov function, 669 limit analysis, 641 limiting-subdifferential, 730 linear heat equation, 714 linear least squares problems, 712 linearly regular intersection, 733 local slope, 768 Łojasiewicz convergence theorem, 726 Łojasiewicz inequality, 706, 726 lower semicontinuous regularization, 82 Lyapunov function, 664, 707, 710, 711, 727 M(Ω, Rm ), 119 M(Ω), 119 Mb (Ω), 119 Mm×N , 441 μA, 119 μ+ , μ− , 120 μ = (μ x )x∈Ω ⊗ σ , 129 marginal function, 370, 384 marginal measures, 481 Markov inequality, 137 mass transportation problems, 480 maximal monotone operator, 670 maximal slope, 766 measure Borel, 119
bounded, 120 counting, 636 Radon, 111, 120 regular, 60, 119 signed, 119 support, 119 theoretical boundary, 410 exterior, 409 interior, 409 total variation, 120 metric derivative, 767 metric regularity, 733 minimizing movements, 768 minimizing property, 702 Minty theorem, 678 mollifier, 18 Monge problem, 481 monotone operator, 670 Moreau decomposition theorem, 708 Moreau–Yosida approximation, 670 Moreau–Yosida regularization, 738, 739 Mosco-convergence, 736, 744 mountain pass theoerm, 95 na r
, 132 narrow, 395 narrow convergence of Young measures, 132 Neumann, 641 Neumann boundary condition, 32 Neumann problem, 32 Newton problem, 645 Newtonian potential, 7, 27 nodes, 289, 290 nonautonomous, 700 normal cone, 361 normalization condition, 745 o-minimal structures, 732 obstacle problem, 260, 279 Opial lemma, 704 optimal distribution of conductors, 651 optimal potentials, 654 optimal value, 377 oscillations, 47 ∂ f , 355 ∂M E, 410 ∂ r E, 417
i
i i
i
i
i
i
Index Palais–Smale compactness condition, 95 Palis–De Melo counterexample, 669 periodic in law, 523, 759 perturbation function, 382, 383, 385, 387 Picard iterative method, 69 Poincaré inequality, 161 Poincaré–Wirtinger inequality, 173, 420 Poisson equation, 8 primal problem, 375 primal value, 376 proper, 74 proximal sequence, 694 push forward operator, 480 Q f , 442 quasi-autonomous, 700 quasi-closed, 209 quasi-continuous, 203, 209 representant, 363 quasi-convex envelope, 442, 476 quasi-everywhere, 209 quasi-lower semicontinuous, 209 quasi-open, 209 quasi-static evolution, 769 ρ ∗ μ, 126 Rademacher, 401 Radon measure, 23 Radon–Nikodým theorem, 120 raffic models, 485 random convex integrand, 759 rarefaction point, 409 Rayleigh Courant–Rayleigh formula, 324 quotient, 317 real-valued analytic functions, 724 recession cone, 609 recession function, 459, 502, 529, 560, 572, 636 recession functional, 600 reduce boundary, 416 reflexive, 52 regular point, 414 regular sequence of sets, 500 regular triangulation, 291 regularizing effect, 689 relaxation, 456 relaxation scheme, 475 relaxed gradient-projection dynamic, 710
book 2014/8/6 page 793 i
793 relaxed problem, 82, 440 Rellich–Kondrakov compact embedding theorem, 172 Rellich–Kondrakov theorem, 166 resolvent, 670, 740 resolvent operator, 212 Riemann, 9 Riesz representation theorem, 20, 45, 60, 65, 124 Riesz theorem, 39, 83 Rockafellar theorem, 353 Rockafellar theorem (maximal monotonicity of the subdifferential), 740 S f , 414 σC , 335 saddle point, 376, 377 saddle value problems, 376 SBV (Ω), 429 SBV (Ω, Rm ), 429 Schrödinger operator, 654 self-similar set, 117 semialgebraic function, 731 semialgebraic set, 731 semiflow, 664 semigroup, 664 separation of variable method, 307, 332 set convergence, 488 set of class C1 , 167 shape optimization, 643 Signorini problem, 258 singular, 120 Slater generalized, 385 Slater qualification assumption, 364, 366, 367, 371, 373, 374, 377, 385 slicing decomposition, 129 Sobolev spaces, 23 Spectral functionals, 214 spt, 16, 119 steepest descent, 766 classical continuous, 663 for coupled systems, 713 generalized, 677 Stefan problem, 717 Stokes problem, 33 subadditive, 636 subadditivity, 106, 636 subdifferential, 355 superharmonic function, 210 support function, 335
sweeping process, 702 tangent cone, 361 Tarski–Seidenberg principle, 731 test function, 15, 30 Thom conjecture, 729 tightness for nonnegative Borel measures, 127 for Young measure, 133 transport plans, 483 transportation maps, 480 transportation networks, 485 u − , u + , 424 uniformly convex, 50, 53 uniformly integrable, 55, 138, 139, 142, 469, 470, 478 uniformly proper, 744 unilateral convex set, 281 upper gradient, 767 urban planning, 485 ˆ 173 v, value function, 384 variational inequality, 259, 280 Vitali convergence theorem, 57 Vitali’s covering theorem, 110 Von Neumann’s minimax theorem, 382
W −m, p (Ω), 161 W 1, p (Ω), 146 1, p W0 (Ω), 147 W 1, p (Ω, Rm ), 441 Wasserstein distance, 484 weak asymptotic convergence, 704 weak convergence, 39 weak solution, 30 weak topology, 39, 42 weak∗ topology, 59 Weierstrass example, 12 theorem, 83 6 (Ω; E), 132 y ,S x, 96 Yosida approximation, 670 Young measures, 132 W 1, p -Young measures, 468 associated with functions, 136 generated by functions, 137
i
i i
i