147 28 2MB
English Pages 384 [378] Year 2024
Ammar Khanfer
Applied Functional Analysis
Applied Functional Analysis
Ammar Khanfer
Applied Functional Analysis
Ammar Khanfer Department of Mathematics and Sciences Prince Sultan University Riyadh, Saudi Arabia
ISBN 978-981-99-3787-5 ISBN 978-981-99-3788-2 (eBook) https://doi.org/10.1007/978-981-99-3788-2 Mathematics Subject Classification: 46B70, 46B50, 46A22, 47B07, 47B38, 47B99, 46B25, 46A30, 54E52, 46C05, 35D30, 35J20, 35A15 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore
Preface
The present book is the third volume of our series in advanced analysis, which consists of • Volume 1: “Measure Theory and Integration”. • Volume 2: “Fundamentals of Functional Analysis”. The field of applied functional analysis concerns with the applications of functional analysis to different areas of applied mathematics and includes various subfields and research directions. Historically, functional analysis emerged as a consequence of the investigations made on the minimization problems of calculus of variations (COV), but it was soon thrived when connected to the theory of partial differential equations. The theories of Sobolev spaces and distributions were established in the beginning of the twentieth century to offer genuine and brilliant answers to the big question of the existence of solutions of PDEs. This direction: (Sobolev spaces, minimization problems in COV, existence and uniqueness theorems of solutions of PDEs, regularity theory) is one of the greatest and most remarkable mathematical achievements in the twentieth century, and should be regarded as one of the most successful stories in the history of mathematics. The present volume highlights this direction and introduces it to the readers by studying its fundamentals and main theories, and providing a careful treatment of the subject with clear exposition and in a student-friendly manner. The book is intended to help students and junior researchers focusing on the theory of PDEs and calculus of variations. The book serves as a one-semester, or alternatively two-semester, graduate course for mathematics students concentrating on analysis. Essential prerequisites for the book include real and functional analysis in addition to linear algebra. A course on PDEs can be helpful but not necessary. The book consists of five chapters, with eleven sections in each chapter. Chapter 1 discusses linear bounded operators: compact operators, Hilbert– Schmidt operators, self-adjoint operators and their spectral properties, and the Fredholm alternative theorem. After that, the unbounded operators are discussed in detail, with a special focus on differential and integral operators.
v
vi
Preface
Chapter 2 introduces distribution theory to the reader motivated by the discussion of Green’s function, and introduces the Dirac delta, then regular and singular distributions and their derivatives. The theory is then connected with Fourier transform and Schwartz spaces and the tempered distributions. Chapter 3 gives a comprehensive discussion about the Sobolev spaces and their properties. Approximations of Sobolev spaces are studied in detail, then inequalities and embedding results are treated carefully. Chapter 4 discusses elliptic theory as an application to the theory of Sobolev spaces. This topic connects applied functional analysis with the theory of partial differential equations. The focus of the chapter is mainly on the applications of Sobolev spaces to PDEs and the dominant role they play in establishing solutions of the equations. The chapter ends with some elliptic regularity results which discuss the regularity and smoothness of the weak solutions. Chapter 5 introduces the area of calculus of variations to the reader as an important application of the theory of Sobolev spaces as it plays a central role in establishing the existence of minimizers of integral functionals using the direct method. The Gateaux derivative is introduced as a generalization of the derivative notion on infinite-dimensional normed spaces. Each chapter ends with a collection of problems. These problems aim to test whether the reader has absorbed the material and gained a comprehensive understanding. Unlike the case in the first two volumes, I didn’t wish to provide any hints or solutions to the problems in this volume because I expect the reader at this stage of study has acquired the knowledge and skills needed to handle these problems independently, and should be able to tackle them correctly after a careful study of the material, which will significantly improve the reader’s analytical skills and provide a more in-depth understanding of the topics. Riyadh, Saudi Arabia 2023
Ammar Khanfer
Acknowledgments
I am forever grateful and thankful to God for giving me the strength, health, knowledge, and patience to endure and complete this work successfully. I would like to express my sincere thanks to Prince Sultan University for its continuing support. I also wish to express my deep thanks and gratitude to Prof. Mahmoud Al Mahmoud, the dean of our college (CHS), and Prof. Wasfi Shatanawi, the chair of our department (MSD), for their support and recognition of my work. My sincerest thanks to my colleagues in our department for their warm encouragement.
vii
Contents
1 Operator Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Quick Review of Hilbert Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.1 Lebesgue Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.2 Convergence Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.3 Complete Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.4 Hilbert Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.5 Fundamental Mapping Theorems on Banach Spaces . . . . 1.2 The Adjoint of Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.1 Bounded Linear Operators . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.2 Definition of Adjoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.3 Adjoint Operator on Hilbert Spaces . . . . . . . . . . . . . . . . . . . 1.2.4 Self-adjoint Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Compact Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.1 Definition and Properties of Compact Operators . . . . . . . . 1.3.2 The Integral Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.3 Finite-Rank Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4 Hilbert–Schmidt Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4.1 Definition of Hilbert–Schmidt Operator . . . . . . . . . . . . . . . 1.4.2 Basic Properties of HS Operators . . . . . . . . . . . . . . . . . . . . . 1.4.3 Relations with Compact and Finite-Rank Operators . . . . . 1.4.4 The Fredholm Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4.5 Characterization of HS Operators . . . . . . . . . . . . . . . . . . . . 1.5 Eigenvalues of Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5.1 Spectral Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.5.2 Definition of Eigenvalues and Eigenfunctions . . . . . . . . . . 1.5.3 Eigenvalues of Self-adjoint Operators . . . . . . . . . . . . . . . . . 1.5.4 Eigenvalues of Compact Operators . . . . . . . . . . . . . . . . . . . 1.6 Spectral Analysis of Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.6.1 Resolvent and Regular Values . . . . . . . . . . . . . . . . . . . . . . . 1.6.2 Bounded Below Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.6.3 Spectrum of Bounded Operator . . . . . . . . . . . . . . . . . . . . . .
1 1 1 2 3 3 4 4 4 5 7 8 8 8 11 13 14 14 16 17 20 22 23 23 24 24 26 28 28 29 30 ix
x
Contents
1.6.4 Spectral Mapping Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 1.6.5 Spectrum of Compact Operators . . . . . . . . . . . . . . . . . . . . . 1.7 Spectral Theory of Self-adjoint Compact Operators . . . . . . . . . . . . 1.7.1 Eigenvalues of Compact Self-adjoint Operators . . . . . . . . 1.7.2 Invariant Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.7.3 Hilbert–Schmidt Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.7.4 Spectral Theorem For Self-adjoint Compact Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.8 Fredholm Alternative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.8.1 Resolvent of Compact Operators . . . . . . . . . . . . . . . . . . . . . 1.8.2 Fundamental Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.8.3 Fredholm Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.8.4 Volterra Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.9 Unbounded Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.9.2 Closed Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.9.3 Basics Properties of Unbounded Operators . . . . . . . . . . . . 1.9.4 Toeplitz Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.9.5 Adjoint of Unbounded Operators . . . . . . . . . . . . . . . . . . . . . 1.9.6 Deficiency Spaces of Unbounded Operators . . . . . . . . . . . 1.9.7 Symmetry of Unbounded Operators . . . . . . . . . . . . . . . . . . 1.9.8 Spectral Properties of Unbounded Operators . . . . . . . . . . . 1.10 Differential Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.10.1 Green’s Function and Dirac Delta . . . . . . . . . . . . . . . . . . . . 1.10.2 Laplacian Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.10.3 Sturm–Liouville Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.10.4 Momentum Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.11 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
33 33 33 33 34 35 39 41 41 43 44 45 46 46 47 49 51 52 54 55 58 61 61 64 67 69 70
2 Distribution Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 The Notion of Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.1 Motivation For Distributions . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.2 Test Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.3 Definition of Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Regular Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.1 Locally Integrable Functions . . . . . . . . . . . . . . . . . . . . . . . . 2.2.2 Notion of Regular Distribution . . . . . . . . . . . . . . . . . . . . . . . 2.2.3 The Dual Space D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.4 Basic Properties of Regular Distributions . . . . . . . . . . . . . . 2.3 Singular Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.1 Notion of Singular Distribution . . . . . . . . . . . . . . . . . . . . . .
81 81 81 81 83 84 84 84 86 87 87 87
Contents
xi
2.3.2 Dirac Delta Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.3 Delta Sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.4 Gaussian Delta Sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Differentiation of Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.1 Notion of Distributional Derivative . . . . . . . . . . . . . . . . . . . 2.4.2 Calculus Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.3 Examples of Distributional Derivatives . . . . . . . . . . . . . . . . 2.4.4 Properties of δ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 The Fourier Transform Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5.2 Fourier Transform on Rn . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5.3 Existence of Fourier Transform . . . . . . . . . . . . . . . . . . . . . . 2.5.4 Plancherel Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6 Schwartz Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6.1 Rapidly Decreasing Functions . . . . . . . . . . . . . . . . . . . . . . . 2.6.2 Definition of Schwartz Space . . . . . . . . . . . . . . . . . . . . . . . . 2.6.3 Derivatives of Schwartz Functions . . . . . . . . . . . . . . . . . . . . 2.6.4 Isomorphism of Fourier Transform on Schwartz Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.7 Tempered Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.7.1 Definition of Tempered Distribution . . . . . . . . . . . . . . . . . . 2.7.2 Functions of Slow Growth . . . . . . . . . . . . . . . . . . . . . . . . . . 2.7.3 Examples of Tempered Distributions . . . . . . . . . . . . . . . . . . 2.8 Fourier Transform of Tempered Distribution . . . . . . . . . . . . . . . . . . 2.8.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.8.2 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.8.3 Derivative of F.T. of Tempered Distribution . . . . . . . . . . . . 2.9 Inversion Formula of The Fourier Transform . . . . . . . . . . . . . . . . . . 2.9.1 Fourier Transform of Gaussian Function . . . . . . . . . . . . . . 2.9.2 Fourier Transform of Delta Distribution . . . . . . . . . . . . . . . 2.9.3 Fourier Transform of Sign Function . . . . . . . . . . . . . . . . . . 2.10 Convolution of Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.10.1 Derivatives of Convolutions . . . . . . . . . . . . . . . . . . . . . . . . . 2.10.2 Convolution in Schwartz Space . . . . . . . . . . . . . . . . . . . . . . 2.10.3 Definition of Convolution of Distributions . . . . . . . . . . . . . 2.10.4 Fundamental Property of Convolutions . . . . . . . . . . . . . . . . 2.10.5 Fourier Transform of Convolution . . . . . . . . . . . . . . . . . . . . 2.11 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
88 91 94 96 96 97 98 100 101 101 102 103 104 105 105 107 108 111 112 112 113 114 116 116 117 117 118 119 122 123 123 124 124 125 125 126 127
3 Theory of Sobolev Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Weak Derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.1 Notion of Weak Derivative . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.2 Basic Properties of Weak Derivatives . . . . . . . . . . . . . . . . . 3.1.3 Pointwise Versus Weak Derivatives . . . . . . . . . . . . . . . . . . . 3.1.4 Weak Derivatives and Fourier Transform . . . . . . . . . . . . . .
133 133 133 135 136 138
xii
Contents
3.2
3.3
3.4
3.5
3.6
3.7
3.8
3.9
Regularization and Smoothening . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 The Concept of Mollification . . . . . . . . . . . . . . . . . . . . . . . . 3.2.2 Mollifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.3 Cut-Off Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.4 Partition of Unity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.5 Fundamental Lemma of Calculus of Variations . . . . . . . . . Density of Schwartz Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 Convergence of Approximating Sequence . . . . . . . . . . . . . 3.3.2 Approximations of S and L p . . . . . . . . . . . . . . . . . . . . . . . . 3.3.3 Generalized Plancherel Theorem . . . . . . . . . . . . . . . . . . . . . Construction of Sobolev Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.1 Completion of Schwartz Spaces . . . . . . . . . . . . . . . . . . . . . . 3.4.2 Definition of Sobolev Space . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.3 Fractional Sobolev Space . . . . . . . . . . . . . . . . . . . . . . . . . . . Basic Properties of Sobolev Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.1 Convergence in Sobolev Spaces . . . . . . . . . . . . . . . . . . . . . . 3.5.2 Completeness and Reflexivity of Sobolev Spaces . . . . . . . 3.5.3 Local Sobolev Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.4 Leibnitz Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5.5 Mollification with Sobolev Function . . . . . . . . . . . . . . . . . . k, p 3.5.6 W0 () . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1, p W () . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.6.1 Absolute Continuity Characterization . . . . . . . . . . . . . . . . . 3.6.2 Inclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.6.3 Chain Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.6.4 Dual Space of W 1, p () . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Approximation of Sobolev Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.7.1 Local Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.7.2 Global Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.7.3 Consequences of Meyers–Serrin Theorem . . . . . . . . . . . . . Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.8.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.8.2 The Zero Extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.8.3 Coordinate Transformations . . . . . . . . . . . . . . . . . . . . . . . . . 3.8.4 Extension Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sobolev Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.9.1 Sobolev Exponent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.9.2 Fundamental Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.9.3 Gagliardo–Nirenberg–Sobolev Inequality . . . . . . . . . . . . . 3.9.4 Poincare Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.9.5 Estimate for W 1, p . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.9.6 The Case p = n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.9.7 Holder Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.9.8 The Case p > n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.9.9 General Sobolev Inequalities . . . . . . . . . . . . . . . . . . . . . . . .
139 139 141 144 146 148 149 149 151 154 155 155 156 159 159 159 161 163 164 165 166 167 167 170 171 174 176 176 177 179 181 181 182 188 193 200 200 202 204 207 208 209 210 211 216
Contents
xiii
3.10 Embedding Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.10.1 Compact Embedding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.10.2 Rellich–Kondrachov Theorem . . . . . . . . . . . . . . . . . . . . . . . 3.10.3 High Order Sobolev Estimates . . . . . . . . . . . . . . . . . . . . . . . 3.10.4 Sobolev Embedding Theorem . . . . . . . . . . . . . . . . . . . . . . . 3.10.5 Embedding of Fractional Sobolev Spaces . . . . . . . . . . . . . . 3.11 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
218 218 221 225 226 227 229
4 Elliptic Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Elliptic Partial Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.1 Elliptic Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.2 Uniformly Elliptic Operator . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.3 Elliptic PDEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Weak Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.1 Motivation for Weak Solutions . . . . . . . . . . . . . . . . . . . . . . . 4.2.2 Weak Formulation of Elliptic BVP . . . . . . . . . . . . . . . . . . . 4.2.3 Classical Versus Strong Versus Weak Solutions . . . . . . . . 4.3 Poincare Equivalent Norm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.1 Poincare Inequality on H01 . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.2 Equivalent Norm on H01 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.3 Poincare–Wirtinger Inequality . . . . . . . . . . . . . . . . . . . . . . . 4.3.4 Quotient Sobolev Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Elliptic Estimates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.1 Bilinear Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.2 Elliptic Bilinear Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.3 Garding’s Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5 Symmetric Elliptic Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5.1 Riesz Representation Theorem for Hilbert Spaces . . . . . . 4.5.2 Existence and Uniqueness Theorem—Poisson’s Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5.3 Existence and Uniqueness Theorem—Helmholtz Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5.4 Ellipticity and Coercivity . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5.5 Existence and Uniqueness Theorem—Symmetric Uniformly Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6 General Elliptic Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6.1 Lax–Milgram Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6.2 Dirichlet Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6.3 Neumann Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.7 Spectral Properties of Elliptic Operators . . . . . . . . . . . . . . . . . . . . . . 4.7.1 Resolvent of Elliptic Operators . . . . . . . . . . . . . . . . . . . . . . 4.7.2 Fredholm Alternative for Elliptic Operators . . . . . . . . . . . . 4.7.3 Spectral Theorem for Elliptic Operators . . . . . . . . . . . . . . . 4.8 Self-adjoint Elliptic Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.8.1 The Adjoint of Elliptic Bilinear . . . . . . . . . . . . . . . . . . . . . .
239 239 239 240 241 242 242 243 246 247 247 248 249 250 251 251 252 252 255 255 256 257 258 260 260 260 263 264 268 268 270 271 271 271
xiv
Contents
4.8.2 Eigenvalue Problem of Elliptic Operators . . . . . . . . . . . . . . 4.8.3 Spectral Theorem of Elliptic Operator . . . . . . . . . . . . . . . . 4.9 Regularity for the Poisson Equation . . . . . . . . . . . . . . . . . . . . . . . . . . 4.9.1 Weyl’s Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.9.2 Difference Quotients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.9.3 Caccioppoli’s Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.9.4 Interior Regularity for Poisson Equation . . . . . . . . . . . . . . . 4.10 Regularity for General Elliptic Equations . . . . . . . . . . . . . . . . . . . . . 4.10.1 Interior Regularity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.10.2 Higher Order Interior Regularity . . . . . . . . . . . . . . . . . . . . . 4.10.3 Interior Smoothness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.10.4 Boundary Regularity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.11 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
273 274 275 275 275 279 280 283 283 286 287 287 288
5 Calculus of Variations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Minimization Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.1 Definition of Minimization Problem . . . . . . . . . . . . . . . . . . 5.1.2 Lower Semicontinuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.3 Minimization Problems in Finite-Dimensional Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.4 Convexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.5 Minimization in Infinite-Dimensional Space . . . . . . . . . . . 5.2 Weak Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.1 Notion of Weak Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.2 Weak Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.3 Weakly Closed Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.4 Reflexive Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.5 Weakly Lower Semicontinuity . . . . . . . . . . . . . . . . . . . . . . . 5.3 Direct Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.1 Direct Verses Indirect Methods . . . . . . . . . . . . . . . . . . . . . . 5.3.2 Minimizing Sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.3 Procedure of Direct Method . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.4 Coercivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.5 The Main Theorem on the Existence of Minimizers . . . . . 5.4 The Dirichlet Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.1 Variational Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.2 Dirichlet Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.3 Weierstrass Counterexample . . . . . . . . . . . . . . . . . . . . . . . . . 5.5 Dirichlet Principle in Sobolev Spaces . . . . . . . . . . . . . . . . . . . . . . . . 5.5.1 Minimizer of the Dirichlet Integral in H01 . . . . . . . . . . . . . . 5.5.2 Minimizer of the Dirichlet Integral in H 1 . . . . . . . . . . . . . . 5.5.3 Dirichlet Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5.4 Dirichlet Principle with Neumann Condition . . . . . . . . . . . 5.5.5 Dirichlet Principle with Neumann B.C. in Sobolev Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
295 295 295 297 298 299 300 300 300 301 302 302 303 305 305 306 306 307 308 310 310 311 313 315 315 315 317 318 321
Contents
5.6
Gateaux Derivatives of Functionals . . . . . . . . . . . . . . . . . . . . . . . . . . 5.6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.6.2 Historical Remark . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.6.3 Gateaux Derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.6.4 Basic Properties of G-Derivative . . . . . . . . . . . . . . . . . . . . . 5.6.5 G-Differentiability and Continuity . . . . . . . . . . . . . . . . . . . . 5.6.6 Frechet Derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.6.7 G-Differentiability and Convexity . . . . . . . . . . . . . . . . . . . . 5.6.8 Higher Gateaux Derivative . . . . . . . . . . . . . . . . . . . . . . . . . . 5.6.9 Minimality Condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.7 Poisson Variational Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.7.1 Gateaux Derivative of Poisson Integral . . . . . . . . . . . . . . . . 5.7.2 Symmetric Elliptic PDEs . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.7.3 Dirichlet Principle of Symmetric Elliptic PDEs . . . . . . . . . 5.8 Euler–Lagrange Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.8.1 Lagrangian Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.8.2 First Variation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.8.3 Necessary Condition for Minimiality I . . . . . . . . . . . . . . . . 5.8.4 Euler–Lagrange Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.8.5 Second Variation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.8.6 Legendre Condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.9 Dirichlet Principle for Euler–Lagrange Equation . . . . . . . . . . . . . . . 5.9.1 The Lagrangian Functional . . . . . . . . . . . . . . . . . . . . . . . . . . 5.9.2 Gateaux Derivative of the Lagrangian Integral . . . . . . . . . . 5.9.3 Dirichlet Principle for Euler-Lagrange Equation . . . . . . . . 5.10 Variational Problem of Euler–Lagrange Equation . . . . . . . . . . . . . . 5.10.1 p−Convex Lagrangian Functional . . . . . . . . . . . . . . . . . . . . 5.10.2 Existence of Minimizer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.11 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
xv
322 322 322 322 324 325 325 326 327 330 331 331 334 336 337 337 338 339 340 341 342 344 344 344 348 348 348 351 352
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365
About the Author
Ammar Khanfer earned his Ph.D. from Wichita State University, USA. His area of interest is analysis and partial differential equations (PDEs), focusing on the interface and links between elliptic PDEs and hypergeometry. He has notably contributed to the field by providing prototypes studying the behavior of generalized solutions of elliptic PDEs in higher dimensions in connection to the behavior of hypersurfaces near nonsmooth boundaries. He also works on the qualitative theory of differential equations, and in the area of inverse problems of mathematical physics. He has published articles of high quality in reputable journals. Ammar taught at several universities in the USA: Western Michigan University, Wichita State University, and Southwestern College in Winfield. He was a member of the Academy of Inquiry Based Learning (AIBL) in the USA. During the period 2008–2014, he participated in AIBL workshops and conferences on effective teaching methodologies and strategies of creative thinking. He then moved to Saudi Arabia to teach at Imam Mohammad Ibn Saud Islamic University, where he taught and supervised undergraduate and graduate students of mathematics. Furthermore, he was appointed as coordinator of the Ph.D. program establishment committee in the department of mathematics. In 2020, he moved to Prince Sultan University in Riyadh, and has been teaching there since then.
xvii
Chapter 1
Operator Theory
1.1 Quick Review of Hilbert Space This section provides a very brief and quick review of the basics of Hilbert space theory and functional analysis that are needed for this text. We list some of the most important notions and results that will be used throughout this book. It should be noted that the objective of this section is to merely refresh the memory rather than explain these concepts as they have been already explained in detail in volume 2 of this series [58]. The reader who did not study this material should consult [58] or alternatively any introductory book on functional analysis.
1.1.1 Lebesgue Spaces Definition 1.1.1 (Normed Spaces) Let X be a vector space. If X is endowed with a norm ·, then the space (X, ·) is called “normed space”. Definition 1.1.2 (L p Spaces) The space L[a, b] is the space consisting of all Lebesgue-integrable functions on [a, b], that is, those functions f : [a, b] → R such that b | f (x)| d x < ∞. f = a
The space L[a, b] can be also generalized to L p [a, b], the space of all functions such that | f | p is Lebesgue-integrable on [a, b] for every f ∈ L p [a, b], where 1 ≤ p < ∞, endowed with the norm f p =
b
1/ p | f (x)| d x p
.
a
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 A. Khanfer, Applied Functional Analysis, https://doi.org/10.1007/978-981-99-3788-2_1
1
2
1 Operator Theory
The convergence in L p is defined as follows: Let ( f n ) ∈ L p (). Then f n converges to f in p− pnorm (or in the mean of order p) if f n − f p → 0, or, equivalently, | | −→ 0. fn − f Theorem 1.1.3 (Holder’s Inequality) Let f ∈ L p () and g ∈ L q (), for 1 ≤ p, q ≤ ∞, and p and q be conjugates. Then f g ∈ L 1 (), i.e., Lebesgue-integrable, and | f g| ≤ f p . gq .
Theorem 1.1.4 (Minkowski’s Inequality) Let f, g ∈ L p for 1 ≤ p ≤ ∞. Then f + g p ≤ f p + g p . Theorem 1.1.5 (Fatou’s Lemma) Let { f n } be a sequence of measurable functions in a measure space (X, Σ, μ), f n ≥ 0, and f n −→ f a.e. on a set E ∈ Σ. Then
f dμ ≤ lim E
f n dμ. E
1.1.2 Convergence Theorems Theorem 1.1.6 (Monotone Convergence Theorem) Let { f n } be a sequence of nonnegative and increasing measurable functions on E ∈ Σ in a measure space (X, Σ, μ). If lim f n = f a.e., then
f n dμ =
lim E
f dμ. E
Theorem 1.1.7 (Dominated Convergence Theorem) (DCT) Let { f n } be a sequence of measurable functions on E ∈ Σ in a measure space (X, Σ, μ). If f n → f p.w. a.e., and is dominated by some μ−integrable function g over E, that is, | f n | ≤ g for all n and for almost all x in E, then f is μ−integrable and
f n dμ =
lim
f dμ.
E
Theorem 1.1.8 (Riesz’s Lemma) Let Y be a closed subspace of a normed space X. Then, for every δ, 0 < δ < 1, there exists x0 ∈ X such that x0 = 1 and dist(x0 , Y ) ≥ δ. Theorem 1.1.9 (Riesz–Fischer Theorem) (L p )∗ ∼ = L q for 1 ≤ p < ∞. Theorem 1.1.10 Let X be a normed space. Then, X is finite-dimensional if and only if its closed unit ball is compact.
1.1 Quick Review of Hilbert Space
3
1.1.3 Complete Space Definition 1.1.11 (Banach Space) A space X is called complete if every Cauchy sequence in X converges to an element in X. A complete normed vector space is called “Banach Space”. Theorem 1.1.12 Every finite-dimensional normed space is complete. Theorem 1.1.13 Let X be complete space and Y ⊂ X. Then, Y is closed if and only if Y is complete.
1.1.4 Hilbert Space Definition 1.1.14 (Inner Product) The inner product is a map ·, · : V × V −→ F where V is a vector space over the field F, which could be R or C, such that the following hold: (1) For any x ∈ V , x, x ≥ 0 and x, x = 0 iff x = 0. (2) For x, y ∈ V, a ∈ F, we have αx, · = α x, · and x + y, · = x, · + y, · . We also have the conjugate linearity property: ·, αx = α¯ ·, x and ·, x + y = ·, x + ·, y . (3) x, y = y, x . If F = R, then we have x, y = y, x . Theorem 1.1.15 (Cauchy–Schwartz Inequality) Let x, y ∈ V for some vector space V. Then |x, y | ≤ x y . Definition 1.1.16 (Hilbert Space) Let V be a vector space. The space X = (V, ·, · ) is said to be an inner product space. A complete inner product space is called Hilbert space. Theorem 1.1.17 (Decomposition Theorem) Let Y be a closed subspace of a Hilbert space H. Then, any x ∈ H can be written as x = y + z, where y ∈ Y and z ∈ Y ⊥ , and such that x and z are uniquely determined by x. Theorem 1.1.18 (Orthonormality Theorem) Let H be an infinite-dimensional Hilbert space and M = {en }n∈N be orthonormal sequence in H. Then (1)
∞ j=1
α j e j converges if and only if
∞ 2 α j converges. In this case, j=1
4
1 Operator Theory
2 ∞ ∞ 2 α j . α e = j j j=1 j=1 (2) Bessel’s Inequality. For every x ∈ H, we have ∞
x, e j 2 ≤ x 2 . j=1
(3) Parseval’s Identity. If M is an orthonormal basis for H then for every x ∈ H, ∞
x, e j e j , and we have x = j=1 ∞
x, e j 2 = x 2 . j=1
1.1.5 Fundamental Mapping Theorems on Banach Spaces Theorem 1.1.19 (Open Mapping Theorem) (OMT) If T ∈ B(X, Y ) where X and Y are Banach spaces. If T is surjective then it is open. Theorem 1.1.20 (Bounded Inverse Theorem) If T ∈ B(X, Y ) where X and Y are Banach spaces, if T is bijective then T −1 ∈ B(X, Y ). Theorem 1.1.21 (Closed Graph Theorem) Let T : X −→ Y be a linear operator where X and Y are Banach spaces. Then T is bounded if and only if its graph is closed in X × Y. Theorem 1.1.22 (Uniform Bounded Principle) Consider the sequence Tn ∈ B(X, Y ) for some Banach space X and normed space Y. If sup Tn (x) < ∞ for every x ∈ X , then sup Tn < ∞.
1.2 The Adjoint of Operator 1.2.1 Bounded Linear Operators Recall in a basic functional analysis course, the linear operator was defined to be a mapping T from a normed space X to another normed space Y such that T (cx + y) = cT (x) + T (y)
1.2 The Adjoint of Operator
5
for all x, y ∈ Y and for any scalar c in the field underlying the spaces X, Y which usually is taken as R. Let T be a linear operator, and X and Y be two normed spaces with norms · X and ·Y , respectively. Then T is called bounded linear operator if there exists M ∈ R, such that for all x ∈ X, T xY ≤ M x X , where · X and ·Y are the norms for the spaces X and Y, respectively. If there is no such M, then T is said to be unbounded. If Y = R then T is called functional, and if X is finite-dimensional then T is called transformation. In general, T is a mapping that maps an element x to a unique element in Y. The norm can be written as T = sup T (x) . x=1
The fundamental theorem of bounded linear operators states that T is bounded iff T is continuous at all x ∈ X iff T is continuous at 0. So, for linear functionals, boundedness and continuity are equivalent, and this feature is not available for nonlinear operators. In fact, it can be easily shown that every linear operator defined on a finitedimensional space is bounded (i.e., continuous). An operator T is said to be injective (or one-to-one) if for every x, y ∈ X such that T (x) = T (y), we have x = y. An operator T is said to be surjective (or onto) if for every y ∈ Y, there exists at least one x ∈ X such that T (x) = y. An operator T is said to be bijective if it is injective and surjective. If dim X = dim Y = n < ∞, then T is injective if and only if T is surjective. If T is bijective, and T, T −1 are continuous, then T is an isomorphism between X and Y. Moreover, T is called isometry if x X = ϕ(x)Y for all x ∈ X.
1.2.2 Definition of Adjoint An important operator that is defined in association with the operator T is the adjoint operator. Definition 1.2.1 (Adjoint Operator) Let X and Y be two Banach spaces, and let T : X −→ Y be a linear operator. Then, the adjoint operator of T, denoted by T ∗ , is the operator T ∗ : Y ∗ −→ X ∗ defined as T ∗ ( f ) = f T, for f ∈ Y ∗ .
6
1 Operator Theory
A basic property that can be easily established from the definition is that T ∗ is linear for linear operators, and, moreover, if S : X −→ Y is another linear operator, then (T + S)∗ = T ∗ + S ∗ and
(T S)∗ = S ∗ T ∗
(verify). The following proposition provides two important fundamental properties for the adjoint operator. Proposition 1.2.2 Let X, Y be Banach spaces. Let T : X −→ Y be a linear operator, and let T ∗ be its adjoint operator. Then (1) If T is bounded, then T ∗ is bounded and ∗ T = T . (2) T is bounded and invertible with bounded inverse if and only if T ∗ has bounded inverse, and (T ∗ )−1 = (T −1 )∗ . Proof For (1), let f ∈ Y ∗ and x ∈ B X . Then ∗ T ( f (x)) = | f (T (x))| ≤ T ∗ f , and so ∗ T ≤ T < ∞.
(1.2.1)
Let x ∈ X . Then T (x) = y ∈ Y. By Hahn–Banach theorem, there exists gx ∈ Y ∗ such that |gx (y)| = y = T (x) . Since T ∗ : Y ∗ −→ X ∗ , ∗ ∗ T ≥ T (gx ) = |gx (T (x))| = T (x) . Taking the supremum over all x ∈ X gives the reverse direction of (1.2.1), and this proves (1). For (2), we have f (x) = f (T 1 (y)) = (T −1 )∗ f (y) = (T −1 )∗ f (T (x)) = (T −1 )∗ T ∗ f (x). Conversely, suppose T ∗ has bounded inverse. Then T ∗∗ has bounded inverse. Then
1.2 The Adjoint of Operator
7
T|∗∗ = T, X so T is one-to-one. Moreover, since T ∗∗ is onto, then by the open mapping theorem it is open, so it maps closed sets to a closed set; hence T (X ) is closed in Y ∗∗ and consequently in Y. Now, suppose T is not onto, i.e., there exists y ∈ Y \ T (X ). By the Banach separation theorem, there exists f ∈ Y ∗ such that f (y) = 1 and f (T (x)) = 0 for all x ∈ X. It follows that T ∗ ( f (x)) = f (T (x)) = 0 which implies that f = 0. This contradiction implies that T is onto, and hence a bijection.
1.2.3 Adjoint Operator on Hilbert Spaces In Hilbert space, the definition of adjoint is given in terms of an inner product. Let T : H1 → H2 for some Hilbert spaces H1 and H2 . Let y ∈ H2 and define the functional f (x) = T x, y , then f is clearly linear and bounded on H1 , and by the Riesz representation theorem, there exists a unique z ∈ H1 such that T x, y = x, z . We define z as T ∗ (y). Definition 1.2.3 (Adjoint Operator on Hilbert Spaces) Let T : H1 → H2 be a bounded linear operator between two Hilbert spaces H1 and H2 . The adjoint operator of T , denoted by T ∗ , is defined to be T ∗ : H2 → H1 given by
T x, y = x, T ∗ (y) . Recall the null space of an operator T is defined as ker(T )= N (T ) = {x ∈ Dom(T ) : T (x) = 0} and the range of T is Im(T ) = R(T ) = {T x : x ∈ Dom(T )}. Proposition 1.2.4 Let T : H1 → H2 be a bounded linear operator between two Hilbert spaces H1 and H2 . Then
8
(1) (2) (3) (4)
1 Operator Theory
T ∗∗ = T . T ∗ T = T 2 . N (T ) = R(T ∗ )⊥ and N (T )⊥ = R(T ∗ ). N (T ∗ ) = R(T )⊥ and N (T ∗ )⊥ = R(T ).
1.2.4 Self-adjoint Operators Recall from a linear algebra course that a self-adjoint matrix is a matrix that is equal to its own adjoint. This extends to operators on infinite-dimensional spaces. Definition 1.2.5 (Self-Adjoint Operator) A bounded linear operator on a Hilbert space T ∈ B(H) is called self-adjoint if T = T ∗ . Proposition 1.2.6 Let T ∈ B(H). Then T is self-adjoint if and only if T x, x ∈ R for all x ∈ H. Proof If T = T ∗ then T x, y = x, T (y) = T x, y . Conversely, let T x, x ∈ R for all x ∈ H. Then
T x, x = T x, x = x, T ∗ x = T ∗ x, x , hence T = T ∗ .
1.3 Compact Operators 1.3.1 Definition and Properties of Compact Operators We now introduce compact operators, an extremely important class of operators which is a cornerstone in the study of several topics of analysis, and finds numerous applications to differential equations and integral equations and their spectral theories. Definition 1.3.1 (Compact Operator) Let X and Y be normed spaces and T : X → Y be a linear operator. Then, T is called compact operator if for every bounded sequence xn ∈ X , the sequence {T (xn )} has a convergent subsequence in Y. The set of all compact operators from X to Y is denoted by K(X, Y ), or K(X ) if X = Y. In simple words, a linear operator T : X → Y is compact if T (B) is compact in Y whenever the set B is bounded in X . The definition implies immediately that every
1.3 Compact Operators
9
compact linear operator is bounded, thus continuous. One of the basic properties of compact operators is that composing them with bounded linear operators retains compactness. Proposition 1.3.2 Let T be compact, and S be bounded linear on a normed space X. Then ST and T S are compact operators. Proof If {xn } is bounded, then S(xn ) is bounded, hence T (S(xn )) has a convergent subsequence, so T S is compact. Also, since {xn } is bounded, T (xn ) has a convergent subsequence T (xn j ), but since S is bounded, it is continuous in norm, so S(T (xn j )) also converges. Theorem 1.3.3 Let T ∈ K(X ) for a normed space X. Then, T is invertible if and only if dim(X ) < ∞. Proof If T −1 exists, then T −1 is bounded, and by Proposition 1.3.2, I = T −1 T is also compact, but the identity cannot be compact in an infinite-dimensional space since the unit closed ball would be compact, which contradicts Theorem 1.1.10. One of the most fundamental and important results in analysis which provides a compactness criterion is the Arzela–Ascoli theorem. Theorem 1.3.4 (Arzela–Ascoli Theorem) Let f n ∈ C(K ), for some compact set K . If the sequence { f n } is bounded and equicontinuous, then it has a uniformly convergent subsequence. Proof By compactness, K is separable, and so let D = {xn } be a countable dense in K and consider the sequence { f n (x1 )} which, again by compactness, has a convergent subsequence. Using Cantor’s diagonalization argument to obtain a diagonal sequence, say {h n } = { f n n } which is a subsequence of { f n (x1 )} that converges at each point of D. Since { f n } is equicontinuous, for every x ∈ K and > 0, we can find δ > 0 such that h n (x) − h n (y) < /3 for all y ∈ D. But h n (y) is Cauchy, so we can find n, m such that h n (y) − h m (y) ≤ /3. Hence h n (x) − h m (x) ≤ h n (x) − h n (y) + h n (y) − h m (y) + h m (y) − h m (x) < . This implies that {h n (x)} is uniformly Cauchy and thus it converges to, say h(x), which is continuous by equicontinuity of { f n }. Now, using a similar argument we conclude that {h n (x)} converges uniformly to h(x).
10
1 Operator Theory
The Arzela–Ascoli theorem will be used in demonstrating the following important property of compact operators linked to adjoint operators. Theorem 1.3.5 Let X and Y be normed spaces. Then, T is compact if and only if T ∗ is compact. Proof Let T : X → Y be compact. Let S = { f n } be a bounded set in Y ∗ . Then | f n (y)| ≤ M T for all n. Denote K = T (B X ). Then, S ⊂ C(K ). Since ( f n ) is uniformly bounded for all y1 , y2 ∈ K and all f j ∈ S, we have f j (y1 ) − f (y2 ) ≤ M y1 − y2 . So { f n } is equicontinuous, hence by the Arzela–Ascoli theorem { f n } has a subsequence that uniformly converges on C(K ), i.e., f n j → f ∈ C(K ). Note that by Proposition 1.2.2(1) T ∗ is bounded, and since S is bounded in Y ∗ , we have ∗ T f n − T ∗ f ≤ T ∗ f n − f → 0. j j ∞ Hence (T ∗ f n j ) converges. For the converse, if T ∗ is compact then so is T ∗∗ . Let J : X −→ X ∗∗ be the canonical mapping given by (J (x))( f ) = f (x) ∀x ∈ X, f ∈ X ∗ .
(1.3.1)
If {xn } is bounded in X , then J (xn ) is bounded in X ∗∗ because J is an isometry. But (T ∗∗ (J (xn )) is bounded (due to the compactness of T ∗∗ ) and has a convergent subsequence, say (T ∗∗ (J (xn j )) ⊂ Y ∗∗ . So (T ∗∗ (J (xn j ))(h) = (J (xn j )T ∗ (h),
(1.3.2)
for h ∈ Y ∗ . But from (1.3.1) (J (xn j )T ∗ (h) = T ∗ (h)(xn j ) = h(T (xn j ).
(1.3.3)
By (1.3.2) and (1.3.3), we obtain (T ∗∗ (J (xn j ))(h) = h(T (xn j ), and consequently ∗∗ T (J (xn ) = T (xn ) . j j T (xn j ) converges due to the convergence of T ∗∗ (J (xn j ), and this completes the proof of the other direction.
1.3 Compact Operators
11
1.3.2 The Integral Operator Example 1.3.6 (Integral Operator) Let X = L p [a, b], 1 < p < ∞, and let k ∈ C ([a, b] × [a, b]) be a mapping from [a, b] × [a, b] to R. Consider the Fredholm integral operator K: X −→ X defined by b k(x, y)u(y)dy. (K u)(x) = a
We will show that K is a compact operator for all p > 1. It is easy to see that |K u| ≤ k∞ u p , from which we conclude that K is bounded, and so for a bounded set B ⊂ X, K (B) is bounded. Now we show that K (B) is equicontinuous. Since u ∈ L p [a, b], let
b
|u(x)| p d x = α.
a
Moreover, since k is uniformly continuous on [a, b] × [a, b], for every > 0 there exists δ > 0 such that for all x1 , x2 ∈ [a, b], with x1 − x2 < δ we have |k(x2 , y) − k(x1 , y)|
0 for all n. Since T1 is compact, {xn } has a subsequence {xn1 } such that T1 (xn1 ) converges in Y. But {xn1 } must be bounded, and since T2 is compact, {xn1 } has a subsequence {xn2 } such that T2 (xn2 ) converges in Y, and keeping in mind that T1 (xn2 ) converges as well since {xn2 } is a subsequence of {xn1 }. Proceed the argument inductively; {xnk } subsequence of {xnk−1 } with Tk (xnk ) converges. Choose the diagonal sequence {xnn } = x11 , x22 , x33 , . . . , that is, the first term in the first sequence {xn1 }, the second term of the second sequence {xn2 }, etc. The proof is done if we can prove that T (xnn ) converges. Clearly, Tk (xnn ) converges, so it is a Cauchy sequence and TN (x n ) − TN (x m ) < n m 3
(1.3.4)
for large n, m > N . On the other hand, since Tn → T , for every > 0 there exists N ∈ N such that for all n ≥ N , we have T − Tn < From (1.3.4) and (1.3.5), we obtain
. 3M
(1.3.5)
1.3 Compact Operators
13
T (x n ) − T (x m ) ≤ T (x n ) − TN (x n ) + TN (x n ) − TN (x m ) + TN (x m ) − T (x m ) n m n n n m m m m = . ≤ T − TN xnn + + T − TN xm 3
It follows that T (xnn ) is a Cauchy sequence, so it converges due to the completeness of Y.
1.3.3 Finite-Rank Operators Recall from a linear algebra course that the rank of a linear transformation is defined as the dimension of its range, and represents the maximum number of linearly independent dimensions in the range space of the operator. The definition extends to operators on infinite-dimensional spaces as we shall see next. Definition 1.3.8 (Finite-Rank Operator) Let X and Y be normed spaces. A bounded linear operator T : X → Y is said to be of finite rank if its range is a finitedimensional subspace, i.e., r = dim(T (X ) < ∞. An operator having a finite rank is called finite-rank operator, or f.r. operator for short. The rank of an operator T is denoted by r (T ). The class of all bounded linear operators of finite rank is denoted by K0 (X, Y ). Note that if at least one of the spaces X or Y is finite-dimensional, then T is of finite rank. Note also that if the range is finite-dimensional then every closed bounded set is compact. Choosing any bounded set in the domain of a f.r. operator, this set will be mapped to a bounded set in the finite-dimensional range, and so its closure is compact. Thus: Proposition 1.3.9 Finite-rank operators are compact operators, i.e., K0 (X, Y ) ⊂ K(X, Y ). Note that the inclusion is proper, i.e., there exist compact operators that are not f.r. operators. A simple example which is left to the reader to verify is to consider the sequence space 2 and define the operator T : 2 −→ 2 as T (x1 , x2 , . . .) = (x1 ,
xn x2 x3 , , . . . , , . . .). 2 3 n
The source of the problem here is the range of T as it is not closed. If it is closed, then it would be a f.r. operator as in the next proposition. Proposition 1.3.10 If T : X −→ Y is compact and R(T ) is closed in the Banach space Y, then T is of finite rank.
14
1 Operator Theory
Proof Since Y is complete, so is R(T ), so T : X → R(T ) is a surjective operator between Banach spaces; hence, T is open (by the Open Mapping Theorem), hence the open unit ball B X is mapped to the open set K = (T (B X )), which is relatively compact since A is compact. Then we can find an open ball of radius r > 0 such that Br ⊂ K , so Br ⊂ K ; hence Br is compact, and therefore Y is necessarily finitedimensional. If a f.r. operator is defined on a Hilbert space, then we have more to say. In this case, the subspace T (X ) is of dimension, say, r (T ) = n, which gives rise to a countable orthonormal basis {e1 , e2 . . . , en } for T (X ), and every y ∈ T (X ) can be written as y = T (x) =
n
n
y, e j e j = T (x), e j e j .
j=1
(1.3.6)
j=1
This suggests the following result. Proposition 1.3.11 If T ∈ K0 (H) then r (T ) = r (T ∗ ). Proof If H is finite-dimensional, the conclusion follows directly from Proposition 1.2.4 and the rank-nullity theorem. Suppose otherwise H is infinite-dimensional, and let r (T ) = n. From (1.3.6), we have T (x) =
n
n
T (x), e j e j = x, T ∗ (e j ) e j .
j=1
j=1
Let T ∗ (e j ) = θ j . Since T x, y = x, T ∗ y , we have n
n
x, θ j e j , y = x, y, e j θ j .
j=1
j=1
Thus T ∗ (·) =
n
·, e j θ j .
j=1
Clearly r (T ∗ ) = n. The opposite direction follows from the fact that T ∗∗ = T.
1.4 Hilbert–Schmidt Operator 1.4.1 Definition of Hilbert–Schmidt Operator Now we discuss another important class of compact operators which was investigated by Hilbert–Schmidt in 1907.
1.4 Hilbert–Schmidt Operator
15
Definition 1.4.1 (Hilbert–Schmidt Operator) Let T ∈ B(H), and let {ϕn } be an orthonormal basis for H. Then T is said to be a Hilbert–Schmidt operator if
T ϕi 2 < ∞.
i
Note that the definition requires an orthonormal basis; hence the space is essentially separable. Another thing to observe is that separable Hilbert spaces have more than one basis, so this raises the question of whether the condition holds for any orthonormal basis or for a particular one? The answer is that the condition does not depend on the basis. First, let us find a convenient form for the norm. Let xk ∈ H. Then, T (x) can be written as
T (xk ), ϕ j ϕ j . T (xk ) = j
Letting xk = ϕk and substituting back,
T (ϕk ), ϕ j ϕ j . T (ϕk ) = j
So T (ϕk )2 =
T (ϕk ), ϕ j 2 . j
Take the summation on k,
T (ϕk ), ϕ j 2 . T (ϕk )2 = k
k
j
Hence we can define the Hilbert–Schmidt norm to be T 2 =
1/2 T (ϕk )2
k
⎛
⎞1/2
2 T (ϕk ), ϕ j ⎠ , =⎝ j,k
and therefore the condition in the definition is equivalent to saying that
T (ϕk ), ϕ j 2 < ∞. j,k
Now, we show that the condition is independent of the choice of the basis. Let {u k } be another orthonormal basis for H. By representing T (ϕk ) and T (u k ) by {u k }, it is easy to see that
16
1 Operator Theory
T ϕk 2 =
T (ϕk ), u j 2 = T ∗ (u j ), ϕk 2 = T ∗ (u j )2 .
k
j,k
j,k
j
(1.4.1) Similarly,
T u k 2 =
T ∗ (u j )2 .
k
(1.4.2)
j
The combination of (1.4.1) and (1.4.2) gives T ϕk 2 = T u k 2 . k
k
We denote the Hilbert–Schmidt operators by HS operator, and the Hilbert–Schmidt norm by ·2 . The set of all Hilbert–Schmidt operators on a separable Hilbert space is denoted by K2 (H). Remark It is important to emphasize here that if {ϕn } is an orthonormal basis for a space X then {ϕn } is an orthonormal basis for X.
1.4.2 Basic Properties of HS Operators The following basic properties of HS operators follow easily from the above discussion. Proposition 1.4.2 Let T be a HS operator (i.e., T ∈ K2 (H)). Then (1) T 2 = T ∗ 2 . (2) T ≤ T 2 . Proof The first assertion follows immediately from (1.4.2). For the second assertion, let x ∈ H and {ϕn } be an orthonormal basis for H. Then
x, T ∗ ϕk 2 ≤ x2 T 2 . T x2 = |T x, ϕk |2 = 2 k
k
This implies that T x ≤ x T 2 for all x ∈ H, whence the result.
It is clear that the set of all Hilbert–Schmidt operators form a linear subspace. Proposition 1.4.3 Let T1 , T2 ∈ K2 (H) and α ∈ F. Then αT1 , T1 + T2 ∈ K2 (H). Proof Use basic properties of norm and triangle inequality. Details are left to the reader.
1.4 Hilbert–Schmidt Operator
17
1.4.3 Relations with Compact and Finite-Rank Operators The next result demonstrates two important fundamental properties of HS operators: every finite-rank operator is a HS operator, and every HS operator is compact. Theorem 1.4.4 Let T ∈ B(H). Then, K0 (H) ⊂ K2 (H) ⊂ K(H). Proof The first inclusion asserts that every f.r. operator is HS. To prove this, let T ∈ K0 (H). Since r (T ) = dim(Im(T )) = m, we have {ϕn } ∈ ImT for n ≤ m. But by Proposition 1.2.4(4), (Im(T ))⊥ = ker(T ∗ ). So T ∗ (ϕn ) = 0 for all n > m, and consequently ∞ m ∗ 2 ∗ 2 T ϕn = T ϕn < ∞. n=1
n=1
Hence, T ∗ is a HS operator, and by Proposition 1.4.2, T is HS and T ∈ K2 (H). For the second inclusion, let T ∈ K2 (H) and {ϕn } be an orthonormal basis for H. Then T ϕ j 2 < ∞. T 22 = Note that for x ∈ H, we have x=
∞
∞
x, ϕ j ϕ j ⇒ T (x) = x, ϕ j T (ϕ j ).
j=1
j=1
Define the sequence Tn =
n
x, ϕ j T (ϕ j ).
j=1
Then, Tn is clearly a sequence of finite-rank operators for all n since R(Tn ) is contained in the span of {ϕ1 , ϕ2 , . . . ϕn }. By Proposition 1.4.2(2),
18
1 Operator Theory
Tn − T 2 ≤ Tn − T 22 =
∞ T ϕ j 2 j=n+1
≤
∞ T ϕ j 2 < ∞. j=n+1
Taking n → ∞ gives Tn − T → 0. Note that {Tn } is a sequence of f.r. operators which are compact operators, and the result follows from Theorem 1.3.7. The preceding result simply states that every f.r. operator is a HS operator, and every HS operator is compact. The proper inclusions imply the existence of compact operators that are not HS, and the existence of HS that are not of finite rank. A useful conclusion drawn from the preceding theorem is Corollary 1.4.5 K0 (H) = K2 (H) in ·2 HS norm, that is, for every T ∈ K2 (H) there exists a sequence Tn ∈ K0 (H) such that Tn − T 2 → 0. Proof Details are left to the reader as an exercise.
The following proposition gives a description of each class. Theorem 1.4.6 Let T ∈ B(H), given by T (x) =
αn x, ϕn u n ,
where (ϕn ) and (u n ) are two orthonormal bases in H and (αn ) be a bounded sequence in the underlying field F, say R. Then (1) T is a compact operator if and only if lim αn = 0. (2) T is a HS operator if and only if |αn |2 < ∞. (3) T is of finite rank if and only if there exists N ∈ N such that αn = 0 for all n ≥ N. Proof For (1), let T be compact. If αn 0 then for every , αn ≥ > 0 j for some n j . It is easy to see that T ϕn − T ϕn 2 = αn 2 + αn 2 ≥ 22 . j i j j
1.4 Hilbert–Schmidt Operator
19
Hence, the sequence {T (ϕn j )} has no convergent subsequence, and this implies that T (xn ) cannot be compact. Suppose now that lim αn = 0. Define the sequence Tn (x) =
n
αk x, ϕk u k .
k=1
Then each Tn is of finite rank, and using the same argument of the proof of the previous theorem we see that Tn − T → 0 which implies that T is compact. For (2), note that |T ϕn , u n |2 = |αn u n , u k |2 = |αn |2 . T ϕn 2 = For (3), let T be a finite-rank operator. Note that T (ϕk ) = αk u k . So we have R(T ) ⊆ span{u 1 , u n , . . . u k } for some m. This means that ϕk = 0 for all k ≥ m + 1. On the other hand, if for each k ∈ N, αk = 0, then u k ∈ R(T ) for all k; hence dim(R(T )) = ∞, and, consequently, T is not of finite rank.
In light of the preceding theorem, if T is a f.r. operator, then αn = 0 for all but finite number of terms, and this implies |αn |2 < ∞, (1.4.3) which also implies that αn → 0. Thus, T is compact. This leads to Theorem 1.4.4. Moreover, if T is compact, then αn → 0, but this doesn’t necessarily imply (1.4.3), and so T may not be HS. If (1.4.3) holds, then T is HS, but this doesn’t necessarily mean that αn = 0 for all but finite number of terms, hence T may not be of finite rank. It turns out that the results of the last two theorems are fully consistent with each other, and the last theorem is very helpful to construct examples of compact but not HS operators, and HS operators but not of f.r. The following example is a good application of the preceding theorem. Example 1.4.7 Let T : 2 → 2 , given by T (x) = αn xn ϕn ,
20
1 Operator Theory
for some orthonormal basis {ϕn } of 2 . The operator x n ϕn T (x) = √ n
1 1 = ∞. Furis a compact operator but not HS. This is because αn = √ and n n thermore, the operator x n ϕn T (x) = n is HS but not f.r. since αn =
1 1 for all n, and < ∞. n n2
1.4.4 The Fredholm Operator Example 1.3.6 demonstrated the fact that Fredholm integral equations defined on L p are compact. In the particular case p = 2, we have the advantage of dealing with an orthonormal basis for the space, which will allow us to work on HS norms. In particular, we have the following result. Theorem 1.4.8 The Fredholm integral operator K ∈ L 2 ([a, b]) defined by (K u)(x) =
b
k(x, y)u(y)dy a
with k ∈ L 2 ([a, b] × [a, b]) is a Hilbert–Schmidt operator. Proof Since K is defined on a Hilbert space, let {ϕn } be an orthonormal basis for H. Using the Dominated Convergence Theorem (DCT) and the fact that k ∈ L 2 we have that 2 ∞ ∞ b 2 K ϕn = k(x, y)ϕn (y)dy d x n=1
=
n=1 a ∞ b
=
∞ b
a
=
a
|k, ϕn |2 d x (DCT)
n=1 b
a
=
|k, ϕn |2 d x
a
n=1
b
k2 d x
b
|k(x, y)|2 d x
a
= k22 < ∞.
1.4 Hilbert–Schmidt Operator
21
The preceding theorem demonstrates that the Fredholm integral operator defined on L 2 [a, b] is HS. In fact, the converse is also true. Namely, every HS operator defined on L 2 [a, b] is an integral operator. This striking result justifies the particular importance of HS operators as compact operators that behave as integral operators and could also be self-adjoint if their kernels are symmetric. The next theorem demonstrates that every HS operator from L 2 to L 2 is identified by a integral operator with kernel k ∈ L 2 . Theorem 1.4.9 Every Hilbert–Schmidt operator defined on X = L 2 ([a, b]) is an integral operator with square-integrable kernel. Proof Consider a Hilbert–Schmidt operator K : X −→ X and let {ϕn } be an orthonormal basis for X. So ∞
K ϕn 2 < ∞,
n=1
hence
∞
n=1 (K ϕn )
converges in X. Now for u ∈ X we have u=
∞
u, ϕn ϕn .
n=1
It follows that (K u)(x) = K = =
∞
u, ϕn ϕn (x)
n=1 ∞
u, ϕn K ϕn (x)
n=1 ∞ b a
n=1
=
u(y)ϕn (y)dy K ϕn (x)
b
u(y) a
∞
ϕn (y)(K ϕn )(x) dy,
n=1
where we used the Dominated Convergence Theorem (DCT) in the last step. Now, define ∞ ϕn (y)(K ϕn )(x). k(x, y) = n=1
Then k is clearly a mapping from [a, b] × [a, b] to R and k ∈ L 2 ([a, b] × [a, b]) . Therefore, the HS operator K can be written as b k(x, y)u(y)dy. (K u)(x) = a
22
1 Operator Theory
1.4.5 Characterization of HS Operators We end the section by the following observation: It was shown in Theorem 1.4.6(2) that the operator T (x) = αn x, ϕn u n is HS if and only if
|αn |2 < ∞. Note here that if u k = ϕk then we obtain T (ϕk ) = αk ϕk .
It turns out that the sequence (αk ) is nothing but the eigenvalues of T. These are countable (possibly finite) set of eigenvalues. This motivates us to investigate the spectral properties of the operators to elaborate more on the eigenvalues and eigenvectors of HS and compact operators. Before we start this investigation, we would like to obtain one final result in this section. The preceding theorem predicts that every HS operator on L 2 is an integral operator with a square-integrable kernel. We will2 combine this result with the preceding theorem to show that the scalar sum |αn | is, in fact, the HS norm of the operator, so the result that T is HS iff this sum is finite comes as no surprise. Theorem 1.4.10 Let T ∈ K2 be a HS operator T : L 2 ([a, b] → L 2 ([a, b]. Let (λn ) be the eigenvalues of T. If the kernel k ∈ L 2 ([a, b] × [a, b]) of T is symmetric, then k22 =
∞
|λn |2 .
n=1
Proof Since L 2 is Hilbert, let (ϕn ) be an orthonormal basis of L 2 ([a, b]) which are the corresponding eigenvectors for (λn ). Define the set ψnm (x, y) = ϕn (x)ϕm (y).
(1.4.4)
Then it can be shown that (ψnm ) is an orthonormal basis of L 2 ([a, b] × [a, b]) (see Problem 1.11.32). Since k ∈ L 2 ([a, b] × [a, b]) , k, ψnm ψnm , k(x, y) = n
and by Parseval’s identity, we have
m
1.5 Eigenvalues of Operators
23
k2 =
n
|k, ψnm |2 .
(1.4.5)
m
On the other hand, we have
b
k, ψnm =
a
k(x, y)ϕn (x)d x ϕm (y)dy
a b
=
b
T (ϕn )ϕm (y)dy
a
= λn ϕn , ϕm since T (ϕn ) = λn ϕn . λn n = m = 0 n = m. Substituting this into (1.4.5) gives k22 =
n
|k, ψnm |2 =
m
∞
|λn |2 .
n=1
1.5 Eigenvalues of Operators 1.5.1 Spectral Analysis The study of eigenvalues in functional analysis was begun by Hilbert in 1904 during his investigations on the quadratic forms in infinitely many variables. In 1904, Hilbert used the terms “eigenfunction” and “eigenvalue” for the first time, and he called this new direction of research “Spectral Analysis”. Hilbert’s early theory led to the study of infinite systems of linear equations, and mathematicians like Schmidt and Riesz, and then John von Neumann were among the first who began this direction. Finite systems of linear equations were investigated during the eighteenth and nineteenth centuries and based centrally on the notions of matrices and determinants. Then the problem of solving integral equations emerged at the beginning of the twentieth century. It turned out that the problem of solving an integral or differential equation could be boiled down to solving linear systems of infinite unknowns. The spectral theory deals with eigenfunctions and eigenvalues of operators in infinite-dimensional spaces and the conditions under which they can be expressed in terms of their eigenvalues and eigenfunctions, which helps in solving integral and differential equations by expanding the solution as a series of the eigenfunctions. Extending matrices to infinite dimensions leads to the notion of operator, but many fundamental and crucial properties of matrices will be lost upon that extension.
24
1 Operator Theory
One of the remarkable results of the study of operators (or infinite-dimensional matrices) is the fact that compact operators can be represented in a diagonal form in which their eigenvalues are the entries of the “infinite diagonal”. This class of operators was constructed to retain many characteristic properties of matrices from which the importance of this class stems. The present section and the next one will establish two remarkable results that demonstrate the fact that compact operators share some nice properties with matrices, and they could be viewed as an “infinitedimensional matrices”. Let T: V −→ V be a Hermitian (or self-adjoint) map on a finite-dimensional space V . It is well-known from linear algebra that T is identified with a Hermitian matrix. The main result is that all eigenvalues of T are real numbers, and its eigenvectors form an orthonormal basis for the space, and this basis can be used to diagonalize the matrix. This is the main spectral theorem for Hermitian maps on finite-dimensional spaces. In infinite-dimensional spaces, the situation can be much different and more complicated. There might not be “enough” eigenvectors, or to say eigenfunctions, to form a basis for the space. We will, nevertheless, show that compact self-adjoint operators on Hilbert spaces retain this property.
1.5.2 Definition of Eigenvalues and Eigenfunctions We begin our discussion by the following definition, which is analogous to the finitedimensional case. Definition 1.5.1 (Eigenvalue, Eigenfunction) Let X be a normed space and T ∈ B(X ). Then, a constant λ ∈ C is called eigenvalue of T if there exists a nonzero vector x ∈ X such that T x = λx. The element x is called eigenvector, or eigenfunction of T corresponding to λ. Notice the following: (1) The concept of eigenvalue and eigenfunction has been defined for bounded linear operators, but it can also be defined for unbounded operators. (2) We always avoid the case x = 0 since this won’t lead to desirable results.
1.5.3 Eigenvalues of Self-adjoint Operators The next proposition gives two basic properties of self-adjoint operators not properties related to the eigenvalues and the norm of the self-adjoint operator. Proposition 1.5.2 Let T ∈ B(H) be a self-adjoint operator. Then (1) All eigenvalues {λi } of T are real numbers, and their corresponding eigenvectors are orthogonal.
1.5 Eigenvalues of Operators
25
(2) T = sup{|T x, x | : x ≤ 1}. Proof Let T u = λu. Then λ u2 = T u, u = u, T u = λ¯ u2 , ¯ which implies λ ∈ R. Let so λ = λ, T v = μv for another eigenvalue μ = λ. Since T is self-adjoint, we have T u, v = u, T v . Then 0 = T u, v − u, T v = λ u, v − μ u, v = (λ − μ) u, v . Hence u and v are orthogonal. This proves (1). To prove (2), let M = sup{|T x, x | : x ≤ 1}. Then, clearly T x, x ≤ T x2 . Take the supremum over all x ∈ B X , M ≤ T . On the other hand, choosing x, y ∈ B X , and using Polarization Identity then the Parallelogram Law for the inner product, 1 [|T (x + y), x + y | + |T (x − y), x − y |] 4 1 ≤ M x + y2 + x − y2 4 1 = M x2 + y2 2 = M.
Re(T x, y ≤
Let c ∈ C with |c| = 1. So cy ¯ ∈ B X . Then
26
1 Operator Theory
|T x, y | = T x, cy ¯ ≤ M. Taking the supremum over all y ∈ B X , then supremum over all x ∈ B X gives T ≤ M.
This proves (2). An immediate corollary is
Corollary 1.5.3 If T ∈ B(H) is a self-adjoint operator and T x, x = 0 for all x ∈ B X , then T = 0.
1.5.4 Eigenvalues of Compact Operators Proposition 1.4.2 is of extreme importance in determining spectral properties for self-adjoint operators. We managed to find orthogonal, hence orthonormal, set of vectors in the range space of the operator. We will use this property by combining compactness with being self-adjoint. This will provide an interesting result. Before stating the result, we may need to recall the notation T ∗ = T ∈ K(H) which means T is a self-adjoint compact operator on a Hilbert space. Proposition 1.5.4 Let T ∗ = T ∈ K(H). Then, T or − T is an eigenvalue of T. Proof From Proposition 1.5.2, we have T = sup{|T x, x | : x = 1}. Hence, there exists a sequence xn ∈ SH such that |T xn , xn | → T , and keeping in mind that T xn , xn ∈ R since T is self-adjoint. Assume T xn , xn → λ ∈ R. Then, either λ = T or λ = − T . Therefore, using the expansion of the inner product, T xn − λxn 2 = T xn 2 + λ2 − 2λ T xn , xn ≤ 2λ2 − 2λ T xn , xn → 0.
1.5 Eigenvalues of Operators
27
Since T is compact, the sequence {T xn } has a convergent subsequence such that T xnk → z
(1.5.1)
for z ∈ H. So λxn k → z and by continuity of T, this implies that λT (xn k ) −→ T z, or T xn k −→
1 T z. λ
(1.5.2)
Then from (1.5.1) and (1.5.2), we obtain T z = λz. It remains to show that z = 0. Note that T > 0. Then, we have T xn ≥ λxn − T xn − λxn → |λ| = T > 0. Taking the limit of both sides, and using continuity of the norm gives z > 0. Hence |λ| is an eigenvalue of T with a corresponding vector z.
An immediate consequence is Corollary 1.5.5 If T ∗ = T ∈ K(H), then T has at least one nonzero eigenvalue. This contrasts with the finite-dimensional case where symmetric matrices may have no eigenvalues if the underlying field is R since the characteristic polynomial may not have solutions in R. It is well-known from linear algebra that the set of eigenvalues of a matrix is finite. Since they are simply the roots of the characteristic polynomial, there can be at most n eigenvalues for an n × n matrix. As mentioned at the beginning of this section, things change when we turn to operators on infinitedimensional spaces. The next example illustrates the idea. Example 1.5.6 Consider the left-shift operator on p , 1 ≤ p < ∞, T (x1, x2 , x3 . . .) = (x2 , x3 . . .). To find eigenvalues of T, we write T xn = λxn . Then xn+1 = λn x1 . If λ = 0, the corresponding eigenvector will be of the form x = (x1 , λx1 , λ2 x1 , λ3 x1 , . . .),
28
1 Operator Theory
and this element does not belong to p unless |λ| < 1. Hence, the set of eigenvalues of T is (−1, 1).
1.6 Spectral Analysis of Operators 1.6.1 Resolvent and Regular Values We have seen one similarity that compact operators share with matrices: the discrete set of eigenvalues (finite or countable). Now we exhibit another property regarding the eigenvalues. Recall from linear algebra that a scalar λ ∈ C is called an eigenvalue of a linear mapping T if (T − λI ) is singular, i.e., if det(T − λI ) = 0. If λ is not an eigenvalue, then (T − λI ) is regular, and so (T − λI )−1 exists. Recall a linear map on a finite-dimensional space is surjective if and only if it is injective. The situation in infinite-dimensional spaces is much more complicated. Recall also that an operator T is invertible if T −1 exists and is bounded, so by saying invertible, we mean the inverse exists in B(X ). One sufficient condition for this to hold is the Open Mapping Theorem (OMT) which states that a bounded linear operator of a Banach space onto a Banach space is open, i.e., its inverse is continuous, which by linearity implies it is bounded. A value λ ∈ C for which the mapping (T − λI ) has a bounded inverse is called regular value. This proposes the following definition: Definition 1.6.1 (Regular Value) Let X be a Banach space and T ∈ B(X ). A scalar λ ∈ C is called a regular value of T if Tλ−1 = (T − λI )−1 ∈ B(X ) (i.e., Tλ is bijection with a bounded inverse operator). The set of all regular values of T is called the resolvent set, and is denoted by ρ(T ). The definition is limited to bounded linear operators on Banach spaces, but the definition can always be extended to unbounded operators on normed spaces. Furthermore, every bounded continuous operator and bijective on Banach spaces has a bounded inverse, thanks to OMT. So the condition of the bounded inverse is automatically satisfied in the case of Banach spaces. If a scalar λ ∈ / ρ(T ), then one of the following holds: (1) The scalar λ is an eigenvalue, so that Tλ has no inverse, i.e., ker(Tλ ) = {0}, or (2) λ is not an eigenvalue, i.e., ker(Tλ ) = {0} and Tλ has an inverse, but λ is not a regular value. By OMT, Tλ must be nonsurjective. This is due to one of the two reasons: (a) R(Tλ ) = X and Tλ−1 is unbounded.
1.6 Spectral Analysis of Operators
29
(b) R(Tλ ) X. Hence, there can be more than one reason for the scalar not to be regular. All these values are called spectral values and the set containing all of them is called spectrum. Definition 1.6.2 (Spectrum) Let T be an operator. The set consisting all numbers that are not in the resolvent set ρ(T ) is called the spectrum of T, and is denoted by σ(T ). It consists of three sets: The point spectrum σ p (T ) consisting of all eigenvalues of T, the continuous spectrum σc (T ) consisting of all scalars λ for which R(Tλ ) = X and Tλ−1 is unbounded, and the residual spectrum σr (T ) consisting of all scalars λ for which R(Tλ ) X. As an immediate consequence of the two preceding definitions, we have the following formula: σ p (T ) ∪ σc (T ) ∪ σr (T ) = σ(T ) = C \ ρ(T ).
1.6.2 Bounded Below Mapping It turns out that the spectrum of an operator defined on infinite-dimensional spaces contains, but is not limited to, its eigenvalues. This contrasts with the finitedimensional case where any scalar that is not an eigenvalue is a regular value for the linear map, and the map is invertible. In fact, the idea of invertibility can be identified with another notion. If T is invertible, then for all x ∈ X , x = T −1 T x ≤ T −1 T x .
(1.6.1)
Since T is invertible, −1 T = M < ∞ for some M > 0. This gives c x ≤ T x 1 . Notice how the operator T is bounded from below. This suggests the for c = M following definition: Definition 1.6.3 (Bounded Below) An operator T is said to be bounded below if for some c > 0, we have c x ≤ T x for all x ∈ X.
30
1 Operator Theory
The previous argument shows that if an operator is invertible, i.e., bijection with bounded inverse, then it is bounded below. For the converse, we have the following proposition. Proposition 1.6.4 Let T ∈ B(X ) for a Banach space X. Then, T is bounded below if and only if T is injective and R(T ) is closed. Proof If T is bounded below, then clearly ker(T ) = {0}, and so T is injective. If xn ∈ X , and T (xn ) → y ∈ X, then {T xn } is Cauchy. It follows that c xn − xm ≤ T xn − T xm , and so {xn } is Cauchy. Since X is Banach, xn → x for some x ∈ X, and by continuity, T xn → T x. Thus, we have y = T x and R(T ) is closed. Conversely, since R(T ) is closed in X , it is Banach, so the mapping Tˆ : X → R(T ) is surjective, and by OMT Tˆ has bounded inverse, and by (1.6.1) it is bounded below, taking into account that in R(T ) we have T xR(T ) = T xY .
The following result follows from the preceding one. Proposition 1.6.5 Let T ∈ B(X ) for a Banach space X. Then, T −1 ∈ B(X ) if and only if T is bounded below and R(T ) is dense in X. Proof If T is invertible, then it is bounded below and bijective, so it is surjective. Conversely, if T is bounded below, then by the previous proposition, it is injective, and R(T ) is closed in X. Since it is also dense, we have R(T ) = R(T ) = X. So it is surjective. It follows from OMT that T is invertible.
1.6.3 Spectrum of Bounded Operator We can define the resolvent to be ρ(T ) = {λ ∈ C : T − λI is bounded below and with dense range}. On the other hand, σc (T ) = {λ ∈ C : T − λI is not bounded below}.
1.6 Spectral Analysis of Operators
31
Example 1.6.6 In Example 1.5.6, the point spectrum of the left-shift operator on p , 1 ≤ p < ∞, T (x1, x2 , x3 . . .) = (x2 , x3 . . .) was found to be σ p (T ) = {λ ∈ C : |λ| < 1}. Now it is clear that T x p ≤ x p since T ≤ 1, so T x − λx p ≥ λx p − T x p ≥ (|λ| − 1) x p . Consequently, |λ| > T = 1 for all λ ∈ ρ(T ), and therefore σ(T ) ⊆ {λ ∈ C : |λ| ≤ 1} but since σ(T ) is closed, σ(T ) = {λ ∈ C : |λ| ≤ 1}. The next proposition provides information about the structure of the spectrum. Proposition 1.6.7 Let T ∈ B(X ) for a Banach space X. (1) If T < 1, then T − I is invertible. (2) If T < |λ| , then λ ∈ ρ(T ). Proof For (1), let T < 1. Then T k ≤ T k = Hence,
1 . 1 − T
T k converges and lim n
n k=1
T k (I − T ) = lim(I − T ) n
= lim(I − T n
= I. So
n
k=1 n+1
)
Tk
32
1 Operator Theory
(I − T )−1 =
∞
T k.
(1.6.2)
k=1
T T . Then, by assumption S < 1, and so by (a) − I is λ λ invertible, and so is T − λI . For (2), we set S =
The series in (1.6.2) is called the Neumann series. Proposition 1.6.8 For any operator T ∈ B(X ), the spectrum σ p (T ) is compact. Proof Let μ ∈ ρ(T ), so T − μI is invertible. Then for z ∈ C, we can write T − z I = (T − μI ) − (z − μ)I = (T − μI )(I − (z − μ)Tμ−1 ).
(1.6.3)
Since μ ∈ ρ(T ), there is M > 0 such that Tμ−1 ≤ M, so we can find δ > 0 such −1 that δ < T −1 . Choose z such that |z − μ| ≤ δ. Then μ
(z − μ)T −1 = |z − μ| T −1 < 1. μ
μ
Hence, by Proposition 1.6.7(1) and Neumann series, I − (z − μ)Tμ−1 is invertible with a bounded inverse. From (1.6.3) we conclude that T − z I is invertible and Tz−1 ∈ B(X ), and so z ∈ ρ(T ). This gives a disk inside ρ(T ), thus it is open. Therefore, the spectrum is closed in C. If λ ∈ σ(T ), then by Proposition 1.6.7(2) |λ| ≤ T < ∞ so the spectrum it is bounded in C, and therefore it is compact.
Proposition 1.6.9 Let T ∈ B(X ) for some Banach space X. Then (1) λ ∈ σ(T ) if and only if λ¯ ∈ σ(T ∗ ). (2) σ(T −1 ) = {λ−1 : λ ∈ σ(t)}. (3) T is invertible if and only if 0 ∈ / σ(T ). ¯ is invertible. Proof For (1), note that T − λI is invertible iff (T − λI )∗ = T ∗ − λI For (2), we have λ−1 I − T −1 = −T −1 λ−1 (λI − T ), knowing that T −1 λ−1 = λ−1 T −1 is invertible. Hence, λI − T is invertible iff λ−1 I − T −1 is invertible. (3) follows directly from (2). The details are left to the reader as an exercise (see Problem 1.11.41).
1.7 Spectral Theory of Self-adjoint Compact Operators
33
1.6.4 Spectral Mapping Theorem For polynomials in particular, we have the following result, known as “Spectral Mapping Theorem”. Proposition 1.6.10 (Spectral Mapping Theorem) Let T ∈ B(X ). If p(z) is a polynomial with complex coefficients, then p(σ(T )) = σ( p(T )). Proof Note that p(σ(x)) = { p(λ) : λ ∈ σ(T )}. Since p(z) − p(λ) is a polynomial in C, say of degree n, it can be factorized as p(z) − p(λ) = (z − λ1 ) . . . (z − λn ). Then, p(λ) ∈ / σ( p(T )) iff p(T ) − p(λ)I is invertible iff (z − λ j ) = 0 for all j = / σ(T ). The details are 1, . . . , n iff (T − λi I ) is invertible for all j = 1, . . . , n iff λ ∈ left to the reader as an (see Problem 1.11.42).
1.6.5 Spectrum of Compact Operators In the case of compact operators, things change. Again, compact operators prove to be the typical extension to matrices since they retain spectral properties of linear maps on finite-dimensional spaces. Proposition 1.6.11 Let T ∈ K(X ) for some Banach space X. If dim(X ) = ∞, then 0 ∈ σ(T ). Proof If 0 ∈ / σ(T ), then T is invertible, and if T is invertible and compact, then I is compact by Proposition 1.3.2, and therefore by Theorem 1.3.3 X must be finitedimensional.
1.7 Spectral Theory of Self-adjoint Compact Operators 1.7.1 Eigenvalues of Compact Self-adjoint Operators Having an uncountable number of eigenvalues is a phenomenon that many operators on infinite-dimensional spaces possess. We will see, however, that compact operators can have a countable set of eigenvalues at most.
34
1 Operator Theory
Theorem 1.7.1 Let T ∗ = T ∈ K(H). Then the set of all eigenvalues {λn } of T is at most countable, and λn → 0. Proof If the set of all eigenvalues is finite we are done. Suppose it is infinite and let > 0. We claim that the set S = {λ : |λ| ≥ } is finite. If not, then we can construct a sequence {λn } with corresponding (orthonormal) eigenvectors {ϕn } such that
ϕi , ϕ j = δi j
by Proposition 1.5.2(1). Then
T (ϕ j ) − T (ϕi )2 = λ j ϕ j − λi ϕi 2 = |λi |2 + |λi |2 ≥ 22 > 0.
(1.7.1)
But T is compact, then T must have a convergent subsequence. This contradicts (1.7.1), which implies that S is finite for every > 0, i.e., S c consists of all but finite terms of λn . The only possible way is that λn → 0. The result reveals one of the most important spectral properties that characterizes compact operators. The situation in the case of compact operators is very similar to linear maps on finite-dimensional spaces in the sense that both have discrete (finite or countable set of eigenvalues). What about eigenvectors? No information on the behavior of these eigenvalues or their corresponding eigenvectors is known yet so far. We will show that they also retain some features from the finite-dimensional case. Before that, we need to introduce the concept of invariant subspaces.
1.7.2 Invariant Subspaces Definition 1.7.2 (Invariant Subspace) Let X be a normed space and Y ⊂ X be a subspace. Let T ∈ L(X ). Then, Y is said to be an invariant subspace if T (Y ) ⊆ Y . The subspace Y is called T -invariant. Invariant subspaces allow us to restrict the mapping to this invariant to obtain a new mapping, denoted by TY , which is a restriction of the original mapping TY = T |Y = Y −→ Y, TY (x) = T (x) ∀x ∈ Y. The new mapping is well-defined, but note that it is not necessarily surjective. A trivial restriction can be made by choosing Y = {0}, then TY = 0. Invariant subspaces are helpful in reducing down operators into simpler operators acting on invariant subspaces. The following result will be very helpful in proving our main theorem of this section.
1.7 Spectral Theory of Self-adjoint Compact Operators
35
Proposition 1.7.3 Let T ∈ B(H), and let Y be a closed subspace of H, that is, T −invariant. Then (1) If T is self-adjoint, then TY is self-adjoint. (2) If T is compact, then TY is compact. Proof Let TY = Y −→ Y , TY (x) = T (x). Note that since Y is closed in H, Y is a Hilbert space. Suppose T = T ∗ and let x, z ∈ Y. Then TY x, z Y = TY x, z = T x, z = x, T z = x, TY z = x, TY z Y . This proves (1). To prove (2), let {yn } be a bounded sequence in Y. Then, it is bounded in H with yn = yn Y and since T is compact, {T yn } has a subsequence {T yn k } that converges in H, so it is Cauchy in H with TY (y j ) − TY (yi ) = TY (y j ) − TY (yi ) . Y Hence, it is Cauchy in Y which is complete, and consequently it converges in Y. This proves (2).
1.7.3 Hilbert–Schmidt Theorem Now, we come to a major result in the spectral theory of operators. The following theorem provides these missing pieces of information about eigenfunctions. Theorem 1.7.4 (Hilbert–Schmidt Theorem) If T ∗ = T ∈ K(H) (i.e., compact and self-adjoint), then its eigenfunctions {ϕn } form an orthonormal basis for R(T ), and their corresponding eigenvalues behave as |λ1 | > |λ2 | > |λ3 | . . . . Moreover, for x ∈ H T (x) =
∞
λ j x, e j e j .
j=1
Proof Proposition 1.5.4 implies the existence of an eigenvalue λ1 of T such that |λ1 | = T = max{|T x, x | : x ∈ H, x = 1}.
(1.7.2)
36
1 Operator Theory
Let u 1 be the corresponding eigenvector, and normalize it to get ϕ1 =
u1 . u 1
Define the following subspace: H1 = (span{ϕ1 })⊥ . Then H1 is a closed subspace of H, so it is Hilbert. Let x ∈ H1 . Then, x, ϕ1 = 0, which implies that T x, ϕ1 = x, T ϕ1 = λ x, ϕ1 = 0. Therefore, T x ∈ H1 , and this shows that H1 is a T −invariant subspace of H. Now consider the restriction TH1 = T1 : H1 −→ H1 , T1 (z) = T (z) ∀z ∈ H1 . Then by previous proposition, T1 is self-adjoint compact on H1 . Again, there exists an eigenvalue λ2 of T1 such that |λ2 | = T1 = max{|T1 x, x | : x ∈ H1 , x = 1}.
(1.7.3)
In view of the definition of T1 and (1.7.2)–(1.7.3), it is clear that |λ1 | > |λ2 | . Let ϕ2 be the eigenvector corresponding to λ2 . Clearly the set {ϕ1 , ϕ2 } is orthonormal. Define H2 = (span{ϕ1 , ϕ2 })⊥ . It is clear that H2 ⊂ H1 is closed in H1 , hence Hilbert. Moreover, it is T − invariant, so T2 = TH2 : H2 −→ H2 is (again by previous proposition) self-adjoint and compact. Then, there exists an eigenvalue λ3 of T2 such that |λ3 | = T2 = max{|T2 x, x | : x ∈ H2 , x = 1}. It is clear that
1.7 Spectral Theory of Self-adjoint Compact Operators
37
|λ1 | > |λ2 | > |λ3 | . Now we proceed to obtain a collection of eigenvalues {λn } of T , with Tn = |λn+1 | ≤ |λn |
(1.7.4)
for all n ≥ 1. If the process stops at n = N < ∞, such that THN +1 = 0, then for every x ∈H T (x) =
N
λ j x, ϕ j ϕ j ,
j=1
and we have a finite set of eigenvectors {ϕ j : j = 1, . . . , N } together with a finite set of eigenvalues |λ1 | > |λ2 | > . . . > |λ N | . If the process doesn’t stop at any N < ∞, we continue the process and we get a sequence |λ1 | > |λ2 | > |λ3 | > . . . with corresponding countable orthonormal eigenvectors {ϕ1 , ϕ2 , ϕ3 , . . .}. Note that every x ∈ H can be written uniquely as ⎛ ⎞ n n
x = ⎝x − x, ϕ j ϕ j ⎠ + x, ϕ j ϕ j , j=1
j=1
and since n
x, ϕ j ϕ j ∈ span{ϕ1 , ϕ2 , . . . , ϕn },
j=1
we have ⎛ ⎝x −
n
⎞
x, ϕ j ϕ j ⎠ ∈ Hn .
j=1
Now, let zn = x −
n j=1
x, ϕ j ϕ j .
38
1 Operator Theory
Using (1.7.4) and Bessel’s inequality, we obtain T z n = Tn z n ≤ Tn z n ≤ |λn | x .
(1.7.5)
But ⎛ T ⎝x −
n
⎞
x, ϕ j ϕ j ⎠ = T (x) −
n
j=1
n
λ j x, ϕ j ϕ j . x, ϕ j T (ϕ j ) = T (x) −
j=1
j=1
Therefore, using (1.7.5) n
T (x) − λ j x, ϕ j ϕ j = T z n ≤ |λn | x . j=1 Take n → ∞ and use Theorem 1.7.1 to obtain T (x) =
∞
λ j x, ϕ j ϕ j .
(1.7.6)
This shows that the range space of T is spanned by the eigenvectors {ϕn }. That is, letting M = Span{ϕn }n∈N , then (1.7.6) makes it clear that R(T ) ⊆ M, hence R(T ) ⊆ M. On the other hand, let y=
∞
αn ϕn ∈ M.
Then, we have ∞ αjϕj
λj and
=x∈X
1.7 Spectral Theory of Self-adjoint Compact Operators
T (x) = T = =
39
∞ αjϕj
∞
∞
λj αj
T (ϕ j ) λj
αjϕj
= y. Therefore, y ∈ R(T ), and so M ⊆ R(T ). This completes the proof.
Problem 1.11.34 gives another approach to show that M is an orthonormal basis for the range space. The preceding theorem allows us to construct an orthonormal basis for R(T ) from the eigenvectors of T, whether H is separable or not. The form T (x) =
∞
λ j x, e j e j
j=1
which was concluded at the end of the proof is called the diagonal operator with entries {λn } on the diagonal ⎛ ⎞ λ1 ⎜ λ2 ⎟ ⎝ ⎠ .. . It turns out that compact self-adjoint operators defined on Hilbert spaces can be unitarily diagonalized, a property which is similar to those of finite-dimensional linear mappings.
1.7.4 Spectral Theorem For Self-adjoint Compact Operators The next theorem is a continuation of the previous theorem in case the space H is separable. This will extend the eigenvectors to cover the whole space and a complete representation will be given. Theorem 1.7.5 (Spectral Theorem For Self-adjoint Compact Operators) Let T ∗ = T ∈ K(H) (i.e., compact and self-adjoint) on a Hilbert space H. Then, its eigenfunctions form an orthonormal basis for the space H. This orthonormal basis is countable if H is separable. Proof Let H =R(T ) ⊕ N (T ).
40
1 Operator Theory
So it suffices to find a basis for the null space of T. If H is separable, then so is the null space N (T ). Let {ϕn } be a countable orthonormal basis for N (T ). Since T ϕn = 0 for all n, we consider them as eigenvectors corresponding to λ = 0. Let {φn } = {ϕn } ∪ {en }, where {en } are the orthonormal basis for R(T ) that was constructed in the proof of the Hilbert–Schmidt theorem. Note that Proposition 1.5.2(1) implies that en , ϕm = 0 for all n, m. Indeed, any x ∈ H can be written uniquely as x= [x, en en + x, ϕn ϕn ] . Therefore, {φn } is a countable orthonormal basis for H. If H is nonseparable, then we choose the orthonormal set {ϕα : α ∈ } to be a basis for N (T ), and if α ∈ , then λα = 0. Thus {φα } = {{ϕα } ∪ {en } : n ∈ N, α ∈ }. This set is not necessarily countable. We proceed with the same argument for the separable case. The spectral theorem in its two parts shows that a compact self-adjoint operator T ∈ K(H) can be written as
T (x) = λ j x, ϕ j ϕ j where {ϕn } is a set of orthonormal basis for H and {λn } is the set of the eigenvalues, which is either finite or countable and decreasing with λn → 0. The next theorem shows that the converse of the spectral theorem holds as well. Theorem 1.7.6 Let T ∈ B(H) be a bounded linear operator on a Hilbert space H such that for every x ∈ H
T (x) = λ j x, ϕ j ϕ j , where {ϕn } is a set of orthonormal basis for H and {λn } is the set of the eigenvalues which is either finite or countable and decreasing and λn → 0. Then T is a compact self-adjoint operator. Proof If dim(H) < ∞ and the system {λn , ϕn } is finite then T is of finite rank and thus it is compact. If not then we define the following operators: Tn (x) =
n j=1
λ j x, ϕ j ϕ j .
1.8 Fredholm Alternative
41
Then for each n, Tn is of finite rank and so it is compact. Since λn is decreasing, we have 2 ∞ ∞
2 2 Tn − T = λ j x, ϕ j ϕ j = λ2j x, ϕ j . j=n+1 j=n+1 So by Bessel’s inequality this yields Tn − T 2 ≤ λ2n+1 x2 . As n −→ ∞, λn → 0 and this gives the uniform convergence of Tn to T which implies that T is compact. Furthermore, we have
λ j x, ϕ j ϕ j , y
= λ j x, ϕ j ϕ j , y
= x, λj ϕj, y ϕj
T x, y =
= x, T y .
1.8 Fredholm Alternative 1.8.1 Resolvent of Compact Operators According to Proposition 1.6.4, a bounded linear operator on a Banach space is bounded below if and only if T is injective and R(T ) is closed. The following proposition treats this result differently. Proposition 1.8.1 Let T ∈ K(X ), and let 0 = λ ∈ C be a complex number. If Tλ is injective for λ = 0, then Tλ is bounded below. Proof If not, then for each n, there exists xn ∈ S X such that Tλ (xn )
0. Hence, λ ∈ σ(T ) and therefore Tλ is not injective.
As a consequence, we have the following. Corollary 1.8.2 Let T ∈ K(X ), and let 0 = λ ∈ C be a complex number. If Tλ = T − λI is injective then R(T − λI ) is closed, and if Y is a closed subspace of X then Tλ (Y ) is closed in X. Proof The first part of the conclusion follows from Proposition 1.6.4. For the second part, it can be readily seen that Tλ |Y is also bounded below from the previous proposition, hence again applying Proposition 1.6.4. Theorem 1.8.3 Let T ∈ K(X ), and let 0 = λ ∈ C be a complex number. If Tλ is injective, then Tλ is surjective. Proof Suppose not. Let Y = (Tλ )(X ). Then, by Corollary 1.8.2 Y = Y X. Consider Yn = Tλn (X ). Clearly X ⊃ Y1 ⊃ Y2 . . . , with each Yn being closed. Let 0 < γ < 1. By the Riesz lemma, there exists yn ∈ Yn \ Yn+1 such that d(xn , Yn+1 ) ≥ γ. Note that if m > n, then Ym+1 Ym ⊆ Yn+1 . It follows that (T − λI )ym ∈ Ym+1 & (T − λI )yn ∈ Yn+1 . This gives
(1.8.1)
1.8 Fredholm Alternative
43
T yn − T ym = λyn − λym + Tλ (ym ) − Tλ (yn ) = λyn − (λym + Tλ (yn ) − Tλ (ym ))
(1.8.2)
where z = λym + Tλ (yn ) − Tλ (ym ) ∈ Yn+1 . From (1.8.1) and (1.8.2), we have T yn − T ym ≥ |λ| γ > 0. Hence, the sequence {T xn } cannot have a convergent subsequence, which contradicts the fact that T is compact. Now, the combination of Propositions 1.6.4 and 1.8.1, and Theorem 1.8.3 implies the following. Corollary 1.8.4 Let T ∈ K(X ) for a Banach space X , and let 0 = λ ∈ C be a complex number. Then, Tλ = T − λI is injective iff Tλ is invertible.
1.8.2 Fundamental Principle Using the notion of a compact operator, we have been able to find an analog of the result in the finite-dimensional case, which states that invertibility and injectivity are equivalent for linear maps. The previous corollary implies that if 0 = λ ∈ / σ p (T ) for some compact T , then λ ∈ ρ(T ). This also means that for a compact operator σ p (T ) \ {0} = σ(T ) \ {0}, or in other words, σ(T ) = σ p (T ) ∪ {0}. This leads to the remarkable result, commonly called, Fredholm Alternative, which states the following. Theorem 1.8.5 (Fredholm Alternative) Let T ∈ K(X ) for a Banach space X , and let 0 = λ ∈ C be a complex number. Then, we have one of the following: (1) Tλ is noninjective, i.e., N (T − λI ) = {0}, or (2) Tλ is invertible, i.e., T − λI has bounded inverse.
44
1 Operator Theory
The operator Tλ satisfying this Fredholm Alternative principle shall be called the Fredholm Operator. The two statements mean precisely the following: either the equation T (u) − λu = 0 (1.8.3) has a nontrivial solution, or the equation T (u) − λu = v
(1.8.4)
has a unique solution for every v ∈ X .
1.8.3 Fredholm Equations In the language of integral equations, we can also say: either the equation b k(x, t) f (t)dt λ f (x) = a
has a nontrivial solution f , or the equation b λ f (x) − k(x, t) f (t)dt = g(x) a
has a unique solution for every function g, keeping in mind that the integral operator is a Hilbert–Schmidt operator, which is a compact operator. We state the Fredholm Alternative theorem for Fredholm integral equations. Theorem 1.8.6 (Fredholm Alternative for Fredholm Equations) Either the equation b k(x, y)u(y)dy − λu(x) = f (x) a
has a unique solution for all f ∈ L 2 [a, b], or the equation b k(x, y)u(y)dy − λu(x) = 0 a
has a nontrivial solution u ∈ L 2 [a, b]. Example 1.8.7 The Fredholm equation of the first kind takes the form b k(x, y) f (y)dy = g(y). a
This can be written as
(1.8.5)
1.8 Fredholm Alternative
45
K f = g, where K is a compact integral operator defined as
b
K (·) =
k(x, y)(·)dy. a
If 0 ∈ σ(K ) then K 0 is noninjective, i.e., K ( f ) = 0 has a nontrivial solution. By the Fredholm Alternative Theorem 1.8.5, K 0 is not invertible; we cannot find a unique solution for (1.8.3) for all g.
1.8.4 Volterra Equations We state a variant of Fredholm Alternative theorem for Volterra integral equations. Theorem 1.8.8 (Existence and Uniqueness of Solutions of Volterra Integral Equation) Consider the Volterra equation of the first kind x (V u)(x) = k(x, y)u(y)dy, 0
where V is defined on L p ([0, 1]), 1 < p < ∞, and k ∈ C([0, 1] × [0, 1]). Then for all λ = 0, there exists a unique solution for the equation (λI − V )u = f for all f ∈ L 2 ([0, 1]). Proof The kernel can be written as
k(x, y) 0 ≤ x < y ˜ k(x, y) = 0 y ≤ x < 1. Then the operator takes the Fredholm form
1
V u(x) =
˜ k(x, y)u(y)dy,
0
which is compact. Now we discuss the spectrum of V. Set f = 0 to obtain the following eigenvalue equation: x V u = λu = k(x, y)u(y)dy (1.8.6) 0
46
1 Operator Theory
for 0 < x ≤ 1. Let M > 0 such that |k| < M on [0, 1], and let u1 = α then we obtain for a.e. x ∈ [0, 1]
|λ| |u(x)| ≤
x 0
k(x, y)u(y)dy ≤ αM.
Repeat the process on u using the fact that λ2 u = V 2 u = λ(λu) to obtain x |λ|2 |u(x)| ≤ k(x, y)(λu(y)dy 0 x ≤ k(x, y)αMdy 0
≤ αM 2 x ≤ αM 2 . Iterating it n times |λ|n |u(x)| ≤ M n α
1 , (n − 1)!
and taking n −→ ∞ gives either λ = 0 (i.e., σ(V ) = {0}) or u = 0 on [0, 1]. So we conclude that the Volterra equation (1.8.6) has only the trivial solution, i.e., (1.8.3) doesn’t hold and V is injective, and since it is compact, λI − V . Therefore, for all λ = 0, Vλ is injective and the result follows by Fredholm Alternative (1.8.4).
1.9 Unbounded Operators 1.9.1 Introduction All the operators that were studied so far fall under the class of linear bounded operators. In this section, we will investigate operators that are unbounded. The theory of the unbounded operator is an important aspect of applied functional analysis since some important operators, such as the differential operators, encountered in applied mathematics and physics are unbounded, so developing a theory that treat these operators is of utmost importance. The following theorems are central in the treatment of linear bounded operators:
1.9 Unbounded Operators
47
(1) The fundamental Theorem of Linear Operators: If T : X −→ Y is a linear operator between two normed spaces, then T is bounded if and only if T is continuous. (2) Bounded Inverse Theorem: If T : X −→ Y is a linear invertible operator between two Banach spaces X, Y , then T is bounded if and only if T −1 is bounded. (3) Closed Graph Theorem: If T : X −→ Y is a linear operator between two Banach spaces X, Y and D(T ) = X , then T is bounded if and only if its graph is a closed set in X × Y. These three theorems describe the general framework of the theory of bounded linear operators. When dealing with unbounded operators, it is understood in view of these theorems that the operators are no longer continuous, and consequently we will have to seek other notions that can redeem some of the properties that hold due to the notion of continuity. The best choice is the notion of closedness as it generalizes the notion of continuity.
1.9.2 Closed Operator Recall that a mapping T : X −→ Y is continuous on its domain if whenever xn −→ x we have T (xn ) −→ T (x). On the other hand, we have the following definition for closed operators. Definition 1.9.1 (Closed Operator) An operator T : X −→ Y is closed if graph(T ) = G( f ) = {(x, T (x)) : x ∈ D(T )} is a closed subspace in X × Y. Remark It is important to note the following: (1) The above definition is equivalent to say T : X −→ Y is closed if whenever xn −→ x in D(T ) and T (xn ) −→ y, we have x ∈ X and y = T (x). (2) This definition is different than the definition used in point-set topology which states that a mapping is closed if the image of a closed set is closed. The two definitions are not equivalent. Mappings can have closed graphs but not closed domains (simply think of f (x) = e x − 1 for example). If D(T ) is closed, then we will certainly have x ∈ D(T ), and we are only concerned about the convergence of (T (xn )). Here, when xn −→ x in D(T ), the continuity property ensures the convergence of the sequence (T (xn )) to T (x). On the other hand, the closedness property won’t guarantee the convergences of (T (xn )), but if it happens, it is guaranteed to converge to T (x). The two properties are similar in that both guarantee that the convergence of (T (xn )) won’t be to any element other than T (x) but closedness doesn’t guarantee the convergence of this sequence as it may diverge, whereas continuity (which is equivalent to boundedness for linear operators) guarantees its convergence. It is evident that a continuous operator is
48
1 Operator Theory
closed but the converse is not necessarily true. The only type of discontinuity that may occur by closed linear operators is the (infinite) essential discontinuity. Loosely speaking, if the domain of the operator is complete then x ∈ D(T ), i.e., T (x) = y ∈ ImT , and this forces the operator to be bounded since otherwise there would be a convergent sequence xn −→ x ∈ D(T ) such that T (xn ) −→ ∞, and this will break the closedness of the graph of T . It turns out that closed operators can redeem some of the properties for the continuous operators. Theorem 1.9.2 (Closed Range Theorem) Let T : X −→ Y be linear closed operator between Banach spaces X and Y. If T is bounded below, then R(T ) is closed in Y. Proof The proof is similar to Prop 1.6.4. Let yn ∈ R(T ) and yn −→ y. Let xn ∈ D(T ) such that T xn = yn and T (xn ) → y ∈ X, then {T xn } is Cauchy. It follows that c xn − xm ≤ T xn − T xm , hence {xn } is Cauchy and by completeness it converges to, say, x ∈ X, and since T is closed, y = T (x) ∈ R(T ) and therefore R(T ) is closed.
In a sense analogous to the Bounded inverse theorem, we have the following. Proposition 1.9.3 Let T : X −→ Y be a linear operator between Banach spaces X and Y. Then T is closed iff T −1 is closed. Moreover, if T is bijective, then T −1 is bounded. Proof Note that the graph of T −1 is G(T −1 ) = {(T (x), x) : x ∈ D(T )}, so if G(T ) is a closed subspace, then so is G(T −1 ) since the same argument for G(T ) can be made for G(T −1 ) with T (xn ) = yn where yn −→ y, and assuming T −1 (yn ) = xn −→ x. If, in addition, T is onto, then
D(T −1 ) = Y
which is Banach, so by the closed graph theorem T −1 is bounded.
1.9 Unbounded Operators
49
This is a very interesting property of closed operators which says that a linear closed operator between Banach spaces has a bounded and closed inverse even if it is unbounded. It also shows that the solution u of the equation Lu = f for a closed bijective operator L is controlled and bounded by f . Indeed, if Lu = f for some f ∈ R( f ) then u = L −1 f ≤ L −1 f . This result is useful in applications to differential equations when seeking well-posed solutions that depend continuously on the data given.
1.9.3 Basics Properties of Unbounded Operators The sum and product of operators can be defined the same way as for the bounded case. Definition 1.9.4 Let T and S be two operators on X. Then (1) (2) (3) (4)
D(T + S) = D(T ) ∩ D(S). D(ST ) = {x ∈ D(T ) : T (x) ∈ D(S)}. T = S if D(T ) = D(S) and T x = Sx for all x ∈ D. S ⊂ T if D(S) ⊂ D(T ) and T |D(S) = S. In this case, T is said to be an extension of S.
Note that from (2) and (3) above, we have in general T S = ST and they are equal only if D(T S) = D(ST ). Furthermore, if L : D(L) ⊂ X −→ X, and L −1 exists, then D(L −1 L) = D(L) whereas D(L L −1 ) = X, so L −1 L ⊂ I and equality holds only if D(L) = X. The next result shows that the closedness of a linear operator T extends to its Fredholm form Tλ . Proposition 1.9.5 If T : X −→ Y is a closed linear operator then so is Tλ = T − λI. Proof Let xn −→ x and assume Tλ (xn ) −→ y. Then xn − x → 0 and T xn − (λxn + y) → 0.
50
1 Operator Theory
Then T xn − (λx + y) = T xn − λxn + λxn − λx − y ≤ T xn − λxn − y + λ xn − xn → 0. Thus, T xn −→ λx + y. But, since T is closed, T xn = T x, so T x = λx + y, or T x − λx = Tλ (x) = y.
A classic example of a closed unbounded operator is the differential operator. Example 1.9.6 Let D : C 1 [0, 1] −→ C[0, 1], D(u) = u where both spaces are endowed with the supremum norm ·∞ . It is clear that D is linear. Let u n −→ u and D(u n ) = u n −→ v in the supremum norm, i.e., the convergence is uniform. Then
x
x
v(τ )dτ = lim
0
0
u n (τ )dτ = lim[u n (x) − u n (0)] = u(x) − u(0).
That is, we have u(x) = u(0) +
x
v(τ )dτ .
0
Hence u = D(u) = v and u ∈ X. Thus D is closed. If we let u n (x) = x n , then x∞ = 1 and D(u n (x)) = nx n−1 = n. So D = sup D(u n ) = n. Hence D is unbounded.
1.9 Unbounded Operators
51
So it became apparent in view of the preceding example that the class of closed unbounded operators represents some of the most important linear operators, and a comprehensive theory for this class of operators is certainly needed to elaborate more on their properties. It is well-known that the inverse of the differential operator is the integral operator, which was found to be compact and self-adjoint. So our next step will be to investigate ways to define the adjoint of these unbounded operators.
1.9.4 Toeplitz Theorem Recall that the adjoint of a linear bounded operator T : X → Y is defined as T ∗ : Y ∗ −→ X ∗ given by
T ∗ y = yT,
for y ∈ Y ∗ . If X and Y are Hilbert spaces, then the adjoint operator for the operator T : H1 → H2 is defined as T ∗ : H2 → H1 ,
T x, y = x, T ∗ y for all x ∈ H1 and y ∈ H2 . If T is bounded then D(T ∗ ) = H2 and so the adjoint is defined for all y ∈ H2 for which there is y ∗ = T ∗ y such that
T x, y = x, T ∗ y for all x ∈ H1 . In the unbounded case, this construction might cause a trouble and won’t give rise to a well-defined adjoint mapping since (T (xn ))n will diverge for some sequence (xn ), therefore we need to restrict the domain of T ∗ to consist of only the elements y that would make T x, y bounded for all x ∈ D(T ). The following theorem illustrates this. Theorem 1.9.7 (Toeplitz Theorem) Let L : H −→ H be a linear operator and D(T ) = H. If T x, y = x, T y for all x, y ∈ H then T is bounded. Proof If T is unbounded then there is a sequence z n ∈ H such that z n = 1 and T (z n ) −→ ∞. Define the sequence f n (x) = T x, yn . Note that for every x ∈ H
(1.9.1)
52
1 Operator Theory
| f n (x)| = |T x, yn | ≤ T x and for each n | f n (x)| = |x, T yn | ≤ T yn x . By the uniform bounded principle, f n ≤ M for some M > 0, i.e., | f n (x)| ≤ M x for every x ∈ H. Then T z n 2 = T z n , T z n = | f n (T z n )| ≤ M T z n , so T z n ≤ M which contradicts (1.9.1).
1.9.5 Adjoint of Unbounded Operators The Toeplitz theorem indicates that for symmetric unbounded operators, we must have D(T ) ⊂ H. Another problem that arises in obtaining a well-defined adjoint is that T ∗ must be uniquely determined for each y ∗ ∈ D(T ∗ ). This can be guaranteed if D(T ) is made as large as possible, but since D(T ) ⊂ H, we will try something like D(T ) = H. Indeed, if D(T ) ⊂ H then by the orthogonal decomposition of Hilbert spaces ⊥
H = D(T ) ⊕ D(T ) , we can find 0 = y0 ∈ D(T )
⊥
such that x, y0 = 0 for all x ∈ D(T ), but this implies that
T x, y = x, y ∗ = x, y ∗ + x, y0 = x, y ∗ + y0 = x, T ∗ y . It follows that for every x ∈ D(T ) such that T x, y = x, T ∗ y we have T ∗ y = y and T ∗ y = y ∗ + y0 . Hence by making the domain of T dense, we obtain ⊥ uniqueness of T ∗ y since D(T ) = {0} in this case. An operator T : X → Y where D(T ) = X is called densely defined. The above argument proposes the following definition for general (possibly unbounded) operators. ∗
Definition 1.9.8 (Adjoint of General Operator) Let T : D(T ) ⊂ H −→ H be a linear operator on a Hilbert space H that is densely defined, i.e., D(T ) = H. Then the adjoint of T is defined as T ∗ : D(T ∗ ) → H,
1.9 Unbounded Operators
53
T x, y = x, T ∗ y for all x ∈ D(T ) and y ∈ D(T ∗ ) = {y ∈ H such that x, T ∗ y = T x, y ∈ C for all x ∈ D(T )}. The collection of all linear (possibly unbounded) operators on a space X is denoted by L(X ). In the bounded case, it is well-known that for T, S ∈ B(H), T ∗ + S ∗ = (T + S)∗ and
T ∗ S ∗ = (ST )∗ .
The next proposition shows that this is not the case for unbounded operators. Proposition 1.9.9 Let T, S ∈ L(H) be two densely defined operators. Then (1) (2) (3) (4)
(αT )∗ = αT ∗ . If S ⊂ T then T ∗ ⊂ S ∗ . T ∗ + S ∗ ⊂ (T + S)∗ . If ST is densely defined, then T ∗ S ∗ ⊂ (ST )∗ .
Proof For (1), we have
αT x, y = x, (αT )∗ y
(1.9.2)
for all x ∈ D(T ). On the other hand,
αT x, y = α T x, y = α x, T ∗ y = x, αT ¯ ∗y . Then (1) follows from (1.9.2) and (1.9.3). For (2), let y ∈ D(T ∗ ). Then
x, T ∗ y = T x, y
for all x ∈ D(T ), hence for all x ∈ D(S). But from Definition 1.9.4(4) T (x) = S(x) for all x ∈ D(S). So
x, S ∗ y = Sx, y
for all x ∈ D(S), which implies that y ∈ D(S ∗ ) and S ∗ = T ∗ on D(T ∗ ). This gives (2).
(1.9.3)
54
1 Operator Theory
For (3), by Definition 1.9.4(1) let y ∈ D(T ∗ + S ∗ ) = D(T ∗ ) ∩ D(S ∗ ). Then y ∈ D(T ∗ ) and D(S ∗ ). Hence, for all x ∈ D(T ), and
x, T ∗ y = T x, y
x, S ∗ y = Sx, y
for all x ∈ D(S). It follows that
x, T ∗ y + x, S ∗ y = T x, y + Sx, y
(1.9.4)
for all x ∈ D(T ) ∩ D(S) = D(T + S). But (1.9.4) can be written as
x, (T ∗ + S ∗ )y = (T + S)x, y = x, (T + S)∗ y .
Therefore, y ∈ D(T + S)∗ , and T ∗ + S ∗ = (T + S)∗ for all y ∈ D(T ∗ + S ∗ ). This proves (3). To prove (4), let y ∈ D(T ∗ S ∗ ). Then
x, T ∗ S ∗ y = T x, S ∗ y = ST x, y
= x, (ST )∗ y
for all x ∈ D(T ). Hence y ∈ D((ST )∗ ) and (4) is proved.
1.9.6 Deficiency Spaces of Unbounded Operators Proposition 1.2.4 gives the relations between the null space and the range of the operator and its adjoint. The result holds to the unbounded case as well. Proposition 1.9.10 Let T ∈ L(H). Then
1.9 Unbounded Operators
55
(1) N (T ∗ ) = R(T )⊥ and N (T ∗ )⊥ = R(T ). T and T ∗ in the identities can be replaced to give N (T ) = R(T ∗ )⊥ and N (T )⊥ = R(T ∗ ). (2) For λ ∈ C, we have N (Tλ∗ ) = R(Tλ )⊥ and N (Tλ∗ )⊥ = R(Tλ ). T and T ∗ in the identities can also be replaced. Proof Note that y ∈ R(T )⊥ iff T x, y = 0 for all x ∈ D(T ) iff x, T ∗ y = 0 iff T ∗ y = 0 iff y ∈ N (T ∗ ), and this gives N (T ∗ ) = R(T )⊥ . All the other identities can be proved similarly and are left to the reader to verify. The decomposition of a Hilbert space is therefore possible for linear operators that are possibly unbounded. Namely, letting T ∈ L(H), R(T ) ⊕ N (T ∗ ) = H. The spaces N (Tλ∗ ) and R(Tλ ) are called the deficiency spaces, and the numbers dim N (Tλ∗ ) & dim R(Tλ ) are called the deficiency indices.
1.9.7 Symmetry of Unbounded Operators Due to the densely defined domains, we need the following definition for symmetric operators. Definition 1.9.11 (Symmetric Operator) Let T ∈ L(H). Then T is symmetric if T x, y = x, T y for all x, y ∈ D(T ). The operator T is self-adjoint. Proposition 1.9.12 Let T ∈ L(H) and D(T ) = H. Then (1) T is symmetric if and only if T ⊂ T ∗ . (2) If T is symmetric and R(T ) = H. then T is injective. Conversely, if T is symmetric and injective then R(T ) = H. (3) If T is symmetric and R(T ) = H, then T is self-adjoint. Proof For (1), let y ∈ D(T ). If T x, y = x, T y for all x ∈ D(T ) then clearly y ∈ D(T ∗ ) and so D(T ) ⊂ D(T ∗ ). This gives the first direction. Conversely, let T ⊂ T ∗ . Then T ∗ = T on D(T ), so for all x, y ∈ D(T )
T x, y = x, T ∗ y = x, T y , so T is symmetric.
56
1 Operator Theory
For (2), let x ∈ D(T ) and T x = 0. Then T x, y = x, T y = 0 for all y ∈ D(T ). So x⊥R(T ), and since R(T ) = H, we have x = 0. Conversely, ⊥ assume R(T ) ⊂ H. Then there exists z ∈ R(T ) such that z, y = 0 for all y ∈ R(T ). But there is x ∈ D(T ) such that T x = y and so z, T x = 0 = T z, x for all x ∈ D(T ), and this implies that T z = 0, but because T is injective, we must have z = 0, hence T is injective. For (3), let T be symmetric. By (1), T ⊂ T ∗ , so it suffices to show that D(T ∗ ) ⊂ D(T ) which will give the other direction. Let y ∈ D(T ∗ ). Then
T x, y = x, T ∗ y for all x ∈ D(T ). But note that T ∗ y ∈ H, so by surjection of T, there exists z ∈ D(T ) such that T z = T ∗ y. Consequently,
T x, y = x, T ∗ y = x, T z = T x, z , hence y = z ∈ D(T ) and so
D(T ∗ ) ⊂ D(T ).
Note that using the adjoint operator, the Toeplitz theorem becomes more accessible to us and follows easily from the preceding proposition since we can simply argue as follows: If T is symmetric then by the preceding proposition T ⊂ T ∗ , which implies D(T ) ⊂ D(T ∗ ), but since D(T ) = H, we also have D(T ∗ ) ⊆ D(T ), which implies
D(T ) = D(T ∗ )
and therefore T = T ∗ . The next theorem discusses the connection between disjoints and inverses. In the bounded case, it is known that T is invertible if and only if T ∗ is invertible, and
1.9 Unbounded Operators
57
(T ∗ )−1 = (T −1 )∗ . This identity extends to general linear invertible densely defined operators that are not necessarily bounded. A more interesting result is to assume symmetry rather than injectivity. Theorem 1.9.13 Let T ∈ L(H) be symmetric. If D(T ) = H and R(T ) = H then (T −1 )∗ exists, (T ∗ )−1 exists, and (T −1 )∗ = (T ∗ )−1 . Proof By Proposition 1.9.12(2), T is injective, hence T : D(T ) −→ R(T ) is invertible. A similar argument shows that for y ∈ D(T ∗ ), T ∗ y = 0 implies y = 0. So T −1 and (T ∗ )−1 exist. Also, D(T −1 ) = R(T ) = H, so (T −1 )∗ exists. It follows that
x, y = T T 1 x, y
= T −1 x, T ∗ y
= x, (T −1 )∗ T ∗ y , for all y ∈ D(T ∗ ). So we obtain (T −1 )∗ T ∗ y = y on D(T ∗ ) and R(T ∗ ) ⊂ D(T −1 )∗ ). On the other hand, the inverse of T ∗ is (T ∗ )−1 : R(T ∗ ) −→ D(T ∗ ). Consequently, D((T ∗ )−1 ) = R(T ∗ ) ⊂ D(T −1 )∗ ), therefore (T ∗ )−1 ⊂ (T −1 )∗ . Similarly,
(1.9.5)
58
1 Operator Theory
x, y = T −1 T x, y
= T x, (T −1 )∗ y
= x, T ∗ (T −1 )∗ y for all y ∈ D(T −1 )∗ ). Then T ∗ (T −1 )∗ y = y on D(T −1 )∗ ), so for all y ∈ D(T −1 )∗ ) we have y ∈ R(T ∗ ) = D((T ∗ )−1 ), whence (T −1 )∗ ⊂ (T ∗ )−1 . From (1.9.5) and (1.9.6), we conclude that (T −1 )∗ = (T ∗ )−1 .
(1.9.6)
An important corollary is Corollary 1.9.14 Let T ∈ L(H) be densely defined and injective. If T is self-adjoint then T −1 is self-adjoint. The next result asserts that the adjoint of any densely defined operator is closed. Theorem 1.9.15 If T ∈ L(H) such that D(T ) = H, then T ∗ is closed. Proof Let yn ∈ D(T ∗ ) such that yn −→ y and T ∗ (yn ) −→ z. Then for every x ∈ D(T )
T x, yn = x, T ∗ yn −→ x, z and T x, yn −→ T x, y . Hence
x, z = T x, y = x, T ∗ y ,
and this implies that y ∈ D(T ∗ ) and T ∗ y = z.
1.9.8 Spectral Properties of Unbounded Operators The spectral properties of the unbounded linear operators retain much of those for the bounded operators. The definitions are the same.
1.9 Unbounded Operators
59
Definition 1.9.16 (Resolvent and Spectrum) Let X be Banach space and T ∈ L(X ). A scalar λ ∈ C is called a regular value of T if the resolvent Rλ = Tλ−1 = (T − λI )−1 ∈ B(X ), that is, Tλ is boundedly invertible (i.e., bijection with a bounded inverse operator). The set of all regular values of T is called the resolvent set, and is denoted by ρ(T ). The set C \ ρ(T ) = σ(T ) is the spectrum of T. Note that to have a bounded inverse for an unbounded operator is more challenging. The notion of closedness will be invoked here as it will play an important role in establishing some interesting properties for the unbounded operators. Theorem 1.9.17 Let T ∈ L(X ) be a densely defined closed operator on a Banach space X. Then λ ∈ ρ(T ) iff Tλ is injective. Proof If λ ∈ ρ(T ) then Tλ is in fact bijective. Conversely, if Tλ be injective then Tλ−1 : R(Tλ ) −→ D(Tλ ) exists, and by Proposition 1.9.3 it is closed since T is closed. Moreover, Tλ is closed by Proposition 1.9.5, and so is D(Tλ ), which implies that it is complete, and the same holds for R(Tλ ). Therefore Tλ−1 is bounded by the Closed Graph Theorem. The preceding theorem gives a characterization for the resolvent of a densely defined closed operator T ∈ L(X ) by saying that ρ(T ) = {λ ∈ C : T − λI is injective.}. The next result characterize closed operators in terms of their resolvents. It basically says that if you can find, at least, one element in the resolvent set, then the operator is necessarily closed. Proposition 1.9.18 Let T ∈ L(H) be a densely defined operator on a Hilbert space H. If ρ(T ) = Ø then T is closed. Proof If not, then for any λ ∈ C neither Tλ nor Tλ−1 is closed, so Tλ−1 is unbounded, and therefore λ ∈ / ρ(T ) and consequently ρ(T ) is empty. The preceding result indicates why dealing with closed operators is efficient. This will simplify the work on self-adjoint operators knowing that every self-adjoint operator is closed. Theorem 1.9.19 Let T ∈ L(H) be a densely defined operator on a Hilbert space H. Then T is self-adjoint iff T is symmetric and σ(T ) ⊂ R. Proof If T is self-adjoint then it is closed by Theorem 1.9.15. Let λ ∈ C, λ = a + bi with b = 0. Then
60
1 Operator Theory
(T − λ)x2 = (T − a)x2 + b2 x2 ≥ b2 x2 ,
(1.9.7)
so T − λI is bounded below and hence injective. By Theorem 1.9.17 λ ∈ ρ(T ), and since λ is arbitrary, σ(T ) ⊂ R. √ Conversely, let σ(T ) ⊂ R. Then, λ = i = −1 is not an eigenvalue of T and so by Theorem 1.9.17, the operator Ti = T − i I is boundedly invertible and R(Ti ) = H. By Proposition 1.9.10(2) N (Ti∗ ) = (R(Ti ))⊥ = H⊥ = {0}. Now, let y ∈ D(T ∗ ). Since R(Ti ) = H, there exists x ∈ D(T ) such that Ti∗ y = (Ti )x, and by symmetry we have D(T ) ⊂ D(T ∗ ) and T ∗ is an extension of T . It follows that x − y ∈ D(T ∗ ) and (Ti∗ )(x − y) = 0, hence x − y ∈ N (Ti∗ ), and therefore x = y and consequently, D(T ) = D(T ∗ ) and T = T ∗. We end the section with the following important criterion for self-adjointness of linear operators. Theorem 1.9.20 Let T ∈ L(H) be a densely defined and symmetric operator on a Hilbert space H. Then the following are equivalent: (1) T is self-adjoint. (2) T is closed and N (T ∗ ± i I ) = {0}. (3) R(T ± i I ) = H. Proof Clearly (1) gives (2) by Proposition 1.9.18 and Theorem 1.9.19. Suppose (2) holds. By Proposition 1.9.10 R(T ± i I ) = H. Since T is symmetric, we use (1.9.7) to conclude that T ± i I is bounded below, hence by Theorem 1.9.2 R(T ± i I ) is closed and therefore R(T ± i I ) = R(T ± i I ) = H and (3) is proved. Now, suppose (3) holds. Then we have D(T ) ⊂ D(T ∗ ).
1.10 Differential Operators
61
Let y ∈ D(T ∗ ). Then (T ∗ + i I )y = z ∈ R(T ∗ + i I ) ⊆ H. By (3), there exists x ∈ D(T ) such that (T + i I )x = z = (T ∗ + i I )y. Now, (1) can be obtained using the same argument as in the proof of the preceding theorem.
1.10 Differential Operators This section is a continuation of the preceding section in discussing unbounded operators. We will investigate differential operators as the most important class of unbounded operators. Example 1.9.6 demonstrates the fact that the derivative operator is a closed but not bounded linear operator. This is a prototype of the class of closed unbounded operator. In this section, we will discuss cases in which differential operators can be self-adjoint, which enables us to use the results of the preceding section and conclude that the inverse of the differential operator is also closed and self-adjoint. This inverse operator is nothing but the integral operator, which has been already proved it is a compact operator, and its spectral properties was discussed in Sect. 1.8. This will help us in answering questions about the existence of solutions of differential equations, and about the eigenvalues and their corresponding eigenvectors of the eigenvalue boundary value problems.
1.10.1 Green’s Function and Dirac Delta Consider the differential equation Lu(x) = f (x)
(1.10.1)
for some differential operator L . If L is invertible then the solution of the equation above is given by u = L −1 f, and so the equation is written as L(L −1 f ) = f. Note that since L −1 is an integral operator, it has a kernel, say, k(x, t), namely L −1 f =
k(x, t) f (t)dt,
62
1 Operator Theory
hence L(L
−1
f) =
(Lk(x, t)) f (t)dt = f.
(1.10.2)
Hence we obtain that u=L
−1
f =
k(x, t) f (t)dt
is the solution to Eq. (1.10.1). The situation of the function Lk(x, t) in (1.10.2) is rather unusual since the integral of its product with f gives f again, and this has no explanation in the classical theory of derivatives. This problem was investigated by Dirac in 1922, and he extended the concept of Kronecker delta function 1 i= j , δi j = 0 i = j which helps select an element, say ak from a set S = {a1 , a2 , . . .} by means of the operation δ jk a j , (1.10.3) ak = j
and it was required by normalization that
δ jk = 1.
j
The notion was extended to the so-called Dirac delta function 0 x = 0 , δ(x) = ∞ x =0
(1.10.4)
and the process in (1.10.3) becomes ∞ δ(x) f (x)d x = f (0), −∞
and in general
∞
−∞
δ(x − t) f (x)d x = f (t).
Moreover, the normalization condition took the form
(1.10.5)
1.10 Differential Operators
63
∞
−∞
δ(x)d x = 1.
Consequently, we obtain Lk(x, t) = δ(t).
(1.10.6)
Of course, the way the Dirac delta was created doesn’t make it well-defined. Moreover, a rigorous and the treatment above doesn’t stand on a firm mathematical foundation, and so a rigorous analysis was needed to validate the construction of Dirac delta. Some great mathematicians, such as Sobolev, Schwartz, and others, were among the first to carry the mathematical analysis, which led to the creation of distribution theory and Sobolev spaces. In fact, the observation and the debate about the Dirac delta was a stepstone that had led to the creation of this important area of functional analysis. The kernel k(x, t) is called Green’s function, and it is the solution of the equation L x k(x, t) = δ(t), where L x is the differential operator that is the inverse of the integral operator of which k(x, t) is its kernel, and the subscript x of L x denotes the variable under differentiation. The strategy is to define L x such that it is densely defined on a separable Hilbert space, say L 2 , injective, and symmetric. In order to prove it is self-adjoint, we can use the definition to show that D(T ∗ ) = D(T ) but this might be a challenging task in many cases, so using theorems and results of the preceding section can be more helpful. One can show that T is surjective, then use Proposition 1.9.12(3) to conclude that T is self-adjoint. Alternatively, one can show that σ(T ) ⊆ R, then use Theorem 1.9.19. After we prove that L is self-adjoint, we apply Corollary 1.9.14 to conclude that L −1 (which is an integral operator) is self-adjoint. We also know that the kernel of the integral operator must be symmetric in order for the integral operator to be self-adjoint, although Corollary 1.9.14 ensures that the inverse operator L −1 is selfadjoint if L is densely defined, injective, and self-adjoint. It turns out that whenever the differential operator L is self-adjoint, the kernel k of the integral operator L −1 is guaranteed to be symmetric so that L −1 is self-adjoint. Therefore, we can apply Theorems 1.7.4 and 1.7.5 to conclude the existence of a decreasing countable set of eigenvalues (λn ) for the integral operator L −1 with a countable set of eigenfunctions (ϕn ) that form an orthonormal basis for the space, and such that L −1 ϕn = λn ϕn . But the eigenfunctions of the operators L and L −1 are the same, and the eigenvalues are the reciprocals of each other. Thus we have Lϕn = μn ϕn ,
64
1 Operator Theory
where μn =
1 −→ ∞ λn
are the eigenvalues of the differential operator L . This can be simply seen from (1.10.1). Indeed, if Eq. (1.10.1) is of the form Lu = λu, then u= k(x, t)λu(t)dt,
or 1 u(x) = L −1 u = λ
k(x, t)u(t)dt,
so the eigenvalue of the integral operator L −1 is the reciprocal of the differential operator L. We will take the Laplacian operator as an example.
1.10.2 Laplacian Operator The Laplacian operator is the differential involved in the Laplace operator Lu = −∇ 2 u where the minus sign is adopted for convenience. Consider the following onedimensional equation: Lu = −u = f defined on L 2 ([a, b]). The operator −L : L 2 ([a, b]) −→ L 2 ([a, b]), and it is well-known that not all functions in L 2 ([a, b]) are twice differentiable, so D(L) ⊂ L 2 ([a, b]) and L cannot be surjective. In fact, L is densely defined since the space C 2 [a, b] consisting of all functions that are twice continuously differentiable on [a, b] is dense in L 2 ([a, b]) as we will see in Chap. 3, so let us take it for granted now. So L ∗ is well-defined.
1.10 Differential Operators
65
To prove symmetry, we proceed as follows: Lu, v − u, Lv =
b
−u v + uv dt
a
= [−u v + uv ]x=b x=a . Hence, it is required to assume the homogeneous conditions u(a) = u(b) = 0 from which we obtain that L is symmetric. In fact, the homogeneous conditions, besides being helpful in establishing self-adjointness, are also helpful to prove the injectivity of the operator. Indeed, if Lu = 0 with u(a) = u(b) = 0, then the only solution to the problem is u = 0, so L is injective. Furthermore, it is easy to show that the spectrum consists of real numbers. Indeed, using integration by parts yields
u, −u = −
b
uu d x
a
b
=−
(uu ) d x +
a
b = −uu a + =
b
b
2 u d x
a b
2 u d x
a
2 u d x ≥ 0.
a
Therefore, the Laplacian operator −u is a positive operator, and thus all its eigenvalues are nonnegative (see Problem 1.11.58). Therefore, by Theorem 1.9.19 we see that L is self-adjoint, and so by Corollary 1.9.14, the inverse operator L −1 exists and it is also self-adjoint. The operator L −1 is given by b −1 L f = Kf = G(x, t) f (t)dt. (1.10.7) a
Here, G(x, t) is Green’s function and it is the kernel of the operator, which is necessarily symmetric since L −1 is self-adjoint. If G is continuous on [a, b] × [a, b] then L −1 is compact, and consequently we can apply the Hilbert–Schmidt theorem and the spectral theory of compact self-adjoint operators. From (1.10.6) and since δ = 0 for x = t, Green’s function necessarily satisfies the boundary conditions: k(a, t) = k(b, t) = 0. Moreover, LK f = −
d2 Ku = dx2
a
b
−G x x (x, t) f (t)dt = f (x),
66
1 Operator Theory
and from (1.10.5) we obtain −G x x (x, t) = δ(x − t). If x < t we have −G x x (x, t) = 0, from which we get G(x, t) = c1 (t) + c2 (t)x. For x = t we have
t+
G x x d x = G x (t + ; t) − G x (t − ; t) = 1.
t−
Using the boundary conditions and the jump discontinuity of G gives 1 (x − a)(b − t) x ≤ t G(x; ξ) = , b − a (t − a)(b − x) t ≤ x which is clearly continuous on [a, b] × [a, b] and symmetric as expected. Therefore we have the following theorem. Theorem 1.10.1 The solution to the problem −u = f defined on L 2 ([a, b]) with the conditions u(a) = u(b) = 0 is given by
∞ 1 f, ϕn ϕn (x) u(x) = λ n=1 n
where {λn } are the eigenvalues of L such that λ1 < λ2 < . . . where λn −→ ∞, and {ϕn } are the corresponding eigenfunctions which form an orthonormal basis for L 2 ([a, b]). Example 1.10.2 Consider the problem
1.10 Differential Operators
67
−u = f, subject to the homogeneous conditions u(0) = u(1) = 0. Then it can be shown using classical techniques of ODEs that the eigenvalues to the problem are λn = n 2 π 2 and the corresponding orthonormal eigenfunctions are ϕn (x) =
√
2 sin nπx.
Moreover, these eigenfunctions form an orthonormal basis of L 2 [0, 1]. Replacing 1 2 sin nx} as an orthonormal by π gives rise to the space L 2 [0, π] which has the set { π basis.
1.10.3 Sturm–Liouville Operator In general, the differential operator L = a2 (x)
∂2 ∂ + a1 (x) + a0 (x) ∂x 2 ∂x
is not self-adjoint. To convert it to self-adjoint, we first multiply L by the factor x a1 (t) 1 exp dt . a2 (x) a2 (t) If we let p(x) = exp
x
a0 (x) a1 (t) a1 (x) dt , q(x) = exp dx a2 (t) a2 (x) a2 (x)
such that p ∈ C 1 [a, b] and q ∈ C[a, b], we obtain the so-called Sturm–Liouville Operator: d d p(x) + q(x). (1.10.8) L= dx dx It remains to find the appropriate boundary conditions that yields symmetry. For this, we assume the Sturm–Liouville equation Lu = f defined on an interval [a, b].
68
1 Operator Theory
Then
b
Lu, v − u, Lv = a
x=b vLu − u Lvd x = puv − vu x=a .
So for the operator L to be self-adjoint, it is required that
p(x)u(x)v (x) − v(x)u (x)
x=b x=a
= 0.
One way to achieve this is to assume the following boundary conditions: α1 u(a) + α2 u (a) = 0β1 u(b) + β2 u (b) = 0.
(1.10.9)
According to the results of the preceding section, we have the following. Theorem 1.10.3 The S-L problem Lu = f defined by (1.10.8–1.10.9) for f ∈ L 2 [a, b] has a countable set of increasing eigenvalues |λ1 | < |λ2 | < . . . < |λn | < . . . where λn −→ ∞, and their corresponding eigenvectors {ϕn } form an orthonormal basis for L 2 [a, b]. Moreover, the solution to the system is given by u(x) =
∞ 1 f, ϕn ϕn (x). λ n=1 n
For the eigenvalues λn of the S-L system, consider the eigenvalue problem Lu + λu = 0. We simply multiply the equation by u and integrate it to obtain b b u( pu ) + qu 2 d x + λ u 2 d x, a
a
from which we get the so-called Rayleigh quotient b b − puu a + a p(u )2 − qu 2 d x λ= . b 2 a u dx It is clear now that if
1.10 Differential Operators
69
puu
b
≤0
a
and q ≤ 0 on [a, b] then λ ≥ 0, and the absolute values in the above theorem could be removed. It is important to observe that the more the boundary conditions are restrictive, the more chance the operator won’t be self-adjoint (even if it is symmetric) since D(T ) ⊂ D(T ∗ ). The following operator explains this observation.
1.10.4 Momentum Operator The differential operator P = −i
∂ ∂x
is called the momentum operator, and has important applications in the field of quantum mechanics. Let P ∈ L(L 2 [a, b]). Remember that the domain of P must be dense in L 2 (a, b) in order to define the adjoint. It is easy to see that this operator is symmetric. Indeed, b u, Pv = Lvud x a
b
= −i
v ud x
a
= [−ivu]ab + i
b
u vd x
a
1 = [u(b)v(b) − u(a)v(a)] + P ∗ u, v . i So by imposing the condition u(a) = u(b), we obtain symmetry and P ∗ u = Pu on D(P). Here, D(P ∗ ) consists of all functions in C 1 [a, b] such that u(a) = u(b). If we adopt the same space C 1 [a, b] subject to the conditions u(a) = u(b) = 0 for the domain of P, then
(1.10.10)
D(P) D(P ∗ )
and P won’t be self-adjoint, hence the homogeneous conditions (1.10.10) will only establish symmetry but won’t lead to self-adjointness. Therefore, we always need
70
1 Operator Theory
to choose suitable boundary conditions to ensure not only symmetry, but D(P) = D(P ∗ ).
1.11 Problems (1) Let T ∈ B(H) and T > 0. Prove that |T x, y |2 ≤ T x, x T y, y . (2) Let H be a complex Hilbert space. (a) Let λ ∈ C and T ∈ B(H) be a normal operator. Show that T − λI is normal. (b) Let α > 0. If T x ≥ α x ∀x then T ∗ is one-to-one. (3) If T and S are two positive bounded linear operators, and T S = ST, show that T S is positive. (4) Show that if T is positive and S is bounded then S ∗ T S is positive. (5) Let T ∈ B(H) and there exists c > 0 such that c x2 ≤ T x, x for all x ∈ H. (a) Prove that T −1 exists.
(6) (7) (8) (9)
1 (b) Prove that T −1 is bounded, with T −1 ≤ . c If T ∈ B(H) is a normal operator, show that T is invertible iff T ∗ T is invertible. If Tn ∈ B(H) is a sequence of self-adjoint operators and Tn −→ T, show that T is self-adjoint. Let T ∈ B(H) such that T ≤ 1. Show that T x = x if and only if T ∗ x = x. Consider the integral operator (T f )(x) =
π −π
K (x − t)u(t)dt.
Determine whether T is self-adjoint in the following cases: (a) K (x) = |x| . (b) K (x) = sin x. (c) K (x) = ei x . 2 (d) K (x) = e−x . (10) (a) Give an example of T ∈ B(H) such that T 2 is compact, but T is not. (b) Show that if T ∈ B(H) is self-adjoint, and T 2 is compact, then T is compact. (11) Let T ∈ B(X, 1 ). If X is reflexive, show that T is compact. (12) Let T : p −→ p , 1 < p < ∞, defined as T (x1 , x2 , . . .) = (α1 x1 , α2 x2 , . . . α j x j , . . .),
1.11 Problems
71
where |αn | < 1 for all n. Show that T is compact iff lim αn = 0. (13) Determine if the operator T : ∞ −→ ∞ , defined as T (x1 , x2 , . . .) = (x1
x2 , xk ,... , , . . .), 2 k
is compact. (14) Consider X = (C 1 [0, 1], ·1,∞ ) and Y = (C[0, 1], ·∞ ) where f 1,∞ = max{| f (t)| , f (t) : t ∈ [0, 1]}. Let i : → C[0, 10 be the inclusion mapping. Show that i is compact. (15) If T ∈ K(H) and {ϕn } is an orthonormal basis for H, show that T (ϕn ) −→ 0. (16) Show that if T ∈ K(H) and dim(H) = ∞ then T −1 is unbounded. (17) Let H be a Hilbert space. Show that K2 (H) is closed if and only if H is finitedimensional. (18) Let T ∈ K2 (H) and S ∈ B(H). Show that T S, ST ∈ K2 (H). (19) Let T ∈ B(H) with an orthonormal basis {en } for H. Show that if T is compact, then lim T en = 0. (20) Show that if T ∈ K2 (H), then n T = T n 2 2 for all n ∈ N. (21) Consider Example 1.3.6. (a) Prove that the integral operator K in the example is compact if X = L p (a, b]), 1 < p < ∞, and k is piecewise continuous on [a, b] × [a, b]. (b) Prove that K is compact if X = L p (a, b]), 1 < p < ∞, and k ∈ L ∞ ([a, b] × [a, b]) . (c) Prove that K is compact if X = L p () for any bounded measurable in Rn and k ∈ L q ( × ) , where q is the Holder conjugate of p (i.e., p −1 + q −1 = 1). (d) Prove that K is compact if X = L p () for any measurable in Rn . (e) Give an example to show that K is not compact if X = L 1 [a, b]. (22) Let T : C[0, 1] → C[0, 1] defined by
x
(T f )(x) = 0
Show that T is compact. (23) Let T : L 2 [0, ∞] −→ L 2 [0, ∞]
f (η) dη. √ x −η
72
1 Operator Theory
(T f )(x) =
1 x
x
f (ξ)dξ.
0
Show that T is not compact. What if T : L 2 [0, 1] −→ L 2 [0, 1]? (24) Consider the Volterra operator V : L p [0, 1] −→ C[0, 1], defined by
x
(V u)(x) =
u(t)dt. 0
(a) Show that V is linear bounded with V ≤ 1. (b) Show that V is compact for all 1 < p ≤ ∞. (c) Show that V is a Hilbert–Schmidt operator when p = 2. (d) Give an example of a function u ∈ L 1 [a, b] to show that V is not compact when p = 1. (25) In Theorem 1.4.6, show that T is compact if and only if the sequence {αn } is bounded. (26) Show that the subspace of Hilbert–Schmidt operators K2 (H) endowed with the inner product defined by T, S = tr(S ∗ T ) for all S, T ∈ K2 (H) form a Hilbert space. (27) If T ∈ L(X ), for some normed space X, show that the null space N (T ) and range space R(T ) are T −invariant of X. (28) Let T ∗ = T ∈ K(H) for a Hilbert space H. Prove the following. (a) If T has a finite number of eigenvalues, then T is of finite rank. (b) R(T ) is separable. (29) Show that T : L 2 [0, 1] −→ L 2 [0, 1] is a Hilbert–Schmidt operator for the following: x (a) T f (x) = 0 f (t)dt. 1 (b) T f (x) = 0 (x − t) f (t)dt. x f (t) (c) T f (x) = 0 √x−t dt. (30) Consider the Laplace transform: L : L 2 (R+ ) −→ L 2 (R+ ) defined by L f (x) =
∞
e−xs f (s)ds.
0
(a) Show that L is a bounded linear integral operator with L f 2 ≤
√
π f 2 .
(b) Determine if L is a Hilbert–Schmidt operator. (c) Determine if L is a compact operator. (31) Determine whether the multiplication mapping
1.11 Problems
73
T u(x) = xu(x) is compact if (a) D(T ) = C[0, 1]. (b) D(T ) = L 2 [0, 1]. (32) Show that the system ψnm (x, y) = ϕn (x)ϕm (y) in (1.4.4) is an orthonormal basis for L 2 ([a, b] × [a, b]) given that (ϕn ) is an orthonormal basis for L 2 [a, b]. (33) Let T ∈ B(X ) for some Banach space X. Show that for all n ∈ N, {λn : λ ∈ σ(T )} ⊆ σ(T n ). (34) The spectral radius is defined as R(T ) = sup{|λ| : λ ∈ σ(T ) ⊆ C}. (a) Show that R(T ) ≤ T . (b) Show that 1/n R(T ) ≤ inf {T n }. n
(c) If T is normal, show that 1/n R(T ) = lim T n } = T . n
(35) Find σ p (T ) and σ(T ) for (1) R : p −→ p 1 ≤ p ≤ ∞ defined by R(x1 , x2 , . . .) = (0, x1 , x2 , . . .). (2) T : p −→ p 1 < p < ∞, defined by T (x1 , x2 , . . .) = (x1 ,
xn x2 x3 , , . . . , , . . .). 2 3 n
(3) T : 2 −→ 2 defined by T (x1 , x2 , . . .) = (
xn x2 x3 , ,..., , . . .). 1 2 n−1
(4) T : L 2 (0, 1) −→ L 2 (0, 1), defined by (T u)(x) = xu(x).
74
1 Operator Theory
(5) T : D(T ) −→ L 2 (0, 1), D(T ) = {u ∈ C 1 (0, 1) : u(0 = 0} defined by (T u)(x) = u . (6) T : L 2 [0, 1] −→ L 2 [0, 1], defined by T f (x) =
x
f (t)dt.
0
(36) Find the point spectrum of the left-shift operator (x1 , x2 , . . .) = (x2 , x3 , x4 , . . .)
(37)
(38) (39) (40) (41) (42) (43)
(a) on c. (b) on ∞ . Let T ∈ B(H) and (T − λ0 I )−1 be compact for some λ0 ∈ ρ(T ). Show that (a) (T − λI )−1 is compact for all λ ∈ ρ(T ). (b) dim N (T − λI ) < ∞ for all λ ∈ ρ(T ). Let T ∈ B(X ) be a normal operator and μ ∈ ρ(T ). Show that (T − μI )−1 is normal. Let T ∈ B(H) and λ ∈ ρ(T ). Show that T is symmetric if and only if (T − λI )−1 is symmetric. Let T ∈ K2 (H). Show that if T is a finite rank operator, then σ(T ) is finite. Write the details of the proof of Proposition 1.6.9. Write the details of the proof of the Spectral Mapping Theorem. Let T ∈ K(X ) for some Banach space X. Show that for λ = 0, ker(T − λI ) = N (Tλ )
is finite-dimensional. (44) Let T ∈ B(X ) for some Banach space X. Show that T is invertible if and only if T and T ∗ are bounded below. (45) Consider the differential operator D : C 1 [0, 1] → C[0, 1], D( f ) = f . Show that ρ(D) = Ø. (46) Let T ∈ B(H). Show that T ∗ is bounded below if and only if T is onto. (47) Let T ∈ B(X ) and A ∈ K(X ) for some Banach space X. Show that if A + T is injective then it is invertible. (48) In the proof of Hilbert–Schmidt theorem, let M = Span{ϕn }n∈N .
1.11 Problems
75
(a) Show that T | M ⊥ : M ⊥ −→ M ⊥ is self-adjoint and compact. (b) Show that T | M ⊥ is the zero operator. (c) Use (b) to show that M = R(T ). (49) A ∈ K(H) with eigenvalues {λn }, and let λ = 0. Show that the equation (λI − A)u = f has a solution if and only if f, v = 0 for all v ∈ ker(λn I − A). (50) Let T ∈ K(H) be a compact self-adjoint operator and let λ ∈ ρ(T ). Let (λn ) be the eigenvalues of T with the corresponding eigenvectors (ϕn ). If f ∈ H, show that the solution to the equation (λI − T )u = f is u(x) =
(λ − λn )−1 f, ϕn ϕn .
(51) Consider the Fourier transform F : L 2 (R) −→ L 2 (R) defined by 1 (Fu)(k) = √ 2π
∞
eikx u(x)d x.
−∞
(a) Show that F is a bounded operator defined on L 2 (R). (b) Find its adjoint operator. (c) Determine if it is compact. (d) Find its eigenvalues. (52) Consider the Volterra equation of the second kind: V : C[0, 1] −→ C[0, 1], u(x) −
x
k(x, y)u(y)dy = f.
0
(a) Use Fredholm Alternative to prove the existence of a unique solution u ∈ C[0, 1] for the equation for f ∈ C[0, 1]. (b) Use the method of successive iterations that was implemented in the proof of Theorem 1.8.8 to find the solution. (53) Let u ∈ L 2 [a, b], k ∈ C[a, b] such that |λ| k2 < 1. Show that the equation u(x) = λ
b
k(x, y)u(y)dy + f (x)
a
has a unique solution. (54) Consider the Fredholm integral operator V :
76
1 Operator Theory
V u(x) =
k(x, y)u(y)dy,
/ σ(V ) show that there exists a unique solution for the for some ⊂ Rn . If 1 ∈ equation f (x) −
k(x, y)u(y)dy = f
for all f ∈ C(). (55) Let K : L 2 [a, b] −→ L 2 [a, b] be a self-adjoint integral operator with a symmetric continuous kernel. Let ϕn be the eigenfunctions of the operator K . If there exists g ∈ L 2 [a, b] such that K g = f , show that f =
∞
f, ϕn ϕn
i=1
converges uniformly. (56) Give an example, other than the examples mentioned in the text, of a linear operator defined on a Banach space that is (a) bounded but a non-closed operator. (b) bounded with a non-closed range. (c) closed but unbounded. (57) Let T ∈ L(H) be a closed and densely defined operator. (a) Show that σ(T ∗ ) = σ(T ). (b) Show that T is closed if and only if R(T ) is closed. (58) Recall that an operator T is called positive if T x, x ≥ 0 for all x ∈ H. (a) Show that every positive operator is symmetric. (b) Show that the eigenvalues of a positive operator is nonnegative. (59) Let T ∈ L(H) be a closed and densely defined operator. Show that T ∗ T is positive and self-adjoint on H. (60) If T is an operator and T −1 is closed and bounded, show that T is closed. (61) If T ∈ L(H) is closed and S ∈ B(H), (a) Show that T + S is closed. (b) (T + S)∗ = T ∗ + S ∗ . (62) Let S, T ∈ L(H) be two densely defined unbounded operators. If D(S) ⊂ D(T ) and T −1 = S −1 , show that T = S. (63) Let S, T ∈ L(H) be two densely defined unbounded operators such that D(ST ) = H. (a) Show that
T ∗ S ∗ ⊂ (ST )∗ .
1.11 Problems
77
(b) If S is bounded and D(S) = H show that T ∗ S ∗ = (ST )∗ . (64) Let T ∈ L(H) be a densely defined operator. (a) If T is symmetric, show that T ⊂ T ∗∗ ⊂ T ∗ . (b) Moreover, if T is closed then T = T ∗∗ ⊂ T ∗ . (65) Let T ∈ L(H) be a densely defined operator on a Hilbert space H. Use the preceding problem to show that T is bounded if and only if T ∗ is bounded. (66) Let T ∈ L(H) be an unbounded closed densely defined operator defined on a Hilbert space H. (a) Show that σ(T ) is closed. (b) Show that λ ∈ σ(T ) iff λ ∈ σ(T ). (c) If i ∈ ρ(T ), show that (T ∗ − i)−1 is the adjoint of (T + i)−1 . (67) Let T ∈ L(H) be a densely defined operator on a Hilbert space H. Show that if λ ∈ ρ(T ) then Tλ is bounded below. (68) Let T ∈ L(H) be a densely defined operator on a Hilbert space H. Show that if there exists a real number λ ∈ ρ(T ) then T is symmetric if and only if T is self-adjoint. (69) Let T ∈ L(H) be a densely defined operator and symmetric on H. Show that if there exists λ ∈ C such that R(T − λI ) = H & R(T − λI = H, then T is self-adjoint. (70) Find the integral operator and find the eigenvalues and the corresponding eigenvectors for the problem Lu = −u provided that (a) u(0) = u (1) = 0. (b) u (0) = u(1) = 0. (c) u (0) = u (1) = 0. (71) Let T : L 2 [0, 1] −→ L 2 [0, 1], T f = f . Find the spectrum of T if (a) D(T ) = { f ∈ C 1 [0, 1] : such that f (0) = 0}. (b) D(T ) = { f ∈ AC[0, 1] : such that f ∈ L 2 [0, 1] and f (0) = f (1)}. (72) Consider the differential operator L = ex D2 + ex D defined on [0, 1] such that u (0) = u(1) = 0. Determine whether or not the operator is self-adjoint (where D is the first derivative).
78
1 Operator Theory
(73) Show that the following operators are of the Sturm–Liouville type. (a) Legendre: (1 − x 2 )D 2 − 2x D + λ on [−1, 1]. (b) Bessel: x 2 D 2 + x D + (x 2 − n 2 ) (c) Laguerre: x D√2 + (1 − x)D √ + λ on 0 < x < ∞. (d) Chebyshev: 1 − x 2 D[ 1 − x 2 D] + λ on [−1, 1]. (74) Convert the equation y − 2x y + 2ny = 0 on (−∞, ∞) into a Sturm–Liouville equation. (75) Determine whether or not the Sturm–Liouville operator L = D2 + 1 on [0, π] is self-adjoint under the conditions (a) u(0) = u(π) = 0. (b) u(0) = u (0) = 0. (c) u(0) = u (0) and u(π) = u (π). (76) Consider the equation Lu = u = f where f ∈ L 2 [0, 1]. Find L ∗ if the equation is subject to the boundary conditions (a) u(0) = u (0) = 0. (b) u (0) = u (1) = 0. (c) u(0) = u(1). (77) Consider the problem Lu = u = f where f ∈ L 2 [0, π] under the conditions: u(0) = u(π) = 0. (a) Show that L is injective. (b) Show that L is self-adjoint. (c) Find an orthonormal basis for L 2 [0, π]. (78) Consider the problem Lu = −u = f where f ∈ L 2 [−π, π] under the conditions: u(−π) = u(π) and u (−π) = u (π). (a) Show that L is injective. (b) Show that L is self-adjoint. (c) Find an orthonormal basis for L 2 [−π, π]. (79) Consider the problem Lu = u + λu = 0, where 0 < x < 1, under the conditions: u(0) = u(1) and u (0) = u (1). (a) Show that L is injective.
1.11 Problems
79
(b) Show that L is self-adjoint. (c) Find an orthonormal basis for L 2 [0, 1]. (80) Consider the operator Lu = iu , where i =
√ −1, and such that D(L) = {u ∈ C 1 [0, 1] : u(0) = u(1) = 0}.
Show that L is symmetric but not self-adjoint. (81) Consider the problem u + qu = f (x) for q ∈ C[a, b] and f ∈ L 2 [a, b] subject to the conditions u(a) = u(ab) = 0. Show that the solution to the problem exists and is given by u(x) =
∞ 1 f, ϕn ϕn (x), λ n=1 n
where {λn } is the set of eigenvalues such that |λn | ∞ and {ϕn } are the corresponding eigenfunctions that form an orthonormal basis for L 2 [a, b]. (82) Let L be a self-adjoint differential operator and let f ∈ L 2 [0, 1]. Use Fredholm Alternative to discuss the solvability of the two boundary value problems (1) Lu = f defined on [0, 1] subject to the conditions u(0) = α and u(1) = β. (2) Lu = 0 defined on [0, 1] subject to the conditions u(0) = u(1) = 0. (83) Determine the value of λ ∈ R for which the operator T : C[0, 1] −→ C[0, 1] defined by x
(T u)(x) = u(0) +
u(t)dt 0
is contraction.
Chapter 2
Distribution Theory
2.1 The Notion of Distribution 2.1.1 Motivation For Distributions Recall in Sect. 1.10, the Dirac delta were introduced with no mathematical foundations, and we mentioned that a rigorous analysis is needed to validate the construction of delta. This is one of the main motivations to develop the theory of distribution, and the purpose of this chapter is to introduce the theory to the reader and discuss its most important basics. As explained earlier, the Dirac delta cannot be considered as a function. We shall call these mathematical objects: distributions. Distributions are not functions in the classical sense because they exhibit some features that are beyond the definition of the function. We can, however, view them as “generalized” functions provided that the definition of function is being extended to include them. This “generalized” feature provides more power and flexibility to these distributions, enabling them to represent some more complicated behavior that cannot be represented by functions. For this reason, distributions are very useful in applications to topics related to physics and engineering, such as quantum mechanics, electromagnetic theory, aerodynamics, and many other fields. This chapter defines the notion of distribution and discusses some fundamental properties. Then, we will perform some operational calculus on rigorous mathematical settings, such as derivatives, convolutions, and Fourier transforms.
2.1.2 Test Functions In principle, the theory suggests that distributions should act on other functions rather than being evaluated at particular points, and its action on a particular function determines its values, and its definition is governed through an integral over a domain. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 A. Khanfer, Applied Functional Analysis, https://doi.org/10.1007/978-981-99-3788-2_2
81
82
2 Distribution Theory
So a distribution, denoted by T, is an operator, in fact functional, and should take the following form: ∞ T (x)ϕ(x)d x. (2.1.1) T (ϕ) = −∞
It is simply an integral operator acting on functions through an integral over R, or generally Rn . Observe that not any function ϕ can be admitted in (2.1.1), so we certainly need to impose some conditions on these admissible functions ϕ to ensure the operator T is well-defined. To handle this issue, note that by the linearity of integrals
∞ −∞
T (x)[c1 ϕ(x) + c2 ψ(x)]d x = c1
∞
−∞
T (x)ϕ(x)d x. + c2
∞
−∞
T (x)ψ(x)d x,
which implies that T (c1 ϕ + c2 ψ) = c1 T (ϕ) + c2 T (ψ). So the operator T is clearly linear. If we assume that both ϕ and ψ to be functions in the domain of T, then c1 ϕ + c2 ψ must be in the domain as well. This implies that the space of all admissible functions ϕ of a distribution T must be a vector space. The admissible functions must satisfy two essential conditions. (1) They are infinitely differentiable. That is, ϕ ∈ C ∞ . Functions in the space C ∞ () are called smooth functions on . (2) They must be of compact support. Recall a support of a function f , denoted by supp( f ), is defined as supp( f ) = {x ∈ Dom( f ) : f (x) = 0}. According to Heine–Borel theorem, a compact set on R n is closed and bounded, so a function f is of compact support K means
f (x) = 0 : x ∈ K . f (x) = 0 : x ∈ /K
The space of continuous functions of compact support is denoted by Cc (). Similarly, the space of smooth functions of compact support is denoted by Cc∞ (). There are two reasons to impose the first condition, one of them is related to differentiation of distributions, and the other one is related to their Fourier transforms. We will clarify these two points when we come to them. The second condition is a strong condition to convert the integral from improper to proper. Indeed, if ϕ is of compact support, then we can find two real numbers a and b such that
2.1 The Notion of Distribution
∞
−∞
83
T (x)ψ(x)d x =
b
T (x)ψ(x)d x.
a
Now, we are ready to provide our first definition. Definition 2.1.1 (Test Function) A function is called a test function on ⊆ Rn if it is smooth on and is of compact support. The space of test functions on is denoted by D() = Cc∞ (). It is readily seen that the space D is a linear space. The dual space of D is denoted by D . We need to impose the notion of a distance in D in order to define convergence in this space. The nature of the members of the space suggests the uniform convergence. Definition 2.1.2 (Convergence in D) Let ϕn , ϕ ∈ D(). Then, we say that ϕn → ϕ in D() if there exists a compact set K such that ϕn , ϕ = 0 outside K , and ϕn − ϕ → 0 uniformly, i.e., max |ϕn (x) − ϕ(x)| → 0. x∈K
2.1.3 Definition of Distribution We are ready now to give a definition for the distribution. Definition 2.1.3 (Distribution) A distribution T is a continuous linear functional, T : D(Rn ) → R, and is given by T (ϕ) =
Rn
T (x)ϕ(x)d x,
for every test function ϕ in D. A distribution T acting on a function ϕ can also be denoted by T, ϕ . The notation ., . here is not an inner product but it was adopted because T behaves similarly when acting on functions. Note that in case n = 1, the definition reduces to (2.1.1). Remark All results for n = 1 applies to n > 1, so for the sake of simplicity, we may establish some of the upcoming results only for n = 1 and the reader can extend them to n > 1 either by induction or by routine calculations. The definition implies that a functional on D is a distribution in the dual space D (R) if it is linear, continuous and satisfies (2.1.1). Recall that by linearity we mean: T, αϕ + βψ = α T, ϕ + β T, ψ .
84
2 Distribution Theory
By continuity, we mean: If ϕn → ϕ in D then T, ϕn → T, ϕ . Since for linear functionals continuity at a point implies continuity at all points, it is enough to study convergence at zero, i.e., T, ϕn → T, 0
whenever ϕn → 0. The dual space of D, denoted by D (R), is endowed with the weak-star topology which forms the pointwise convergence.
2.2 Regular Distribution 2.2.1 Locally Integrable Functions Definition 2.1.3 offers a rigorous mathematical definition of distribution by means of integration. But, we may still have some issues related to the nature of this definition. The functional T is defined in terms of itself because the functional T is inserted inside the integral and acts pointwise on x ∈ Rn as the test functions do, then we compute the integral over Rn to obtain T again. This approach might be awkward and fuzzy in some situations. If we can associate the functional T with another function, say f, that can be used to define the functional appropriately, then the functional T would be characterized by f . But how to choose such f ? The following definition gives some help in this regard. Definition 2.2.1 (Locally Integrable Function) Let f : → R be a measurable function. Then, f is called a locally integrable function if it is Lebesgue-integrable on every compact subset of . The space of all locally integrable functions on is 1 (). denoted by L loc The definition implies that every continuous function on Rn is locally integrable. The constant function f (x) = c is locally integrable on R but not integrable on it. If a function is continuous on R and locally integrable but not integrable, this means it does not vanish at infinity.
2.2.2 Notion of Regular Distribution The notion of locally integrable functions ignites the following idea: Since ϕ is continuous on a compact set K , it has a maximum value on that set, and if we
2.2 Regular Distribution
85
multiply a locally integrable function f with a test function ϕ and integrate over Rn , this gives
Rn
f (x)ϕ(x)d x =
| f (x)ϕ(x)| d x f (x)ϕ(x)d x ≤ K K | f | d x < ∞. ≤ (max |ϕ|) x∈K
K
Therefore, one way to characterize a distribution T and give it an appropriate integral representation is to use a locally integrable function to define it, and this will yield a meaningful integral. Since the integral exists and finite, the functional is well-defined. To show it is a distribution, we need to prove linearity and continuity. Linearity is clear: T (ϕ1 + ϕ2 ) =
=
f (x)(ϕ1 + ϕ2 )d x f (x)ϕ1 (x)d x + f (x)ϕ2 (x)d x
= T, ϕ1 + T, ϕ2 = T (ϕ1 ) + T (ϕ2 ). To prove continuity, we assume ϕn → 0, then max |ϕn | → 0 x∈K
for some compact K ⊃ supp(ϕn ). Then, | f, ϕn | =
f (x)ϕn (x)d x Rn | f | d x → 0. ≤ (max |ϕn |) x∈K
K
Therefore, f is a distribution on Rn , i.e., f ∈ D (Rn ). 1 Definition 2.2.2 (Regular Distribution) Let f ∈ L loc (Rn ). Then, the distribution T f given by f (x)ϕ(x)d x Tf , ϕ = Rn
is called a regular distribution. According to Definition 2.2.2, any regular distribution is characterized by a locally integrable function. Indeed, if f = g, then T f = Tg . On the other hand, if f and g are locally integrable functions and T f = Tg , then f = g except for a set of measure zero. So we can say that regular distributions are uniquely determined by a locally integrable function. We could go the other way around and say that every locally
86
2 Distribution Theory
integrable function can be used to define a regular distribution. If no such f exists to define a distribution, it is a singular distribution. This leads to the fact that some functions defined in the classical sense, such as the class of locally integrable functions, can be considered regular distributions. This also shows that the value of a distribution at a point can have a meaning only if the distribution is regular because it 1 (). This fact is rather interesting and shows can be identified with a function in L loc how distributions generalize the notion of classical functions, which implies that calculus operations (such as limits and differentiation) that were classically developed for functions can also be implemented somehow in a distributional sense.
2.2.3 The Dual Space D The dual space of D is the space of all distributions on D, and is denoted by D . So if T ∈ D , then T : D → R. The space D is linear (check!), which means that for any two T1 , T2 ∈ D , we have αT1 + βT2 ∈ D . The convergence in D can take a weak form. Recall that a sequence ϕn ϕ in D weakly if T (ϕn ) → T (ϕ) for every T ∈ D . The convergence in D takes the form of a weak-star. Recall that a sequence Tn → T in D in the weak-star topology if Tn (ϕ) → T (ϕ) for every ϕ ∈ D. Another characterization is equality. Recall for functions f and g, we say that f = g on if f (x) = g(x) for every x ∈ . That is, the equality of functions is contingent on their values. To extend to distributions, we say that two distributions T and S are equal if and only if T (ϕ) = S(ϕ) for all ϕ ∈ D. That is T = S ⇐⇒ T, ϕ = S, ϕ ∀ϕ ∈ D.
(2.2.1)
Tn → T ⇐⇒ Tn , ϕ → T, ϕ ∀ϕ ∈ D.
(2.2.2)
By continuity, Note that when we say T = S, it means they are equal in the distributional sense.
2.3 Singular Distributions
87
2.2.4 Basic Properties of Regular Distributions Here are Some Elementary Properties of Distributions Proposition 2.2.3 Let T and S be two distributions, and ϕ ∈ D(), ⊆ Rn . Then, the following properties hold. (1) (2) (3) (4) (5) (6) (7) (8)
T (0) = 0. T, cϕ = c T, ϕ for any c ∈ R. T + S, ϕ = T, ϕ + S, ϕ . T (x − c), ϕ(x) = T, ϕ(x + c) . If T (x − c), ϕ = T, ϕ(x) then T is said to be invariant with respect to translations. T (−x), ϕ(x) = T (x), ϕ(−x) . x 1 T (cx), ϕ(x) = T (x), ϕ( , for any nonzero number c. c c T.S, ϕ = S, T ϕ , provided that T ϕ ∈ D(). Let x = g(y) ∈ C ∞ be injection. Then, T ◦ g, ϕ = T, (ϕ ◦ g −1 )(g −1 (x)) .
Proof The properties from (1) to (7) can be easily concluded from the definition directly and using some simple substitutions, so the reader is asked to present proofs for them. For (8), we let y = g −1 (x) in the integral T (g(y)ϕ(y)dy.
2.3 Singular Distributions 2.3.1 Notion of Singular Distribution The terms “regular” and “singular”, if coined with functions, determine whether a function is finite or infinite on some particular domain. The points at which the function blows up are called singularities. If we extend the notion to distributions, we ∞ f (x)ϕ(x)d x shall say that a regular distribution T is a distribution when the value −∞
is finite and well-defined. This can be achieved if there exists a function f that is integrable on every compact set. This ensures that f is integrable on supp(ϕ); hence the integral is well defined. If no such f exists, then the distribution is called singular. One way to construct a singular distribution is through the Cauchy principal value of a nonintegrable function having a singularity at a point a. Such a function cannot define a regular distribution, but we can define the principal value of it, denoted by p.v. f (x), as follows
88
2 Distribution Theory
p.v. f (x), ϕ(x) = lim+ ↓0
|x−a|>
f (x)ϕ(x)d x
where f has a singularity at x = a. The following example illustrates the idea. 1 / L 1Loc (R) due to the sinExample 2.3.1 Consider f (x) = on R. Obviously, f ∈ x gularity at x = 0. Hence, we apply its principal value,
1 ϕ(x) p.v. , ϕ = lim+ d x. ↓0 x x |x|>
(2.3.1)
Since ϕ is compact, we can find a compact interval K = [−r, r ], with r > , such that ϕ vanishes outside K . So (2.3.1) can be written as lim+
↓0
0
φm (x)[ϕ(x) − ϕ(0)]d x +
r −r
φm (x)[ϕ(x) − ϕ(0)]d x.
For the first integral, we pass the limit inside the integral using (2.3.9) due to uniform convergence on |x| ≥ r > 0, and this gives 0. For the second integral, note that ϕ is continuous at x = 0. Let > 0. Then, there exists r1 > 0 such that |ϕ(x) − ϕ(0)| < for |x| < r1 . Moreover, we can find r2 sufficiently small such that
r2 −r2
φm (x)d x < 1.
Let r = min{r1 , r2 }. Then r ≤ φ (x)[ϕ(x) − ϕ(0)]d x m −r
r −r
φm (x)d x < .
Note that is arbitrary, hence we obtain lim
∞
m→∞ −∞
φm (x)ϕ(x)d x = 0 + ϕ(0)
∞ −∞
φm (x)d x = ϕ(0) · 1 = ϕ(0). (2.3.11)
In view of Definition 2.3.3, we write (2.3.11) as lim φm (x), ϕ(x) = δ(x), ϕ(x) .
m→∞
It follows from (2.2.1) that lim φm (x) = δ(x).
m→∞
Since {φn } ∈ each φi can be considered a regular distribution, which implies that, though δ is not regular, it is the limit of a sequence of regular distribu1 L loc (),
2.3 Singular Distributions
93
tions. This argument raises the following question: Does this sequence exist? The answer is yes, and one way to construct it is to consider a function φ ∈ D(), with the properties φ ≥ 0 on K =supp(φ), and φ = 1. Since φ is continuous on a compact K
set K , it is uniformly continuous on K , so for every η > 0 we can find n sufficiently large such that x − φ(0) < η φ n for all x ∈ K . This shows that φ
x n
→ φ(0)
uniformly as n −→ ∞, and this can be written as φ(y) → φ(0) uniformly as → 0. Now, for the function φ ∈ D() and n > 0, we define the sequence φn (x) = nφ(nx) (2.3.12) for ⊆ R. Then, we observe the following: (1) 0 ≤ φn (x) ∈ D() for all n. (2) φn (x) −→ 0 uniformly on x = 0, and φn (0) −→ ∞ as n −→ ∞. ∞ (3) φn (x)d x = 1. This can be seen from the substitution nx = y. −∞
If we represent δ by (2.3.6) which physicists use, then observation (2) leads to the result directly, but we prefer to maintain a rigorous treatment. Given the observations above, we see that {φn } in (2.3.12) is a delta sequence, hence by Theorem 2.3.5, we obtain lim φn (x) = δ(x). n→∞
Indeed, for ϕ ∈ D() we have
∞ −∞
φn (x)ϕ(x)d x =
∞
nφ(nx)ϕ(x)d x. −∞
Using the substitution y = nx, we obtain
∞
−∞
φn (x)ϕ(x)d x =
∞
φ(y)ϕ
−∞
Using the same argument as above on ϕ, we have
y n
dy.
94
2 Distribution Theory
ϕ
y n
→ ϕ(0)
uniformly. Hence, we can pass the limit inside the previous integral and obtain
∞
lim
φ(y)ϕ
y
−∞
n
dy. = ϕ(0)
∞ −∞
φ(y)dy = ϕ(0).
Therefore, as n → ∞ we have φn (x) → δ(x). Remark The following should be noted: 1. The condition φ(x) ∈ D() is used to define continuous functions on compact sets, but this condition can be relaxed and we still obtain a delta sequence {φn } without necessarily being in D(). 2. The above discussion can be extended analogously to Rn .
2.3.4 Gaussian Delta Sequence Here is a famous example of a delta sequence known as the Gaussian delta sequence which is derived from the Gaussian function. Example 2.3.6 Consider the Gaussian function 1 2 φ(x) = √ e−x . π Clearly, φ ∈ / D(R) because it is not of compact support. It is well-known that
∞ −∞
Define
e−t dt = 2
√ π.
n 2 2 φn (x) = nφ(nx) = √ e−n x . π
Then, φn (x) ≥ 0. Integrating over R, and using the substitution y = nx,
∞
−∞
n 1 2 2 √ e−n x d x = √ π π
Moreover, for |x| ≥ r > 0, we have
∞
−∞
e−y dy = 1. 2
2.3 Singular Distributions
95
n n 1 2 2 sup √ e−n x − 0 ≤ 2 2 = 2 → 0 n r nr π |x|≥r >0, as n −→ ∞ for all |x| ≥ r. So
φn (x) → 0
on |x| ≥ r. In view of Definition 2.3.4 we conclude that {φn } is a delta sequence, and hence by Theorem 2.3.5, φn −→ δ. Indeed, letting y = nx, then lim
∞
n→∞ −∞
n 2 2 √ e−n x ϕ(x)d x = π
∞
−∞
y 1 2 dy. √ e−y ϕ n π
Since ϕ ∈ D, by a previous argument, we have the uniform convergence ϕ
y n
u
−→ ϕ(0).
Passing the limit inside the integral above gives
∞ −∞
y 1 2 dy = ϕ(0) √ e−y ϕ n π
∞ −∞
1 2 √ e−y dy = ϕ(0). π
(2.3.13)
We have the following important remarks about the previous example. (1) We can verify that φn −→ δ by a simple, though no rigorous, argument. Let x = 0, then n δn (0) = √ → ∞ as n → ∞. π Let x = 0, then
n 2 2 lim √ e−n x = 0. π
n→∞
So, in general, we have n −n 2 x 2 0 : x = 0 lim √ e = δ(x) = . n→∞ π ∞ :x =0 As illustrated before, approaching δ distribution through the representation (2.3.6) does not yield a rigorous treatment. (2) We can prove the result in (2.3.13) using the Dominated Convergence Theorem rather than the uniform convergence on |x| ≥ r > 0. This can be achieved using the fact that ϕ is a continuous function with a compact support, and the fact
96
2 Distribution Theory
that e−y is Lebesgue-integrable on R. We leave the details for the reader as an exercise. (3) We can define the function φ over Rn and we obtain the same results. Moreover, we can rewrite the sequence in (2.3.12) as 2
φ (x) =
1 x φ n
(2.3.14)
1 in 4
and taking → 0+ , then letting n 2 =
n 2 2 φn (x) = √ e−n x . π Then, the sequence can be written in the following form 1 2 φ (x) = √ e−x /4 , 4π with φ → δ as → 0. This representation of the delta sequence is important and useful in constructing solutions of PDEs.
2.4 Differentiation of Distributions 2.4.1 Notion of Distributional Derivative One of the most fundamental and important properties of distributions is that they ignore values at points and act on functions through an integration process. This seems interesting because it enables us to differentiate discontinuous functions. The generalized definition of functions is the main tool to allow this process to occur. Assume a distribution is T and its derivative is T . Then ∞ T ϕd x. T ,ϕ = −∞
If we perform integration by parts, and making use of the fact that ϕ is differentiable and of compact support, then the above integral will be of the form
T , ϕ = 0 −
This proposes the following definition.
∞ −∞
T ϕ d x.
2.4 Differentiation of Distributions
97
Definition 2.4.1 (Distributional Derivative) Let T be a distribution. Then, the distributional derivative of T , denoted by T , is given by
T , ϕ = − T, ϕ .
The derivative of distribution is always a distribution, and we can continue differentiating or use induction to get
T (m) , ϕ = (−1)m T, ϕ(m) .
We have no problem with that as long as ϕ ∈ D, which is in fact one of the main reasons why test functions are restricted to that condition. It should be noted that T (m) can never be the zero function. We pointed out previously that some normal functions, such as the locally integrable functions, can be considered distributions. This implies the following: (1) Derivatives of the distributions should extend the notion of derivative for functions. Otherwise, we may get two different derivatives for the same function if treated as a normal function and as a distribution. (2) The rules governing the distributional derivatives should be the same in classical cases.
2.4.2 Calculus Rules Theorem 2.4.2 The following rules hold for distributional derivatives: (1) Summation Rule:
(T + S) = T + S .
(2) Scalar Multiplication Rule:
(cT ) = cT
for every c ∈ R. (3) Product Rule: If g ∈ C ∞ , then (gT ) = gT + g T. (4) Chain Rule: If g ∈ C ∞ , then d (T (g(x)) = T (g(x) · g (x). dx
98
2 Distribution Theory
Proof (1) and (2) are immediate using the definitions. (T + S) , ϕ = T , ϕ + (S , ϕ ,
and
(cT ) , ϕ = c T , ϕ .
For (3), we have (gT ) , ϕ = − gT, ϕ = − T, gϕ = − T, (gϕ) + T, g ϕ = T , gϕ + T, g ϕ = gT , ϕ + g T, ϕ . For (4), we apply the definition to get [T (g(x))] , ϕ = − T (g(x), ϕ
∞ = −[T (g(x))ϕ]∞ + T (g(x)) · g (x)ϕ(x)d x −∞ −∞ = T (g(x) · g (x), ϕ .
2.4.3 Examples of Distributional Derivatives Example 2.4.3 To find the distributional derivative of δ(x), we apply Definition 2.4.1 to get δ , ϕ = − δ, ϕ = −ϕ (0). It is important to know that this doesn’t mean δ = −ϕ (0), but it says that when δ acts on ϕ, the outcome is −ϕ (0), i.e., δ [ϕ] = −ϕ (0). Further, using induction one can continue to find derivatives to get the general formula δ (n) , ϕ = (−1)n ϕ(n) (0).
Example 2.4.4 Consider the Heaviside function
2.4 Differentiation of Distributions
99
H (x) =
1 :x >0 0 : x < 0.
Let ϕ ∈ D. Then
H , ϕ = − H, ϕ ∞ =− H ϕ d x −∞ ∞ ϕ d x = ϕ(0). =− 0
But ϕ(0) is nothing but δ, ϕ , therefore we get
H , ϕ = δ, ϕ
for any test function ϕ. So we conclude that H = δ. Example 2.4.5 Consider the sign function 1 :x >0 f (x) = sgn(x) = −1 : x < 0. Then we clearly have sgn(x) = H (x) − H (−x). Using properties of derivatives (Theorem 2.4.2(1)) and the previous example, (sgn(x)) = H (x)(1) − H (−x)(−1) = δ(x) + δ(−x) = 2δ(x). Another interesting function that we like to differentiate in a distributive sense is the following. Example 2.4.6 Let f (x) = ln |x| . We apply the definition to obtain
(ln |x|) , ϕ = −
∞ −∞
ln |x| ϕ d x.
Since we have a singularity at 0, we use the principal value again to get
(ln |x|) , ϕ = − lim+ →0
|x|>
ln |x| ϕ d x.
(2.4.1)
By integration by parts, and the fact that ϕ is of compact support, the RHS of (2.4.1) equals
100
2 Distribution Theory
− lim+ →0
|x|>
ϕ(x) d x + [(ϕ() − ϕ(−)) ln |x|]. x
(2.4.2)
The reader should be able to verify that lim (ϕ() − ϕ(−)) ln |x| = 0.
→0+
Substituting (2.4.2) in (2.4.1) gives
(ln |x|) , ϕ = − lim+ →0
|x|>
1 ϕ(x) d x = p.v. . x x
2.4.4 Properties of δ The following theorem provides some interesting properties for the delta distribution that can be very useful in computations. Theorem 2.4.7 The following properties hold for δ(x). (1) x · δ (x) = −δ(x). (2) If c ∈ Dom( f ), then f (x)δ(x − c) = f (c)δ(x − c). (3) For any c = 0, we have δ(x 2 − c2 ) =
1 [δ(x − c) + δ(x + c)]. 2 |c|
(4) Let g(x) = (x − x1 )(x − x2 ) · · · (x − xn ). Then, δ(g(x)) =
n
1 δ(x − xk ). (x )| |g k k=1
Proof For (1), we make use of Proposition 2.2.3(3) and Theorem 2.4.2(3). Since x · δ(x) = 0, we have
xδ , ϕ = (xδ) , ϕ − δ, ϕ = − δ, ϕ .
For (2) we apply the definition,
2.5 The Fourier Transform Problem
101
f (x)δ(x − c), ϕ(x) = δ(x − c), f (x)ϕ(x)
= f (c)ϕ(c) = f (c) δ(x − c), ϕ(x)
= f (c)δ(x − c), ϕ(x) . For (3), consider
⎧ ⎪ ⎨1 : x < −c 2 2 H (x − c ) = 0 : −c < x < c ⎪ ⎩ 1 : x > c.
Then, we can write H (x 2 − c2 ) = 1 − [H (x + c) − H (x − c)]. Taking the derivative of both sides of the equation using chain rule gives 2xδ(x 2 − c2 ) = δ(x − c) − δ(x + c). Now (3) follows from (2). For (4), we extend (3). Notice that H (g(x)) = 1 − [H (x − x1 ) − H (x − x2 ) + H (x − x3 ) · · · −(−1)n H (x − xn )]. Differentiate both sides and using (2), the result follows.
2.5 The Fourier Transform Problem 2.5.1 Introduction The Fourier transform is one of the main tools used in the theory of distributions and its applications to partial differential equations. In fact, a comprehensive study of the theory of Fourier transforms and its techniques requires a whole separate book. We will, however, confine ourselves to the material that suffices our needs and meets the aims of the present book. Our main goal is to enlarge the domain of the Fourier transform to apply to a wide variety of functions. If we confine distribution theory to test functions, we cannot do much work on transformations. It is well-known that some functions such as the Heaviside function, constant functions, polynomials, periodic sine and cosine, and other functions are good examples of external sources imposed on systems, so they appear in the PDEs representing the systems. Unfortunately, these functions do not possess Fourier transforms. The duality of the Fourier transform is not consistent with test functions because the Fourier transform of a test
102
2 Distribution Theory
function needs not be a test function. The key is to ensure the following two points: 1. To find a property that keeps a function vanishing at infinity. 2. If multiplied by other smooth and nice functions, the integrand is integrable over R, or Rn .
2.5.2 Fourier Transform on Rn Recall the one-dimensional Fourier transform of a function f : R −→ R is given by F{ f (x)} =
∞
−∞
f (x)e−iωx d x,
(2.5.1)
where ω ∈ R. The Fourier transform of a function is denoted by F{ f (x)}(ω) = fˆ(ω). In n-dimensions, this is extended to the multidimensional Fourier transform ˆ f (x)e−i(ω·x) d x. (2.5.2) F{ f (x)}(ω) = f (ω) = Rn
Here, x = (x1 , x2 , . . . , xn ) and ω = (ω1 , ω2 , . . . , ωn ) are spatial variables of n dimensions, and ω · x = ω1 x 1 + · · · + ω n x n . We can recover the function from the Fourier transform through the inverse Fourier transform ∞ 1 fˆ(ω)eiωx dω f (x) = 2π −∞ on R, and F −1 { fˆ(ω)} =
1 (2π)n
Rn
fˆ(ω)ei(ω·x) dω
for f : Rn −→ R. In the present section, we discuss the problem of existence of Fourier and inverse Fourier transforms. Particularly speaking, we impose the following questions: (1) Under what conditions does the Fourier transform of a function f exist? (2) Can we find the inverse Fourier transform of fˆ(ω) for a function f (x)? (3) If the answer to (2) is yes, does this recover f again in the sense that F −1 F{ f (x)} = FF −1 { f (ω)} = f (x)?
2.5 The Fourier Transform Problem
103
2.5.3 Existence of Fourier Transform We begin with the following result. Theorem 2.5.1 If f ∈ L 1 (R), then fˆ(ω) exists. Proof Note that |F{ f }| ≤
∞ −∞
f (x)e−iωx d x =
∞
−∞
| f (x)| d x < ∞.
The result establishes the fact that the Fourier transform F is, in fact, a linear bounded (hence continuous) mapping from L 1 to L ∞ . Now, suppose f ∈ L 1 (R) ∩ L 2 (R). How to establish a Fourier transform for f ? The idea is to define the transform F on a dense subspace of L 2 (R), then we extend the domain of definition to L 2 (R) using the closure obtained by continuity. The typical example of such a subspace is the space of simple functions because this space is dense in L 2 . Consider the truncated sequence f n = f · χ[−n,n] , where χ is the characteristic function χ[−n,n] (x) =
1 −n ≤ x ≤ n 0 otherwise.
Then f n ∈ L 1 (R) ∩ L 2 (R), and it is known that L 1 (R) ∩ L 2 (R) = L 2 (R). It is easy to see that f n − f 2 → 0, f n − f 1 → 0, which implies that f n → f in L 2 (R), and the sequence { f n } is Cauchy in L 2 (R). On the other hand, since f n ∈ L 1 (R) ∩ L 2 (R), fˆn (ω) ∈ L 2 (R) for every n, where n fˆn (ω) = f (x)e−iωx d x. −n
Now, for 0 < n < m. With the aid of Plancherel Theorem which will be discussed next, we have as n, m −→ ∞
104
2 Distribution Theory
F{ f n } − F{ f m }2 = F{ f n − f m }2 = 2π f n − f m 2 .
(2.5.3)
So fˆn (ω) is Cauchy in the complete space L 2 (R), hence fˆn (ω) converges in the L 2 −norm to a function in L 2 (R), call it h(x). Note that h was found by means of the sequence { f n }. Let us assume there exists another Cauchy sequence, say gn ∈ L 1 (R) ∩ L 2 (R), such that gn − f 2 → 0, which implies that gn − f n 2 → 0. By the same argument above we conclude that gˆn −→ g in the L 2 (R) norm. Using (2.5.3) again, it is easy to show that h − g2 ≤ h − fˆn + 2π f n − gn 2 + gˆn − g 2 → 0. 2
This means that h = g a.e., i.e. h does not depend on the choice of the approximating sequence { f n }, and therefore we can define now the Fourier transform of f on L 2 (R) to be fˆ(ω) = F{ f } = h, as an equivalence class of functions in L 2 , and where n F{ f } = l.i.m n→∞ F{ f n } = l.i.m n→∞ f (t)e−iωt dt. −n
The notation l.i.m is the limit in mean and is referred to the limit in the L 2 −norm. For convenience, we will, however, write it simply as lim, keeping in mind that it is not a pointwise convergence, but a convergence in L 2 - norm. It remains to prove (2.5.3).
2.5.4 Plancherel Theorem The following theorem is one of the central theorems in the theory of Fourier analysis. It is called: the Plancherel theorem, and it is sometimes called: Parseval’s identity. It demonstrates the fact that the Fourier transform F on L 2 (R) is a bijective linear operator which maps f to fˆ, and is an isometry up to a constant, so it is an isomorphism of L 2 onto itself. Theorem 2.5.2 (Plancherel Theorem) Let f ∈ L 2 (R), and let its Fourier transform be fˆ. Then, 1 f 2 = √ fˆ . 2 2π Proof We have
2.6 Schwartz Space
∞ −∞
105
| f (t)|2 dt =
∞
1 2π 1 = 2π 1 = 2π 1 = 2π =
f (t) f (t)dt ∞ ∞ f (t) fˆ(ω)e−iωt dωdt −∞ −∞ ∞ ∞ f (t)e−iωt dt dω fˆ(ω) −∞ −∞ ∞ fˆ(ω) fˆ(ω)dω −∞ ∞ ˆ 2 f (ω) dω.
−∞
−∞ 2
This result shows that the space L (R) is a perfect environment for the Fourier transform to work in.
2.6 Schwartz Space 2.6.1 Rapidly Decreasing Functions One of the central problems of Fourier analysis is how to apply Fourier transform to a broader class of functions. The main obstacle is that we cannot guarantee F{ f } ∈ L 1 , even if F{ f } exists for some f ∈ L 1 , which will encounter a problem in satisfying the essential identity F −1 {F{ f }} = f. One reason for this is that f is not decaying fast enough. This tells us that the rate of convergence of the Fourier transform plays a significant role. To see this, let f ∈ L 1 (R), with lim f (t) = 0. t→±∞
Using the definition and basic properties of Fourier transform, it can be easily shown that F{ f (t)} = iω fˆ(ω), which gives
ˆ M for some M > 0. f (ω) ≤ ω
1 This implies that fˆ(ω) converges to 0 like . If we proceed further, assuming that ω f is absolutely integrable over R, and lim f (t) = 0,
t→±∞
106
2 Distribution Theory
then we obtain
M ˆ f (ω) ≤ 2 , ω
1 i.e., fˆ(ω) converges to 0 like 2 . This shows that the smoother the function and the ω more integrability of its derivatives, the faster the decay of its Fourier transform will be. If f and all its derivatives are absolutely integrable over R and vanish at ∞, then its Fourier transform decays at least exponentially. If we continue the process, we find that fˆ(ω) converges faster than any inverse of the polynomial, i.e., fˆ(ω) is a rapidly decreasing function. On the other hand, it is well-known that F{t f (t)} = i
d ˆ ( f (ω)). dω
Repeating these processes infinitely many times can only work if we are dealing with infinitely differentiable functions of rapid decay. We conclude that if f has a high rate of decay, then fˆ is smooth, and if f is smooth, then fˆ has a high rate of decay. Due to the duality between the smoothness of a function and the rate of decay of its Fourier transform, and the rate of decay of a function with the smoothness of its Fourier transform, the Fourier transform of a function can be used to measure how smooth that function is, and the faster f decays the smoother its Fourier transform will be. This idea motivated Laurent Schwartz in the late 40s of the last century to introduce the class of rapidly decreasing functions which provides the bases for Schwartz spaces. Definition 2.6.1 (Rapidly Decreasing Function) Let ϕ ∈ C ∞ (R). Then ϕ is said to be rapidly decreasing function if ϕk,m = sup x k ϕ(m) (x) < ∞ for all k, m ≥ 0.
(2.6.1)
x∈R
In other words, a rapidly decreasing function is simply a smooth function that decays to zero as x → ±∞ faster than the inverse of any polynomial. According to the definition, all the derivatives of such functions have the same property. The following can be considered equivalent to (2.6.1) (see Problem 2.11.22) lim x k ϕ(m) (x) = 0, for all k, m ≥ 0.
|x|→∞
sup(1 + |x|2 )k ϕ(m) (x) < ∞, for all k, m ≥ 0.
(2.6.2) (2.6.3)
x∈R
sup(1 + |x|)k ϕ(m) (x) < ∞, for all k, m ≥ 0.
(2.6.4)
x∈R
The definition can be easily extended to Rn . In this case, we need partial differentiation. We define a multi-index α = (α1 , . . . , αn ) to be an n-tuple of nonnegative
2.6 Schwartz Space
107
integers αi ≥ 0, such that
αi ∈ N0 = N ∪ {0},
and we denote |α| = α1 + · · · + αn so that for x = (x1 , x2 , . . . , xn ) ∈ Rn , we have x α = x1α1 . . . xnαn . The norm (2.6.1) becomes ϕα,β = sup x α ∂ β ϕ , x∈Rn
and definition (2.6.2) reads lim x α ∂ β ϕ(x) = 0, for all α, β ∈ Nn0
|x|→∞
for |α| = k, and |β| = m.
2.6.2 Definition of Schwartz Space It is a good exercise to show that if ϕ and φ are two rapidly decreasing functions, then aϕ + bφ is also a rapidly decreasing function (verify), so the collection of all rapidly increasing functions on R forms a linear space. This space is called Schwartz Space. Definition 2.6.2 (Schwartz Space) A linear space is called Schwartz Space, denoted by S, if it consists of all rapidly increasing functions, which are also known as Sobolev functions. It is clear from the definition that every test function is a Schwartz function. That is, D(Rn ) ⊂ S(Rn ). On the other hand, ϕ(x) = e−x is clearly Schwartz function but is not a test function. Indeed, 2 1 , f (x) = e x ∈ L loc 2
so f defines a regular distribution in D (Rn ). But f, ϕ =
R
e x e−x d x = ∞. 2
2
Hence, ϕ is not a test function. Thus we have the following important proper inclusion
108
2 Distribution Theory
D(Rn ) S(Rn ). For the convergence in S(Rn ), let {ϕ j } be a sequence s.t. ϕ j , ϕ ∈ S(Rn ). Then, we say that ϕ j → ϕ in S(Rn ) iff ϕ j − ϕ → 0 α,β for every α, β ∈ Nn0 . For n = 1, we have ϕ j − ϕ → 0 k.m for every k, m ∈ N0 . Under this new class of functions, if ϕ ∈ S then e−iωt ϕ ∈ S, and so F{ϕ} ∈ S Indeed, the equivalent definition (2.6.3) with m = 0 gives ϕe−iωx d x ≤ |ϕ(x)|
≤ sup(1 + |x|2 )k ϕ(x) x∈R
dx (1 + |x|2 )k
< ∞. So the Fourier transform of a function in S exists, and using the properties of Fourier transform and the fact that d ˆ ( f (ω)), F{t f (t)} = t f (t) = i dω we can claim the same result for all derivatives of F.
2.6.3 Derivatives of Schwartz Functions The following result is the key to achieve our goal. Proposition 2.6.3 Let ϕ ∈ S(R) be a Schwartz function, and F{ϕ}(ω) = ϕ(ω). ˆ Let k ϕ d , for k ∈ Z+ . Then Dxk ϕ denotes dxk (1) (−i)k F{Dxk ϕ(x)} = ω k ϕ(ω). ˆ ˆ (2) (−i)k F{x k ϕ(x)} = Dωk ϕ(ω). Proof To prove (1), we perform integration by parts k times, taking into account that Dxk e−iωx = (−iω)k e−iωx
2.6 Schwartz Space
109
and using the fact that ϕ vanishes at ∞ being Schwartz, we get F{Dxk ϕ} = (−1)k
R
ϕ(x)(−iω)k e−iωx d x = (iω)k ϕ(ω). ˆ
(2.6.5)
The second assertion follows immediately given the fact that Dωk e−iωx = (−i x)k e−iωx , where the differential operator Dωk is taken with respect to ω. We get Dωk ϕ(ω) ˆ
=
R
ϕ(x)Dωk e−iωx d x.
Now, performing the differentiation k times w.r.t. ω gives
ϕ(x)(−i x)k e−iωx d x k = (−i) x k ϕ(x)e−iωx d x
ˆ = Dωk ϕ(ω)
R
Rn
= (−i) F{x k ϕ(x)}. k
The previous result can be easily extended to S(Rn ). Recall that the n-dimensional Fourier transform of f : Rn −→ R, denoted by F{ f (x)}(ω), is given by F{ f (x)}(ω) = fˆ(ω) =
Rn
f (x)e−i(ω·x) d x
where ω = (ω1 , ω2 , . . . , ωn ), ω · x = ω1 x1 + . . . + ωn xn . Then, Proposition 2.6.3 takes the following form: ˆ Let D α Proposition 2.6.4 Let ϕ ∈ S(Rn ) be a Schwartz function, and F{ϕ} = ϕ. α denotes ∂ for α = (α1 , . . . , αn ) , |α| = α1 + · · · + αn . Then: (1) (−i)|α| F{Dxα ϕ} = ω α ϕ(ω). ˆ ˆ (2) (−i)|α| F{x α ϕ(x)} = Dωα ϕ(ω). Proof The proof is the same for the previous proposition. To prove (1), note that the integral in (2.6.5) becomes over Rn . Then F{Dx j ϕ} =
Rn
Dx j ϕ(x) e−iωx d x.
110
2 Distribution Theory
But since ϕ is Schwartz, we must have
Dxα ϕ(x) e−iωx ∈ L 1 ,
and using Fubini Theorem F{Dx j ϕ} =
Rn−1
R
Dx j ϕ(x) e−iωx d x.
Then, we proceed the same as in the proof of Proposition 2.6.3, and repeating |α| times, we obtain (1). To prove (2) we write Dω j ϕ(ω) ˆ = Dω j Again,
Rn
ϕ(x)(e−iωx )d x.
x j ϕ(x)e−iωx = x j ϕ(x)
and x j ϕ(x) ∈ L 1 . So Dω j ϕ(ω) ˆ = =
Rn
Rn
ϕ(x)Dω j e−iωx d x (−i x j )ϕ(x)e−iωx d x
= −iF{x j ϕ(x)}. Repeating the process |α| times gives (2).
Notice the correspondence between the two processes: differentiation and multiplication by polynomials, and one advantage of the Schwartz space is that it can deal well with this correspondence because it is closed under the two operations. If we add to this the advantage of being closed under Fourier transform, we realize why such space is the ideal space to utilize. As a consequence of the previous result, if ˆ ϕ ∈ S(Rn ), then F{Dxα ϕ} and F{x α ϕ(x)} exist, hence by Proposition 2.6.4 ω α ϕ(ω) ˆ exist, and we have and Dωα ϕ(ω) ˆ (−i)|α| F{Dxα ϕ} = ω α ϕ(ω) and
ˆ (−i)|α| F{x α ϕ(x)} = Dωα ϕ(ω).
It turns out that F{ϕ} is a Schwartz function as claimed in the discussion at the beginning of the section. We have the following important result.
2.6 Schwartz Space
111
Corollary 2.6.5 If ϕ ∈ S(Rn ), then F{ϕ} ∈ S(Rn ), that is, the Schwartz space is closed under the Fourier transform. The result is not valid for D(Rn ) because the Fourier transform of a test function is not necessarily a test function.
2.6.4 Isomorphism of Fourier Transform on Schwartz Spaces Consider the mapping F : S(Rn ) −→ S(Rn ).
(2.6.6)
Let f ∈ S(Rn ). Using Proposition 2.6.4, ω α D β F{ f } = (−i)|β| ω α F{x β f } = (−i)|β|+|α| F{D α x β f }. But
D α x β f ∈ S(Rn ) ⊂ L 1 (Rn ).
(2.6.7)
Hence, F{D α x β f }(ω) exists, and sup ω α D β F{ f } < ∞. Therefore, F{ f } ∈ S(Rn ). Note that if the sequence f j ∈ S(Rn ) and f j → f in S(Rn ), i.e., f j − f α,β → 0, then Dα x β f j → Dα x β f in S(Rn ). Furthermore, the inclusion in (2.6.7) implies that f j → f in L 1 , but since F is continuous on L 1 , this implies F{D α x β f j } → F{D α x β f j } and consequently,
ω α D β fˆj → ω α D β fˆ.
Hence, fˆj → fˆ and F is thus continuous on S. If f (x) ∈ S(Rn ), then F 2 { f } = F{F{ f }}. Since (2π)n f (−x) =
Rn
F(ω)e−iω.x dω = F{F{ f }},
112
2 Distribution Theory
we have F 2 { f (x)} = (2π)n f (−x), or, normalizing F by setting T =
(2.6.8)
1 F, (2π)n/2
we get T 2 ( f ) = f (−x), from which we get T 4 ( f ) = f. This implies that T 4 = I d S(Rn ) . It follows that T (T 3 ( f )) = f = T 3 (T ( f )), and hence
T 3 = T −1 .
Since T is continuous, T −1 and F −1 are continuous. Moreover, for every f ∈ S(Rn ), there exists g = T 3 { f } such that T {g} = f, thus we conclude that T, hence F, maps S(Rn ) onto itself. This demonstrates the following important property: Theorem 2.6.6 The mapping: F : S(Rn ) −→ S(Rn ) is an isomorphism. The isomorphism of the Fourier transform between Schwartz spaces means that we can find the Fourier transform and the inverse Fourier transform of any function in that space. This makes S(Rn ) the ideal space to use Fourier transform.
2.7 Tempered Distributions 2.7.1 Definition of Tempered Distribution It was illustrated in the previous section that the existence of Fourier transforms is one of the central problems in Fourier analysis, and it has motivated Schwartz to introduce a new class of distributions. The idea of Schwartz was to extend the space of test functions to include more functions in addition to the smooth functions of compact supports. Since the space of functions associated with distributions is getting larger, we expect the new space of distributions to be smaller, hoping that this new class of distributions will have all the properties we need to define a Fourier transform.
2.7 Tempered Distributions
113
Let S (Rn ) be the dual space of S(Rn ), and T, ϕ =
∞ −∞
T (x)ϕ(x)d x.
It is clear that if T1 , T2 ∈ S (Rn ), then aT1 + bT2 ∈ S (Rn ) for every a, b ∈ R. Moreover, if
then
ϕ j → ϕ,
T, ϕ j → T, ϕ .
This proposes the following class of distributions. Definition 2.7.1 (Tempered Distribution) A distribution T : S(Rn ) → R is said to be tempered distribution if it is a linear continuous functional defined as T (ϕ) = T, ϕ =
∞
−∞
T (x)ϕ(x)d x.
The space of all linear continuous functionals on S(Rn ) is called the space of tempered distributions, and is denoted by S (Rn ).
2.7.2 Functions of Slow Growth The tempered distribution can be defined through a function f with the property that f ϕ is rapidly decreasing. This can be achieved by what is known as “functions of slow growth”. Definition 2.7.2 (Function of Slow Growth) A function f ∈ C ∞ (Rn ) is said to be of slow growth if for every m there exists cm ≥ 0 such that (m) f (x) ≤ cm (1 + |x|2 )k for some k ∈ N. The definition implies that functions of slow growth grow at infinity but no more than polynomials, i.e., for some k, we have f (x) → 0. xk
114
2 Distribution Theory
The reader should be able to prove that if f is a function of slow growth and ϕ is Schwartz function, then f ϕ is Schwartz (see Problem 2.11.35), hence integrable. Therefore, this class of functions can be used to define a tempered distribution. Let f be of slow growth, and ϕn ∈ S, then | f, ϕn − f, ϕ | = | f, ϕn − ϕ | ≤ f ϕn − ϕ .
(2.7.1)
If ϕn → ϕ in S, then (2.7.1) implies that f is continuous. Linearity of f is obvious, hence f is a linear continuous functional on S, i.e., defines a tempered distribution. Thus, let f be a function of slow growth. Define T f as
T f (ϕ) = T f , ϕ =
∞ −∞
f (x)ϕ(x)d x.
Linearity is clear. To prove continuity, consider the sequence ϕn ∈ S. Then T f (ϕn ) = f, ϕn
∞ f (x)ϕn (x)d x = −∞ ∞ f (x) · (1 + |x|2 )k ϕn (x)d x = 2 k −∞ (1 + |x| ) ∞ f (x) d x. ≤ sup (1 + |x|2 )k ϕn 2 k −∞ (1 + |x| ) Since f is of slow growth, the integral exists for some large k. If ϕn → 0, then ϕn k,m = 0. Let m = 0, then
sup (1 + |x|2 )k ϕn → 0
because {ϕn } is rapidly decreasing, hence f, ϕn → 0. Therefore, f is continuous. We proved f to be linear and continuous, so f is a tempered distribution. As a result, all polynomials, constant functions, and trigonometric sine and cosine functions generate tempered distributions.
2.7.3 Examples of Tempered Distributions It should be noted that every tempered distribution is regular but not the converse. Again, the function
2.7 Tempered Distributions
115
f (x) = e x
2
defines a regular distribution, but it is not a tempered distribution because ϕ(x) = e−x ∈ S, 2
and the integral
f ϕ diverges. On the other hand, if we assume f to be of slow
R
growth, i.e., | f | ≤ p(x) for some polynomial p, then R
f ϕ < ∞.
This tells us that every tempered distribution is a regular distribution, i.e., S (R) D (R), and there are regular distributions that are not tempered. Example 2.7.3 Consider the Heaviside function H (x). Using the same technique as above ∞ | H, ϕn | = H (x)ϕn (x)d x −∞ ∞ ∞ |ϕn (x)| d x = ϕn (x)d x ≤ 0 0 ∞ 1 ≤ sup (1 + |x|2 )ϕn d x. 2 −∞ (1 + |x| ) The integral exists. Let ϕn → 0. Then, ϕn 1,0 = 0, so
sup (1 + |x|2 )ϕn → 0,
which implies that | H, ϕn | → 0. So H (x) is a continuous functional. Linearity is clear. Hence H (x) ∈ S (R). Example 2.7.4 Consider the delta distribution. We have
116
2 Distribution Theory
| δ, ϕn | = |ϕn (0)| . If ϕn → 0, then
ϕn k,m = 0.
Let k = m = 0, then sup |ϕn | → 0, which implies δ, ϕn → 0. Therefore, δ is a continuous functional. Linearity is clear, hence δ ∈ S (R).
2.8 Fourier Transform of Tempered Distribution 2.8.1 Motivation Suppose we need to take the Fourier transform of a tempered distribution T , and for simplicity, let n = 1. Then
ˆ T,ϕ = Tˆ ϕ(ω)dω R = T (x)e−iωx ϕ(ω)d xdω. R
R
The RHS of the equation can be written by means of Fubini Theorem as ϕ(ω)e R
Thus, we have
R
−iωx
dω T (x)d x =
R
T (x)ϕd ˆ x = T, ϕˆ .
Tˆ , ϕ = T, ϕˆ .
In order for T, ϕˆ to make sense, it is required that ϕˆ ∈ S for every ϕ ∈ S. So now we understand the purpose of introducing a new class of distributions which is the dual of rapidly decreasing (Schwartz) functions. The tempered distribution seems to behave nicely with Fourier transforms. Now, we state the definition of the Fourier transform of distributions.
2.8 Fourier Transform of Tempered Distribution
117
2.8.2 Definition Definition 2.8.1 (Fourier Transform of Tempered Distribution) Let T be a tempered distribution and ϕ ∈ S. Then, the Fourier transform of T , denoted by Tˆ (or F{T }), is given by Tˆ , ϕ = T, ϕˆ . Remark The notation F{T } is commonly used to denote classical Fourier transforms of functions, but the notation Tˆ is more commonly used for distributions. The definition says that
Tˆ (ϕ) = T (ϕ) ˆ
for every ϕ ∈ S, and that makes sense because ϕˆ ∈ S, and T ∈ S , and this means that every tempered distribution has a Fourier transform. Now, the question arises is How to find the transform of a distribution? We need to manipulate with the Fourier integration in T, ϕˆ and rewrite it as g(x)ϕ(x)d x for some g(x). Then, we obtain
which implies
Tˆ , ϕ = g, ϕ
Tˆ = g.
The Fourier transform, in this case, is defined in a distributional sense, but it is a natural extension of the classical Fourier transform. If T is a function of slow growth and its classical Fourier transform F{T } exists, then F{T } = g.
2.8.3 Derivative of F.T. of Tempered Distribution The next proposition shows that the result of Proposition 2.6.4 for Schwartz functions is valid for distributions. Proposition 2.8.2 Let T be a tempered distribution. Then (1) Dωα (Tˆ (ω)) = (−i)|α| xαT , |α| α ˆ α (2) ω T = (−i) Dx T . Proof To prove the first assertion, we have
118
2 Distribution Theory
Dωα (Tˆ (ω)), ϕ(ω) = (−1)|α| Tˆ , Dωα ϕ(ω) = (−1)|α| T, F{Dωα ϕ(ω)} .
By Proposition 2.6.4(1), ˆ F{Dωα ϕ} = (i)|α| x α ϕ(ω). This gives ˆ = (−i)|α| (−1)|α| T, F{Dωα ϕ(ω)} = (−i)|α| T, x α ϕ(ω) x α T , ϕ(ω) . Hence,
x α T , ϕ(ω) . Dωα (Tˆ (ω)), ϕ(ω) = (−i)|α|
This proves (1). For (2), note that
α α ˆ = (−1)|α| T, Dxα (ϕ) ˆ . D x T , ϕ(ω) = D x T, ϕ
Again, by Proposition 2.6.4(2) α ϕ = T,(i) ˆ |α| ω α ϕ = (i)|α| ω α T,ϕ ˆ ˆ = T, (iω) . (−1)|α| T, Dxα (ϕ) Hence,
|α| α ˆ α D x T = (i) ω T.
Dividing by (i)|α| proves (2).
2.9 Inversion Formula of The Fourier Transform The inverse Fourier transform of a distribution, denoted by F −1 {T }, or Tˇ , is given by −1 F {F{T }}, ϕ = F{T }, F −1 {ϕ} = T, F{F −1 {ϕ}} = T, ϕ , i.e.,
F{F −1 {ϕ} = F −1 {F{T }} = T.
(2.9.1)
How to construct a formula for Tˇ ? The next example is helpful in establishing some subsequent results.
2.9 Inversion Formula of The Fourier Transform
119
2.9.1 Fourier Transform of Gaussian Function Example 2.9.1 (Gaussian Function) Consider f (x) = e−x
2
/2
.
By definition, we have F{ f (t)} = Write 1 2 t + iωt = 2
∞
1 2
+iωt )
dt.
−∞
t iω √ +√ 2 2
Use the substitution
2 +
ω2 . 2
t + iω √ 2
u= we get
e−( 2 t
√ −ω2 /2 F{ f (t)} = 2e
∞
−∞
e−u du = 2
√
2πe−ω
2
/2
.
If x ∈ Rn , then f is written as f (x) = e−|x|
So F{ f } =
e−|x|
2
2
/2
.
/2−i(ω·x)
d x.
Rn
By the previous argument for the case n = 1, taking into account the integral is made over Rn , we have F{ f } = e
− 21 |ω|2
= e− 2 |ω| 1
2
n
∞
e− 2 [x+iwk ] d x 1
2
k=1 −∞ n
√ ( 2π)
k=1 n 2
= (2π) e
−|ω|2 /2
.
The Gaussian function shall be used to obtain the inversion formula of the Fourier transform. Indeed, let f ∈ Cc∞ (Rn ), and consider the sequence
120
2 Distribution Theory
g (x) = g(x) for some g ∈ C0∞ (Rn ) and > 0. Then F{g } =
1 ω . gˆ n
(2.9.2)
It follows that f, F{g } =
f F{g }.
Using (2.9.2) and appropriate substitution we get f, F{g } =
f (y)g(y)dy. ˆ
Then, we either use Dominated Convergence Theorem (verify), or the fact that f (y) → f (0) uniformly as → 0 (justify) to pass the limit inside the integral, and this gives f, F{g } → f (0)
g(y)dy. ˆ
(2.9.3)
On the other hand, f, F{g } = F{ f }, g =
fˆ(y).g (y)dy =
fˆ(y).g(y)dy.
Passing to the limit → 0,
fˆ(y).g(y)dy −→ g(0)
Hence, f (0)
fˆ(y)dy.
fˆ(y)dy.
g(y)dy ˆ = g(0)
(2.9.4)
This holds for all possible g. So let g(x) be the Gaussian function discussed in Example 2.9.1. Then, g(0) = 1, and F{g} = gˆ = (2π) 2 e− 2 |ω| . n
Integrating gˆ over Rn gives
1
2
2.9 Inversion Formula of The Fourier Transform
Rn
121
n
g(y)dy ˆ = (2π) 2
e− 2 |y| dy 1
2
Rn
n
n
n
k=1 −∞ n
= (2π) 2
∞
√
= (2π) 2
e− 2 yk dy 1
2
2π = (2π)n .
k=1
Hence, from (2.9.4) we obtain 1 ˆ(y)dy = 1 f fˆ(y)e−i y·x ei y·x dy. f (0) = (2π)n Rn (2π)n Rn
(2.9.5)
Then, for any x, f (x) can be obtained by using shifting property on (2.9.5) as f (x − x0 ) ←→ e−iwx0 fˆ(ω), and we obtain f (x) =
1 (2π)n
Rn
fˆ(y)ei y·x dy.
This suggests (in fact establishes) the inversion formula for the Fourier transform. The Inverse Fourier transform, denoted by fˇ(x), is defined as F −1 { fˆ(ω)} = fˇ(x) = We have
1 (2π)n
Rn
fˆ(ω)ei(ω·x) dω.
(2.9.6)
FF −1 ( f ) = F −1 F( f ) = f.
Hence, the inverse Fourier transform of a distribution T can be defined similar to Definition 2.8.1 as Tˇ , ϕ = T, ϕˇ , ∀ϕ ∈ S. The analog of (2.9.1) can be achieved as follows: For every ϕ ∈ S, we have ˇˆ = Tˇ (ϕ) Tˇˆ (ϕ) = Tˆ (ϕ) ˇ = T (ϕ) ˆ = Tˇˆ (ϕ). Hence,
ˆˇ Tˇˆ = T.
122
2 Distribution Theory
2.9.2 Fourier Transform of Delta Distribution Example 2.9.2 To find F{δ}, we have F{δ}, ϕ = δ, F{ϕ} . But this is equal to F{ϕ}(0) =
∞ −∞
e0xi ϕ(x)d x =
∞ −∞
1 · ϕ(x)d x = 1, ϕ .
Hence F{δ} = 1. The result of the example seems plausible. Let us see why. Let 1 sin(nx). πx
φn (x) =
It is well-known that
∞
−∞
sin x d x = π. x
Using the substitution u = nx, and a continuous f , lim
∞
n→∞ −∞
sin u u f du n −∞ πu u
1 ∞ sin u lim f du = n→∞ π −∞ u n ∞ sin u 1 du = f (0). = f (0) π −∞ u
sin nx f (x)d x = πx
∞
Therefore, φn (x) is a delta sequence. Now let us find the inverse Fourier of 1. F
−1
1 {1} = 2π
∞
1.eiωx dω
−∞
L 1 lim 1.eiωx dω 2π L→∞ −L iωL e − e−iωL 1 lim = 2π L→∞ ix sin L x = δ(x). = lim L→∞ πx =
Example 2.9.3 Consider δc . This is the shift δ(x − c) of δ. We have
2.10 Convolution of Distribution
123
F{δc }, ϕ = δc , F{ϕ} = F{ϕ}(c) = Hence
R
e−icx ϕ(x)d x = e−icx , ϕ .
F{δc }(ω) = e−icω .
Similarly, one can show that F{δ−c } = eicω . This implies that F{eicx } = 2πδ(ω − c). Moreover,
F
δc + δ−c 2
= cos cω,
and F{cos cx} = π(δ(ω − c) + δ(ω + c)).
2.9.3 Fourier Transform of Sign Function Example 2.9.4 Since (sgn) = 2δ, F{sgn } = 2. According to Proposition 2.8.2, F{sgn } = iωF{sgn}. So F{sgn} =
2 . iω
The unit step function can be written as H (x) = so we obtain F{H } =
1 (1 + sgn(x)), 2
2 1 1 2πδ + = πδ + . 2 iω iω
2.10 Convolution of Distribution The convolution of two functions is a special type of product that satisfies elementary algebraic properties, such as the commutative, associative, distributive properties. To
124
2 Distribution Theory
define a convolution for distributions, we restrict ourselves to the case of a distribution and a test function.
2.10.1 Derivatives of Convolutions First, we need the following result, which discusses the derivative of convolutions. Lemma 2.10.1 (ϕ ∗ ψ)(k) = ϕ(k) ∗ ψ = ϕ ∗ ψ (k) for k = 0, 1, 2, . . . Proof We differentiate ϕ ∗ ψ to obtain d (ϕ ∗ ψ) = dx = =
R
R R
ϕ (x − y)ψ(y)dy (−1)
d ϕ(x − y)ψ(y)dy dy
(−1)ϕ(x − y)ψ (y)dy
= ϕ ∗ ψ (x). Continuing the process k times gives the result.
2.10.2 Convolution in Schwartz Space The previous result indicates the smoothness property of convolution. If ψ ∈ S(Rn ), then the convolution of ψ with another function ϕ is smooth and can be differentiated infinitely many times. As a consequence, we have Theorem 2.10.2 If ϕ, ψ ∈ S(Rn ), then ϕ ∗ ψ ∈ S(Rn ). Proof The previous lemma implies that ϕ ∗ ψ ∈ C ∞ (Rn ). To prove ϕ ∗ ψ is rapidly decreasing, we use Definition 2.6.1. Notice that |x|k |ϕ(x − y)| ψ (m) (y) ≤ 2k |x − y|k |ϕ(x − y)| ψ (m) (y) + |y|k |ϕ(x − y)| ψ (m) (y) .
Then, |x|k (ϕ ∗ ψ)(m) ≤ 2k |x − y|k |ϕ(x − y)| ψ (m) (y) + |y|k |ϕ(x − y)| ψ (m) (y) dy. R
(2.10.1)
2.10 Convolution of Distribution
125
Since ϕ, ψ ∈ S(Rn ), the integral in the RHS of (2.10.1) exists and finite (why?). Hence |x|k (ϕ ∗ ψ)(m) < ∞. Taking the supremum over all x ∈ R the result follows. For S(Rn ), the proof is the same as above, with k is replaced with |α| and m with β for some α, β ∈ Nn0 , and taking the integral over Rn .
2.10.3 Definition of Convolution of Distributions Now, let T ∈ S (Rn ) and ψ ∈ S(Rn ). The convolution of T with ψ is given by ψ ∗ T, ϕ =
R
=
R
ψ(x − y)T (y)dyϕ(x)d x. T (y)dy ψ(x − y)ϕ(x)d x. R
R
Using Fubini Theorem, this gives ψ ∗ T, ϕ = T, ψ − ∗ ϕ , where
ψ − (x) = ψ(−x).
This definition won’t make sense unless ψ − ∗ ϕ ∈ S. Now we are ready to state the following definition. Definition 2.10.3 (Convolution of Distribution) Let T ∈ S (Rn ) and ψ ∈ S(Rn ). Then the convolution of T and ψ is given by ψ ∗ T, ϕ = T, ψ − ∗ ϕ .
2.10.4 Fundamental Property of Convolutions We apply the definition to establish the following fundamental property of convolutions. Theorem 2.10.4 If f (n) exists, then for all n = 0, 1, 2, . . . , we have
126
2 Distribution Theory
f ∗ δ (n) = f (n) . Proof Let n = 0. Applying Definition 2.10.3, we get f ∗ δ, ϕ = δ, f − ∗ ϕ = ( f − ∗ ϕ)(0) = f (y − 0)ϕ(y)dy R
= f, ϕ . Hence, we obtain f ∗ δ = f. For n = 1, we have
f ∗ δ, ϕ = δ, f − ϕ
d − ( f ∗ ϕ) = − δ, dx = −( f − ∗ ϕ) (0 = (−1)(−1) f (y − x)ϕ(y)dy |x=0 R = f , ϕ .
Using induction, one can easily prove that the result is valid for all n.
The result shows that the delta distribution plays the role of the identity of the convolution process over all distributions. The advantage of this property is that the delta function and its derivatives can be used in computing the derivatives of functions.
2.10.5 Fourier Transform of Convolution Now, we state the Fourier transform of convolution. Theorem 2.10.5 Let T ∈ S (R) and ψ ∈ S(R). Then, F{ψ ∗ T } = F{ψ} · F{T }. Proof We have F{ψ ∗ T }, ϕ = ψ ∗ T, F{ϕ}
= T, ψ − ∗ F{ϕ} ,
2.11 Problems
127
which, using the fact that
F{F{ψ}} = ψ −
(verify), can be written as T, F{F{ψ}} ∗ F{ϕ} = T, F{F{ψ} · ϕ}
= F{T }, F{ψ} · ϕ
= F{ψ} · F{T }, ϕ . Therefore, we obtain F{ψ ∗ T } = F{ψ} · F{T }.
2.11 Problems (1) Show that Cc∞ (R) is an infinite-dimensional linear space. (2) Give an example of a function that is not locally integrable on R2 . (3) Give an example of a locally integrable function f that defines regular distribution, but f 2 does not. 1 (I ) but f ∈ / L(I ) for some interval (4) Give an example of a function f ∈ L loc I ⊂ R. (5) Consider the following function e−1/t t > 0 f (t) = . 0 t ≤0 (a) Show that f ∈ C ∞ (R). (b) Define g(x) = f (1 − |x|2 ). Show that supp(g) = (0, ∞). (6) Determine when f (x) = x α defines a regular distribution. (7) Prove Proposition 2.2.3. (8) Let ϕ ∈ D (R). (a) Show that ϕ(at) ∈ D(R) for all a ∈ R \ {0}. (b) If g ∈ C ∞ (R), show that g(t)ϕ(t) ∈ D(R). (9) Let T ∈ D (R). Show that f (t − τ ), ϕ(t) = f (t), ϕ(t + τ ) . (10) Show that if T ∈ D() for ⊆ R and f ∈ C0∞ (R) then d( f T ) = f (DT ) + T f .
128
2 Distribution Theory
(11) Show that for s < r we have s . L rloc ⊂ L loc
(12) Show that the following sequences are delta-sequences (1) ϕn (x) =
n , for n → ∞. π(1 + n 2 x 2 )
(2) ϕn (x) =
sin2 nx , for n → ∞. nπx 2
(13) Evaluate the integral
∞ −∞
(14) Show that
∞ −∞
δ(x − y − u)δ(u − t)du.
δ(x − a)δ(x − b)d x = δ(a − b).
(15) Find the first distributional derivative of each of the following functions. 1 (1) f (x) = |x| . (5) f (x) = √ . x+ x . (6) f (x) = cos x for x irrational. (2) f (x) = |x| 1 (7) f (x) = ln |x| sgn(x). (3) f (x) = |x|2 . 2 1 (4) f (x) = √ . (8) f (x) = H (x) sin x. x (16) Determine whether each of the following functions is a Schwartz function or not √ 2 (5) f (x) = e− x +1 . (1) f (x) = e−a|x| for all a > 0. (2) f (x) = e−a|x| for all a > 0.
(6) f (x) = e−x cos e x .
(3) f (x) = xe−x .
(7) f (x) = e−x cos e x .
(4) f (x) = e−|x| .
(8) f (x) = e−x cos e x .
2
2
2
2
2
(17) Let T f be a regular distribution. Show that F{T f (ϕ)} = T f (F(ϕ)).
2.11 Problems
129
(18) Prove the scaling property: F{ϕ(ax)} =
1 x . F ϕ |a| a
(19) Show the following shift properties: (a) F{ f (x − a)} = e−iaω F{ f }. (b) F{ei x·a f (x)} = fˆ(ω − a). (20) Prove the duality property of the operator F: If F{ f (x)} = F(ω), then F{F(x)} = 2π f (−ω). (21) Let f ∈ S(Rn ). Show that (a) F{ f (x + y)} = ei yω F{ f }. (b) F{ f (λx)} =
ω 1 n F{ }, λ ∈ R. |λ| λ
(22) Show that definitions from (2.6.2) to (2.6.4) are all equivalent. (23) Show that if f n converges in D(R), then f n converges in S(R). (24) Let ϕ ∈ S(R) and P(x) is a polynomial on R. Prove that (P(x))c ϕ(x) ∈ S(R) for any c ≥ 0. (25) Show that if ϕ, ψ are rapidly decreasing functions, then ϕ · ψ ∈ S(R). Conclude that ϕ ∗ ψ ∈ S(R). (26) Prove that if F{ f } is rapidly decreasing function, so is f . (27) Show that the convergence in S(Rn ) is uniform. (28) Show that m f (x) = e−|x| ∈ S(R) if and only if m = 2n for n ∈ N. (29) Show that F : S(Rn ) −→ S(Rn ) is absolutely convergent. (30) Show that |·|α,β is a seminorm but not a norm. (31) Show that if f ∈ S(Rn ) then | f | ≤ cm (1 + |x|)−m for all m ∈ N, but the converse is not true. (32) Show that if f (x) ∈ S(Rn ) then
130
2 Distribution Theory
f (−x), f (ax) ∈ S(Rn ). (33) Determine whether each of the following functions defines a tempered distribution. 4 (1) f (x) = e x . x (2) f (x) = e . (3) f (x) = x 3 . (34) Show that if f (x) ∈ S (R) then f (−x), f (at) ∈ S (R) for a ∈ R. (35) Show that a product of a function of slow growth with a Schwartz function is again a Schwartz function. (36) If f n → f in S , show that f n → f in S . (37) Show that if T is tempered distribution then so is T. (38) Determine whether e x cos(e x ) belongs to S (R). (39) Show that (T f )− , ϕ = T f , ϕ− . (40) Show that
D ⊂ S ⊂ L 2 ⊂ S ⊂ D.
(41) Use the duality property (F{F(x)} = 2π f (−ω)) to find the Fourier transform 1 of f (x) = . x (42) Show that ∞ 1 δ(ω) = eiωx dω. 2π −∞ (43) Find the Fourier transform of 2 (1) e−ax +bx+c .
(5) |x|α sgn(x).
(2) e−b|x| .
(6) |x|α ln |x| .
1 . x + ia 1 (4) . (x + ia)2
1 . x 1 (8) p.v. 2 . x
2
(3)
(7) p.v.
(44) Find F{(−x)α } and F{D α δ}. (45) Show that if f ∈ S(Rn ) then F{ f } = (2π)n F −1 { f }.
2.11 Problems
131
(46) Let F : S (Rn ) −→ S (Rn ). Show that F is continuous and F −1 exists. (47) Show that ω α D β (F{ f }) = (−i)|β|+|α| D α (F{(x β f )}). (48) Let T be the distribution defined by the function F{ f }, for some f ∈ S (Rn ). Show that TF { f } = F{T f }. (49) Fourier Transform of Polynomials: Show that the following is true. (a) F{e−icx } = 2πδ(ω + c) for x ∈ R and F{e−ic·x } = (2π)n δ(ω + c) for x ∈ Rn . (b) F{i k x k e−icx } = (−1)k (2π)D k (δ(ω + c)) for x ∈ R and (c) F{i |α| x α e−ic·x } = (−1)|α| (2π)n D α (δ(ω + c)) for x ∈ Rn . (d) F{x k } = 2πi k D k δ(ω). n n ! ! a j (i) j D j (δ(ω)) for the polynomial P(x) = ajx j. (e) F{P(x)} = 2π j=0
j=0
(50) Show that the following inclusions are proper (a) S(Rn ) ⊂ Cc∞ (Rn ). (b) S (R) ⊂ D (R). (51) If T ∈ S and ϕ, ψ ∈ S, show that (T ∗ ϕ) ∗ ψ = T ∗ (ϕ ∗ ψ). (52) Show that F{ f g} = fˆ ∗ gˆ and F −1 {F{ f } ∗ F{g}} = f g. (53) Let f, g ∈ C0∞ (R). (a) Show that:
f ∗ g ∈ C0∞ (R).
(b) Show that: supp( f ∗ g) ⊂ supp( f ) + supp(g). (54) Let T ∈ S (R) and ϕ ∈ S(R). Show that T ∗ ϕ ∈ S (R). (55) Show that 1 F{ϕ · ψ} = F{ϕ} ∗ F{ψ} (2π)n for x ∈ Rn . 1 (R). (56) Let f ∈ L loc (a) Show that if g ∈ Cc (R) then f ∗ g ∈ C(R). (b) Show that if g ∈ Cc1 (R) then f ∗ g ∈ C 1 (R). (c) Show that if g ∈ Cc∞ (R) then f ∗ g ∈ C ∞ (R).
Chapter 3
Theory of Sobolev Spaces
3.1 Weak Derivative 3.1.1 Notion of Weak Derivative Recall Definition 1.4.1 for distributional derivatives was given in the form
T (k) , ϕ = (−1)k T, ϕ(k) .
Under this type of derivative, distributions have derivatives of all orders. Another generalization of differentiation is proposed for locally integrable functions that are not necessarily differentiable in the usual sense. Such type of derivatives has two advantages: Providing derivatives for nondifferentiable functions and generalizing the notion of partial derivative. Recall the multi-index α = (α1 , . . . , αn ) defined in Sect. 2.6 as the n-tuple of nonnegative integers αi ≥ 0, where αi ∈ N0 = N ∪ {0}, and we denote |α| = α1 + · · · + αn . Then, for x = (x1 , x2 , . . . , xn ) ∈ Rn , we have x α = x1α1 . . . xnαn . For differentiation, we have © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 A. Khanfer, Applied Functional Analysis, https://doi.org/10.1007/978-981-99-3788-2_3
133
134
3 Theory of Sobolev Spaces
∂xα = ∂1α1 . . . ∂nαn , ∂ α u =
∂ |α| u . . . . ∂xnαn
∂x1α1
For example, letting α = (2, 1, 0) and u = u(x, y, z), we get Dαu = ∂ αu =
∂3u . ∂x 2 ∂ y
In general, for a function u ∈ L p (Rn ), we can write ⎞ ⎛ D1 u(x) ⎜ D2 u(x) ⎟ ⎟ ⎜ Du(x) = ⎜ ⎟, .. ⎠ ⎝ . Dn u(x) with norm Du pp = D1 u pp + D2 u pp + · · · + Dn u pp . Motivated by the distributional derivative, the differentiation takes the form ∂ α T, ϕ = (−1)|α| T, ∂ α ϕ . We give the definition of a weak derivative. Definition 3.1.1 (Weak Derivative) Let u ∈ L 1Loc (), ∈ Rn . If there exists a function v ∈ L 1Loc () such that
u D α ϕd x = (−1)|α| vϕd x.
for every ϕ ∈ Cc∞ (), then D α u = v, and we say that u is weakly differentiable of order α, and its αth weak derivative is v. We also say that the αth partial derivative of the distribution T is given by D α T (ϕ) = (−1)|α| T (D α ϕ) Remark Observe the following: (1) It is easy to see from the definition that D α u(c1 ϕ1 + c2 ϕ2 ) = c1 D α u(ϕ1 ) + c2 D α u(ϕ2 ) and D α u, ϕm = (−1)|α| u, D α ϕm −→ (−1)|α| u, D α ϕ = D α u, ϕ .
3.1 Weak Derivative
135
Consequently, D α u is a bounded linear functional. (2) Classical and weak derivatives coincide if u is continuously differentiable on . In this case, if |α| = k, then dk u. dxk
D α u = u (k) =
(3) If T is a regular distribution (i.e., T = Tu for some locally integrable u), then the distributional derivative and the weak derivative coincide, that is, if |α| = k, then D α u = T (k) . (4) If u ∈ L 1Loc (), but there is no v ∈ L 1Loc (Rn ) such that
|α|
α
u D ϕd x = (−1)
vϕd x
for every ϕ ∈ Cc∞ (), then we say that u has no weak αth partial derivative. A distribution may have a distributional derivative without having a weak derivative. For example, consider the Heaviside function H (x). Then, H = δ is the distributional derivative, but it is not locally integrable; hence H has no weak derivative. On the other hand, let f (x) = |x| , then f (x) = sgn(x) ∈ L 1Loc (R), so it is the weak and the distributional derivative of f .
3.1.2 Basic Properties of Weak Derivatives The next theorem discusses some basic properties of the weak derivative. Theorem 3.1.2 Let u, D α u ∈ L 1Loc (), then (1) D α u is unique. (2) If D β u, D α+β u ∈ L 1Loc () exist, then D β (D α u) = D α (D β u) = D α+β u. Proof For (1), let v1 and v2 be two weak derivatives of u, then
v1 ϕd x = − uϕ d x
and
v2 ϕd x = −
uϕ d x,
136
3 Theory of Sobolev Spaces
hence we have
(v2 − v1 )ϕd x = 0.
In Sect. 3.2, we will see that this implies that v2 = v1 a.e. For (2), note here that D α (D β ϕ) = D α+β ϕ since ϕ ∈ Cc∞ (). Now, for every ϕ ∈ Cc∞ () we have
β
α
|β|
D α u D β ϕd x
= (−1)|β|+|α| u D α D β ϕd x
|β|+|α| = (−1) u D α+β ϕd x
= (−1)|β|+|α|+|α+β| (D α+β u)ϕd x
α+β = (D u)ϕd x.
D (D u)ϕd x = (−1)
This yields
D β (D α u) = D α+β u,
and exchanging α and β gives D α (D β u) = D α+β u. For (3), note that the convergence in L 1Loc implies convergence in the sense of distributions. which allows us (verify) to write
vϕ =
α
lim(D u n )ϕ = (−1)
|α|
α
(lim u n )D ϕ = (−1)
and the result follows.
|α|
u Dαϕ
3.1.3 Pointwise Versus Weak Derivatives A natural question that arises is whether there is a connection between pointwise derivative and weak derivative. The answer can be illustrated through the example of Heaviside function 1 x ≥0 H (x) = (3.1.1) 0 x < 0.
3.1 Weak Derivative
137
It was demonstrated in Example 2.4.4 that the distributional derivative of H is 1 (−1, 1). This shows that H is δ, which is a singular distribution, and so δ ∈ / L loc not weakly differentiable. But it can be immediately seen that the usual derivative H = 0 for all x = 0. We have seen that the weak derivative is identified through integration over the set rather than being calculated pointwise at every point in the set. The previous example shows that discontinuous function or piecewise continuous functions can be pointwise differentiable a.e. without being weakly differentiable. In fact, there are examples of continuous functions that are pointwise differentiable almost everywhere but not weakly differentiable (Cantor function). However, if the function is continuously differentiable and its classical derivative exists for all x, then the weak derivative exists and they coincide. The next result proposition proves this assertion. Proposition 3.1.3 Let u ∈ C k (), ∈ Rn . Then the weak derivatives |α| ≤ k exist and are equal to the classical derivatives 1 ≤ n ≤ k, up to order k. Proof If u ∈ C k () then D α u ∈ C() ⊆ L 1Loc (). So, D α u as a weak derivative exists. Then, for ϕ ∈ Cc∞ () we have
|α| |α|+|α| ∇uϕd x = (−1) u∇ϕd x = (−1) D α uϕd x.
Hence ∇u = D α u. Similar arguments for k > 1.
The above proposition tells that the weak derivative exists for continuously differentiable functions, but the jump discontinuity in the step function (3.1.1) causes serious problems to the existence of weak derivatives. The next result gives a characterization to determine whether a function in L 1Loc has its weak derivative also in L 1Loc . Proposition 3.1.4 Let u ∈ L 1Loc (). If there exists u j ∈ C ∞ () such that u j −→ u and D α u j −→ vα in L 1Loc (), then D α u exists and D α u = vα ∈ L 1Loc (). Proof Let ϕ ∈ Cc∞ (). Then we have
(u j ϕ − uϕ)d x ≤ sup |ϕ| u j − u d x → 0.
138
3 Theory of Sobolev Spaces
Therefore,
α
u j D α ϕd x
= (−1)|α| lim D α u j ϕd x
|α| = (−1) vα ϕd x,
u D ϕd x = lim
and the result follows. The next result deals with the general case if u, u j ∈ L P ().
Proposition 3.1.5 Let u ∈ L P () for open ⊆ Rn and p ≥ 1. Consider a sequence v j ∈ L P () such that D α v j ∈ L P (). If v j −→ v in L P () and D α v j −→ wα in L P (), then D α u exists and D α u = w. Proof The L p convergence implies that (verify) for all ϕ ∈ Cc∞ ().
α
lim
D v j ϕd x =
wα ϕd x.
On the other hand, the definition gives
lim
D α v j ϕd x = (−1)|α| lim
v j D α ϕd x = (−1)|α| lim
v D α ϕd x.
3.1.4 Weak Derivatives and Fourier Transform We conclude the section with the following transformation property between weak derivatives and powers of the independent variables. ˆ Then Proposition 3.1.6 Let u ∈ L 1Loc (Rn ), and F{u}(ω) = u(ω). ˆ (1) F{Dxα u} = (i)|α| ω α u(ω). ˆ = (−i)|α| F{x α u(x)}. (2) Dωα (u(ω)) Proof The proof is similar to Proposition 2.8.2.
This demonstrates that the Fourier transform behaves well with weak derivatives in a similar manner to the usual (and distributional) derivatives, and it preserves the correspondence between the smoothness of the function and the rate of decay of its Fourier transform.
3.2 Regularization and Smoothening
139
3.2 Regularization and Smoothening 3.2.1 The Concept of Mollification In light of the discussion in the preceding sections, it turns out that Schwartz spaces play a dominant role in the theory of distributions. We will further obtain interesting results that will lead to essential consequences on distribution theory. The main tool for this purpose is “mollifiers”. A well-known result in measure theory is that any function f in L p can be approximated by a continuous function g. If g is smooth, this will give an extra advantage since g in this case can serve as a solution to some differential equation. Since the space of smooth functions C ∞ is a subspace of C, we hope that any function f ∈ L p can be approximated by a function g ∈ C ∞ , which will play the role of “mollifier”. There are two remarks to consider about g: (1) Since our goal is to approximate any f ∈ L p by a function g ∈ C ∞ , this function is characterized by f, so we can view it as f such that f → f as → 0. (2) The smoothening process of getting a smooth function out of a continuous (not necessarily smooth) reminds us of convolutions in which the performed integration has the effect of smoothening the curve of the function and thus eliminating sharp points on the graph, which justifies the name “mollifier”. As the word implies, to mollify an edge means to smoothen it. In general, if f is continuous and g is not, then f ∗ g is continuous. If f is differentiable but g is not, then f ∗ g is differentiable, i.e., f ∗ g will take the better regularity conditions of the two functions. Mollifiers are functions that can be linked to other functions by convolution to smoothen the resulting function and give it more regularity. Therefore, we may write f = φ ∗ f
(3.2.1)
for some particular function φ ∈ C ∞ (Rn ) independent of the choice of f . To get an idea of φ , note that if we take the Fourier transform of (3.2.1) and take → 0, then we obtain F{ f } = F{φ }F{ f }.
(3.2.2)
On the other hand, the continuity of the Fourier transform, and the fact that f → f implies F{ f } → F{ f }.
(3.2.3)
140
3 Theory of Sobolev Spaces
Combining (3.2.1) and (3.2.3) gives F{φ } → 1, But 1 = F{δ}, so as → 0 we obtain φ → δ. This means that the family of functions φ plays the role of a delta sequence. So our task is to seek a function φ ∈ Cc∞ (Rn ) such that the family φ has the following properties: 1. 0 ≤ φ ∈ Cc∞ (Rn ), 2. φ = 1, and Rn
3. φ → δ. As in (2.3.14), we suggest the form φ =
1 x . φ n
For simplicity, let n = 1. Indeed, we have x 1 φ f (x)d x.
φ , f = Let x = y. Then,
φ , f =
φ(y) f (y)dy.
(3.2.4)
Now, let → 0. We have
φ , f → If we set
φ(y) f (0)dy = f (0)
φ(y)dy.
φ(y)dy as expected, then
φ , f →
φ(y) f (0)dy = f (0) = δ, f ,
Thus, we have φ → δ as desired. To find a smooth function φ, the exponential function serves a good example, but it needs to have a compact support, i.e., to be zero at the boundary out of its compact support. The following example works properly. Let ϕ : Rn → R, defined as:
3.2 Regularization and Smoothening
ϕ(x) =
141
⎧ ⎨0
⎩c. exp
x ≥ 1 , 1 x < 1 − 1−x 2
(3.2.5)
where c is a number chosen so that
R
ϕ = 1.
The function ϕ is smooth and having the ball B1 (0) as support. For n = 1, we have supp(ϕ) = [−1, 1]. If we define ϕ as in (2.3.14), then ϕ satisfies all properties of ϕ (verify), and supp(ϕ ) = B (0) i.e. for n = 1,
supp(ϕ ) = [−, ].
We are ready to define the mollifier.
3.2.2 Mollifiers Definition 3.2.1 (Mollifier) Let ϕ ∈ Cc∞ (Rn ) with ϕ ≥ 0, supp(ϕ) = B1 (0), and
ϕ = 1. Then, the family of functions ϕ given by Rn
1 x ϕ (x) = n ϕ for all > 0 is called mollifier. Many functions can play the role of ϕ, but the function given in (3.2.5) is the standard one, so the family ϕ is called the standard mollifier if ϕ is as given in (3.2.5). It is easy to check that ϕ has the following standard properties: (1) ϕ (x) ≥ 0 for all x ∈ Rn . ∞ n (2) ϕ
∈ Cc (R ), with supp(ϕ ) = B (0). (3) ϕ = 1. Rn
(4) ϕ (x) → δ(x) as → 0. Mollifiers are defined as C ∞ approximations to the delta distribution. Now, we can mollify a function f ∈ L p (Rn ) by convolving it with any mollifier, say the standard
142
3 Theory of Sobolev Spaces
ϕ , giving rise to another family of functions f = f ∗ ϕ . The family f is called: f -mollification. For > 0, define the following set: = {x ∈ : d(x, ∂) > }. Here, ⊂ , and can be also described as = {x ∈ : B (x) ⊆ }, so it is clear that → as → 0. Note that if f : ⊆ Rn −→ R, then
f =
Rn
ϕ (x − y) f (y)dy =
B (0)
ϕ (x − y) f (y)dy
for all x ∈ . This is the domain of f , and so f = f ∗ ϕ in , and supp( f ) ⊂ supp( f ) + supp(ϕ ) = supp( f ) + B (0). The following theorem discusses the properties of f and the importance of its formulation. Theorem 3.2.2 Let f ∈ L p , 1 ≤ p < ∞, and let f = f ∗ ϕ for some mollifier ϕ , > 0. Then, we have the following: (1) If f ∈ L p (Rn ) then f ∈ C ∞ (Rn ), and D α f = ϕ ∗ D α f for all x ∈ Rn . (2) If f ∈ L p () then f ∈ C ∞ ( ), and D α f = ϕ ∗ D α f for all x ∈ . (3) f p ≤ f p . Proof For (1), We have
f = f ∗ ϕ =
Rn
ϕ (x − y) f (y)dy.
Observe that Dxαi ϕ (x − y) = (−1)|α| D αyi ϕ (x − y). We obtain the following for all x ∈ Rn
3.2 Regularization and Smoothening
(D α f )(x) =
143
Dxα ϕ (x − y) f (y)dy
|α| = (−1) D αy ϕ (x − y) f (y)dy Rn
= (−1)|α| (−1)|α| ϕ (x − y)D αy f (y)dy Rn
= ϕ ∗
D αy
Rn
f
α
= (D f ) (x). That is (D α f )(x) = (D α f ) (x), for all x ∈ Rn , and therefore f ∈ C ∞ (Rn ). For (2), we assume α = 1 then we can use induction. Choose h small enough so that x + hei ∈ . Then f (x + hei ) − f (x) 1 = n h 1 = n
x + hei − y x−y 1 ϕ −ϕ f (y)dy h
K
x + hei − y x−y 1 ϕ −ϕ f (y)dy h
for some compact K ⊂ . This implies that as h → 0+ , x−y x + hei − y x−y 1 1 ϕ −ϕ −→ Dxi ϕ h uniformly on K . Hence, the weak derivative of f exists, and similar argument to the above can be made for D α f to obtain (D α f )(x) = (D α f ) (x) for all x ∈ , and therefore f ∈ C ∞ ( ). For (3), let f ∈ L p () where ⊆ Rn . Let q be the conjugate of p. Then, ϕ = (ϕ ) p + q . 1
Using Holder’s inequality, we have
1
144
3 Theory of Sobolev Spaces
| f (x)| =
ϕ (x − y) f (y)dy 1 1 p q ≤ ϕ (x − y) . ϕ (x − y). f q
=
B (x)
q1
ϕ (x − y)dy .
B (x)
p
ϕ (x − y) | f (y)| p dy
1p
.
But the first integral of the mollifier is equal to 1, and since supp(ϕ ) is compact, we conclude that the second integral exists and finite. So
| f (x)| p ≤ ϕ (x − y) | f (y)| p dy. (3.2.6) B (x)
Taking the integral of both sides of (3.2.6) over , we get
| f (x)| p d x ≤ f pp = ϕ (x − y) | f (y)| p d yd x,
B (x)
and, by Fubini Theorem, the above integral can be written as
f Lp p ( ) ≤
ϕ (x − y) | f (y)| p d xd y B (x)
| f (y)| p dy = ϕ (x − y)d x B (x)
| f (y)| p dy ϕ (x − y)d xd y ≤ Rn
| f (y)| p dy = f Lp p () =
and this proves (3). The result can also be proved similarly for the case = Rn .
3.2.3 Cut-Off Function Another type of mollification that is important in applications is when f is the characteristic function χ A . This can be achieved using cut-off functions. Theorem 3.2.3 Let U be open set in Rn . Then for every compact set K ⊂ U , there exists a function ξ(x) ∈ D(U ) such that 0 ≤ ξ ≤ 1 with ξ(x) ≡ 1 in a neighborhood of K and supp(ξ) ⊂ U . Proof For > 0, consider the following set:
3.2 Regularization and Smoothening
145
K = {x : dist(x, K ) ≤ }. Obviously, the set K contains K , and K =
B (x),
x∈K
and let be small enough that K 2 ⊂ . Consider the characteristic function 1 x ∈ K χ K (x) = . 0 x∈ / K Now, define the function ξ(x) = (ϕ ∗ χ K )(x) where ϕ (x) is the standard mollifier. Then, by Theorem 3.2.2, ξ ∈ C ∞ is smooth, and supp(ξ) ⊂ K + B (x) = K 2 ⊂ U. Therefore, we have ξ(x) ∈ D(Rn ). Moreover, 0 ≤ ξ ≤ 1 and ξ(x) =
ϕ (y)χ K (x − y)dy = 1 ∀x ∈ K .
The previous theorem guarantees the existence of a function ξ ∈ D(Rn ) such that ξ ≡ 1 on B1 (0). The function ξ is known as “cut-off function”. One can simply formulate it as: 1 |x| ≤ 1 ξ(x) = . (3.2.7) 0 |x| ≥ 2 Now, if we consider the sequence ξm (x) = ξ then we see that ξm (x) = 1 on Bm (0), and
x , m
146
3 Theory of Sobolev Spaces
1 |x| ≤ m ξm (x) = . 0 |x| ≥ 2m Now, one can use the Dominated Convergence Theorem to show that if f ∈ L p (Rn ), then ξm f − f p −→ 0 in L p (Rn ). One important advantage of using cut-off functions over characteristic functions is that breaking smooth functions with a cut-off function preserves the smoothness, so if f ∈ C ∞ () then f ξ ∈ C ∞ (), while breaking them with the characteristic (indicator) function of the form χ K m (x) may produce jump discontinuities at the boundary of K m , so f ξ won’t be smooth on .
3.2.4 Partition of Unity The third regularizing tool that will be studied is the partition of unity, which can be obtained by use of cut-off functions. This is an important tool to globalize local approximations. Definition 3.2.4 (Locally Finite Cover) Let M be a manifold in Rn , and let {Ui } be a collections of subsets of Rn . Then {Ui } is said to be locally finite cover of M if for every x ∈ M, there exists a neighborhood of x N (x) such that N x ∩ Ui = Ø for all but a finite number of indices i. In other words, a locally finite cover means that for every point we can find a neighborhood intersecting at most finitely many of sets in the cover. A topological space is called paracompact if every open cover admits a locally finite refinement. All locally compact Hausdorff spaces are paracompact spaces, hence Rn is paracompact, i.e., every open cover {Uα } of a manifold in Rn has a locally finite refinement {Vi }. If we apply this result again on {Vi }, we obtain another cover {Wi }, and it can be shown that Wi ⊂ Wi ⊂ Vi . Definition 3.2.5 (Subordinate) Let F1 = {Vi , i ∈ Λ} and F2 ={Ui , i ∈ Λ} be two families of sets. Then we say that F1 subordinate to F2 if Vi ⊂ Ui for all i ∈ Λ. Note here that if the two families are covers for some set M, then we say that F1 is refinement to F2 and a subcover of it. Definition 3.2.6 (Partition of Unity) Let M be a manifold in Rn , and {φi : M → R : i ∈ I } be a collection of nonnegative smooth functions. Then {φi } is a partition of unity on M if (1) 0 ≤ φi ≤ 1 for all i,
3.2 Regularization and Smoothening
(2)
147
φi (x) = 1 for all x ∈ M. i
In light of the definition, one can also describe the functions as φi ∈ C ∞ (M), φi : M → [0, 1]. The partition of unity can be used to “glue” local smooth paths to obtain a global smooth one, and this will allow us to approximate nonsmooth functions by smooth functions. It is worth noting that the set I is not necessarily countable, but condition (2) implies that the summation is only over a countable indices i ∈ I ⊂ I where I is a countable set, so we can WLOG assume that I is countable. Consequently, we have φi (x) = 0 for all but countably many i ∈ I . If we want the summation to be over a finite number of indices, then we need extra conditions. Proposition 3.2.7 Let M be a manifold in Rn , {Ui } be a locally finite cover of M, and let {ξi } be a sequence of cut-off functions defined on M. If the subcover {supp(ξi )} is a subordinate to {Ui }, then for each x ∈ M, we have ξi (x) = 0 for all but finitely many i ∈ I . Proof By the local finiteness of {Ui }, for every x ∈ M x ∈ / supp(ξi ) for all but finitely many ξi . Another way of saying that the subcover {supp(φi )} is subordinate to the open cover {Ui } is to say that {φi } is subordinate to {Ui }. Note that if {supp(φi )} is subordinate to a locally finite cover {Ui }, then {supp(φi )} = {φi−1 (0, 1]} is also locally finite. A partition of unity is said to be locally finite on a set M, if {supp(φi )} is locally finite cover of M. Now, we discuss the existence of this partition. Theorem 3.2.8 Let be open set in Rn . Then there exists a locally finite smooth partition of unity {φi } on . Proof Consider the collection {i }i∈N of subsets of defined by 0 = Ø, and for k ≥ 1, 1 . i = Bi (0) ∩ x ∈ : d(x, ∂) > i Then for each i, we have: i is open, i is compact, i ⊂ i+1 , and Furthermore, for every x ∈ , there exists N ∈ N such that x ∈ N +2 \ N +1 , which is clearly a compact subset of the open set N +3 \ N .
i = .
148
3 Theory of Sobolev Spaces
Thus, let K i = i+2 \ i+1 and Ui = i+3 \ i . Then clearly, {K i } and {Ui } are collections of compact sets and open sets, respectively, and for each i, K i ⊂ Ui . It is easy to see from the construction of {Ui } that it is a locally finite cover of (for example, if i < r < i + 1 then Br (0) will intersect Ui , Ui−1 , and Ui−2 at most). By Theorem 3.2.3, there exists a sequence of cut-off functions {ξi }i∈N such that for each i, we have ξi (x) ∈ D(), 0 ≤ ξi ≤ 1 with ξi (x) ≡ 1 on K i and {ξi } subordinate supp(ξi ) ⊂ Ui . This implies by Definition 3.2.5 that the sequence {Ui }, hence it is locally finite. By Proposition 3.2.7, the summation i ξi (x) is finite for every x ∈ (only three nonzero terms), so we set ξi (x). ξ(x) = i
Then function ξ is well-defined and smooth. Now, we define the following sequence φi ∈ D() given by φi (x) = It is clear that 0 ≤ φi ≤ 1 and partition of unity.
ξi (x) . ξ(x)
φi (x) = 1 for all x ∈ . So this is the smooth
The three tools that we studied: mollifiers, cut-off functions, and partition of unity, are among the most powerful tools that can be used to establish density and approximation results. Mollifiers provide a mollification (smoothening the edges), and cut-off functions provide compact support, and finally, the partition of unity enables us to pass from local property to global.
3.2.5 Fundamental Lemma of Calculus of Variations We conclude the section with a proof of the well-known Fundamental Lemma of Calculus of Variations. Lemma 3.2.9 (Fundamental Lemma of Calculus of Variations) Let u ∈ C 2 () satisfying
uvd x = 0
3.3 Density of Schwartz Space
149
for all v ∈ C 2 () with v(∂) = 0. Then u ≡ 0 on . Proof If not, then there exists a a neighborhood N ⊂ such that u > 0 on N . Let K be a compact subset of N and define a cut-off function v = ξ(x) on K with ξ = 1 on K , 0 ≤ ξ ≤ 1, and supp(ξ) ⊆ N . This gives uξ > 0 on N . Hence
uv ≥
uξ > 0. N
3.3 Density of Schwartz Space 3.3.1 Convergence of Approximating Sequence One useful application of mollifiers is to approximate a function by another function of the same regularity but with compact support. For example, let u ∈ C(Rn ), then the sequence u j (x) = u(x)ξ j (x) ∈ Cc (Rn ) and u j −→ u in C. Hence, we begin our density results with the following wellknown result in real analysis which implies that a continuous function can always be approximated by another continuous function of compact support. Theorem 3.3.1 Cc∞ (Rn )) is dense in C(Rn ). Recall that ϕ ∈ C ∞ (Rn ) is said to be rapidly decreasing if ϕα,β = sup x α D β ϕ(x) < ∞ ∀α, β ∈ Nn0 .
(3.3.1)
x∈Rn
which is equivalent to sup (1 + x)k D β ϕ(x) < ∞
(3.3.2)
x∈Rn
for all k ∈ N and β ∈ Nn0 , |β| ≤ k. The Schwartz space S(Rn ) is the space S(Rn ) = {u ∈ C ∞ (Rn ) such that |u|α,β < ∞ for all α, β ∈ Nn0 }. This section discusses some interesting properties of this space and how it can be used to construct other function spaces. Schwartz space S(Rn ) has three significant properties that make it very rich in construction and important in applications. These properties are: (1) S(Rn ) is closed under differentiation and multiplication by polynomials. (2) S(Rn ) is dense in L p (Rn ). (3) The Fourier transform is preserved under S(Rn ), i.e., it carries S(Rn ) onto itself.
150
3 Theory of Sobolev Spaces
The first property is clear from the previous chapter. To demonstrate the other properties, we need the following theorem. Theorem 3.3.2 If u ∈ L p () for some ⊆ Rn and 1 ≤ p < ∞, then u → u in L p (). Proof We will prove the case when = Rn . By Theorem 3.3.1, or alternatively Lusin’s Theorem, the function u can be approximated by a function in Cc (Rn ), i.e., for every σ > 0, there exists v ∈ Cc (Rn ) such that u − v p
0. Hence, v ∈ Cc∞ (Rn ). Moreover, since g is continuous on a compact support, it is uniformly continuous on K . So there exists ρ > 0 such that if x − y < ρ, then |v(x − y) − v(y)|
0. Then, we have the following: (1) If u ∈ W k, p (Rn ) then u ∈ C ∞ (Rn ), and D α u = ϕ ∗ D α u = (D α u)
166
3 Theory of Sobolev Spaces
for all x ∈ Rn . (2) If u ∈ W k, p () then u ∈ C ∞ ( ), and D α u = ϕ ∗ D α u = (D α u) for all x ∈ . (3) u k, p ≤ u k, p . Proof (1) and (2) follows immediately from Theorem 3.2.2 since every u in W k, p (Rn ) is in L p (Rn ). For (3), Theorem 3.2.2 proved that u p ≤ u , p .
(3.5.3)
On the other hand, (2) demonstrated that (D α u )(x) = (D α u) (x). Since u ∈ W k, p , we have D α u ∈ L p . Then by Theorem 3.2.2 (D α u) p ≤ D α u p .
(3.5.4)
Now (3) follows from (3.5.3) and (3.5.4). The result can also be proved similarly for the case = Rn .
k, p
3.5.6 W0
()
One of the important Sobolev spaces is the so called: “zero-boundary Sobolev space”. This is defined in most textbooks as the closure (i.e., completion) of the space Cc∞ . However, since we haven’t yet discussed approximation results, we shall adopt for the time being an equivalent definition which may seem to be a bit more natural. Definition 3.5.8 (Zero-Boundary Sobolev Space) Let ⊆ Rn . Then, the zerok, p boundary Sobolev space, denoted by W0 (), is defined by W0 () = {u ∈ W k, p () such that D α u |∂ = 0 for all 0 ≤ |α| ≤ k − 1}. k, p
k, p
In words, the Sobolev functions in W0 () together with all their weak derivatives up to k − 1 vanish on the boundary. More precisely, 1, p
W0 (Rn ) = {W 1, p (Rn ) ∈ L P (Rn ) such that u |∂ = 0}. Two advantages of this property, as we shall see later, is that: 1. the regularity of the boundary is not necessary, and 2. extensions can be easily constructed outside .
3.6 W 1, p ()
167 k, p
Functions in W0 () are thus important in the theory of PDEs since they naturally satisfy Dirichlet condition on the boundary ∂. k, p
Proposition 3.5.9 For any ⊆ Rn , the space W0 () is Banach, separable, and reflexive, for every k ≥ 0, 1 ≤ p ≤ ∞. 1, p
Proof It suffices to prove that W0 () is closed in W 1, p (). The proofs for separability and reflexivity are very similar to that for W 1, p (). The details are left to the reader as an exercise.
3.6 W 1, p () The space W 1, p () is particularly important since it provides the bases for several properties of Sobolev spaces. Sometimes it might be simpler to prove certain properties in W 1, p () because the techniques involved might become more complicated when dealing with k > 1, so we establish the result for W 1, p () knowing that the results can be extended to general k by induction. The next result demonstrates one of the features of this space that distinguish it from other Sobolev spaces. We consider the simplest type of this space which is W 1,1 (I ), where I is an interval in R. The Sobolev norm in this case takes the form
u1,1 = |u| d x + u d x. I
I
The following theorem gives a relation between classical derivatives and weak derivatives.
3.6.1 Absolute Continuity Characterization Theorem 3.6.1 Let I be an interval in R, and u ∈ L 1 (I ) and its weak derivative is u . Then u ∈ W 1,1 (I ) if and only if there exists an absolutely continuous representation u˜ ∈ C(I ) in W 1,1 (I ) such that
x
u(x) ˜ =c+
u (t)dt
a
for every x ∈ I and for some constant c. Proof Choose a ∈ I and define
u(x) ˜ = a
x
u (t)dt.
168
3 Theory of Sobolev Spaces
Since u ∈ W 1,1 (I ), u ∈ L 1 (I ), so u˜ ∈ L 1 (I ), and hence u˜ ∈ C(I ) is absolutely continuous and (u) ˜ = u , and consequently u˜ ∈ W 1,1 (I ). We have (u − u) ˜ = 0, and so u − u˜ = c. Conversely, let u˜ ∈ C(I ) be an absolutely continuous version of u such that (u) ˜ = u . Let ϕ ∈ Cc∞ (I ) with supp(ϕ) ⊆ I . Performing integration by parts gives
b
u ϕ = − uϕ . a
I
This implies that Du exists, and since u is absolutely continuous, we have by Fundamental Theorem of Calculus Du = u ∈ L 1 (I ).
According to the previous theorem, absolute continuity provides a necessary and sufficient condition for the weak and classical derivative of functions in L 1 (I ) to coincide. The importance of the previous result lies in the fact that it allows the replacement of any function u ∈ W 1,1 (I ) by its continuous representative function. Loosely speaking, we can view functions in W 1,1 (I ) as absolutely continuous functions. The fact that (u) ˜ = u is well expected since any absolutely continuous function is differentiable a.e. by the Fundamental Theorem of Calculus, and if its weak derivative is in L 1 then it equals the classical derivative. It is worth noting that the result does not hold in higher dimensions Rn , n > 1, i.e., functions in W 1, p () for n ≥ 2 are not continuous and may be unbounded. We shall see when studying embedding theorems that, under certain conditions, they may coincide with continuous functions almost everywhere. 1, p The next result gives a connection between W 1, p () and W0 (). Proposition 3.6.2 Let u ∈ W 1, p (), 1 ≤ p < ∞. If there exists a compact set K ⊂ 1, p such that supp(u) = K , then u ∈ W0 (). Proof Let be open such that K ⊂ ⊂⊂ . Define ξ ∈ D() such that ξ = 1 on K and
3.6 W 1, p ()
169
supp(ξ) ⊆ ⊂ for some open ⊃ K . By Theorem 3.3.3, there exists u m ∈ D() such that u m −→ u in L p (), and
∂u m ∂u −→ ∂xi ∂xi
on (why?). Then ξu m ∈ D(), and ξu m − ξu L p () ≤ ξ∞ u m − u L p () −→ 0. Further, we have ∂ξ ∂(ξu m ) ∂(ξu) ∂u m ∂u − = − (u m − u) − ξ ∂x ∂xi L p () ∂xi ∂xi ∂xi L p () i ∂ξ u m − u L p () + ξ∞ ∂u m − ∂u ≤ −→ 0. ∂x ∂x ∂xi L p () i ∞ i
Therefore, ξu m −→ ξu = u in W 1, p (). Since supp(ξu m ) ⊂ , we have 1, p
ξu m ∈ W0 (). 1, p
By completeness, u ∈ W0 ().
Remark The result of the previous proposition can be easily extended to W K , p (). Similar to Lemma 2.10.1, we have the following result. Proposition 3.6.3 Let u ∈ W 1, p (Rn ) for 1 ≤ p < ∞, and f ∈ L 1 (Rn ). Then u ∗ f ∈ W 1, p (Rn ) and Dxi ( f ∗ u) = f ∗ Dxi u. Proof It is clear that u ∗ f ∈ L p (Rn ). Let ϕ ∈ D(Rn ). Then
170
3 Theory of Sobolev Spaces
Rn
Dxi ( f ∗ u)ϕ = − =−
R
n
R
n
( f ∗ u)Dxi ϕ u( f ∗ Dxi ϕ)
=− u.Dxi ( f ∗ ϕ) Rn
= (Dxi u)( f ∗ ϕ) n
R = ( f ∗ Dxi u)ϕ. Rn
3.6.2 Inclusions Proposition 3.6.4 (Inclusion Results) Let ⊂ Rn . Then the following inclusions hold: 1. If k1 , k2 ∈ N, such that 0 ≤ k1 < k2 , then W k2 , p () ⊂ W k1 , p (). 2. If ⊂ then
W k, p () ⊂ W k, p ( ).
3. If is bounded and q ≥ p then W k,q () ⊂ W k, p (). 4. If is bounded then k, p
k, p
p
W0 () ⊂ W k, p () ⊂ Wloc () ⊂ L loc (). 5. For all k ∈ N, we have C ∞ () ⊂ C k () ⊂ W k, p (). 6. For all k ∈ N, we have Cc∞ () ⊂ Cck () ⊂ W0 (). k, p
Proof The proofs follow directly from the definitions of the spaces.
3.6 W 1, p ()
171
3.6.3 Chain Rule The next result is a continuation of the calculus of weak derivatives. We have established various types of derivative formulas, and it remains to establish the chain rule, which plays an important role when we discuss the extension of Sobolev functions. Theorem 3.6.5 Let u ∈ W 1, p (), 1 ≤ p ≤ ∞, and F ∈ C 1 (R) such that F ≤ M. Then ∂ ∂u (F ◦ u) = F (u) · . ∂x j ∂x j Moreover, if is bounded or F(0) = 0, then F ◦ u ∈ W 1, p (). ∂u Proof Let 1 ≤ p < ∞. Since ∈ L p () for 1 ≤ i ≤ n, and F ∈ C(R), ∂xi p
∂u p F (u) ∂u d x ≤ M p d x < ∞. ∂xi ∂x i Hence, F (u)
∂u ∈ L p (). ∂xi
(3.6.1)
By Theorem 3.3.3, consider the convolution approximating sequence u ∈ Cc∞ () converging to u in L p (). Similar to the above argument, we also see that F(u ) − F(u) L p () ≤ M u − u L p () −→ 0. Hence, F(u ) −→ F(u) and u −→ u a.e. on , F (u ) −→ in L p (). Furthermore, since F is continuous
F (u) a.e. on , and we also have F ≤ M on , so we apply the Dominated Convergence Theorem to obtain F (u ) −→ F (u) in L p (). Note that ∂u = ∂xi
∂u ∂xi
,
172
3 Theory of Sobolev Spaces
and since
∂u ∈ L p (), we have by Theorem 3.3.2 ∂xi ∂u ∂u −→ in L p (). ∂xi ∂xi
Then
F (u ) ∂u − F (u) ∂u ∂x ∂x i
i
∂u ∂u ≤ M − ∂xi ∂xi L p () L p () !
" ∂u
+ F (u ) − F (u) ∂x
i
L p ()
.
Hence, F (u )
∂u ∂u −→ F (u) ∂xi ∂xi
in L p (). Now, let ϕ ∈ Cc∞ (), and writing = n → 0 as n −→ ∞. We use the classical chain rule on F,
∂ϕ ∂ϕ F(u) d x = lim F(u n ) dx ∂x ∂x i i
∂ (F(u n ))ϕd x = − lim ∂x i
∂u n F (u n ) ϕ(x)d x = − lim ∂xi
∂u F (u) ϕ(x)d x. =− ∂x i Consequently, ∂u ∂ F(u(x) = F (u) , ∂xi ∂xi and the result follows by (3.6.1). If F(0) = 0, then, using the Mean Value Theorem, we have
u F (t) dt ≤ Mu. |F(u)| = |F(u) − F(0)| ≤ 0
Integrating over , F L p () ≤ M u L p () < ∞.
3.6 W 1, p ()
173
So F ◦ u ∈ L p (), and so (3.6.1) implies that F ◦ u ∈ W 1, p (). If is bounded, then
|F(0)| p d x < ∞,
so F(0) ∈ L p (). Also, |F(u)| ≤ |F(u) − F(0)| + |F(0)| ≤ Mu + |F(0)| and since u ∈ L p (), we have Mu + |F(0)| ∈ L p (), so
|Mu + |F(0)|| p < ∞,
thus F ◦ u ∈ L p (). For p = ∞, note that if u ∈ W 1,∞ (), then u ∈ W 1, p () and the chain rule holds for all p < ∞. Corollary 3.6.6 Let u ∈ W 1, p () for ⊂ Rn , 1 ≤ p ≤ ∞. Then |u| ∈ W 1, p () and for all |α| = 1 ⎧ α ⎪ u>0 ⎨D u α D (|u|) = 0 u=0 ⎪ ⎩ α −D u u < 0. Proof Since |u| = u + + u − , where u + = max(u, 0) and u − = (−u)+ = − min(u, 0), it suffices to prove the result for u + . For every t ∈ R, define √ t 2 + 2 − t > 0 F (t) = 0 t ≤ 0. It is clear that F ∈ C 1 (R). By the chain rule, Fn (u) ∈ W 1, p () and
174
3 Theory of Sobolev Spaces
⎧ ∂u ⎨√ u D(F (u)) = u 2 + 2 ∂xi ⎩ 0
u>0 u ≤ 0.
It follows that
∂ϕ F (u) dx = − ∂xi
+
ϕ√
u u2
+
2
∂u dx ∂xi
where + = {x ∈ : u > 0}. But in + we have F < 1, and we also have Fn (u) −→ u + and u 1 a.e. in . √ 2 u + 2 Applying the Dominated Convergence Theorem,
∂ϕ ∂u u+ dx = − ϕd x. ∂xi + + ∂x i Therefore, u + ∈ W 1, p (), and ⎧ ⎨ ∂u D α (u + ) = ∂xi ⎩ 0
u>0
(3.6.2)
u ≤ 0.
Similar argument shows that D α (u − ) =
⎧ ⎨0
∂u ⎩− ∂xi
u≥0 u } = {x ∈ : B (x) ⊆ }. Theorem 3.3.2 also established the fact that u → u in L p (). This gives an approximating sequence in the form of convolution which approaches u from the interior of the domain. With every fixed > 0, we can have some x in the neighborhood ⊂ such that this interior neighborhood approaches from inside. This is a “local” type approximation. This localness necessarily requires a bounded subset . As soon as we consider the whole space Rn , the localness property should disappear.
3.7.1 Local Approximation Proposition 3.7.1 Let u ∈ W k, p () for 1 ≤ p < ∞. Then k, p
u −→ u in Wloc (). If = Rn then
u −→ u in W k, p (Rn ).
Proof By definition of Sobolev spaces, u ∈ L p (), so by Theorem 3.3.2, u −→ u in L p (). Let = {x ∈ : d(x, ∂) > } for > 0. Then, by Theorem 3.5.7(2), we have u ∈ C ∞ ( ), and
3.7 Approximation of Sobolev Spaces
177
D α u = ϕ ∗ D α u.
(3.7.1)
for all x ∈ . Therefore u ∈ W k, p ( ). Choosing any compact set ⊂⊂ such that ⊂ for some small > 0, then for all |α| ≤ k and letting → 0+ , we have D α u − D α u L p ( ) = (D α u) − D α u L p ( ) −→ 0. This proves the first part. If = Rn , then = Rn and (3.7.1) holds for all x ∈ Rn , so for all |α| ≤ k, we have D α (u ) − D α u L p (Rn ) = (D α u) − D α u L p (Rn ) −→ 0, i.e., u −→ u in W k, p (Rn ).
In this sense, every Sobolev function can be locally approximated by a smooth function in the Sobolev norm whenever the values of x are away from the boundary of the original domain. If we make the distance closer (i.e., → 0), the interior neighborhood gets larger and will absorb new values of x, until we eventually evaluate the function at the boundary (since → ). This is the essence of localness property. To get a stronger result, we need to globalize the approximation regardless of a neighborhood taken for this approximating process. As pointed out in Sect. 3.2., the partition of unity shall be invoked here.
3.7.2 Global Approximation As discussed above, in order to remove the localness property, one needs to work on the whole space. However, one can still obtain global results when considering bounded sets. The following theorem extends our previous result from local to global. Theorem 3.7.2 (Meyers–Serrin Theorem) For every open set ⊆ Rn and 1 ≤ p < ∞, we have C ∞ () ∩ W k, p () = W k, p (). Proof We first consider the case = Rn . Let u ∈ W k, p (Rn ). So D α u ∈ L p (Rn ) for all |α| ≤ k. Let ϕ ∈ Cc∞ (Rn ), and consider its mollification
178
3 Theory of Sobolev Spaces
u = ϕ ∗ u, where ϕ is the standard mollifier. Then, by Theorem 3.5.7, u ∈ C ∞ (Rn ) and u k, p ≤ uk, p , from which we conclude that D α u ∈ L p (Rn ) and D α u − D α u p → 0 in L p (Rn ). Therefore,
p u − uk, p =
→0
D α u − D α u pp −→ 0,
|α|≤k
u ∈ C ∞ (Rn ) ∩ W k, p (Rn ), and u −→ u in W k, p (Rn ). Now, let be open in Rn . Then there exists a smooth locally finite partition of unity ξ˜i ∈ Cc∞ () subordinate to a cover {Ui }. Let > 0, and u ∈ W k, p (). Define u i (x) = ξ˜i (x)u(x).
(3.7.2)
Then by Theorem 3.5.6 u i ∈ W k, p (), so (u i )i ∈ C ∞ (Ui ), supp((u i )i ) ⊂ Ui , and for small (u i ) − u i k, p ≤ . i W () 2i Now define v(x) =
(u i )i =
i
(ϕi ∗ u i )(x).
(3.7.3)
i
˜ Since for every x ∈ , ξi (x) = 0 for all but a finite number of indices i, we have the same conclusion for i (u i ) , and hence v ∈ C ∞ (). Also, note that u i (x) = ξ˜i (x)u(x) = u(x) ξ˜i (x) = u(x). i
i
i
Given δ > 0, there exist small i such that v − uW k, p () = (ϕi ∗ u i )(x) − u i (x) i
i
W k, p ()
3.7 Approximation of Sobolev Spaces
≤
179
(u i ) (x) − u i (x)W k, p ()
i
≤
δ = δ. 2i i
3.7.3 Consequences of Meyers–Serrin Theorem Theorem 3.7.2 implies that C ∞ () = W k, p ()
(3.7.4)
in the Sobolev norm ·k, p , that is D α u n − D α uW k, p () −→ 0, for all |α| ≤ k. This is a significant advantage over the local approximation. The theorem has several important consequences. One consequence is the following corollary which is analogous to (3.7.4). k, p
Corollary 3.7.3 Cc∞ () = W0 () in the Sobolev norm ·k, p . k, p
Proof In the proof of the previous theorem, let u ∈ W0 (), then the sequence u i k, p in (3.7.2) belongs to W0 (), hence by (3.7.3) and the argument thereafter, we have ∞ v ∈ Cc (). This gives k, p
Cc∞ () ∩ W k, p () = W0 ().
k, p
This result serves as an alternative definition of W0 (). Definition 3.7.4 (Zero-boundary Sobolev Space) Let be open in Rn . The space k, p W0 () is defined as the closure of Cc∞ () in the Sobolev norm ·k, p . The proof of the previous corollary clearly shows that Definition 3.5.8 implies Definition 3.7.4, whereas Definition 3.7.4 trivially implies Definition 3.5.8, thus the two definitions are equivalent. Another important consequence of Meyers–Serrin is that any Sobolev function on the whole space Rn can be approximated by a smooth function in the ·k, p norm, i.e. C ∞ (Rn ) = W k, p (Rn ) in the Sobolev norm ·k, p . The next result has even more to say. Proposition 3.7.5 Cc∞ (Rn ) = W k, p (Rn ) 1 ≤ p < ∞.
in the
Sobolev
norm ·k, p
for
180
3 Theory of Sobolev Spaces
Proof Let u ∈ W k, p (Rn ). By Meyers–Serrin Theorem, there exists u j ∈ C ∞ (Rn ) such that u j −→ u in the norm ·k, p . Consider the cut-off function ξ(x) ∈ Cc∞ (Rn ), and define the sequence x ∈ Cc∞ (Rn ), ξ j (x) = ξ j and set v j (x) = u j (x)ξ j (x). Then v j ∈ Cc∞ (Rn ), and clearly v j (x) → u a.e. Differentiate for |α| = 1 and i = 1, 2, . . . , n, we obtain ∂u j ∂u j ∂v j 1 ∂u a.e. + ξu j −→ = = ξj . ∂xi ∂xi j ∂xi ∂xi as j −→ ∞. By Dominated Convergence Theorem, we can show that ∂v j ∂u − −→ 0, ∂x ∂xi L p (Rn ) i i.e., ∂v j ∂u −→ in L p (Rn ). ∂xi ∂xi A similar argument can be done for higher derivatives up to k using Leibnitz rule to obtain α D (v j ) − D α u → 0. p Thus,
D α (v j ) ∈ L p (Rn )
for all |α| ≤ k, and so v j ∈ W k, p (Rn ), and v j − u → 0. k, p
Two important results can be inferred from the previous result. It was indicated earlier that the Sobolev space is supposed to be the completion of the Schwartz space. Since
3.8 Extensions
181
D(Rn ) ⊆ S(Rn ) ⊆ W k, p (Rn ), we infer Corollary 3.7.6 S(Rn ) = W k, p (Rn ). The second result that can be inferred from Corollary 3.7.3 is about the connection between the two Sobolev spaces W and W0 . Although we have the inclusion k, p
W0 () ⊂ W k, p () as in Proposition 3.6.4(4). But this becomes different when = Rn . k, p
Corollary 3.7.7 W k, p (Rn ) = W0 (Rn ). Proof Note that from the proof of Proposition 3.7.5, for every u ∈ W k, p (Rn ), we can find an approximating sequence v j ∈ Cc∞ (Rn ) ⊂ W k, p (Rn ), k, p
So v j ∈ W0 (Rn ). Since v j −→ u in ·k, p norm, by the completeness of the space k, p we obtain u ∈ W0 (Rn ).
3.8 Extensions 3.8.1 Motivation In the previous section, we have obtained our results on bounded sets and in Rn . In general, the behavior of Sobolev functions on the boundary of the domain has always been a critical issue that could significantly affect the properties of the (weak) solutions of a partial differential equation. In this regard, it might be useful sometimes to extend from W k, p () to a W k, p ( ) for some ⊂ , in particular, from W k, p () to W k, p (Rn ) because functions in W k, p () inherit some important properties for those in W k, p (Rn ). This boils down to extending Sobolev functions defined on a bounded set to be defined on Rn . However, we need to make certain that our new functions preserve the weak derivative and other geometric properties across the boundary. One of the many important goals is to use the extension to obtain embedding results for W k, p () from W k, p (Rn ). It should be noted that we have already used the zero extension in (3.3.7) in the proof of Theorem 3.3.2 in the case ⊂ Rn . The treatment there wasn’t really problematic because we dealt merely with functions. In Sobolev spaces, the issue becomes more delicate due to the involvement of weak derivatives.
182
3 Theory of Sobolev Spaces
3.8.2 The Zero Extension The first type of extension is the zero extension. For a function f defined on , the zero extension can be simply defined by ¯f (x) = f (x) · χ (x) = f (x) x ∈ (3.8.1) 0 x ∈ Rn \ . So f¯ is defined on Rn . Dealing with L p spaces makes this extension possible since the functions in L p are considered the same if the difference between them is of measure zero, so even if f (∂) = 0, the zero extension (3.8.1) is still in L p . Proposition 3.8.1 Let be open in Rn , and let u ∈ L p () for some 1 ≤ p < ∞. Then there exists a sequence u n ∈ L p (Rn ) such that u n L p (Rn ) ≤ u L p () . Proof Let u¯ ∈ L p (Rn ) be the zero extension of u. For > 0, consider the convolution approximating sequence ¯ ∗ ϕ (x) u¯ (x) = u(x) defined on . This gives
u(x) ¯ ∗ ϕ (x) = =
R
n
u(y)ϕ ¯ (x − y)dy
∩B (x)
u(y)ϕ (x − y)dy
= u(x) ∗ ϕ (x) on . Hence, by Theorem 3.2.2(3), and writing u n = u n for n −→ 0+ , we obtain u n L p (Rn ) = u n L p () ≤ u L p () .
Recall in the proof of Theorem 3.3.2, we first proved that u → u in L p (Rn ) for u ∈ L p (Rn ), from which we concluded that Cc∞ (Rn ) is dense in L p (Rn ). Then, we assumed that ⊂ Rn , and we used the zero extension (3.3.7) to obtain the sequence u¯ ∈ C ∞ (Rn ) which converges to u in L p (). Now, we will assume that Theorem 3.3.3 holds only for = Rn , and we will prove the general case using the zero extension. Proposition 3.8.2 For any open set in Rn , the space Cc∞ () is dense in L p () for 1 ≤ p < ∞.
3.8 Extensions
183
Proof Let u ∈ L p (). Then u¯ ∈ L p (Rn ). Hence, by Theorems 3.3.2 and 3.3.3, the mollification u¯ m ∈ C ∞ (Rn ) and u¯ m −→ u¯ in L p (Rn ). Hence, u − u p = u¯ − u¯ p n −→ 0. m m L () L (R ) For convenience, set u m = u m , and consider a partition of unity ξm ∈ Cc∞ (), and define the sequence wm = u m ξm . Clearly, wm ∈ Cc∞ (). Then wm − u L p () ≤ wm − ξm u L p () + ξm u − u L p () . But, wm − ξm u L p () = ξm (u m − u) L p () ≤ u m − u L p () → 0. For the second integral, note that |ξm − 1| p |u| p ≤ |u| p . So the Dominated Convergence Theorem gives ξm u − u L p () −→ 0. Hence, wm −→ u in L p ().
The situation in Sobolev spaces is more delicate since they involve weak derivatives. The zero extension breaks the graph of the function across the boundary and jump discontinuity may occur, and consequently, the weak derivatives could fail to k, p exist. The space W0 () will play an important role here since functions on this space already assumed to vanish at the boundary, so the zero extension won’t break the graph. k, p
Proposition 3.8.3 Let be open in Rn , and let u ∈ W0 () for some 1 < p < ∞. Then u¯ ∈ W k, p (Rn ) and u ¯ W k, p (Rn ) = uW k, p () . k, p
Proof If u ∈ W0 (), then u ∈ L p (), and so u¯ ∈ L p (Rn ). Moreover, by Meyers–Serrin Theorem, there exists a sequence u m ∈ Cc∞ () such that u m −→ u in W k, p (), so u m −→ u in L p (). Also, there exists u¯ m ∈ Cc∞ (Rn ) such that u¯ m − u ¯ L p (Rn ) = u m − u L p () −→ 0,
184
3 Theory of Sobolev Spaces
so u¯ m −→ u¯ in L p (Rn ). Note that u m is Cauchy in W k, p (), and u¯ m is Cauchy in L p (Rn ), so we have for all |α| ≤ k, α D u¯ j − D α u¯ m k, p n = D α u j − D α u m k, p −→ 0, W (R ) W () so u¯ m Cauchy in W k, p (Rn ), and thus by completeness, u¯ m −→ v in W k, p (Rn ), hence v = u¯ ∈ W k, p (Rn ). Let ϕ ∈ Cc∞ (), and noting that u m ∈ Cc∞ (). Then
u¯ D α ϕd x = u D α ϕd x n R
= (lim u m )D α ϕd x
= lim u m D α ϕd x
α = (−1) lim D α u m ϕd x
α = (−1) D α ϕϕd x
= D α ϕϕd x. Rn
So D α u¯ = D α u on Rn , and clearly u ¯ L p (Rn ) = u L p () . Consequently, u ¯ W k, p (Rn ) = uW k, p () .
It should be noted that this result doesn’t hold in general for W k, p () because functions on this space don’t necessarily vanish at the boundary. Instead, we will find a sequence in Cc∞ (Rn ) approaching to the function on but in L p () only, and since we can’t guarantee the existence of the weak derivatives across the boundary ∂, we always need to investigate the convergence of weak derivatives inside staying away from ∂. Hence, our best tool here is the compact inclusion. Proposition 3.8.4 Let be open in Rn , and let u ∈ W k, p () for some 1 ≤ p ≤ ∞. Then, there exists u m ∈ Cc∞ (Rn ) such that u m −→ u in L p () and for every ⊂⊂ .
D α u m −→ D α u in L p ( )
3.8 Extensions
185
Proof Consider the zero extension u¯ ∈ L p (Rn ). Define vm = ϕm ∗ u¯ ∈ C ∞ (Rn ) and vm −→ u¯ in L p (Rn ), and moreover, ∂vm ∂u −→ in L p ( ) ∂xi ∂xi for every ⊂⊂ . Define the sequence of cut-off functions x , on
ξm (x) = ξ m and set u m (x) = u m (x) = ξn vm (x). Then u m ∈ Cc∞ (Rn ) and u m −→ u¯ = u a.e. on . and it can be shown using Dominated Convergence Theorem that u − u p −→ 0, m L () in L p (). Moreover, for every ⊂⊂
∂ϕ u¯ m (x) dx = ∂xi
That is,
∂ϕ(x) (ϕm (x) ∗ u(x)ξ ¯ m (x))
∂xi
∂ϕ(x) = [u(x ¯ − y)ξm (x − y)ϕm (y)dy] dx
n ∂xi R
∂ϕ(x) dx u(x ¯ − y)ξm (x − y)ϕm (y)dy = Rn ∂x i
∂ ϕ(x) ¯ − y)ξm (x − y)) ϕm (y)dy =− (u(x n
∂x i
R ∂ u(x ¯ − y) ϕ(x) ξm (x − y)ϕm (y)dy =− n
∂xi
R ∂ u¯ ϕ(x)d x =− n ∂x i m R
186
3 Theory of Sobolev Spaces
∂ ∂u u ¯ − ∂x m ∂x p −→ 0 i i L ( ) in L p ( ) for every ⊂⊂ .
The previous result enables us to construct a sequence in Cc∞ (Rn ) that converges to k, p u ∈ W k, p () in Wloc (). One may start to wonder when can we get the convergence in W k, p ()? We will either let = Rn , so no extension is needed to pass across the boundary, or we need to impose extra conditions on the boundary ∂ to guarantee nice behavior of the weak derivatives. The next result deals with the first option, i.e., extending the domain to the whole space. Proposition 3.8.5 Let u ∈ W k, p (Rn ) for some 1 ≤ p ≤ ∞. Then there exists u m ∈ Cc∞ (Rn ) such that u m −→ u in W k, p (Rn ). Proof Consider a partition of unity
x ξm (x) = ξ m
∈ Cc∞ (Rn ),
which is defined by 1 x ≤ m ξm = 0 x > m and define the sequence u m (x) = u(x)ξm (x). Then clearly u m ∈ Cc∞ (Rn ) and so by Theorem 3.5.6 u m ∈ Cc∞ (Rn ) ∩ W k, p (Rn )
and u m −
p u L p (Rn )
=
|ξm − 1| |u| d x ≤ p
Rn
p
x>m
|u| p d x → 0
as m → ∞. So, u m −→ u in L p (Rn ). Taking the derivative for |α| = 1, ∂u ∂ξm ∂u ∂u m = ξm +u −→ . ∂xi ∂xi ∂xi ∂xi Hence, u m −→ u in W k, p (Rn ).
Proposition 3.7.5 can now be immediately concluded. Now, if we stick with a bounded open in Rn , then, as suggested above, we need to impose further conditions on the boundary ∂. Many results on Sobolev spaces don’t require smooth boundary
3.8 Extensions
187
but may require boundary with a “nice” structure. The word “nice” here shall be formulated mathematically in the following definition. Definition 3.8.6 Let ⊂ Rn be bounded and connected. (1) The set is said to be Lipschitz (denoted by Lip) if for every x ∈ ∂, there exists a neighborhood N (x) such that Γ (x) = N (x) ∩ ∂ is the graph of a Lipschitz continuous function, and ∂ can be written as a finite union of these graphs, i.e., m Γi . ∂ = i=1
(2) The set is said to be of class C k if for every x ∈ ∂, there exists a neighborhood N (x) such that Γ (x) = N (x) ∩ ∂ is the graph of a C k function, and ∂ can be written as a finite union of these graphs, i.e., m Γi . ∂ = i=1
(3) The set is said to be a smooth domain if k = ∞ in (2). In words, a Lipschitz domain means its boundary locally coincides with a graph of a Lipschitz function. A C k -class domain means its boundary locally coincides with a C k -surface. A bounded Lip domain has the extension property for all k. Roughly speaking, a bounded domain is Lip if its boundary behave as a Lipschitz function. So, every convex domain is Lip, and all smooth domains are Lip. On the other hand, a polyhedron in R3 is an example of a Lip domain that is not smooth. Remark We need to note the following: (1) A Lip domain is by definition bounded, so when we refer to a domain as lip, it is presumed that it is bounded. The same holds for C k -class domains. (2) It was assumed in the definition that the domain is connected. If it is disconnected, then we will add to (2) the condition that N (x) ∩ is on one side of Γ (x). m (3) For ∂ = Γi , it is required to have a system of local coordinates such that i=1
if Γ is the graph of a Lip function (C k functions) ψ, then it is represented by xm = ψ(x1 , . . . , xm−1 ).
188
3 Theory of Sobolev Spaces
3.8.3 Coordinate Transformations The main tool to establish our extension result is the following: Definition 3.8.7 (Diffeomorphism) Let U, V be two open and bounded sets in Rn . A mapping φ : U −→ V is called: C k −diffeomorphism if φ is bijection and bounded in C k (U ), and φ−1 = ψ ∈ C k (V ). In words, a C k diffeomorphism is a C k mapping whose inverse is also C k . For a C diffeomorphism φ : U −→ V, we have k
φ = (φ1 , φ2 , . . . , φn ) ψ = (ψ1 , ψ2 . . . , ψn ). Here, the mapping ψ is the coordinate transformation on V because it makes ∂ a coordinate surface, and the functions ψi are called the coordinate functions. In this case, we say that U and V are C k diffeomorphic to each other. Roughly speaking, they look the same sets, and the elements of the sets are relabeled due to the reorientation of the coordinates. Writing y1 = φ1 (x1 , . . . , xn ), y2 = φ2 (x1 , . . . , xn ), . . . , yn = φn (x1 , . . . , xn ), x1 = ψ1 (y1 , . . . , yn ), x2 = ψ2 (y1 , . . . , yn ), . . . , xn = ψn (y1 , . . . , yn ). The Jacobian J (φ) is defined as J (φ) =
∂(y1 , . . . , yn ) , ∂(x1 , . . . , xn )
∂φ j , 1 ≤ i, j ≤ n. The determinant of J is ∂xi known as the Jacobian determinant of φ, and is denoted by which is the n × n matrix with entries
|J (φ)| = det(Dφ(x)). A well-known result in the calculus of manifolds is that if f, g ∈ C 1 (Rn ) and h = f ◦ g, then ∇h = ∇ f · ∇g. Applying this on φ, ψ = ϕ−1 yields (ψ ◦ φ)(x) = I d(x) = x.
3.8 Extensions
189
Taking the derivatives of both sides of the equation, then taking the determinant of each side, given the fact that det(AB) = det(A) · det(B), give 1 = det(Dφ(x)) det(Dψ(y)), hence (det(Dφ(x)))−1 = det(Dφ−1 (y)), in other words, ∂(y1 , . . . , yn ) 1 ∂(x , . . . , x ) = ∂(x1 ,...,xn ) . 1 n ∂(y1 ,...,yn ) This implies that 0 < m < |J (φ)| , |J (ψ)| < M < ∞ for some M > 0, so the Jacobian determinant doesn’t vanish for C k -diffeomorphisms. This coordinate system helps us to change variables when performing multiple integrals. Namely, let f ∈ L 1 (V ), then substituting y = ϕ(x), gives
f dy = ( f ◦ φ) |J (φ)| d x. V
U
Similarly, if f ∈ L 1 (U ), then
f dy = ( f ◦ ψ) |J (ψ)| dy. U
V
It should be noted that if φ ∈ C 1 (U ) is a diffeomorphism, then Dφ is continuous but needs not be bounded on U . To allow the property of bounded derivative to occur, we strengthen the definition as follows: Definition 3.8.8 (Strongly Diffeomorphism) Let U, V be two open and bounded sets in Rn . A mapping φ : U −→ V is called: C k −strongly diffeomorphism, if φ is bijection and bounded in C k (U ), and φ−1 = ψ ∈ C k (V ). This strong version of the diffeomorphism guarantees that the mappings φ and ψ, in addition to all their derivatives up to kth order are bounded since they are continuously defined on closed sets, i.e. ∂ϕ j ∂ψ j , < ∞. max 1≤i, j≤n ∂x i ∞ ∂ yi ∞
190
3 Theory of Sobolev Spaces
Of course, derivatives cannot be defined on the boundaries, so we can get around this by defining φ : −→ ∗ such that , ∗ are open sets in Rn , and U ⊂ and V ⊂ ∗ . This guarantees that all first derivatives of φ, ψ are bounded on and ∗ , respectively. Another advantage of the definition is that it defines φ on a compact set , which allows us to define new Sobolev spaces on compact manifolds in Rn . Indeed, ∂ can be covered by a finite number of open sets. In particular, each point, say x, in the set ∂ is contained in some neighborhood N (x) that can be represented by the graph of φ, and so ∂ is covered by a finite number of these neighborhood, say {Ni }. In other words, ∂ is covered by a finite number of subgraphs of mappings φi ∈ C k (Ni ), and thus a system of local coordinates is constructed via the mappings {ψi } for . The following result, which is helpful in proving the next theorem, provides a sufficient condition for a composition with a diffeomorphism to be a Sobolev function. Lemma 3.8.9 (Change of Coordinates) Let U, V be open bounded sets in Rn , and let u ∈ W 1, p (U ), and φ : U −→ V , be a C 1 -strongly diffeomorphism, and let φ−1 = ψ = (ψ1 , . . . , ψn ). If v(y) = (u ◦ ψ)(y), then v ∈ W 1, p (V ) and ∂u(ψ) ∂ψk ∂v = · . ∂ yi ∂xk ∂ yi k=1 n
Moreover, vW k, p (V ) ≤ C uW k, p (U ) for all k ∈ N and some for some C > 0. Proof Note that
|v(y)| dy =
|u ◦ ψ(y)| P dy.
P
V
V
Choosing the substitution x = ψ(y), with |J (ψ)| ≤ M yield
3.8 Extensions
191
|v(y)| dy ≤ M
|u(x)| P d x < ∞,
P
V
U
that is; v ∈ L p (V ) and v L p (V ) ≤ C u L p (U ) .
(3.8.2)
∂u ∂v ∈ L p (U ) and ∇ψ is continuous, and consequently exists. ∂xi ∂ yi ∂v ∂v The next step is to evaluate , then to show that ∈ L p (V ), which implies that ∂yj ∂yj v ∈ W 1, p (V ). By Meyers–Serrin Theorem, there exists a sequence Also, note that
u m ∈ W 1, p (U ) ∩ C ∞ (U ) such that u m −→ u in W 1, p (U ) and ∂u m ∂u −→ in L p (U ). ∂xi ∂xi Define the following sequence: vm = u m ◦ ψ. It is clear that vm ∈ C 1 (V ), so we apply the chain rule for classical derivatives ∂u m (ψ) ∂ψk ∂vm = · ∂ yi ∂xk ∂ yi k=1 n
Since
∂u m ∂u −→ in L p (U ), this gives ∂xi ∂xi n ∂u(ψ) ∂ψk ∂vm −→ · = w ∈ L p (V ) ∂ yi ∂x ∂ y k i k=1
(3.8.3)
in L p (U ). A similar argument to the first part gives
|vm (y) − v(y)| dy ≤ M
|u m (x) − u(x)| P d x −→ 0.
P
V
U
where |J (ψ)| p < M. This implies that vm −→ v in L p (U ) and L p (U ), but since v ∈ L p (V ) and ness of weak derivatives that
∂vm −→ w in ∂ yi
∂v exists, we conclude from (3.8.3) and unique∂ yi
192
3 Theory of Sobolev Spaces
∂vm ∂v −→ = w. ∂ yi ∂ yi Next, we establish the estimate n ∂u(ψ) ∂ψ p k · dy ∂xk ∂ yi k=1 n ∂u ∂ψk p dy. ≤ · ∂x ∂ yi k k=1 V
∂v p ∂y p = i L (V ) V
Using the change of variable x = ψ(y) from V to U , and denoting ∂ψ j p C1 = n max , i, j ∂ yi ∞ then substituting the above give n ∂v p ∂u(x) p ≤ MC 1 ∂y p ∂x d x i L (V ) j k=1 U n ∂u p ≤ MC1 ∂x p < ∞ j L (U ) k=1 This implies that
∂v ∈ L p (V ), and hence v ∈ W 1, p (V ). Letting C p = MC1 , this ∂yj
gives n 1/ p ∂u ∂v p . ∂y p ≤ C ∂x p i L (V ) j L (U ) k=1 This estimate, in addition to (3.8.2) implies that vW 1, p (V ) ≤ C uW 1, p (U ) . A similar argument can be performed for k ≥ 2, with u m ∈ W k, p (U ) ∩ C ∞ (U ) converging to u ∈ W k, p (U ) and the chain and Leibnitz rules are applied, then taking the limits. We leave the details for the reader.
3.8 Extensions
193
Remark The result also holds for Lip domains Now, we are ready to investigate the problem of extending Sobolev functions defined on open bounded sets to the whole space.
3.8.4 Extension Operator Theorem 3.8.10 (Existence of Extension Operator) Let ⊂ Rn be open bounded and a C k -class, and ⊂⊂ . Then, there exists a linear bounded operator E, called: “extension operator”, such that the following hold: (1) E : W k, p () −→ W k, p (Rn ). (2) Eu | = u for every u ∈ W k, p ().
(3) supp(Eu) ⊆ . (4) The estimate EuW k, p (Rn ) ≤ c uW k, p () holds for some c = c(n, k, p, , ), i.e. does not depend on u. For p = ∞, c = c(, ). (5) The estimate EuW k, p ( ) ≤ c uW k, p () holds for some c = c(n, k, p, , ). Proof The idea is to extend the function u to a larger set in Rn+ that contains its support, then extend it by zero to Rn+ , then extend it to Rn by a higher order reflection. We will only prove the case k = 1. Let u ∈ W k, p () and x0 ∈ ∂. If ∂ is flat near x0 and lies in the hyperplane H = {x = (x1 , . . . , xn−1 , 0) ∈ Rn }, then there exists a neighborhood N (x0 ) ⊂ ∗ such that Γ (x0 ) = N (x0 ) ∩ ∂ is “flat” and lies in the hyperplane H = {xn = 0}. For a small δ > 0, let B + = B ∩ {xn > 0}, B − = B ∩ {xn < 0} where B = Bδ (x0 ). Then clearly u ∈ W k, p (B + ). Suppose that u ∈ C 1 (). Then we define the following extension as a reflection of u from B + to B u(x) x ∈ B+ u(x) ¯ = (3.8.4) 3u(x1 , . . . , −xn ) − 2u(x1 , . . . , −2xn ) x ∈ B − .
194
3 Theory of Sobolev Spaces
Letting
u + = u¯ | B +
and the even reflection
u − = u¯ | B − ,
and writing x = (x1 , . . . xn−1 ), it is easy to see that lim u + (x , xn ) = lim− u − (x , xn ),
xn →0+
xn →0
and for 1 ≤ i ≤ n − 1, ∂u − ∂u
∂u
=3 (x , −xn ) − 2 (x , −2xn ), ∂xi ∂xi ∂xi hence lim−
xn →0
∂u − ∂u + ∂u
= (x , 0) = lim+ . xn →0 ∂x i ∂xi ∂xi
For i = n, we have ∂u − ∂u
∂u
= −3 (x , −xn ) + 4 (x , −2xn ), ∂xn ∂xn ∂xn so lim−
xn →0
∂u − ∂u + ∂u
= (x , 0) = lim+ . xn →0 ∂x n ∂xn ∂xn
Therefore, u¯ ∈ C 1 (B) ⊂ W 1, p () (Proposition 3.6.4(5)), and by simple calculation one can easily see that u ¯ W 1, p (B) ≤ C uW 1, p (B + ) , where C is a constant that doesn’t depend on u. Now, suppose that u ∈ / C 1 () and ∂ is not flat near x0 . We flatten out the boundary Γ = ∂ ∩ N through a change in the coordinate system that will map it to a subset of the hyperplane xn = 0. We will use a C 1 -strongly diffeomorphism φ : N −→ B, φ ∈ C 1 (N ),
3.8 Extensions
195
and
ψ = φ−1 ∈ C 1 (B),
where B is an open set centered at y0 = φ(x0 ) so that y0 ∈ ∂ B + , and where φ is given by φ(x , xn ) = φ(x , xn − γ(x )), ψ(y , yn ) = ψ(y , yn − γ(y )), Here, φ(Γ ) = B ∩ {xn = 0}, given by φ(U ) = V = B + , where U = N ∩ , and
ψ(V ) = U.
Under this new coordinate system, consider the restriction of u to U, and define v(y) = u ◦ φ−1 (y) = u(φ−1 (y)), The idea here is to write
v = uφ−1 φ,
which makes V = B + the domain of v. and u(U ) = v(V ). By Lemma 3.8.9 and the same procedure as above, it can be shown that v ∈ C 1 (B + ) ∩ W 1, p (B + ), and vW 1, p (V ) ≤ C uW 1, p (U ) ≤ C uW 1, p () .
(3.8.5)
¯ Again, Similar to (3.8.2), we extend v from B + to B through the even reflection v. it can be shown that v¯ ∈ W 1, p (B) and v ¯ W 1, p (B) ≤ C vW 1, p (V ) .
(3.8.6)
Then, we pull the function back to the original system by composing v¯ with φto produce a new extension u¯ = v(φ(x)), ¯ which extends u from U to N and such that u¯ = u on U . Then by continuity and Lemma 3.8.9, u¯ ∈ C 1 (N ) ∩ W 1, p (N ) and
196
3 Theory of Sobolev Spaces
u ¯ W 1, p (N ) ≤ C v ¯ W 1, p (B) .
(3.8.7)
From (3.8.5)–(3.8.7), we have u ¯ W 1, p (N ) ≤ C uW 1, p (U ) ≤ C uW 1, p () .
(3.8.8)
Note that we have two issues with the treatment above. First, the extensions are still not of compact support, so they are not extended to the whole space. Second, it can be implemented only locally because the coordinate system provides a local representation. So we need to make use of the powerful tool of partition of unity to globalize our results and compactly support the extensions so that we can extend by zero the functions to Rn . Since ∂ is compact, there exists a finite cover {Ni } of ∂, m Ni = N , and let N0 ⊂⊂ such that i=1
⊆
m
Ni = N ∪ = A ⊆ ∗ .
i=0
Let Ui = Ni ∩ , so {Ui , i = 0, . . . , m} is a cover of , and u i ∈ W 1, p (Ui ). By the previous argument, u¯ i ∈ W 1, p (Ni ). Choose a partition of unity {ξi , i = 1, . . . , m}, ξi ∈ Cc∞ (Ni ) subordinate to {Ni , i = m 1, . . . , m}, where supp(ξi ) ⊆ Ni and ξi = 1 on . Moreover, supp(ξ0 ) ⊆ N0 , and i=1
let u˜ 0 = uξ0 , u˜ i = ξi (x)u¯ i (x). Since u ∈ C 1 (), we have
u˜ i ∈ C 1 ()
and supp(u˜ i ) ⊆ Ni for i = 1, . . . , m, and consequently, u˜ i ∈ W 1, p (Ni ) for each i = 0, . . . m. In view of (3.8.8), we have u˜ i W 1, p (Ni ) ≤ C u i W 1, p (Ui ) . The last step is to define the linear operator
(3.8.9)
3.8 Extensions
197
u¯ =
m
u˜ i ,
i=0
for x ∈
m
Ni . Then u¯ ∈ W 1, p (A), such that u¯ = u on and
i=0
supp(u) ¯ ⊆ A ⊆ ∗ . Now, that we obtained an extension of u with a compact support, it is time now to define Eu using the zero extension u¯ x ∈ A Eu = 0 x ∈R\ A The following estimate is established: EuW 1, p (Rn ) = u ¯ W 1, p ( ) = u ¯ W 1, p (A) m u˜ i W 1, p (Ni ) ≤ i=0
≤
m
ξi W 1,∞ (Ni ) ·
i=0
≤ KM
m
u¯ i W 1, p (Ni )
i=0 m
u i W 1, p (Ui ) , (by (3.8.9))
i=0
≤ C uW 1, p () , where M = max{Ci , i = 0, . . . , m}, K =
m
ξi W 1,∞ (Ni ) , C = K M(m + 1).
i=0
So C clearly depends on n, k, , ∗ . The result easily holds for p = ∞. We leave the details for the reader. Remark We should note the following: (1) The result holds for all k ∈ N, and in this case the procedure of the proof becomes harder since the diffeomorphism will be C k instead of C 1 , which requires a more complicated even reflection. For example, u − of (3.8.4) will be of the form k xn ci u(x , − ) for some coefficients ci such that i +1 i=0
198
3 Theory of Sobolev Spaces k+1
ci
i=1
−1 i
j = 1, j = 0, 1, . . . , k
This is represented by a system V C = 1 of k + 1 of linear equations. The coefficient matrix V is known as the Vandermonde matrix, and this matrix is invertible since otherwise a polynomial of degree k will have k + 1 roots. This implies that the system is uniquely solvable. So one can uniquely determine the values of ci and proceed to establish u¯ ∈ C k (B) by showing that u¯ and all its derivatives up to order k to be equal across the hyperplane {x ∈ Rn : xn = 0}. (2) The condition C 1 for the boundary can be weakened. In fact, the result also holds for the case when is a Lip domain, and in this case, φ, ψ are assumed to be Lipschitz continuous instead of C 1 diffeomorphisms. The part of the boundary Γ is the graph of a Lipschitz function γ and lies above the graph of γ. We define the following Lipschitz φ(x) = y = (x , xn − γ(x0 )), ψ(y) = x = (x , yn + γ(y0 )), where
x0 = (x0 , x0n ), y0 = (y0 , y0n ).
Then both φ and ψ are Lipschitz with constant 1 with u ¯ W 1, p (B) ≤ C uW 1, p (B + ) , and |J (φ)| = |J (ψ)| = 1. This gives u ¯ W 1, p (V ) = uW 1, p (U ) . Further, (3.8.6) stays the same and (3.8.7) becomes u ¯ W 1, p (A) = v ¯ W 1, p (B) . For p = ∞, the extension defined in (3.8.4) becomes u(x) ¯ =
u(x) x ∈ B+ , u(x1 , . . . , −xn ) x ∈ B − .
and u¯ is Lipschitz, and the corresponding estimate is u ¯ W 1,∞ (B) = uW 1,∞ (B + ) , and (3.8.6) becomes v ¯ W 1,∞ (B) = vW 1,∞ (V ) ,
3.8 Extensions
199
and (3.8.7) becomes u ¯ W 1,∞ (A) = v ¯ W 1,∞ (B) . Now, with the help of the Extension Theorem, we can strengthen the result of Proposition 3.8.4. Proposition 3.8.11 Let be open bounded in Rn and of C k class, and let u ∈ W k, p () for some 1 ≤ p ≤ ∞. Then there exists u m ∈ Cc∞ (Rn ) such that u m −→ u in W k, p (). Proof Let ∂ be bounded. Then by the Extension Theorem, there exists an extension operator Eu : W k, p () −→ W k, p (Rn ). Define the sequence u m = ξm (Eu) where (Eu) = Eu ∗ ϕ and ξm is a sequence of cut-off functions. Then u m () ∈ Cc∞ (Rn ), so it is the desired sequence. The case when ∂ is unbounded is left to the reader as an exercise (see Problem 3.11.31). One of the advantages of imposing a nice structure on the boundary of the domain is that it allows us to construct our approximating function to be defined in rather than in , i.e., our approximating function will belong to C ∞ (). This will provide a global approximation up to the boundary. Functions in C ∞ () are functions which are smooth up to the boundary. We next prove another variant of the Meyers–Serrin Theorem which establishes a global approximation of smooth functions up to the boundary. The idea is to extend functions in W k, p () to functions in W k, p (Rn ) in order to apply Proposition 3.7.5. Theorem 3.8.12 Let be open bounded in Rn and of C k class. Then C k () is dense in W k, p () in the Sobolev norm ·k, p for all 1 ≤ p < ∞. Proof We will use Proposition 3.6.4(5). C ∞ () ⊂ C k () ⊂ W k, p ().
(3.8.10)
Let u ∈ W k, p () (1 ≤ p < ∞). We want to show that there exists a sequence u j ∈ W k, p () ∩ C ∞ () such that u j −→ u in W k, p (). That is
200
3 Theory of Sobolev Spaces
W k, p () ∩ C ∞ () = W k, p (). Since u ∈ W k, p (), there exists an extension E(u) ∈ W k, p (Rn ). By Proposition 3.7.5, there exists u j ∈ Cc∞ (Rn ) such that u j −→ E(u) in W k, p (Rn ). Take the restriction to , Consequently, (u j ) | ∈ Cc∞ (), and u j converges to E(u) | = u. This proves that C ∞ () is dense in W k, p (). Now the result follows from (3.8.10).
3.9 Sobolev Inequalities 3.9.1 Sobolev Exponent This section establishes some inequalities that play an important role in embedding theorems and other results related to the elliptic theory and partial differential equations. There are many of these inequalities, but we will discuss some of the important ones that will be used later and may provide the foundations for other inequalities. In particular, we will study estimate inequalities in Sobolev or Holder spaces in the following forms: (1) u L ≤ C Du L . (2) u L ≤ C uW . (3) uC ≤ C uW . Here, L refers to an arbitrary Lebesgue measurable space L p , W refers to a Sobolev space, and C refers to a Holder continuous space. A main requirement of all these inequalities is that the constant C of the estimate must be kept independent of the function u, otherwise the estimate will lose its power and efficiency in applications and producing further inequalities and other embedding results. Another concern is the conjugate we need for a number p. Recall in measure theory, the conjugate of a number p is the number q such that p −1 + q −1 = 1. This parameter was required to guarantee the validity of Holder’s inequality, which is a fundamental inequality in the theory of Lebesgue measurable spaces, and this is the reason why it is sometimes known as “Holder conjugate”. Likewise, the conjugate needed for the number p to obtain Sobolev inequalities shall be called Sobolev conjugate, and will be denoted by p ∗ . Inequality (1) above is fundamental in this subject and plays the same role as Holder’s inequality in Lebesgue spaces, therefore, it is important to establish this inequality. The most basic form of inequality (1) takes the following form: If u ∈ C 1 [a, b] then u L 1 [a,b] ≤ C u ∞ .
3.9 Sobolev Inequalities
201
This can be easily seen since u is absolute continuous on [a, b] and
u(x) =
b
u (x)d x,
a
so
b
|u| ≤
u (x) d x = (b − a) max u (x) , [a,b]
a
i.e., the constant C = b − a depends only on the domain [a, b]. How can we extend this estimate in more general Lebesgue spaces? What if the estimate is taken over Rn instead of R? Let us assume that for some p, q ≥ 1, u L q (Rn ) ≤ C Du L p (Rn ) .
(3.9.1)
We observe two points: (1) The estimate is taken over Rn since it would be easy to extend the results to any open bounded sets of Rn by the extension techniques studied in the previous section, so obtaining the estimate over Rn is the key to other general results. (2) The estimate is obviously invalid for all values of p, q, (one can find several counterexamples). So with a given p, the value of q required to validate the inequality shall be defined as the Sobolev conjugate of p. It turns out that a suitable function should be invoked in order to validate the inequality and help us determine appropriate values of q. The best function to deal with is a function in Cc1 since it will be contained in all such spaces. We will use the standard mollifier of (2.3.14) 1 x ∈ Cc1 (Rn ), φ (x) = n φ where φ is the function defined in (3.2.5). Then integrating φ over Rn using the change of variable x = dy gives
q n−nq |φ (x)| d x = |φ(y)|q dy, Rn
Rn
so φ L q (Rn ) = q −n φ L q (Rn ) . n
(3.9.2)
Similar argument for D(φ ) gives φ L p (Rn ) = p −n−1 Dφ L p (Rn ) . n
(3.9.3)
202
3 Theory of Sobolev Spaces
In order to obtain (3.9.1), we combine (3.9.2) and (3.9.3) to obtain φ L q (Rn ) ≤ α φ L p (Rn ) , where α=
n n − − 1. p q
Letting → 0 leads to a contradiction, so the exponent must be zero, that is, n n − − 1 = 0, p q which implies that q=
np . n−p
A simple calculation gives 1 1 1 − = . p p∗ n Thus, this will be defined as the Sobolev conjugate (or Sobolev exponent) of p for all p ∈ [1, n). Definition 3.9.1 (Sobolev Exponent) Let p ∈ [1, n). Then the Sobolev exponent of p is np p∗ = . n−p Remark (1) For the definition to take place, we should have 1 ≤ p < n. For p = n, we agree to have p ∗ = ∞. (2) The new conjugate definition takes into account the space Rn but it cannot be reduced to the classical Holder conjugate q −1 + p −1 = 1 as it is apparent from the definition of p ∗ , so it cannot be regarded as a generalization of the Holder conjugate.
3.9.2 Fundamental Inequalities Now, we come to the next stage; proving the inequality, with the assumption that q = p ∗ . Before we prove the inequality, we recall some basic inequalities from the theory of Lebesgue spaces.
3.9 Sobolev Inequalities
203
Theorem 3.9.2 (Extended Holder’s Inequality) Let p1 , . . . , pn be positive numbers, such that n 1 = 1. p i=1 i Let u 1 , u 2 , . . . , u n ∈ L pi (). Then n
u i ∈ L 1 (), i=1
and
n
i=1
ui ≤ 1
n
u i pi . i=1
Proof Use induction on the Holder’s inequality.
Theorem 3.9.3 (Nested Inequality) Let u ∈ L q () for some bounded measurable set of measure μ() = M. If 1 ≤ p < q, then u ∈ L p () and u p ≤ C uq , where C = M p − q = C( p, q, ). 1
1
q , then p r find s the conjugate of r , then using Holder’s Inequality on v ∈ L () and 1 ∈ L s (). This gives
Proof Note that |u| p ∈ L q (). Let v = |u| p , then v ∈ L q/ p (). Let r =
Taking the power of
|u| p =
1/r
|v| ≤
|u| pr
.(μ())1/s .
1 for both sides of the inequality, given that pr = q and p q−p 1 = , sp qp
and the result follows.
The previous result holds for bounded domains . If is unbounded, then we use the following inequality. Theorem 3.9.4 (Interpolation Inequality) Let 1 ≤ p < r < ∞ and ⊆ Rn be arbitrary measurable set. If u ∈ L p () ∩ L r (), then u ∈ L q () for any q ∈ [ p, r ] and
204
3 Theory of Sobolev Spaces
uq ≤ uθp · ur1−θ for some θ ∈ [0, 1] such that 1 θ 1−θ = + . q p r Proof Note that 1=
(1 − θ)q θq + , p r
p
Since u ∈ L p (), u θq ∈ L θq (), and also since u ∈ L r (), we have r
u (1−θ)q ∈ L (1−θ)q (). So we write
|u|q d x =
|u|θq |u|(1−θ)q d x,
then using Holder’s inequality on |u|θq and |u|(1−θ)q gives uqq ≤ u θq p · u (1−θ)q θq
r (1−θ)q
(1−θ)q = uθq . p · ur
3.9.3 Gagliardo–Nirenberg–Sobolev Inequality Now we come to our Sobolev inequalities. Our first inequality is fundamental, and known as Gagliardo–Nirenberg–Sobolev inequality. Gagliardo and Nirenberg proved the inequality for the case p = 1, and Sobolev for 1 < p < n in the space Rn . The first three inequalities are known as “Gagliardo–Nirenberg–Sobolev inequalities”, although the first of them (Theorem 3.9.5) is the most well known. Theorem 3.9.5 (Gagliardo–Nirenberg–Sobolev Inequality I) Let 1 ≤ p < n and u ∈ Cc1 (Rn ). Then u L p∗ (Rn ) ≤ C Du L p (Rn ) . Proof We prove the case p = 1. By the fundamental Theorem of calculus, we write
xi ∂u u(x) = (x1 , . . . , xi−1 , ti , xi+1 , . . . , xn )dti . ∂x i −∞
3.9 Sobolev Inequalities
205
Then
|u(x)| ≤
xi
−∞
Dx u dti . i
Multiply the n inequalities and take the n − 1 root,
n
|u(x)|
n n−1
≤
∞ −∞
i=1
Dx u dti i
1 n−1
.
Now, we integrate with respect to x1 over R, then we use the extended Holder’s inequality
∞
−∞
|u(x)|
n n−1
dx ≤
−∞
≤
∞
∞ −∞
Dx u dt1 1
Dx u dt1 1
1 n−1
·
1 n−1
∞
n
−∞ i=2
n
( i=2
∞
−∞
∞
−∞ ∞
−∞
Dx u dti i
1 n−1
Dx u d x1 dti ) i
d x1 1 n−1
.
We integrate with respect to x2 over R, and we repeat this argument successively until xn to obtain
∞ −∞
n
n
|u(x)| n−1 d x ≤ i=1
∞
−∞
...
∞ −∞
Dx u d x1 . . . dti . . . d xn i
1 n−1
=
Rn
|Du| d x
n n−1
.
(3.9.4) This establishes the inequality for p = 1. For 1 < p < n, let y = |u|α where α is to be determined later, and substitute above in (3.9.4). Then
∞
−∞
|u(x)|
αn n−1
n−1 n dx
≤
α
Rn
|D |u| | d x = α
Rn
|u|α−1 |Du| d x.
We apply the Holder’s inequality on the last term, and taking into account that |Du| ∈ L p and |u|α−1 ∈ L q , where q is the Holder conjugate of p such that p −1 + q −1 = 1. Consequently p . q= p−1 This gives
∞ −∞
|u(x)|
αn n−1
n−1 n dx
≤α
Rn
|u|
(α−1) p p−1
p−1 p dx Rn
1/ p |Du| d x p
. (3.9.5)
206
3 Theory of Sobolev Spaces
Now, we choose α such that the powers of u in both sides of the above inequality are equal to p ∗ , i.e. αn p(α − 1) np = = = p∗ . n−1 p−1 n−p This gives α=
p(n − 1) . n−p
Substitute in (3.9.5), and divide both sides of the inequality by the first term of the RHS of the inequality, noting that n−1 p−1 n−p 1 − = = ∗. n p np p We thus obtain
∞ −∞
p∗
1/ p∗
|u(x)| d x
p(n − 1) ≤ n−p
1/ p |Du| d x p
Rn
, or
u L p∗ (Rn ) ≤ C Du L p (Rn ) , for C = C(n, p).
Remark Note that from the proof of the case 1 < p < n, we cannot assume the case p ≥ n since the choice of α will be invalid otherwise. Having a norm in the L p spaces in the RHS of an inequality is always advantageous because we can always engage W k, p in the inequality due to the fact that u L p ≤ uW k, p whenever u ∈ W k, p . This idea will be applied next. Corollary 3.9.6 Let 1 ≤ p < n and u ∈ Cc1 (Rn ). Then u L p∗ (Rn ) ≤ C uW 1, p (Rn ) . Proof This follows from the fact that u ∈ Cc1 (Rn ) ⊂ W 1, p (Rn ), hence Du L p ≤ uW 1, p .
3.9 Sobolev Inequalities
207
∗
The next result extends the corollary to include not only L p (Rn ), but other L q (Rn ) for all q ∈ [ p, p ∗ ]. Theorem 3.9.7 Let 1 ≤ p < n and u ∈ Cc1 (Rn ). Then u L q (Rn ) ≤ C uW 1, p (Rn ) for all q ∈ [ p, p ∗ ]. Proof We use the interpolation inequality with r = p ∗ to obtain u L q (Rn ) ≤ uθL p (Rn ) · u1−θ L r (Rn ) . Note that 1= where p =
1 1 + ,
p q
1 1 > 1, q = > 1. θ 1−θ
So the Young’s inequality can be used to rewrite the above estimate as uqL q (Rn ) ≤ θ u L p (Rn ) + (1 − θ) u L r (Rn ) ≤ u L p (Rn ) + u L r (Rn ) ≤ u L p (Rn ) + C Du L p (Rn ) (by GNS inequality) = C uW 1, p (Rn ) .
3.9.4 Poincare Inequality The next step is to generalize the above results in two ways. In particular, we will establish the inequalities for any Sobolev function in W 1,1 rather than Cc1 , and on any bounded open set rather than the whole space. Of course, we need the Meyers–Serrin Theorem for the former idea, and the extension operator for the latter. The second inequality (Theorem 3.9.7) shall also be generalized to hold for any Sobolev function in W 1,1 and on any bounded open set. The first inequality is the famous Poincare inequality, which is one of the most useful and important inequalities in the theory of PDEs. Note that q here is just a parameter that doesn’t play the role of Holder conjugate. Theorem 3.9.8 (Poincare Inequality) Let be an open C 1 and bounded in Rn for 1, p 1 ≤ p < n. If u ∈ W0 () then u ∈ L q () for all 1 ≤ q ≤ p ∗ , and
208
3 Theory of Sobolev Spaces
u L q () ≤ C Du L p () where C = C( p, q, n, ). Consequently, we have u L q () ≤ C DuW 1, p () . Proof By Meyers–Serrin Theorem, there exists u j ∈ Cc∞ () such that 1, p
u j −→ u in W0 (). Extend u j to Rn and still call them u j , so u j ∈ Cc∞ (Rn ) ⊂ Cc1 (Rn ). By GNS inequality u j
L p∗ ()
= u j L p∗ (Rn ) ≤ C Du j L p (Rn ) ≤ C Du j L p () ,
(3.9.6)
and u j − u i p∗ n ≤ C Du j − Du i p n −→ 0. L (R ) L (R ) ∗
∗
Hence, {u j } is Cauchy in L p (Rn ) which is Banach, so u j −→ v ∈ L p (Rn ), hence v = u. Take the limit of both sides of (3.9.6) using Fatou’s Lemma (Theorem 1.1.5). u L p∗ () ≤ C Du L p () . Since q ≤ p ∗ , the result now follows from the nested inequality (Theorem 3.9.3), and the second inequality follows from the fact that Du L p ≤ uW 1, p . Remark The particular case when q = p in the first estimate is the classical Poincare inequality. In this case, the inequality holds for 1 ≤ p < ∞ since we always have p < p ∗ for all p, n. 1, p
We infer from the above inequality that if we measure the size of the function in W0 by the p-norm, then its size will be bounded above by the size of its weak derivative.
3.9.5 Estimate for W 1, p The next theorem concerns with the generalization of Theorem 3.9.7.
3.9 Sobolev Inequalities
209
Theorem 3.9.9 (Sobolev’s Inequality) Let be open C 1 and bounded in Rn for 1 ≤ p < n. If u ∈ W 1, p (), then u ∈ L q () for all 1 ≤ q ≤ p ∗ and u L q () ≤ C uW 1, p () , where C = C( p, q, n, ). Proof We proceed the same as we did in the previous inequality. The details are left to the reader as an exercise.
3.9.6 The Case p = n All the above results hold for p < n. For the borderline case p = n, we see that p ∗ = ∞. So in fact we have the following: Theorem 3.9.10 (Sobolev’s Inequality) If u ∈ W 1,n (Rn ), n ≥ 2, then u ∈ L q (Rn ) for all n ≤ q ≤ ∞ and u L q (Rn ) ≤ C uW 1,n (Rn ) . Proof Substitute with p = α = n > 1 in (3.9.5) to become
∞
−∞
|u(x)|
n2 n−1
n−1 n dx
≤n
|u| d x n
Rn
n−1 n Rn
1/n |Du| d x n
,
which implies unL r1 (Rn ) ≤ n un−1 L n (Rn ) · Du L n (Rn ) , where r1 = gives
n2 n . We apply Young’s inequality with p = and q = n. This n−1 n−1
unL r1 (Rn )
n−1 1 unL n (Rn ) + DunL n (Rn ) ≤n n n " ! ≤ n unL n (Rn ) + DunL n (Rn ) .
1 Now, take the power of both sides of the equation and make use of the equivalent n norms ! "1/n u L r1 (Rn ) ≤ n 1/n unL n (Rn ) + DunL n (Rn ) ≤ C uW 1,n (Rn ) .
(3.9.7)
210
3 Theory of Sobolev Spaces
Now, we have 1 < n < r1 , so we apply the interpolation inequality for all q ∈ [n, r1 ] and make use of (3.9.7) u L q (Rn ) ≤ uθL r1 (Rn ) · u1−θ L n (Rn ) ≤ C uθW 1,n (Rn ) · u1−θ W 1,n (Rn ) = C uW 1,n (Rn ). We can repeat the same argument for p = n and α = n + 1. This will also give us the same estimate u L q (Rn ) ≤ C uW 1,n (Rn ) for all q ∈ [n + 1, r2 ], where r2 =
n(n + 1) . n−1
Repeat the argument successively for α = n + 2, n + 3, . . . , by induction, we obtain the same estimate for all q ≥ n and rk −→ ∞ as k → ∞. This completes the proof.
3.9.7 Holder Spaces We will discuss some inequalities that connect Sobolev spaces to Holder spaces. Thus, we need to review some facts about Holder spaces. Definition 3.9.11 (Holder-Continuous Function) A function u : −→ R is called Holder continuous with exponent β ∈ (0, 1] if there exists a constant C > 0 such that for all x, y ∈ |u(x) − u(y)| ≤ C |x − y|β . It is obvious that a Holder continuous function is Lipschitz continuous for β = 1, and a βth Holder continuous function is uniformly continuous for any β > 0. It can be easily seen that for β > 1 the function is constant. One interesting property of Holder functions which can be concluded from the definition is that for x = y [u]0,β = sup x,y
In general, we write
|u(x) − u(y)| |x − y|β
< ∞.
3.9 Sobolev Inequalities
211
uk,β =
D α u∞ +
|α|≤k
[D α u]0,β .
|α|=k
This will be used to characterize Holder functions in their spaces. Definition 3.9.12 (Holder Space) Let 0 ≤ β < 1. Then, the Holder space is the normed space C k,β () = {u : −→ R such that u ∈ C k () and [u]k,β < ∞}. Here, it is important to note that a function u ∈ C k,β () doesn’t necessarily imply that u is βth Holder continuous, but only its kth partial derivative is a βth Holder continuous. It can be shown that the space C k,β () endowed with the norm [u]k,β is a Banach space. The reason for studying this type of spaces in this section is that for relatively high values of p ( p > n), the Sobolev function tends to embed in a Holder space.
3.9.8 The Case p > n This case will be illustrated through what is known as the Morrey’s inequality. The next lemma is useful in proving the inequality. Lemma 3.9.13 Let u ∈ Cc1 (Rn ) and p > n. Then (1) For all r > 0, we have for some constant C1 = C(n) > 0 1 |Br (x)|
Br (x)
|u(y) − u(x)| dy ≤ C1
(2) If q is the Holder conjugate of p, (i.e., q = C2 = C(n, p) we have
1 Br (x)
|x − y|(n−1)q
Br (x)
|Du(y)| dy. |x − y|n−1
p ), then for some constant p−1
dy = C2 · r ( p−n)/( p−1) .
Proof Let y = x + r v for x, y ∈ Rn , where 0 ≤ t ≤ r = |x − y|, and v is a unit vector, so v ∈ S1n−1 (0) = S. Then t t |u(x + tv) − u(x)| = |Du(x + τ v)| · vdτ ≤ |Du(x + τ v)| dτ . 0
0
212
3 Theory of Sobolev Spaces
Integrate over the unit sphere S,
|u(x + tv) − u(x)| ds ≤
t
|Du(x + τ v)| d Sdτ
t
|Du(x + τ v)| = τ n−1 d Sdτ τ n−1 0 S
t
|Du(x + τ v)| = τ n−1 d Sdτ n−1 S |x + τ v − x| 0
S
0
S
Now, substitute with y = x + τ v, and note that the first integral is over the (n − 1)dimensional sphere Sτn−1 (x) of radius τ ≤ t. Converting to polar coordinates, given that n−1 S (x) = τ n−1 S n−1 (0) , τ τ where |S| denotes the surface area, and the above integral becomes
t
|Du(y)| |u(x + tv) − u(x)| d S ≤ d Sdτ . n−1 0 S Sτ (x) |y − x| This is an n − 1 dimensional integration over the surface of radius τ followed by an integration over the radius 0 ≤ τ ≤ t, and this should give us an n dimensional integration over the whole ball Bt (x) ⊆ Br (x). So we have
|u(x + tv) − u(x)| d S ≤ S
Br (x)
|Du(y)| dy. |y − x|n−1
Multiplying both sides by t n−1 , and then integrating both sides with respect to t from 0 to r yields
r
r
|Du(y)| n−1 n−1 |u(x + tv) − u(x)| t dtd S ≤ t dydt. n−1 S Br (x) |y − x| 0 0 Again, the integration on the LHS is an integration over the ball Br (x) and t n−1 dt can be integrated using usual calculus rules. This gives
|Du(y)| rn |u(x + tv) − u(x)| d S ≤ dy. n |y − x|n−1 Br (x) Br (x) Note that the coefficient
3.9 Sobolev Inequalities
213
rn = C(r, n) n depends also on r . To eliminate r , we divide both sides of the inequality by the volume of the n- dimensional ball |Br (x)| =
π n/2 r n n2 + 1
where is the gamma function. This gives the inequality in (1) with C1 =
n 2
+1
nπ n/2
= C(n).
To prove (2), let r = |x − y|. Then using polar coordinates by letting ρ = |x − y| and dy = ρn−1 dρ, we obtain
1
Br (x)
|y − x|
dy = (n−1) p/ p−1
r
ρ(1−n) p/ p−1 ρn−1 dρ
0 r
=
p − 1 ( p−n)/( p−1) ·r . p−n
ρ(1−n)/ p−1 dρ =
0
Inequality (2) is thus established with C2 =
p−1 = C(n, p). p−n
Theorem 3.9.14 (Morrey’s Inequality) Let p ∈ (n, ∞]. If u ∈ Cc1 (Rn ), then uC 0,β (Rn ) ≤ C uW 1, p (Rn ) , where β = 1 −
n . p
Proof We will only prove the case n < p < ∞, and the case p = ∞ is left to the reader as an exercise (see Problem 3.11.44). The inclusion u ∈ Cc1 (Rn ) ⊂ W 1, p (Rn ) is clear, so we just need to prove the estimate. Let x ∈ Rn , and |B1 (x)| denotes the volume of the unit ball centered at x. Then we have
|u(x)| =
|B1 (x)| u(x) |u(x)| = |B1 (x)| |B1 (x)|
B1 (x)
1dy =
1 |B1 (x)|
|u(x)| dy B1
214
3 Theory of Sobolev Spaces
and using triangle inequality and Lemma 3.9.13(2) with r = 1 gives
1 |u(x) − u(y)| dy + |u(y)| dy |B1 (x)| B1 (x) B1 (x)
|Du(y)| 1 |u(y)| dy dy + ≤ C1 n−1 |B1 (x)| B1 (x) B (x) |x − y| 1
|Du(y)| |u(y)| dy dy + ≤ C1 n−1 B1 (x) |x − y| B1 (x)
|u(x)| ≤
Note that u ∈ Cc1 (Rn ), so DL ∈ L p (Rn ). By the nested inequality
B1
|u(y)| dy ≤ u L p (Rn ) ,
and in view of Lemma 3.9.13, |x − y|1−n ∈ L q (Rn ) where q = p =
p . Hence, we apply Holder’s inequality to obtain p−1
|u(x)| ≤ C1
B1 (x)
|Du(y)| |x − y|1−n dy + u L p (Rn )
1/ p
p−1/ p (1−n) p/ p−1 |Du(y)| dy |x − y| · dy
≤ C1
p
B1
B1
+ C1 u L p (Rn ) . From Lemma 3.9.13(2), with r = 1
|x − y|
(1−n) p/ p−1
( p−1)/ p dy
= (C2 )( p−1)/ p .
B1
Substitute this value in the RHS of the above inequality |u(x)| ≤ C1 C2( p−1)/ p Du L p (Rn ) + C1 u L p (Rn ) ] ≤ C3 uW 1, p (Rn ) , ( p−1)/ p
where C3 = C1 C2
. Taking the supremum over all x ∈ Rn u L ∞ (Rn ) ≤ C3 uW 1, p (Rn ) .
(3.9.8)
3.9 Sobolev Inequalities
215
Now, let x, y ∈ Rn and r = |x − y| > 0. Note that n r |Br (x)| = r n |B1 (x)| = 2n |B1 (x)| = 2n B r2 (x) . 2 Define the open set N = Br (x) ∩ Br (y). Letting z =
x+y , then it is clear that 2
B r (x) = B r (z) ≤ |N | < |Br (x)| = |Br (y)| 2
2
We can write |u(x) − u(y)| ≤ |u(x) − u(z)| + |u(z) − u(y)| . Then
1 |u(x) − u(y)| ≤ |u(x) − u(z)| dz + |u(z) − u(y)| dz |N | Br (x) Br (y)
1 |u(x) − u(z)| dz + |u(z) − u(y)| dz ≤ B r (z) Br (x) Br (y) 2
2n |u(x) − u(z)| dz + |u(z) − u(y)| dz = |Br (x)| Br (x) Br (y)
1 n+1 |u(x) − u(z)| dz =2 |Br (x)| Br (x) Again, using Lemma 3.9.13, then by Holder’s inequality
|Du(z)| dz |x − z|n−1 Br (x) r p−1/ p |x − z|(1−n) p/ p−1 dz ≤ 2n+1 C1 Du L p (Rn )
|u(x) − u(y)| ≤ 2n+1 C1
0
=2
n+1
( p−1)/ p C1 C2
! " p−1/ p · Du L p (Rn ) r ( p−n)/( p−1)
So we obtain |u(x) − u(y)| ≤ C Du L p (Rn ) r β ( p−1)/ p
where C = 2n+1 C3 = 2n+1 C1 C2 |x − y|β gives
|u(x) − u(y)| |x − y|β
(3.9.9)
. Now, dividing both sides of (3.9.9) by r β =
≤ C Du L p (Rn ) < ∞.
216
3 Theory of Sobolev Spaces
Since x and y are arbitrary, we take the supremum over all x, y ∈ Rn , x = y [u]0,β, ≤ C Du L p (Rn ) , and using (3.9.8), we obtain u1,β = u∞ + [u]0,β, ≤ C3 uW 1, p (Rn ) + C Du L p (Rn ) ≤ C uW 1, p (Rn ) and therefore, u ∈ C 1,β (Rn ) and uC 1,β (Rn ) ≤ C uW 1, p (Rn ) , where C = 2n+1
n 2
+1
nπ n/2
p−1 p−n
p−1 p
.
Morrey’s inequality holds for Rn . We can, however, generalize it to hold for any subset of Rn that satisfies the hypotheses of Theorem 3.8.10, thanks to the extension operator. Theorem 3.9.15 If ⊂ Rn is bounded, open, and a C 1 −class, then W 1, p () ⊂ C 0,β () and uC 0,β () ≤ C uW 1, p () , for p ∈ (n, ∞], and β = 1 −
n . p
Proof This follows from Extension Theorem, Meyers-Serrin Theorem, and Morrey’s inequality. The details are left to the reader (see Problem 3.11.45).
3.9.9 General Sobolev Inequalities We establish some inequalities and estimates for general Sobolev spaces W k, p . Theorem 3.9.16 Let be open C 1 and bounded in Rn and u ∈ W k, p (). (1) If k
n then u ∈ C k−m−1,β () where p ⎧ n ⎪ ⎨m + 1 − n p m= ,β = ⎪θ p ⎩
n ∈ /N p , n ∈N p
for some θ with 0 < θ < 1, and uC k−m−1,β () ≤ C uW k, p () . Proof To prove (1), note that D α u ∈ W 1, p () for all |α| ≤ k − 1, and so by GNS D α u L p∗ () ≤ C D α uW 1, p () ≤ C uW k, p () , where
1 1 k ∗ = − . So u ∈ W k−1, p () and p∗ p n uW k−1, p∗ () ≤ C1 uW k, p () . ∗
∗
We repeat the same argument for u ∈ W k−1, p () so that we get u ∈ W k−1, p () and uW k−2, p∗∗ () ≤ C2 uW k−1, p∗ () where
1 1 1 1 1 1 1 2 = − = − − = − . p∗∗ p∗ n p n n p n
Continue repeating this process k − 1 times until we obtain u ∈ W 0,q () = L q () where 1 k 1 = − q p n and for some C = C1 C2 . . . Ck , we have u L q () ≤ C uW k, p () . n n To prove (2), let k = , then u ∈ W k, p (). But for p = , there is p < p such that p k n k, p k, p
k < , so W () ⊂ W (), and we we apply (1) for this chosen p to obtain p
218
3 Theory of Sobolev Spaces
q > q such that W k, p () ⊂ L q (), and the result follows from the combination of the above two inclusions in addition to the nested inequality (Theorem 3.9.3). n n / N. For (3), we only prove the first case. Let u ∈ W k, p () where k > and ∈ p p n n Let m = . Then we clearly have m < < m + 1. p p n Then m < . Applying the same argument of case (1), we obtain u ∈ W k−m,r () p and uW k−m,r () ≤ C uW k, p () , 1 m 1 = − . r p n But this implies that D α u ∈ W 1,r () for all α ≤ k − m − 1. Moreover, note that n < m + 1, so we have r > n, and using Morrey’s inequality, we conclude that p
where
D α u ∈ C 0,β () and D α uC 0,β () ≤ C D α uW 1,r () ≤ C uW m, p () , where β =1−
n n = 1 − + m. r p
Since D α u ∈ C 0,β () for all α ≤ k − m − 1, we must have u ∈ C k−m−1,β () and uC k−m−1,β () ≤ C uW k−m,r () , n n ∈ N, then letting m = − 1, by similar argument we can show that u ∈ p p W k−m,r () , then we proceed as above.
If
3.10 Embedding Theorems 3.10.1 Compact Embedding Recall in basic functional analysis, the concept of “embedding” was introduced to facilitate the study of dual spaces and reflexivity. These embeddings connect Sobolev spaces with the theory of PDEs because it provides information on the relations between weak differentiability and integrability, and ultimately, on the regularity of
3.10 Embedding Theorems
219
the solutions, which plays a critical role in the theory of PDEs, and demonstrates the fact that Sobolev spaces are, in many cases, the perfect spaces to deal with when searching solutions for PDEs due to their nice integrability properties. Here, we recall the definition again. Definition 3.10.1 (Embedding) Let X and Y be two Banach spaces with norms · X and ·Y respectively, and let ϕ : X −→ Y be a mapping. If ϕ is an isometric injective, then ϕ is said to be “embedding”, and is written as ϕ : X → Y (and sometimes X ⊂⊂ Y ). If we consider the map ı : X → Y , ı(x) = x and X ⊂ Y , then ı is called the inclusion map. In general, the map ı is called embedding in the sense that it embeds (or stick) X inside Y, and we can think of the elements of X as if they are in Y, or to say that Y contains an isomorphic copy of X . If this map is bounded, then we have more to say about this type of embedding. Recall that a linear operator is bounded if and only if it is continuous, and so the inclusion map ı : X → Y is continuous if there exists a constant C such that ı(x)Y = xY ≤ C x X for every x ∈ X . The equality on the above is due to the isometry of ı. In other words, if x X < ∞ (i.e. x ∈ (X, · X ) then xY < ∞ (i.e. x ∈ (Y, ·Y ). This embedding map is continuous. In this case we say that X is continuously embedded into Y . It is important to note that when we say that X is embedded in Y and x ∈ X, we don’t necessarily mean that x ∈ Y, but rather there is a representative element in Y, say y such that x = y a.e. and x = y. In the previous section we established some important estimates between Sobolev spaces connected to other Banach spaces (Lebesgue or Holder). This gives rise to inclusions and embeddings results. In view of the preceding estimates, we have the following continuous embeddings. Theorem 3.10.2 All the following inclusions are continuous: (1) If 1 ≤ p < n then
∗
W 1, p (Rn ) ⊂ L p (Rn ).
Moreover, if p ≤ q ≤ p ∗ then W 1, p (Rn ) ⊂ L q (Rn ). (2) Let be open C 1 and bounded in Rn . If 1 ≤ p < n then
220
3 Theory of Sobolev Spaces
W 1, p () ⊂ L q () for all 1 ≤ q ≤ p ∗ . (3) If n < p ≤ ∞, then (4) If n < p ≤ ∞, then
W 1, p (Rn ) ⊂ L ∞ (Rn ). W 1, p (Rn ) ⊂ C 0,β (Rn )
n . p 1 (5) Let be open C and bounded in Rn . If n < p ≤ ∞, then where β = 1 −
W 1, p () ⊂ L ∞ () and W 1, p () ⊂ C 0,β () where β = 1 −
n . p
The theorem is an immediate conclusion of the estimates established in the previous section. Note that all the above inclusions are continuous, i.e. for all 1 ≤ p < n ∗ the space W 1, p (Rn ) is continuously embedded in L p (Rn ) and in L q (Rn ) for all q ∈ [ p, p ∗ ], and for all n < p ≤ ∞ it is continuously embedded in C 0,β (Rn ), which in turn embedded in Cb (Rn ). The condition n < p in (3) and (4) is sharp (see Problem 3.11.50). One of the interesting properties of these continuous embeddings is that any Cauchy sequence in X is Cauchy in Y, and any convergent sequence in X is convergent in Y . A more interesting type of embedding is what is known as compact embedding, where the inclusion operator is not only bounded, but also compact. Here is the definition. Definition 3.10.3 (Compact Embedding) Let X and Y be two Banach spaces with norms · X and ·Y respectively. Then an inclusion mapping is called compact c
embedding. This is denoted by X → Y . The property of being sequentially compact means that the bounded sequence {ϕ(xk )} has a convergent subsequence. So one simple argument to show that an embedding X → Y is not compact is to find an example of a bounded sequence in Y with no convergent subsequence (see Problem 3.11.57). The next theorem, due to Rellich and Kondrachov, is a powerful tool in establishing compactness property. Rellich proved the result in 1930 for the case p = q = 2, and Kondrachov generalized it in 1945 to p, q ≥ 1.
3.10 Embedding Theorems
221
3.10.2 Rellich–Kondrachov Theorem An important example to which the theorem can be applied is the convolution approximating sequence u = ϕ ∗ u. Lemma 3.10.4 If (u m ) ∈ W 1, p (Rn ) with compact support K , then lim (u m ) − u m L 1 (K ) = 0
uniformly in m, and for each fixed , (u m ) = u m is uniformly bounded and equicontinuous in C(K ), and consequently in L 1 (K ). Proof We have
(u m ) − u m =
ϕ (x − y) (u m (y) − u m (x)) dy 1 x−y ϕ = (u m (y) − u m (x)) dy. m B1 (0)
R
n
x−y in the above integral, and also using the fundaUsing the substitution z = mental theorem of calculus on u m
1 ............. = − ϕ(z) Du m (x − t)zdtdz. B1 (0)
0
It follows that
|(u m ) (x) − u m (x)| d x ≤ K
B1 (0)
0
1
ϕ(z) |Du m (x − t y)| d xdtdz K
|Du m (w)dw| < ∞.
≤ K
Thus (u m ) − u m L 1 (K ) ≤ Du m L 1 (K ) ≤ M. Now, take the supremum over all m, lim sup (u m ) − u m L 1 (K ) = 0,
(3.10.1)
222
3 Theory of Sobolev Spaces
which proves the first part. Moreover, fix > 0. Then
|(u m ) (x)| ≤
Rn
ϕ (x − y) |u m (y)| dy ≤ ϕ ∞ u m L 1 (K ) ≤
C n .
(3.10.2)
Take the supremum over all m, (u m ) ∞ = sup |(u m ) (x)| ≤ m
C , n
and hence the sequence (u m ) is uniformly bounded. Similarly for D(u m ) ,
|D(u m ) (x)| ≤
Rn
ϕ (x − y) |u m (y)| dy ≤ Dϕ ∞ u m L 1 (K ) ≤
C . n+1 (3.10.3)
It follows that D(u m ) ∞ = sup |D(u m ) (x)| ≤ m
C . n+1
So let ε > 0, and |x − y| < δ for some δ > 0. Then
|(u m ) (y) − (u m ) (x)| ≤
1
|D(u m ) (x + t (y − x))| dt
0
≤
C δ = ε, n+1
εn+1 . Therefore, (u m ) is equicontinuous in C(K ) which, in turns, is conC tinuously embedded in L 1 (K ).
for δ =
The previous lemma will be used next to prove a fundamental compact embedding result: Rellich–Kondrachov Theorem, which states that the inclusion in Theorem 3.10.2(2) is not only continuous, but also compact. Theorem 3.10.5 (Rellich–Kondrachov Theorem) Let be open C 1 and bounded in Rn . If 1 ≤ p < n then c W 1, p () → L q () for all 1 ≤ q < p ∗ . Proof Theorem 3.10.2(2) already established the fact that the inclusion is continuous, so we only need to prove compactness. Consider the bounded sequence u m ∈ W 1, p (), for 1 ≤ p < n. By Theorem 3.8.10, there exists an extension
3.10 Embedding Theorems
223
operator Eu m , which is still denoted by u m , such that u m ∈ W 1, p (Rn ) and supp(u m ) ⊆ ∗ . Consider the convolution approximating sequence (u m ) . By (3.10.1) we have lim sup (u m ) − u m L 1 (∗ ) = 0, ↓0,m∈N
Since 1 ≤ q ≤ p ∗ , choose θ ∈ (0, 1) such that θ 1−θ 1 = + . q p∗ 1 Then by the interpolation inequality (Theorem 3.9.4), for any q ∈ [1, p ∗ ] we have (u m ) − u m L q (∗ ) ≤ (u m ) − u m θL p∗ (∗ ) · (u m ) − u m 1−θ L 1 (∗ ) . Hence lim sup (u m ) − u m L q () = 0, ↓0,m∈N
and so (u m ) converges to u m in L q () as −→ 0, and similarly (3.10.2) and (3.10.3) still hold. By Lemma 3.10.4, for fixed > 0, (u m ) is uniformly bounded and equicontinuous in C(), and consequently in L q (). Hence, the sequence satisfies the Arzela–Ascoli (Theorem 1.3.4), which implies that there exists a uniformly convergent subsequence {(u m k ) } to u m k in L q (), so {(u m k ) } is Cauchy in L q (∗ ). Therefore, the sequence {(u m k ) } has the following two properties: for any fixed k ∈ N there exists = k such that (u m ) − u m q ∗ ≤ 1 . k k L ( ) k Also, for every > 0 we can find Nk such that for all i, j ≥ Nk we have (u m ) − (u m ) q ∗ ≤ 1 . i j L ( ) k It follows that u m − u m q ∗ ≤ u m − (u m ) q ∗ + (u m ) − (u m ) q ∗ i j L ( ) i i i j L ( ) L ( ) 3 + (u m j ) − u m j L q (∗ ) < . k Note that since k is fixed, we cannot yet conclude that {u m i } is Cauchy in L q (). But if we can repeat the same argument above for k + 1, k + 2, . . . , and for each choice and every > 0 we obtain i, j ≥ Nk+1 > Nk , and in order to construct the corresponding Cauchy sequence we must continue the process i, j −→ ∞, and in
224
3 Theory of Sobolev Spaces
this case we need to perform the Cantor’s diagonalization argument to obtain a Cauchy sequence {u m i } which is convergent in L q (∗ ), and due to completeness, {u m i } converges to u ∈ L q (∗ ). The significance of this result stems from the fact that for every bounded sequence of functions in W 1, p we can always extract a convergent subsequence in some L q space for some suitable q, which turns out to be extremely useful in applications to PDEs. Note that we required the domain to be bounded open C 1 in order to apply 1, p the extension operator. If u ∈ W0 () then by Proposition 3.8.3 we don’t need this condition on . Corollary 3.10.6 Let be open and bounded in Rn . If 1 ≤ p < n then 1, p
c
W0 () → L q () for all 1 ≤ q < p ∗ . Proof Use Proposition 3.8.3 to obtain a zero extension, then proceed the same as in the proof of Theorem 3.10.5. The Rellich–Kondrachov Theorem investigated the compact embedding for the case 1 ≤ p < n. We will use it to investigate the compact embedding for the cases p = n, and p > n. Theorem 3.10.7 Let be open C 1 and bounded in Rn . (1) If p = n then
c
W 1,n () → L q () for all 1 ≤ q < ∞. (2) If n < p ≤ ∞ then
c
W 1, p () → C(), and consequently c
W 1, p () → L q () for 1 ≤ q < ∞. Proof (1) Let u m ∈ W 1,n () be a bounded sequence. Since is bounded, by the nested inequality, u m is bounded in W 1,q () for all 1 ≤ q < n, so we apply Rellich– Kondrachov Theorem, and for q ≥ n, we can find p such that 1 ≤ p < n. Choose p < n such that q < p ∗ . Then u m ∈ W 1, p (), and by Rellich–Kondrachov Theorem, there exists u ∈ L q () such that
3.10 Embedding Theorems
225
u m j −→ uinL q (), and this proves (1). (2) Let u m ∈ W 1,n () be a bounded sequence. By Morrey’s inequality we have the continuous embedding W 1, p () → C 0,β (), but it is easy to show that bounded sets in C 0,β () are uniformly bounded and equicontinuous, so by Arzela–Ascoli Theorem, c
c
C 0,β () → C() → L q ().
3.10.3 High Order Sobolev Estimates Another important and useful consequence of Rellich–Kondrachov Theorem is the following. Theorem 3.10.8 Let be open C 1 and bounded in Rn . If 1 ≤ p < n then 1. For all 1 ≤ q < p ∗ , k ≥ 1, we have c
W k, p () → W k−1,q (). 2. For all
1 m 1 > − , k ≥ m ≥ 1, we have q p n c
W k, p () → W k−m,q (). Proof (1). Let u j ∈ W k, p () be bounded sequence. Then D α u j ∈ W 1, p () for all |α| ≤ k − 1. By Rellich–Kondrachov Theorem D α u j ∈ L q () for all 1 ≤ q < p ∗ and D α u j has a convergent subsequence, and D α u j q ≤ C u j k,q , u j k−1,q ≤ C W L W |α|≤k−1
so D α u j has a convergent subsequence in W k−1,q (). For (2), note that since m ≥ 1, we have
226
3 Theory of Sobolev Spaces
∗
W k, p () ⊆ W k−m+1, p (),
(3.10.4)
but from (1) we also have ∗
c
W k−m+1, p () → W k−m,q ().
(3.10.5)
The result now follows from (3.10.4)–(3.10.5).
An immediate corollary is the following: Corollary 3.10.9 Let be open C 1 and bounded in Rn . If 1 ≤ p < n then c
(1) W k, p () → W k−1, p (). c (2) W 1, p () → L p (). Proof Note that for p < n, we always have p < p ∗ .
3.10.4 Sobolev Embedding Theorem Theorem 3.10.10 Let be open C 1 and bounded in Rn . n (1) If k < then p c W k, p () → L q () 1 k 1 for all q ≥ 1, such that > − . q p n n (2) If k = then p c W k, p () → L q (), for 1 ≤ q < ∞. n (3) If k > then p c
W k, p () → C 0,β () for 0 < β < γ where γ = min{1, k −
n }. p
Proof (1). Let u i ∈ W k, p () be a bounded sequence. By Theorem 3.10.8(1), we have c W k, p () → W k−1,q ()
3.10 Embedding Theorems
227
for all 1 ≤ q < p ∗ = p1 , k ≥ 1. Iterating the process gives c
c
c
W k, p () → W k−1,q () → W k−2,q () → . . . , where
1 j 1 = − , 1 ≤ j ≤ k. After k iterations we obtain pj p n c
W 1, p () → W 0,q () = L q (), 1 1 k = − . pk∗ p n For (2), repeat the argument above k − 1 iterations. For (3), we use Morrey’s inequality to show that every u ∈ W k, p () is Holder continuous. We leave the details for the reader. where q < pk∗ and
3.10.5 Embedding of Fractional Sobolev Spaces Recall in Section 3.5 the fractional Sobolev space was defined as ˆ ∈ L 2 (Rn ) 0 ≤ |α| ≤ k}. H s (Rn ) = {u ∈ L 2 (Rn ) : (1 + w2 )s/2 u(w) We will provide two compact embedding theorems for this space. Theorem 3.10.11 Let 1 < p ≤ ∞, and r, t > 0 be any two positive real numbers. If r > t, we have the continuous inclusion H r (Rn ) → H t (Rn ). Moreover, if is Lip and bounded in Rn then the above inclusion is compact: c
H r () → H t (). Proof Let u ∈ H r (Rn ). Then $ % r −t ˆ = F −1 (1 + w2 )− 2 · (1 + w2 )r/2 u(w) ˆ F −1 {(1 + w2 )t/2 u(w)} = F −1 {(1 + w2 )− From the hypothesis, the exponent −
r −t 2
} ∗ F −1 {(1 + w2 )r/2 u(w)}. ˆ
r −t < 0, hence 2
F −1 {(1 + w2 )−
r −t 2
} ∈ L 1,
228
3 Theory of Sobolev Spaces
and since u ∈ H r (Rn ), we have ˆ ∈ L 2, (1 + w2 )r/2 u(w) which implies ˆ ∈∈ L 2 , F −1 {(1 + w2 )r/2 u(w)} and therefore u ∈ H t (Rn ). If be open C 1 and bounded in Rn , then by extension theorem we can show that H () → H t (). Let u n ∈ H r () be a bounded sequence, which implies that r
E(u n ) ∈ H r (Rn ), and define the cut-off function ξ ∈ C0∞ () such that ξ = 1 on . Define the sequence vn = ξ E(u n ). Then vn ∈ H r (Rn ) such that vn | = u n , and supp(vn ) ⊆ supp(ξ) ⊆ K for some compact set K ⊃⊃ . Extract a subsequence vn j of vn which converges to H t (Rn ). Consider F{vn j }. It is left to the reader to show that F{vn j } is uniformly bounded and equicontinuous in H t (). After that, the result follows from Arzela– Ascoli Theorem. The theorem implies that in a fractional Sobolev space H r () with a bounded domain of a nice regularity, any bounded sequence can have a convergent subsequence that converge in another Fractional Sobolev space H t () for any t < r . Another type of compact embedding for fractional Sobolev spaces is the following: Theorem 3.10.12 Let be bounded and C k (or Lip) in Rn . Then n (1) If k > then 2 H k (Rn ) → Cb (Rn ). (2) If k >
n then 2
c
H k () → C(). (3) If k > m +
n then 2
H k (Rn ) → Cbm (Rn ).
3.11 Problems
229
(4) If k > m +
n then 2
c
H k () → C m (). Proof We will only prove the continuous inclusion (1). By performing m successive iterations we can prove (3), and by the extension theorem and Arzela–Ascoli theorem we can prove (2) and (4). Since S(Rn ) ∩ H k (Rn = H k (Rn ), it suffices to prove the result for u ∈ S(Rn ) ∩ H k (Rn ). But this implies that
u∞ ≤
Rn
dw = C u(w) ˆ
Rn
1 dw < ∞. ˆ (1 + |w|2 )k/2 u(w) 2 k/2 (1 + |w| )
So we use Cauchy–Schwartz inequality to obtain
u∞ ≤ C Since k >
Rn
1 dw (1 + |w|2 )k
n , we obtain 2
R
1/2
Rn
2 dw ˆ (1 + |w|2 )k u(w)
1/2 . (3.10.6)
1 dw < ∞, (1 + |w|2 )k
so (3.10.6) becomes u∞ ≤ C u H k (Rn ) .
In Theorem 3.6.1, it was shown that any function in W 1,1 (I ) has an absolutely continuous representation u˜ ∈ C(I ). Theorem 3.10.12 shows that functions in H k (R) are always continuous and bounded for all k ≥ 1, and there are no continuous representations for functions in H 1 (R2 ). In order to get bounded continuous Sobolev functions on R3 , we need at least H 2 .
3.11 Problems 1 (1) Show that for any weakly differentiable f ∈ L loc (Rn ), if D α f = 0 for |α| ≤ n, then f is constant almost everywhere. (2) Show that the Heaviside function H (x) is the weak derivative of
230
3 Theory of Sobolev Spaces
f (x) =
x x >0 . 0 x ≤0
(3) Consider the function u : R −→ R, u(x) = |x|. (a) Find the weak derivative of u. (b) Is Du ∈ L 1 (R)? (c) Does u have a second weak derivative D 2 u ? (4) Find the weak derivative of u(x) = χQ (x). (5) Let f ∈ Cc∞ (Rn ). Prove that supp{D k f } ⊆ supp{ f }. (6) Show that the function introduced in (3.2.5) is in Cc∞ (Rn ). (7) Show that the formula Cc∞ (Rn ) = L p (Rn ) doesn’t hold for p = ∞. (8) Let f ∈ L p (Rn ) and ϕ ∈ Cc∞ (Rn ) be a mollifier, and consider f = f ∗ ϕ . Show that f exists for p = 1 and for p = ∞. (9) Show that Cck (Rn ) is dense in C k (Rn ). (10) Let f ∈ L 1 () for some open set ⊂ R. Find a sequence in C ∞ (R) that converges to f . (11) Let f ∈ C(R). Show that f → f uniformly on compact subsets of Rn . (12) Consider the following mollification for x ∈ Rn f = e−(+bi)|x| . 2
(a) Find F{ f } 2 (b) Find F{e−bi|x| }. n (13) Let f, h ∈ S(R ) and define h (x) =
1 −|x|2 /2 , n e (2π) 2
and let f = h ∗ f . (a) Show that f −→ f . (b) Show that F{h } = e−|x| (c) Conclude that
2
/2
.
F −1 {F{ f }} = f (x).
3.11 Problems
231
(14) Let f ∈ L ∞ (Rn ). Show that f −→ f uniformly. Is f ∈ C ∞ (Rn )? (15) Show that S(Rn ) is dense in S (Rn ). (16) (a) Show that x k ∂ β ϕ ρk (ϕ) = ∞ |β|≤k
is a seminorm on S(Rn ). (b) Use the Frechet metric ∞ d( f, g) = k=0
ρk ( f − g) 1 + ρk ( f − g)
to show that (S(Rn ), d) is a complete metric. (c) Do the same for ρm,k (ϕ) = sup sup (1 + |x|)k D β ϕ(x) . |β|≤m x∈Rn
(17) Show that ·, · in (3.4.3) defines an inner product. (18) Prove that the product of functions in W 1, p (R) is again in W 1, p (R). (19) Give an example of a function f ∈ H S (Rn ) for all s ∈ R but f is not a Schwartz function. Deduce that & H S (Rn ). S(Rn ) (20) Let u ∈ L 2 (Rn ). Show that √ 2 n ≤ 2 u H 1 (Rn ) u H 1 (Rn ) ≤ (1 + |w| u(w) ˆ L (R ) for all u ∈ H 1 (Rn ). (21) Prove the following
−1 . 2 s 2 (b) δ ∈ H (R ) iff s < −1. −n . (c) δ ∈ H s (Rn ) iff s < 2 3 (22) Show that e−|x| ∈ H s (R) iff 0 ≤ s < . 2 (23) Determine the value of n for which the following functions belong to W 1,n (B) if 1. B is the unit ball in Rn , and 2. B is the ball of radius 21 . (a) u = log (|log |x||) . 1 . (b) u = log log |x| (a) δ ∈ H s (R) iff s
0 we have |u(x)| ≤ c uW 1,1 (I ) . (28) Show that the Heaviside function belongs to H −1 () for ⊂ Rn . (29) Let u ∈ L p () for some open ⊂ Rn . (a) Show that as −→ 0 we have
|u(x + z) − u(x)| p dz −→ 0.
(b) Conclude from (a) that u (x) = ϕ ◦ u
(30) (31) (32) (33)
converges to u in L p (). If u ∈ L 1 (Rn ), show that u (x) ∈ L ∞ (Rn ). Prove the case when ∂ is unbounded in Proposition 3.8.11. 1, p Show that if u¯ ∈ W 1, p (Rn ) and is C 1 -class then u ∈ W0 (). 1, p If u ∈ W ((0, ∞)), show that lim u(x) = 0.
x→∞
(34) (a) Let u ∈ W 1,1 (I ) for some interval I ⊂ R. If u is weakly differentiable and Du = 0 then u = c a.e. for some constant c ∈ R.
3.11 Problems
233
(b) Let be open and connected in Rn and u ∈ W 1, p (). If u is weakly differentiable and Du = 0 then u = c a.e. (35) Let u ∈ W 1, p () and ξ ∈ Cc∞ () be a cut-off function, ⊂ Rn . Let w = ξu be the zero extension of ξu. Show that for 1 ≤ i ≤ n, ∂w ∂w ∂ξ =ξ +w ∂xi ∂xi ∂xi (36) Use approximation results to show that if u ∈ W 1, p (Rn ) and Dxi u = 0 for all i = 1, 2, . . . , n, then u is constant a.e. k, p (37) (a) Show that if ϕn ∈ Cc∞ () and u ∈ W k, p () then uϕn ∈ W0 (). k, p k, p (b) Show that if v ∈ C k () ∩ W k,∞ () and u ∈ W0 () then uv ∈ W0 (). (38) Show that for every u ∈ W k, p (), there exists w ∈ W k, p ( ) such that w = u on and wW k, p ( ) ≤ c uW k, p () . (39) Let u ∈ W k, p (Rn+ ). Define the sequence w (x) = u (x + 2en ), where en is a unit vector in the nth coordinate. Show that w (x) −→ u(x) in W k, p (Rn+ ). (40) Let u ∈ W 1, p (Rn+ ). Define u(x) ¯ =
u(x) xn > 0
u(x , −xn ) xn < 0.
(a) Show that u¯ ∈ W 1, p (Rn ). (b) Find a general form for u¯ if u ∈ W k, p (Rn+ ) for k ∈ N.0 (41) Prove the inequality in Theorem 3.9.9. (42) (a) Show that any function in W 1,∞ (R) coincides a.e. with a Lipschitz continuous function. (b) Deduce from (a) that W 1,∞ () = C 0,1 () for any C 1 bounded set ⊂ Rn .
234
3 Theory of Sobolev Spaces
(43) Let be an open C 1 and bounded in Rn . Show that if u ∈ W 1,n (), n ≥ 2, then u ∈ L q () for all n ≤ q ≤ ∞ and u L q () ≤ C uW 1, p () . (44) Prove Morrey’s inequality for p = ∞ as follows: (a) Use a cut-off function ξ and Theorem 3.2.3 to show that for every u ∈ W 1,∞ (Rn ), ξu ∈ W 1, p (Rn ) for every p > n. (b) Apply Morrey’s inequality on ξu. (45) Prove Theorem 3.9.15 as follows: (a) Show that for every u ∈ W 1, p (), 1 < p < ∞, there exists a sequence u j in Cc∞ (Rn ) ∩ W 1, p (Rn ) that converges to u¯ in W 1, p (Rn ). (b) Apply Morrey’s inequality to show that u j converges to u a.e. in C 0,β (R). (c) Use Morrey’s inequality and Theorem 3.8.10(4) to prove Theorem 3.9.15 for all 1 < p < ∞. (d) Use the same argument of the previous problem to prove Theorem 3.9.15 for p = ∞. n (46) In Theorem 3.9.16, write the details of the proof of the case when k > and p n ∈ N. p (47) If 0 ≤ α < β < 1, show that for a bounded set ⊂ Rn , c (a) C 0,β () → C 0,α (). c (b) C 0,β () → C(). (48) Show that bounded sets in C 0,β () are uniformly bounded and equicontinuous. (49) Let ⊂ Rn be open bounded and C 1 . Show that c
W 1,∞ () → W 1, p () for any p > n. Deduce that c
W 1,∞ () → C(). (50) Let u = ln |ln |x||. / L ∞ (B), for B = B1/e (0) ⊂ Rn (the open (a) Show that u ∈ W01,n (B) but u ∈ 1 ball of radius ). e (b) Show that u ∈ H 1 () but u ∈ / C(), for = B1/2 (0) ⊂ R2 . (c) Deduce that the condition n < p in Theorem 3.10.2 (3, 4) is sharp.
3.11 Problems
235
(51) Let u ∈ W 1, p () for some open ⊂ Rn . If p > n show that u is pointwise differentiable and ∇u = Du a.e. (52) Use only the Arzela–Ascoli Theorem together with Theorem 3.6.1 to show that for all 1 < p < ∞, we have the compact embedding c
W 1, p (I ) → C(I ) where an open bounded interval I ⊂ R. (53) Show that Rellich–Kondrachov Theorem doesn’t hold for q = p ∗ , that is; ∗
W 1, p () → L p () is a continuous inclusion but not compact. (54) (a) Show that Rellich–Kondrachov Theorem still holds for Lip domain that are not necessarily C 1 . (b) Show by a counterexample that the condition that the domain being bounded and Lip is essential. (55) Give an example to show that the continuous inclusion W 1, p (Rn ) → L p (Rn ) is not compact. n (56) Let k ≤ . 2 a) Show that H k (Rn ) → L p (Rn ) 2n . n − 2k n b) If ⊂ R is open bounded and C 1 , show that
for all 2 ≤ p
p the inclusion for all 2 ≤ p
n n − > 0. p q
(a) Show that c
W k, p () → W m,q (). (b) Discuss the case when k−m =
n n − . p q
(61) (a) If p = n, show that for all p ≤ q < ∞, c
W 1,n (Rn ) → L q (Rn ) (b) If n < p, show that for β = 1 −
n , p c
W 1, p (Rn ) → C 0,β (Rn ). (62) Prove or disprove the following inclusions are continuous: (a) W 1,n (Rn ) ⊂ C(Rn ) ∩ L ∞ (Rn ). (b) W 1, p (Rn ) ⊂ C(Rn ) ∩ L ∞ (Rn ) for p > n. (63) Let ⊂ Rn be open bounded and C k . Show that every weakly convergent sequence in W k, p () converges strongly in W k−1, p (). (64) Give an example of an unbounded function in H 1/2 (R).
3.11 Problems
(65) Verify the following inclusions: (a) H 1/3 (R2 ) → L 3 (R2 ). (b) H 1/2 (R2 ) → L 4 (R2 ). (c) H 3/4 (R3 ) → L 4 (R3 ).
237
Chapter 4
Elliptic Theory
4.1 Elliptic Partial Differential Equations 4.1.1 Elliptic Operator The general form of a second-order partial differential equation in R2 takes the form Au x x + 2Bu x y + Cu yy + Du x + Eu y + F = 0. This equation is called elliptic if B 2 − AC < 0, or equivalently, the matrix A B M= BC is positive definite. For x ∈ ⊆ Rn , the standard form of an elliptic PDE operator takes the following form: Lu(x) = −
n i, j=1
∂2u ∂ (x) + bi (x) u(x) + c(x)u(x), ∂xi ∂x j ∂xi i=1 n
ai j
(4.1.1)
where A(x) = (ai j (x)) is a matrix-valued function defined on as a11 a21 A= . .. an1
a1n a2n .. . . . . ann
a12 . . . a22 .. . an2
and is positive definite for all x ∈ , i.e., ξ T Aξ > 0 for every nonzero ξ ∈ Rn . A more convenient way of writing it is in the divergence form © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 A. Khanfer, Applied Functional Analysis, https://doi.org/10.1007/978-981-99-3788-2_4
239
240
4 Elliptic Theory
Lu(x) = −div(A(x)∇u), i.e. Lu(x) = −
n n ∂ ∂u ∂u ai j (x) + bi (x) (x) + c(x)u(x). ∂xi ∂x j ∂xi i, j=1 i=1
(4.1.2)
The equation models many steady-state natural and physical systems (e.g., heat conduction, diffusion, heat and mass transfer, flow of fluids, and electric potential). The divergence term n ∂ ∂u ai j (x) ∂xi ∂x j i, j=1 refers to the diffusion process, the second term n
bi (x)
i=1
∂u (x) ∂xi
is the advection term, and the zeroth-order term c(x)u(x) is the decay term. The function A(x) is called symmetric if ai j = a ji for all 1 ≤ i, j ≤ n. For the matrix A to be positive definite means that all its eigenvalues are positive. In particular, for every x ∈ , n ai j (x)ξi ξ j i, j=1
|ξ|2
≥ λ(x),
(4.1.3)
where λ(x) is the eigenvalue of A(x) and ξ ∈ Rn . Taking the minimum over all such vectors ξ gives the smallest eigenvalue λmin (x) > 0. So (4.1.3) characterizes the ellipticity of PDEs, and the smallest eigenvalue λmin (x) depends on the chosen value of x and serves as the minimum of the LHS of (4.1.3).
4.1.2 Uniformly Elliptic Operator To make this lower bound uniform, we need to make it independent of x. This gives rise to the following definition. Definition 4.1.1 (Uniformly Elliptic Operator) A second-order partial differential equation of the form Lu(x) = −
n i, j=1
∂2u ∂ (x) + bi (x) u(x) + c(x)u(x), x ∈ ⊆ Rn ∂xi ∂x j ∂x i i=1 n
ai j
4.1 Elliptic Partial Differential Equations
241
is called uniformly elliptic if there exists a positive number λ0 > 0 such that n
ai j (x)ξi ξ j ≥ λ0 |ξ|2
i, j=1
for all x ∈ and ξ = (ξ1 , ξ2 , . . . , ξn ) ∈ Rn . The number λ0 is called the uniform ellipticity constant. There are some important particular cases. Due to its physical interpretation, uniformly elliptic PDEs usually have an extra condition on the ellipticity: There exists 0 < < ∞ such that n
ai j ξi ξ j ≤ |ξ|2
i, j=1
for all ξ ∈ Rn . This is a stronger version of the uniform ellipticity in which the diffusion term can be controlled from above and below. It is easy to see that choosing suitable values of ξi , ξ j yields ai j ∈ L ∞ (). This is interpreted by the fact that there is no blow-up in the diffusion process, which, in many situations, arises naturally, so we will adopt this assumption throughout this chapter. One advantage of this two-sided control of the diffusion process is that all eigenvalues of the matrix A lie between these two bounds. Namely, for any eigenvalue λ(x) of A and for all x ∈ we have λ0 ≤ λ ≤ .
(4.1.4)
Another particular case is when A(x) is the identity matrix (i.e., ai j (x) = δi j ), and bi = c = 0. In this case, the operator (4.1.2) reduces to the Laplace operator Lu(x) = −∇ u(x) = − 2
n
∂2u (x). ∂xi ∂x j i, j=1
4.1.3 Elliptic PDEs Elliptic equations are extremely important and can be used in a wide range of applications in applied mathematics and mathematical physics. The most basic examples of elliptic equations are (1) Laplace equation: ∇ 2 u = 0. (2) Poisson equation: ∇ 2 u = f.
242
4 Elliptic Theory
(3) Helmholtz equation: (∇ 2 + μ)u = f. The most common types boundary conditions are: 1. The Dirichlet condition: u = g, on ∂, 2. Neumann condition:
∂u = ∇u · n = g, on ∂, ∂n
where n is the outward unit normal vector. Laplace’s equation is the cornerstone of potential theory and describes the electric potential in a region. It is one of the most important partial differential equations because it has numerous applications in applied mathematics, physics, and engineering. Poisson’s equation is the nonhomogeneous version of Laplace’s equation and describes the electric potential in the presence of a charge. It plays a dominant role in electrostatic theory and gravity. The Helmholtz equation is another fundamental equation which has important applications in many areas of physics, such as electromagnetic theory, acoustics, classical and quantum mechanics, thermodynamics, and geophysics. So no wonder the theory of elliptic partial differential equations stands as one of the most active areas in applied mathematics, and attracts an increasing interest from researchers. Therefore, studying solutions of elliptic PDEs provides a comprehensive overview of equations of mathematical physics and a wide scope of the theory of PDEs, so it suffices our needs in this text. Moreover, another feature of studying elliptic type is that elliptic equations have no real characteristic curves, and consequently, solutions to elliptic equations don’t possess discontinuous derivatives, because if they do so, they will be only along characteristic curves. This makes the elliptic equations the perfect tool to use in investigating equilibrium (timeindependent) steady-state processes in which time has no effect, and no singularities in the solution to be transported.
4.2 Weak Solution 4.2.1 Motivation for Weak Solutions A classical solution for a boundary value problem is a solution that satisfies the problem pointwise everywhere. It should be differentiable as many times as needed to fulfill the PDE. If the equation is of order n and defined in a domain , then the classical solution must be in C n () for every x ∈ , and must also satisfy boundary conditions of the problem for every x ∈ ∂. The problem of finding a solution to an equation in a domain and satisfying the conditions on the boundary of this domain is called “Boundary Value Problem”.
4.2 Weak Solution
243
Our elliptic BVP takes the following form: Let be a bounded and open set in Rn . Find u ∈ C 2 () ∩ C() such that Lu = f x ∈ (4.2.1) u=g x ∈ ∂ for some linear elliptic operator L as in (4.1.2). Because of the boundary condition, this is called Dirichlet problem. Notice that we can write the solution of this problem as w + g where w is the solution to the problem L(v + g) = f x ∈ v = 0 x ∈ ∂, so for simplicity we can just assume g = 0 in (4.2.1). If we can find such u, then it will be a classical solution for the problem (4.2.1). As said earlier, these equations model natural and physical phenomena occurring in the real world. Unfortunately, these models may not admit classical solutions. In fact, many equations in various areas of applied mathematics may have solutions that are not continuously differentiable, or not even continuous (e.g., shock waves), and so finding classical solutions to these equations in general may be too restrictive to obtain. Consider for example the case when f is a regular distribution, then the equation ∇2u = f cannot produce a continuous solution since otherwise f would also be continuous. In this case, it would be helpful to be less demanding by seeking solutions with less regularity, in the sense that they won’t satisfy the problem everywhere in the domain and the conditions on its boundary.
4.2.2 Weak Formulation of Elliptic BVP How can we find such solutions? Sobolev spaces come to the rescue as they provide all the essentials to obtain these solutions. Sobolev spaces are the completion of C ∞ spaces, and they include weakly differentiable functions that are not necessarily continuous or differentiable in the usual sense, but they can be approximated by some smooth functions in Cc∞ (). These proposed solutions are supposed to satisfy these equations in a distributional sense not in pointwise sense. Recall in (2.2.1) two distributions T and S are equal if T, ϕ = S, ϕ
for all ϕ ∈ Cc∞ (). We will use the same formulation here. Namely, we will multiply both sides of the equation Lu = f in (4.2.1) by a test function ϕ ∈ Cc∞ () then integrate over to obtain
244
4 Elliptic Theory
(Lu, ϕ) d x =
f ϕd x.
(4.2.2)
By density, this extends to all functions v ∈ H01 (). Indeed, for every v ∈ H01 () we can choose a sequence ϕn ∈ Cc∞ (∞) such that ϕn −→ v. Since the above equation is written in terms of ϕn , we pass the limit using Dominated Convergence Theorem to obtain
f vd x. (Lu, v) d x =
If we let f ∈ L 2 () and u ∈ H 1 (), then (4.2.2) is well-defined and can be written as ⎞ ⎛
n n ∂ ∂u ∂u ⎝− ai j + bi + cu ⎠ vd x = f, v L 2 () , ∂x ∂x ∂x i j i i, j=1 i=1 and performing integration by parts in the divergence term (the first term), making use of the fact that v(∂) = 0, yields
⎞ n ∂u ∂v ∂u ⎝ ai j + bi v + cuv ⎠ d x = f, v L 2 () . ∂xi ∂x j ∂x i i, j=1 i=1 ⎛
n
(4.2.3)
Since we usually impose the homogeneous Dirichlet condition u = 0 on the boundary, our best choice of solution space is H01 (). If we can find a function u ∈ H01 () such that (4.2.3) holds a.e. in then u satisfies the Dirichlet problem in a distributional sense, and since this function satisfies the problem in the weak sense, this type of solutions is known as: weak solution. Definition 4.2.1 (Weak Formulation, Weak Solution) Consider the Dirichlet problem Lu = f (x) x ∈ , (4.2.4) u=0 x ∈ ∂ for some elliptic operator L of the form Lu(x) = −
n n ∂ ∂u ∂u ai j (x) + bi (x) (x) + c(x)u(x). ∂x ∂x ∂x i j i i, j=1 i=1
Moreover, assume that ai j , bi , c ∈ L ∞ (), and f ∈ L 2 (), ∈ Rn . Then, the problem
4.2 Weak Solution
245
(Lu, v) d x = f, v L 2 () x ∈ u=0 x ∈ ∂
(4.2.5)
is called the weak formulation (or variational) formulation of problem (4.2.4). If there exists u ∈ H01 () which satisfies (4.2.5) (i.e. (4.2.3)) for all v ∈ H01 (), then u is a weak solution of problem (4.2.4). Here, we need to emphasize the fact that f, v L 2 () is not really an inner product but rather a sloppy way of describing it because it behaves the same. Since | f, v | < ∞, f defines a bounded linear functional on H01 () as f (v) = f, v . So the problem is equivalent to the problem of finding u ∈ H01 () such that for f ∈ H −1 () f, v = B[u, v] for all v ∈ H01 (). We need to emphasize the following important observation: If problem (4.2.4) has a classical solution and a weak solution, then, with sufficient regularity conditions, the weak solution is a classical solution. It is evident from the formulation above that a classical solution of (4.2.4) is also a weak solution. On the other hand, let u ∈ C 2 () be a weak solution of (4.2.4), i.e., it satisfies (4.2.5) for every v ∈ H01 () and suppose ai j , bi , c ∈ C 1 (), f ∈ C(). Performing integration by parts again in the divergence term (first term of the LHS of (4.2.5)) gives ⎞
n n ∂ ∂u ∂u ⎝− ai j + bi + cu ⎠ vd x = f vd x, ∂xi ∂x j ∂xi i, j=1 i=1
⎛
which implies
⎞ n n ∂ ∂u ∂u ⎝− ai j (x) + bi (x) (x) + c(x)u(x) − f ⎠ vd x = 0. ∂x ∂x ∂x i j i i, j=1 i=1 ⎛
By the Fundamental Lemma of COV (Lemma 3.2.9), we see that u satisfies (4.2.4) almost everywhere, but since the terms on both sides of (4.2.4) are continuous, the result extends by continuity to all x ∈ , and thus u is a classical solution of (4.2.4). Define B[u, v] to be the integral in the LHS of (4.2.5). This is defined as the bilinear form associated with L , and equation (4.2.5) can be written as
246
4 Elliptic Theory
B[u, v] = f (v). This B will play a dominant role in establishing the existence of weak solutions of elliptic PDEs.
4.2.3 Classical Versus Strong Versus Weak Solutions We adopt the following definitions to compare all types of solutions to a differential equation. (1) Classical Solution: If u satisfies equation (4.2.4) pointwise for all x ∈ , then u is said to be a classical solution of (4.2.4). (2) Strong Solution: If u satisfies equation (4.2.4) pointwise for almost all x ∈ , then u is said to be a strong solution of (4.2.4). (3) Weak Solution: If u satisfies the weak formulation (4.2.5) of equation (4.2.4), then u is said to be a weak solution of (4.2.4). It is easy to see that every classical solution is a strong solution, but the converse is not necessarily true. We illustrate the idea by the Poisson equation u = f, in u = 0 on ∂ for some f ∈ L 2 (). If u ∈ C 2 () and satisfies the equation pointwise for every x ∈ together with the boundary condition, then u is a classical solution to the problem, and in this case we get f ∈ C(). When it comes to applications to science and engineering, the condition of obtaining a continuous data is a bit restrictive in practice, and requiring f to be measurable and L 2 integrable seems more realistic in many situations. If it happens that u ∈ H 2 () ∩ H01 () satisfies the equation at almost all x ∈ except for a set of measure zero, then u is a strong solution. Notice the difference between the two notions; u continues to be a Sobolev function, and so it is measurable but not continuous, and consequently the boundary condition cannot be taken pointwise because the boundary of has measure zero and measurable functions don’t change by a set of measure zero (remember that in L p spaces we are dealing with classes of functions rather than functions). Nevertheless, the function belongs to H 2 , so it possesses second weak derivatives and this allows it to satisfy the equation pointwise almost everywhere and produces f ∈ L 2 () as a result of the calculations. If u ∈ H01 () and satisfies the variational weak formulation
4.3 Poincare Equivalent Norm
247
Du.Dvd x =
f vd x
for all v ∈ H01 (), then u is a weak solution of the equation. Observe here that u does not satisfy the equation nor the boundary in a pointwise behavior, but rather globally via an integration over the domain. We only require the first weak derivative of u to exist, and since H 2 () ⊂ H 1 () we see that every strong solution is indeed a weak solution but the converse is not necessarily true, thus it may happen that the equation has a weak solution but not a strong solution, but if the weak solution turns out to be in H 2 () then it becomes a strong solution. Sections 4.9 and 4.10 investigate this direction thoroughly. We end the section by the following important remark: The notion of weak solution should not be confused with the notion of weak derivative as the former is called “weak” because it satisfies the weak formulation of the PDE, which is a weaker condition than satisfying the equation pointwise, but it doesn’t mean it satisfies the equation with its weak derivatives. For example, the function (3.1.1) is a weak solution of the Laplace equation u = 0 although it is not weakly differentiable as illustrated in Sect. 3.1.
4.3 Poincare Equivalent Norm 4.3.1 Poincare Inequality on H01 In this section, we will deduce some important results that are very useful in establishing existence and uniqueness of weak solutions. Theorem 3.9.8 discussed the 1, p Poincare inequality in W0 as a consequence of the Gagliardo–Nirenberg–Sobolev inequality. According to the remark after Theorem 3.9.8, the inequality holds for p = 2 for all n. Here is a restatement of the result with an alternative proof that doesn’t depend on GNS inequality, which implies that it holds for 1 ≤ p < ∞, and in which the domain can be unbounded in general but bounded in at least one direction, and this gives an extra flexibility to the choice of . Theorem 4.3.1 (Poincare Inequality on H01 ) Let ⊂ Rn be an open set that is bounded in at least one direction of Rn . Then there exists C > 0 (which depends only on ) such that for every u ∈ H01 (), we have u L 2 () ≤ C Du L 2 () . Proof For n > 1, we assume that is bounded in the xi direction, that is, |xi | ≤ M for all x ∈ with xi being the i th component of x. For u ∈ Cc∞ ():
248
4 Elliptic Theory
u2 =
|u|2 d x = −
2xi u
∂u dx ∂xi
where we perform integration by parts in xi . By Cauchy-Schwartz inequality, this gives: u2L 2 ()
∂u ≤ 2M u ∂x d x
i
≤ C u L 2 () Di u L 2 () . The result follows by dividing both sides by u L 2 () . For n = 1 we have = (a, b) and u(a) = u(b) = 0, and the inequality can be easily established by the same argument as above. Now, using the fact that H01 () = Cc∞ (), the inequality extends by density to u ∈ H01 () where we can assume a sequence u n ∈ Cc∞ () such that u n −→ u in H01 , which implies i D un
L 2 ()
−→ D i u L 2 ()
for i = 0, 1.
Remark The Poincare constant C = 2M depends only on and is regarded as the least diameter of . The value 2M is not the best value that can be obtained for C , but it suffices our needs.
4.3.2 Equivalent Norm on H01 An important consequence of this inequality is the following. Corollary 4.3.2 Let ⊂ Rn be an open set that is bounded in at least one direction. Then the norm u∂ = Du L 2 () defines an inner product on H01 (), and H01 () endowed with this inner product is a Hilbert space. Proof For every u ∈ H01 (), we use Poincare inequality to obtain u2H 1 () ≤ (C2 + 1) Du2L 2 () ≤ (C2 + 1) u2H 1 () .
(4.3.1)
This implies that the norm defined as Du L 2 () is equivalent to the standard norm u H01 () , and so we can define the following inner product on H01 ():
4.3 Poincare Equivalent Norm
249
(u, v)∂ = Du, Dv L 2 () =
Du Dvd x,
(4.3.2)
such that u2∂ = Du2L 2 () = (Du, Du)∂ and the result follows since H01 () endowed with · H01 () is a complete space. The norm u∂ = Du L 2 () on H 1 () be called Poincare Norm. In a similar fashion, we write u∂ 2 to shall denote D 2 u L 2 () .
4.3.3 Poincare–Wirtinger Inequality One main concern about the Poincare inequality is when u is constant over . Of course, in this case u ∈ / H01 (). So we need to generalize the inequality to include this case and in the general space H 1 (). To motivate this, we need the following definition: Definition 4.3.3 (Mean Value). Let ⊂ Rn . Then the mean value of a function u over , denoted by u¯ , is given by u¯ =
1 ||
ud x.
The next inequality generalizes Poincare’s inequality. Theorem 4.3.4 (Poincare–Wirtinger Inequality) Let be a open connected Lip set in Rn , n ≥ 2. Then there exists C > 0 such that for every u ∈ H 1 (), we have u − u¯ L 2 () ≤ C Du L 2 () . Proof If the estimate above doesn’t hold, then for every m > 0 there exists u m ∈ H 1 () such that u − u¯ L 2 () > Du L 2 () . Define the following sequence: vm =
u m − (u¯ )m . u m − (u¯ )m 2
250
4 Elliptic Theory
Then it is clear that vm 2 = 1 and (v¯ )m = 0. Moreover, Dvm L 2
0 for all nonzero u ∈ H. (4) B is strongly positive (or coercive) on H if there exists η > 0 such that B[u, u] ≥ η u2H for all u ∈ H. In view of Definition 4.4.2, we expect that much of the properties for linear mappings extend to bilinear mappings. In particular, we have the following, whose proof is left to the reader.
252
4 Elliptic Theory
Proposition 4.4.3 Let B : H × H → R be a bilinear mapping. Then the following are equivalent: (1) B is bounded. (2) B is continuous everywhere in H × H. (3) B is continuous at (0, 0).
Proof Exercise.
4.4.2 Elliptic Bilinear Mapping Now we come to our elliptic bilinear map that we already introduced in the previous section. Definition 4.4.4 (Elliptic Bilinear Map) Let ai j , bi , c ∈ L ∞ () for some open ⊆ Rn . We define the elliptic bilinear map, B : H01 () × H01 () −→ R and it is given by ⎞ ⎛
n n ∂u ∂v ∂u ⎝ B[u, v] = ai j + bi v + cuv ⎠ d x. (4.4.1) ∂xi ∂x j ∂x i i, j=1 i=1 This B is the bilinear form associated with the elliptic operator L defined in (4.1.1) which serves our weak formulation given in (4.2.5). The conditions adopted in the definition suffice our needs in this text. Before we establish our estimates, we need the following. Lemma 4.4.5 (Cauchy’s inequality) Let > 0. Then for s, t > 0 we have st ≤ s 2 + Proof We have st = (2 )1/2 s
t2 . 4
t . (2 )1/2
The result follows using Young’s inequality with a = (2 )1/2 s and b =
t . (2 )1/2
4.4.3 Garding’s Inequality Theorem 4.4.6 (Elliptic Estimates) Let B be the elliptic bilinear map (4.4.1) for some open in Rn .
4.4 Elliptic Estimates
253
(1) If ai j , bi , c ∈ L ∞ (), then B is bounded, i.e., there exists α > 0 such that for every u, v ∈ H01 () we have |B[u, v]| ≤ α u H01 () v H01 () . (2) Garding’s inequality: If B is associated with a uniform elliptic operator L on a domain that is bounded in at least one direction, then there exist β > 0 and γ ≥ 0 such that for all u ∈ H01 () we have B[u, u] ≥ β u2H 1 () − γ u2L 2 () . 0
Remark For convenience, we write · L ∞ () = ·∞ and · L 2 () = ·2 . Proof For (1), let ⎧ ⎫ n n ⎨ ⎬ ai j , b c M = max , i ∞ ∞ . ∞ ⎩ ⎭ i, j=1
i=1
Then we have
n ∂u |B[u, v]| ≤ ai j ∞ ∂x
i, j=1
i
∂v ∂x
j
n dx + b i ∞
∂u |v| d x ∂x i
|u| |v| d x + c∞
i=1
∂u ∂v ∂u ≤M ∂x ∂x + ∂x |v| + |u| |v| d x i j i ∂u ∂v ∂u v2 + u2 v2 (by C-S inequality) ≤M + ∂xi 2 ∂x j 2 ∂xi 2 ≤ α u H 1 () v H 1 () (since · L 2 ≤ · H 1 ). 0
0
0
for some suitable α = 3M > 0. For (2), since L is uniform elliptic, there exists λ0 > 0 such that n
ai j (x)ξi ξ j ≥ λ0 |ξ|2
(4.4.2)
i, j=1
for all ξ ∈ Rn , let ξ = Du, substitute in (4.4.2) and integrate both sides over . Then we have
n ∂u ∂u |Du|2 d x ≤ ai j (x) dx λ0 ∂xi ∂x j i, j=1
254
4 Elliptic Theory
∂u ud x − cu 2 d x ∂xi i=1
n ∂u |u| d x + c∞ bi ∞ |u|2 d x. ≤ B[u, u] + ∂x i i=1
= B[u, u] −
n
bi (x)
∂u = s and |u| = t in Cauchy’s inequality with Now we substitute with ∂xi λ0 0 < < n . 2 i=1 bi ∞ This gives
λ0
|Du|2 d x ≤ B[u, u] +
n
bi ∞
i=1
|Du|2 d x +
1 + c∞ 4
u 2 d x.
It follows that λ0 2
|Du| d x ≤ λ0 −
2
n
bi ∞
i=1
≤ B[u, u] +
1 + c∞ 4
|Du|2 d x
u 2 d x,
which implies λ Du22 ≤ B[u, u] + γ u22 , 2
β u2H 1 () ≤ 0
where β=
1 λ0 + c∞ . , γ= 4
2(C2 + 1)
If bi = 0, then we have
|Du|2 d x ≤ B[u, u] + c∞ |u|2 d x, λ0
which, by using Poincare inequality again, implies β u2H 1 () ≤ B[u, u] + γ u22 , 0
where β=
λ0 , (C2 +1)
γ = c∞ .
4.5 Symmetric Elliptic Operators
255
4.5 Symmetric Elliptic Operators 4.5.1 Riesz Representation Theorem for Hilbert Spaces Before establishing results on the existence and uniqueness of solutions of elliptic PDEs, we need to recall the famous Riesz Representation Theorem on Hilbert spaces and give a brief proof of it. Theorem 4.5.1 (Riesz Representation Theorem (RRT) for Hilbert Spaces) Let H be a Hilbert space, and 0 = f ∈ H∗ . Then, there exists a unique element u ∈ H such that f (v) = v, u
for all v ∈ H and
f = u .
Proof Note that Y = ker( f ) is a closed proper subspace of H, and so Y ⊥ contains a nonzero element, say y0 . Then f (v)y0 − f (y0 )v ∈ Y , which implies that 0 = f (v) y0 , y0 − f (y0 ) v, y0
from which we get f (y0 ) f (y0 ) v, y0 = v, f (v) = y0 . y0 , y0
y0 , y0
Then u=
f (y0 ) y0 y0 , y0
establishes the existence. If there exists another element, say u ∈ H such that f (v) = v, u , then v, u − u = 0.
Choose v = u − u gives
u = u ,
which established uniqueness. Note that f ∈ H∗ and f ≤ u . On the other hand, u2 = u, u = f (u) ≤ f u . Divide by u to get the other direction.
256
4 Elliptic Theory
4.5.2 Existence and Uniqueness Theorem—Poisson’s Equation The first equation to start with is the Poisson’s equation. Theorem 4.5.2 (First Existence Theorem) Consider the Dirichlet problem −∇ 2 u = f in (4.5.1) u = 0 on ∂ where f ∈ L 2 () for some open ⊂ Rn that is bounded in at least one direction. Then there exists a unique weak solution u ∈ H01 () for problem (4.5.1). Proof Note that L is elliptic with ai j (x) = δi j , bi (x) = c = 0 for all i = 1, . . . n. So the elliptic bilinear map is of the form
B[u, v] =
Du.Dvd x = u, v ∂
which is the Poincare inner product defined in (4.3.4). Proposition 4.3.5 asserts that H01 () with this inner product is Hilbert, and since f ∈ L 2 (), the inner product takes the form
f vd x. (u, v)∂ = f, v = f (v) =
Indeed, By Holder’s (or C-S) inequality,
| f v| ≤ f L 2 () v H01 () , | f (v)| ≤
so
f ∈ (H01 ())∗ = H −1 (),
and therefore by the Riesz Representation Theorem (RRT) there exists a unique u ∈ H01 () satisfying the equation, and clearly u |∂ = 0. Hence u is the unique weak solution of the problem (4.5.1). The second equation to discuss is the nonhomogeneous Helmholtz equation.
4.5 Symmetric Elliptic Operators
257
4.5.3 Existence and Uniqueness Theorem—Helmholtz Equation Theorem 4.5.3 (Second Existence Theorem) Consider the Dirichlet problem −∇ 2 u + u = f in (4.5.2) u = 0 on ∂ where f ∈ H −1 () for some ⊆ Rn . Then there exists a unique weak solution u ∈ H01 () for problem (4.5.2). Proof Note that Lu = −∇ 2 u + u which is elliptic with ai j (x) = δi j , bi (x) = 0 for all i = 1, . . . n, and c(x) = 1. Here, the elliptic bilinear map takes the following form:
B[u, v] = (Du Dv + uv) d x = u, v H01 () .
So B defines the standard Sobolev inner product on H01 (), and B[u, v] = f, v = f (v) for all v ∈ H01 () and f is a bounded linear functional on H01 (). Thus, by Riesz Representation Theorem (RRT) there exists a unique function u ∈ H01 () satisfying the equation, and of course u |∂ = 0. Hence, u is the unique weak solution of problem (4.5.2).
We observe two important points from the two preceding theorems. (1) The domain for the first problem was chosen to be ⊆ Rn , which could be = Rn , whereas the domain for the second problem was chosen to be open and bounded in at least one direction. This is because the latter problem requires Poincare inequality to use the Poincare norm, while the former problem used the standard Sobolev norm without the use of Poincare inequality. (2) Both operators in the two problems are symmetric. In fact, this condition is essential for B to define an inner product on H01 . If bi = 0 for some i, then L is not symmetric and thus B cannot define an inner product and the Riesz Representation Theorem won’t be applicable.
258
4 Elliptic Theory
4.5.4 Ellipticity and Coercivity Now, we investigate equations of the form Lu = −
n ∂ ∂u ai j (x) + c(x)u(x). ∂xi ∂x j i, j=1
The following result connects uniform ellipticity with coercivity. Theorem 4.5.4 Consider the elliptic operator L=−
n
∂xi ai j (x)∂x j + c(x),
(4.5.3)
i, j=1
defined on H01 () for some open and bounded in at least one direction in Rn , and let ai j , c ∈ L ∞ () with c(x) ≥ 0. (1) If L is uniformly elliptic, then the associated elliptic bilinear map B[u, v] is coercive. (2) Moreover, if A = (ai j ) is symmetric then B defines a complete inner product on H01 (). Proof The elliptic bilinear map associated with L takes the form ⎞ ⎛
n ∂v ∂u ⎝ B[u, v] = ai j (x) + c(x)u(x)v(x)⎠ d x. ∂xi ∂x j i, j=1 By the uniform ellipticity of L, there exists λ > 0 such that for all ξ ∈ Rn n
ai j (x)ξi ξ j ≥ λ0 |ξ|2 .
i, j
Substitute Du = ξ in (4.5.4), then by substituting the above in B
B[u, u] ≥
λ0 |Du|2 + c(x)u 2 d x
λ0 |Du|2 d x
λ0 |Du|2 + = 2
λ0 |Du|2 + ≥ 2 ≥
λ0 |Du|2 d x 2 λ0 2 d x (by Poincare inequality) u 2C2
(4.5.4)
4.5 Symmetric Elliptic Operators
≥σ
259
|Du|2 + u 2 d x
= σ u2H 1 () 0
for any u ∈ H01 (), where
! σ = min
" λ0 λ0 . , 2 2C2
This proves that B is coercive. So B[u, u] > 0 for u = 0, and if B[u, u] = 0 then we clearly have u = 0. Moreover, by symmetry of A, we have B[u, v] = B[v, u]. Hence, B[u, v] defines an inner product ·, · B on H01 () and B[u, u] = u, u B = u2B ≥ σ u2H 1 () , 0
or u B ≥
√ σ u H01 () .
(4.5.5)
On the other hand,
u, u B ≤ M
u 2 d x = M u2H 1 () 0
(4.5.6)
where ⎧ ⎫ n ⎨ ⎬ ai j ∞ , c L ∞ () . M = max L () ⎩ ⎭ i, j=1
Then (4.5.5) and (4.5.6) imply the inner product u, u B is equivalent to the standard inner product u, u H01 () and thus the space (H01 (), ·, · B ) is Hilbert space.
260
4 Elliptic Theory
4.5.5 Existence and Uniqueness Theorem—Symmetric Uniformly Operator The next theorem provides an existence and uniqueness theorem for (4.2.4) for a symmetric uniformly operator of the form (4.5.3). Remember that the condition bi = 0 is essential for symmetry of L . Theorem 4.5.5 Consider the Dirichlet elliptic problem
Lu = f in u = 0 on ∂.
(4.5.7)
where L is a uniformly elliptic operator of the form (4.5.3) defined on some open set and bounded in at least some direction in Rn . If A = (ai j ) is symmetric, and ai j , c ∈ L ∞ (), f ∈ L 2 (), and c(x) ≥ 0, then there exists a unique weak solution for the problem (4.5.7). Proof As in Theorem 4.5.4, we have ⎞ ⎛
n ∂v ∂u ⎝ B[u, v] = ai j (x) + c(x)u(x)v(x)⎠ d x. ∂xi ∂x j i, j=1 Then B is bounded by estimate 1 of Theorem 4.4.6, and is symmetric since A is symmetric. Since L is uniform elliptic and c ≥ 0, by Theorem 4.5.4, B is coercive and defines a complete inner product ·, · B on H01 () such that (H01 (), ·, · B ) is Hilbert space. Moreover, f, v = f (v) is a bounded linear functional on L 2 (), and thus on H01 (). The existence and uniqueness of the weak solution of problem (4.5.7) follows now from the Riesz Representation Theorem. Remark We end the section by the following remarks. (1) The symmetry condition for A was required to prove that B defines an inner product, but it wasn’t required to prove coercivity of B. (2) The above results are still valid if c(x) admits negative values (see Problems: 4.11.20, 4.11.21).
4.6 General Elliptic Operators 4.6.1 Lax–Milgram Theorem The elliptic bilinear map B in Theorem 4.5.4 for a symmetric elliptic operator defines the most general inner product on H01 () in the sense that if A is the identity matrix and c = 1 then
4.6 General Elliptic Operators
261
u, u B = u, u H01 () , and if A is the identity matrix and c = 0 then u, u B = u, u ∂ . If bi = 0 for at least one i, then L is not symmetric, which implies that B is not symmetric. In this case, B cannot define an inner product and consequently we cannot apply the Riesz representation theorem. Therefore, we need to investigate a more general version of the Riesz representation theorem to allow us to deal with general elliptic operators that are not symmetric. The following theorem is fundamental and serves our needs. Theorem 4.6.1 (Lax–Milgram) Let B : H × H → R be a bilinear mapping for the Hilbert space H . If B is bounded and coercive, then for every f ∈ (H )∗ there exists a unique u ∈ H such that B[u, v] = f, v
for all v ∈ H. Proof Let u ∈ H be a fixed element. Then the mapping v −→ B[u, v] defines a bounded linear functional on H and so by the Riesz representation theorem, there exists a unique w = wu ∈ H such that B[u, v] = wu , v
for all v ∈ H. Our claim is that the mapping u −→ wu is onto and one-to-one, which implies the existence of u ∈ H such that wu = f. So consider the mapping T : H −→ H , T (u) = w. Then for all v ∈ H , we have B[u, v] = T u, v . Clearly T is linear due to the linearity of B. Moreover, by boundedness of B T u2 = T u, T u = B[u, T u] ≤ C u T u . We divide by T u and conclude that T is bounded, i.e., continuous. Moreover, by coercivity of B there exists η > 0 such that η u2H ≤ B[u, u] = T u, u ≤ T u H u H . Again, dividing by u H implies that T is bounded below η u H ≤ T u H .
(4.6.1)
262
4 Elliptic Theory
The next step is to show that R(T ) is closed, then show that R(T )⊥ = {0}. This implies that R(T ) = H and so T is onto. To show that R(T ) is closed, let wn be Cauchy in R(T ). Then for each n, there exists u n such that T (u n ) = wn . Then by (4.6.1), we have u n − u m ≤
1 1 T (u n − u m ) ≤ wn − w m → 0, η η
and so (u n ) is Cauchy. But by completeness of H , u n −→ u for some u ∈ H and wn −→ w for some w ∈ H . Hence w = lim wn = lim T (u n ) = T (lim u n ) = T (u), therefore w ∈ R(T ) and consequently R(T ) is closed in H. It follows that by orthogonal decomposition of Hilbert spaces, H = R(T ) ⊕ R(T )⊥ . Now, if R(T ) ⊂ H, then there exists 0 = y ∈ R(T )⊥ , so y, R(T ) = 0. By coercivity of B η y2H ≤ B[y, y] = T y, y = 0. Therefore y = 0, and hence
R(T ) = H,
i.e., T is onto. This means that there exists u ∈ H such that wu = f. To show uniqueness of u, let T u1 = T u2; then substituting with u = u 1 − u 2 in (4.6.1) gives u1 = u2, which implies that T is one-to-one, and hence u is unique. Note that in the proof of the theorem, we didn’t assume B is symmetric.
4.6 General Elliptic Operators
263
4.6.2 Dirichlet Problems The next theorem is basically the same as Theorem 4.5.5 except that the symmetry condition for A is relaxed. Theorem 4.6.2 Consider the Dirichlet elliptic problem Lu = f in u = 0 on ∂.
(4.6.2)
where L is a uniformly elliptic operator of the form (4.5.3), such that ai j , c ∈ L ∞ (), f ∈ L 2 (), and c(x) ≥ 0 for some open that is bounded in at least some direction in Rn . Then there exists a unique weak solution in H01 () for the problem (4.6.2). Proof Define ⎛
⎞ ∂v ∂u ⎝ ai j (x) + c(x)u(x)v(x)⎠ d x. B[u, v] = ∂xi ∂x j i, j=1
n
(4.6.3)
Then B is an elliptic bilinear map with ai j , c ∈ L ∞ (), and so bounded by estimate 1 of Theorem 4.4.6. Moreover, since L is uniform ellipticity, B[u, v] is coercive by Theorem 4.5.4(1). So by the Lax–Milgram theorem, for every f ∈ L 2 () ⊂ H −1 (), there exists a unique u ∈ H01 () such that f (v) = B[u, v] for all v ∈ H01 (). The result follows from the Lax–Milgram theorem.
Recall that in an elliptic operator L, the coefficient matrix A is positive definite, so its eigenvalues are all positive. In the previous theorem, it was assumed that c ≥ 0, so it cannot be an eigenvalue of A. But for arbitrary c, we need to avoid the values that would make c an eigenvalue of −L since this would give zero in the LHS of (4.5.7) and hence f cannot be obtained. Now we will study the solution of the equation Lu + μu = f in , where L is the operator Lu = −
n ∂ ∂u ai j (x) + c(x)u(x), ∂xi ∂x j i, j=1
264
4 Elliptic Theory
so the zeroth-order term in the equation is (c(x) + μ)u. When we relax the condition c ≥ 0, we will make use of γ = c∞ in Garding’s elliptic estimate. If we assume μ ≥ γ, then the zeroth-order term becomes (c(x) + μ)u ≥ (c(x) + c∞ )u, from which we obtain c(x) + c∞ ≥ 0 for all choices of c, so by Theorem 4.5.4 the elliptic bilinear map B[u, v] is coercive. Thus we have the following. Theorem 4.6.3 Consider the Dirichlet elliptic problem Lu + μu = f in u=0 on ∂.
(4.6.4)
for some uniformly elliptic operator L of the form (4.5.3), such that ai j , c ∈ L ∞ () and f ∈ L 2 () for some open and bounded in at least some direction in Rn . If μ ≥ γ, which was obtained in Garding inequality, then there exists a unique weak solution for the problem (4.6.4). Proof The result follows immediately from Theorem 4.6.2 since c(x) + c∞ ≥ 0.
In other words, for μ ≥ γ, the operator L μ = (L + μI ) : H01 −→ H −1
(4.6.5)
is onto (by existence), one-to-one (by uniqueness), and bounded.
4.6.3 Neumann Problems Now we investigate Elliptic PDEs with Neumann conditions. Consider the problem
Lu = f in ∇u · n = 0 on ∂
(4.6.6)
4.6 General Elliptic Operators
265
where L is the elliptic operator in (4.5.3). Here, we will assume is bounded open in Rn for n ≥ 2. The weak formulation of the problem takes the following form: Find u ∈ H 1 () such that ⎞ ∂v ∂u ⎝ ai j (x) + c(x)u(x)v(x)⎠ d x = f, v in ∂xi ∂x j i, j=1
⎛
n
∇u · n = 0 on ∂. The argument of finding the weak formulation of the problem is almost the same as for the Dirichlet problem except one thing: The solution doesn’t vanish on the boundary, so our test function will be in C ∞ (), and consequently our space solution will be H 1 () rather than H01 (). This means that we won’t be able to use Poincare inequality since it requires H01 (). To solve the problem we assume that ai j , c ∈ L ∞ (), f ∈ L 2 () and c(x) ≥ 0 on . Here, the case c = 0 should be treated with extra care for reasons that will be discussed shortly. So we need to discuss the two cases separately: the first case when c(x) is away from 0 and the second case is when c = 0. Theorem 4.6.4 Let ai j , c ∈ L ∞ () and f ∈ L 2 () in the problem (4.6.6) defined on a bounded open in Rn for n ≥ 2. If c(x) ≥ m > 0 then there exists a unique weak solution u ∈ H 1 () for problem (4.6.6), and for some C > 0, we have u H 1 () ≤ C f L 2 () .
(4.6.7)
Proof The elliptic bilinear map associated with L is ⎞ ⎛
n ∂v ∂u ⎝ B[u, v] = ai j (x) + c(x)u(x)v(x)⎠ d x. ∂xi ∂x j i, j=1 Then,
n |B[u, v]| ≤ |Du| |Dv| d x + c L ∞ () |u| |v| d x ai j L ∞ ()
i, j=1
≤α
|Du| |Dv| d x +
|u| |v| d x
= 2α u H 1 () v H 1 () , where α = max{
n ai j ∞ , c L ∞ () }. L ()
i, j=1
Hence B is bounded on H 1 () × H 1 (). Moreover, by letting ξ = Du in the uniform ellipticity condition,
266
4 Elliptic Theory
B[u, u] ≥ λ0
|Du|2 d x + m
u2d x
≥ β u2H 1 () where β = min{λ0 , m}. Hence B is coercive. By applying the Lax–Milgram theorem, we obtain a unique u ∈ H 1 () such that β u2H 1 () ≤ B[u, u] = f, u ≤ f L 2 () u H 1 () . Dividing by β u H 1 () , we arrive at estimate (4.6.7). Now, we discuss the second case when c = 0. The problem reduces to Lu = f in ∇u · n = 0 on ∂
(4.6.8)
The weak formulation of the problem takes the following form: Find u ∈ H 1 () such that
n i, j=1
ai j (x)
∂u ∂v d x = f, v in ∂xi ∂x j ∇u · n = 0 on ∂.
The difficulty of this problem stems from the fact that the existence and uniqueness result is not guaranteed because if u is a weak solution to the problem, then u + k is also a solution to the problem for any constant k. Moreover, for a constant k, we have f, k = 0, which is equivalent to having
f¯ = 0.
In this case, two conditions will be added, the first is that u¯ = 0, and the compatibility condition f, 1 = 0, and so the Poincare–Wirtinger inequality and the quotient Sobolev space will be invoked here. Theorem 4.6.5 Let ai j ∈ L ∞ () and f ∈ L 2 () in problem (4.6.8) defined on a bounded connected open in Rn for n ≥ 2. If
4.6 General Elliptic Operators
267
u¯ = 0 and f, 1 L 2 () = 0, then there exists a unique weak solution u ∈ H 1 () for problem (4.6.8). Moreover, for some C > 0 we have u H˜ 1 () ≤ C f L 2 () .
(4.6.9)
Proof Consider the quotient Sobolev space H˜ 1 () with the Poincare norm u H˜ 1 () = u∂ = Du L 2 , which is a Hilbert space by Proposition 4.3.5. The associated elliptic bilinear map takes the form
n ai j ∞ |Du| |Dv| d x |B[u, ˜ v]| ˜ ≤ L ()
i, j=1
≤ α u H 1 () v H 1 () , where α=
n ai j ∞ , L () i, j=1
and so B is bounded on H˜ 1 () × H˜ 1 (). Moreover, letting ξ = Du in the uniform ellipticity condition,
|Du|2 d x = λ0 u2˜ 1 , B[u, ˜ u] ˜ ≥ λ0 (4.6.10)
H ()
B is coercive. Lastly, we show that f ∈ ( H˜ 1 ())∗ . We have | f (v)| ˜ = | f, v | = | f, v − v | ≤ f L 2 v − v L 2 . Using the Poincare–Wirtinger inequality, this gives | f (v)| ˜ ≤ C f L 2 Dv L 2 = f L 2 () v ˜ H˜ 1 () , thus f is a linear bounded functional on H˜ 1 (). Therefore, applying the Lax– Milgram theorem, we see that for every v˜ ∈ H˜ 1 () there exists a unique u˜ ∈ H˜ 1 () such that B[u, ˜ v] ˜ = f, v
268
4 Elliptic Theory
for f ∈ L 2 (). From (4.6.10), this gives ˜ H˜ 1 () , ˜ u] ˜ = f, u
˜ ≤ f L 2 () u λ0 u2H˜ 1 () ≤ B[u, and dividing by λ0 u H˜ 1 () we arrive at estimate (4.6.9).
4.7 Spectral Properties of Elliptic Operators 4.7.1 Resolvent of Elliptic Operators We return to problem (4.6.4) which states that the equation Lu + μu = g with u(∂) = 0 has a unique weak solution on H01 () for all μ ≥ γ. We also concluded that the operator L μ in (4.6.5) is invertible for μ ≥ γ, and −1 () −→ H01 (). L −1 μ : H
The objective of this section is to investigate the spectral properties of elliptic operator. Our ultimate goal is to show that L μ is a Fredholm operator. Consider the problem Lu = f in (4.7.1) u = 0 on ∂ for some uniformly elliptic operator L=−
n
∂i ai j ∂ j + c(x) + μ
i, j
defined on an open bounded set ∈ Rn . It was proved that for every f ∈ L 2 (), there exists a unique weak solution u ∈ H01 () such that B[u, v] = f, v for every v ∈ H01 (). Adding the term μu to both sides of the equation gives Lu + μu = f + μu. Writing g = f + μu gives the same equation in problem (4.6.4). Hence, we denote L μ = L + μI : H01 () −→ H −1 (), and the associated bilinear map is Bμ [u, v] = B[u, v] + μ (u, v) = g, u .
(4.7.2)
4.7 Spectral Properties of Elliptic Operators
269
Then −1 −1 u = L −1 μ (g) = L μ ( f ) + μL μ (u).
Let us denote
(4.7.3)
μL −1 μ = K,
which is the resolvent of L , and L −1 μ ( f ) = h, provided that we have μ ≥ γ. Then (4.7.3) can be written as (I − K )u = h,
(4.7.4)
−1 () −→ H01 (). K = μL −1 μ : H
(4.7.5)
with
The following theorem gives the first result of this section which implies that a uniform elliptic operator has a compact resolvent. Theorem 4.7.1 The operator 2 2 K = μL −1 μ : L () −→ L ()
defined above is compact, i.e., I − λK is a Fredholm operator for any 0 = λ. Proof If we strict K in (4.7.4) on L 2 (), still calling it K , then K | L 2 = K : L 2 () −→ H01 (). We prove K is bounded. Indeed, Bμ is clearly elliptic bilinear, so it is bounded by elliptic estimate (1), and using Garding’s inequality we have Bμ [u, u] ≥ β u2H 1 () + (μ − γ) u2L 2 () ≥ β u2H 1 () . 0
0
(4.7.6)
On the other hand, from (4.7.2) we have β u2H 1 () ≤ Bμ [u, u] = g, u ≤ g L 2 () u L 2 () ≤ g L 2 () u H01 () . 0 (4.7.7) Then (4.7.6) and (4.7.7) give u H01 () ≤
1 g L 2 () . β
270
4 Elliptic Theory
From (4.7.3) and the definition of K , this implies 1 K (g) H01 () = L −1 g L 2 () , μ (g) H 1 () ≤ 0 β and hence K is a bounded linear operator which maps bounded sequences to bounded sequences in H01 (), which, in turn, is compactly embedded in L 2 () (by the Rellich–Kondrachov theorem for n > 2 and Theorem 3.10.7 for n = 1, 2) and therefore K = ι ◦ K is compact.
4.7.2 Fredholm Alternative for Elliptic Operators Since we concluded that K is compact, we can obtain a Fredholm alternative theorem for elliptic operator. Theorem 4.7.2 (Fredholm Alternative for Elliptic Operators) Let L be a uniformly elliptic operator defined on an open bounded set of Rn . Then: Either 1. For every f ∈ L 2 () there exists a unique weak solution u ∈ H01 () of the problem Lu = f u = 0 ∂, or 2. There exists a nonzero weak solution u ∈ H01 () for the equation Lu = 0. Proof We start from the fact that the operator I − K is Fredholm by Theorem 4.7.1. So either (i) for every h ∈ L 2 () the equation (I − K )u = h
(4.7.8)
has a unique weak solution u ∈ H01 (), or (ii) the equation (I − K )u = 0 has a nontrivial weak solution u ∈ H01 (). −1 Suppose statement (i) holds. Substituting μL −1 μ = K and L μ ( f ) = h in (4.7.8) gives −1 (I − μL −1 μ )u = L μ ( f ) = h.
(4.7.9)
Apply L μ to both sides of (4.7.9) and rearrange terms to finally obtain statement (1) of the theorem. Suppose statement (ii) above holds. Again, letting K = μL −1 μ , (I − μL −1 μ )u = 0,
4.8 Self-adjoint Elliptic Operators
271
which, after applying L μ to both sides, implies 1 L μ (u) = u. μ Multiplying both sides by μ gives statement (2) of the theorem. This completes the proof.
4.7.3 Spectral Theorem for Elliptic Operators An immediate conclusion is Corollary 4.7.3 Let K be the resolvent of a uniformly elliptic operator of the form L = −∂x j (ai j ∂xi ) + bi ∂xi + c for some ai j ∈ C 1 (), bi ∈ R be a constant number, and be open and bounded in Rn . Then, the eigenfunctions of K form an countable orthonormal basis for the space L 2 () and their corresponding eigenvalues behave as λ1 > λ2 > λ3 . . . , and λn −→ 0. Proof This is a consequence of the Hilbert–Schmidt theorem and the Spectral theorem for Self-adjoint compact operators. The above corollary provides us with a justification of the Fredholm alternative of elliptic operators. Indeed, if Lu = 0 has only the trivial solution, then L doesn’t have 0 as an eigenvalue, so by Proposition 1.6.11 this implies that the orthonormal basis is finite, but such a set cannot span all the Hilbert space L 2 , which is infinitedimensional, i.e., cannot be surjective, so we cannot find a solution for the equation for every f.
4.8 Self-adjoint Elliptic Operators 4.8.1 The Adjoint of Elliptic Bilinear The adjoint form of B is denoted by B ∗ and is defined as B ∗ [u, v] = B[v, u]
272
4 Elliptic Theory
and the adjoint problem is defined as finding the weak solution v ∈ H01 () of the adjoint equation L ∗v = f v=0 ∂ such that B ∗ [v, u] = f, u
for all u ∈ H01 (). Moreover, Bμ∗ [v, u] = B ∗ [v, u] + μ v, u = g, u . We will investigate the eigenvalue problem of K = μL −1 μ rather than L in order to make use of the spectral properties of compact operators studied in Chap. 1. Consider the elliptic operator L = −∂x j (ai j ∂xi ) + bi ∂xi + c. To make L self-adjoint, it is required that B ∗ [u, v] = B[u, v], so that L = L ∗ and Lu, v = u, Lv . To achieve this equality, we integrate (Lu)v by parts (by ignoring the summations for simplicity). This gives
−(ai j u xi )x j + bi u xi + cu vd x (Lu)vd x =
= ai j u xi vx j + [(bi u)xi − (bi )xi u]v + cuvd x.
= −(ai j vxi )x j + bi vxi + (c − (bi )xi )v ud x
= (L ∗ v)ud x.
Letting bi constant gives (bi )xi = 0, and so
(Lu)vd x = (L ∗ v)ud x. = (Lv)ud x.
Thus: Theorem 4.8.1 Let ai j ∈ C 1 (), and bi be a constant number. Then the elliptic operator
4.8 Self-adjoint Elliptic Operators
273
L = −∂x j (ai j ∂xi ) + bi ∂xi + c is self-adjoint; consequently its resolvent K = μL −1 μ is self-adjoint. Proof The argument above implies L = L ∗ , and since I is also self-adjoint, then so is L + μI, hence K ∗ = (L + μI )−1
∗
= (L ∗ + μI )−1 = (L + μI )−1 = K.
4.8.2 Eigenvalue Problem of Elliptic Operators Recall Theorem 4.6.3 asserts that the problem Lu + μu(x) = f in u = 0 on ∂
(4.8.1)
has a unique weak solution whenever μ ≥ γ (where γ is the elliptic estimate of Garding inequality), where L=−
n
∂xi ai j (x)∂x j + c(x)
(4.8.2)
i, j=1
is a uniformly elliptic and self-adjoint operator and ai j , c ∈ L ∞ () and f ∈ L 2 () for some open and bounded in at least some direction in Rn . If μ = 0 and c ≥ 0, then Theorem 4.6.2 asserts that the problem has a unique weak solution. If we write μ = −λ, then the equation Lu + μu = 0 is written as Lu = λu and we have an eigenvalue problem. We will discuss two cases. If λ ≤ −γ then μ ≥ γ and the solution exists and unique for (4.8.1). If λ > −γ then we may not have a nontrivial solution for the equation (L − λ)u = 0, but rather have the following Fredholm alternative: either Lu − λu = f has a unique weak solution for every f ∈ L 2 (), or the equation Lu − λu = 0 has a nontrivial solution, which implies that λ is an eigenvalue of L, and problem (4.8.1) turns to the eigenvalue problem Lu = λu in u=0 on ∂.
274
4 Elliptic Theory
4.8.3 Spectral Theorem of Elliptic Operator The following theorem provides the main property for the spectrum of L . Theorem 4.8.2 (Spectral Theorem of Elliptic Operators) Consider the uniformly elliptic operator L in (4.8.2) defined on an open bounded set in Rn . Then, the eigenfunctions of L form an countable orthonormal basis for the space L 2 () and their corresponding eigenvalues behave increasingly as 0 < λ1 ≤ λ2 ≤ λ3 . . . , and λn −→ ∞. Proof Consider the case when Lu − λu = 0 has a nontrivial solution, which identifies λ as an eigenvalue of L . Add the term γu to both sides of the equation, λ > −γ. This gives Lu + γu − λu = γu, or L γ u = (γ + λ)u.
(4.8.3)
Apply the resolvent L −1 γ to both sides of (4.8.3) u = (γ + λ)L −1 γ u. Substituting with K = γ L −1 γ , Ku =
γ u. γ+λ
From Corollary 4.7.3, the eigenvalues of K are countable and decreasing, so let them be γ νn = γ + λn then the eigenvalues of L , and they increase, and since νn −→ 0 we must have λn −→ ∞.
4.9 Regularity for the Poisson Equation
275
4.9 Regularity for the Poisson Equation 4.9.1 Weyl’s Lemma Investigating the regularity of weak solutions of elliptic PDEs has been a major research direction since 1940s. We begin with one of the earliest and most basic regularity results. The result surprisingly asserts that the weak solution of the Laplace equation is, in fact, a classical solution. Theorem 4.9.1 (Weyl’s Lemma) If u ∈ H 1 () such that
Du · Dv = 0
for every v ∈ Cc∞ (). Then u ∈ C ∞ () and ∇ 2 u = 0 in . Proof Consider the mollification u ∈ Cc∞ ( ). Performing integration by parts yields
∇ u =
2
∇ ϕ (x − y)u(y)dy = − 2
Hence u is harmonic on and
∇ϕ (x − y) · ∇u(y)dy = 0.
∇ 2 u = 0.
Letting ⊂ K ⊂ for some compact set K , it can be easily shown that the sequence u is uniformly bounded and equicontinuous on K , hence there exists v ∈ C ∞ (K ) such that u −→ v uniformly on K , so ∇ 2 v = 0 in K , and since u −→ u in L 2 () we conclude that u = v. Corollary 4.9.2 If u ∈ H 1 () is a weak solution to the Laplace equation then u ∈ C ∞ (). In other words, weak solutions of the Laplace equation are classical solutions. The significance of the result is that it shows that the weak solution is actually smooth and gives a classical solution. This demonstrates interior regularity.
4.9.2 Difference Quotients Now we turn our discussion to the Poisson equation. The treatment for this equation is standard and can be used to establish regularity results for other general elliptic equations. The main tool of this topic is difference quotient. In calculus, the difference quotient of a function u ∈ L p () is given by the formula
276
4 Elliptic Theory
Dkh u(x) =
u(x + hek ) − u(x) , h
and this ratio will lead to the derivative of u if h −→ 0. We always need to ensure that x + hek is inside , and this can be achieved by defining h = {{x ∈ : d(x, ∂) > h > 0}, which clearly implies that h → as h → 0. This is similar to the settings adopted for the mollifiers. Definition 4.9.3 (Difference Quotient) Let ⊆ Rn be an open set, and the set {e1 , . . . , en } be the standard basis of Rn . Let u be defined on . Then the difference quotient of u in the direction of ek , denoted by Dkh u, is defined on h by the ratio Dkh u(x) =
u(x + hek ) − u(x) . h
If we choose ⊂⊂ such that ⊆ h , we ensure the difference quotient is well-defined and this setting will be helpful in later results. Since the difference quotients are meant to be a pre-stage for derivatives, our guess is that they obey the same basic rules. The following proposition confirms our guess is correct. Proposition 4.9.4 Let u, v ∈ W 1, p (), 1 ≤ p < ∞. Let ⊂⊂ such that ⊆ h , and suppose that supp(v) ⊆ , then (1) Higher Derivative: D(Dkh u) = Dkh (Du) for all x ∈ . (2) Sum Rule: Dkh (u + v)(x) = Dkh u(x) + Dkh v(x) for all x ∈ . (3) Product Rule: Dkh (u)v(x) = u(x)Dkh v(x) + v(x + hek )Dkh u(x) for all x ∈ . (4) Integration by Parts:
u(x)Dkh v(x)d x
=−
v(x)Dk−h u(x)d x.
Proof The first three statements can be proved by similar arguments for the classical derivatives in ordinary calculus and it is thus left to the reader. For (4), note that by using the substitution y = x + hek ,
4.9 Regularity for the Poisson Equation
277
u(x)v(x + hek ) dx = h
u(y − hek )v(y) dy h
u(x − hek )v(x) = d x. h
Therefore we have
u(x)v(x + hek ) − u(x)v(x) h dx u(x)Dk v(x)d x = h
u(x − hek )v(x) − u(x)v(x) = dx h
u(x − hek )v(x) − u(x)v(x) dx =− −h
u(x − hek ) − u(x) = − v(x) dx −h
= − v(x)Dk−h u(x)d x.
The next theorem investigates the relation between difference quotients of a Sobolev function in W 1, p () and its weak derivatives. Of course, if u ∈ W 1, p () then Du ∈ L p (), so Dk u L p () < ∞. How to compare between the two norms of Dk u and Dkh u? In view of the next theorem, we see that difference quotients of functions in W 1, p () are bounded above by its partial derivatives. On the other hand, if u ∈ L p () and its difference quotients Dkh u are uniformly bounded above and independent of h, then its weak derivative exists and is bounded by the same bound. That is, u ∈ W 1, p (). Theorem 4.9.5 Let ⊆ Rn and suppose ⊂⊂ and ⊆ h . (1) If u ∈ W 1, p (), 1 ≤ p then Dkh u ∈ L p ( ) and h D u k
L p ( )
≤ Dk u L p () .
(2) If u ∈ L p (), 1 < p and there exists M > 0 such that for any and h above we have Dkh u ∈ L p ( ) and h D u k
then Dk u ∈ L p () and
L p ( )
≤ M,
Dk u L p ( ) ≤ M.
Proof (1): By density argument, it suffices to prove the result for
278
4 Elliptic Theory
u ∈ W 1, p () ∩ C 1 (). This enables us to use the fundamental theorem of calculus. We write u(x + hek ) − u(x) h
1 h Dk u(x + tek )dt. = h 0
Dkh u(x) =
Now, using Holder’s inequality integrate over
h D u(x) p d x ≤ k
1 h
h
|Dk u(x + tek )| p dtd x
0
1 h |Dk u(x + tek )| p d xdt (by Fubini Thm) = h 0
1 h |Dk u(x)| p d xdt = h 0
|Dk u(x)| p d x. =
(2): A well-known result in functional analysis states that every bounded sequence in a reflexive space has a weakly convergent subsequence (see Theorem 5.2.6 next chapter). So, letting h = h n such that h n −→ 0 as n −→ ∞, there exists a weakly w convergent subsequence, say Dkh n u, such that Dkh n u −→ v ∈ L p ( ), and so for every ϕ ∈ C0∞ () we have by definition of weak convergence
lim ϕDkh n ud x = ϕvd x. (4.9.1)
On the other hand, note that Dk−h n ϕ converges to Dk ϕ uniformly on , hence by Proposition 4.9.4(4)
lim ϕDkh n ud x = − lim u Dk−h n ϕd x = − u Dk ϕd x.
Combining this result with (4.9.1) gives
ϕvd x = − u Dk ϕd x,
w
which implies that v = Dk u ∈ L p ( ) and since Dkh n u −→ v, we obtain (see Proposition 5.3.3(3) next chapter) Dk u L p ( ) ≤ lim inf Dkh n u p ≤ M. L ( )
4.9 Regularity for the Poisson Equation
279
Note that this holds for all ⊂⊂ and ⊆ h . Define n = {x ∈ : d(x, ∂) >
1 }. n
Then uχ K n u, and using Dominated Convergence Theorem we get Dk u L p () ≤ M.
The above results will help us establish regularity results. The regularity results contain tedious calculations, so we will start with the simplest settings of Poisson equation in Hilbert space, and the results for general equations can be proved by a similar argument.
4.9.3 Caccioppoli’s Inequality The following estimate is helpful to establish our first regularity theorem. Lemma 4.9.6 (Caccioppoli’s Inequality) Let u ∈ H 1 () be a weak solution to the Poisson equation −∇ 2 u = f, for some f ∈ L 2 (). Then for any ⊂⊂ and ⊆ h , we have Du L 2 ( ) ≤ C[u L 2 () + f L 2 () ]. Proof Since u is a weak solution to the Poisson equation, it satisfies the weak formulation
Du.Dvd x = f vd x (4.9.2)
for every v ∈ H01 . Define v = ξ 2 u where ξ ∈ Cc∞ is a cut-off function having the following properties: 1. On , we have ξ = 1. 2. On \ , we have 0 ≤ ξ ≤ 1, and |∇ξ| ≤ M for some M > 0. 3. supp(ξ) ⊆ . Substituting it in (4.9.2) gives
ξ 2 |Du|2 + 2ξ∇ξu Du = ξ 2 f ud x.
It follows that
280
4 Elliptic Theory
ξ 2 |Du|2 ≤
2 |ξ∇u| |u∇ξ| d x +
|ξ f | |ξu| d x.
1 Use Cauchy inequality (Lemma 4.4.5) on the second integral on the LHS with = , 4 and Young’s inequality on the RHS integral with
ξ 2 |Du|2 ≤
1 2
ξ 2 |Du|2 + 2
u 2 |∇ξ|2 +
1 2
ξ2 f 2 +
1 2
ξ2u2.
Using the properties of ξ given above, we obtain
|Du| ≤ 2
ξ 2 |Du|2 ≤ C[u2L p () + f 2L p () ].
4.9.4 Interior Regularity for Poisson Equation Now we set our second regularity result. Theorem 4.9.7 (Interior Regularity for Poisson Equation) Let u ∈ H 1 () be a weak solution to the Poisson equation −∇ 2 u = f, for some f ∈ L 2 (). Then for any ⊂⊂ , we have u ∈ H 2 ( ) and u H 2 ( ) ≤ C[u L 2 () + f L 2 () ]. Proof We will use the same settings of the preceding lemma. Let ⊂⊂ h ⊂ , and define a cut-off function ξ ∈ Cc∞ having the following properties: 1. In , we have ξ = 1. 2. In h \ , we have 0 ≤ ξ ≤ 1, and |∇ξ| ≤ M for some M > 0. 3. supp(ξ) ⊆ h . Since u is a weak solution to the Poisson equation, it satisfies the weak formulation (4.9.2). Consider v ∈ H01 () and let supp(v) ⊆ h . Substitute it in (4.9.2) with u = −Dkh u. This gives
4.9 Regularity for the Poisson Equation
h
281
D(−Dkh u.).Dvd x = − Dkh (Du.).Dvd x (Proposition 4.9.4(1)) h
= Du.Dkh (Dv)d x (Proposition 4.9.4(4)) h
= Du.D(Dkh v)d x (Proposition 4.9.4(1)) h
= f.Dkh vd x h
2 |Dk v|2 d x (Theorem 4.9.5(1)). ≤ f d x. h
h
So we have
h
D(−Dkh u.).Dvd x ≤ f L 2 () Dv L 2 (h ) .
(4.9.3)
Now, define v = −ξ 2 Dkh u in (4.9.3). Note that, using Proposition 4.9.4(3), the expression D(ξ 2 Dkh u) can be written as D(ξ 2 Dkh u) = 2ξ∇ξ.Dkh u + ξ 2 D(Dkh u).
(4.9.4)
Substituting v in (4.9.3) taking into account (4.9.4) gives
h
D(−Dkh u.).D(−ξ 2 Dkh u)d x
=
=
h
h
D(Dkh u.).D(ξ 2 Dkh u)d x
ξ D(D h u)2 d x + 2 ξ∇ξ.Dkh u.D(Dkh u)d x, k h
so ξ D(D h u)2 2 k
L (h )
=
h
D(Dkh u.).D(ξ 2 Dkh u)d x − 2
h
ξ∇ξ.Dkh u.D(Dkh u)d x. (4.9.5)
Given property 2 for ξ above and using it in (4.9.4) gives D(ξ 2 Dkh u) ≤ 2M.Dkh u + ξ D(Dkh u). Using (4.9.6) in (4.9.5),
(4.9.6)
282
4 Elliptic Theory
ξ D(D h u)2 2 k
L (h )
=
h
D(Dkh u.).D(ξ 2 Dkh u)d x − 2
h
ξ∇ξ.Dkh u.D(Dkh u)d x
ξ D(D h u) ∇ξ D h u d x ≤ D(ξ 2 Dkh u) L 2 (h ) f L 2 () + 2 k k h h ≤ 2M Dk u L 2 (h ) f L 2 () + ξ D(Dkh u) L 2 (h ) f L 2 ()
ξ D(D h u) 2M D h u d x (by (4.9.6)). + k k h
Again, invoking the Cauchy inequality in the RHS of the inequality above, with 1
= and the values of s, t are in the order appeared in the inequality, we have 4 2 ξ D(D h u)2 2 ≤ M 2 Dkh u L 2 (h ) + f 2L 2 () k L (h ) 2 1 + ξ D(Dkh u) L 2 (h ) + f 2L 2 () 4 2 2 1 + ξ D(Dkh u) L 2 (h ) + 4M 2 Dkh u L 2 (h ) 4 2 1 ≤ ξ D(Dkh u) L 2 (h ) + 2 f 2L 2 () + 5M 2 Du2L 2 (h ) 2 (Theorem 4.9.5(1))
which implies ξ D(D h u)2 2
L (h )
k
≤ 4 f 2L 2 () + 10M 2 Du2L 2 (h ) $ # ≤ C f 2L 2 () + Du2L 2 (h ) 2 ≤ C f L 2 () + Du L 2 (h ) .
Note that ξ = 1 on , so Proposition 4.9.4(1)) and Theorem 4.9.5(2) yield 2 2 D u 2
L ( )
2 ≤ ξ D(Dkh u) L 2 (h ) .
Substituting the above and using Lemma 4.9.6 (Caccioppoli’s Inequality) with = h in the Lemma gives 2 D u
L 2 ( )
≤ C f L 2 () + Du L 2 () .
The combination of (4.9.7) and Caccioppoli’s Inequality yields the result.
(4.9.7)
The above theorem asserts that a weak solution of the Poisson equation is in fact a strong solution that belongs to H 2 , so its second weak derivative exists in L 2 . Consequently, we can safely perform integration by parts in (4.9.2) to obtain the Poisson equation for almost all x ∈ except for a set of measure 0.
4.10 Regularity for General Elliptic Equations
283
In standard calculus, it is well-known that u ∈ C 2 if u ∈ C. This observation may lead someone to believe that we can do the same in the case above and conclude that u ∈ H 2 because ∇2u = f ∈ L 2, which is incorrect. It should also be noted here that having ∇ 2 u ∈ L 2 doesn’t necessarily mean that the weak derivatives Di2 u exist and belong to L 2 because, as discussed earlier in Section 3.1, the existence of pointwise derivatives doesn’t always imply the existence of weak derivatives, so our case should be handled with extra care. For example, consider the equation u = 0 in some interval in R. Then, if u is a strong solution to the equation, it is also a weak solution since it satisfies the weak formulation, but we cannot conclude that u ∈ W 1, p because u might be a step function, and step functions of the form (3.1.1) are not weakly differentiable.
4.10 Regularity for General Elliptic Equations 4.10.1 Interior Regularity Now we take the result one step further. We will prove it for a general elliptic equation (4.2.4), namely −
n n ∂ ai j (x)D j u + bi (x)Di u(x) + c(x)u(x) = f . ∂xi i, j=1 i=1
The weak formulation is given by
n n ai j Di u D j v + bi Di uv + cuvd x = f, v L 2 () . i, j=1
(4.10.1)
(4.10.2)
i=1
The argument of the proof is similar to the preceding theorem and so we will give a sketch of the proof, leaving all the details to the reader to figure out. Note also that the Caccioppoli inequality can be proved for the operator in (4.10.1) by a similar argument. Theorem 4.10.1 (Interior Regularity Theorem) Consider the elliptic equation Lu = f where L is a uniformly elliptic operator given by (4.10.1) for ai j ∈ C 1 (), bi , c ∈ L ∞ (), and f ∈ L 2 () for some bounded open in Rn . If u ∈ H 1 () is a weak 2 (). Furthermore, for any ⊂⊂ solution to the equation Lu = f , then u ∈ HLoc and some constant C > 0, we have u H 2 ( ) ≤ C[u H 1 () + f L 2 () ].
284
4 Elliptic Theory
Proof We will use the same settings as before. Namely, let ⊂⊂ h ⊂ , and define a cut-off function ξ ∈ Cc∞ having the following properties: In , we have ξ = 1, in h \ , we have 0 ≤ ξ ≤ 1, |∇ξ| ≤ M for some M > 0, and supp(ξ) ⊆ h . Since u is a weak solution to (4.10.1), it satisfies (4.10.2), so
n i, j=1
ai j (x)Di u Di vd x =
fˆvd x,
(4.10.3)
for every v ∈ H01 (), where fˆ(x) = f −
n
bi (x)Di uv + c(x)u(x).
i=1
Choose
v(x) = −Dk−h (ξ 2 Dkh u),
and substitute it in (4.10.3). This gives
n i, j=1
ai j (x)Di u Di (−Dk−h (ξ 2 Dkh u))d x
=−
fˆ Dk−h (ξ 2 Dkh u)d x.
(4.10.4)
Employing all the previous useful results that we used in the preceding theorem, the LHS of the equation can be written as
n ai j (x)Di u Di (−Dk−h (ξ 2 Dkh u))d x = A + B, i, j=1
where A= B=
n i, j=1
n
i, j=1
ai j Dkh D j u(ξ 2 Dkh Di u))d x, [ai j Dkh D j u(2ξ∇ξ Dkh u)
+ Dkh ai j D j u(2ξ∇ξ Dkh u) + Dkh ai j D j u(ξ 2 Dkh Di u)]d x Since L is uniformly elliptic, the integral A can be estimated as
h 2 ξ D Du d x, A≥λ k h
4.10 Regularity for General Elliptic Equations
285
λ and for the integral B, we can use Cauchy’s inequality with = , then 2 Theorem 4.9.5(1). We obtain
h 2 λ |B| ≤ |Du|2 d x. ξ Dk Du d x + C 2 h Using the two estimates for A and B in (4.10.4) yields λ 2
h
h 2 |Du|2 d x ≤ ξ Dk Du d x − C1
n
i, j=1
ai j Di u Di vd x =
fˆvd x.
(4.10.5) On the other hand, by doing the necessary calculations on the RHS of the inequality λ above and using Cauchy’s inequality with = , Theorem 4.9.5(1) gives 4
λ h 2 2 −h 2 h ˆ | f | + |u|2 + |Du|2 d x. ξ Dk Du d x + C2 f Dk (ξ Dk u)d x ≤ 4 h Substituting in (4.10.5) and rearranging terms yields
# $ h 2 λ ξ D Du d x ≤ C3 u2 2 + Du2 2 + f 2 2 k L () L () L () , 4 h where C3 = max{C1 , C2 }. Therefore
$ # h 2 ξ D Du d x ≤ C u2 2 + Du2 2 + f 2 2 k L () L () L () . h
Using the same argument of the preceding proof, we conclude that D 2 u ∈ L 2 ( ) and u H 2 ( ) ≤ C[u H 1 () + f L 2 () ], 2 (). therefore u ∈ H 2 ( ), and given that ⊂⊂ , we have u ∈ HLoc
Remark It should be noted that the estimate of Theorem 4.10.1 can be expressed by L 2 −norm of u rather than the H 1 −norm and becomes u H 2 ( ) ≤ C[u L 2 () + f L 2 () ] (see Problem 4.11.41).
286
4 Elliptic Theory
4.10.2 Higher Order Interior Regularity Now that we proved the regularity result for u ∈ H 1 and prove that u ∈ H 2 , one can use induction to repeat the argument above and iterate the estimates and obtain higher 2 (). order regularity. Indeed, if f ∈ H 1 (), then by the preceding theorem u ∈ HLoc Let us for simplicity consider the Poisson equation. Recall that a weak solution satisfies in (4.9.2) which deals only with the first derivative, and consequently one cannot perform integration by parts to obtain the equation −∇ 2 u = f because there is no grantee that D 2 u ∈ L 2 and Di f ∈ L 2 exists and is integrable in L 2 , and this is the main reason why weak solutions cannot be automatically regarded as strong or classical solutions to the original equation. However, if the weak solution is found to be in H 2 and Di f ∈ L 2 , then we can perform integration by parts. In this case, we can choose to deal with Dv instead of v ∈ Cc∞ . Substituting it in (4.9.2) gives
Du.D(Dv)d x = f Dvd x.
Performing integration by parts gives
D 2 u.Dvd x = (D f ).vd x.
(4.10.6)
The solution u in this case satisfies the original equation pointwise and almost everywhere, and is thus a strong solution. Corollary 4.10.2 Under the assumptions of the preceding theorem, if u ∈ H 1 () is a weak solution of the equation Lu = f , then u is a strong solution to the equation. Another important consequence is that, based on (4.10.6) and the argument above, if we repeat the proof of the preceding theorem and iterate our estimates we will 3 (). This process can be inductively repeated for k ∈ N. In general, obtain u ∈ HLoc we have the following. Theorem 4.10.3 (Higher order Interior Regularity Theorem) Consider the elliptic equation Lu = f where L is a uniformly elliptic operator given by (4.10.1) for ai j ∈ C k+1 (), bi , c ∈ L ∞ (), and f ∈ H k () for some open bounded in Rn . If u ∈ k+2 (). Furthermore, H 1 () is a weak solution to the equation Lu = f , then u ∈ HLoc for any ⊂⊂ and some constant C > 0, we have u H k+2 ( ) ≤ C[u L 2 () + f H k () ]. According to the theorem, the smoother the data ai j , bi , c, f of the equation, n then using the Sobolev 2 Embedding Theorem, we conclude that f ∈ C() and consequently, u ∈ C 2 () which describes a classical solution for the equation.
the smoother the solution we get. Note here that if k >
4.10 Regularity for General Elliptic Equations
287
4.10.3 Interior Smoothness A natural question arises now: if the preceding theorem holds for all k ∈ N, shouldn’t that imply that the solution is smooth? The following theorem answers the question positively. Theorem 4.10.4 (Interior Smoothness Theorem) Let ai j , bi , c, f ∈ C ∞ () and u ∈ H 1 () be a weak solution to the equation Lu = f , then u ∈ C ∞ () and u be a classical solution to the equation. Proof If u ∈ C ∞ () then f ∈ H k () for all k ∈ N, so by the preceding theorem k+2 (), i.e., u ∈ H k+2 ( ) for every ⊂⊂ , and so by Sobolev Embedding u ∈ HLoc Theorem (Theorem 3.10.10(3)) u ∈ C ∞ ( ), and since is arbitrary, we can extend it by continuity to .
4.10.4 Boundary Regularity All the preceding results obtained so far establish the interior regularity for weak solutions H k ( ) for sets ⊂⊂ . This shows that a solution of an elliptic equation with regular/smooth data is locally regular/smooth in the interior of its domain of definition; but it doesn’t mention any information about the smoothness of the solution at the boundary. In other words, the results above didn’t obtain a solution in H k (). In order to obtain a smooth solution at the boundary, and based on the treatment we gave to obtain interior regularity and smoothness, we require the boundary itself to be sufficiently regular. The following theorem can also be iterated to yield smoothness provided the data are smooth. The proof is long and very technical and may not fit the scope of the present book. The interested reader may consult books on the theory of partial differential equations for the details of the proof. Theorem 4.10.5 (Boundary Regularity Theorem) In addition to the assumptions of Theorem 4.10.1, suppose that is of C 1 −class, and ai j ∈ C 1 (). If u ∈ H01 () is a weak solution to the equation Lu = f under the boundary condition u = 0 on ∂, then u ∈ H 2 (). Furthermore, for some constant C > 0, we have u H 2 () ≤ C[u L 2 () + f L 2 () ]. It becomes clear now that the weak solutions of elliptic equations can become strong or even classical solutions provided the data of the equation are sufficiently regular. It is well-known that the task of establishing the existence of a solution of an elliptic equation is not an easy task. Now, in light of the present section, one can seek weak solutions over Sobolev spaces, then use regularity techniques to show that these weak solutions are in fact strong or classical solutions. In the next chapter, we will see that weak solutions of elliptic partial differential equations are closely
288
4 Elliptic Theory
connected with minimizers of integral functionals that are related to these elliptic PDEs through their weak formulation.
4.11 Problems (1) If ai j (x) ∈ C 2 () and bi = 0 in the elliptic operator L in (4.1.1), prove that B[u, v] is symmetric. (2) Suppose that L is an elliptic operator and there exists 0 < < ∞ such that n
ai j ξi ξ j ≤ |ξ|2
i=1
(3) (4)
(5) (6) (7) (8)
for all ξ ∈ Rn . Show that ai, j ∈ L ∞ . Show that an elliptic operator defined on ⊆ Rn makes sense in divergence form only if ai j ∈ C 1 (). (a) Prove that ·∂ defines a norm on H˜ 1 (). (b) Prove that every Cauchy sequence in H˜ 1 () converges in H˜ 1 (). (c) Prove that (4.3.4) defines an inner product. (d) Conclude Proposition 4.3.5. Prove Proposition 4.4.3. Determine whether the cross-product mapping P : R3 × R3 −→ R3 is bilinear. Show that if the elliptic bilinear map B[u, v] is symmetric then it is coercive. (Poincare–Friedrichs inequality): Let be C 1 open and bounded in R2 . Show that there exists C > 0 such that
2
|u|2 d x ≤ C |Du|2 d x + ud x
for all u ∈ H 1 (). (9) Show that if u ∈ C 2 () ∩ C() is a weak solution to the problem ⎧ ⎨ Lu = f x ∈ ∂u ⎩ = 0 x ∈ ∂. ∂n for some bounded ⊂ Rn , then show that u is a classical solution to the problem. (10) Consider H01 () with the Sobolev norm u H01 () = u L 2 () + Du L 2 ()
4.11 Problems
289
for u ∈ H01 (). Show that the norm Du L 2 () is equivalent to u H01 () on H01 (). (11) Determine whether or not H01 (Rn ) with the inner product
(u, v) = ∇u∇vd x is a Hilbert space. (12) Show that if f ∈ L 2 () and D f is its distributional derivative, then D f H −1 () ≤ f L 2 () . Deduce that L 2 () ⊂ H −1 (). (13) Let L be a uniformly elliptic operator with bi = 0 for all i. (a) If min(c(x)) < λ0 , show that γ in Garding’s Inequality can be estimated as γ = λ0 − min(c(x)). (b) Show that the bilinear map associated with the operator Lu + μu = f is coercive for μ ≥ γ. (c) Establish the existence and uniqueness of the weak solution of the Dirichlet problem of Lu + μu = f with u(∂) = 0. (14) Let L be a uniformly elliptic general operator with ai j , bi , c ∈ L ∞ () for some open and bounded in R2 . Write Garding’s Inequality with γ=
M λ0 −m+ , 2 2λ0
where m = min(c(x)) and M = max bi ∞ . (15) Consider a bounded ⊂ Rn and f ∈ L 2 (), g ∈ L 2 (∂). Prove the existence and uniqueness of the weak solution of the following Neumann problems: ∂u = g. (a) ∇ 2 u = 0 in and ∂n ∂u (b) −∇ 2 u + u = f in and = 0. ∂n ∂u = g on L 2 () for bounded ⊂ Rn . (16) Consider the problem −∇ 2 u = f in ∂n (a) Show that a necessary condition for the solution to exist is that
290
4 Elliptic Theory
f dx +
gd x = 0.
(b) Establish the existence and uniqueness of the weak solution of the problem. (17) Prove that there exists a weak solution for the nonhomogeneous Helmholtz problem −∇ 2 u + u = f : x ∈ u=0 : x ∈ ∂ Do we need to use Lax-Milgram theorem? Justify your answer. (18) Let bounded ⊂ Rn and f ∈ L 2 (), g ∈ L 2 (∂). Consider the problem −div(∇u) + u = f ∂u = g. ∂n (a) Find the weak formulation of the problem. (b) Prove the existence and uniqueness of the weak solution of the problem. (19) Consider the problem (− pu ) + qu = f u(0) = u(1) = 0, on I = [0, 1], for p, q ∈ C[I ] and f ∈ L 2 (I ), and p, q > 0. (a) Find the associated bilinear form B on H01 (I ). (b) Show that B is coercive. (c) Prove the existence and uniqueness of the weak solution of the problem. (20) In Theorem 4.5.4, find the best value of c0 > −∞ such that if c(x) ≥ c0 then B[u, v] is coercive. (21) Generalize Theorem 4.5.5 assuming that c(x) ≥ m for some m ∈ R (not necλ essarily positive) such that m ≥ − 2 . C (22) Solve the Dirichlet problem (4.6.2) if c(x) ≥ λ (where λ is the ellipticity constant of L). (23) Prove the existence and uniqueness of the weak solution of problem (4.6.4) for a symmetric A using the Riesz Representation Theorem. (24) Let λmax > 0 be the largest eigenvalue of the matrix A, and let μ < −λmax . Prove the existence and uniqueness of weak solution of the Dirichlet problem
Lu + μu = f u = 0,
x ∈ x ∈ ∂
where L is a uniformly elliptic operator of the form (4.5.3), such that ai j , c ∈ L ∞ (), and for some open that is bounded in at least some direction in Rn .
4.11 Problems
291
(25) Consider the following uniform elliptic operator L=−
n n n ∂ ∂ ∂ ∂ ai j (x) − (bi ·) + c(x) + d, ∂xi ∂x j ∂xi ∂xi i, j=1 i=1 i=1
with ai j , bi , c, d ∈ L ∞ () for some open and bounded in R2 . a) Show that the associated elliptic bilinear map B[u, v] is bounded. b) Write the Garding’s Inequality with and β=
λ0 1 , γ= (max bi ∞ + c∞ )2 + d∞ . 2 2λ0
c) If μ ≥ γ, show that Bμ [u, v] = B[u, v] + μ u, v L 2 is bounded and coercive in H01 (). d) Show that for f ∈ L 2 (), there exists a unique weak solution in H01 () to the Dirichlet problem Lu = f x ∈ . u = 0, x ∈ ∂ (26) Without using Theorem 4.5.4, show that the elliptic bilinear map associated with Lu + μu = f is coercive for μ ≥ γ. (27) Show that the norm
u∂ 2 = ∇ 2 u L 2 ()
is equivalent to the standard norm u H01 () for some open and bounded in R2 . (28) Consider the equation n n ∂ ∂ ∂u ai j (x) + (bi u) + c(x)u(x) + μu(x) = f, in . − ∂x ∂x ∂x i j i i, j=1 i=1 Assume ai j , c ∈ L ∞ () and bi ∈ W 1,∞ (), f ∈ L 2 () for some open in R2 and bounded in at least one direction. Discuss existence and uniqueness of weak solution of the equation under: (a) the homogeneous Dirichlet condition u(∂) = 0. (b) the homogeneous Neumann condition ∇u · n = 0 on ∂. (c) Impose whatever conditions needed on b, c, μ to ensure the results in (a) and (b). (d) Discuss the two cases: μ = 0, and μ > 0.
292
4 Elliptic Theory
(29) Consider the fourth-order biharmonic BVP ∇ 2 ∇ 2 u = f in u = 0 on ∂ ∇u · n = 0 on ∂ where f ∈ L 2 () and is open and bounded in R2 . Use the preceding problem to show the existence and uniqueness of the weak solution of the problem. (30) Show that the eigenvalues of a self-adjoint uniformly elliptic operator are bounded from below. (31) Show that K ∗ = (L ∗ + μI )−1 | L 2 for the operator K = L −1 μ . (32) In the preceding problem, show that if the only solution of K ∗ (v) = −v is v = 0 then for every g ∈ L 2 , the equation u + γKu = g has a unique solution. (33) Let L be a uniformly elliptic operator defined on some open bounded in R2 , and f, h ∈ L 2 (), λ ∈ R. (a) Show that u − Ku = h has a weak solution in H01 () iff h, v = 0 for every weak solution v ∈ H01 () of K ∗ v − v = 0. (b) If Lu = 0 has a nontrivial solution, show that Lu = f has a weak solution in H01 () iff f, v = 0 for every weak solution v ∈ H01 () of L ∗ v = 0. (34) In the previous problem, show that if Lu − λu = f has a weak solution in H01 () for every f ∈ L 2 () then u L 2 ≤ C f L 2 for some C = C(λ, ).
4.11 Problems
293
(35) Let L = −∇ 2 , and = Rn . (a) Show that L has a continuous spectrum. (b) Conclude that the boundedness of in Theorem 4.8.2 is essential to obtain a discrete set of eigenvalues. (36) Let {φn } be the orthonormal basis for L 2 () which are the eigenfunctions of the Laplacian equation −∇ 2 φn = λn φn . φn Show that { √ } is an orthonormal basis for H01 () endowed with the Poincare λn inner product
u, v ∂ = Du Dv.
(37) Prove the first three statements of Proposition 4.9.4. (38) Prove that u ∈ W 1, p (Rn ) if and only if u ∈ L p (Rn ) and lim sup h→0
u(x + h) − u(x) p < ∞. |h|
(39) Prove the estimate in Theorem 4.9.5(1) for p = ∞. (40) Give an example to show that the estimate in Theorem 4.9.5(2) is not valid for p = 1. (41) (a) Prove the Caccioppoli inequality (Lemma 4.9.6) for the general elliptic operator in (4.10.1).) (b) Show that the estimate obtained in Theorem 4.10.1 can be refined to & % u H 2 ( ) ≤ C u H 1 (h ) + f L 2 (h ) . (c) Choose v = ξ 2 u for some cut-off function ξ = 1 in h and supp(ξ) ⊂ . Under the same assumptions of Theorem 4.10.1, use (a) and (b) to establish the estimate & % u H 2 ( ) ≤ C u L 2 () + f L 2 () . n (42) In Theorem 4.10.3, show that if k > then u ∈ C 2 () is a classical solution 2 for the equation.
Chapter 5
Calculus of Variations
5.1 Minimization Problem 5.1.1 Definition of Minimization Problem The problem of finding the maximum value or minimum value of a function over some set in the domain of the function is called: variational problem. This problem and finding ways to deal with it is as old as humanity itself. In case we are looking for a minimum value, the problem is called: minimization problem. We will focus on the minimization problems due to their particular importance: In physics and engineering, we look for the minimum energy, in geometry, we look for the shortest distance or the smallest area or volume, and in economy, we look for the minimum costs. Historically, the minimization problems in physics and geometry were the main motivation to develop the necessary tools and techniques that laid the foundations of this field of mathematics which connects functional analysis with applied mathematics. We will discuss the relations between minimization problems and the existence and uniqueness problem of solutions of PDEs. We confine ourselves to elliptic equations. We shall give a formal definition to the problem. Definition 5.1.1 (Minimization Problem) Let f : X −→ R be a real-valued function. The minimization problem is the problem of searching for some x0 ∈ A for some set A ⊆ X , such that f (x0 ) = inf f (x). x∈A
If there exists such x0 ∈ A, then we say that x0 is the solution to the problem, and the point x0 is said to be the minimizer of f over A. If f (x0 ) is the minimum value over X then x0 is the global minimizer of f . The set A is called: admissible set, and it consists of all the values of x that shall compete to obtain the minimum value attained in A. We observe the following points: (1) Not every minimization problem has a minimizer. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 A. Khanfer, Applied Functional Analysis, https://doi.org/10.1007/978-981-99-3788-2_5
295
296
5 Calculus of Variations
(2) If the minimizer was not found over a set A, it is still possible to find it if the admissible set A is enlarged to a bigger one, and the larger the admissible set, the bigger chance the problem will have a solution. (3) The maximization problem is the dual of the minimization problem, noting that over an admissible set A, we always have that sup f (x) = − inf(−f (x)), so solving one problem automatically solves the other problem. We emphasized that we are concerned with searching for the minimizer rather than the infimum of the function. This “searching” has two levels: the first level is to merely prove the existence of such a minimizer. The second level (which usually comes after achieving the goal of the first level) is to find such a minimizer (exactly or approximately) using various analytical, numerical, and computational techniques. In functional analysis, we usually confine ourselves to the first level, and once the job is done, it would be the task for the applied mathematicians to carry out the second level of the problem. To simplify our task, we will start our discussion with simple minimization problems in finite-dimensional spaces which will motivate us to discuss the problem in infinite-dimensional spaces. The first time we encountered this type of problem was in a calculus course. We begin with a lovely theorem that all readers had certainly studied in the first course of calculus. Theorem 5.1.2 (Extreme Value Theorem) If f is continuous on a closed interval [a, b] then f attains both an absolute maximum value and an absolute minimum value somewhere in [a, b]. The theorem predicts the existence of a minimizer and maximizer over [a, b], and it can be easily generalized to Rn . The admissible set in the theorem is the compact interval [a, b] and the only requirement for f is to be continuous. The properties of continuity of the function and the compactness of the admissible set are the most important conditions to establish the existence results of minimizers and maximizers. Another fundamental theorem is: Theorem 5.1.3 (Bolzano–Weierstrass Theorem) Every sequence in a compact set in Rn has a subsequence that converges to some point in that set. The theorem can be stated equivalently as: “Every bounded sequence in Rn has a convergent subsequence”. Note here that both maximum and minimum are required. The following function f (x) =
x2 −1 ≤ x ≤ 0 2 − x 0 < x ≤ 1.
does have a minimum value of 0; but it does not have a maximum value. If we restrict our goal to exploring minimizers only, then functions such as the above may serve as a good example to motive us. In fact, this function is known as: lower semicontinuous.
5.1 Minimization Problem
297
5.1.2 Lower Semicontinuity Definition 5.1.4 Let X be a normed space and f is a real-valued function defined on X . (1) Lower semicontinuous. The function f is said to be lower semicontinuous at x0 ∈ D(f ) if for every > 0 there exists δ > 0 such that f (x0 ) < f (x) + whenever x − x0 < δ for all x ∈ D(f ). The lower semicontinuous function is denoted simply by l.s.c. (2) Upper semicontinuous. The function f is said to be upper semicontinuous at x0 ∈ D(f ) if for every > 0 there exists δ > 0 such that f (x) − < f (x0 ) whenever x − x0 < δ for all x ∈ D(f ). The lower semicontinuous function is denoted simply by u.s.c. It is evident from the definitions above that a function is continuous at a point iff it is l.s.c. and u.s.c. at that point. Also, if f is l.s.c. then −f is u.s.c. We will only discuss l.s.c. functions. A more reliable definition based on sequences is the following: Definition 5.1.5 (Sequentially lower semicontinuous) A function f : X −→ (−∞, ∞] is said to be sequentially lower semicontinuous at a number x0 ∈ D(f ) if f (x0 ) ≤ lim inf f (xn ) for every sequence xn ∈ X converges to x. In normed spaces, the two definitions are equivalent, so we will continue to use Definition 5.1.5 and omit the term sequentially for convenience. The following result is fundamental and some authors use it as an equivalent definition of lower semicontinuity. Proposition 5.1.6 Let f be defined on some normed space X . Then f is l.s.c. iff f −1 (−∞, r] = {x : f (x) ≤ r} is closed set in X for every r ∈ R. The proof is straightforward using the definitions. We leave it to the reader. The set {x : f (x) ≤ r} is called: the lower-level set of f . Another important notion linked to real-valued functions is the following: Definition 5.1.7 (Epigraph) Let f be defined on some normed space X . Then the epigraph of f , denoted by epi(f ), is given by
298
5 Calculus of Variations
epi(f ) = {(x, r) ∈ X × R: r ≥ f (x)}. In words, the epigraph of a function f is the set of all points in X × R lying above or on the graph of f . The next result shows that l.s.c. functions can be identified by their epigraphs. Proposition 5.1.8 Let f be defined on some normed space X . Then f is l.s.c. iff epi(f ) is closed. Proof Consider the sequence (xn , rn ) ∈ epi(f ), which implies that xn ≤ rn . let (xn , rn ) −→ (x0 , r0 ) for some x0 ∈ X and r0 ∈ R. Then by lower semicontinuity, f (x0 ) ≤ lim inf f (xn ) ≤ lim inf rn = lim rn = r0 . Hence (f (x0 ), r0 ) ∈ epi(f ) and this proves one direction. Conversely, let epi(f ) be closed. Let xn −→ x0 in X . Then (xn , f (xn )) ∈ epi(f ) which is closed, so (xn , f (xn )) −→ (x0 , r0 ) ∈ epi(f ). It follows that f (xn ) −→ r0 , and f (xn ) ≤ r0 . Hence f (x0 ) ≤ r0 = lim f (xn ) = lim inf rn .
5.1.3 Minimization Problems in Finite-Dimensional Spaces With the use of the l.s.c notion, Bolzano–Weierstrass Theorem can be reduced to the following version: Theorem 5.1.9 If f :X −→ (−∞, ∞] is l.s.c on a compact set K ⊆ X , then f is bounded from below and attains its infimum on K, i.e., there exists x0 ∈ K such that f (x0 ) = inf f (x). x∈A
Proof Let inf f (x) = m ≥ −∞.
x∈K
Define the sets
1 Cn = {x ∈ C: f (x) ≤ m + }. n
5.1 Minimization Problem
299
Since f is l.s.c., every Cn is closed in X , and since they are subsets of C, they are all compact; noticing that Cn+1 ≤ Cn for all n. By Cantor’s intersection theorem, ∞
Cn = Ø.
n=1
i.e., there exists x0 ∈ X such that x0 ∈
∞
Cn ,
n=1
therefore, f (x0 ) = inf f (x) = m > −∞. x∈K
The above theorem establishes the existence of minimizers without giving a constructive procedure to find it. To find the minimizer, we need to implement various analytical and numerical tools to obtain the exact or approximate value of it, but this is beyond the scope of the book. To determine whether the minimizer predicted by the above theorem is global or not, we need to impose further conditions to give extra information about this minimizer. One interesting and important condition is the notion of convexity.
5.1.4 Convexity Recall that a set C is convex if whenever x, y ∈ C we have θx + (1 − θ)y ∈ C for every 0 ≤ θ ≤ 1. In the following proposition, we remind the reader of some of the basic properties of convex sets that can be easily proved using definition. Proposition 5.1.10 The following statements hold in every topological space: (1) (2) (3) (4) (5)
The empty set is trivially convex. Moreover, a singleton set is convex. All open and closed balls in any normed space are convex. If C is convex then C and Int(C) (the interior of C) are convex sets. The intersection of any collection of convex sets is convex. The intersection of all convex sets containing a subset A is convex. This is called the: convex hull, and is denoted by Conv(A), or A . Moreover, if A is compact then Conv(A) is compact. (6) For any set A, Conv (Conv(A)) = Conv(A). Recall a function is convex if whenever x, y ∈ D(f ) we have f (θx + (1 − θ)y) ≤ θf (x) + (1 − θ)f (y)
300
5 Calculus of Variations
for all 0 < θ < 1. If the inequality is strict then f is called: strictly convex. It is evident from the definition of convex function that the domain D(f ) must be convex. The following theorem benefits from the property of convexity. Theorem 5.1.11 Let f be convex. If f has a local minimizer, then it is a global minimizer. Proof If x0 is local minimizer, then for any y ∈ D(f ), we can choose suitable 0 < θ < 1 such that f (x0 ) ≤ f (θx0 + (1 − θ)y) ≤ θf (x0 ) + (1 − θ)f (y). With a simple calculation, this implies f (x0 ) ≤ f (y).
5.1.5 Minimization in Infinite-Dimensional Space We have seen that compactness and lower semicontinuity were the basic tools used to search for minimizers in finite-dimensional spaces. This situation totally changes when it comes to infinite-dimensional spaces. As discussed in a previous course of functional analysis, compactness is not easy to achieve in these spaces, and Heine– Borel Theorem, which states that every closed bounded set is compact, fails in infinite-dimensional spaces. In fact, a well-known result in analysis states that a space is finite-dimensional if and only if its closed unit ball is compact, and hence the closed unit ball in infinite-dimensional spaces is never compact. A good suggestion to remedy this difficult situation is to change the topology on the space. More specifically, we replace the norm topology with another “weaker” topology that allows more closed and compact sets to appear in the space. This is the weak topology, which is the subject of the next section.
5.2 Weak Topology 5.2.1 Notion of Weak Topology It is well-known that in normed spaces, the stronger the norm the more open sets we obtain in the space, which makes it harder to obtain convergence and compactness. Replacing the norm topology with a coarser, or weaker topology results in fewer open sets, which will simplify our task. We are concerned with the smallest topology that makes any bounded linear functional continuous. This merely requires f −1 (O)
5.2 Weak Topology
301
to be open in the topology for any O open in R. The generated topology is called the weak topology. Historically, the minimization problem was one of the main motivations to explore the weak topology. As noted in the preface, the reader of this book must have prior knowledge of weak topology in a previous course of analysis, but we shall give a quick overview and remind the reader of the important results that will be used later in the chapter.
5.2.2 Weak Convergence Remark Throughout, we always assume the space is normed. Recall a sequence xn ∈ X is weakly convergent, written as xn x weakly (or can w be denoted by xn −→ x) if f (xn ) −→ f (x) for all bounded linear functionals f on X . In particular, let un ∈ Lp for some 1 ≤ p < ∞ and q its Holder conjugate. if un u then f (un ) −→ f (u) for all bounded linear functionals f on Lq . Moreover, if un u in a Hilbert space H, then un , z → u, z for all z ∈ H. A useful property to obtain convergence in norm from weak convergence is known as the Radon-Riesz property: w
Theorem 5.2.1 (Radon-Riesz Theorem) Let fn ∈ Lp , 1 < p < ∞. If xn −→ x) in LP and fn p −→ f p then fn − f p −→ 0. Proof For p = 2, use Riesz Representation Theorem and note that fn − f 22 = fn − f , fn − f = fn 2 − 2 fn , f + f 2 . For p = 2, use Hahn–Banach Theorem to show the existence of g ∈ Lq such that gq = 1 and g(f ) = f p = 1. Then use Holder’s inequality and the fact that Lp is uniformly convex. It is worth noting that when we describe a topological property by “weakly” it means that the property holds in the weak topology. The following is a brief description of the sets in the weak topology with some implications: (1) A is said to be weakly open if every element x ∈ U there exist {fi } and {Oi } open in R, for i = 1, 2, . . . n, such that x∈
n
fk−1 (Ok ) ⊂ U.
k=1
(2) A is said to be weakly closed if Ac is weakly open. w (3) A is said to be weakly sequentially closed if for every xn ∈ A and xn −→ x, we have x ∈ A.
302
5 Calculus of Variations
(4) A is said to be weakly bounded if f (A) is bounded for all f ∈ X ∗ . (5) A is said to be weakly compact if f (A) is a compact set for all f ∈ X ∗ . (6) A is said to be weakly sequentially compact if for every sequence xn ∈ A there w exists a subsequence xnj −→ x ∈ A. (7) If {xn } is convergent in norm, then it weakly convergent, but the converse is not necessarily true. (8) If A is a weakly open set, then it is an open set, but the converse is not necessarily true. (9) If A is a weakly closed set, then it is a closed set, but the converse is not necessarily true. (10) A is bounded if and only if it is weakly bounded. (11) If A is compact, then it is weakly compact, but the converse is not necessarily true.
5.2.3 Weakly Closed Sets Now, we state two important theorems about closed sets in the weak topology. Theorem 5.2.2 (Mazur’s Theorem) If a set is convex and closed, then it is weakly closed. Proof Let A is convex and closed, and x0 ∈ Ac . By consequences of Hahn–Banach Theorem, there exists f ∈ X ∗ such that for all x ∈ A, f (x) < b < f (x0 ) for some b ∈ R. Then it is easy to show that x0 ∈ U = {x ∈ X : f (x) > b} ⊂ Ac . Hence, Ac is weakly open, and therefore, A is weakly closed.
Theorem 5.2.3 (Heine–Borel Theorem/Weak Version) Any weakly compact set is weakly closed and weakly bounded. Any weakly closed subset of a weakly compact set is weakly compact. Proof Use the fact that every weak topology is Hausdorff.
5.2.4 Reflexive Spaces Recall that a space X is said to be reflexive if it is isometrically isomorphic to its second dual X ∗∗ under a natural embedding. In what follows, we list theorems that are among the most important results in this theory. Theorem 5.2.4 A space X is reflexive if and only if its closed unit ball
5.2 Weak Topology
303
BX = {x ∈ X : x ≤ 1} is weakly compact. Proof This can be proved using Banach–Alaoglu Theorem.
Theorem 5.2.5 Let X be normed space and A be a subset of X . Then the following statements are equivalent: (1) X is reflexive. (2) A is weakly compact if and only if A is weakly closed and weakly bounded. Proof Use Theorems: 5.2.2, 5.2.3, and 5.2.4.
Theorem 5.2.6 (Kakutani Theorem) If X is reflexive, then every bounded sequence in X has a weakly convergent subsequence. Proof Let Y = span{xn } ⊂ X . Then Y is reflexive and separable. Now, use Banach–Alaoglu Theorem.
The following are well-known results that can be easily proved using the above theorems: (1) (2) (3) (4)
Every reflexive space is a Banach space. Every Hilbert space is reflexive. All Lp spaces for 1 < p < ∞ are reflexive. A closed subspace of a reflexive space is reflexive.
We provided brief proofs for the theorems above.1
5.2.5 Weakly Lower Semicontinuity In the weak topology, the lower semicontinuity takes the following definition: Definition 5.2.7 (Weak Lower Semicontinuous Mapping) A mapping f : X −→ (−∞, ∞] is said to be weakly lower semicontinuous, denoted by w.l.s.c. at x0 ∈ D(f ) if f (x0 ) ≤ lim inf f (xn ) for every sequence xn ∈ X converges weakly to x. Remark We allow convex functions to take the value ∞. 1
For details, the reader can consult volume 2 of this series, or alternatively any other textbook on functional analysis.
304
5 Calculus of Variations
Since convergence in norm implies weak convergence, it is evident that every w.l.s.c. mapping is indeed l.s.c. The converse, however, is not necessarily true, unless the two types of convergence are equivalent, which is the case in finite-dimensional spaces. The convexity property once again proves its power and efficiency. We first need to prove the following result. Lemma 5.2.8 A function f is convex if and only if epi(f ) is convex. Proof Let f be convex and suppose (x, r), (y, s) ∈ epi(f ). let 0 < θ < 1 such that (z, t) = θ(x, r) + (1 − θ)(y, s). Then z = θr + (1 − θ)s ≥ θf (x) + (1 − θ)f (y) ≥ f (θx + (1 − θ)y) ≥ f (t). Hence (z, t) ∈ epi(f ). Conversely, let epi(f ) be convex and suppose x, y ∈ D(f ). Then (x, f (x), (y, f (x) ∈ epi(f ) and so (θx + (1 − θ)y, θf (x) + (1 − θ)f (y)) = θ(x, f (x)) + (1 − θ)(y, f (y)) ∈ epi(f ).
But this implies f (θx + (1 − θ)y) ≤ θf (x) + (1 − θ)f (y).
Theorem 5.2.9 If f is convex and l.s.c, then f is w.l.s.c. Proof Since f is convex, by the preceding theorem epi(f ) is convex, and since f is l.s.c., by Proposition 5.1.8 epi(f ) is closed, hence by Mazur’s theorem (Theorem 5.2.2) epi(f ) is weakly closed, i.e., closed in the weak topology, hence by Proposition 5.1.8 again, f is l.s.c in the weak topology. The next result shows that if a sequence converges weakly, then there exists a finite convex linear combination of the sequence that converges strongly to the weak limit of the sequence.
5.3 Direct Method
305 w
Lemma 5.2.10 (Mazur’s Lemma) Let xn −→ x) in a normed space X with a norm ·X . Then for every n ∈ N, there exists a corresponding k = k(n) and another sequence k λj xj , yn = j=n
for some positive coefficients λj satisfying λn + λn+1 + · · · + λk = 1, such that yn − xX −→ 0. Proof Fix n ∈ N, and define the set Cn = {
k j=n
λj xj , s.t.
k
λj = 1}.
j=n
Then it is clear from Proposition 5.1.10(5) that Cn is the convex hull of the set {xj : j ≥ n}, so it is convex, hence by Proposition 5.1.10(3) Cn is convex and closed. By Mazur’s Theorem 5.2.2, this implies that Cn is weakly closed, and since xn converges weakly to x, we must have x ∈ Cn from which we can find a member yn ∈ C such that yn − xX ≤
1 . n
(5.2.1)
Since {xn } is a sequence, we can take the limit as n −→ ∞ in (5.2.1) which gives the desired result.
5.3 Direct Method 5.3.1 Direct Verses Indirect Methods Differential Calculus is the most important tool used by mathematicians to study minimization and maximization problems. This classical method was began by Euler in the 1750s and it is considered today the standard technique. The procedure takes place on the first derivative and the second derivative of the function rather than the function itself. More specifically, we find critical points using the first derivative, and then examine them using the first or the second derivative to determine if these critical points refer to minimum, maximum, or saddle points. So the method is described as “indirect method ” in the sense that no direct work on the function takes place. In 1900,
306
5 Calculus of Variations
Hilbert began research on these minimization problems during his investigation of the Dirichlet principle, which will be discussed in the next section. He soon found a nonconstructive proof in which only the existence of the minimizer can be provided, without finding the minimizer itself. The method was described as: “direct method ” because the procedure is implemented directly on the function without regard to its derivatives. The direct method is purely analytic, and many fundamental results of functional analysis were established during the attempts to modify the method or improve it. It is one of the greatest ideas that laid the foundations of modern analysis. The indirect calculus-based method is constructive, but it is too technical and restrictive in the sense that it may require several conditions on the regularity of the function and its derivatives to facilitate the task of implementing the method. On the other hand, the direct method is nonconstructive, but it is not too technical. It only needs a weak topology and some conditions on the function itself. One advantage of this method is that, under some conditions on the function and the space, it deals with general functionals defined on spaces, and can provide abstract existence theorems of minimizers, and consequently they can be applied to different problems and in various settings.
5.3.2 Minimizing Sequence The main tool and the key to solve the problem is to start with minimizing sequences. Definition 5.3.1 (Minimizing Sequence) Let f be a functional defined on A. Then, a minimizing sequence of f on A is a sequence (xn ) such that xn ∈ A and f (xn ) −→ inf f (x). x∈A
Here are some observations: (1) The infimum of a function always exists, but it may be −∞ if f is unbounded from below. However, we usually assume or impose conditions to avoid this case. (2) By the definition of infimum, we can show that minimizing sequences always exist (verify). (3) The definition doesn’t say anything about the convergence of the sequence.
5.3.3 Procedure of Direct Method Generally speaking, the method is based on the following main steps:
5.3 Direct Method
307
(1) Constructing the minimizing sequence that converges to the infimum of the function. (2) Prove a subsequence of it converges. (3) Show that the limit is the minimizer Step 2 is the crucial one here. The general argument is to extract a subsequence of the minimizing sequence which converges in the weak topology. We will use the Kakutani Theorem (Theorem 5.2.6) which is a generalization of the Bolzano– Weierstrass Theorem. For step 1, the construction of the minimizing sequence comes naturally, i.e., from the definition of the infimum. Another way of showing the finiteness of the infimum is to prove that the function is bounded from below, which implies the infimum can never be −∞. It is important to note that if the function is not proved to be bounded from below, its infimum may or may not be −∞, and assuming that f : X −→ (−∞, ∞] doesn’t help in this regard because the infimum may still be −∞ even if the function doesn’t take this value. For example, if f (x) = e−x then f : R −→ (0, ∞) although inf f = 0. One advantage of defining the range of the function to be (−∞, ∞] is to guarantee that there is no x0 ∈ D(f ) with f (x0 ) = −∞. So if we can show the existence of a minimizer x0 , then we have inf f (x) = f (x0 ) > −∞.
x∈A
In fact, a function that is convex and l.s.c. cannot take the value −∞ since otherwise it would be nowhere finite (verify). Thus, it is important to avoid this situation by assuming the function to be proper. A function is called proper if its range is (−∞, ∞]. So, a proper functional f means neither f ≡ ∞, nor does it take the value −∞. Remark Throughout, we use the assumption that f is proper in all subsequent results.
5.3.4 Coercivity We will assume that our space X is reflexive Banach space in order to take advantage of the Banach–Alaoglu Theorem and its consequences, such as the Kakutani Theorem. We will also assume the function to be proper convex and l.s.c., and the admissible set is bounded, closed, and convex. If the admissible set is bounded, then any sequence belongs to the set is also bounded, so we can use Kakutani Theorem if the space is a reflexive Banach space, and we extract a subsequence that converges weakly. If the set is not bounded, then we can control the boundedness of the sequence using the coercivity of the function. One equivalent variant of coercivity of f is that
308
5 Calculus of Variations
f (xn ) −→ ∞ if xn −→ ∞ for xn ∈ D(f ). It follows that if (xn ) is the minimizing sequence and f (xn ) −→ inf f and f is bounded from below, then f (xn ) ∞, and consequently xn ∞ and this proves that xn is bounded. We can look at it from a different point. Letting z ∈ D(f ) such that f (z) < ∞, we can choose r > 0 large enough to let f (x) > f (z) whenever x > r. We can then exclude all these members x by taking the following intersection: M = D(f ) ∩ Br (0), which is clearly a bounded set, and is also closed if D(f ) is closed, and convex if D(f ) is convex, and the infimum of f over D(f ) is the same as the infimum over M . Our minimizing sequence xn is certainly inside the bounded set M , so it is bounded. If the space is reflexive Banach space, then M lies in a large fixed closed ball which is weakly compact, so we can extract a subsequence from xn that converges weakly to some member x0 . It remains to show that this limit is the minimizer, and it belongs to M , keeping in mind that inf f = inf f . M
D(f )
5.3.5 The Main Theorem on the Existence of Minimizers Now we state our main result of the section which solves the minimization problem. Theorem 5.3.2 Let X be a reflexive Banach space. If a functional J : A ⊂ X −→ (−∞, ∞] is a proper convex that is bounded from below, coercive, and l.c.s., and defined on A closed and convex in X , then there exists u ∈ X such that J [u] = inf J [v]. v∈A
If J is strictly convex then the minimizer u is unique.
5.3 Direct Method
309
Proof Since J is bounded from below, let inf J [v] = m > −∞.
v∈A
Let (un ) be a minimizing sequence such that J [un ] −→ m, which implies that J [un ] is bounded and so by coercivity un is bounded in A. Since X is reflexive, we use Theorem 5.2.6 to conclude that there exists a subsequence unj , for convenience call it again un , which converges weakly to, say u ∈ X . But since A is closed and convex, by Mazur’s Theorem it is weakly convex, and therefore u ∈ A. Finally, since J is convex and l.s.c, it is w.l.s.c. and so J [u] ≤ lim inf J [un ]. It follows that m ≤ J [u] ≤ lim inf J [un ] ≤ m, and therefore J [u] = m > −∞. Now, suppose that J is strictly convex. Let u1 , u2 ∈ A and consider u0 =
u1 + u2 . 2
Then by strict convexity, J [u0 ] = J
1 u1 + u2 < (J [u1 ] + J [u2 ]) = m, 2 2
and this contradicts the fact that m is the infimum. The proof is complete.
In light of the preceding theorem, we see that the property of being reflexive and Banach is very useful in establishing the existence of minimizers. We already proved in Theorem 3.5.3 that all Sobolev spaces W k,p () are Banach, separable, and reflexive for 1 < p < ∞, which justifies the great importance of Sobolev spaces in applied mathematics. Also, we notice the importance of the admissible set to be convex as this guarantees I to be convex and u to be in the admissible set. Proving a functional is w.l.s.c. could be the most challenging condition to satisfy. Once the functional is proved to be l.s.c. and convex, Theorem 5.2.9 can be used. We recall some basic results in the analysis: Proposition 5.3.3 Let un , u ∈ Lp , 1 ≤ p < ∞. The following statements hold:
310
5 Calculus of Variations
(1) If un −→ u in norm then there exists a subsequence (unj ) of (un ) such that unj −→ u a.e. (2) If (un ) is bounded then there exists a subsequence (unj ) of (un ) such that unj −→ lim inf un a.e. w
(3) If un −→ u) then (un ) is bounded and uLp () ≤ lim un Lp () . Proof (1) follows immediately from the convergence in measure. (2) is proved using the definition of infimum. (3) can be proved using the definition of weak convergence, noting that the sequence {f (un )} is convergent for every bounded linear functional f ∈ X ∗ . Now, we use Hahn–Banach Theorem to choose f such that f = 1 and f (u) = u . The third statement is especially important and can be a very efficient tool in proving weak lower semicontinuity property for functionals as we shall see in the next section.
5.4 The Dirichlet Problem 5.4.1 Variational Integral The goal of this section is to employ the direct method in establishing minimizers of some functionals. Then we proceed to investigate connections between these functionals and weak solutions of some elliptic PDEs. It turns out that there is a close relation between the weak formulation (4.2.5) of the form B[u, v] − f , v = 0 of a PDE and some integral functional. If we set v = u in the weak formulation, we obtain the following integral functional: J [v] = B[u, v] − f , v . It was observed that the minimizer of the functional J , i.e., the element u0 that will minimize (locally or globally) the value of J , is the solution of the weak formulation, which in turns implies that u0 is the weak solution of the associated PDE from which the bilinear B was derived, and given that B is identified by an integral, the same
5.4 The Dirichlet Problem
311
applies to J . The corresponding functional J is thus called: variational functional, or variational integral, The word: “variational” here refers to the process of extremization of the functional, i.e., finding extrema and extremizers of a functional. As we mentioned earlier in the chapter, we are only interested in the minimization problems. This observation of the link between the two concepts (i.e., weak solutions of PDEs and minimizers of their corresponding variational integrals) is old in history and its roots go back to Gauss, but it was Dirichlet who formulated the problem for the Laplace equation and provided a framework for this principle which is regarded as one of the greatest mathematical ideas that affect the shape and structure of modern analysis.
5.4.2 Dirichlet Principle Consider the Dirichlet problem of the Laplace equation
∇ 2 u = 0 in u = 0 on ∂.
(5.4.1)
Recall the weak formulation of the Laplace equation takes the form
∇u.∇vdx =
fvdx
Letting u = v above, then we obtain the nonnegative functional 1 E[v] = 2
|∇v|2 dx.
Definition 5.4.1 (Dirichlet Variational Integral) The the integral 1 E[v] = 2
|∇v|2 dx
is called the Dirichlet Integral. Physically, v refers to the electric potential and the integral E stands for the energy of a continuous charge distribution, and consequently, the integral is sometimes called the energy integral, or the Dirichlet energy. The problem of minimizing this functional is justified by a physical principle which asserts that all natural systems tend to minimize their energy. Now we introduce the Dirichlet principle, due to Dirichlet in 1876, which is regarded as a cornerstone of analysis and a landmark in the history of mathematics.
312
5 Calculus of Variations
Theorem 5.4.2 (Dirichlet Principle) Let ⊂ Rn be bounded, and consider the collection A = {v ∈ Cc2 () : v(∂) = 0}. A function u ∈ Cc2 () ∩ C 0 () is a solution to problem (5.4.1) if and only if u is the minimizer over A of the Dirichlet integral 1 |∇u|2 dx. E(u) = 2 Proof Note that for u, v ∈ Cc2 (), we have by integration by parts (divergence theorem) 2 ∇u · ∇vdx = [(v∇u)]∂ − ∇ uvdx = − ∇ 2 uvdx (5.4.2)
Now, let u ∈ Cc2 () be a solution to the Laplace equation. Let v ∈ A. Then by (5.4.2) 0=
(∇ u)vdx = − 2
∇u · ∇vdx,
so
|∇(u + v)|2 dx = = ≥
|∇u|2 dx + |∇u| dx +
2
∇u · ∇vdx +
|∇v|2 dx
|∇v|2 dx
|∇u|2 dx,
1 for an arbitrary v in A. Multiplying both sides of the inequality by , we conclude 2 that u is a minimizer of E. Conversely, assume that the functional E has the minimizer u and E(u) ≤ E(v) for all v ∈ A. Let v ∈ A and choose t ∈ R. Then E(u) ≤ E(u + tv). Define the function h : R −→ R by h(t) = (u + tv), then its derivative takes the form
5.4 The Dirichlet Problem
313
d 1 |∇(u + tv)|2 dx dt 2
d 1 |∇u|2 + 2t∇u · ∇v + t 2 |∇v|2 dx . = dt 2
h (t) =
Note that the derivative takes place in t while the integration is in x, and since u ∈ Cc2 () ∩ C 0 () we can take the derivative inside, and so
h (t) =
|∇u| |∇v| + t
|∇v|2 dx.
Since h has a minimum at 0, we have 0 = h (0) =
|∇u| |∇v| dx
Again, using (5.4.2), noting that v = 0 on ∂, gives 0=−
v∇ 2 udx
for every v ∈ A. By Theorem 3.3.4, we conclude that u is a solution to the Laplace equation.
5.4.3 Weierstrass Counterexample The principle gives the equivalence between the two famous problems. Based on this principle, one can guarantee the existence of the Harmonic function (which solves the Laplace equation) if and only if a minimizer for the Dirichlet integral is obtained. It is well known that there exist harmonic functions that satisfy Laplace’s equation. Now, if ∇ 2 u = 0, this implies that ∇u is a constant, and when u = 0 on the boundary then we must have u = 0 in the entire region, and so the energy integral equals zero. However, that wasn’t the end of the story. If we go the other way around, mathematicians in the late nineteenth century started to wonder: does there exist a minimizer for the integral? Riemann, a pupil of Dirichlet, argued that the minimizer of the integral exists since E≥0
314
5 Calculus of Variations
from which he incorrectly concluded that there must exist a greatest lower bound. In 1869, Weierstrass provided the following example of a functional with a minimum but with no minimizer, i.e., the functional doesn’t attain its infimum. Example 5.4.3 (Weierstrass’s Example) Consider the Dirichlet integral E[v] =
1
−1
2 xv (x) dx
where v ∈ A = {v ∈ Cc2 () : v((−1) = −1, v(1) = 1}. Clearly E ≥ 0. Define on [−1, 1] the sequence un (x) =
tan−1 nx tan−1 n
Then, it is easy to see that E[un ] −→ 0, and so we conclude that inf E[u] = 0.
u∈A
Now, if there exists u ∈ A such that E[u] = 0, then u = 0, which implies that u is constant, but this contradicts the boundary conditions. The example came as a shock to the mathematical community at that time, and the credibility of the Dirichlet principle became questionable. Several mathematicians started to rebuild the confidence in the principle by providing a rigorous proof for the existence of the minimizer of the Dirichlet integral. In 1904, Hilbert provided a “long and complicated” proof of the existence of the minimizer of E in C 2 using the technique of minimizing sequence. After the rise of functional analysis, it soon became clear that the key to achieve a full accessible proof lies in the direct method. The Dirichlet principle is valid on C 2 but the minimizer cannot be guaranteed unless we have a reflexive Banach space, and C 2 is not. The best idea was to enlarge the space to the completion space of C 2 , which is nothing but the Sobolev space. It came as no surprise to know that defining Sobolev spaces as the completion of smooth functions C ∞ was for a long time the standard definition of these spaces. Furthermore, according to Theorem 3.5.3, Sobolev spaces are reflexive Banach space for 1 < p < ∞. It turns out that Sobolev spaces are the perfect spaces to use in order to handle these problems in light of the direct method that we discussed in Sect. 5.3. This was among the main reasons that motivated Sergei Sobolev to develop a theory to construct these spaces, and this justifies Definition 3.7.4 from a historical point of view.
5.5 Dirichlet Principle in Sobolev Spaces
315
5.5 Dirichlet Principle in Sobolev Spaces 5.5.1 Minimizer of the Dirichlet Integral in H01 The next two theorems solve the minimization problem of the Dirichlet integral over two different admissible sets in a simple manner by applying Theorem 5.3.2. Theorem 5.5.1 There exists a unique minimizer over A = {v ∈ H01 (), is bounded in Rn } for the Dirichlet integral. Proof It is clear that E is bounded from below by 0, and so there is a finite infimum for E. Moreover, E is coercive by Poincare’s inequality (or considering the Poincare w w w norm on H01 ). Let un −→ u in H01 (). Then un −→ u and Dun −→ Du in L2 (). By Proposition 5.3.3(3), Du2 ≤ lim inf Dun 2 , which implies E[u] ≤ lim inf E[un ],
(5.5.1)
and so E[·] is w.l.s.c. Finally, it can be easily proved that E[·] is strictly convex due to the strict convexity of |·|2 and linearity of the integral. We, therefore, see that E is bounded from below, coercive, w.l.s.c. and strictly convex, and it is defined on a reflexive Banach space H01 (). The result now follows from Theorem 5.3.2.
5.5.2 Minimizer of the Dirichlet Integral in H 1 Now, we investigate the same equation, but with u = g on ∂ this time. Here we assume g ∈ H 1 (), and consequently we also have u ∈ H 1 (). This boundary condition can be interpreted by saying that the functions u and g have the same trace on the boundary. Since the functions are both Sobolev, they are measurable and so pointwise values within a set of measure zero have no effect. To avoid this problematic issue on the boundary, we reformulate the condition to the form: u − g ∈ H01 (). Accordingly, the admissible set consists of all functions in H 1 () such that they are equal, in a trace sense, to a fixed function g ∈ H 1 () on ∂. The following result proves an interesting property of such a set. Proposition 5.5.2 The admissible set A = {u ∈ H 1 () : u − g ∈ H01 () for some g ∈ H 1 ()}
316
5 Calculus of Variations
is weakly closed. Proof Notice that the set A is convex. Indeed, let u, v ∈ A and letting w = θu + (1 − θ)v, we clearly have w ∈ H 1 () being a linear space. We also have w − g = θu + (1 − θ)v − g = θ(u − v) + v − g = θ(u − g) − θ(v − g) + (v − g) ∈ H01 () since (u − g), (v − g) ∈ H01 () which is a linear space. Hence, w ∈ A. Further, let un ∈ A, and un −→ u. Then un − g ∈ H01 () which is a complete space, so lim(un − g) = (u − g) ∈ H01 (), hence u ∈ H01 (), from which we conclude that A is closed, and therefore, it is weakly closed. The next is a variant of Theorem 5.5.1 for the space H 1 (). The source of difficulty here is that the Poincare inequality is not applicable directly. Theorem 5.5.3 For a bounded set in Rn , there exists a unique minimizer for the Dirichlet integral over the set A = {u ∈ H 1 () : u − g ∈ H01 () for some g ∈ H 1 ()}. Proof In view of the proof of Theorem 5.5.1, it suffices to show that the minimizing sequence is bounded. If uj ∈ A is a minimizing sequence such that J [uj ] < ∞, then sup Duj q < ∞. We also have
1,q
uj − g ∈ W0 (), so using Poincare’s inequality we have
5.5 Dirichlet Principle in Sobolev Spaces
uj
Lq
317
= uj − g + g Lq ≤ uj − g Lq + gLq ≤ D(uj − g) Lq + gLq = Duj − Dg q + C1 L
≤C Hence,
sup uj q < ∞,
and consequently, (uj ) is bounded in W 1,q ().
5.5.3 Dirichlet Principle Now we are ready to prove the Dirichlet principle in the general settings over Sobolev spaces. Theorem 5.5.4 Let ⊂ Rn be bounded. A function u ∈ A = {v ∈ H01 ()} is a weak solution to problem (5.4.1) if and only if u is the minimizer over A of the Dirichlet Integral. Proof If u ∈ A is a weak solution to the Laplace equation, then u satisfies the weak formulation of the Laplace equation, namely,
Du · Dvdx = 0.
Let w ∈ A, which can be written as w = u + v for some v ∈ A. Then, we have |D(u + v)|2 dx = |Du|2 dx + 2 |Du| · |Dv| dx + |Dv|2 dx 2 |Du| dx + 2 Du · Dvdx + |Dv|2 dx ≥ |Du|2 dx + |Dv|2 dx = 2 |Du| dx, ≥
and so u is a minimizer of E[·]. Conversely, let E(u) ≤ E(v)
318
5 Calculus of Variations
for all v ∈ A. Theorem 4.5.2 proved the existence and uniqueness of the weak solution for the problem (with f = 0), and it was already proved above that it is a minimizer of the integral variational E[·] and Theorem 5.5.1 proved the uniqueness of the minimizer, so the other direction is proved.
5.5.4 Dirichlet Principle with Neumann Condition The next result discusses the Dirichlet principle under a Neumann boundary condition, ⎧ ⎨−∇ 2 u = 0 x ∈ (5.5.2) ∂u ⎩ = g x ∈ ∂. ∂n for a bounded Lip. domain ⊂ Rn and g ∈ C 1 (∂). It can be seen from the problem that the solution is unique up to constant. Let u ∈ C 2 () be a classical solution of problem (5.5.2). Multiply the equation by v ∈ C 2 () and integrate over then using Green’s formula, ∂u |∇u| |∇v| dx, vds − ∇ 2 uvds = 0= ∂ ∂n or
|∇u| |∇v| dx =
∂
∂u vds, ∂n
(5.5.3)
which gives the corresponding variational integral J [v] = E[v] −
gvds.
(5.5.4)
∂
Now, letting v = 1 and substituting in (5.5.3) gives ∂
∂u ds = 0. ∂n
This is a compatibility condition which is essential for the problem to guarantee the uniqueness of the solution. As we did in Theorem 5.4.2, we will first prove the principle in the classical case. Theorem 5.5.5 A function u ∈ C 2 () is a solution of problem (5.5.2) if and only if u is the minimizer of the variational integral (5.5.4) over A = {v ∈ C (), 2
∂v ds = 0}. ∂n
5.5 Dirichlet Principle in Sobolev Spaces
319
Proof Note that the admissible space here is C 2 () and so our functions u, v don’t necessarily vanish on ∂. Let u ∈ C 2 () be a solution of problem (5.5.2). Let w ∈ A, and write w = u − v for some w ∈ C 2 (). Then J [w] = J [u − v] 1 |∇(u − v)|2 dx − = (u − v)gds 2 ∂ ∂u 1 1 |∇u|2 − |∇v|2 − |∇u| |∇v| + vds. guds + = 2 2 ∂ ∂ ∂n Note that if v is a constant, then J [w] = J [u]. We substitute in the above equality with v = c, and thus the condition ∂u ds = 0 ∂ ∂n is verified, and so u ∈ A. Using the first Green’s formula in the last two integrals in the RHS of the equation yields
−
|∇u| |∇v| +
∂
∂u vds = ∂n
v∇ 2 u = 0.
Substituting above gives J [w] = J [u] +
1 2
|∇v|2 ≥ J [u],
hence u is the minimizer of J . Conversely, let u be a minimizer of J . Let v ∈ A and choose t ∈ R. Then J (u) ≤ J (u + tv). So if we define the function h : R −→ R by h(t) = J (u + tv) 1 |∇(u + tv)|2 − g(u + tv)ds = 2 ∂ 1 1 |∇u|2 + t |∇u| |∇v| + t 2 |∇v|2 − = guds − t gvds 2 2 ∂ ∂ then it is clear that h has a minimum value at 0. If h is differentiable at 0 then h (0) = 0. This gives 0 = h (0) =
|∇u| |∇v| −
gvds. ∂
320
5 Calculus of Variations
Applying Green’s formula again gives 0= or
∂
∂u vds − ∂n
v∇ udx = 2
∂
v∇ 2 udx − ∂u vds − ∂n
gvds, ∂
gvds.
(5.5.5)
∂
Remember that we are yet to prove that u is a solution for Laplace equation, so we cannot say that ∇ 2 u = 0. The trick here is to reduce the admissible space by adding a suitable condition, namely v = 0 on ∂. This gives the following reduced admissible space A0 = {v ∈ Cc2 ()} ⊂ A = {v ∈ C 2 ()}. So, if (5.5.5) holds for all v ∈ A then it holds for all v ∈ A0 , i.e., v = 0 on ∂, and consequently the integrals in the RHS of (5.5.5) vanish and we thus obtain
v∇ 2 udx = 0
for every v ∈ A0 , from which we conclude that ∇ 2 u = 0. Now, it remains to show that u satisfies the boundary condition. Getting back to our admissible space A = {v ∈ C 2 ()}, and since the left-hand side of (5.5.5) is zero, this gives ∂u − g vds, 0= ∂ ∂n for all v ∈ A. By the Fundamental Lemma of COV, we get the Neumann BC and the proof is complete. It is important to observe how the variational integral changes its formula although it corresponds to the same equation, namely, Laplace equation. It is, therefore, essential in this type of problems to determine the admissible set in which the candidate functions compete since changing the boundary conditions usually cause a change in the corresponding variational integral.
5.5 Dirichlet Principle in Sobolev Spaces
321
5.5.5 Dirichlet Principle with Neumann B.C. in Sobolev Spaces We end the section with the Dirichlet principle with Neumann condition over Sobolev spaces. Theorem 5.5.6 A function u ∈ H 1 (), for bounded ∈ Rn , is a weak solution of problem (5.5.2) if and only if u is the minimizer of the variational integral (5.5.4) over ∂u ds = 0}. A = {v ∈ H 1 (), ∂n ∂
Proof The (if) direction is quite similar to the argument for the preceding theorem. Conversely, let u be a minimizer of J . It has been shown that the problem (5.5.2) has a unique weak solution (see Problem 4.11.15(a)), and the argument above showed that a weak solution is a minimizer to the variational integral (5.5.4), so it suffices to prove that there exists a unique minimizer for (5.5.4), but this is true since J is strictly convex. We end the section with the following important remark: In the proof of the preceding theorem, to show a minimizer is a weak solution for the problem, one may argue similar to the proof of Theorem 5.5.5. Namely, let t ∈ R, and define the function h(t) = J (u + tv) =
1 2
|∇(u + tv)|2 dx −
∂
(u + tv)gds.
Again, h has minimum at 0, and so h (0) = 0. Differentiating both sides, then substituting with t = 0 gives 0 = h (0) =
|∇u| |∇v| −
gvds.
(5.5.6)
∂
Hence, u satisfies the weak formulation (5.5.3). Although the argument seems valid, but in fact, we may have some technical issue with it. Generally speaking, inserting the derivative inside the integral when one of the functions of the integrand is not smooth can be problematic, and this operation should be performed with extra care. We should also note that differentiating h implies that the integral functional J should be differentiable in some sense and the two derivatives are equal. The next section will elaborate more on this point, and legitimize this previous argument by introducing a generalization of the notion of derivative that can be applied to functionals that are defined on infinite-dimensional spaces.
322
5 Calculus of Variations
5.6 Gateaux Derivatives of Functionals 5.6.1 Introduction In Sect. 5.3, we discussed the direct method and indirect method and the comparison between them. The former is based on functional analysis, whereas the latter is based on calculus. Two reasons for choosing the direct method are: 1. discussing the direct method fits the objective and scope of this book, and 2. we didn’t have tools to differentiate the variational integrals E[·] and J [·]. In this section, we will introduce the notion of differentiability of functionals and will apply it to our variational integrals. As the title of the section suggests, we are merely concerned with the idea of differentiating the variational integrals, and the topic of differential calculus on metric function spaces is beyond the scope of the book.
5.6.2 Historical Remark One of the main reasons that causes the direct method to appear and thrive is the lack of differential calculus required to deal with functional defined on function spaces at that time. Hilbert and his pupils in addition to their contemporaries didn’t had the machinery tools of calculus to deal with these “functionals”. The theory of differential calculus in function and metric spaces was developed by René Fréchet and René Gateaux, and they both were still pupils studying mathematics when Hilbert published his direct method in 1900. The first work that appeared in differential calculus was the Frechet’s thesis in 1906, but the theory wasn’t clear and rich enough to use at that time. Gateaux started to work on the theory in 1913, and his work was not published until 1922. The theory of differential calculus in metric and Banach spacers soon started to attract attention, and it soon became an indispensable tool in the area of calculus of variations due to its efficiency and richness in techniques. The present section gives a brief introduction to the theory. We will not give a comprehensive treatment of the theory, but rather highlight the important results that suffice our needs in this chapter, and show how calculus can enrich the area of variational methods and simplify the task of solving minimization problems.
5.6.3 Gateaux Derivative Recall the directional derivative of a function f : Rn −→ R at x ∈ Rn and in the direction of v is given by
5.6 Gateaux Derivatives of Functionals
323
f (x + tv) − f (x) . t→0 t
Dv f (x) = lim
(5.6.1)
To extend this derivative to infinite dimensions in a general normed space X , we let Dv f (x) = Df (x; v): X × X −→ R and given by the limit definition (5.6.1). If the limit above exists, then the quantity Dv f (x) is called the Gateaux differential of f at x in the direction of v ∈ X , or simply: G-differential. So at each single point, there are two G-differentials in one dimension and infinitely many of them in two or more dimensions. Let us take a closer look at the differential in (5.6.1). Let x0 ∈ X for some normed space X be a fixed point at which the G-differential in the direction ofv = x − x0 for some x ∈ X exists. Then for every t > 0, we have |f (x) − f (x0 ) − Df (x, x − x0 )| ≤ t x − x0 . In order for the above inequality to make sense, it is required that Df (x, x − x0 ) ∈ R. On the other hand, if f in (5.6.1) is a mapping defined on Rn , then the G-differential reduces to: (5.6.2) Df (x0 ; v) = ∇f (x0 ) · v. It is well-known in linear algebra that a function f : Rn −→ R is linear if and only if there exists v ∈ Rn such that f (x) = x · v for every x ∈ X, hence when we extend to an infinite-dimensional space X , the dot product ∇f (x0 ) · v gives rise to a functional Df (x0 , v) from X to R, and we can alternatively use the inner product notation Df (x0 ; v) = Df (x0 ), v .
(5.6.3)
Now, letting v = x − x0 and substituting in (5.6.2) gives Df (x0 , v) = Df (x0 )(x − x0 ),
(5.6.4)
and since x − x0 ∈ X , and Df (x0 , v) ∈ R, the quantity Df (x0 ) must be a functional defined on X . This gives a good definition for the “Gateaux derivative” to exist and be defined on X ∗ × X rather than X × X as for the directional derivative in Rn , but this requires the G-differential to exist in all directions v ∈ R, so that we can define Df (x0 ) as a functional on X . Moreover, the form (5.6.4) suggests the derivative Df (x0 ; v) to be linear in v and bounded. Indeed, Df (x0 , v) = Df (x0 )v ≤ Df (x0 ) v ,
324
5 Calculus of Variations
and Df (x0 , v + w) = Df (x0 )(v + w) = Df (x0 )v + Df (x0 )w = Df (x0 , v) + Df (x0 , w). So we have Df (x0 ) ∈ X ∗ , and Df (x0 , v) : X ∗ × X −→ R. The functional Dv f (x) = Df (x, v) is the Gateaux derivative of f at x, and f is said to be Gateaux differentiable at x. This motivates the following definition for Gateaux derivative of functionals. Definition 5.6.1 (Gateaux Derivative) Let x0 ∈ X for some normed space X , and consider the functional f : X −→ R. Define the following mapping (Df )(·, ·) : X ∗ × X −→ R, given by Df (x0 , v) = Df (x0 )v = lim
t→0
f (x0 + tv) − f (x0 ) . t
(i.e. Df (x0 ) is bounded and linear in v). If Df (x0 , v) exists at x0 in all directions v ∈ Rn , then Df (x0 , v) is called the Gateaux derivative of f , and f is said to be Gateaux differentiable at x. Care must be taken in case the domain of f is not all the space X , as we must ensure the definition applies to points in the interior of D(f ). Furthermore, as we observe from the definition, it requires the Gateaux derivative to be bounded and linear in v. This is especially helpful in the calculus of variations to obtain results that are consistent with the classical theory of differential calculus.
5.6.4 Basic Properties of G-Derivative The following proposition shows that the G-derivative rules operate quite similar to those for the classical derivative. Proposition 5.6.2 Let f : X −→ R that is G-differentiable on X , and c ∈ R. Then (1) (2) (3) (4) (5)
For Df (u, cv) = cDf (u, v). D(cf ) = cDf . D(f ± g) = Df ± Dg. D(f .g) = fDg + gDf . Dv (ex ) = vex .
Proof Exercise.
5.6 Gateaux Derivatives of Functionals
325
5.6.5 G-Differentiability and Continuity One significant difference between the classical derivative f and the G-derivative is that if a functional is differentiable at x0 in the classical sense, then it is continuous at x0 . This is not the case for the Df (x0 , v). The following example demonstrates this fact. Example 5.6.3 Consider the function f : R2 −→ R given by f (x, y) =
xy4 x2 +y8
0
(x, y) = (0, 0) (x, y) = (0, 0).
It is easy to show that f is not continuous at (0, 0) by showing the limit at (0, 0) doesn’t exist. However, let v = (v1 , v2 ) ∈ R2 . Then applying the limit definition at x0 = (0, 0) gives f (tv1 , tv2 ) 1 tv1 (tv2 )4 = lim t→0 t→0 t (tv1 )2 + (tv2 )8 t v1 v4 = lim t 2 2 26 8 t→0 v + t v 1 2
lim
= 0, so Df (0, 0)v = 0 for all v ∈ R2 , which implies that Df (0, 0) = 0. One reason to explain the above example is that the G-derivative from its very definition doesn’t require a norm on the space, and consequently, it is not related to the convergence property of the space. This is why the functional can be G-differentiable at a point and discontinuous at that point.
5.6.6 Frechet Derivative A stronger form of a derivative which ensures this property is the Frechet derivative, which results from the G-derivative if the convergence of the limit is uniform in all directions and doesn’t depend on v as the case for G-differentiability, and writing x − x0 = v, then the derivative takes the form |f (x) − f (x0 ) − Df (x0 )v| = 0, v→0 vX
Df (x0 , v) = lim
326
5 Calculus of Variations
where the norm ·X is the norm on the space X . Here, if the Frechet derivative exists at a point x0 then the G-derivative exists at x0 exists and they coincide. Moreover, the function is continuous at x0 . However, the G-derivatives may exist, but F-derivatives do not exist. That’s why, dealing with the “weaker” type of differentiation may sound more flexible and realistic in many cases. Moreover, due to the norm and the uniform convergence, demonstrating the Frechet differentiability is sometimes not an easy task, and evaluating the G-derivative is usually easier than the F-derivative. More importantly, since this derivative can differentiate discontinuous functions, it suits Sobolev functions and suffices our needs in this regard, so we will confine the discussion to it throughout the chapter.
5.6.7 G-Differentiability and Convexity It is well-known in analysis that one way to interpret convex functions geometrically is that its graph lies above all of its tangent lines. This can be represented by saying that f (x) − f (y) ≥ f (y)(x − y) for all x, y ∈ D(f ). The following result extends the above fact to infinite-dimensional spaces. Theorem 5.6.4 Let f : ⊆ X −→ R be G-differentiable defined on a convex set in a normed space X . Then f is convex if and only if f (v) − f (u) ≥ Df (u, v − u) for all u, v ∈ . Proof Let f be convex. Then for u, v ∈ tv + (1 − t)u = u + t(v − u) ∈ , and f (tv + (1 − t)u) ≤ tf (v) + (1 − t)f (u) = f (u) + t(f (v) − f (u)), which implies that f (tv + (1 − t)u) − f (u) ≤ f (v) − f (u). t Passing to the limit as t −→ 0 gives the first direction. Conversely, let u, v ∈ . Since is convex,
5.6 Gateaux Derivatives of Functionals
327
w = tv + (1 − t)u = u + t(v − u) ∈ . We apply the inequality on u, w and using u − w = −t(v − u) as the direction, and apply it again on w, v, and using v − w = (1 − t)(v − u) as the direction, and taking into account Proposition 5.6.2(1), gives f (u) − f (w) ≥ Df (w, −t(v − u)) = −tDf (w, v − u), f (v) − f (w) ≥ Df (w, (1 − t)(v − u)) = (1 − t)Df (w, v − u).
(5.6.5) (5.6.6)
Multiply (5.6.5) by (1 − t) and (5.6.6) by t, respectively, then add the two equations gives f (u) + t(f (v) − f (u)) ≥ f (w), which implies the function is convex.
One important consequence is the following remarkable result that may facilitate our efforts in demonstrating lower semicontinuity property of functionals, which in many cases can be a complicated task. According to the following result, this property is guaranteed if the functional is G-differentiable and convex. Theorem 5.6.5 If f : X −→ R is convex and G-differentiable at x, then f is weakly lower semicontinuous at x. w
Proof Letting un −→ u in X . By Theorem 5.6.4, for each n we write f (un ) − f (u) ≥ Df (u, un − u) = Df (u)(un − u), but we know that Df (u) ∈ X ∗ , then by the definition of weak convergence, we have Df (u)(un − u) −→ 0, from which we get f (un ) − f (u) ≥ 0, and the result follows by taking the liminf for both sides of the inequality above.
5.6.8 Higher Gateaux Derivative We can use the same discussion above in defining a second order G-derivative. Let Df (x0 ) ∈ X ∗ be the G-derivative of a functional f : X −→ R. We define the Gderivative of Df (x0 , v) in the direction of w ∈ X as
328
5 Calculus of Variations
Df (x0 + tw, v) − Df (x0 , v) . t→0 t
D2 f (x0 , v, w) = lim
This gives the second G-derivative of f in the directions v and w and taking the form (D2 f (x0 )v)w = (D2 f (x0 )v, w ∈ R which defines an inner product on X × X . The second G-derivative D2 f (x0 )v also defines a continuous bilinear B : X × X −→ R in the directions v, w and is given by B[v, w] = (D2 f (x0 )v)w. This proposes the following Gateaux’s variant of Taylor’s theorem. Theorem 5.6.6 Let f : X −→ R, for some normed space X , be twice G-differentiable. Then for some t0 ∈ (0, 1), we have f (u + tv) = f (u) + t Df (u), v +
t2 2 D f (u + t0 v)w, w 2
(5.6.7)
Proof Define the real-valued function ϕ(t) = f (u + tv). Since ϕ is twice differentiable, we can apply a second-order Taylor series expansion on (0, t) to obtain ϕ(t) = ϕ(0) + tϕ (0) +
t 2 ϕ (t0 ) + (v), 2!
This yields the expansion (5.6.7).
Theorem 5.6.6 will be used to characterize convexity of functionals in connection to their first and second G-derivatives. Theorem 5.6.7 Let f : ⊆ X −→ R be twice G-differentiable defined on a convex set in a normed space X . Then the following are equivalent: (1) (2) (3)
f is convex f (v) − f (u) ≥ Df (u, v − u) for all u, v ∈ . D2 f (u)v, v ≥ 0 for all u, v ∈ .
Proof The equivalence between (1) and (2) has been proved in Theorem 5.6.4. The equivalence between (2) and (3) follows easily from (5.6.7). The next theorem provides a procedure of finding the G-derivative of a general variational form.
5.6 Gateaux Derivatives of Functionals
Theorem 5.6.8 Let J [u] =
329
1 B[u, u] − L(u) 2
be the variational integral associated with some elliptic PDE, for some bounded symmetric bilinear form B and bounded linear functional L. Then: (1) DJ (u)v = B[u, v] − L(v). (2) D2 J (u, v)v = B[v, v]. Proof For (1), we have J [u + tv] − J [u] t 1 1 1 B[u + tv, u + tv] − L(u + tv) − B[u, u] + L(u) = lim t t 2 2 1 = lim tB[u, v] + t 2 B[v, v] − tL(v) t t = B[u, v] − L(v).
DJ (u)v = lim t
For (2), we have DJ (u + tv)v − DJ (u)v t 1 = lim [B[u + tv, v] − L(v) − (B[u, v] − L(v))] t t 1 = lim tB[v, v] = B[v, v]. t t
D2 J (u, v)v = lim t
The arguments were fairly straightforward but we preferred to write the details out due to the significance of the result. The functional J in the theorem represents the variational functional from which an equivalent minimization problem can be defined for an elliptic PDE. One consequence of the above theorem is that the G-derivative of the variational functional is nothing but the weak formulation of the associated PDE. Moreover, the second G-derivative of the functional is the elliptic bilinear form B evaluated at the new variation v. If B is coercive then the second G-derivative of J is positive for all v = 0. Another consequence is that any critical point of J is a weak solution of the PDE associated with the functional J . Note from Theorem 5.6.8(1) that the equation DJ (u, v) = 0 gives B[u, v] = L(v) which is the the weak formulation of the equation associated to the variational integral J .
330
5 Calculus of Variations
5.6.9 Minimality Condition We end the section with some calculus techniques for the functionals to explore minimizers using the first and the second G-derivative. The first concerning the critical point of a functional. The term: local point will serve the same meaning as we have for the calculus. A local minimum point (or function) is an interior point at which the function/functional takes a minimum value. Theorem 5.6.9 Let J : ⊂ X −→ R be a functional that is G-differentiable on a convex set . (1) If u ∈ X is a local minimizer then DJ (u, v) = 0 for all v ∈ X . (2) If J is convex then DJ (u, v) = 0 for all v ∈ X if and only if u is a local minimizer of J . If = X then u is a global minimizer. (3) If J is strictly convex then u is the unique minimizer. Proof We will only prove (1) and (2), leaving (3) as easy exercises to the reader. (1): Let u ∈ X be a local minimizer for a G-differentiable functional J . Letting v ∈ be any vector, and choosing t > 0 such that u + tv ∈ yield J (u + tv) − J (u) ≥ 0. Divide by t then take t −→ ∞ to obtain DJ (u, v) ≥ 0. Since this holds for any arbitrary v, we can choose −v, and using Proposition 5.6.2(1) −DJ (u, v) = DJ (u, −v) ≥ 0. (2): Assume J is convex. Then for v ∈ , we have by Theorem 5.6.4 f (v) − f (u) ≥ Df (u, v − u) = 0.
In the elementary calculus case, it is well-known that a critical point at which f = 0 doesn’t necessary mean it is a local minimum, as it may be maximum or a saddle point. We need to examine the first derivative test or the second derivative test to check if this point is extremum or not. We will do the same here. The next result tells us that if the second derivative of the functional at the critical point is positive then the point is local minimum. Theorem 5.6.10 Let J : ⊂ X −→ R be a twice G-differentiable functional on a convex set . If u0 is a local minimizer for J then D2 J (u)(v, v) ≥ 0.
5.7 Poisson Variational Integral
331
Proof The result follows easily using Theorem 5.6.9(1) above and Taylor’s Formula (5.6.7).
5.7 Poisson Variational Integral 5.7.1 Gateaux Derivative of Poisson Integral In this section, we will employ the tools and results that we learned in the previous section in establishing some existence results and necessity conditions for minimizers. This “calculus-based” method seems more flexible than the direct method (which deals only with the existence problems), and provides us with various tools from calculus. We will pick the Poisson variational integral (which is a generalization to the Dirichlet integral) as our first example since we already established minimization results and equivalence to weak solution for the Laplace equation. Recall the Poisson variational integral takes the form 1 J [v] = 2
|Dv| dx − 2
fvdx,
(5.7.1)
Proposition 5.7.1 The Poisson variational integral J : H 1 () −→ R given by (5.7.1) is G-differentiable. Proof The G-derivative of J at u can be calculated as follows: Let 0 < t ≤ 1. From (5.7.1), we write 1
|Du|2 + t 2 |Dv|2 + 2t |Du| |Dv| dx. J [u + tv] = 2 This implies t J [u + tv] − J [u] |Dv|2 + |Du| |Dv| dx − = fvdx. t 2 The integrand in the first integral is clearly dominated by |Dv|2 which is integrable, so applying the Dominated Convergence Theorem gives lim
t→0
t |Dv|2 = 2
t |Dv|2 = 0. t→0 2 lim
Therefore, the G-derivative of J [·] is |Du| |Dv| dx − DJ (u)v = fvdx.
332
5 Calculus of Variations
We have two important observations. (1) One advantage of using the G-derivative is that, in many cases (not all!), it is easy to exchange integration and limits. This is because the integration process takes place with respect to x, whereas the limit process is with respect to t, which enables us to find an integrable function that dominates our integrand, and so the Dominated Convergence Theorem can be applied. (2) Observe that the G-derivative which was found in the preceding result has the same form as the one found in (5.5.6) (except possibly the second integral term). It shows that by defining h(t) = J (u + tv), we can evaluate the G-derivative of the functional by differentiating h. The Poisson variational functional is G-differentiable, and consequently the operation h (t) is legitimate and makes sense. Therefore, according to the results of the preceding section, if u is a minimizer of a G-differentiable variational integral J , and we set h(t) = J (u + tv), then
h (0) = DJ (u, V ).
In other words, we can think of the minimizer u to be a critical point of J . Now, we will discuss the results using the new tools that we learned in the previous sections. The first result to begin with is the Dirichlet principle of the equivalence between the minimization problem and the weak solution of the Poisson equation. Theorem 5.7.2 There exists a unique minimizer over H01 (), where is bounded in Rn , for the variational Poisson integral (5.7.1) for f ∈ L2 (). Furthermore, u ∈ H01 () is the weak solution of the Dirichlet problem of the Poisson equation
−∇ 2 u = f u=0 ∂
for f ∈ L2 (), if and only if u is the minimizer of the Poisson variational integral. Proof For the first part of the theorem, note that J is clearly strictly convex, coercive by Poincare inequality, G-differentiable by Proposition 5.7.1, and so weakly l.s.c. by Theorem 5.6.5. The result follows from Theorem 5.3.2. Now we prove the equivalence of the two problems. Let u ∈ H01 () be the weak solution of the Poisson problem. Let v ∈ H01 (), and write v = u − w for some w ∈ H01 (). Then 1 |∇(u − w)|2 − (u − w)f dx. J [v] = J [u − w] = 2 With simple calculations, this can be written as
5.7 Poisson Variational Integral
1 J [v] = 2
333
1 |∇u| − fudx + 2
2
|∇w| − 2
|∇u| |∇w| +
fwdx.
Performing integration by parts on the fourth integral in the RHS of the equation, given that w |∂ = 0 yields
−
|∇u| |∇w| =
w∇ 2 u = −
fwdx.
Substituting above gives J [v] = J [u] +
1 2
|∇w|2 ≥ J [u],
which holds for every v ∈ H01 (), hence u is the minimizer of J . Conversely, let u be a minimizer of J . We have 1 1 J [u + tv] − J [u] = B[u + tv, u + tv] − f (u + tv])dx − B[u, u] + fudx 2 2 t2 fvdx + B[v, v]. = t B[u, v] − 2 Dividing by t then passing to the limit as t −→ 0 gives DJ (u) = B[u, v] −
fvdx.
As a consequence, we have Corollary 5.7.3 u ∈ H01 () is a minimizer for the Poisson variational integral J [·] if and only if DJ (u) = 0. Proof By Theorem 5.6.8, we have DJ (u)v = B[u, v] −
fvdx,
and note that the right-hand side represents the weak formulation of the Poisson equation with homogeneous Dirichlet condition. Hence, DJ (u) = 0 if and only if u is a weak solution to the Poisson equation, and by the previous theorem this occurs if and only if u is a minimizer for the functional J . The above corollary views the vanishing of DJ (u) as a necessary and sufficient condition for minimization, but only for the variational integral of Poisson-type.
334
5 Calculus of Variations
5.7.2 Symmetric Elliptic PDEs Now we attempt to generalize our discussion of the Dirichlet principle to hold for the following problem: Lu = f in (5.7.2) u = 0 on ∂. for some uniformly elliptic operator L as in (4.5.3) with symmetric aij , c ∈ L∞ (), and c(x) ≥ 0 a.e.x ∈ , for some open and bounded in Rn , and f ∈ L2 (). Recall that the elliptic bilinear B[u, v] associated with an elliptic operator L is continuous, and B[v, v] is coercive if L is uniformly bounded. Moreover, by uniform ellipticity, we have n aij (x)ξi ξj ≥ λ0 |ξ|2 . i,j
It has been shown (Theorem 4.5.5) that there exists a unique weak solution in H01 () for the problem. The bilinear map takes the form ⎞ ∂v ∂u ⎝ B[u, v] = aij (x) + c(x)u(x)v(x)⎠ dx, ∂xi ∂xj i,j=1
⎛
n
(5.7.3)
and so the variational integral associated to (5.7.2) (written in short form) is J [v] =
1 B[v, v] − 2
fvdx =
1 2
A(x) |Dv|2 + cv2 dx −
fvdx.
(5.7.4)
We will follow the same plan. Namely, we prove the existence of the minimizer, then we prove that the problem of finding the minimizer of (5.7.4) is equivalent to the problem of finding the solution of (5.7.3), i.e., the solution of the equation and the minimizer is the same. Remember that Theorem 4.5.5 states that there exists only one weak solution, so if our result is valid, there should be only one minimizer. Before establishing the result, we recall that for bounded sequences, the following identities are known and can be easily proved: lim inf(−xn ) = − lim sup(xn ),
(5.7.5)
lim inf(xn ) + lim inf(yn ) ≤ lim inf(xn + yn ).
(5.7.6)
and
5.7 Poisson Variational Integral
335
Theorem 5.7.4 There exists a unique minimizer over H01 (), where is bounded in Rn , for the variational integral (5.7.4) for f ∈ L2 (). Proof Let u ∈ H01 (). Then, we have 1 1 2 2 J [u] = A(x) |Du| dx + cu dx − fudx. 2 2 1 ≥ λ0 du2L2 − fudx. 2 1 2 fudx (Poincare’s inequality) u − ≥ 1 H 2 2C + 2 ≥ C u2H 1 − f L2 · uL2 (C-S inequality) 1 u2L2 (Lemma 4.4.5) ≥ C u2H 1 − f 2 dx − 4
2 2 ≥ c uH 1 − uL2 − f 2 dx (c = min{C, 41 }) 2 2 = c DuL2 − f dx (definition of ·H 1 ) 2 ≥ − f dx
> −∞.
(since f ∈ L2 ).
Hence, J is bounded from below. Also, from the third inequality above, we have J [v] ≥ C1 v2H 1 −
f 2 dx − C2
v2 dx,
which can be written as J [u] + f 2 dx ≥ C1 u2H 1 − C2 u2L2
≥ C3 u2H 1 − u2L2 (C3 = min{C1 , C2 }). = C3 Du2L2 ≥ C u2H 1
(C =
C3 2 +1 C
).
Hence, J [·] is coercive. Further, J is a strictly convex being the summation of two w strictly convex terms (i.e., |Du|2 and u2 ) and a linear term. Finally, let un −→ u in H 1 (). Since f is a bounded linear functional on H 1 (), by definition of weak convergence, we have lim
un fdx =
ufdx,
336
5 Calculus of Variations
which implies that
lim inf
un fdx = lim sup
un fdx =
ufdx.
(5.7.7)
By Proposition 5.3.3(3), we have c u2 ≤ lim inf c un 2 .
(5.7.8)
and Du2 ≤ lim inf Dun 2 . and given that A = [aij (x)] ∈ L∞ (), it is readily seen (verify) that 1 2
1 A |Du| dx ≤ lim inf 2
2
A |Dun |2 dx.
(5.7.9)
Using (5.7.5), we add − lim sup( un , f ) and lim inf(− un , f ) to the left-hand side and to the right-hand side of (5.7.9), respectively. This gives
I [u] − lim sup(−
un fdx) ≤ lim inf I [un ] + lim inf −
un fdx.
(5.7.10)
Using (5.7.6) and (5.7.7) in (5.7.10) yields I [u] −
ufdx ≤ lim inf I [un ] − un fdx ,
and again add (5.7.8)–(5.7.10), therefore, we conclude that J [·] is w.l.s.c. The result now follows from Theorem 5.3.2. The step where we proved strict convexity is not necessary since we already proved the functional is weakly l.s.c.. In fact, establishing strict convexity in these cases is important only to prove uniqueness of the solution, which has been already verified by Theorem 4.5.5.
5.7.3 Dirichlet Principle of Symmetric Elliptic PDEs Next, we consider the problem of minimizing the functional (5.7.4) over the set of admissible functions A = {v ∈ H01 ()}. The next theorem shows that the problem of finding a solution for (5.7.3) and the problem of finding the minimizer of (5.7.4) are equivalent. Theorem 5.7.5 Consider the uniformly elliptic operator L defined in (5.7.2) for a symmetric aij ∈ L∞ (), f ∈ L2 (), and c(x) ≥ 0 a.e.x ∈ for some open and
5.8 Euler–Lagrange Equation
337
bounded in Rn , and B[u, v] is the elliptic bilinear map associated with L. Then u ∈ H01 () is a weak solution to (5.7.2) if and only if u is a minimizer for the variational integral (5.7.4) over A = {v ∈ H01 ()}. Proof Let u be a weak solution of (5.7.2). Then it satisfies the weak form B[u, v] = f (v)
(5.7.11)
for every v ∈ H01 (). Let w ∈ H01 () be a weak solution of (5.7.2), and without loss of generality, assume w = u + v for any function v ∈ H01 (). Our claim is that 0 ≤ J [u] − J [w]. We have J [u + v] − J [u] =
1 B[u + v, u + v] − 2
1 f (u + v])dx − B[u, u] + 2
fudx.
(5.7.12) By simple computations, taking into account that B is symmetric, and the fact from Theorem 4.5.4 that if L is uniformly elliptic operator, then B is coercive, we obtain J [u + v] − J [u] = −
1 fvdx + B[u, v] + B[v, v] 2
1 B[v, v] 2 ≥ β v2H 1 ()
=
(By coercivity of B)
0
≥ 0. This implies that u is the unique minimizer of (5.7.4). Note that the above inequality becomes equality only when v = 0. Conversely, let u ∈ H01 () be the minimizer of (5.7.4). Theorem 4.5.5 already proved the existence and uniqueness of a weak solution for problem (5.7.2) which we already proved it is a minimizer of (5.7.4), and on the other hand, Theorem 5.7.4 proved the existence and uniqueness of the minimizer of (5.7.4). This completes the proof.
5.8 Euler–Lagrange Equation 5.8.1 Lagrangian Integral Now, it seems that we are ready to generalize the work to more variational functionals. We developed all the necessary tools and techniques to implement a “calculus-based” method to solve minimization problems and their connections with their original PDEs. Our goal is to investigate variational problems and see if the minimizers of
338
5 Calculus of Variations
these problems hold as the weak solutions for the corresponding partial differential equations. Consider the general variational integral J [u] =
L(∇u, u, x)dx.
(5.8.1)
Here, L is a C 2 multi-variable function defined as L : Rn × R × −→ R, where u, v ∈ C 1 (), and is C 1 open and bounded set. The first variable in place of ∇u is denoted by p, the second variable in place of u is denoted by z. This is a common practice in the differentiation process if a function and its derivatives are the arguments of another function so that the chain rule is not misused. Such a function with the properties above is known as the Lagrangian functional. The functional (5.8.1) shall be called: Lagrangian Integral. To differentiate L with respect to any of the variables, we write ∇p L = (Lp1 , · · · , Lpn ), Lz , ∇x L = (Lx1 , · · · , Lxn ). We will establish an existence theorem for the minimizer of the general variational functional (5.8.1).
5.8.2 First Variation One of the consequences of the preceding section is that by defining the function h : R −→ R, by h(t) = J (u + tv), we see that, assuming sufficient smoothness on the integrand, the function h is differentiable if and only if J is G-differentiable. Indeed, if L is C 2 , then J is Gdifferentiable and both ∇p L and Lz are continuous, so we can perform chain rule, dL is continuous, which implies that h is differentiable on and since u, v ∈ C 1 (), dt R. Now, if u is a minimizer of a G-differentiable variational integral J , then h (0) =
∂ J [u + tv] |t=0 = DJ (u, V ). ∂t
(5.8.2)
Equation (5.8.2) is called the first varition of J [·], and it provides the weak form of the PDE which is associated with J .
5.8 Euler–Lagrange Equation
339
5.8.3 Necessary Condition for Minimiality I Let us see how to obtain the weak formulation explicitly from the first variation. Writing L(∇(u + tv), u + tv, x)dx, h(t) = J (u + tv) =
for v ∈ C01 () and t ∈ R so that u + tv ∈ C01 (). Let u ∈ C01 () be a minimizer of J , so 0 is the minimizer for h, hence h (0) = 0. Then we have ∂L ∇p L(∇u + t∇v, u + tv, x) · Dv + (∇u + t∇v, u + tv, x)v dx. h (t) = ∂z n ∂L (∇u + t∇v, u + tv, x)v dx. = Lpi (∇u + t∇v, u + tv, x)vxi + ∂z i=1 Thus, we have
0 = h (0) =
n
i=1
∂L (∇u, u, x)v dx. Lpi (∇u, u, x)vxi + ∂z
This is the weak formulation of the PDE associated with the variational integral J . To find the equation, we use Green’s identity (or integrate by parts with respect to x in the first n terms of the integral), taking into account v|∂ = 0, we obtain n ∂ ∂L (∇u, u, x) vdx. Lp (∇u, u, x) + − 0= ∂xi i ∂z i=1 By the Fundamental Lemma of COV, we obtain −
n ∂ ∂L (∇u, u, x) = 0, Lpi (∇u, u, x) + ∂xi ∂z i=1
or in vector form −div(∇p L) +
∂L (∇u, u, x) = 0. ∂z
Theorem 5.8.1 (Necessary Condition For Minimiality I) Let L(∇u, u, x) ∈ C 2 ()
340
5 Calculus of Variations
for some ∈ Rn , and consider the Lagrangian Integral (5.8.1). If u ∈ C01 () is a minimizer for J over A = {v ∈ C01 ()}, then u is a solution for the equation −
n ∂ ∂L ∂L (∇u, u, x) = 0, (∇u, u, x) + ∂x ∂p ∂z i i=1
(5.8.3)
with the homogeneous Dirichlet condition u = 0 on .
5.8.4 Euler–Lagrange Equation Definition 5.8.2 (Euler–Lagrange Equation) The equation (5.8.3) is called: Euler−Lagrange equation, and its weak form is given by n
i=1
∂L (∇u, u, x)v dx = 0. Lpi (∇u, u, x)vxi + ∂z
The Euler–Lagrange equation is a quasilinear second-order partial differential equation. It is one of the most important partial differential equations in applied mathematics, and it is a landmark in the history and development of the field of calculus of variations. If the functional is the variational integral of some PDE, then the associated E-L equation reduces to the weak formulation of that PDE. For example, if I [·] is the Dirichlet integral, then it is easy to see that the E-L equation reduces to the Laplace equation. Here, L(∇u, u, x) =
1 2 |p| , 2
so Lu = 0 and Lp = p = ∇u, and consequently Lpx = ∇ 2 u, therefore, the E-L equation reduces to the Laplace equation ∇ 2 u = 0. The Lagrangian of the Poisson variational integral is written in the form L(∇u, u, x) =
1 2 |p| − fu. 2
Then Lpx = ∇ 2 u, Lu = −f , so the E-L equation reduces to the Poisson equation.
5.8 Euler–Lagrange Equation
341
−∇ 2 u, = f . In the preceding theorem, the minimizer of the functional J was taken over the space C01 (), and consequently, the direction vector v was also chosen to be in C01 () so that u + tv ∈ C01 (). If the admissible set is chosen to be C 1 (), then we must have v ∈ C 1 (). Writing again the first variation as 0 = h (0) =
∂L ∇p L(∇u, u, x) · Dv + (∇u, u, x)v dx. ∂z
Applying the first Green’s identity to the first term yields 0=
∂
v∇p L(∇u, u, x) · nds −
∂L div(∇p L) + (∇u, u, x) vdx, ∂z
where n is the outward normal vector, and this equation holds for all v ∈ C 1 (). We thus have the following: Theorem 5.8.3 Let L(∇u, u, x) ∈ C 2 () for some ∈ Rn . If u ∈ C 2 () is a minimizer of the Lagrangian Integral J over C 2 (), then u is a solution for the Euler– Lagrange equation with the Neumann boundary condition ∇p L(∇u, u, x) · n = 0 on ∂.
5.8.5 Second Variation A natural question arises is: Does the converse of Theorem 5.8.1 hold? We know from the discussion above that a weak solution to the E-L equation is a critical point of the associated variational integral. But is it necessarily minimizer? The answer in general is: No, as it may also be maximizer, or neither. As we usually do with elementary calculus, the second derivative needs to be invoked here. Consider again the function L(∇(u + tv), u + tv, x)dx. h(t) = J (u + tv) =
If h has a minimum value at 0, then h (0) = 0 and h (0) ≥ 0. The first derivative was found to be n ∂L Lpi (∇u + t∇v, u + tv, x)vxi + 2 (∇u + t∇v, u + tv, x)v dx. h (t) = ∂z i=1
342
5 Calculus of Variations
Then h (t) =
⎡
⎣
n
Lpi pj vxi vxj + 2
n
i,j=1
⎤ Lzpj vvxj + Lzz v2 ⎦ dx,
j=1
Lpi pj = Lpi pj (∇u + t∇v, u + tv, x),
where
Lzpj = Lzpj (∇u + t∇v, u + tv, x), Lzz = Lzz (∇u + t∇v, u + tv, x). Thus, the second variation takes the form ⎡ ⎤ n n ⎣ 0 ≤ h (0) = Lpi pj vxi vxj + 2 Lzpj vvxj + Lzz v2 ⎦ dx.
i,j=1
j=1
Lpi pj = Lpi pj (∇u, u, x)
where
Lzpj = Lzpj (∇u, u, x) Lzz = Lzz (∇u, u, x) More precisely, 0≤
⎡ ⎣
n
Lpi pj (∇u, u, x)(∇v)2 + 2
i,j=1
n
⎤ Lzpi (∇u, u, x)v∇v + Lzz (∇u, u, x)v2 ⎦ dx,
i=1
which is valid for all v ∈ Cc∞ (). The above integral is called: second variation.
5.8.6 Legendre Condition Consider the function v = ξ(x)ϕ(
ηx ),
for some cut-off function ξ(x) ∈ Cc∞ (), fixed η ∈ Rn such that ϕ → 0 as → 0 and ϕ = 1. This gives vxi (x) = ηi ξ + O(). Substituting above with a suitable choice of ξ gives 0≤
n i,j=1
Lpi pj (∇u, u, x)ηi ηj dx,
5.8 Euler–Lagrange Equation
which holds for
343
n
Lpi pj (∇u, u, x)ηi ηj ≥ 0.
(5.8.4)
i,j=1
Condition (5.8.4) is thus a necessary condition for the critical point u to be a minimizer for J . Moreover, the inequality reminds us with the convexity property for L with respect to p. We therefore have the following theorem: Theorem 5.8.4 (Necessary Condition For Minimiality II (Legendre) Let L(∇u, u, x) ∈ C 2 () for some ∈ Rn . If u ∈ C01 () is a minimizer of the functional J in (5.8.1) over A = {v ∈ H01 ()}, then for all η ∈ Rn , we have n
Lpi pj (∇u, u, x)ηi ηj ≥ 0.
i,j=1
The second variation can be written in the quadratic form A(∇v)2 + 2Bv∇v + Cv2 dx, Q[u, v] =
where A=
n
Lpi pj , B =
i,j=1
n
Lzpi , C = Lzz ,
i=1
then it can be seen that if u is a local minimizer then Q[u, u] is positive definite, which implies that the integrand A(∇u)2 + 2BuDu + Cu2 ≥ 0. If the inequality is strict, then by (5.6.7), we have J [u + tv] = J [u] + t DJ (u v + where Q[u, v] = h (0) =
t2 Q[u, v], 2
d2 J [u + tv] |t=0 . dt 2
Therefore, we have Theorem 5.8.5 Let L(∇u, u, x) ∈ C 2 () for some bounded ∈ Rn , and suppose u ∈ C01 () is a critical point of the Lagrangian integral J . Then u is a local minimizer for J over
344
5 Calculus of Variations
A = {v ∈ C01 ()} if and only if Q[u, v] > 0 (i.e., positive definite), equivalently, A(∇u)2 + 2Bu∇u + Cu2 > 0.
5.9 Dirichlet Principle for Euler–Lagrange Equation 5.9.1 The Lagrangian Functional Consider again the variational Lagrangian integral J [u] =
L(Du, u, x)dx.
(5.9.1)
for some C 1 open and bounded ⊂ Rn . We will establish an existence theorem for the minimizer of J over Sobolev spaces W 1,q (), 1 < q < ∞. To ensure the variational integral is well-defined, we assume throughout that L is Caratheodory. A function L(p, z, x) is called Caratheodory if L is C 1 in z and p for a.e. x, and measurable in x for all p.
5.9.2 Gateaux Derivative of the Lagrangian Integral Our first task is to prove that J is G-differentiable and find its derivative. We need to impose further conditions on L. To ensure the functional J [·] is finite, we assume L, Lp , and Lz are all Caratheodory. We also assume the p−growth condition
|L(p, z, x)| ≤ C |p|q + |z|q + 1 ,
(5.9.2)
together with
max{Lp (p, z, x) , |Lz (p, z, x)|} ≤ C |p|q−1 + |z|q−1 + 1 .
(5.9.3)
Theorem 5.9.1 Consider the Lagrangian Integral Functional J : W 1,q () −→ R, 1 < q < ∞, given by J [u] =
L(Du, u, x)dx
for some bounded ⊂ Rn , where L is the Lagrangian. If L is Caratheodory and satisfies conditions (5.9.2) and (5.9.3), then the functional J is G-differentiable and
5.9 Dirichlet Principle for Euler–Lagrange Equation
DJ (u, v) =
345
Dp L(Du, u, x)Dv + Lz (Du, u, x)v dx.
(5.9.4)
Proof The G-derivative of J can be found as the limit as t −→ 0 of the expression 1 (J [u + tv] − J [u]) = t
1 (L(Du + tDv, u + tv, x) − L(Du, u, x)) dx (5.9.5) t
The key to the proof is the Dominated Convergence Theorem. Set ft =
1 (L(Du + tDv, u + tv, x) − L(Du, u, x)) . t
If we show that ft −→ f a.e. as t → 0 and |ft | ≤ g ∈ L1 () then by the Dominated Convergence Theorem we conclude that
lim t
ft dx =
fdx.
It is clear that a.e.
ft −→
dL |t=0 = Dp L(Du, u, x)Dv + Lz (Du, u, x)v. dt
(5.9.6)
On the other hand, letting 0 < t ≤ 1, then ft can be written as 1 t d (L(Du + τ Dv, u + τ v, x)) d τ t 0 dτ 1 t Dp L(Du + τ Dv, u + τ v, x)Dv + Lz (Du + τ Dv, u + τ v, x)v d τ = t 0
ft =
Next, we shall make use of the condition (5.9.3). This gives t 1 Dp L(Du + τ Dv, u + τ v, x) |∇v| (J [u + tv] − J [u]) ≤ 1 t t 0 + |Lz (Du + τ Dv, u + τ v, x)| |v|] d τ 1 t
C |Du + τ Dv|q−1 + |u + τ v|q−1 + 1 ≤ t 0 (|∇v| + |v|)] d τ . 1,q
Note that since u, v ∈ W0 , we have v, ∇v, (Du + τ Dv), (u + τ v) ∈ Lq . Given that the Holder’s conjugate of q is the number q∗ , so q∗ (q − 1) = q, we have
346
5 Calculus of Variations ∗
(|Du + τ Dv|q−1 )q = (|Du + τ Dv|)q ∈ L1 , and similarly for u + τ v. Then using Young’s inequality on the terms |Du + τ Dv|q−1 |∇v| , |Du + τ Dv|q−1 |v| , |u + τ v|q−1 |Dv| , and |u + τ v|q−1 |v| , we have the following L1 − integrable functions |Du + τ Dv|q , |u + τ v|q , |v| , |Dv| ∈ L1 (). Together with some constant C, set them all to be the function g(x) ∈ L1 (), we then have 1 t |ft | ≤ g(x)d τ = g(x) ∈ L1 (). t 0 Now from (5.9.5) and (5.9.6), the Dominated Convergence Theorem gives (5.9.4). Theorem 5.9.2 Under the assumptions of the preceding theorem, if u ∈ W 1,q () is a local minimizer of the functional J over A = {v ∈ W 1,q () : v = g on ∂, for some g ∈ W 1,q ()}, then u is a weak solution of the Euler-Lagrange equation n ∂ ∂L (Du, u, x) = 0, x ∈ − Lpi (Du, u, x) + ∂x ∂z i i=1 u = g, x ∈ ∂. Proof Multiplying the equation above by v ∈ Cc∞ () and integrating by parts gives
Dp L(Du, u, x)Dv + Lz (Du, u, x)v dx = 0.
(5.9.7)
So (5.9.7) is the weak formulation of the Euler–Lagrange equation. Now, since u is a local minimizer of J , by Theorem 5.9.1 and Theorem 5.6.9(1), we write (5.9.4) as (5.9.7). The task of proving that a minimizer is unique for such general functionals is a bit challenging, and some further conditions should be imposed. One way to deal with this problem is to assume convexity in the two variables (p, z) rather than p alone. Such property is called: jointly convexity. Definition 5.9.3 (Jointly Convex Functional) A function F(x, y) : X × Y −→ R is called jointly convex if for x1 , x2 ∈ X and y1 , y2 ∈ Y and every 0 ≤ θ ≤ 1, we have F (θx1 + (1 − θ)x2 , θy1 + (1 − θ)y2 ) ≤ θF(x1 , y1 ) + (1 − θ)F(x2 , y2 ). If the inequality is strict, then the functional F is said to be jointly strictly convex.
5.9 Dirichlet Principle for Euler–Lagrange Equation
347
In an analogous way to Theorem 5.6.4, we have the following useful property: Proposition 5.9.4 If L = L(p, z, x) is jointly convex in (z, p), then L(x, v, Dv) − L(x, u, Du) ≥ Lp (x, u, Du)(Dv − Du) + Lz (x, u, Du)(v − u). (5.9.8) Proof For 0 < t ≤ 1, set
w = tv + (1 − t)u.
Then, by joint convexity, we have L(x, w, Dw) ≤ tL(x, v, Dv) + (1 − t)L(x, u, Du) = t(L(x, v, Dv) − L(x, u, Du)) + L(x, u, Du). This implies L(x, w, Dw) − L(x, u, Du) t L(x, w, Dw) − L(x, w, Du) + L(x, w, Du) + L(x, u, Du) = t L(x, w, Dw) − L(x, w, Du) L(x, w, Du) + L(x, u, Du) + . = t t
L(x, v, Dv) − L(x, u, Du) ≥
Taking the limit t −→ 0, making use of Theorem 5.6.4, and noting that w −→ u, we get (5.9.8). Using this property of joint convexity, we can easily establish an existence and uniqueness theorem for the minimizer. Theorem 5.9.5 Under the assumptions of Theorem 5.9.1, and assuming L is jointly strictly convex in (z, p), there exists u ∈ W 1,q () for any 1 < q < ∞ such that u is the unique minimizer of the Lagrangian variational integral J [u] =
L(Du, u, x)dx. w
Proof The variational integral J is G-differentiable by Theorem 5.9.1. Let un −→ u in W 1,q (). Then by (5.9.8) L(x, un , Dun ) − L(x, u, Du) ≥ Lp (x, u, Du)(Dun − Du) + Lz (x, u, Du)(un − u). Note that Lp (Du, un , x) and Lz (x, u, Du)(v − u) are bounded linear functionals, and w w un −→ u, Dun −→ Du, so Lp (Du, un , x)(Dun − Du) −→ 0,
348
5 Calculus of Variations
and Lz (x, u, Du)(un − u) −→ 0. This gives L(x, un , Dun ) ≥ L(x, u, Du). Now we integrate both sides over , and then taking the limit inferior, lim inf J [un ] ≥ J [u]. So, J is weakly l.s.c., and therefore, by Theorem 5.3.2, there exists a minimizer. The uniqueness of the minimizer follows from the joint strict convexity by a similar argument to that of Theorem 5.3.2, and this will be left to the reader as an easy exercise. Now we are ready to establish the Dirichlet principle for the Lagrangian integral.
5.9.3 Dirichlet Principle for Euler-Lagrange Equation Theorem 5.9.6 Under the assumptions of Theorem 5.9.1, and assuming L is jointly convex in (z, p), u ∈ W 1,q () is a weak solution of the Euler–Lagrange equation if and only if u is a minimizer of the Lagrangian integral. Proof Let u ∈ W 1,q () be a weak solution of the Euler–Lagrange equation. Integrating both sides of the inequality (5.9.8) over yields J [v] − J [u] ≥ Lp (x, u, Du)(Dv − Du) + Lz (x, u, Du)(v − u) = 0.
This gives J [v] ≥ J [u], and here u is a minimizer. Theorem 5.9.2 gives the other direction.
5.10 Variational Problem of Euler–Lagrange Equation 5.10.1 p−Convex Lagrangian Functional In this section, we solve a variational problem of Euler–Lagrange equation. Namely, we will find a minimizer for the corresponding variational integral, then we will show that this minimizer is a weak solution for the Euler–Lagrange equation. In the previous section, we have already solved a variant of this problem in addition to
5.10 Variational Problem of Euler–Lagrange Equation
349
a Dirichlet principle for the Euler–Lagrange equation, provided the Lagrangian is jointly convex. However, the property of joint convexity is restrictive, and not too many functions satisfy this property. For example, the function f (x, y) = xy can be shown that it is convex in x and convex y but not jointly convex in (x, y). A main motivation for us is the Legendre condition (Theorem 5.8.4), in the sense that the inequality n Lpi pj (Du, u, x)ηi ηj ≥ 0 i,j=1
is essential for the critical point to be a minimizer. The above inequality implies that in p, which seems to be the natural property to replace the joint convexity of L, so we will adopt this property in the next two results. A classical result in real analysis shall be invoked here. Recall that Egoroff’s theorem states the following: If {fn } be a sequence of measurable functions and fn → f a.e. on a set E of finite measure, then for every > 0, there exists a set A, with μ(A) < such that fn −→ f uniformly on E \ A. The theorem shall be used to prove the following. Theorem 5.10.1 Let L = L(p, z, x) be the Lagrangian functional that is bounded from below. If L is convex in p, then J [u] is weakly l.s.c. on W 1,q () for any 1 < w q < ∞, is C 1 open bounded in Rn , that is, for every un −→ u on W 1,q () for 1 n 1 < q < ∞, is C open bounded in R , we have J [u] ≤ lim inf J [un ]. w
Proof Let un −→ u in W 1,q (). We divide the proof into three parts. Firstly, since L is convex in p, we use Theorem 5.6.4 to get L(Dun , un , x) − L(Du, un , x) ≥ Lp (Du, un , x)(Dun − Du), w
but we know that Lp (Du, un , x) is a bounded linear functional, and Dun −→ Du, so Lp (Du, un , x)(Dun − Du) −→ 0, from which we get
L(Dun , un , x) ≥
L(Du, un , x).
(5.10.1)
Secondly, since L is bounded from below, J is also bounded from below, so let m = lim inf J [un ].
350
5 Calculus of Variations
Moreover, by Proposition 5.3.3(3), (un ) is bounded, so by Rellich–Kondrachov Theorem 3.10.5, there exists a subsequence unm = um of un such that um strongly converges to u in Lp , and so by Proposition 5.3.3(1), there exists a subsequence umj = uj of um such that uj converges to u a.e. Now, since is bounded, by Egoroff theorem, there exists a subsequence ujk = uk of uj such that uk converges to u uniformly in some open set such that μ( \ ) < (where μ denotes the Lebesgue measure), so we can assume within that both un and Dun are bounded, and since L is C 1 , this implies
L(Du, un , x)dx =
lim
= =
L(Du, lim un , x)dx L(Du, u, x)dx,
i.e., we have
lim L(Du, un , x)dx
lim
L(Du, un , x)dx =
L(Du, u, x)dx.
(5.10.2)
Thirdly, since L is bounded from below by, say c > −∞, then WLOG we can assume L > 0 (since we can use the shift transformation L −→ L + c). Also, we note that as → 0, , so we can write L(Du, un , x)dx = χ L(Du, un , x)dx,
where χA is the characteristic function which equals 1 on A and zero otherwise. 1 Writing = , then we see that (χ L) is an increasing sequence of nonnegative and n measurable and converges to χ L. Hence, by the Monotone Convergence Theorem (Theorem 1.1.6), lim L(Du, u, x)dx = L(Du, u, x)dx. (5.10.3)
From (5.10.2) and (5.10.3), we conclude that L(Du, un , x)dx = L(Du, u, x)dx, lim
and from (5.10.1), we obtain
5.10 Variational Problem of Euler–Lagrange Equation
351
lim inf J [un ] = lim inf ≥ lim inf
L(Dun , un , x)dx L(Du, un , x)dx
≥ lim inf L(Du, un , x)dx = L(Du, u, x)dx
= J [u].
5.10.2 Existence of Minimizer We have seen that the two main conditions to guarantee the existence of minimizers are the coercivity and lower semicontinuity. The preceding theorem deals with the latter condition, and we need assumptions to guarantee the former. As in the preceding theorem, we gave conditions on L rather than J , so we will continue to do that for the existence theorem. Theorem 5.10.2 Let L = L(p, z, x) be the Lagrangian functional. Suppose that L is bounded from below and convex in p. Moreover, there exists α > 0 and β ≥ 0 such that L ≥ α |p|q − β, for 1 < q < ∞. Then there exists u ∈ W 1,q () for some C 1 open bounded ⊂ Rn such that u is the minimizer of the Lagrangian variational integral J [u] = L(Du, u, x)dx
over the admissible set A = {v ∈ W 1,q () : v = g on ∂, for some g ∈ W 1,q ()}. Remark To avoid triviality of the problem, we assume that inf J < ∞, and A = ∅. Proof WLOG we can assume β = 0 or use the shift L −→ L + β. The bound condition of L implies that
J [u] =
L(Du, u, x)dx ≥ α
|Du|q ,
hence J is bounded from below, and note also from the preceding theorem that J is weakly l.s.c., so let un ∈ A such that J [un ] < ∞. Then
352
5 Calculus of Variations
sup Dun q < ∞. For any v ∈ A, we have
1,q
un − v ∈ W0 (),
so using Poincare inequality un Lq = un − v + vLq ≤ un − vLq + vLq ≤ D(un − v)Lq + vLq = Dun − DvLq + C1 = C = C2 + C1 . Therefore, sup un q < ∞, and consequently, (un ) is bounded in W 1,q (), which shows that J is coercive. Finally, Proposition 5.5.2 shows that A is weakly closed. The result follows now from Theorem 5.3.2. Lastly, we prove that the minimizer for the Lagrangian variational integral is a weak solution to the Euler–Lagrange equation. Theorem 5.10.3 Suppose that the Lagrangian functional L satisfies all the assumptions of Theorem 5.9.1 and 5.10.2. If u ∈ W 1,q () is a local minimizer of the functional J over A = {v ∈ W 1,q () : v = g on ∂, for some g ∈ W 1,q ()}, then u is a weak solution of the Euler-Lagrange problem n ∂ ∂L (Du, u, x) = 0, x ∈ − Lpi (Du, u, x) + ∂xi ∂z i=1 u = g, x ∈ ∂. Proof Same as Theorem 5.9.2.
5.11 Problems (1) Prove Proposition 5.1.10. (2) Give an example to show that the result of Mazur’s Lemma 5.2.10 doesn’t hold for every finite convex combination of the sequence xn .
5.11 Problems
353
(3) Prove Theorem 5.2.9 from Mazur’s Lemma. (4) Let f : X −→ R be coercive and weakly l.s.c. defined on a reflexive Banach space X . Show that f is bounded from below. (5) Show that if f : R −→ R and f (x) ≥ α |x|p − β for some α, β > 0 and 1 < p < ∞ then f has a minimizer over R. (6) Give an example of a function f : R −→ R such that f is coercive, bounded from below, but does not have a minimizer on R. (7) Let f : X −→ R be convex and ls.c. If f < ∞ and there exists x0 ∈ X such that f (x0 ) = −∞ then show that f ≡ −∞. (8) Give an example of a minimizing sequence with no subsequence converging in norm. (9) Let {fi : i ∈ I } be a family of convex functionals defined on a Hilbert space. Show that sup{fi : i ∈ I } is convex. (10) Show that if f , g are l.s.c and both are bounded from below then f + g is l.s.c. (11) Show that if fn is a sequence of l.s.c. functions and fn converges uniformly to f , then f is l.s.c. (12) If f is bounded from below, convex, and l.s.c. Prove or disprove: f is continuous on its domain. (13) Use Prop 5.3.3(3) to prove the statement of Theorem 4.9.5(2). (14) (a) Show that a function f is coercive if and only if its lower level sets {x : f (x) ≤ b, b ∈ R} are bounded. (b) Deduce from (a) that if f : H −→ (−∞, ∞] is proper coercive then every minimizing sequence of f is bounded. (15) A function is called: quasi − convex if its lower-level sets {x : f (x) ≤ b, b ∈ R} are convex. (a) Show that every quasi-convex function is convex. (b) Show that every monotone function is quasi-convex. (c) Let f : H −→ (−∞, ∞] be quasi-convex. Show that f is l.s.c. if and only if f is weakly l.s.c. (16) Let Let f : H −→ (−∞, ∞] be quasi-convex and l.s.c., and suppose C ⊂ H is weakly closed. If there exists b ∈ R such that C ∩ {x : f (x) ≤ b, b ∈ R} is bounded, prove that there exists a minimizer of f over C.
354
5 Calculus of Variations 1,p
(17) (a) Show that the Dirichlet integral I is not coercive on W0 () for p > 2. (b) Show that the Dirichlet integral I is strictly convex on W 1,p () for p ≥ 2. (18) (a) Show that x2 is weakly l.s.c. (b) Determine values of p for which xp is weakly l.s.c. (19) Let F : Rn −→ (−∞, ∞] be l.s.c. and convex. Let be bounded Lip. in Rn , and define the variational J : W 1,p () −→ R, J [u] = F(Du)dx.
(a) Show that J is convex. (b) Show that J is l.s.c. (20) Consider the variational integralJ : H 1 (0, 1) −→ R.
1
J [u] =
u − 1 2 + u2 dx.
0
(a) Show that J is coercive. (b) Show thatJ is not convex. (c) Show that the minimum of J is zero but J doesn’t attain its minimum. (21) Consider the variational integral
1
J [u] =
2 2 (u ) − 1 dx.
0
(a) Show that J has minimum value 0. (b) Show that there exists no minimizer over C 1 [0, 1]. (c) Show that there exists a minimizer over C[0, 1]. (22) Consider the variational integral J [u] =
1 −1
2 u − 2 |x| dx.
(a) Show that J has minimum value 0. (b) Show that there exists no minimizer over C 2 [−1, 1]. (c) Show that there exists a minimizer over C 1 [−1, 1]. (23) Consider the variational integral J : H01 () −→ R, is bounded in Rn , given by 1 1 |Du|2 + u3 + f (x)u dx. J [u] = 3 2 (a) Show that J is strictly convex. (b) Show that J is l.s.c. (c) Show that there exists a minimizer for J .
5.11 Problems
355
(d) Show that the minimizer of J is a weak solution of the problem ∇ 2 u − u2 = f , x ∈ u = 0, x ∈ ∂. (24) Let ψ : Rn −→ R be l.s.c. and convex. Consier the functional J : W 1,p () −→ R, for some open and lip. in Rn , and 1 < p < ∞, and defined by J [u] =
ψ(Du)dx.
Show that J is weakly l.s.c. (25) Let f : R2 −→ R given by ⎧x ⎨ (x2 + y2 ) R2 \ {(0, 0)} f (x, y) = y ⎩0 (x, y) = (0, 0). (a) Show that f is not continuous at (0, 0). (b) Find the G-derivative of f at (0, 0). (26) Prove Proposition 5.6.2. (27) Let f be G-differentiable on a normed space X . Prove that f is convex if and only if
Df (v) − Df (u), v − u ≥ 0 for all u, v ∈ X . (28) Consider the integral functional J : C 1 [0, 1] −→ R defined by 1 |u| dx J [u] = 0
(a) Find the G-derivative of J at all u = 0. (b) Show that the G-derivative does not exist at u = 0. (29) Let f : X −→ R for some Banach space X . Show that |f (x1 ) − f (x2 )| ≤ sup Df (tx1 + (1 − t)x2 )X ∗ x2 − x1 X t∈[0,1]
(30) Show that if f is Frechet differentiable at x, then it is G-Differentiable at x. (31) Show that ·pp is not G-differentiable at u = 0. (32) Consider the integral functional J : W 1,p () −→ R, ⊂ Rn , 1 < p < ∞, defined by |Du|p dx. J [u] =
356
5 Calculus of Variations
(a) Show that J is convex. (b) Find the G-derivative of J . (c) Show that J has a minimizer u over the set 1,p
A ={v ∈ W 1,p () : v − g ∈ W0 (), for some g ∈ W 1,p ()}. (d) Show that the minimizer u is the weak solution of the problem
div |∇u|p−2 ∇u = 0, x ∈ u = g, x ∈ ∂. (33) If J [u] = B[u, v] + L[u] for some bilinear form B and linear L. Show that D2 J (u, v)w = B[v, w] + B[w, v]. (34) Find the variational integral for the equation !
∇u
div
1/2 1 + |∇u|2
" = 0.
(35) Consider the problem of minimizing the variational integral (5.7.1) over H01 (), where is bounded in at least one direction in Rn and f ∈ L2 (). (a) Prove the following identity 1 1 1 ∇(ui − uj )2 dx, inf J [v] ≤ J [ui ] + J [uj ] − v∈X 2 2 4 (b) Show that ∇(ui − uj ) L2 −→ 0. (c) Show that (un ) is Cauchy in H01 . (d) Use (c) to prove the existence of the minimizer. (e) Use (a) to prove the uniqueness of the minimizer. (36) (a) Show that f (x) = xp is strictly convex for 1 < p < ∞. (b) Show that the variational integral defined in (5.7.4) is strictly convex. (c) Deduce that the weak solution to problem (5.7.2) is unique. (37) Alternative proof for (5.7.9): Consider the problem (5.7.2) with all the assumptions. (a) Show that ADv ∈ L2 (). (b) Show that 0 ≤ A(x) |D(vn − v)|2 dx,
(c) Show that
5.11 Problems
357
A(x)Dvn Dvdx −→
A(x) |Dv|2 dx.
(d) Prove (5.7.9) in the proof of Theorem 5.7.5. (38) In Theorem 5.7.5, find the associated variational integral, then use any method to prove the theorem for the same symmetric operator with the boundary condition: (a) u = f on ∂. ∂u = 0 on ∂. (b) ∂n ∂u (c) = g on ∂. ∂n (39) Consider the following Neumann problem of the Poisson equation −∇ 2 u = f , x ∈ ∂u = g, x ∈ ∂. ∂n for some f , g ∈ L2 () where is bounded in Rn . (a) Find the associated variational integral J [u] of the problem. (b) Define the minimization problem and its admissible set. (c) Show that u ∈ H 1 () is the weak solution of the problem if and only if u is the minimizer of the associated variational integral J which is obtained in (a) over the admissible set obtained in (b). (40) Prove that there exists a minimizer for the functional J : H02 () −→ R, given by 1 2 2 D u − f (x)Du − g(x)u dx J [u] = 2 for some bounded ⊂ Rn , f , g ∈ Cc∞ (), over the admissible set A ={u ∈ H02 () :
u = 0 on ∂}.
(41) The p−Laplacian operator p is defined by p u = ∇ · (|∇u|p−2 ∇u) p
(a) Find the G-derivative of u(x) = uLp . (b) Consider the p−Laplace equation (for 1 < p < ∞) −p u = f , x ∈ u = 0, x ∈ ∂. Show that the corresponding variational integral is
358
5 Calculus of Variations
Jp [u] =
1 p
|∇u|p dx −
fudx.
(c) Show that Jp is G-differentiable and convex. Deduce it is weakly l.s.c. (d) Show that the functional Jp [·] admits a unique minimizer over H01 (). (e) Establish a Dirichlet principle between the p−Laplacian equation and its variational integral. (42) Find the Euler–Lagrange equation corresponding to the following Lagrangians over {u ∈ C 1 , u = 0 on ∂}. 1 (a) L(p, z, x) = |p|2 + F(z) for some nonlinear function F. 2 1 q |p| + z q for some 1 < q < ∞. 2 1 |p|r+2 + fz, f ∈ C 1 . (c) L(p, z, x) = r+2 (43) Show that the functional defined by 1 # x 1 + (u )2 dx J [u] = (b) L(p, z, x) =
−1
has no minimizer on C 1 [−1, 1]. (44) Determine whether the functional J : C 1 [0, 1] −→ R defined by 1# u2 + (u )2 dx J [u] = 0
has a minimizer over A ={u ∈ C 1 [0, 1] such that u(0) = 0 and u(1) = 1}. (45) Consider the variational integral J : H01 () −→ R, given by 1 2 |Du| − f (x)Du dx J [u] = 2 for some bounded ⊂ Rn , f ∈ Cc∞ (). (a) Prove that there exists a minimizer over the admissible set A ={u ∈ H01 (), u = 0 on ∂.}. (b) Find the corresponding Euler–Lagrange equation. (46) Consider the variational integral J : C[a, b] −→ R, given by
b
J [u] = a
#
1 + (u )2 dx.
5.11 Problems
359
(a) Find the corresponding Euler–Lagrange equation. (b) Show that the associated bilinear B[u, v] is positive. (c) Conclude that the line is the shortest distance between two points. (47) Find the Euler–Lagrange equation corresponding to the quadratic form Q[u] =
2 (u ) − u2 dx.
(48) Determine whether the functions are jointly convex or not. (a) L(p, z, x) = zpi . (b) L(p, z, x) = |p|2 − z. (c) L(p, z, x) = 21 |p|2 − zx. (49) Show that the Poisson variational integral is jointly strictly convex in z and p. (50) Prove or disprove: (a) If f (x, y) and g(y, z) are strictly convex functions then f (x, y, z) = f (x, y) + g(y, z) is strictly convex. (b) If f (x, y) is convex in (x, y) and strictly convex in x and strictly convex in y, then f is strictly convex. (c) If f (x, y) is jointly convex in (x, y) and strictly convex in x and strictly convex in y, then f is jointly strictly convex. (51) A function f : Rn −→ R is said to be strongly convex if there exists β > 0 such that f (x) − β x2 is convex. (a) Show that if a function is strongly convex, then it is strictly convex. (b) Give an example of a function that is strictly convex but not strongly convex. (c) Show that if a function is strongly convex then f (y) ≥ f (x) + (∇f ) · (y − x) +
β y − x2 . 2
(d) In Theorem 5.10.2, in addition to all the assumptions of the theorem, if L=L(x,p) and is strongly convex, show that the minimizer predicted by the theorem is unique.
References
1. R.A. Adams, J.J.F. Fournier, Sobolev Spaces (Academic, Elsevier Ltd., 2003) 2. N.I. Akhiezer, I.M. Glazman, Theory of Linear Operators in Hilbert Space (Dover Publications, 1993) 3. C. Alabiso, I. Weiss, A Primer on Hilbert Space Theory (Springer International Publishing Switzerland, 2015) 4. F. Albiac, N.J. Kalton, Topics in Banach Space Theory (Springer International Publishing Switzerland, 2006; 2nd edn., 2016) 5. C.D. Aliprantis, K.C. Border, Infinite Dimensional Analysis: A Hitchhiker’s Guide (Springer, Berlin, 1999; Heidelberg, 2006) 6. T. Apostol, Mathematical Analysis, 2nd edn. (Pearson, 1974) 7. J.-P. Aubin, Applied Functional Analysis, 2nd edn. (Wiley, 2000) 8. G. Bachman, L. Narici, Functional Analysis (Dover Publications, 1998) 9. V. Barbu, T. Precupanu, Convexity and Optimization in Banach Spaces (Springer Netherlands, 2012) 10. H.H. Bauschke, P.L. Combettes, Convex Analysis and Monotone Operator Theory in Hilbert Spaces (Springer, 2011) 11. B. Beauzamy, Introduction to Banach Spaces and their Geometry (North- Holland Publishing Company, 1982) 12. L. Beck, Elliptic Regularity Theory a First Course (Springer, 2016) 13. S. Berberian, Fundamentals of Real Analysis (Springer, New York, 1999) 14. S.K. Berberian, P.R. Halmos, Lectures in Functional Analysis and Operator Theory (Springer, 1974) 15. K. Bichteler, Integration - A Functional Approach (Birkhäuser, Basel, 1998) 16. A. Bowers, N.J. Kalton, An Introductory Course in Functional Analysis (Springer, New York, 2014) 17. S. Boyd, L. Vandenberghe, Convex Optimization (Cambridge University Press, 2004) 18. A. Bressan, Lecture Notes on Functional Analysis: With Applications to Linear Partial Differential Equations (American Mathematical Society, 2012) 19. H. Brezis, Functional Analysis, Sobolev Spaces and Partial Differential Equations (Springer, New York, 2010) 20. D.S. Bridges, Foundations of Real and Abstract Analysis (Springer, New York, 1998) 21. T. Bühler, D.A. Salamon, Functional Analysis (American Mathematical Society, 2018)
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 A. Khanfer, Applied Functional Analysis, https://doi.org/10.1007/978-981-99-3788-2
361
362
References
22. C. Caratheodory, Calculus of Variations and Partial Differential Equations of First Order, 3rd edn. (American Mathematical Society, 1999) 23. K. Chandrasekharan, Classical Fourier Transforms (Springer, Berlin, Heidelberg, 1989) 24. N.L. Carothers, A short Course on Banach Space Theory (Cambridge University Press, 2004) 25. Ward Cheney, Analysis for Applied Mathematics (Springer, New York Inc, 2001) 26. M. Chipot, Elliptic Equations: An Introductory Course (Birkhäuser, Berlin, 2009) 27. M. Chipot, Elements of Nonlinear Analysis (Birkhauser Advanced Texts, 2000) 28. C. Chidume, Geometric Properties of Banach Spaces and Nonlinear Iterations (SpringerVerlag London Limited, 2009) 29. P.G. Ciarlet, Linear and Nonlinear Functional Analysis with Applications (SIAM-Society for Industrial and Applied Mathematics, 2013) 30. R. Coleman, Calculus on Normed Vector Spaces (Springer, 2012) 31. J.B. Conway, A Course in Functional Analysis (Springer, New York, 1985) 32. R.F. Curtain, A. Pritchard, Functional Analysis in Modern Applied Mathematics (Academic, 1977) 33. B. Dacorogna, Direct Methods in the Calculus of Variations (Springer, Berlin, 1989) 34. J. Diestel, Geometry of Banach Spaces - Selected Topics (Springer, Berlin, Heidelberg, NY, 1975) 35. G. van Dijk, Distribution Theory: Convolution, Fourier Transform, and Laplace Transform, De Gruyter Graduate Lectures (Walter de Gruyter GmbH, Berlin/Boston, 2013) 36. J.J. Duistermaat, J.A.C. Kolk, Distributions: Theory and Applications (Springer, New York, 2006) 37. Y. Eidelman, V. Milman, A. Tsolomitis, Functional Analysis: An Introduction (American Mathematical Society, 2004) 38. L.D. Elsgolc, Calculus of Variations (Dover Books on Mathematics, 2007) 39. L.C. Evans, Partial Differential Equations, 2nd edn. (American Mathematical Society, 2010) 40. Marián Fabian, Petr Habala, Petr Hájek, Vicente Montesinos, Václav. Zizler, Functional Analysis and Infinite-Dimensional Geometry (Springer, New York, 2001) 41. A. Friedman, Foundations of Modern Analysis (Dover Publications Inc, 1970) 42. I.M. Gelfand, S.V. Fomin, Calculus of Variations (Prentice-Hall, Inc, 1963) 43. M.G.S. Hildebrandt, Calculus of Variations (Springer, 1996) 44. D. Gilbarg, N.S. Trudinger, Elliptic Partial Differential Equations of Second Order (Springer, 2001) 45. G. Giorgi, A. Guerraggio, J. Thierfelder, Mathematics of Optimization: Smooth and Nonsmooth Case (Elsevier Science, 2004) 46. I. Gohberg, S. Goldberg, Basic Operator Theory (Birkhäuser, 1980) 47. H.H. Goldstine, A History of the Calculus of Variations from the 17th through the 19th Century (Springer, 1980) 48. D.H. Griffel, Applied Functional Analysis (Ellis Horwood LTD, Wiley, 1981) 49. G. Grubb, Distributions and Operators (Springer Science+Business Media, 2009) 50. C. Heil, A Basis Theory Primer (Springer Science+Business Media, LLC, 2011) 51. V. Hutson, J.S. Pym, M.J. Cloud, Applications of Functional Analysis and Operator Theory (Elsevier Science, 2006) 52. W.B. Johnson, J. Lindenstrauss, Handbook of the Geometry of Banach Spaces, vol. 2 (Elsevier Science B.V., 2003) 53. J. Jost, Partial Differential Equations, 2nd edn. (Springer, 2007) 54. V. Kadets, A Course in Functional Analysis and Measure Theory (Springer, 2006) 55. L.V. Kantorovich, G.P. Akilov, Functional Analysis (Pergamon Pr, 1982) 56. S. Kantorovitz, Introduction to Modern Analysis (Oxford University Press, 2003) 57. N. Katzourakis, E. Varvaruca, An Illustrative Introduction To Modern Analysis (CRC Press, 2018) 58. A. Khanfer, Fundamentals of Functional Analysis (Springer, 2023) 59. H. Kielhöfer, Calculus of Variations, An Introduction to the One-Dimensional Theory with Examples and Exercises (Springer, 2018)
References
363
60. A.N. Kolmogorov, S.V. Fomin, Elements of the Theory of Functions and Functional Analysis (Martino Fine Books, 2012) 61. V. Komornik, Lectures on Functional Analysis and the Lebesgue Integral (Springer, 2016) 62. S.G. Krantz, A Guide to Functional Analysis (Mathematical Association of America, 2013) 63. E. Kreyszig, Introductory Functional Analysis with Applications (Wiley Classics Library, 1989) 64. A.J. Kurdila, M. Zabarankin, Convex Functional Analysis (Springer Science & Business Media, 2005) 65. S.S. Kutateladze, Fundamentals of Functional Analysis (Springer-Science+Business Media, B.V., 1995) 66. Serge Lang, Real and Functional Analysis (Springer, New York, 1993) 67. D. Peter, Lax, Functional Analysis: A Wiley-Interscience Series of Texts, Pure and Applied Mathematics (2002) 68. L.P. Lebedev, I.I. Vorovich, Functional Analysis in Mechanics (Springer, New York, Inc., 2003) 69. G. Leoni, A First Course in Sobolev Spaces (American Mathematical Society, 2009) 70. E.H. Lieb, M. Loss, Analysis (American Mathematical Society, 2001) 71. J. Lindenstrauss, L. Tzafriri, Classical Banach Spaces II: Function Spaces (Springer, Berlin, Heidelberg GmbH, 1979) 72. Yu.I. Lyubich, Functional Analysis I: Linear Functional Analysis (Springer, Berlin, Heidelberg, 1992) 73. T.-W. Ma, Classical Analysis on Normed Spaces (World Scientific Publishing, 1995) 74. M.V. Marakin, Elementary Operator Theory (De Gruyter, 2020) 75. R. Megginson, An Introduction to Banach Space Theory (Springer, New York Inc, 1998) 76. M. Miklavcic, Applied Functional Analysis and Partial Differential Equations (World Scientific Publishing Co., 1998) 77. D. Mitrea, Distributions, Partial Differential Equations, and Harmonic Analysis (Springer, 2018) 78. T.J. Morrison, Functional Analysis: An Introduction to Banach Space Theory (WileyInterscience, 2000) 79. J. Muscat, Functional Analysis: An Introduction to Metric Spaces, Hilbert Spaces, and Banach Algebras (Springer, 2014) 80. L. Narici, E. Beckenstein, Topological Vector Spaces (Chapman & Hall/CRC, Taylor & Francis Group, 2011) 81. J.T. Oden, L.F. Demkowicz, Applied Functional Analysis (CRC Press, Taylor & Francis Group, 2018) 82. M.S. Osborne, Locally Convex Spaces (Springer International Publishing, Switzerland, 2014) 83. S. Ponnusamy, Foundations of Functional Analysis (Alpha Science International Ltd, 2002) 84. V. Maz’ya, Sobolev Spaces with Applications to Elliptic Partial Differential Equations, 2nd edn. (Springer, 2011) 85. M. Renardy, R. Rogers, An Introduction to Partial Differential Equations, 2nd edn. (Springer, 2004) 86. M. Reed, B. Simon, Methods of Modern Mathematical Physics I: Functional Analysis (Academic, 1981) 87. F. Riesz, B. Sz.-Nagy, Functional Analysis (Dover Publications, 1990) 88. A.W. Roberts, D.E. Varberg, Convex Functions (Academic, 1973) 89. R.T. Rockafellar, Convex Analysis (Princeton University Press, 1970) 90. R.T. Rockafellar, R. Wets, Variational Analysis (Springer, 2010) 91. W. Rudin, Functional Analysis (McGraw-Hill, 1991) 92. B.P. Rynne, M.A. Youngson, Linear Functional Analysis. Springer Undergraduate Mathematics Series (2008) 93. H.H. Schaefer, M.P. Wolff, Topological Vector Spaces (Springer Science+Business Media, New York, 1999) 94. M. Schechter, Principles of Functional Analysis (American Mathematical Society, 2002)
364
References
95. M. Ó Searcóid, Elements of Abstract Analysis (Springer, 2002) 96. R. Sen, A First Course in Functional Analysis: Theory and Applications (Anthem Press, 2013) 97. V.I. Smirnov, A.J. Lohwater, A Course of Higher Mathematics. Integration and Functional Analysis (Elsevier Ltd, 1964) 98. R. Strichartz, A Guide to Distribution Theory and Fourier Transforms (World Scientific Publishing Company, 2003) 99. V.S. Sunder, Operators on Hilbert Space. Springer, Texts and Readings in Mathematics (2016) 100. P. Szekeres, A Course in Modern Mathematical Physics (Cambridge University Press, 2004) 101. A.E. Taylor, D.C. Lay, Introduction to Functional Analysis (Robert E. Krieger, 1980) 102. M.E. Taylor, Partial Differential Equations (Springer, New York, 2010) 103. F. Treves, Topologocal Vector Space, Distributions and Kernels (Academic, California, 1967) 104. G.M. Troianiello, Elliptic Differential Equations and Obstacle Problems (Plenum, New York, 1987) 105. J.K. Truss, Foundations of Mathematical Analysis (Oxford University Press, 1997) 106. H.L. Vasudeva, Elements of Hilbert Spaces and Operator Theory. (Springer Nature Singapore Pte Ltd., 2017) 107. P. Wojtaszczyk, Banach Spaces for Analysts (Cambridge University Press, 1991) 108. A. Wouk, A Course of Applied Functional Analysis (Wiley, 1979) 109. K. Yosida, Functional Analysis (Springer, Berlin, Heidelberg, 1978) 110. E. Zeidler, Applied Functional Analysis: Main Principles and Their Applications (Springer, 1995) 111. E. Zeidler, Applied Functional Analysis: Applications to Mathematical Physics (Springer, 1995) 112. A.H. Zemanian, Distribution Theory and Transform Analysis: An Introduction to Generalized Functions, with Applications (Dover Publications, 2011) 113. W.P. Ziemer, Weakly Differentiable Functions, Sobolev Spaces and Functions of Bounded Variations (Springer, 1989)
Index
A Adjoint of general operators, 52 Adjoint operator, 5 Adjoint operator on Hilbert space, 7 Admissible set, 295 Arzela–Ascoli theorem, 9
B Banach–Alaoglu theorem, 303 Banach space, 3 Bessel operator, 78 Bessel’s inequality, 4 Bilinear form, 251 Bolzano–Weierstrass theorem, 296 Boundary regularity theorem, 287 Bounded below, 29 Bounded inverse theorem, 4 Bounded linear operator, 5
C Caccioppoli’s inequality, 279 Caratheodory function, 344 Cauchy–Schwartz inequality, 3 Cauchy’s inequality, 252 Chain rule for Sobolev spaces, 171 Chebyshev, 78 Classical solution, 246 Closed graph theorem, 4 Closed operator, 47 Closed range theorem, 48 Coercivity, 307 Compact embedding, 220 Compact inclusion, 163 Compact operator, 8 Convex function, 299
Convex hull, 299 Convex set, 299 Convolution of distribution, 125 Cut-off function, 144
D Deficiency spaces, 54 Delta distribution, 91 Delta sequence, 91 Densely defined operator, 52 Diagonal operator, 39 Diffeomorphism, 188 Difference quotient, 276 Directional derivative, 322 Direct method, 306 Dirichlet energy, 311 Dirichlet integral, 311 Dirichlet principle, 311 Dirichlet problem, 243 Distribution, 83 Distributional derivative, 97 Dominated convergence theorem, 2 Dual of Sobolev space, 175
E Eigenfunction, 24 Eigenvalue, 24 Eigenvector, 24 Elliptic bilinear map, 252 Elliptic equation, 239 Embedding, 219 Epigraph, 297 Euler–Lagrange equation, 340 Extended Holder’s inequality, 202 Extension operator, 193
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 A. Khanfer, Applied Functional Analysis, https://doi.org/10.1007/978-981-99-3788-2
365
366 Extreme value theorem, 296
F Fatou’s Lemma, 2 Finite-rank operator, 13 First variation, 338 Frechet derivative, 325 Fredholm alternative, 43 Fredholm alternative for elliptic operators, 270 Fredholm operator, 44 Functions of slow growth, 113 Fundamental lemma of calculus of variations, 148
G Gagliardo–Nirenberg–Sobolev inequality, 204 Garding’s inequality, 253 Gateaux derivative, 324 Gateaux differential, 323 Gaussian function, 119 Generalized Plancherel theorem, 154 Green’s function, 63
H Heine–Borel theorem, 302 Helmholtz equation, 242 Higher order interior regularity theorem, 286 High-order Sobolev estimate, 225 Hilbert–Schmidt operator, 15 Hilbert–Schmidt theorem, 35 Hilbert space, 3 Holder-continuous function, 210 Holder’s inequality, 2 Holder space, 211
I Inclusion map, 219 Infimum, 296 Inner product space, 3 Interior regularity theorem, 283 Interior smoothness theorem, 287 Interpolation inequality, 203 Invariant subspace, 34
J Jointly convex function, 346
Index K Kakutani’s theorem, 303 Kronecker delta function, 62
L Lagrangian integral, 338 Laguerre operator, 78 Laplace equation, 241 Laplacian operator, 64 Lax–Milgram theorem, 261 Lebesgue space, 1 Legendre operator, 78 Lipschitz domain, 187 Locally finite cover, 146 Locally integrable function, 84 Local Sobolev space, 163 Lower semicontinuous, 297
M Mazur’s lemma, 304 Mazur’s theorem, 302 Meyers-Serrin theorem, 177 Minimization problem, 295 Minimizer, 295 Minimizing sequence, 306 Minkowski’s inequality, 2 Mollifier, 141 Momentum operator, 69 Monotone convergence theorem, 2 Morrey’s inequality, 213 Multidimensional Fourier transform, 102
N Nested inequality, 203 Neumann series, 32 Normed space, 1
O Open mapping theorem, 4
P Parseval’s identity, 4 Partition of unity, 146 Plancherel theorem, 104 Poincare inequality, 207 Poincare norm, 249 Poincare–Wirtinger inequality, 249 Poisson equation, 241 Proper function, 307
Index Q Quotient Sobolev spaces, 250
R Radon-Riesz property, 301 Rapidly decreasing function, 106 Reflexive space, 302 Regular distribution, 85 Regular value, 28 Rellich-Kondrachov theorem, 222 Resolvent, 28 Riesz-Fischer theorem, 2 Riesz Representation Theorem for Hilbert space, 255 Riesz’s lemma, 2
S Schwartz space, 107 Self-adjoint operator, 8 Sequentially lower semicontinuous, 297 Singular distribution, 87 Smooth domain, 187 Smooth functions, 82 Sobolev conjugate, 200 Sobolev embedding theorem, 226 Sobolev exponent, 202 Sobolev’s inequality, 208 Sobolev space, 156 Spectral mapping theorem, 33 Spectral theorem for self-adjoint compact operators, 39 Spectral theorem of elliptic operator, 274 Spectrum, 29 Strictly convex, 300 Strongly diffeomorphism, 189 Strong solution, 246 Sturm–Liouville operator, 67 Subordinate, 146
367 T Tempered distribution, 113 Test function, 83 Toeplitz theorem, 51
U Uniform bounded principle, 4 Uniformly elliptic operator, 240 Upper semicontinuous, 297
V Variational integral, 311 Variational problem, 295 Volterra equation, 45
W Weak derivative, 134 Weak formulation of elliptic equation, 245 Weakly bounded set, 302 Weakly closed, 301 Weakly closed set, 301 Weakly compact, 302 Weakly compact set, 302 Weakly convergence, 301 Weakly differentiable, 134 Weakly lower semicontinuous, 303 Weakly sequentially closed set, 301 Weakly sequentially compact set, 302 Weak solution, 245 Weak topology, 301 Weierstrass’s example, 314 Weyl’s lemma, 275
Z Zero-boundary Sobolev space, 166, 179 Zero extension, 182