Invitation to Partial Differential Equations 0821836404, 9780821836408

This book is based on notes from a beginning graduate course on partial differential equations. Prerequisites for using

253 98 2MB

English Pages 319 [341] Year 2020

Table of contents :
Cover
Title page
Foreword
Preface
Selected notational conventions
Chapter 1. Linear differential operators
1.1. Definition and examples
1.2. The total and the principal symbols
1.3. Change of variables
1.4. The canonical form of second-order operators with constant coefficients
1.5. Characteristics. Ellipticity and hyperbolicity
1.6. Characteristics and the canonical form of second-order operators and second-order equations for 𝑛=2
1.7. The general solution of a homogeneous hyperbolic equation with constant coefficients for 𝑛=2
1.8. Appendix. Tangent and cotangent vectors
1.9. Problems
Chapter 2. One-dimensional wave equation
2.1. Vibrating string equation
2.2. Unbounded string. The Cauchy problem. D’Alembert’s formula
2.3. A semibounded string. Reflection of waves from the end of the string
2.4. A bounded string. Standing waves. The Fourier method (separation of variables method)
2.5. Appendix. The calculus of variations and classical mechanics
2.6. Problems
Chapter 3. The Sturm-Liouville problem
3.1. Formulation of the problem
3.2. Basic properties of eigenvalues and eigenfunctions
3.3. The short-wave asymptotics
3.4. The Green’s function and completeness of the system of eigenfunctions
3.5. Problems
Chapter 4. Distributions
4.1. Motivation of the definition. Spaces of test functions
4.2. Spaces of distributions
4.3. Topology and convergence in the spaces of distributions
4.4. The support of a distribution
4.5. Differentiation of distributions and multiplication by a smooth function
4.6. A general notion of the transposed (adjoint) operator. Change of variables. Homogeneous distributions
4.7. Appendix. Minkowski inequality
4.8. Appendix. Completeness of distribution spaces
4.9. Problems
Chapter 5. Convolution and Fourier transform
5.1. Convolution and direct product of regular functions
5.2. Direct product of distributions
5.3. Convolution of distributions
5.4. Other properties of convolution. Support and singular support of a convolution
5.5. Relation between smoothness of a fundamental solution and that of solutions of the homogeneous equation
5.6. Solutions with isolated singularities. A removable singularity theorem for harmonic functions
5.7. Estimates of derivatives of a solution of a hypoelliptic equation
5.8. Fourier transform of tempered distributions
5.9. Applying the Fourier transform to find fundamental solutions
5.10. Liouville’s theorem
5.11. Problems
Chapter 6. Harmonic functions
6.1. Mean-value theorems for harmonic functions
6.2. The maximum principle
6.3. Dirichlet’s boundary value problem
6.4. Hadamard’s example
6.5. Green’s function for the Laplacian
6.6. Hölder regularity
6.7. Explicit formulas for Green’s functions
6.8. Problems
Chapter 7. The heat equation
7.1. Physical meaning of the heat equation
7.2. Boundary value problems for the heat and Laplace equations
7.3. A proof that the limit function is harmonic
7.4. A solution of the Cauchy problem for the heat equation and Poisson’s integral
7.5. The fundamental solution for the heat operator. Duhamel’s formula
7.6. Estimates of derivatives of a solution of the heat equation
7.7. Holmgren’s principle. The uniqueness of solution of the Cauchy problem for the heat equation
7.8. A scheme for solving the first and second initial-boundary value problems by the Fourier method
7.9. Problems
Chapter 8. Sobolev spaces. A generalized solution of Dirichlet’s problem
8.1. Spaces 𝐻^{𝑠}(Ω)
8.2. Spaces \HH^{𝑠}(Ω)
8.3. Dirichlet’s integral. The Friedrichs inequality
8.4. Dirichlet’s problem (generalized solutions)
8.5. Problems
Chapter 9. The eigenvalues and eigenfunctions of the Laplace operator
9.1. Symmetric and selfadjoint operators in Hilbert space
9.2. The Friedrichs extension
9.3. Discreteness of spectrum for the Laplace operator in a bounded domain
9.4. Fundamental solution of the Helmholtz operator and the analyticity of eigenfunctions of the Laplace operator at the interior points. Bessel’s equation
9.5. Variational principle. The behavior of eigenvalues under variation of the domain. Estimates of eigenvalues
9.6. Problems
Chapter 10. The wave equation
10.1. Physical problems leading to the wave equation
10.2. Plane, spherical, and cylindric waves
10.3. The wave equation as a Hamiltonian system
10.4. A spherical wave caused by an instant flash and a solution of the Cauchy problem for the three-dimensional wave equation
10.5. The fundamental solution for the three-dimensional wave operator and a solution of the nonhomogeneous wave equation
10.6. The two-dimensional wave equation (the descent method)
10.7. Problems
Chapter 11. Properties of the potentials and their computation
11.1. Definitions of potentials
11.2. Functions smooth up to Γ from each side, and their derivatives
11.3. Jumps of potentials
11.4. Calculating potentials
11.5. Problems
Chapter 12. Wave fronts and short-wave asymptotics for hyperbolic equations
12.1. Characteristics as surfaces of jumps
12.2. The Hamilton-Jacobi equation. Wave fronts, bicharacteristics, and rays
12.3. The characteristics of hyperbolic equations
12.4. Rapidly oscillating solutions. The eikonal equation and the transport equations
12.5. The Cauchy problem with rapidly oscillating initial data
12.6. Problems
Chapter 13. Answers and hints. Solutions
Bibliography
Index
Back Cover

Recommend Papers

Partial Differential Equations

120 71 7MB Read more

Partial Differential Equations 9781442653993

This book is an attempt to make available to the student a coherent modern view of the theory of partial differential eq

137 99 10MB Read more

Partial Differential Equations

104 48 10MB Read more

Analysis of Partial Differential Equations

117 42 507KB Read more

Partial Differential Equations 9780821807729, 0821807722

This text gives a comprehensive survey of modern techniques in the theoretical study of partial differential equations (

576 40 5MB Read more

Partial Differential Equations: An Introduction 9780367613228, 9781003105183

608 87 10MB Read more

Partial Differential Equations 9780470054567, 9781118313169, 111831316X

Cover; Title Page; Copyright Page; Preface; Contents; Chapter 1: Where PDEs Come From; What is a Partial Differential Eq

460 20 3MB Read more

Classical Aspects of Partial Differential Equations

447 82 16MB Read more

Partial differential equations [1 ed.] 9781470475055

This book is a gem. It fills the gap between the standard introductory material on PDEs that an undergraduate is likely

212 25 36MB Read more

Elements of Partial Differential Equations 9783110206753, 9783110191240

This textbook presents a first introduction to PDEs on an elementary level, enabling the reader to understand what parti

191 107 28MB Read more

Invitation to Partial Differential Equations
0821836404, 9780821836408

Author / Uploaded
Maxim Braverman

0 0 0
Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up

File loading please wait...

Citation preview

GRADUATE STUDIES I N M AT H E M AT I C S

205

Invitation to Partial Differential Equations Mikhail Shubin E D I T E D BY

Maxim Braverman Robert McOwen Peter Topalov

10.1090/gsm/205

Invitation to Partial Differential Equations

GRADUATE STUDIES I N M AT H E M AT I C S

205

Invitation to Partial Differential Equations Mikhail Shubin E D I T E D BY

Maxim Braverman Robert McOwen Peter Topalov

EDITORIAL COMMITTEE Daniel S. Freed (Chair) Bjorn Poonen Gigliola Staﬃlani Jeﬀ A. Viaclovsky This work was originally published in Russian by MCNMO under the title “Lekcii c ob uravnenih matematiqesko i fiziki”2001. The present translation was created under license for the American Mathematical Society and is published by permission. 2010 Mathematics Subject Classiﬁcation. Primary 35-01.

For additional information and updates on this book, visit www.ams.org/bookpages/gsm-205

Library of Congress Cataloging-in-Publication Data Names: Shubin, M. A. (Mikhail Aleksandrovich), 1944- author. | Braverman, Maxim, 1966- editor. | McOwen, Robert C., editor. | Topalov, P., 1968- editor. Title: Invitation to partial diﬀerential equations / Mikhail Shubin ; edited by Maxim Braverman, Robert McOwen, Peter Topalov. Description: Providence, Rhode Island : American Mathematical Society, [2020] | Series: Graduate studies in mathematics, 1065-7339 ; 205 | Includes bibliographical references and index. Identiﬁers: LCCN 2020001923 | ISBN 9780821836408 (hardcover) | ISBN 9781470456979 (ebook) Subjects: LCSH: Diﬀerential equations, Partial–Textbooks. | Diﬀerential equations, Partial– Problems, exercises, etc. | AMS: Partial diﬀerential equations – Instructional exposition (textbooks, tutorial papers, etc.). Classiﬁcation: LCC QA374 .S518 2020 | DDC 515/.353–dc23 LC record available at https://lccn.loc.gov/2020001923

Copying and reprinting. Individual readers of this publication, and nonproﬁt libraries acting for them, are permitted to make fair use of the material, such as to copy select pages for use in teaching or research. Permission is granted to quote brief passages from this publication in reviews, provided the customary acknowledgment of the source is given. Republication, systematic copying, or multiple reproduction of any material in this publication is permitted only under license from the American Mathematical Society. Requests for permission to reuse portions of AMS publication content are handled by the Copyright Clearance Center. For more information, please visit www.ams.org/publications/pubpermissions. Send requests for translation rights and licensed reprints to [email protected]. c 2020 by the American Mathematical Society. All rights reserved. The American Mathematical Society retains all rights except those granted to the United States Government. Printed in the United States of America. ∞ The paper used in this book is acid-free and falls within the guidelines

established to ensure permanence and durability. Visit the AMS home page at https://www.ams.org/ 10 9 8 7 6 5 4 3 2 1

25 24 23 22 21 20

Contents

Foreword

xi

Preface

xiii

Selected notational conventions

xvii

Chapter 1. Linear diﬀerential operators

1

§1.1. Deﬁnition and examples

1

§1.2. The total and the principal symbols

2

§1.3. Change of variables

4

§1.4. The canonical form of second-order operators with constant coeﬃcients

5

§1.5. Characteristics. Ellipticity and hyperbolicity

7

§1.6. Characteristics and the canonical form of second-order operators and second-order equations for n = 2

8

§1.7. The general solution of a homogeneous hyperbolic equation with constant coeﬃcients for n = 2

11

§1.8. Appendix. Tangent and cotangent vectors

12

§1.9. Problems

14

Chapter 2. One-dimensional wave equation

15

§2.1. Vibrating string equation

15

§2.2. Unbounded string. The Cauchy problem. D’Alembert’s formula

20

v

vi

Contents

§2.3. A semibounded string. Reﬂection of waves from the end of the string

24

§2.4. A bounded string. Standing waves. The Fourier method (separation of variables method)

26

§2.5. Appendix. The calculus of variations and classical mechanics

33

§2.6. Problems

39

Chapter 3. The Sturm-Liouville problem

43

§3.1. Formulation of the problem

43

§3.2. Basic properties of eigenvalues and eigenfunctions

44

§3.3. The short-wave asymptotics

47

§3.4. The Green’s function and completeness of the system of eigenfunctions

50

§3.5. Problems

54

Chapter 4. Distributions

57

§4.1. Motivation of the deﬁnition. Spaces of test functions

57

§4.2. Spaces of distributions

63

§4.3. Topology and convergence in the spaces of distributions

67

§4.4. The support of a distribution

70

§4.5. Diﬀerentiation of distributions and multiplication by a smooth function

74

§4.6. A general notion of the transposed (adjoint) operator. Change of variables. Homogeneous distributions

87

§4.7. Appendix. Minkowski inequality

91

§4.8. Appendix. Completeness of distribution spaces

94

§4.9. Problems

96

Chapter 5. Convolution and Fourier transform §5.1. Convolution and direct product of regular functions

99 99

§5.2. Direct product of distributions

101

§5.3. Convolution of distributions

104

§5.4. Other properties of convolution. Support and singular support of a convolution

107

§5.5. Relation between smoothness of a fundamental solution and that of solutions of the homogeneous equation

109

§5.6. Solutions with isolated singularities. A removable singularity theorem for harmonic functions

113

Contents

vii

§5.7. Estimates of derivatives of a solution of a hypoelliptic equation

114

§5.8. Fourier transform of tempered distributions

116

§5.9. Applying the Fourier transform to ﬁnd fundamental solutions

120

§5.10. Liouville’s theorem

121

§5.11. Problems

123

Chapter 6. Harmonic functions

127

§6.1. Mean-value theorems for harmonic functions

127

§6.2. The maximum principle

129

§6.3. Dirichlet’s boundary value problem

131

§6.4. Hadamard’s example

133

§6.5. Green’s function for the Laplacian

136

§6.6. H¨older regularity

140

§6.7. Explicit formulas for Green’s functions

143

§6.8. Problems

147

Chapter 7. The heat equation

151

§7.1. Physical meaning of the heat equation

151

§7.2. Boundary value problems for the heat and Laplace equations

153

§7.3. A proof that the limit function is harmonic

155

§7.4. A solution of the Cauchy problem for the heat equation and Poisson’s integral

156

§7.5. The fundamental solution for the heat operator. Duhamel’s formula

161

§7.6. Estimates of derivatives of a solution of the heat equation

164

§7.7. Holmgren’s principle. The uniqueness of solution of the Cauchy problem for the heat equation

165

§7.8. A scheme for solving the ﬁrst and second initial-boundary value problems by the Fourier method

168

§7.9. Problems

170

Chapter 8. Sobolev spaces. A generalized solution of Dirichlet’s problem §8.1. Spaces

171

H s (Ω)

171

˚s (Ω) §8.2. Spaces H

177

viii

Contents

§8.3. Dirichlet’s integral. The Friedrichs inequality

181

§8.4. Dirichlet’s problem (generalized solutions)

183

§8.5. Problems

190

Chapter 9. The eigenvalues and eigenfunctions of the Laplace operator

193

§9.1. Symmetric and selfadjoint operators in Hilbert space

193

§9.2. The Friedrichs extension

197

§9.3. Discreteness of spectrum for the Laplace operator in a bounded domain

201

§9.4. Fundamental solution of the Helmholtz operator and the analyticity of eigenfunctions of the Laplace operator at the interior points. Bessel’s equation

202

§9.5. Variational principle. The behavior of eigenvalues under variation of the domain. Estimates of eigenvalues

209

§9.6. Problems

212

Chapter 10. The wave equation

215

§10.1. Physical problems leading to the wave equation

215

§10.2. Plane, spherical, and cylindric waves

220

§10.3. The wave equation as a Hamiltonian system

222

§10.4. A spherical wave caused by an instant ﬂash and a solution of the Cauchy problem for the three-dimensional wave equation 228 §10.5. The fundamental solution for the three-dimensional wave operator and a solution of the nonhomogeneous wave equation

234

§10.6. The two-dimensional wave equation (the descent method)

236

§10.7. Problems

239

Chapter 11. Properties of the potentials and their computation

241

§11.1. Deﬁnitions of potentials

242

§11.2. Functions smooth up to Γ from each side, and their derivatives

244

§11.3. Jumps of potentials

251

§11.4. Calculating potentials

253

§11.5. Problems

256

Contents

ix

Chapter 12. Wave fronts and short-wave asymptotics for hyperbolic equations

257

§12.1. Characteristics as surfaces of jumps

257

§12.2. The Hamilton-Jacobi equation. bicharacteristics, and rays

262

Wave fronts,

§12.3. The characteristics of hyperbolic equations

269

§12.4. Rapidly oscillating solutions. The eikonal equation and the transport equations

271

§12.5. The Cauchy problem with rapidly oscillating initial data

281

§12.6. Problems

287

Chapter 13. Answers and hints. Solutions

289

Bibliography

311

Index

315

Foreword

Misha Shubin, our dear colleague at Northeastern University, worked for many years trying to ﬁnish this book on partial diﬀerential equations. Unfortunately, his failing health prevented him from completing this task. Sergei Gelfand of the American Mathematical Society approached us asking whether we could put the ﬁnishing touches on the manuscript so that it could be published. We agreed to do this hoping to preserve Misha’s writing style which is at the same time rigorous yet accessible. Many chapters in the book required only minor changes; others required substantial revision and additions because they had not been completed before his health deteriorated. It has taken us quite a while to ﬁnish this project, but we now feel that the book is in suﬃciently good condition that it can be shared with the mathematical community. We hope that it will be appreciated as a lasting tribute to the legacy of Professor Mikhail Shubin. Maxim Braverman Robert McOwen Peter Topalov Boston, November 2019

xi

Preface

I shall be telling this with a sigh Somewhere ages and ages hence: Two roads diverged in a wood, and I – I took the one less traveled by, And that has made all the diﬀerence. From The Road Not Taken by Robert Frost

It will not be an overstatement to say that everything we see around us (as well as hear, touch, etc.) can be described by partial diﬀerential equations or, in a slightly more restricted way, by equations of mathematical physics. Among the processes and phenomena whose adequate models are based on such equations, we encounter, for example, string vibrations, waves on the surface of water, sound, electromagnetic waves (in particular, light), propagation of heat and diﬀusion, gravitational attraction of planets and galaxies, behavior of electrons in atoms and molecules, and also the Big Bang that led to the creation of our Universe. It is no wonder, therefore, that today the theory of partial diﬀerential equations (PDE) forms a vast branch of mathematics and mathematical physics that uses methods of all the remaining parts of mathematics (from which PDE theory is inseparable). In turn, PDE inﬂuence numerous parts of mathematics, physics, chemistry, biology, and other sciences. A disappointing conclusion is that no book, to say nothing of a textbook, can be suﬃciently complete in describing this area. Only one possibility remains: to show the reader pictures at an exhibition—several typical xiii

xiv

Preface

methods and problems—chosen according to the author’s taste. In doing so, the author has to severely restrict the choice of material. This book originally contained a practically intact transcript of the PDE course which I taught twice to students in experimental groups at the Department of Mechanics and Mathematics of the Moscow State University and later, in a shortened version, in Northeastern University (Boston). Since then several other people taught this course in other universities (in particular, I. S. Louhivaara was the ﬁrst to teach it outside Moscow State University—namely, at Freie Universit¨at (Berlin). Now the transcript has been considerably extended, becoming practically a new book. In Moscow, all problems presented in this book were used in recitations accompanying the lectures. There was one lecture per week during the whole academic year (two semesters); recitations were held once a week during the ﬁrst semester and twice a week during the second one. My intention was to create a course that is both concise and modern. These two requirements proved to be contradictory. To make the course concise, I was, ﬁrst, forced to explain some ideas on simplest examples and, second, to omit many topics which should be included in a modern course but would have expanded the course beyond a reasonable length. With particular regret I have omitted the Schr¨odinger equation and related material. I hope the gap will be ﬁlled by a parallel course on quantum mechanics. In any modern course on equations of mathematical physics, something should, perhaps, be said about nonlinear equations but I found it extremely diﬃcult to select this “something”. The bibliography at the end of the book contains a number of textbooks where the reader can ﬁnd information complementing this course. Naturally, the material of theses books overlaps, but it was diﬃcult to do a more precise selection, so I leave this selection to the reader’s taste. The problems in the book are not just exercises. Some of them add essential information but require no new ideas to solve them. Acknowledgments. I am very grateful to Professor S. P. Novikov, who organized the experimental groups and invited me to teach PDE to them. I am also greatly indebted to the professors of the chair of diﬀerential equations of the Moscow State University for many valuable discussions. I owe a lot to Professor Louhivaara, who provided unexpectedly generous help in editing an early version of this text and encouraged me to publish this book. I also had very productive discussions about this course with Professor P. Albin, which led to extremely many useful improvements and corrections. Professors R. Mazzeo and I. Polterovich taught the course using the lecture notes, and I owe them thanks for many useful suggestions.

Preface

xv

I am thankful to Professor O. Milatovic for his help in verifying problems. I am also grateful to Professor D. A. Leites, who participated considerably in the translation of this book into English.

Selected notational conventions

Cross references inside the book are natural: Theorem 6.2 stands for Theorem 2 in Chapter 6, Problem 6.2 is Problem 2 from Chapter 6, etc. := means “denotes by deﬁnition”. ≡ means “identically equal to”. By Z, Z+ , N, R, and C we will denote the set of all integers, nonnegative integers, positive integers, real numbers, and complex numbers, respectively. I denotes the identity operator. 1 The Heaviside function is θ(z) = 0

for z ≥ 0, for z < 0.

xvii

10.1090/gsm/205/01

Chapter 1

Linear diﬀerential operators

1.1. Deﬁnition and examples We will start by introducing convenient notation for functions of several variables and diﬀerential operators. By Z, R, and C we will denote the set of integer, real, and complex numbers, respectively. A multi-index α is an ntuple α = (α1 , . . . , αn ), where αj ∈ Z+ (i.e., the αj are nonnegative integers). If α is a multi-index, then we set |α| = α1 + · · · + αn and α! = α1 ! . . . αn !; for a vector x = (x1 , . . . , xn ) ∈ Cn we deﬁne xα = xα1 1 . . . xαn n . If Ω is an open subset of Rn , then C ∞ (Ω) denotes the space of complex-valued inﬁnitely diﬀerentiable functions on Ω, and C0∞ (Ω) denotes the subspace of inﬁnitely diﬀerentiable functions with compact support; i.e., ϕ ∈ C0∞ (Ω) if ϕ ∈ C ∞ (Ω) and there exists a compact K ⊂ Ω such that ϕ vanishes on Ω \ K. (Here K depends on ϕ.) √ Denote ∂j = ∂x∂ j : C ∞ (Ω) −→ C ∞ (Ω), Dj = −i∂j , where i = −1, D = (D1 , . . . , Dn );

∂ α = ∂1α1 . . . ∂nαn ,

D α = D1α1 . . . Dnαn .

∂ |α| α n ∂x1 1 ...∂xα n −|α| α i ∂ .

Therefore, ∂ α = C ∞ (Ω), D α =

∂ = (∂1 , . . . , ∂n ),

is a mixed derivative operator ∂ α : C ∞ (Ω) −→

If f ∈ C ∞ (Ω), then instead of ∂ α f we will sometimes write f (α) . 1

2

1. Linear diﬀerential operators

Exercise 1.1. Let f ∈ C[x1 , . . . , xn ]; i.e., f is a polynomial of n variables x1 , . . . , xn . Prove the Taylor formula (1.1)

f (x) =

f (α) (x0 ) α

α!

(x − x0 )α ,

where the sum runs over all multi-indices α. (Actually, the sum is ﬁnite; i.e., only a ﬁnite number of terms may be nonvanishing.) A linear diﬀerential operator is an operator A : C ∞ (Ω) −→ C ∞ (Ω) of the form aα (x)D α , (1.2) A= |α|≤m

where aα (x) ∈ C ∞ (Ω). Clearly, instead of D α we may write ∂ α , but the expression in terms of D α is more convenient as we will see in the future. Here m ∈ Z+ , and we will say that A is an operator of order ≤ m. We say that A is an operator of order m if it can be expressed in the form (1.2) and there exists a multi-index α such that |α| = m and aα (x) ≡ 0. Examples. 1) The Laplace operator, or Laplacian, Δ = −(D12

+··· +

Dn2 ).

∂2 ∂x21

+ ··· +

∂2 ∂x2n

=

∂ 2) The heat operator ∂t − Δ (here the number of independent variables is equal to n + 1 and they are denoted by t, x1 , . . . , xn ).

3) The wave operator, or d’Alembertian, =

∂2 ∂t2

− Δ.

4) The Sturm-Liouville operator L deﬁned for n = 1 and Ω = (a, b) by the formula d du Lu = p(x) + q(x)u, dx dx where p, q ∈ C ∞ ((a, b)). Operators in Examples 1)–3) are operators with constant coeﬃcients. The Sturm-Liouville operator has variable coeﬃcients.

1.2. The total and the principal symbols The symbol, or the total symbol, of the operator (1.2) is aα (x)ξ α , x ∈ Ω, ξ ∈ Rn , (1.3) a(x, ξ) = |α|≤m

and its principal symbol is the function aα (x)ξ α , (1.4) am (x, ξ) = |α|=m

x ∈ Ω, ξ ∈ Rn .

1.2. The total and the principal symbols

3

The symbol belongs to C ∞ (Ω)[ξ], i.e., it is a polynomial in ξ1 , . . . , ξn with coeﬃcients in C ∞ (Ω), and the principal symbol is a homogeneous polynomial of degree m in ξ1 , . . . , ξn with coeﬃcients in C ∞ (Ω). Examples. 1) The total symbol and the principal symbol of the Laplace operator coincide and are equal to −ξ 2 = −(ξ12 + · · · + ξn2 ). 2) The total symbol of the heat operator and its principal symbol is a2 (t, x, τ, ξ) = ξ 2 .

∂ ∂t

− Δ is a(t, x, τ, ξ) = iτ + ξ 2

3) The total symbol and the principal symbol of the wave operator coincide and are equal to a(t, x, τ, ξ) = a2 (t, x, τ, ξ) = −τ 2 + ξ 2 . 4) The total symbol of the Sturm-Liouville operator is a(x, ξ) = −p(x)ξ 2 + ip (x)ξ + q(x); its principal symbol is a2 (x, ξ) = −p(x)ξ 2 . The symbol of an operator A is recovered from the operator by the formula a(x, ξ) = e−ix·ξ A(eix·ξ ),

(1.5)

where A is applied with respect to x. Here we used the notation x · ξ = x1 ξ1 + · · · + xn ξn . Formula (1.5) is obtained from the relation D α eix·ξ = ξ α eix·ξ , which is easily veriﬁed by induction with respect to |α|. Formula (1.5) also implies that the coeﬃcients aα (x) of A are uniquely deﬁned by this operator, because the coeﬃcients of the polynomial a(x, ξ) for each ﬁxed x are uniquely deﬁned by this polynomial viewed as a function of ξ. It is convenient to apply A to a more general exponent than eix·ξ , namely to eiλϕ(x) , where ϕ ∈ C ∞ (Ω) and λ is a parameter. Then we get Lemma 1.1. If f ∈ C ∞ (Ω), then e−iλϕ(x) A(f (x)eiλϕ(x) ) is a polynomial of λ of degree ≤ m with coeﬃcients in C ∞ (Ω) and (1.6)

e−iλϕ(x) A(f (x)eiλϕ(x) ) = λm f (x)am (x, ϕx ) + λm−1 (. . .) + · · · ;

i.e., the highest coeﬃcient of this polynomial (that of λm ) is equal to ∂ϕ ∂ϕ , . . . , ∂x ) = grad ϕ is the gradient of ϕ. f (x)am (x, ϕx ), where ϕx = ( ∂x n 1

Proof. We have Dj [f (x)eiλϕ(x) ] = λ(∂j ϕ)(x)f (x)eiλϕ(x) + (Dj f )eiλϕ(x) ,

4

1. Linear diﬀerential operators

so that the lemma holds for the operators of order ≤ 1. In what follows, to ﬁnd the derivatives D α eiλϕ(x) we will have to diﬀerentiate products of the form f (x)eiλϕ(x) , where f ∈ C ∞ (Ω). A new factor λ will only appear when we diﬀerentiate the exponent. Clearly, D α eiλϕ(x) = λ|α| ϕαx eiλϕ(x) + λ|α|−1 (. . .) + · · · , implying the statement of the lemma. Corollary 1.2. Let A, B be two linear diﬀerential operators in Ω, let k, l be their orders, ak , bl their principal symbols, C = A ◦ B their composition, ck+l the principal symbol of C. Then (1.7)

ck+l (x, ξ) = ak (x, ξ)bl (x, ξ).

Remark. Clearly, C is a diﬀerential operator of order ≤ k + l. Proof of Corollary 1.2. We have C(eiλϕ(x) ) = A(λl bl (x, ϕx )eiλϕ(x) + λl−1 (. . .) + · · · ) = λk+l ak (x, ϕx )bl (x, ϕx )eiλϕ(x) + λk+l−1 (. . .) + · · · . By Lemma 1.1, we obtain (1.8)

ck+l (x, ϕx ) = ak (x, ϕx )bl (x, ϕx ),

for any ϕ ∈ C ∞ (Ω).

But then, by choosing ϕ(x) = x · ξ, we get ϕx = ξ and (1.8) becomes ck+l (x, ξ) = ak (x, ξ)bl (x, ξ), as required.

1.3. Change of variables Let κ : Ω −→ Ω1 be a diﬀeomorphism between two open subsets Ω, Ω1 in Rn ; i.e., κ is a C ∞ map having an inverse C ∞ map κ−1 : Ω1 −→ Ω. We will denote the coordinates in Ω by x and coordinates in Ω1 by y. The map κ can be deﬁned by a set of functions y1 (x1 , . . . , xn ), . . . , yn (x1 , . . . , xn ), yj ∈ C ∞ (Ω), whose values at x are equal to the respective coordinates of the point κ(x). If f ∈ C ∞ (Ω1 ), set κ∗ f = f ◦ κ; i.e., (κ∗ f )(x) = f (κ(x)) or (κ∗ f )(x) = f (y1 (x), . . . , yn (x)). The function κ∗ f is obtained from f by a change of variables or by passage to coordinates x1 , . . . , xn from coordinates y1 , . . . , yn . Recall that the diﬀeomorphism κ has the inverse κ−1 which is also a diﬀeomorphism. Clearly, κ∗ is a linear isomorphism κ∗ : C ∞ (Ω1 ) −→ C ∞ (Ω) and (κ∗ )−1 = (κ−1 )∗ .

1.4. The canonical form

5

Given an operator A : C ∞ (Ω) −→ C ∞ (Ω), deﬁne the operator A1 : C ∞ (Ω1 ) −→ C ∞ (Ω1 ) with the help of the commutative diagram C ∞ (Ω) ⏐ ∗ ⏐κ

A

−→ C ∞ (Ω) ⏐ ∗ ⏐κ A

1 C ∞ (Ω1 ) −→ C ∞ (Ω1 )

i.e., by setting A1 = (κ∗ )−1 Aκ∗ = (κ−1 )∗ Aκ∗ . In other words, (1.9)

A1 v(y) = {A[v(y(x))]}|x=x(y),

where y(x) = κ(x), x(y) = κ−1 (y). Therefore, the operator A1 is, actually, just the expression of A in coordinates y. If A is a linear diﬀerential operator, then so is A1 , as easily follows from the chain rule and (1.9). Let us investigate a relation between the principal symbols of A and A1 . To this end we will use the notions of tangent and cotangent vectors in Ω, as well as tangent and cotangent bundles over Ω (see the appendix in this chapter). In particular, for x ∈ Ω we identify RN with Tx∗ Ω as follows: using the coordinates x1 , . . . , xn we deﬁne the basis dx1 , . . . , dxn in Tx∗ Ω and we identify each vector ξ = (ξ1 , . . . , ξn ) ∈ Rn with a covector ξ1 dx1 + · · · + ξn dxn ∈ Tx∗ Ω, which we also denote by ξ. Similarly, for y := κ(x) ∈ Ω1 we use coordinates y1 , . . . , yn to identify Rn with Ty∗ Ω1 . Theorem 1.3. The value of the principal symbol am of an operator A at the cotangent vector (x, ξ) is the same as that of the principal symbol am of A1 at the corresponding vector (y, η); i.e., (1.10)

am (x, ξ) = am (κ(x), (t κ∗ )−1 ξ).

In other words, the principal symbol is a well-deﬁned function on T ∗ Ω (it does not depend on the choice of C ∞ -coordinates in Ω). Proof. A cotangent vector at x can be expressed as the gradient ϕx (x) of a function ϕ ∈ C ∞ (Ω) at x. It is clear from Lemma 1.1 that am (x, ϕx (x)) does not depend on the choice of coordinates. But on the other hand, it is clear that this value does not depend on the choice of a function ϕ with a prescribed diﬀerential at x. Therefore, the principal symbol is well-deﬁned on T ∗ Ω.

1.4. The canonical form of second-order operators with constant coeﬃcients By a change of variables we can try to transform an operator to a simpler one. Consider a second-order operator with constant coeﬃcients of the principal

6

1. Linear diﬀerential operators

part: A=−

n i,k=1

∂ ∂2 + bl + c, ∂xi ∂xk ∂xl n

aik

l=1

where the aik are real constants such that aik = aki (the latter can always 2 2 be assumed since ∂x∂i ∂xk = ∂x∂k ∂xi ). Ignoring the terms of lower order, we will transform the highest-order part of A A0 = −

n

aik

i,k=1

∂2 ∂xi ∂xk

to a simpler form with the help of a linear change of variables: yk =

(1.11)

n

ckl xl ,

l=1

where the ckl are real constants. Consider the quadratic form Q(ξ) =

n

aik ξi ξk,

i,k=1

which only diﬀers from the principal symbol of A by a sign. With the help of a linear change of variables η = F ξ, where F is a nonsingular constant matrix, we can transform Q(ξ) to the following form: (1.12)

Q(ξ) = (±η12 ± η22 ± · · · ± ηr2 )|η=F ξ

for some r ≤ n. Denote by C the matrix (ckl )nk,l=1 in (1.11). By Theorem 1.3, the operator A in coordinates y has the form of a second-order operator A1 whose principal symbol is a quadratic form Q1 (η) such that Q(ξ) = Q1 (η)|η=(t C)−1 ξ , where t C is the transpose of C. From this and (1.12) it follows that we should choose C so that (t C)−1 = F or C = (t F )−1 . Then the principal part of A under the change of variables y = Cx of the form (1.11) will be reduced to the form (1.13)

±

∂2 ∂2 ∂2 ± ± · · · ± , ∂yr2 ∂y12 ∂y22

called the canonical form. Remark. An operator with variable coeﬃcients may be reduced to the form (1.13) at any ﬁxed point by a linear change of variables.

1.5. Characteristics. Ellipticity and hyperbolicity

7

1.5. Characteristics. Ellipticity and hyperbolicity Let A be an mth-order diﬀerential operator, am (x, ξ) its principal symbol. A nonzero cotangent vector (x, ξ) is called a characteristic vector if am (x, ξ) = 0. A surface (of codimension 1) in Ω is called characteristic at x0 if its normal vector at this point is a characteristic vector. The surface is called a characteristic if it is characteristic at every point. If a surface S is deﬁned by the equation ϕ(x) = 0, where ϕ ∈ C 1 (Ω), ϕx |S = 0, then S is characteristic at x0 if am (x0 , ϕx (x0 )) = 0. The surface S is a characteristic if am (x, ϕx (x)) identically vanishes on S. All level surfaces ϕ = const are characteristics if and only if am (x, ϕx (x)) ≡ 0. Theorem 1.3 implies that the notion of a characteristic does not depend on the choice of coordinates in Ω. Examples. 1) The Laplace operator Δ has no real characteristic vectors. ∂ − Δ the vector (τ, ξ) = (1, 0, . . . , 0) ∈ Rn+1 is 2) For the heat operator ∂t a characteristic one. The surfaces t = const are characteristics. The surface t = |x|2 (paraboloid) is characteristic at a single point only (the origin). 2

∂ 3) Consider the wave operator ∂t 2 − Δ. Its characteristic vectors at each 2 point (t, x) constitute a cone τ = |ξ|2 . Any cone (t − t0 )2 = |x − x0 |2 is a characteristic. In particular, for n = 1 (i.e., for x ∈ R1 ) the lines x + t = const and x − t = const are characteristics.

Deﬁnition. a) An operator A is called elliptic if am (x, ξ) = 0 for all x ∈ Ω and all ξ ∈ Rn \ {0}, i.e., if A has no nonzero characteristic vectors. b) An operator A in the space of functions of t, x, where t ∈ R1 , x ∈ Rn , is hyperbolic with respect to t if the equation am (t, x, τ, ξ) = 0 considered as an equation in τ for any ﬁxed t, x, ξ has exactly m distinct real roots if ξ = 0. In this case one might say that the characteristics of A are real and distinct. Examples. a) The Laplace operator Δ is elliptic. b) The heat operator is neither elliptic nor hyperbolic with respect to t. c) The wave operator is hyperbolic with respect to t since the equation τ 2 = |ξ|2 has two diﬀerent real roots τ = ±|ξ| if ξ = 0. d) The Sturm-Liouville operator Lu ≡ (a, b) if p(x) = 0 for x ∈ (a, b).

d d dx (p(x) dx u) + q(x)u

is elliptic on

8

1. Linear diﬀerential operators

By Theorem 1.3, the ellipticity of an operator does not depend on the choice of coordinates. The hyperbolicity with respect to t does not depend on the choice of coordinates in Rnx . Let us see what it means for the surface x1 = const to be characteristic. The coordinates of the normal vector are (1, 0, . . . , 0). Substituting them into the principal symbol, we get aα (x)(1, 0, . . . , 0)α = a(m,0,...,0) (x); am (x; 1, 0, . . . , 0) = |α|=m

i.e., the statement “the surface x1 = const is characteristic at x” means that ∂m a(m,0,...,0) (x), the coeﬃcient of ( 1i )m ∂x m , vanishes at x. 1

All surfaces x1 = const are characteristics if and only if a(m,0,...,0) ≡ 0. This remark will be used below to reduce second-order operators with two independent variables to a canonical form. It is also possible to ﬁnd characteristics by solving the Hamilton-Jacobi equation am (x, ϕx (x)) = 0. If ϕ is a solution of this equation, then all surfaces ϕ = const are characteristic. Integration of the Hamilton-Jacobi equation is performed with the help of the Hamiltonian system of ordinary diﬀerential equations with the Hamiltonian am (x, ξ) (see Chapter 12 or any textbook on classical mechanics, e.g., Arnold [3]).

1.6. Characteristics and the canonical form of second-order operators and second-order equations for n = 2 For n = 2 characteristics are curves and are rather easy to ﬁnd. For example, consider a second-order operator (1.14)

A=a

∂2 ∂2 ∂2 + c + 2b + ··· , ∂x2 ∂x∂y ∂y 2

where a, b, c are smooth functions of x, y deﬁned in an open set Ω ⊂ R2 and the dots stand for terms containing at most ﬁrst-order derivatives. Let (x(t), y(t)) be a curve in Ω, (dx, dy) its tangent vector, (−dy, dx) the normal vector. The curve is a characteristic if and only if a(x, y)dy 2 − 2b(x, y)dxdy + c(x, y)dx2 = 0 on it. If a(x, y) = 0, then in a neighborhood of the point (x, y) we may assume that dx = 0; hence, x can be taken as a parameter along the characteristic which, therefore, is of the form y = y(x). Then the equation for the characteristics is of the form ay 2 − 2by + c = 0.

1.6. Characteristics and the canonical form (n = 2)

9

If b2 − ac > 0, then the operator (1.14) is hyperbolic and has two families of real characteristics found from ordinary diﬀerential equations: √ b + b2 − ac (1.15) y = , a √ b − b2 − ac . (1.16) y = a Observe that in this case two nontangent characteristics pass through each point (x, y) ∈ Ω. We express these families of characteristics in the form ϕ1 (x, y) = C1 and ϕ2 (x, y) = C2 , where ϕ1 , ϕ2 ∈ C ∞ (Ω). This means that ϕ1 , ϕ2 are ﬁrst integrals of equations (1.15) and (1.16), respectively. Suppose that grad ϕ1 = 0 and grad ϕ2 = 0 in Ω. Then grad ϕ1 and grad ϕ2 are linearly independent, since characteristics of diﬀerent families are not tangent. Let us introduce new coordinates: ξ = ϕ1 (x, y), η = ϕ2 (x, y). In these coordinates, characteristics are lines ξ = const and η = const. But ∂2 ∂2 then the coeﬃcients of ∂ξ 2 and ∂η 2 vanish identically and A takes the form (1.17)

A = p(ξ, η)

∂2 +··· , ∂ξ∂η

called the canonical form. Here p(ξ, η) = 0 (recall that we have assumed a = 0). If c(x, y) = 0, then the reduction to the canonical form (1.17) is similar. It is not necessary to consider these two cases separately, since we have only used the existence of integrals ϕ1 (x, y), ϕ2 (x, y) with the described properties. These integrals can also be found when coeﬃcients a and c vanish (at one point or on an open set). This case can be reduced to one of the above cases, e.g., by a rotation of coordinate axes (if a2 + b2 + c2 = 0). One often considers diﬀerential equations of the form (1.18)

Au = f,

where f is a known function, A a linear diﬀerential operator, and u an unknown function. If A is a hyperbolic second-order operator with two independent variables, i.e., an operator of the form (1.14) with b2 − ac > 0 (in this case the equation (1.18) is also called hyperbolic), then, introducing the above coordinates ξ, η and dividing by p(ξ, η), we reduce (1.18) to the canonical form ∂2u + · · · = 0, ∂ξ∂η where the dots stand for terms without second-order derivatives of u. Now let b2 − ac ≡ 0 (then the operator (1.14) and the equation (1.18) with this operator are called parabolic). Let us assume that a = 0. Then,

10

1. Linear diﬀerential operators

for characteristics we get the diﬀerential equation (1.19)

y =

b . a

Let us ﬁnd characteristics and write them in the form ϕ(x, y) = const, where ϕ is the ﬁrst integral of (1.19) and grad ϕ = 0. Choose a function ψ ∈ C ∞ (Ω) such that grad ϕ and grad ψ are linearly independent and introduce new coordinates ξ = ϕ(x, y), η = ψ(x, y). In the new coordinates the operator ∂2 A does not have the term ∂ξ 2 , since the lines ϕ = const are characteristics. 2

∂ also vanishes, since the principal But then the coeﬃcient of the term ∂ξ∂η symbol should be a quadratic form of rank 1. Thus, the canonical form of a parabolic operator is ∂2 A = p(ξ, η) 2 + · · · . ∂η

For a parabolic equation (1.18) the canonical form is ∂2 + · · · = 0. ∂η 2 Observe that if b2 − ac = 0 and a2 + b2 + c2 = 0, then a and c cannot vanish simultaneously, since then we would also have b = 0. Therefore, we always have either a = 0 or c = 0 and the above procedure is always applicable. Finally, consider the case b2 − ac < 0. The operator (1.14) and the equation (1.18) are in this case called elliptic. Suppose for simplicity that functions a, b, c are real analytic. Then the existence theorem for complex analytic solutions of the complex equation √ b + b2 − ac y = a implies the existence of a local ﬁrst integral ϕ(x, y) + iψ(x, y) = C, where ϕ, ψ are real-valued analytic functions and grad ϕ and grad ψ are linearly independent. It is easy to verify that by introducing new coordinates ξ = ϕ(x, y), η = ψ(x, y), we can reduce A to the canonical form 2 ∂2 ∂ + + ··· , A = p(ξ, η) ∂ξ 2 ∂η 2 where p(ξ, η) = 0. Dividing (1.18) by p, we reduce it to the form ∂ 2u ∂ 2u + 2 + · · · = 0. ∂ξ 2 ∂η

1.7. The general solution (n = 2)

11

1.7. The general solution of a homogeneous hyperbolic equation with constant coeﬃcients for n = 2 As follows from Section 1.6, the hyperbolic equation (1.20)

a

∂ 2u ∂ 2u ∂ 2u + c + 2b = 0, ∂x2 ∂x∂y ∂y 2

where a, b, c ∈ R and b2 − ac > 0, can be reduced by the change of variables ξ = y − λ1 x,

η = y − λ2 x,

where λ1 , λ2 are roots of the quadratic equation aλ2 − 2bλ + c = 0, to the form ∂ 2u = 0. ∂ξ∂η Assuming u ∈ C 2 (Ω), where Ω is a convex open set in R2 , we get implying ∂u ∂η = F (η) and

∂ ∂u ∂ξ ( ∂η )

=0

u = f (ξ) + g(η), where f, g are arbitrary functions of class C 2 . In variables x, y we have (1.21)

u(x, y) = f (y − λ1 x) + g(y − λ2 x).

It is useful to consider functions u(x, y) of the form (1.21), where f, g are not necessarily of the class C 2 but from a wider class, e.g., f, g ∈ L1loc (R); i.e., f, g are locally integrable. Such functions u are called generalized solutions of equation (1.20). For example let f (ξ) have a jump at ξ0 ; i.e., it has diﬀerent limits as ξ → ξ0 + and as ξ → ξ0 −. Then u(x, y) has a discontinuity along the line y − λ1 x = ξ0 . Note that the lines y − λ1 x=const, y − λ2 x = const are characteristics. Therefore, in this case discontinuities of solutions are propagating along characteristics. A similar phenomenon occurs in the case of general hyperbolic equations. 2

2

Example. Characteristics of the wave equation ∂∂t2u = a2 ∂∂xu2 are x − at = const, x+at = const. The general solution of this equation can be expressed in the form u(t, x) = f (x − at) + g(x + at). Observe that f (x − at) is the wave running to the right with the speed a and g(x + at) is the wave running to the left with the same speed a. The general solution is the sum (or superposition) of two such waves.

12

1. Linear diﬀerential operators

1.8. Appendix. Tangent and cotangent vectors Let Ω be an open set in Rn . The tangent vector to Ω at a point x ∈ Ω is a vector v which is tangent to a parametrized curve x(t) in Ω passing through x at t = 0. Here we assume that the function t → x(t) is a C ∞ function deﬁned for t ∈ (−ε, +ε) with some ε > 0 and takes values in Ω. In coordinates x1 , . . . , xn we have x(t) = (x1 (t), . . . , xn (t)) and the tangent vector v is given as v = x(0) ˙ =

dx dt t=0

= (x˙ 1 (0), . . . , x˙ n (0)) dxn dx1 ,..., = dt t=0 dt t=0 = (v1 , . . . , vn ).

Interpreting t as the time variable, we can say that the tangent vector v = x(0) ˙ is the velocity vector at the moment t = 0 for the motion given by the curve x(t). The real numbers v1 , . . . , vn are called the coordinates of the tangent vector x(0). ˙ They depend on the choice of the coordinate system (x1 , . . . , xn ). In fact, they are deﬁned even for any curvilinear coordinate system. Two tangent vectors at x can be added (by adding the corresponding coordinates of these vectors), and any tangent vector can be multiplied by any real number (by multiplying each of its coordinates by this number). These operations do not depend on the choice of coordinates in Ω. So all tangent vectors at x ∈ Ω form a well-deﬁned vector space (independent of coordinates in Ω), which will be denoted Tx Ω and called the tangent space to Ω at x. A cotangent vector ξ at x ∈ Ω is a real-valued linear function on the tangent space Tx Ω. All cotangent vectors at x form a linear space which is denoted Tx∗ Ω and called the cotangent space to Ω at x. Denote by T Ω the union of all tangent spaces over all points x ∈ Ω; i.e., TΩ =

Tx Ω ∼ = Ω × Rn .

x

It is called the tangent bundle of Ω. Similarly, we will introduce the cotangent bundle T ∗Ω =

x

Tx∗ Ω ∼ = Ω × Rn .

1.8. Appendix. Tangent and cotangent vectors

13

For every tangent vector v at x ∈ Ω and any function f ∈ C ∞ (Ω) we can deﬁne the derivative of f in the direction v: it is vf =

∂f (x) df (x(t)) ∂f (x) = x˙ j (0) = vj , dt ∂xj ∂xj n

n

j=1

j=1

where x(t) is a curve in Ω such that x(0) = x and x(0) ˙ = v. Sometimes, vf is also called the directional derivative. Taking f to be a coordinate function xj , we obtain vxj = vj , so v is uniquely deﬁned by the corresponding directional derivative, and we can identify v with its directional derivative which is a linear function on C ∞ (Ω). In particular, take vectors at x ∈ Ω which are tangent to the coordinate axes. Then the corresponding directional derivatives will be simply ∂ ∂ ∂ ∂x1 , . . . , ∂xn , and they form a basis of Tx Ω. (For example, ∂x1 is tangent to the curve x(t) = (x1 + t, x2 , . . . , xn ) and has coordinates (1, 0, . . . , 0). For the jth partial derivative all coordinates vanish except the jth one which equals 1.) The dual basis in Tx∗ Ω consists of linear functions dx1 , . . . , dxn on Tx Ω such that dxi ( ∂x∂ j ) = δij , where δij = 1 for i = j and δij = 0 for i = j. For any tangent vector x(0) ˙ ∈ Tx(0) Ω which is the velocity vector of a curve x(t) at t = 0, its coordinates in the basis ∂x∂ 1 , . . . , ∂x∂n are equal dxn 1 to dx dt |t=0 , . . . , dt |t=0 . An example of a cotangent vector is given by the diﬀerential df of a function f ∈ C ∞ (Ω): it is a function on Tx(0) Ω whose C∞

value at the tangent vector x(0) ˙ is equal to df (x(t)) dt |t=0 and whose coordinates ∂f ∂f in the basis dx1 , . . . , dxn are ∂x , . . . , , so ∂xn 1 n ∂f df = dxj . ∂xj j=1

In Euclidean coordinates, the diﬀerential of a smooth function f can be identiﬁed with its gradient ∇f , which is the tangent vector with the same coordinates. Given a diﬀeomorphism κ : Ω −→ Ω1 , we have a natural map κ∗ : Tx Ω −→ Tκ(x) Ω1 (the tangent map, or the diﬀerential of κ). Namely, for a tangent vector x(0), ˙ which is the velocity vector of a curve x(t) at t = 0, the ˙ is the velocity vector of the curve (κ ◦ x)(t) = κ(x(t)) at t = 0. vector κ∗ x(0) ∗ Ω −→ The corresponding dual map of spaces of linear functions t κ∗ : Tκ(x) 1 ∗ t ∗ ˙ = ξ(κ∗ x(0)), ˙ where ξ ∈ Tκ(x) Ω. Tx Ω is deﬁned by ( κ∗ ξ)(x(0)) Choose bases { ∂x∂ j }nj=1 , { ∂y∂ j }nj=1 , {dyj }nj=1 , {dxj }nj=1 in spaces Tx Ω, ∗ Ω , T ∗ Ω, respectively. In these bases the matrix of κ is Tκ(x) Ω1 , Tκ(x) 1 ∗ x (κ∗ )kl =

∂yk ∂xl .

It is called the Jacobi matrix. The matrix of t κ∗ is the

14

1. Linear diﬀerential operators

transpose of κ∗ . Since κ is a diﬀeomorphism, the maps κ∗ and t κ∗ are isomorphisms at each point x ∈ Ω. It is convenient to deﬁne a cotangent vector at x by a pair (x, ξ), where ξ = (ξ1 , . . . , ξn ), ξ ∈ Rn , is the set of coordinates of this vector in the basis {dxj }nj=1 in Tx∗ Ω. Under the isomorphism (t κ∗ )−1 : T ∗ Ω −→ T ∗ Ω1 the covector (y, η) corresponds to the covector (x, ξ), where y = κ(x), η = (t κ∗ )−1 ξ.

1.9. Problems 1.1. Reduce the equations to the canonical form: a) uxx + 2uxy − 2uxz + 2uyy + 6uzz = 0; b) uxy − uxz + ux + uy − uz = 0. 1.2. Reduce the equations to the canonical form: a) x2 uxx + 2xyuxy − 3y 2 uyy − 2xux + 4yuy + 16x4 u = 0; b) y 2 uxx + 2xyuxy + 2x2 uyy + yuy = 0; c) uxx − 2uxy + uyy + ux + uy = 0. 1.3. Find the general solution of the equations: a) x2 uxx − y 2 uyy − 2yuy = 0; b) x2 uxx − 2xyuxy + y 2 uyy + xux + yuy = 0.

10.1090/gsm/205/02

Chapter 2

One-dimensional wave equation

2.1. Vibrating string equation Let us derive an equation describing small vibrations of a string. Note right away that our derivation will not be mathematical but rather physical or mechanical. However, its understanding is essential to grasp the physical meaning of, ﬁrst, the wave equation itself and, second, but not less importantly, the initial and boundary conditions. A knowledge of the derivation and physical meaning helps also to ﬁnd various mathematical tools in the study of this equation (the energy integral, standing waves, etc.). Generally, derivation of equations corresponding to diﬀerent physical and mechanical problems is important for understanding mathematical physics and is essentially a part of it. So, let us derive the equation of small vibrations of a string. We consider the vibrations of a taut string such that each point moves in a direction perpendicular to the direction of the string in its equilibriuim. We assume here that all the forces appearing in the string are negligible compared to the tension directed along the string (we assume the string to be perfectly ﬂexible, i.e., nonresistant to bending). First of all, choose the variables describing the form of the string. Suppose that in equilibrium the taut string is stretched along the x-axis. To start, we will consider the inner points of the string disregarding its ends. Suppose that the vibrations keep the string in the (x, y)-plane so that each point of the string is only shifted parallel to the y-axis and this shift at time t equals u(t, x); see Figure 2.1. 15

16

2. One-dimensional wave equation y

u(t, x) x

x Figure 2.1. Vibrating string.

Therefore, at a ﬁxed t the graph of u(t, x), as a function of x, is the shape of the string at the moment t, and for a ﬁxed x the function u(t, x) describes the motion of one point of the string. Let us compute the length of the part of the string corresponding to the interval (a, b) on the x-axis. It is equal to

b 1 + u2x dx. a

Our main assumption is the negligibility of the additional string lengthening, compared with the equilibrium length. More precisely, we assume that u2x 1 and consider u2x negligible compared to 1. Note that if α = α(t, x) is the angle between the tangent to the string and the x-axis, then tan α = ux , cos α = √ 1 2 , sin α = √ ux 2 ; under our assumptions we should consider 1+ux

1+ux

cos α ≈ 1 and sin α ≈ ux . If T = T (t, x) is the string’s tension, then its horizontal component is T cos α ≈ T and the vertical one is T sin α ≈ T ux . By Hooke’s law in elasticity theory (which should be considered an assumption in our model) the tension of the string at any point and at any given moment of time is proportional to the relative lengthening of a small piece of string at this point, i.e., the lengthening (the increase of the length compared with the equilibrium) divided by the equilibrium length of this piece. Since we assume the additional lengthening to be negligible, we should have T (t, x) = T = const. (Alternatively, we can argue as follows. Since all points of the string move in the vertical direction, along the y-axis, the sum of the horizontal components of all forces acting on the segment of the string over [a, b] should vanish. This means that T (t, a) = T (t, b) which due to the arbitrariness of a and b yields T (t, x) = T (t); that is, T is independent of x. Then, the negligibility of the lengthening implies that in fact T (t) = T = const.) Let us write the equation of motion of the part of the string above [a, b]; see Figure 2.2. Let ρ(x) be the linear density of the string at x (the ratio of the mass of the inﬁnitesimal segment to the length of this segment at x).

2.1. Vibrating string equation

17

y

T

T a

b

x

Figure 2.2. Tension forces in the string.

The vertical component of the force is

b

(T ux )|x=b − (T ux )|x=a =

(2.1)

T uxx (t, x)dx, a

assuming the absence of external forces. In the presence of vertical external forces distributed with density g(t, x) (per unit mass of the string), we should add to (2.1) the term

b ρ(x)g(t, x)dx. (2.2) a

The vertical component of the momentum of this segment of the string is equal to

b (2.3) ρ(x)ut (t, x)dx. a

Let us use the well-known corollary of Newton’s Second and Third Laws which states that the rate of the momentum’s change in time is equal to the sum of all external forces. (See the appendix in this chapter.) Then (2.1)–(2.3) give

b [ρ(x)utt (t, x) − T uxx (t, x) − ρ(x)g(t, x)]dx = 0, a

or, since a and b are arbitrary, (2.4)

ρ(x)utt − T uxx − ρ(x)g(t, x) = 0.

In particular, for ρ(x) = ρ = const and g(t, x) ≡ 0 we get (2.5) utt − a2 uxx = 0, where a = T /ρ. We would have arrived at the same equation (2.5) if we had applied d’Alembert’s principle equating the sum of all the external and inertial forces to zero. One may arrive at (2.4) via another route by using Lagrange’s equations. (See the appendix in this chapter for an elementary introduction to the

18

2. One-dimensional wave equation

calculus of variations and Lagrange’s equations.) Suppose the external forces are absent. Clearly, the kinetic energy of the interval (a, b) of the string is

1 b K= ρ(x)u2t (t, x)dx. 2 a To calculate the potential energy of the string whose form is that of the graph of v(x), x ∈ (a, b), we should calculate the work necessary to move the string from the equilibrium position to v(x). Suppose this move is deﬁned by the “curve” v(x, α), α ∈ [0, 1], so that v(x, 0) = 0, v(x, 1) = v(x). The force

x+Δx T vxx (x, α)dx T vx (x + Δx, α) − T vx (x, α) = x

acts on the part of the string corresponding to the interval (x, x + Δx) on the x-axis. As α varies from α to α + Δα, the displacement of the point with the abscissa x equals

α+Δα vα (x, α)dα. v(x, α + Δα) − v(x, α) = α

Therefore, to move the part of the string (x, x + Δx) from α to α + Δα, the work −T vxx (x, α)vα (x, α)Δx · Δα + o(Δx · Δα) should be performed. Integrating with respect to x and α we see that the total work performed over the segment (a, b) is

1 b A=− T vxx vα dxdα. 0

a

Integrating by parts, we get

b

b

b d 1 b b T vxx vα dx = −T vx vα |a + T vx vxα dx = −T vx vα |a + T vx2 dx, − dα a 2 a a and integrating with respect to α from 0 to 1, we get

1 1 b 2 x=b T vx (x, α)vα (x, α)|x=a dα + T vx (x)dx. (2.6) A=− 2 a 0 Suppose that the ends of the string are ﬁxed at the points 0 and l; i.e., their displacements are equal to zero during the whole process of motion. Then we may assume that v(0, α) = v(l, α) = 0 for α ∈ [0, 1] and write

1 l 2 T vx (x)dx A= 2 0

2.1. Vibrating string equation

19

instead of (2.6) for a = 0, b = l. This clearly implies that the potential energy of the string with ﬁxed ends 0, l at time t is equal to

1 l U= T u2x (t, x)dx. 2 0 Now we can write the Lagrangian of the string:

1 l 1 l ρ(x)u2t (t, x)dx − T u2x (t, x)dx, L=K −U = 2 0 2 0 which is a functional in u and ut (here u(t, x) plays the role of the set of coordinates at time t and ut (t, x) the role of the set of velocities). Let us t write the action t01 Ldt and equate the variation of the action with respect to u to 0. The usual integration by parts under the assumption that δu|t=t0 = δu|t=t1 = 0 gives

t1 l

t1 Ldt = ρ(x)ut (t, x)δut (t, x)dxdt δ t0

t0

− =−

0 t1 l

T ux (t, x)δux (t, x)dxdt t0

t1 t0

0

l

[ρ(x)utt (t, x) − T uxx (t, x)]δu(t, x)dxdt,

0

which, since δu is arbitrary, implies (2.7)

ρ(x)utt (t, x) − T uxx (t, x) = 0.

This means that we obtained equation (2.4) (with g(t, x) ≡ 0). Note an important circumstance here: the total energy of the string is equal to

1 l 1 l 2 H =K +U = ρ(x)ut (t, x)dx + T u2x (t, x)dx. 2 0 2 0 Let us prove the energy conservation law: the energy of an oscillating string with ﬁxed ends is conserved. We will verify this law by using (2.7). We have

l

l dH = ρ(x)ut (t, x)utt (t, x)dx + T ux (t, x)utx (t, x)dx. dt 0 0 Integrating by parts in the second integral, we get

l dH = ut (t, x)[ρ(x)utt(t, x) − T uxx (t, x)]dx + T ux (t, x)ut (t, x)|x=l (2.8) x=0 . dt 0 The last summand in (2.8) vanishes due to the boundary conditions u|x=0 = d (u|x=0 ) = 0 and similarly ut |x=l = 0. The u|x=l = 0, since then ut |x=0 = dt ﬁrst summand vanishes due to (2.7). Thus, dH dt = 0, yielding the energy conservation law: H(t) = const.

20

2. One-dimensional wave equation

The conservation of energy implies the uniqueness of the solution for the string equation (2.4) provided ρ(x) > 0 everywhere and the law of motion for the string’s ends is given: (2.9)

u|x=0 = α(t),

u|x=l = β(t),

and the initial values (the position and the velocity of the string) are ﬁxed: (2.10)

u|t=0 = ϕ(x),

ut |t=0 = ψ(x)

for x ∈ [0, 1].

Indeed, if u1 , u2 are two solutions of the equation (2.4) satisfying (2.9) and (2.10), then their diﬀerence v = u1 − u2 satisﬁes the homogeneous equation (2.7) and the homogeneous boundary and initial conditions v|x=0 = v|x=l = 0, (2.11)

v|t=0 = vt |t=0 = 0

for x ∈ [0, l].

But then the energy conservation law for v implies

l [ρ(x)vt2 (t, x) + T vx2 (t, x)]dx ≡ 0, 0

yielding vt ≡ 0 and vx ≡ 0; i.e., v = const. By (2.11) we now have v ≡ 0; i.e., u1 ≡ u2 , as required. Finally, note that the derivation of the string equation could be, of course, carried out without the simplifying assumption u2x 1. As a result, we would have obtained a nonlinear equation that hardly admits investigation by simple methods. The equation (2.4) is in fact the linearization (principal linear part) of this nonlinear equation. Properties of the solution of (2.4) give some idea of the behavior of solutions of the nonlinear equation. Observe, however, that for large deformations this nonlinear equation is also inadequate to describe the physical problem, since the resistance to the bending and other eﬀects which were unaccounted for will arise.

2.2. Unbounded string. The Cauchy problem. D’Alembert’s formula From the physical viewpoint, an unbounded string is an idealization, meaning that we consider an inner part of the string assuming the ends to be suﬃciently far away, so that during the observation time they do not aﬀect the given part of the string. As we will see, the consideration of an unbounded string helps in the investigation of a half-bounded string (which is a similar idealization) and of a bounded string. Going on to the mathematical discussion, consider the one-dimensional wave equation (2.5) for x ∈ R, t ≥ 0. A natural problem here is the Cauchy

2.2. Unbounded string. D’Alembert’s formula

21

problem, i.e., an initial value problem for (2.5) with the initial conditions (2.12)

u|t=0 = ϕ(x),

ut |t=0 = ψ(x).

From the physical viewpoint (2.12) means that the initial position and the initial velocity of the string are given. We might expect that, like ﬁnitedimensional mechanical problems, this Cauchy problem is well-posed; i.e., there exists a unique solution continuously depending on the initial data ϕ and ψ. As we will see shortly, this is really so. We use the general solution of the equation (2.5): u(t, x) = f (x − at) + g(x + at) found in Section 1.7. From (2.12) we get the system of two equations to determine two arbitrary functions f and g: (2.13)

f (x) + g(x) = ϕ(x),

−af (x) + ag (x) = ψ(x).

(2.14)

The integration of the second equation yields

1 x (2.15) −f (x) + g(x) = ψ(ξ)dξ + C. a x0 Now, from (2.13) and (2.15), we get 1 1 f (x) = ϕ(x) − 2 2a 1 1 g(x) = ϕ(x) + 2 2a Therefore, 1 1 u(t, x) = ϕ(x − at) − 2 2a

x−at x0

x

ψ(ξ)dξ −

C , 2

ψ(ξ)dξ +

C . 2

x0

x x0

1 1 ψ(ξ)dξ + ϕ(x + at) + 2 2a

or (2.16)

ϕ(x − at) + ϕ(x + at) 1 u(t, x) = + 2 2a

x+at

ψ(ξ)dξ x0

x+at

ψ(ξ)dξ. x−at

Thus, the solution u(t, x) actually exists, is unique, and continuously depends on the initial data ϕ, ψ under an appropriate choice of topologies in the sets of initial data ϕ, ψ and functions u(t, x). For example, it is clear that if u1 is a solution corresponding to the initial conditions ϕ1 , ψ1 and if for a suﬃciently small δ > 0 we have (2.17)

sup |ϕ1 (x) − ϕ(x)| < δ,

sup |ψ1 (x) − ψ(x)| < δ,

x∈R

x∈R

22

2. One-dimensional wave equation

then for arbitrarily small ε > 0 we have (2.18)

|u1 (t, x) − u(t, x)| < ε.

sup x∈R, t∈[0,T ]

(More precisely, for any ε > 0 and T > 0 there exists δ > 0 such that (2.17) implies (2.18).) Therefore, the Cauchy problem is well-posed. The formula (2.16) which determines the solution of the Cauchy problem is called d’Alembert’s formula. Let us make several remarks concerning its derivation and application. First, note that this formula makes sense for any locally integrable functions ϕ and ψ and in this case it gives a generalized solution of (2.5). Here we will only consider continuous solutions. Then we can take any continuous function for ϕ and any locally integrable one for ψ. It is natural to call the functions u(t, x) obtained in this way the generalized solutions of the Cauchy problem. We will consider generalized solutions together with ordinary ones on equal footing and skip the adjective “generalized”. Second, it is clear from d’Alembert’s formula that the value of the solution at (t0 , x0 ) only depends on values of ϕ(x) at x = x0 ± at0 and values of ψ(x) at x ∈ (x0 −at0 , x0 +at0 ). In any case, it suﬃces to know ϕ(x) and ψ(x) on [x0 −at0 , x0 +at0 ]. The endpoints of the closed interval [x0 −at0 , x0 +at0 ] can be determined as intersection points (in (t, x)-space) of the characteristics passing through (t0 , x0 ), with the x-axis; see Figure 2.3. The triangle formed by these characteristics and the x-axis is the set of points in the half-plane t ≥ 0 where the value of the solution is completely t t0 , x 0

x0 − at0

x0 + at0

x

Figure 2.3. Dependence domain.

c

x−

= at

at =

x+

d

t

c

d

Figure 2.4. Inﬂuence domain.

x

2.2. Unbounded string. D’Alembert’s formula

23

determined by initial values on the segment [x0 −at0 , x0 +at0 ]. This triangle is called the dependence domain for the interval [x0 − at0 , x0 + at0 ]. An elementary analysis of the derivation of d’Alembert’s formula shows that the formula is true for any solution deﬁned in the triangle with the characteristics as its sides and an interval [c, d] of the x-axis as its base (i.e., it is not necessary to require a solution to be deﬁned everywhere on the half-plane t ≥ 0). Indeed, from (2.13) and (2.14) we ﬁnd the values f (x) and g(x) at x ∈ [c, d] (if the initial conditions are deﬁned on [c, d]). But this determines the values of u(t, x) when x − at ∈ [c, d] and x + at ∈ [c, d], i.e., when characteristics through (t, x) intersect the segment [c, d]. We can, of course, assume f (x) and g(x) to be deﬁned on [c, d]; i.e., u(t, x) is deﬁned in the above-described triangle which is the dependence domain of the interval [c, d]. The physical meaning of the dependence domain is transparent: it is the complement to the set of points (t, x) such that the wave starting the motion with the velocity a at a point outside [c, d] at time t = 0 cannot reach the point x by time t. Further, the values of ϕ and ψ on [c, d] do not aﬀect u(t, x) if x + at < c or x − at > d, i.e., if the wave cannot reach the point x during the time t starting from an inside point of the interval [c, d]. This explains why the domain bounded by the segment [c, d] and the rays of the lines x + at = c, x − at = d belonging to the half-plane t ≥ 0 is called the inﬂuence domain of the interval [c, d]. This domain is shaded on Figure 2.4. This domain is the complement to the set of points (t, x) for which u(t, x) does not depend on ϕ(x) and ψ(x) for x ∈ [c, d]. Example 2.1. Let us draw the form of the string at diﬀerent times if ψ(x) = 0 and ϕ(x) has the form shown in the uppermost graph of Figure 2.5; i.e., the graph of ϕ(x) has the form of an isosceles triangle with the base [c, d]. Let d − c = 2l. We draw the form of the string at the most interesting times to show all of its essentially diﬀerent forms. All these pictures together form an animation depicting vibrations of the string such that at the initial moment the string is pulled at the point 1 2 (c + d) whereas the points c and d are kept ﬁxed, and then at t = 0 the whole string is released. The formula u(t, x) =

ϕ(x − at) + ϕ(x + at) 2

makes it clear that the initial perturbation ϕ(x) splits into two identical waves one of which runs to the left, and the other to the right with the speed a.

24

2. One-dimensional wave equation u

u

c

t=0 x

d

t=ε x

u

t= u t= u t= u t=

l 2a x l +ε 2a x l a x l +ε a x

Figure 2.5. Animation to Example 2.1.

The dotted lines on Figure 2.5 depict divergent semiwaves 12 ϕ(x − at) and 12 ϕ(x + at) whose sum is u(t, x). We assume ε > 0 to be suﬃciently small compared to the characteristic time interval al (actually it suﬃces that l ). ε < 2a

2.3. A semibounded string. Reﬂection of waves from the end of the string Consider the equation (2.5) for x ≥ 0, t ≥ 0. We need to impose a boundary condition at the end x = 0. For instance, let the motion of the string’s left end be given by (2.19)

u|x=0 = α(t),

t ≥ 0,

and the initial conditions be given only for x ≥ 0: (2.20)

u|t=0 = ϕ(x),

ut |t=0 = ψ(x),

x ≥ 0.

2.3. A semibounded string. Reﬂection of waves

25

The problem with initial and boundary conditions is usually called a mixed problem. Therefore, the problem with conditions (2.19) and (2.20) for a wave equation (2.5) is an example of a mixed problem (sometimes called the ﬁrst boundary value problem for the string equation). Returning to the boundary value problem above, we can see that the simplest way to get an analytic solution is as in the Cauchy problem. The quadrant t ≥ 0, x ≥ 0 is a convex domain; therefore, we can again represent u in the form u(t, x) = f (x − at) + g(x + at), where g(x) is now deﬁned and needed only for x ≥ 0 because x + at ≥ 0 for x ≥ 0, t ≥ 0. (At the same time f (x) is deﬁned, as before, for all x.) The initial conditions (2.20) deﬁne f (x) and g(x) for x ≥ 0 as in the case of an unbounded string and via the same formula. Therefore, for x − at ≥ 0 the solution is determined by d’Alembert’s formula, which is, of course, already clear. But now we can make use of the boundary condition (2.19) which gives f (−at) + g(at) = α(t), implying (2.21)

z − g(−z), f (z) = α − a

t ≥ 0,

z ≤ 0.

The summands in (2.21) for z = x − at give the sum of two waves running to the right. One of them is α(t − xa ) and is naturally interpreted as the wave generated by oscillations of the end of the string; the other one is −g(−(x − at)) and is interpreted as the reﬂection of the wave g(x + at) running to the left from the left end (note that this reﬂection results in the change of the sign). If the left end is ﬁxed, i.e., α(t) ≡ 0, only one wave, −g(−(x − at)), remains and if there is no wave running to the left, then only the wave α(t − xa ) induced by the oscillation α(t) of the end is left. Let us indicate a geometrically more transparent method for solving the problem with a ﬁxed end. We wish to solve the wave equation (2.5) for t ≥ 0, x ≥ 0, with the initial conditions (2.20) and the boundary condition u|x=0 = 0,

t ≥ 0.

Let us try to ﬁnd a solution in the form of the right-hand side of d’Alembert’s formula (2.16), where ϕ(x), ψ(x) are some extensions of functions ϕ(x), ψ(x) onto the whole axis deﬁned for x ≥ 0 by initial conditions (2.20). The boundary condition yields

at 1 ϕ(−at) + ϕ(at) + ψ(ξ)dξ, 0= 2 2a −at

26

2. One-dimensional wave equation

implying that we will reach our goal if we extend ϕ and ψ as odd functions, i.e., if we set ϕ(z) = −ϕ(−z),

ψ(z) = −ψ(−z),

z ≤ 0.

Thus, we can construct a solution of our problem. The uniqueness of the solution is not clear from this method, but, luckily, the uniqueness was proved above. Example 2.2. Let an end be ﬁxed, let ψ(x) = 0, and let the graph of ϕ(x) again have the form of an isosceles triangle with the base [l, 3l]. Let us draw an animation describing the form of the string. Continuing ϕ(x) as an odd function, we get the same problem as in Example 2.1. With a dotted line we depict the graph over the points of the left half-axis (introducing it is just a mathematical trick since actually only the half-axis x ≥ 0 exists). We get the animation described on Figure 2.6. Of interest here is the moment t = 2l a when near the end the string is horizontal and so does not look perturbed. At the next moment, t = 2l a + ε, however, the perturbation arises due to a nonvanishing initial velocity. For t > 3l a + ε we get two waves running to the right, one of which is negative and is obtained by reﬂection from the left end of the wave running to the left.

2.4. A bounded string. Standing waves. The Fourier method (separation of variables method) Consider oscillations of a bounded string with ﬁxed ends; i.e., solve the wave equation (2.5) with initial conditions (2.22)

u|t=0 = ϕ(x),

ut |t=0 = ψ(x),

x ∈ [0, l],

and boundary conditions (2.23)

u|x=0 = u|x=l = 0,

t ≥ 0.

This can be done as in the preceding section. We will, however, apply another method which also works in many other problems. Note that the uniqueness of the solution is already proved with the help of the energy conservation law. We will ﬁnd standing waves, i.e., the solutions of the wave equation (2.5) deﬁned for x ∈ [0, l], satisfying (2.23) and having the form u(t, x) = T (t)X(x).

2.4. A bounded string. Standing waves. The Fourier method

27

u −3l

−l l

3l

t=0 x

u t=ε x u

t=

u

t=

u

t=

u

t=

u

t=

u

t=

u

t=

l a x l +ε a x 3l 2a x 3l +ε 2a x 2l a x 2l +ε a x 3l +ε a x

Figure 2.6. Animation to Example 2.2.

The term “standing wave” is justiﬁed since the form of the string does not actually change with the time except for a factor which is constant in x and only depends on time. Substituting u(t, x) into (2.5), we get T (t)X(x) = a2 T (t)X (x).

28

2. One-dimensional wave equation

Excluding the uninteresting trivial cases when T (t) ≡ 0 or X(x) ≡ 0, we may divide this equation by a2 T (t)X(x) and get T (t) X (x) = . a2 T (t) X(x)

(2.24)

The left-hand side of this relation does not depend on x and the right-hand side does not depend on t. Therefore, they are constants; i.e., (2.24) is equivalent to the relations (2.25)

T (t) − λa2 T (t) = 0, X (x) − λX(x) = 0

with the same constant λ. Further, boundary conditions (2.23) imply X(0) = X(l) = 0. The boundary value problem (2.26)

X = λX,

X(0) = X(l) = 0

is a particular case of the so-called Sturm-Liouville problem. We will discuss the general Sturm-Liouville problem later. Now, we solve the problem (2.26); i.e., ﬁnd the values λ (eigenvalues) for which the problem (2.26) has nontrivial solutions (eigenfunctions), and ﬁnd these eigenfunctions. Consider the following possible cases. a) λ = μ2 , μ > 0. Then the general solution of the equation X = λX is X(x) = C1 sinh μx + C2 cosh μx. (Recall that sinh x = 12 (ex − e−x ) and cosh x = 12 (ex + e−x ).) From X(0) = 0 we get C2 = 0, and from X(l) = 0 we get C1 sinh μl = 0 implying C1 = 0, i.e., X(x) ≡ 0; hence, λ > 0 is not an eigenvalue. b) λ = 0. Then X(x) = C1 x + C2 and boundary conditions imply again C1 = C2 = 0 and X(x) ≡ 0. c) λ = −μ2 , μ > 0. Then X(x) = C1 sin μx + C2 cos μx; hence, X(0) = 0 implies C2 = 0 and X(l) = 0 implies C1 sin μl = 0. Assuming C1 = 0, we get sin μl = 0; therefore, the possible values of μ are kπ , k = 1, 2, . . . , μk = l and the eigenvalues and eigenfunctions are 2 kπ kπx , k = 1, 2, . . . . , Xk = sin λk = − l l

2.4. A bounded string. Standing waves. The Fourier method

29

X1 (x)

0

l

Figure 2.7. Standing wave 1.

X2 (x) l/2

l

0

Figure 2.8. Standing wave 2.

X3 (x)

0

l/3

2l/3 l

Figure 2.9. Standing wave 3.

Let us draw graphs of several ﬁrst eigenfunctions determining the form of standing waves (see Figures 2.7–2.9). It is also easy to ﬁnd the corresponding functions T (t). Namely, from (2.25) with λ = λk we get kπat kπat + Bk sin , l l implying the following general form of the standing wave: kπx kπat kπat + Bk sin sin , k = 1, 2, . . . . uk (t, x) = Ak cos l l l Tk (t) = Ak cos

The frequency of the oscillation of each point x corresponding to the solution uk is equal to kπa , k = 1, 2, . . . . ωk = l These frequencies are called the eigenfrequencies of the string. Now let us ﬁnd the general solution of (2.5) with boundary conditions (2.23) as a sum (superposition) of stationary waves, i.e., in the form ∞ kπx kπat kπat + Bk sin sin . Ak cos (2.27) u(t, x) = l l l k=1

30

2. One-dimensional wave equation

We need to satisfy the initial conditions (2.22). Substituting (2.27) into these conditions, we get (2.28)

ϕ(x) =

∞

Ak sin

k=1

(2.29)

ψ(x) =

∞ kπa k=1

l

kπx , l

Bk sin

x ∈ [0, l],

kπx , l

x ∈ [0, l];

i.e., it is necessary to expand both ϕ(x) and ψ(x) into series in eigenfunctions kπx sin : k = 1, 2, . . . . l First of all, observe that this system is orthogonal on the segment [0, l]. This can be directly veriﬁed, but we may also refer to the general fact of orthogonality of eigenvectors of a symmetric operator corresponding to distinct eigenvalues. For such an operator one should take the operator d2 2 L = dx 2 with the domain DL consisting of functions v ∈ C ([0, l]) such that v(0) = v(l) = 0. Integrating by parts, we see that (Lv, w) = (v, Lw),

v, w ∈ DL .

The parentheses denote the inner product in L2 ([0, l]) deﬁned via

l v1 (x)v2 (x)dx. (v1 , v2 ) = 0 kπx l }k=1,2,...

Thus, the system {sin to establish its completeness.

is orthogonal in L2 ([0, l]). We would like

For simplicity, take l = π (the general case reduces to this one by introducing the variable y = πx l ) so that the system takes on the form ∞ {sin kx}k=1 and is considered on [0, π]. Let f ∈ L2 ([0, π]). Extend f as an odd function onto [−π, π] and expand it into the Fourier series in the system {1, cos kx, sin kx : k = 1, 2, . . .} on the interval [−π, π]. Clearly, this expansion does not contain constants and the terms with cos kx, since the extension is an odd function. Therefore, on [0, π], we get the expansion in the system {sin kx : k = 1, 2, . . .} only. Note, further, that if f is continuous, has a piecewise continuous derivative on [0, π], and f (0) = f (π) = 0, then its 2π-periodic odd extension is also continuous with a piecewise continuous derivative. Therefore, it can be expanded into a uniformly convergent series in terms of {sin kx : k = 1, 2, . . .}. If this 2π-periodic extension is in C k and its (k + 1)st derivative is piecewise continuous, then this expansion can be diﬀerentiated k times and the uniform convergence is preserved under this diﬀerentiation.

2.4. A bounded string. Standing waves. The Fourier method

31

Thus, expansions into series (2.28) and (2.29) exist, and imposing some smoothness and boundary conditions on ϕ(x), ψ(x), we can achieve arbitrarily fast convergence of these series. Coeﬃcients of these series are uniquely deﬁned: l

kπx 2 l kπx 0 ϕ(x) sin l dx Ak = l = dx, ϕ(x) sin 2 kπx l l 0 l dx 0 sin l

l 2 l 0 ψ(x) sin kπx kπx l dx = dx. ψ(x) sin Bk = l 2 kπx kπa kπa 0 l sin dx 0

l

Substituting these values of Ak and Bk into (2.27), we get the solution of our problem. The Fourier method can also be used to prove the uniqueness of solution. (More precisely, the completeness of the trigonometric system enables us to reduce it to the uniqueness theorem for ODE.) We will illustrate this below by an example of a more general problem. Consider the problem of a string to which a distributed force f (t, x) is applied: (2.30)

utt = a2 uxx + f (t, x),

t ≥ 0, x ∈ [0, l].

The ends of the string are assumed to be ﬁxed (i.e., conditions (2.23) are satisﬁed) and for t = 0 the initial position and velocity of the string, i.e., conditions (2.22), are deﬁned as above. Since the eigenfunctions {sin kπx l : k = 1, 2, . . .} constitute a complete orthonormal system on [0, l], any function g(t, x), deﬁned for x ∈ [0, l] and satisfying reasonable smoothness conditions and boudnary conditions, can be expanded in terms of this system, with coeﬃcients depending on t. In particular, we can write ∞ kπx , uk (t) sin u(t, x) = l k=1

f (t, x) =

∞ k=1

fk (t) sin

kπx , l

where the fk (t) are known and the uk (t) are to be determined. Substituting these decompositions into (2.30), we get (2.31)

∞ kπx = 0, [¨ uk (t) + ωk2 uk (t) − fk (t)] sin l k=1

where the ωk = kπa l are the eigenfrequencies of the string. It follows from (2.31), due to the orthogonality of the system of eigenfunctions, that (2.32)

u ¨k (t) + ωk2 uk (t) = fk (t), k = 1, 2, . . . .

32

2. One-dimensional wave equation

The initial conditions ∞

uk (0) sin

kπx = ϕ(x), l

u˙ k (0) sin

kπx = ψ(x) l

k=1 ∞ k=1

determine uk (0) and u˙ k (0); hence, we can uniquely determine functions uk (t) and, therefore, the solution u(t, x). In particular, of interest is the case when f (t, x) is a harmonic oscillation in t; e.g., f (t, x) = g(x) sin ωt. Then, clearly, the right-hand sides fk (t) of equations (2.32) are fk (t) = gk sin ωt. For example, let gk = 0. Then for ω = ωk (the nonresonance case) the equation (2.32) has an oscillating particular solutions of the form uk (t) = Uk sin ωt, Uk =

gk2 . ωk2 − ω 2

Therefore, its general solution also has an oscillating form: uk (t) = Uk sin ωt + Ak cos ωk t + Bk sin ωk t. For ω = ωk (the resonance case) there is a particular solution of the form uk (t) = t(Mk cos ωt + Nk sin ωt), which the reader can visualize as an oscillation with the frequency ω and an increasing amplitude. The general solution uk (t) and the function u(t, x) are unbounded in this case. Observe that there is a more general problem that can be reduced to the problem above, namely the one with ends that are not ﬁxed but whose motion is given by (2.33)

u|x=0 = α(t),

u|x=l = β(t), t ≥ 0.

Namely, if u0 (t, x) is an δ-function satisfying (2.33), e.g., x l−x α(t) + β(t), l l then for v(t, x) = u(t, x) − u0 (t, x) we get a problem with ﬁxed ends. u0 (t, x) =

Finally, note that all the above problems are well-posed, which is obvious from their explicit solutions.

2.5. Appendix. The calculus of variations and mechanics

33

2.5. Appendix. The calculus of variations and classical mechanics The calculus of variations is an analogue of the usual diﬀerential calculus in the setting of an inﬁnite-dimensional space. Functions on an inﬁnitedimensional space are usually called functionals to distinguish them from functions on a ﬁnite-dimensional space (which can be arguments of the functionals). 2.5.1. Variations of functionals. Euler-Lagrange equations. Let us consider a functional

t1 L(t, x(t), x(t))dt, ˙ (2.34) S= t0

where t ∈ [t0 , t1 ], t0 < t1 , and we usually assume that x = x(t) ∈ C 1 ([t0 , t1 ]), ˙ = dx(t)/dt are the usual so that x˙ ∈ C([t0 , t1 ]). The functions x˙ = x(t) derivatives of x(t) with respect to t. The functions x = x(t) ∈ C([t0 , t1 ]) can be considered as arguments of the functional S. Let L = L(t, x(t), x(t)) ˙ = L(t, x, x)| ˙ x=x(t),x= ˙ x(t) ˙ be the substitution result of x(t), x(t) ˙ instead of x, x, ˙ respectively. All functions are assumed to be real valued, with the smoothness suﬃcient for the calculations. We will specify the appropriate level of smoothness wherever we need it. Now we can describe the variational setting and boundary conditions for our problem. Let us assume that we need to ﬁnd a minimum of the functional S on a set of functions x ∈ C 1 ([t0 , t1 ]), whose values at the ends t0 , t1 are ﬁxed. (In other words, we impose boundary conditions x(t0 ) = b0 , x(t1 ) = b1 where b0 , b1 are ﬁxed constants.) Let x = x(t) be a minimizer: a function where this minimum is attained. For any function δx ∈ C 1 ([t0 , t1 ]), satisfying the boundary conditions (2.35)

δx(t0 ) = δx(t1 ) = 0,

take also a function f = f (s) of one variable s ∈ R, deﬁned by

t1 L(t, x(t) + s δx(t), x(t) ˙ + s δ x(t))dt. ˙ (2.36) f (s) = S[x + s δx] = t0

Clearly, it should attain a minimum at s = 0. (In this formula δ x˙ should be understood as (d/dt)δx(t).) Therefore,

t1 ∂L ∂L δx + δ x˙ dt = 0, f (0) = ∂x ∂ x˙ t0

34

2. One-dimensional wave equation

where ∂L/∂ x˙ should be understood as ∂L(t, x(t), v)/∂v|v=x(t) ˙ . Integrating by parts in the second term, and taking into account the boundary conditions for δx, we obtain

t1 ∂L d ∂L − δx dt = 0. ∂x dt ∂ x˙ t0 Since δx can be an arbitrary function from C 2 ([t0 , t1 ]), vanishing at the ends t0 , t1 , we easily obtain that the expression in the parentheses should vanish identically; i.e., we get the equation d ∂L ∂L − = 0, dt ∂ x˙ ∂x on the interval [t0 , t1 ]. This is a second-order ordinary diﬀerential equation that is called the Euler-Lagrange equation. Its solution satisfying the boundary conditions x(t0 ) = b0 , x(t1 ) = b1 is called the extremal or the extremal point of the action functional S. All minimizers and maximizers of S should be extremal points. Sometimes, if it is known for some reasons that the minimum is attained and there is only one extremal, then this extremal must coincide with the minimizer. Algebraic machinery of the variational calculus Here we give a short list of algebraic notation in the calculus with one degree of freedom. Denote by δS a linear functional on the space of functions δx ∈ C 2 ([t0 , t1 ]), vanishing at the ends of the interval [t0 , t1 ], deﬁned by

t1 ∂L d ∂L δx + δ x˙ dt = δS[δx] = S[x + s δx] ds ∂x ∂ x˙ t0 s=0

t1 d ∂L ∂L − δx dt. = ∂x dt ∂ x˙ t0 It is the principal linear part of the functional S at the point x, and it is called the variation of the functional S. Using the variation, we can rewrite the extremality condition in the form δS = 0 which means that δS[δx] = 0 for all variations δx, which vanish at the ends of the interval [t0 , t1 ]. Clearly, this is equivalent to the Euler-Lagrange equation. We can rewrite the last equation in the form

t1 δS δx dt, δS = δS[δx] = t0 δx where ∂L d ∂L δS = − . δx ∂x dt ∂ x˙ Here the right-hand side is called the variational derivative of S.

2.5. Appendix. The calculus of variations and mechanics

35

The case of several degrees of freedom There is a straightforward extension of the arguments given above to the case when the action functional S depends upon several functions x1 (t), . . . , xn (t) belonging to C 2 ([t0 , t1 ]). In fact, it can be obtained if we just use the same formulas as above but for a vector function x instead of a scalar function. (This vector function can be interpreted as a parametrized curve in Rn , or, alternatively, a trajectory of a point moving in Rn .) Nevertheless, let us reproduce the main formulas, leaving the details of the arguments as an exercise for the readers. The action S is given by

t1 L(t, x1 (t), . . . , xn (t), x˙ 1 (t), . . . , x˙ n (t))dt, S[x] = t0

where L = L(t, x1 , . . . , xn , v1 , . . . , vn ) is the Lagrangian, which is a scalar real-valued function. The variations of the given (real-valued) functions x1 (t), . . . , xn (t) are real-valued C 2 -functions δx1 (t), . . . , δxn (t), vanishing at the ends t0 , t1 . The extremum conditions are obtained from the equations ∂ S[x1 + s1 δx1 , . . . , xn + sn δxn ] = 0, j = 1, . . . , n, ∂sj s1 =···=sn =0 which lead to the Euler-Lagrange equations ∂L d ∂L − = 0, dt ∂ x˙ j ∂xj

j = 1, . . . , n.

This system of equations is equivalent to the vanishing of the variation δS, which is a linear functional on vector functions δx = (δx1 , . . . , δxn ); it is given by

t1 n ∂L ∂L δxj + δ x˙ j dt δS = δS[δx] = ∂ x˙ j t0 j=1 ∂xj

t1 n d ∂L ∂L − δxj dt. = ∂xj dt ∂ x˙ j t0 j=1

This can be rewritten in the form

t1

δS = δS[δx] = t0

n δS δxj dt δxj j=1

where ∂L d ∂L δS = − δxj ∂xj dt ∂ x˙ j is called the partial variational derivative of S with respect to xj .

36

2. One-dimensional wave equation

2.5.2. Lagrangian approach in classical mechanics. The most important equations of classical mechanics can be presented in the form of the Euler-Lagrange equations. (In mechanics they are usually called Lagrange’s equations.) Take, for example, a point particle of mass m > 0 moving along the x-axis R in a time-independent (or stationary) ﬁeld of forces F (x) (which is assumed to be a real-valued continuous function on R). Let x(t) denote the coordinate of the moving particle at the time t. Then by Newton’s Second Law, (mass)·(acceleration)=(force), or, more precisely, m¨ x(t) = F (x(t)) for all t. It is sometimes more convenient to rewrite this equation in the form dp = F, dt where p = mx˙ is called the momentum of the moving particle. Let us introduce a potential energy of the particle, which is a real-valued function U = U (x) on R, such that F (x) = −U (x) for all x. In other words,

x F (s)ds, U (x) = U (x0 ) − x0

which means that to move the particle from x0 to x you need to do work U (x) − U (x0 ) against the force F . (If U (x) < U (x0 ), then in fact the force does the work.) The Newton equation can be rewritten then as m¨ x = −U (x). Multiplying both parts by x, ˙ we can rewrite the result as d d mx˙ 2 = − U (x), dt 2 dt where x = x(t). Taking antiderivative, we obtain mx˙ 2 + U (x) = E = const, 2 where the constant E depends on the solution x = x(t) (but is constant along the trajectory). This is the energy conservation law for the one-dimensional motion. The quantity mx˙ 2 K= 2 is called the kinetic energy of the moving particle, and the quantity H = K + U = H(x, x) ˙ is called the energy, or full energy, or Hamiltonian. The energy conservation law can be rewritten in the form H(x, x) ˙ = E = const.

2.5. Appendix. The calculus of variations and mechanics

37

Now let us take the Lagrangian L = L(x, x) ˙ =K−U =

mx˙ 2 − U (x) 2

and the corresponding action S. Then it is easy to see that the extremals of S will be exactly the solutions of the Newton Second Law equation, satisfying the chosen boundary conditions. In other words, the Euler-Lagrange equation for this action coincides with the Newton Second Law equation. Now consider a motion of a particle in Rn . (Note that any motion of k particles in R3 can be equivalently presented as a motion of one particle in R3k , the space whose coordinates consist of all coordinates of all particles.) The particle coordinates can be presented as a point x ∈ Rn , x = (x1 , . . . , xn ), so the motion of the particle is given by functions x1 (t), . . . , xn (t), or by one vector function x(t) (with values in Rn ). Let us assume that the force is F (x) = (F1 (x), . . . , Fn (x)), so that F (x) ∈ Rn for every x ∈ Rn , and F is continuous. (In the case when x represents a system of particles, F may include forces of interaction between these particles, which may depend on the conﬁguration of these particles.) Then by Newton’s Second Law, the motion satisﬁes the equations ¨j = Fj (x), mj x

j = 1, . . . , n.

Let us assume also that the ﬁeld of forces is a potential ﬁeld, given by a potential energy U = U (x) (which is a scalar, real-valued function on Rn ), so that ∂U , j = 1, . . . , n, Fj (x) = − ∂xj or, in the vector analysis notation, F = −∇U = − grad U. In this case, it is easy to see that the equations of motion coincide with the Euler-Lagrange equations for the action with the Lagrangian L = K − U , where the kinetic energy K is given by K=

n mj x˙ 2j j=1

2

.

The Hamiltonian H = H(x, x) ˙ = K + U satisﬁes the energy conservation law H(x(t), x(t)) ˙ = E = const, where x(t) is the solution of the equations of motion (or, equivalently, the extremal of the action). This can be veriﬁed

38

2. One-dimensional wave equation

by a straightforward calculation: d ˙ dt H(x(t), x(t))

∂H ∂H · x˙ + ·x ¨ ∂x ∂ x˙ n ∂H ∂H x˙ j + x ¨j = ∂xj ∂ x˙ j j=1 n ∂U 1 ∂U ∂K − x˙ j + = ∂xj ∂ x˙ j mj ∂xj j=1 n ∂U 1 ∂U = x˙ j + mj x˙ j − = 0. ∂xj mj ∂xj =

j=1

Now let us consider a system of k particles in R3 . Assume that n = 3k and write x = (x1 , . . . , xk ), where xj ∈ R3 , j = 1, . . . , k. Let us view xj as a point in R3 where the jth particle is located. For moving particles we assume that xj = xj (t) is a C 2 -function of t. Let us introduce the momentum of the jth particle as pj = mj x˙ j , which is a vector in R3 . The momentum of the whole system of such particles is deﬁned as p = p1 + · · · + pk , which is again a vector in R3 . Now assume that all forces acting on the particles are split into pairwise interaction forces between the particles and external forces, so that the force acting on the jth particle is Fij + Fext Fj = j , i=j

where Fij = Fij (x) ∈ R3 , and the summation is taken over i (with j ﬁxed), 3 = Fext Fext j j (x) ∈ R . Here Fij is the force acting on the jth particle due to its interaction with the ith particle, and Fext is the external force acting j upon the jth particle. The equations of motion look like dpj = Fj = Fij + Fext j , dt

j = 1, . . . , k.

i=j

Now assume that Newton’s Third Law holds for two interacting particles: Fij (x) = −Fji (x), for all i, j with i = j and for all x. Then we get dp = Fext = Fext j . dt k

j=1

2.6. Problems

39

In words, the rate of change of the total momentum in time equals the total external force. (The pairwise interaction forces, satisfying Newton’s Third Law, do not count.) Indeed, calculating the derivative of p, we see that all interaction forces cancel due to Newton’s Third Law. In particular, in the absence of external forces the conservation of momentum law holds: for any motion of these particles p(t) = const . Indeed, from the relations above we immediately see that dp/dt = 0. Remark. The mechanical system, describing a particle, moving in Rn , is said to have n degrees of freedom. In the main text of this chapter we discuss the motion of a string, which naturally has inﬁnitely many degrees of freedom, because its position is described not by a ﬁnite set of numbers but by a function (of one variable), or by an inﬁnite set of its Fourier coeﬃcients.

2.6. Problems 2.1. a) Write the boundary condition on the left end of the string if a ring which can slide without friction along the vertical line is attached to this end. The ring is assumed to be massless (see Figure 2.10). b) The same as in a) but the ring is of mass m. c) The same as in b) but the ring slides with a friction which is proportional to its velocity. 2.2. a) Derive the energy conservation law for a string if at the left end the boundary condition of Problem 2.1 a) is satisﬁed and the right end is ﬁxed.

Figure 2.10. The ring attached to the end of the string.

Figure 2.11. The end of the rod is elastically attached.

40

2. One-dimensional wave equation

b) The same as in Problem 2.1 b) (take into account the energy of the ring). c) The same as in Problem 2.1 c) (take into account the loss of energy caused by friction). 2.3. Derive the equation for longitudinal elastic vibrations of a homogeneous rod. 2.4. Write the boundary condition for the left end of the rod if this end is free. 2.5. Write the boundary condition for the left end of the rod if this end is elastically restrained; see Figure 2.11. 2.6. Derive the energy conservation law for a rod with a) ﬁxed ends; b) free ends; c) one end ﬁxed and the other restrained elastically. 2.7. Derive an equation for longitudinal vibrations of a conic rod. 2.8. Prove that the energy of an interval of a string between c + at and d − at is a decreasing function of time. Deduce from this the uniqueness of the solution of the Cauchy problem and information about the dependence and inﬂuence domains for [c, d]. 2.9. Formulate and prove the energy conservation law for an inﬁnite string. 2.10. Draw an animation describing oscillations of an inﬁnite string with 1 for x ∈ [c, d], the initial values u|t=0 = 0, ut |t=0 = ψ(x), where ψ(x) = 0 for x ∈ [c, d]. 2.11. Describe oscillations of an inﬁnite string for t ∈ (−∞, +∞) such that some interval (x0 − ε, x0 + ε) of the string is in equilibrium position at all times. 2.12. The same as in the above problem but the interval (x0 − ε, x0 + ε) is ﬁxed only for t ≥ 0. 2.13. Draw an animation describing oscillations of a semibounded string with a free end (the boundary condition ux |x=0 = 0, see Problem 2.1 a), and the initial conditions u|t=0 = ϕ(x), ut |t=0 = 0, where the graph ϕ(x) has the form of an isosceles triangle with the base [l, 3l]).

2.6. Problems

41

2.14. Draw an animation describing oscillations of a semibounded string with a ﬁxed end and the initial conditions u|t=0 = 0, ut |t=0 = ψ(x), where ψ(x) is the characteristic function of the interval (l, 3l). 2.15. Derive the energy conservation law for a semibounded string with a ﬁxed end. 2.16. Given the wave sin ω(t + xa ) running for t ≤ 0 along the rod whose left end is elastically attached, ﬁnd the reﬂected wave. 2.17. Given a source of harmonic oscillations with frequency ω running along an inﬁnite string at a speed v < a, i.e., u|x=vt = sin ωt for t ≥ 0, describe the oscillations of the string to the left and to the right of the source. Find frequencies of induced oscillations of a ﬁxed point of the string to the left and to the right of the source and give a physical interpretation of the result (Doppler’s eﬀect). 2.18. Describe and draw standing waves for a string with free ends. 2.19. Prove the existence of an inﬁnite series of standing waves in a rod with the ﬁxed left end and elastically attached right one (see Problem 2.5). Prove the orthogonality of eigenfunctions. Find the shortwave asymptotics of eigenvalues. 2.20. The force F = F0 sin ωt, where ω > 0, is applied to the right end of the rod for t ≥ 0. Derive a formula for the solution describing forced oscillations of the rod and ﬁnd the conditions for a resonance if the left end is a) ﬁxed; b) free. (Comment. The forced oscillations are the oscillations which have the frequency which coincides with the frequency of the external force or other external perturbation.) 2.21. The right end of a rod is oscillating via A sin ωt for t ≥ 0. Write a formula for the solution describing forced oscillations and ﬁnd the conditions for resonance if the left end is a) ﬁxed; b) free; c) elastically restrained.

10.1090/gsm/205/03

Chapter 3

The Sturm-Liouville problem

3.1. Formulation of the problem Let us consider the equation ρ(x)utt = (p(x)ux )x − q(x)u,

(3.1)

which is more general than the equation derived in Chapter 2 for oscillations of a nonhomogeneous string. For the future calculations to be valid, we will assume that ρ ∈ C 1 ([0, l]), ρ > 0 on [a, b], p ∈ C 1 ([a, b]), and q ∈ C([a, b]) are real valued. Let us try to simplify the equation by a change of variables u(t, x) = z(x)v(t, x), where z(x) is a known function. This will allow us to obtain an equation of the form (3.1) with ρ(x) ≡ 1. Indeed, we have utt = zvtt , ux = zvx + zx v, uxx = zvxx + 2zx vx + zxx v. Substituting this into (3.1) and dividing by zρ, we get p 2p zx px vtt = vxx + · + vx + B(x)v. ρ ρ z ρ This equation is of the form (3.1) with ρ(x) ≡ 1 whenever 2p zx px p · + . = ρ x ρ z ρ Performing the diﬀerentiation in the left-hand side we see that this can be equivalently rewritten in the form ρx zx =− z 2ρ 43

44

3. The Sturm-Liouville problem

and we may take, for example, z(x) = ρ(x)−1/2 . Since we always assume ρ(x) > 0, such a substitution is possible. Thus, instead of (3.1), we may consider the equation utt = (p(x)ux )x − q(x)u.

(3.2)

Solving it by separation of variables on the closed interval [0, l] and assuming that the ends are ﬁxed, we get the following eigenvalue problem: (3.3)

−(p(x)X (x)) + q(x)X(x) = λX(x),

(3.4)

X(a) = X(b) = 0.

Generalizing it, we can take the same diﬀerential operator while replacing the boundary conditions by more general conditions: αX (a) + βX(a) = 0, γX (b) + δX(b) = 0, where α, β, γ, δ are real numbers, α2 + β 2 = 0, γ 2 + δ 2 = 0. This eigenvalue problem is called the Sturm-Liouville problem. Considering this problem we usually suppose that p ∈ C 1 ([a, b]), q ∈ C([a, b]), and p(x) = 0 for x ∈ [0, l]. For simplicity, we assume that p(x) ≡ 1 and the boundary conditions are as in (3.4). Thus, consider the problem −X (x) + q(x)X(x) = λX(x), with the boundary conditions (3.4) and assume also that (3.5)

q(x) ≥ 0.

This is not a restriction since we can achieve (3.5) by adding the same constant to q and to λ.

3.2. Basic properties of eigenvalues and eigenfunctions We will start with the following oscillation theorem of C. Sturm (1836): Theorem 3.1. Let y, z, q1 , q2 ∈ C 2 ([a, b]) be real-valued functions, such that z(a) = z(b) = 0, z ≡ 0, y and z satisfy the equations (3.6)

y (x) + q1 (x)y(x) = 0,

(3.7)

z (x) + q2 (x)z(x) = 0.

Assume also that (3.8)

q1 (x) ≥ q2 (x) for all x ∈ [a, b].

3.2. Eigenvalues and eigenfunctions

45

Then, either there exists x0 ∈ (a, b) such that y(x0 ) = 0, or q1 (x) ≡ q2 (x) and y(x) ≡ cz(x) on [a, b], where c is a constant. Proof. Note that, due to the existence and uniqueness theorem for secondorder ODEs, all zeros of the function z are simple (i.e., if z(x0 ) = 0 for a point x0 ∈ [a, b], then z (x0 ) = 0). In particular, all zeros of z are isolated. Without loss of generality, we may assume that a and b are neighboring zeros of z(x) and z(x) > 0 for all x ∈ (a, b). Then z (a) > 0 and z (b) < 0. If y(x) does not vanish on (a, b), then we may assume that y(x) > 0 on (a, b). Let us consider the Wronskian (3.9)

W (x) = y(x)z (x) − y (x)z(x),

x ∈ [a, b].

Diﬀerentiating W (x), we obtain (3.10)

W (x) = y(x)z (x) − y (x)z(x) = (q1 (x) − q2 (x))y(x)z(x).

Integrating over [a, b], we get W (b) − W (a) ≥ 0, and equality holds if and only if q1 (x) ≡ q2 (x) on [a, b]. On the other hand, by assumption, y(a), y(b) ≥ 0. Then we have W (b) − W (a) = y(b)z (b) − y(a)z (a) ≤ 0, with equality if and only if y(a) = y(b) = 0. Comparing the obtained inequalities, we see that W (b) − W (a) = 0, implying both q1 (x) ≡ q2 (x) and y(a) = y(b) = 0. Since z(a) = z(b) = 0, the solutions y and z must be proportional; i.e., y(x) = c z(x), where c is a constant. (Indeed, taking c such that y (a) = c z (a), the uniqueness theorem for ODEs implies y(x) ≡ c z(x).) Remark 3.2. Arguing as in the proof of Theorem 3.1, we can also prove that the Wronskian W (x) of the two eigenfunctions with the same eigenvalues (and the identical potentials q(x) = q1 (x) = q2 (x)) are proportional. 2

d Let us introduce the Sturm-Liouville operator L = − dx 2 + q(x), which is also called the one-dimensional Schr¨ odinger operator. We consider L as an (unbounded) linear operator in L2 ([0, l]), with the domain DL consisting of all complex-valued functions v(x) ∈ C 2 ([0, l]) such that v(0) = v(l) = 0. The Sturm-Liouville problem is to ﬁnd the eigenvalues and eigenvectors of this operator.

Note that the operator L is symmetric; i.e., (Lv1 , v2 ) = (v1 , Lv2 )

for v1 , v2 ∈ DL ,

where (·, ·) is the inner product in L2 ([0, l]) deﬁned by

l u(x) v(x) dx. (u, v) = 0

46

3. The Sturm-Liouville problem

Indeed,

l

(Lv1 , v2 ) − (v1 , Lv2 ) =

0

[−v1 v¯2 + v1 v¯2 ] dx

l

d [−v1 v¯2 + v1 v¯2 ] dx = [−v1 v¯2 + v1 v¯2 ]|l0 = 0. dx 0 This implies that all the eigenvalues are real and the eigenfunctions corresponding to diﬀerent eigenvalues are orthogonal in L2 ([0, l]). =

In addition, all the eigenvalues are simple: eigenspaces are one-dimensional since any two solutions of the equation −y + q(x) y − λ y = 0 that vanish at some point x0 are proportional to each other (and each of them is proportional, due to the uniqueness theorem, to the solution with the initial values y(x0 ) = 0, y (x0 ) = 1). Furthermore, the condition q(x) ≥ 0 and Sturm’s theorem imply that all eigenvalues are positive. To establish this, let us compare the solutions to −z + q(x)z = λ z with those to −y = λ y. If an eigenfunction z(x) with the eigenvalue λ is given, then, by Sturm’s theorem, any solution of the equation −y = λ y should vanish at a point of the segment [0, l]. But if λ ≤ 0, then √ among the solutions of the equation −y = λ y there is a solution cosh −λ x which never vanishes. √ For λ > 0, consider the solution y = sin √λ x which vanishes on [0, l] at least at one point besides 0 precisely when λ ≥ πl . This clearly implies that any eigenvalue λ of L satisﬁes

π 2 λ≥ . l √ From now on set for convenience k = λ so that the eigenvalue equation takes on the form (3.11)

−X + q(x)X = k 2 X.

We will always assume that k > 0. Denote by ψ = ψ(x, k) the solution of (3.11) with initial conditions ψ(0, k) = 0,

ψx (0, k) = k,

where ψx denotes the derivative with respect to x. (For example, if q(x) ≡ 0, then ψ(x, k) = sin kx.) The eigenvalues of the Sturm-Liouville operator L are of the form λ = k 2 , where k is such that ψ(l, k) = 0. The Sturm theorem implies that the number of zeros of ψ(x, k) belonging to any ﬁxed segment [0, a], where a ≤ l, is an increasing function of k. Indeed, if 0 = x1 (k) < x2 (k) < · · · < xn (k) are all such zeros of ψ(·, k) (xn (k) ≤ a) and k > k, then every open interval (xj (k), xj+1 (k)) contains at least one zero of ψ(·, k ). Recalling that x1 (k) = 0 for all k, we see that

3.3. The short-wave asymptotics

47

the total number of zeros of ψ(·, k ) on [0, a] is greater than or equal to n, which proves the claim. Therefore, as k grows, all zeros of ψ(x, k) move to the left. In fact, it is easy to see that this motion is continuous, even smooth. Indeed, the zeros xj (k) satisfy the equation ψ(xj (k), k) = 0. But the function ψ = ψ(x, k) has continuous partial derivatives in x and k (cf. [12], Sec. V.3) and ψx (xj (k), k) = 0, as observed above. By the implicit function theorem, the function xj = xj (k) is continuous. In fact, if we diﬀerentiate the equation ψ(xj (k), k) = 0 with respect to k, we obtain ψx (xj (k), k) · xj (k) = −ψk (xj (k), k), showing that xj (k) is also continuous. The eigenvalues correspond to the values k for which a new zero appears at l. Since the number of these zeros is ﬁnite for any k, this implies that the eigenvalues form a discrete sequence λ1 < λ2 < λ3 < · · · , which is either ﬁnite or tends to inﬁnity. The eigenfunction Xn (x) corresponding to the eigenvalue λn has exactly n − 1 zeros on the open interval (0, l). It is easy to see that the number of eigenvalues is actually inﬁnite. Indeed, Sturm’s theorem implies that the number of zeros of ψ(x, k) on (0, l) is not less than the number of zeros on (0, l) for the corresponding solution of the equation −y + M y = k 2 y, where M = sup q(x). x∈[0,l]

√

But this solution is c sin k 2 − M x with c = 0, and the number of its zeros on (0, l) tends to inﬁnity as k → +∞. Thus, we have proved Theorem 3.3. The eigenvalues are simple and form an increasing sequence λ1 < λ2 < λ3 < · · · tending to +∞. If q1 ≥ 0, then λ1 > 0. The eigenfunctions Xn (x), corresponding to diﬀerent eigenvalues λn , are orthogonal. The eigenfunction Xn (x) has exactly n − 1 zeros on the open interval (0, l).

3.3. The short-wave asymptotics Let us describe the asymptotic behavior of large eigenvalues and the corresponding eigenfunctions. The asymptotic formulas obtained are often called the high-frequency asymptotics or short-wave asymptotics, meaning that they correspond to high frequencies and therefore to short wavelengths in the nonstationary (i.e., time-dependent) problem.

48

3. The Sturm-Liouville problem

The behavior of eigenvalues is easily described with the help of oscillation d2 theorems. Namely, the eigenvalues of the operator L = − dx 2 + q(x) are d2 contained between the eigenvalues of L1 = − dx2 and the eigenvalues of d2 L2 = − dx 2 + M , where M = max q(x). Since the eigenvalues of L1 and L2 x∈[0,l]

πn 2 2 are equal to ( πn l ) and ( l ) + M , respectively (with n = 1, 2, . . .), we get

πn 2 l

≤ λn ≤

πn 2 l

+ M.

In particular, this implies the asymptotic formula

πn 2 1 1+O as n → ∞, λn = l n2 √ or, putting kn = λn , πn 1 1 πn = 1+O +O as n → ∞. (3.12) kn = l n2 l n Now, let us ﬁnd the asymptotics of the eigenfunctions Xn (x). The idea is that for large k the term k 2 X in (3.11) plays a greater role than q(x)X. Therefore, let us solve the equation ψ + k 2 ψ = q(x)ψ

(3.13) with the initial conditions (3.14)

ψ(0) = 0,

ψ (0) = k.

Treating (3.13) as a nonhomogeneous equation with right-hand side q(x)ψ, we want to obtain an integral equation for ψ = ψ(x) = ψ(x, k) that can be solved by successive approximations. Let us write ψ(x) = C1 (x) cos kx + C2 (x) sin kx. Then the equations for C1 (x) and C2 (x), obtained by variation of parameters, are (3.15)

C1 (x) cos kx + C2 (x) sin kx = 0,

(3.16)

−kC1 (x) sin kx + kC2 (x) cos kx = q(x)ψ.

Solving these equations, we ﬁnd C1 (x) and C2 (x) in terms of ψ. After integration this determines C1 (x) and C2 (x) up to arbitrary additive constants which are ﬁxed by the initial values (3.17)

C1 (0) = 0,

C2 (0) = 1,

3.3. The short-wave asymptotics

obtained from (3.14). Then

49

x

C1 (x) = −

q(τ )ψ(τ ) 0

sin kτ dτ, k

x

cos kτ dτ, k 0 so ψ satisﬁes an integral equation (of so-called Volterra type):

1 x sin k(x − τ )q(τ )ψ(τ )dτ. (3.18) ψ(x) = sin kx + k 0 C2 (x) = 1 +

q(τ )ψ(τ )

Clearly, the solution ψ of this equation satisﬁes (3.13) and (3.14) so it is equivalent to (3.13) with initial conditions (3.14). To solve (3.18) by successive approximations, consider the integral operator A deﬁned by the formula

x Aψ(x) = sin k(x − τ )q(τ )ψ(τ )dτ. 0

The equation (3.18) becomes 1 A ψ = sin kx, I− k and its solution may be expressed in the form −1 ∞ 1 n 1 sin kx = A (sin kx), (3.19) ψ= I− A k kn n=0

provided the series in the right-hand side of (3.19) converges and A can be applied termwise. It can be proved that the series (3.19) indeed converges uniformly in x ∈ [0, l] for all k but, since we are only interested in large k, we can restrict ourselves to an obvious remark that it converges uniformly on [0, l] for large k because the norm of A is bounded uniformly in k as an operator in the Banach space C([0, l]) (with sup norm) and the norm of sin kx in C([0, l]) is not greater than 1. In particular, we get 1 as k → +∞. (3.20) ψ(x, k) = sin kx + O k This gives the short-wave asymptotics of the eigenfunctions Xn . Indeed, it is clear from (3.12) and (3.20) that 1 1 1 πn +O +O ; = sin Xn (x) = ψ(x, kn ) = sin(kn x) + O kn l n n hence πnx +O Xn (x) = sin l

1 as n → +∞. n

50

3. The Sturm-Liouville problem

3.4. The Green’s function and completeness of the system of eigenfunctions 2

d Consider the Schr¨odinger operator L = − dx 2 + q(x) as a linear operator mapping its domain

DL = {v(x) ∈ C 2 ([0, l]) : v(0) = v(l) = 0} into C([0, l]). We will see that for q ≥ 0 this operator is invertible and the inverse operator L−1 can be expressed as an integral operator

l −1 G(x, y)f (y)dy for f ∈ C([0, l]), (3.21) L f (x) = 0

where G ∈ C([0, l]×[0, l]). The function G(x, y) is called the Green’s function of L. In general, if an operator is expressed in the form of the right-hand side of (3.21), then G(x, y) is called its kernel or Schwartz’s kernel (in honor of L. Schwartz). Therefore, L−1 has a continuous Schwartz kernel equal to G(x, y). Clearly, the function G(x, y) is uniquely deﬁned by (3.21). To ﬁnd L−1 f , we have to solve the equation (3.22)

−v (x) + q(x)v(x) = f (x)

with the boundary conditions (3.23)

v(0) = v(l) = 0.

This is performed by the variation of parameters. It is convenient to use two solutions y1 (x), y2 (x) of the homogeneous equation −y (x) + q(x)y(x) = 0 satisfying the boundary conditions (3.24)

y1 (0) = 0,

y1 (0) = 1

and (3.25)

y2 (l) = 0,

y2 (l) = −1.

Clearly, the solutions y1 (x) and y2 (x) are linearly independent since 0 is not an eigenvalue of L because q(x) ≥ 0. Now, by the variation of parameters, writing v(x) = C1 (x)y1 (x) + C2 (x)y2 (x), we arrive to the equations (3.26)

C1 (x)y1 (x) + C2 (x)y2 (x) = 0,

(3.27)

C1 (x)y1 (x) + C2 (x)y2 (x) = −f (x).

The determinant of this system of linear equations for C1 (x) and C2 (x) is the Wronskian W (x) = y1 (x)y2 (x) − y1 (x)y2 (x).

3.4. The Green’s function and eigenfunctions

51

It is well known (and easy to verify by diﬀerentiation of W (x)—see the calculation in the proof of Theorem 3.1) that W (x) = const and W (x) = 0 if and only if y1 and y2 are linearly dependent. Therefore, in our case W (x) = W = const = 0. Solving the system (3.26), (3.27) by Cramer’s rule, we get 1 1 f (x)y2 (x), C2 (x) = − f (x)y1 (x). W W Taking into account the boundary condition (3.23) for v and the boundary conditions (3.24), (3.25) for y1 , y2 , we see that C1 (x) and C2 (x) should be chosen so that C1 (l) = 0, C2 (0) = 0. C1 (x) =

This implies 1 C1 (x) = − W

l x

1 f (ξ)y2 (ξ)dξ, C2 (x) = − W

x

f (ξ)y1 (ξ)dξ. 0

Therefore, v exists, is unique, and is determined by the formula

x

l 1 1 y1 (ξ)y2 (x)f (ξ)dξ − y1 (x)y2 (ξ)f (ξ)dξ, v(x) = − W 0 W x which can be expressed in the form

l G(x, ξ)f (ξ)dξ, (3.28) v(x) = 0

where 1 [θ(x − ξ)y1 (ξ)y2 (x) + θ(ξ − x)y1 (x)y2 (ξ)] W and θ(z) is the Heaviside function: θ(z) = 1 for z ≥ 0, θ(z) = 0 for z < 0.

(3.29)

G(x, ξ) = −

Note that G(x, ξ) is continuous and symmetric; i.e., (3.30)

G(x, ξ) = G(ξ, x).

(The property (3.30) is clear without explicit computation. Indeed, L−1 should be symmetric due to the symmetry of L, and the symmetry of L−1 is equivalent to that of G(x, ξ).) Consider now the operator G in L2 ([0, l]) with the kernel G(x, ξ):

l G(x, ξ)f (ξ)dξ. Gf (x) = 0

Since G(x, ξ) is continuous and symmetric, G is a symmetric compact operator. By the Hilbert-Schmidt theorem it has a complete orthogonal system of eigenfunctions Xn (x) with real eigenvalues μn , where n = 1, 2, . . . and μn → 0 as n → ∞. (For the proofs of these facts from functional analysis

52

3. The Sturm-Liouville problem

see, e.g., Sect. VI.5, especially Theorem VI.16, in Reed and Simon [25].) We have GXn = μn Xn .

(3.31)

If the Xn are continuous, then, applying L to (3.31), we see that Xn = μn LXn . This implies that μn = 0 and λn = 1/μn is an eigenvalue of L. The continuity of Xn follows easily from (3.31), provided μn = 0. In general, if f ∈ L2 ([0, l]), then Gf ∈ C([0, l]) because by the Cauchy-Schwarz inequality l |(Gf )(x ) − (Gf )(x )| = [G(x , ξ) − G(x , ξ)]f (ξ)dξ 0

≤ sup |G(x , ξ) − G(x , ξ)| · ξ∈[0,l]

≤ sup |G(x , ξ) − G(x , ξ)| ·

√

l

|f (ξ)|dξ

0 l

1/2 |f (ξ)| dξ 2

l 0

ξ∈[0,l]

and because of the fact that G(x, ξ) is uniformly continuous on [0, l] × [0, l]. It remains to prove that G cannot have zero as an eigenvalue on L2 ([0, l]). But if u ∈ L2 ([0, l]) and Gu = 0, then u is orthogonal to the image of G, since then (3.32)

(u, Gf ) = (Gu, f ) = 0

for all f ∈ L2 ([0, l]).

But all the functions v ∈ DL can be represented in the form Gf since we have v = G(Lv) by construction of G. Since DL is dense in L2 ([0, l]), we see that (3.32) implies u = 0. Thus, the set of eigenfunctions of G coincides exactly with the set of eigenfunctions of L. In particular, we have proved the completeness of the system of eigenfunctions of L in L2 ([0, l]). The following properties of the Green’s function are easily deducible with the help of (3.29): a) G(x, ξ) has continuous derivatives up to the second order for x = ξ and d2 satisﬁes the equation Lx G(x, ξ) = 0 (also for x = ξ), where Lx = − dx 2 +q(x).

b) G(x, ξ) is continuous everywhere, whereas its derivative Gx has left and right limits at x = ξ and the jump is equal to −1: Gx (ξ + 0, ξ) − Gx (ξ − 0, ξ) = −1.

c) the boundary conditions G(0, ξ) = G(l, ξ) = 0 are satisﬁed for all ξ ∈ [0, l]. These properties may be used to ﬁnd G without variation of parameters. It is easy to verify that G(x, ξ) is uniquely deﬁned by these conditions.

3.4. The Green’s function and eigenfunctions

53

Conditions a) and b) can be easily expressed with the help of the Dirac δ-function. Later on we will do this more carefully, but now we just give some heuristic arguments (on the “physical” level of rigor). It is convenient to apply formula (3.21) which expresses L−1 in terms of its kernel from the very beginning. Let us formally apply L to (3.21) to obtain

l f (x) = Lx G(x, ξ)f (ξ) dξ. 0

If we denote δ(x, ξ) = Lx G(x, ξ), then this object formally satisﬁes

l δ(x, ξ)f (ξ) dξ. (3.33) f (x) = 0

It is clear that δ(x, ξ) vanishes for ξ = x and yet (taking f (x) ≡ 1 in (3.33)) we have

l δ(x, ξ) dξ = 1. 0

We can say more about δ(x, ξ). Taking f with a small support and shifting the support, we obtain from (3.33) that δ(x + z, ξ + z) = δ(x, ξ) for any z. Therefore, δ(x, ξ) should only depend on x − ξ; i.e., δ(x, ξ) = δ(x − ξ) for an object δ(x) that vanishes for all x ∈ R\{0} and satisﬁes

∞ δ(x) dx = 1. −∞

There is no locally integrable function with this property; nevertheless, it is convenient to make use of the symbol δ(x) if it is a part of an integrand, keeping in mind that

∞ δ(x)f (x) dx = f (0) for any f ∈ C(R). −∞

The “function” δ(x) is called Dirac’s δ-function or the Dirac measure. In the theory of distributions (that we shall discuss in the following chapter), δ(x) is understood as a linear functional on functions whose value on a continuous function f (x) is equal to f (0). The δ-function can be interpreted as a point mass, or a point charge, or a point load. For example, let a force f be applied to a nonhomogeneous string. Then, when oscillations subside (e.g., due to friction), we see that the stationary form of the string v(x) should satisfy (3.22), (3.23) and, therefore, is expressed by formula (3.28). If the load is concentrated near a point ξ and the total load equals 1, i.e., f (x)dx = 1, then the form of the string

54

3. The Sturm-Liouville problem

is precisely the graph of G(x, ξ). This statement can easily be made into a more precise one by passing to the limit when a so-called δ-like sequence of loads fn is chosen, e.g., such that fn (x) ≥ 0; fn (x) = 0 for |x − ξ| ≥ 1/n; and fn (x)dx = 1. We skip the details leaving them as an easy exercise for the reader.

3.5. Problems 3.1. Derive an integral equation for a function Φ(x) satisfying −Φ + q(x)Φ = k 2 Φ, Φ(0) = 1, Φ (0) = 0, with q ∈ G([0, l]) and deduce from this integral equation the asymptotics of Φ(x) and Φ (x) as k → +∞. 3.2. Using the result of Problem 3.1, prove the existence of an inﬁnite number of eigenvalues for the following Sturm-Liouville problem: −X + q(x)X = λX,

X (0) = X(l) = 0.

Prove the symmetry of the corresponding operator and the orthogonality of eigenfunctions. 2

d 3.3. Write explicitly the Green’s function of the operator L = − dx 2 with the boundary conditions v(0) = v(l) = 0. Give a physical interpretation.

2

d 3.4. Write explicitly the Green’s function of the operator L = − dx 2 +1 with the boundary conditions v (0) = v (l) = 0. Give a physical interpretation.

3.5. With the help of the Green’s function, prove the completeness of the d2 system of eigenfunctions for the operator L = − dx 2 +q(x) with the boundary conditions v (0) = v (l) = 0 corresponding to free ends. 2

d 3.6. Prove that if q(x) ≥ 0, then the Green’s function G(x, ξ) of L = − dx 2 + q(x) with the boundary conditions v(0) = v(l) = 0 is positive for x = 0, l and ξ = 0, l.

3.7. Prove that under the conditions of the previous problem, the Green’s function G(x, ξ) is a positive kernel; i.e., the matrix (G(xi , xj ))ni,j=1 is positive deﬁnite for any ﬁnite set of distinct points x1 , . . . , xn ∈ (0, l).

3.5. Problems

55

3.8. Expand the Green’s function G(x, ξ) of the Sturm-Liouville problem into a series involving the eigenfunctions Xn (x) and express in terms of the Green’s function the following sums: ∞ ∞ 1 1 a) , b) (diﬃcult) , λn λ2n n=1

n=1

where λn (n = 1, 2, . . .) is the set of eigenvalues for this problem. In partic∞ ∞ 1 1 and . ular, use these to calculate n2 n4 n=1

n=1

10.1090/gsm/205/04

Chapter 4

Distributions

4.1. Motivation of the deﬁnition. Spaces of test functions In analysis and mathematical physics one often meets technical diﬃculties caused by the nondiﬀerentiability of certain functions. The theory of distributions (generalized functions) makes it possible to overcome these diﬃculties. The Dirac δ-function, which naturally appears in many situations (one of them was mentioned in Section 3.4), is an example of a distribution. In the theory of distributions a number of notions and theorems of analysis acquire simpler formulations and are free of unnatural and irrelevant restrictions. The origin of the notion of distribution or generalized function may be explained as follows. Let f (x) be a physical quantity (such as temperature, pressure, etc.) which is a function of x ∈ Rn . If we measure this quantity at a point x0 with the help of some device (thermometer, pressure gauge, etc.), then we actually measure a certain average of f (x) over a neighbor hood of x0 , that is, an integral of the form f (x)ϕ(x)dx, where ϕ(x) is a function characterizing the measuring gadget and smeared somehow over a neighborhood of x0 . The idea is to completely abandon the function f (x) and consider instead a linear functional by assigning to each test function ϕ the number

(4.1)

f, ϕ =

f (x)ϕ(x)dx.

Considering arbitrary linear functionals (not necessarily of the form (4.1)) we come to the notion of a distribution or generalized function. 57

58

4. Distributions

Let us introduce now the needed spaces of test functions. Let Ω be an open subset of Rn . Let E(Ω) = C ∞ (Ω) be the space of smooth (inﬁnitely diﬀerentiable) functions on Ω; let D(Ω) = C0∞ (Ω) be the space of smooth functions with compact support in Ω, i.e., functions ϕ ∈ C ∞ (Ω) such that there exists a compact K ⊂ Ω with the property ϕ|Ω\K = 0. In general, the support of ϕ ∈ C(Ω) (denoted by supp ϕ) is the closure (in Ω) of the set {x ∈ Ω : ϕ(x) = 0}. Therefore, supp ϕ is the minimal closed set F ⊂ Ω such that ϕ|Ω\F = 0 or, equivalently, the complement to the maximal open set G ⊂ Ω such that ϕ|G = 0. Therefore, D(Ω) consists of ϕ ∈ C ∞ (Ω) such that supp ϕ is compact in Ω. Let K be the closure of a bounded open set in Rn and let D(K) = C0∞ (K) be the space of functions ϕ ∈ C ∞ (Rn ) such that supp ϕ ⊂ K. Clearly, D(Ω) is a union of all D(K) over compacts K ⊂ Ω. Finally, as a space of test functions we will also use S(Rn ), L. Schwartz’s space, consisting of functions ϕ(x) ∈ C ∞ (Rn ) such that sup |xα ∂ β ϕ(x)| < x∈Rn

+∞ for any multi-indices α and β. The theory of distributions makes use of topology in the spaces of test functions. We will introduce it with the help of a system of seminorms. First, we will give general deﬁnitions. A seminorm in a linear space E is a map p : E −→ [0, +∞) such that a) p(x + y) ≤ p(x) + p(y), x, y ∈ E; b) p(λx) = |λ|p(x), x ∈ E, λ ∈ C. The property p(x) = 0 does not necessarily imply x = 0 (if this implication holds, then the seminorm is called a norm). Let a system of seminorms {pj }j∈J , where J is a set of indices, be deﬁned on E. For any ε > 0 and j ∈ J, set Uj,ε = {x : x ∈ E, pj (x) < ε} and introduce the topology in E whose basis of neighborhoods of 0 consists of all ﬁnite intersections of the sets Uj,ε (a basis of neighborhoods of any other point x0 is obtained by the shift by x0 , i.e., consists of ﬁnite intersections of the sets x0 + Uj,ε ). Therefore, G ⊂ E is open if and only if together with each point x0 it also contains a ﬁnite intersection of sets x0 + Uj,ε . Clearly, lim xn = x if and only if lim pj (xn − x) = 0 for any j ∈ J. n→+∞

n→+∞

To avoid perpetual reference to ﬁnite intersections, we may assume that for any j1 , j2 ∈ J there exists j3 ∈ J such that pj1 (ϕ) ≤ pj3 (ϕ) and pj2 (ϕ) ≤ pj3 (ϕ) for any ϕ ∈ E (in this case Uj3 ,ε ⊂ (Uj1 ,ε ∩ Uj2 ,ε )). If this does not hold, we can add (to the existing system of seminorms) additional seminorms

4.1. Motivation. Spaces of test functions

59

of the form pj1 ...jk (ϕ) = max{pj1 (ϕ), . . . , pjk (ϕ)}. This does not aﬀect the topology deﬁned by seminorms in E and we will always assume that this is already done. In other words, we will assume that the basis of neighborhoods of 0 in E consists of all sets Uj,ε . The same topology in E can be deﬁned by a diﬀerent set of seminorms. Namely, assume that we have another system of seminorms {pj }j ∈J , with the corresponding neighborhoods of 0 denoted Uj ,ε . Clearly, the systems of seminorms {pj }j∈J , {pj }j ∈J deﬁne the same topology if and only if the following two conditions are satisﬁed: • for every j ∈ J and ε > 0, there exist j ∈ J and ε > 0 such that Uj,ε ⊂ Uj ,ε ; • for every j ∈ J and ε > 0, there exist j ∈ J and ε > 0 such that ⊂ Uj,ε .

Uj ,ε

It is easy to see that these conditions are satisﬁed if and only if • for every j ∈ J there exist j ∈ J and C = Cj ,j > 0 such that pj (x) ≤ Cpj (x), x ∈ E; • for every j ∈ J there exist j ∈ J and C = Cj,j > 0 such that pj (x) ≤ C pj (x), x ∈ E.

If these conditions are satisﬁed for two systems of seminorms, {pj }, {pj }, then we will say that the systems are equivalent. Since we are only interested in translation invariant topologies on vector spaces E, we can freely replace a system of seminorms by any equivalent system. If f is a linear functional on E, then it is continuous if and only if there exist j ∈ J and C > 0 such that (4.2)

|f, ϕ| ≤ Cpj (ϕ),

ϕ ∈ E,

where f, ϕ denotes the value of f on ϕ. Indeed, the continuity of f is equivalent to the existence of a set Uj,ε such that |f, ϕ| ≤ 1 for ϕ ∈ Uj,ε and the latter is equivalent to (4.2) with C = 1/ε. Note that the seminorm pj itself, as a function on E, is continuous on E. Indeed, this is obvious at the point ϕ = 0 because the set {ϕ| pj (ϕ) < ε} is a neighborhood of 0 for any ε > 0. Now the continuity of pj at an arbitrary point ϕ0 ∈ E follows, because the triangle inequality a) above implies p(ϕ0 ) − p(ϕ) ≤ pj (ϕ0 + ϕ) ≤ p(ϕ0 ) + p(ϕ), so the continuity at ϕ0 follows from the continuity at 0.

60

4. Distributions

We will usually suppose that the following separability condition is satisﬁed: if pj (ϕ) = 0 for all j ∈ J, then ϕ = 0.

(4.3)

This implies that any two distinct points x, y ∈ E have nonintersecting neighborhoods (the Hausdorﬀ property of the topology). Indeed, there exists j ∈ J such that pj (x − y) > 0. But then the neighborhoods x + Uj,ε , y + Uj,ε do not intersect for ε < 12 pj (x − y), because if z = x + t = y + s ∈ (x + Uj,ε ) ∩ (y + Uj,ε ), then pj (x − y) = pj (s − t) ≤ pj (s) + pj (t) ≤ 2ε < pj (x − y). Now, consider the case which is most important for us when J is a countable (inﬁnite) set. We may assume that J = N. Introduce a metric on E by the formula ∞ 1 pk (x − y) . (4.4) ρ(x, y) = 2k 1 + pk (x − y) k=1

The symmetry and the triangle inequality are easily checked. Due to separability condition (4.3), ρ(x, y) = 0 implies x = y. Let us prove that the topology deﬁned by the metric coincides with the topology deﬁned above by seminorms. Since the metric (4.4) is invariant with respect to the translations (i.e., ρ(x + z, y + z) = ρ(x, y)), it is suﬃcient to verify that each neighborhood Uj,ε contains a ball Bδ (0) = {x : ρ(x, 0) < δ} of radius δ > 0 with the center at the origin, and, conversely, each ball Bδ (0) contains a set of the form Uj,ε . (Note that in the future we will also write B(δ, 0) instead of Bδ (0).) First, take an arbitrary δ > 0. Let us prove that there exist j, ε such that Uj,ε ⊂ Bδ (0). If j is such that p1 (x) ≤ pj (x), . . . , pN (x) ≤ pj (x), then ρ(x, 0) ≤

N pk (x) k=1

2k

+

∞ k=N +1

1 1 pk (x) ≤ pj (x) + N . k 2 1 + pk (x) 2

Therefore, if ε < δ/2 and 1/2N < δ/2, then Uj,ε ⊂ Bδ (0). Conversely, given j and ε, we deduce the existence of δ > 0 such that Bδ (0) ⊂ Uj,ε from the obvious inequality 1 pj (x) ≤ ρ(x, 0). 2j 1 + pj (x) 1 ε t . Then, since f (t) = 21j 1+t is strictly 2j 1 + ε increasing for t > 0, the inequality ρ(x, 0) < δ implies pj (x) < ε.

Namely, take δ so that δ
0, such that for every j, k, which are both greater than m, we have xj −xk ∈ U . Clearly, the last condition only depends upon the topology in E. But the notion of convergence also depends on the topology only. Therefore, the completeness property of a countably normed space is independent of the choice of a system of seminorms in the same equivalence class. Now we can also formulate the notion of completeness in terms of seminorms. Using (4.5) and the deﬁnition of topology in terms of seminorms, we see that a sequence {xj | j = 1, 2, . . . } is a Cauchy sequence (in terms of topology or metric) if and only if (4.6)

pl (xj − xk ) → 0

as j, k → +∞,

for every seminorm pl . So to check the completeness of a countably normed space it suﬃces to check convergence of the sequences satisfying (4.6), which is sometimes more convenient. Deﬁnition 4.2. A Fr´echet space is a topological vector space whose topology is deﬁned by a structure of a complete countably normed topological

62

4. Distributions

vector space. In other words, it is a complete topological vector space with a countable system of seminorms deﬁned up to the equivalence explained above. Since the topology in a countably normed space E can be determined by a metric ρ(x, y), the continuity of functions on E and, in general, of any maps of E into a metric space can be determined in terms of sequences. For instance, a linear functional f on E is continuous if and only if lim ϕk = 0 k→+∞

implies lim f, ϕk = 0. k→+∞

Deﬁnition 4.3. The space of continuous linear functionals in E is called the dual space to E and is denoted by E . Now, let us turn D(K), E(Ω), and S(Rn ) into countably normed spaces. 1) Space D(K). Denote sup |∂ α ϕ(x)|, pm (ϕ) = |α|≤m

ϕ ∈ D(K).

x∈K

The seminorms pm (ϕ), m = 1, 2, . . . , deﬁne a structure of a countably normed space on D(K). Clearly, the convergence ϕp → ϕ in the topology of D(K) means that if α is any multi-index, then ∂ α ϕp (x) → ∂ α ϕ(x) uniformly on K. Let us show that D(K) is in fact a Fr´echet space. The only nonobvious thing to check is its completeness. Let {ϕj | j = 1, 2, . . . } be a Cauchy sequence in D(K). This means that for any multi-index α the sequence {∂ α ϕj } is a Cauchy sequence in C0 (K), the space of all continuous functions on Rn supported in K (i.e., vanishing in Rn \K), with the sup norm p0 . Then there exists ψα ∈ C0 (K) such that ∂ α ϕj → ψα in C0 (K) as j → +∞ (i.e., converges uniformly on Rn ). It remains to prove that in fact ψα = ∂ α ψ0 for all α. But this follows from the standard analysis results on the possibility to diﬀerentiate a uniformly convergent sequence, if the sequence of derivatives also converges uniformly. 2) Space E(Ω). Let Kl ⊂ Ω, l = 1, 2, . . ., be a sequence of compacts such that K1 ⊂ K2 ⊂ K3 ⊂ · · · and for every x ∈ Ω there exists l such that x belongs to the interior of Kl (i.e., K contains a neighborhood of x). For instance, we may set Kl = {x : x ∈ Ω, |x| ≤ l, ρ(x, ∂Ω) ≥ 1/l}, ¯ where ∂Ω is the boundary of Ω (i.e., ∂Ω = Ω\Ω) and ρ is the usual Euclidean n distance in R . Set sup |∂ α ϕ(x)|, ϕ ∈ E(Ω). pl (ϕ) = |α|≤l

x∈Kl

4.2. Spaces of distributions

63

These seminorms turn E(Ω) into a countably normed space. Clearly, the convergence ϕp → ϕ in the topology of E(Ω) means that ∂ α ϕp (x) → ∂ α ϕ(x) uniformly on any compact K ⊂ Ω and for any ﬁxed multi-index α. In particular, this implies that the topology deﬁned in this way does not depend on the choice of a system of compacts Kl described above. 3) Space S(Rn ). Deﬁne the system of seminorms pk (ϕ) = sup |xα ∂ β ϕ(x)|, |α|+|β|≤k

x∈Rn

and deﬁne the convergence ϕp → ϕ in the topology of S(Rn ) means that if A is a diﬀerential operator with polynomial coeﬃcients on Rn , then Aϕp (x) → Aϕ(x) uniformly on Rn . Both E(Ω) and S(Rn ) are also Fr´echet spaces. Their completeness can be established by the same arguments as the ones used for the space D(K) above.

4.2. Spaces of distributions The above procedure of transition from E to its dual E makes it possible to deﬁne E (Ω) and S (Rn ) as dual spaces of E(Ω) and S(Rn ), respectively. We have not yet introduced topology in D(Ω) (the natural topology in D(Ω) is not easy to deﬁne) but we will not need it. Instead we deﬁne D (Ω) as the space of linear functionals f on D(Ω) such that f |D(K) is continuous on D(K) for any compact K ⊂ Ω. This continuity can also be deﬁned in terms of sequences (rather than seminorms), because the topology in D(K) can be deﬁned by a metric. It is also convenient to introduce the convergent sequences in D(Ω), saying that a sequence {ϕk }k=1,2,... , ϕk ∈ D(Ω), converges to ϕ ∈ D(Ω) if the following two conditions are satisﬁed: (a) there exists a compact K ⊂ Ω such that ϕ ∈ D(K) and ϕk ∈ D(K) for all k; (b) ϕk → ϕ in D(K), or, in other words, for every ﬁxed multi-index α, ∂ α ϕk → ∂ α ϕ uniformly on K (or on Ω). Deﬁnition 4.4. (1) The elements of D (Ω) are called distributions in Ω. (2) The elements of E (Ω) are called distributions with compact support in Ω. (The notion of support of a distribution that justiﬁes the above term will be deﬁned later.) (3) The elements of S (Rn ) are called tempered distributions on Rn .

64

4. Distributions

Example 4.1. The “regular” distributions on Ω. Let L1loc (Ω) be the space of (complex-valued) functions on Ω which are Lebesgue integrable with respect to the Lebesgue measure on any compact K ⊂ Ω. To any f ∈ L1loc (Ω), assign the functional on D(Ω) (denoted by the same letter), setting

(4.7) f, ϕ = f (x)ϕ(x)dx, Ω

where ϕ ∈ D(Ω). Clearly, we get a distribution on Ω (called regular). An important fact is that if two functions f1 , f2 ∈ L1loc (Ω) determine the same distribution, they coincide almost everywhere. This is a corollary of the following lemma. Lemma 4.5. Let f ∈ L1loc (Ω) and Ω f (x)ϕ(x)dx = 0 for any ϕ ∈ D(Ω). Then f (x) = 0 for almost all x. Proof of this lemma requires the existence of an ample set of functions in D(Ω). We do not know yet whether there exist such nontrivial (not identically vanishing) functions. Let us construct a supply of functions belonging to D(Rn ). First, let ψ(t) = θ(t)e−1/t , where t ∈ R and θ(t) is the Heaviside function. Clearly, ψ(t) ∈ C ∞ (R). Therefore, if we consider the function ϕ(x) = ψ(1 − |x|2 ) in Rn , we get a function ϕ(x) ∈ C0∞ (Rn ) such that ϕ(x) = 0 for |x| ≥ 1 and ϕ(x) > 0 for |x| < 1. It is convenient to normalize ϕ considering instead the function ϕ1 (x) = c−1 ϕ(x), where c = ϕ(x)dx. Now, set ϕε (x) = ε−n ϕ1 (x/ε). Then ϕε (x) = 0 for |x| ≥ ε and ϕε (x) > 0 for |x| < ε. Besides, ϕε (x)dx = 1. Now, introduce an important averaging (or mollifying) operation: for any f (x) ∈ L1loc (Ω) take a family of convolutions

(4.8) fε (x) = f (x − y)ϕε (y)dy = f (y)ϕε (x − y)dy deﬁned for x ∈ Ωε = {x : x ∈ Ω, ρ(x, ∂Ω) > ε}. It is clear from the dominated convergence theorem that the last integral in (4.8) is a C ∞ function and

∂ α fε (x) =

f (y)(∂ α ϕε )(x − y)dy;

hence fε ∈ C ∞ (Ωε ). Note also the following properties of averaging: a) If f (x) = 0 outside of a compact K ⊂ Ω, then fε (x) = 0 outside of the ε-neighborhood of K. In particular, in this case fε ∈ D(Ω) for a suﬃciently small ε > 0. b) If f (x) = 1 for x ∈ Ω2ε , then fε (x) = 1 for x ∈ Ω3ε . In particular, if f is the characteristic function of Ω2ε , then fε ∈ D(Ω) and fε (x) = 1 for x ∈ Ω3ε .

4.2. Spaces of distributions

65

c) If f ∈ C(Ω), then fε (x) → f (x) uniformly on any ﬁxed compact K ⊂ Ω as ε → 0+. Indeed,

fε (x) − f (x) =

[f (x − y) − f (x)]ϕε (y)dy,

implying |fε (x) − f (x)| ≤ sup |f (x − y) − f (x)| |y|≤ε

so that the uniform continuity of f on the ε-neighborhood of K implies our statement. d) If f ∈ Lploc (Ω), where p ≥ 1, then fε → f with respect to the norm in for any ﬁxed compact K ⊂ Ω as ε → 0+.

Lp (K)

Since the values fε (x) at x ∈ K only depend on the values of f on the ε-neighborhood of the compact K, we may assume that f (x) = 0 for x ∈ Ω\K1 , where K1 is a compact in Ω. Let the norm in Lp (Ω) be denoted by up ; i.e.,

1/p |u(x)|p dx . up = Ω

Due to Minkowski’s inequality (see the appendix in this chapter: Section 4.7)

fε p = f (· − y)ϕε (y)dy ≤ f (· − y)p |ϕε (y)|dy = f p . p

Due to c), the property d) holds for f ∈ C(Ω). But if f ∈ Lp (Ω) and f = 0 outside K1 , then we can approximate f with respect to the norm in Lp (Ω), ﬁrst by step functions and then by continuous functions on Ω which vanish outside a compact K2 ⊂ Ω. Let g be a continuous function on Ω which vanishes outside K2 and such that f − gp < δ. Then lim sup fε − f p ≤ lim sup fε − gε p + lim sup gε − gp + lim sup g − f p ε→0+

ε→0+

ε→0+

ε→0+

= lim sup (f − g)ε p + g − f p ≤ 2f − gp ≤ 2δ, ε→0+

which due to the arbitrariness of δ > 0 implies lim sup fε − f p = 0, as ε→0+

required. Proof of Lemma 4.5. Let f ∈ L1loc (Ω) and Ω f (x)ϕ(x)dx = 0 for every ϕ ∈ D(Ω). This implies fε (x) = 0 for all x ∈ Ωε . Therefore, the statement of the lemma follows from the property d).

66

4. Distributions

Lemma 4.5 allows us to identify the functions f ∈ L1loc (Ω) with the distributions that they deﬁne by formula (4.7). Note that the expression

(4.9) f, ϕ = f (x)ϕ(x)dx is also often used for distributions f (in this case the left-hand side of (4.9) is the deﬁnition for the right-hand side; when the right-hand side makes sense, this deﬁnition is consistent). Example 4.2. “Regular” distributions in E (Ω) and S (Rn ). If f ∈ L1comp (Ω), i.e., f ∈ L1 (Rn ) and f (x) = 0 outside a compact K ⊂ Ω, then we may construct via the standard formula (4.9) a functional on E(Ω) which is a distribution with compact support. We will identify f with the corresponding element of E (Ω). If f ∈ L1loc (Rn ) and (4.10)

|f (x)|(1 + |x|)−N < +∞

for some N > 0, then (4.9) deﬁnes a functional on S(Rn ) which is a tempered distribution. In particular, (4.10) holds if f is measurable and |f (x)| ≤ C(1 + |x|)N −n−1 . Example 4.3. Dirac’s δ-function. This is the functional deﬁned as follows: (4.11)

δ(x − x0 ), ϕ(x) = ϕ(x0 ).

If x0 ∈ Ω, then δ(x − x0 ) ∈ D (Ω) and, moreover, δ(x − x0 ) ∈ E (Ω). Obviously, δ(x − x0 ) ∈ S (Rn ) for any x0 ∈ Rn . Instead of (4.11) one often writes, in accordance with the above-mentioned convention,

δ(x − x0 )ϕ(x)dx = ϕ(x0 ), though, clearly, δ(x − x0 ) is not a regular distribution. Indeed, if we assume that δ(x−x0 ) is regular, then Lemma 4.5 would imply that δ(x−x0 ) vanishes almost everywhere in Rn \{x0 }; hence, it vanishes almost everywhere in Rn , and so vanishes as a distribution in Rn . Example 4.4. Let L be a linear diﬀerential operator in Rn , Γ a smooth compact hypersurface in Rn , dS the area element of Γ. Then the formula

(4.12) ϕ → (Lϕ)|Γ dS Γ

deﬁnes a nonregular distribution with compact support in Rn . When L is the identity operator, we denote the distribution (4.12) by δΓ . For Γ we may take a compact submanifold (perhaps with a boundary) of any codimension in Rn . In this case for dS we may take any density on Γ

4.3. Topology and convergence

67

(or a diﬀerential form of the maximal order if the orientation of Γ is ﬁxed). In particular, the Dirac δ-function appears as a particular case when Γ is a point.

4.3. Topology and convergence in the spaces of distributions Let F be one of the spaces of test functions D(Ω), E(Ω), or S(Rn ), and let F be the corresponding dual space of distributions (continuous linear functionals on F ). We will consider the space F with so-called weak topology, i.e., the topology deﬁned by the seminorms pϕ (f ) = |f, ϕ|,

ϕ ∈ F , f ∈ F .

In particular, the convergence fk → f in this topology means that fk , ϕ → f, ϕ for any ϕ ∈ F . In this case we will also use the usual notation lim fk = f . k→∞

Up to now we assumed that the limit f is itself a distribution. Now let us simply assume that {fk | k = 1, 2, . . . } is a sequence of distributions (from F ), such that for any ϕ ∈ F there exists a limit f, ϕ := lim fk , ϕ. k→+∞

Will the limit f always be a distribution too? It is clear that the limit is a linear functional with respect to ϕ. But will it be continuous? The answer is aﬃrmative, and the corresponding property is called completeness (or weak completeness) of the distribution space. It is a particular case of a general result which holds for any Fr´echet space (see the appendix in this chapter: Section 4.8). Example 4.5. δ-like sequence. Consider the above-constructed functions ϕε (x) ∈ D(Rn ). Recall that they satisfy a) ϕε (x) ≥ 0 for all x; b) ϕε (x)dx = 1; c) ϕε (x) = 0 for |x| ≥ ε. Let us prove that the properties a)–c) imply lim ϕε (x) = δ(x)

(4.13)

ε→0+

in D (Rn ). Indeed, this means that

ϕε (x)ψ(x)dx = ψ(0) for any ψ ∈ D(Rn ). (4.14) lim ε→0+

With the help of b) we may rewrite (4.14) in the form of the relation

ϕε (x)[ψ(x) − ψ(0)]dx = 0, lim ε→0+

68

4. Distributions

which is obviously true due to the estimate

ϕε (x)[ψ(x) − ψ(0)]dx ≤ sup |ψ(x) − ψ(0)|. |x|≤ε

Note that (4.13) holds not only in D (Rn ) but also in S (Rn ), E (Rn ), and even in E (Ω) if 0 ∈ Ω. Sequences of “regular” functions converging to the δ-function are called δ-like sequences. Properties a)–c) can be considerably weakened retaining δ-likeness of the sequence. Thus, in the theory of Fourier series one proves the δ-likeness (e.g., in D ((−π, π))) of the sequence of Dirichlet kernels Dk (t) =

1 sin(k + 12 )t , 2π sin 2t

deﬁned by the condition that Dk , ϕ is the sum of the ﬁrst k terms of the Fourier series of ϕ at t = 0. Similarly, a δ-like sequence in D ((−π, π)) is constituted by the Fej´er kernel Fk (t) =

1 sin2 kt 2 , 2πk sin2 2t

deﬁned from the condition that Fk , ϕ is the arithmetic mean of the ﬁrst k partial sums of the Fourier series. Later on we will encounter a number of other important examples of δ-like sequences. Note that the δ-likeness is essentially always proved in the same way as for ϕε (x). 1 1 and v.p. x1 . The “regular” functions x+iε Example 4.6. Distributions x±i0 √ 1 and x−iε (here i = −1) of x ∈ R1 for ε > 0 determine tempered distributions (elements of S (R1 )). It turns out that there exist limits in S (R1 )

1 1 = lim , x + i0 ε→0+ x + iε

1 1 = lim . x − i0 ε→0+ x − iε

The left-hand sides here are, by deﬁnition, the limits on the right-hand sides. Let us prove, e.g., the existence of the limit

1 x+i0

in S (R1 ).

We must prove that if ϕ ∈ S(R1 ), then

∞ ϕ(x) dx (4.15) lim ε→0+ −∞ x + iε exists and is a continuous linear functional of ϕ ∈ S(R1 ). The simplest way to do this is to integrate by parts. Namely, we have d 1 = ln(x + iε), x + iε dx

4.3. Topology and convergence

69

where we take an arbitrary branch of ln(x + iε) (but keeping continuity for all x ∈ R, ε > 0). Integrating by parts, we get

∞

∞ ϕ(x) (4.16) dx = − ϕ (x) ln(x + iε)dx. x + iε −∞ −∞ Since ln(x + iε) = ln |x + iε| + iarg (x + iε), it is clear that, by the dominated convergence theorem, the limit of the right-hand side of (4.16), as ε → 0+, is equal to

0

∞ (ln |x|)ϕ (x)dx − iπ ϕ (x)dx − −∞ −∞ (4.17)

∞ =− (ln |x|)ϕ (x)dx − πiϕ(0). −∞

and ln |x| grows at inﬁnity not faster than |x|ε for any Since ln |x| ∈ ε > 0, it is obvious that the right-hand side of (4.17) is ﬁnite and deﬁnes a 1 . continuous functional of ϕ ∈ S(Rn ). This functional is denoted by x+i0 L1loc (R)

Let us study in detail the ﬁrst summand in (4.17). We have

−

∞

−∞

(ln |x|)ϕ (x)dx = lim

(4.18)

ε→0+

−

ln |x| ·

= lim

ε→0+

= lim

|x|≥ε

ε→0+ |x|≥ε

ln |x| · ϕ (x)dx

ϕ(x)|ε−ε

+ |x|≥ε

1 ϕ(x)dx x

1 ϕ(x)dx, x

since lim ln ε[ϕ(ε) − ϕ(−ε)] = lim O(ε) ln ε = 0. In particular, we have ε→0+

ε→0+

proved that the last limit in (4.18) exists and deﬁnes a functional belonging to S (R1 ). This functional is denoted v.p. x1 (the letters v.p. stand for the ﬁrst letters of the French words “valeur principale”, meaning the principal value). Therefore, by deﬁnition,

1 1 (4.19) v.p. , ϕ = lim ϕ(x)dx. ε→0+ x |x|≥ε x Moreover, we may now rewrite (4.17) in the form (4.20)

1 1 = v.p. − πiδ(x). x + i0 x

Similarly, we get (4.21)

1 1 = v.p. + πiδ(x). x − i0 x

70

4. Distributions

Formulas (4.20) and (4.21) are called Sokhotsky’s formulas. ticular, imply

They, in par-

1 1 − = −2πiδ(x). x + i0 x − i0 The existence of limits in (4.15) and (4.19) may also be proved by expanding ϕ(x) in the sum of a function equal to ϕ(0) in a neighborhood of 0 and a function which vanishes at 0. For each summand, the existence of limits is easy to verify. In the same way, one gets the Sokhotsky formulas, too. 1 and v.p. x1 are distinct “regularizations” of the noninDistributions x±i0 1 tegrable ∞ 1 function x ; i.e., they allow us to make sense of the divergent integral −∞ x ϕ(x)dx. We see that there is an ambiguity: various distributions correspond to the nonintegrable function x1 . The regularization procedure is important if we wish to use x1 as a distribution (e.g., to diﬀerentiate it).

This procedure (or a similar one) is applicable to many other nonintegrable functions.

4.4. The support of a distribution Let Ω1 , Ω2 be two open subsets of Rn and let Ω1 ⊂ Ω2 . Then D(Ω1 ) ⊂ D(Ω2 ). Therefore, if f ∈ D (Ω2 ), we may restrict f to D(Ω1 ) and get a distribution on Ω1 denoted by f |Ω1 . It is important that the operation of restriction has the following properties: a) Let{Ωj : j ∈ J} be a cover of an open set Ω by open sets Ωj , j ∈ J; i.e., Ω = j∈J Ωj . If f ∈ D (Ω) and f |Ωj = 0 for j ∈ J, then f = 0. b) Let again Ω = j∈J Ωj and let fj ∈ D (Ωj ) be a set of distributions such that fk |Ωk ∩Ωl = fl |Ωk ∩Ωl for any k, l ∈ J. Then there exists a distribution f ∈ D (Ω) such that f |Ωj = fj for any j ∈ J. Properties a) and b) mean that distributions constitute a sheaf. Let us prove the property a), which is important for us. To do this, let us use a partition of unity, i.e., a family of functions {ϕj }j∈J , such that 1) ϕj ∈ C ∞ (Ω), supp ϕj ⊂ Ωj ; 2) the family {ϕj }j∈J is locally ﬁnite; i.e., any point x0 ∈ Ω has a neighborhood in which only a ﬁnite number of functions of this family, ϕj1 , . . . , ϕjk , do not vanish; 3) j∈J ϕj ≡ 1. The existence of a partition of unity is proved in courses of geometry and analysis; see, e.g., Theorem 10.8 in Rudin [27] or Theorem 1.11 in Warner [38].

4.4. The support of a distribution

71

If ϕ ∈ D(Ω), then property 1) implies ϕj ϕ ∈ D(Ωj ) and properties 2) and 3) imply ϕ= ϕj ϕ, j∈J

where the sum locally contains only a ﬁnite number of summands which are not identically zero, since the compact supp ϕ intersects, due to property 2), only a ﬁnite number of supports supp ϕj . If f |Ωj = 0 for all j, then f, ϕj ϕ = 0, f, ϕ = j∈J

implying f = 0, due to the arbitrariness of ϕ. This proves property a). Property b) may be proved by constructing f ∈ D (Ω) from distributions fj ∈ D (Ωj ) with the help of the formula fj , ϕj ϕ. f, ϕ = j∈J

If ϕ ∈ D(Ωk ), where k is ﬁxed, then f, ϕ =

j∈J

fj , ϕj ϕ =

fk , ϕj ϕ =

j∈J

fk ,

ϕj ϕ

= fk , ϕ

j∈J

since supp(ϕj ϕ) ⊂ supp ϕj ∩ supp ϕ ⊂ Ωj ∩ Ωk . We have therefore veriﬁed that f |Ωk = fk , proving property b). Property a) allows us to introduce for f ∈ D (Ω) a well-deﬁned maximal open subset Ω ⊂ Ω such that f |Ω = 0 (here Ω is the union of all open sets Ω1 ⊂ Ω such that f |Ω1 = 0). The closed subset F = Ω\Ω is called the support of the distribution f (denoted by supp f ). Therefore, supp f is the smallest closed subset F ⊂ Ω such that f |Ω\F = 0. Example 4.7. supp δ(x) = {0}. The support of the distribution from Example 4.4 belongs to Γ. In general, if supp f ⊂ F , then f is said to be supported on F . Therefore, a distribution of the form (4.12) is supported on Γ. Now, let us deﬁne supports of distributions from E (Ω) and S (Rn ). This is easy if we note that there are canonical embeddings E (Ω) ⊂ D (Ω), S (Rn ) ⊂ D (Rn ). They are constructed with the help of embeddings D(Ω) ⊂ E(Ω), D(Rn ) ⊂ S(Rn ),

72

4. Distributions

which enable us to deﬁne maps (4.22)

E (Ω) −→ D (Ω)

and (4.23)

S (Rn ) −→ D (Rn )

by restricting functionals onto the smaller subspace. The continuity of these functionals on D(Ω) and D(Rn ) follows from the continuity of embeddings D(K) ⊂ E(Ω) for a compact K in Ω, D(K) ⊂ S(Rn ) for a compact K in Rn . Finally, the dual maps (4.22), (4.23) are embeddings, because D(Ω) is dense in E(Ω) and D(Rn ) is dense in S(Rn ); therefore any continuous linear functional on E(Ω) (respectively, S(Rn )) is uniquely determined by its values on a dense subset D(Ω) (respectively, D(Rn )). In what follows, we will identify distributions from E (Ω) and S (Rn ) with their images in D (Ω) and D (Rn ), respectively. Proposition 4.6. Let f ∈ D (Ω). Then f ∈ E (Ω) if and only if supp f is a compact subset of Ω, or, in other words, if f is supported on a compact subset of Ω. Proof. Let f ∈ E (Ω). Then f is continuous with respect to a seminorm in E(Ω); i.e., (4.24)

|f, ϕ| ≤ C

|α|≤l

sup |∂ α ϕ(x)|,

x∈K

where the compact K ⊂ Ω and numbers C, l do not depend on ϕ. But this implies that f, ϕ = 0 for any ϕ ∈ D(Ω\K), so f |Ω\K = 0; i.e., supp f ⊂ K. Conversely, let f ∈ D (Ω) and supp f ⊂ K1 , where K1 is a compact in Ω. Clearly, if K ⊂ Ω contains a neighborhood of K1 , then f is determined by its restriction onto D(K) and (4.24) holds for ϕ ∈ D(K), hence, for ϕ ∈ D(Ω) (since f, ϕ does not depend on ϕ|Ω\K ). Therefore, f can be continuously extended to a functional f ∈ E (Ω). Note that this extension essentially reduces to the following two steps: 1) decompose ϕ ∈ E(Ω) into a sum ϕ = ϕ1 + ϕ2 , where ϕ1 ∈ D(Ω) and ϕ2 = 0 in a neighborhood of supp f and 2) set f, ϕ = f, ϕ1 . Proposition 4.6 is proved.

4.4. The support of a distribution

73

The following important theorem describes distributions supported at a point. Theorem 4.7. Let f ∈ E (Rn ) and supp f = {0}. Then f has the form cα (∂ α ϕ)(0), (4.25) f, ϕ = |α|≤l

where l and constants cα do not depend on ϕ. Proof. To any ϕ ∈ E(Rn ) assign its l-jet at 0, i.e., the set of derivatives j0l (ϕ) = {(∂ α ϕ)(0)}|α|≤l . Choose l so as to satisfy (4.24), where K is a compact which is a neighborhood of 0. Let us verify that f, ϕ actually depends only on j0l (ϕ). Then the statement of the theorem becomes obvious since it reduces to the description of linear functionals on a ﬁnite-dimensional vector space of l-jets at 0 and these functionals are deﬁned as in the right-hand side of (4.25). Thus, it remains to verify that j0l (ϕ) = 0 implies f, ϕ = 0. Let χ ∈ D(Rn ) be chosen so that χ(x) = 1 if |x| ≤ 1/2 and χ(x) = 0 if |x| ≥ 1. Set χε (x) = χ( xε ), ε > 0. Clearly, supp f = {0} implies f, ϕ = f, χε ϕ for any ε > 0. We will see below that j0l (ϕ) = 0 implies (4.26)

sup |∂ α [χε (x)ϕ(x)]| → 0

x∈Rn

for |α| ≤ l as ε → 0 + .

Therefore f, ϕ = f, χε ϕ = lim f, χε ϕ = 0 ε→0+

due to (4.24). It remains to prove (4.26). But by the Leibniz rule cα α [∂ α χε (x)][∂ α ϕ(x)] ∂ α [χε (x)ϕ(x)] = α +α =α

=

α +α =α

x [∂ α ϕ(x)]. cα α ε−|α | (∂ α χ) ε

It follows from Taylor’s formula that

∂ α ϕ(x) = O(|x|l+1−|α | )

as x → 0.

Note that |x| ≤ ε on supp χ( xε ). Therefore, as → 0+, x α sup ε−|α | (∂ α χ) |∂ ϕ(x)| = O(ε−|α |+l+1−|α | ) = O(εl+1−|α| ) → 0, n ε x∈R implying (4.26).

74

4. Distributions

4.5. Diﬀerentiation of distributions and multiplication by a smooth function Diﬀerentiation of distributions should be a natural extension of diﬀerentiation of smooth functions. To deﬁne it note that if u ∈ C 1 (Ω) and ϕ ∈ D(Ω), then

∂ϕ ∂u ϕdx = − u dx (4.27) ∂xj ∂xj or

(4.28)

∂u ,ϕ ∂xj

∂ϕ = − u, ∂xj

.

∂u Therefore, it is natural to deﬁne ∂x for u ∈ D (Ω) by the formula (4.28) j (which we will sometimes write in the form (4.27) in accordance with the above-mentioned convention). Namely, deﬁne the value of the functional ∂u ∂xj at ϕ ∈ D(Ω) (i.e., the left-hand side of (4.27)) as the right-hand side

which makes sense because

∂ϕ ∂xj

∈ D(Ω). Since

to D(K) for any compact K ⊂ Ω, we see deﬁne higher derivatives

∂ ∂xj continuously maps D(K) ∂u ∈ D (Ω). We can now that ∂x j

∂ α : D (Ω) −→ D (Ω), where α = (α1 , . . . , αn ) is a multi-index, as products of operators

∂ ∂xj ,

namely, by setting ∂ α = ( ∂x∂ 1 )α1 . . . ( ∂x∂n )αn . Another (equivalent) way to do this is to write (4.29)

∂ α u, ϕ = (−1)|α| u, ∂ α ϕ,

which enables us to immediately compute derivatives of an arbitrary order. The successive application of (4.28) shows that the result obtained from (4.29) is equal to that obtained by applying ∂ α considered as the composition of operators ∂x∂ j . Since ∂ α continuously maps E(Ω) to E(Ω) and S(Rn ) to S(Rn ), it is clear that ∂ α maps E (Ω) to E (Ω) and S (Rn ) to S (Rn ). It is also clear from (4.29) that ∂ α is a continuous operator in D (Ω), E (Ω), and S (Rn ). Therefore, we may say that ∂ α is the extension by continuity of the usual diﬀerentiation onto the space of distributions. Such an extension is unique, because, as we will see in the next section, D(Ω) is dense in D (Ω). Finally, it is clear that supp(∂ α u) ⊂ supp u

for all u ∈ D (Ω).

4.5. Diﬀerentiation and multiplication by a smooth function

75

Example 4.8. The derivative of the Heaviside function θ(x). The distributional derivative of θ(x) can be calculated as follows:

∞ θ , ϕ = −θ, ϕ = − ϕ (x)dx = ϕ(0), ϕ ∈ D(R1 ); 0

i.e., θ (x) = δ(x). Example 4.9. The derivative ∂ α δ(x) of δ(x) ∈ D (Rn ). By deﬁnition, ∂ α δ(x), ϕ(x) = (−1)|α| δ(x), ∂ α ϕ(x) = (−1)|α| ϕ(α) (0). Therefore, ∂ α δ(x) assigns the number (−1)|α| ϕ(α) (0) to the function ϕ(x). Theorem 4.7 may now be formulated as follows: any distribution supported at 0 is a ﬁnite linear combination of derivatives of the δ-function. Example 4.10. The fundamental solution of the Laplace operator. Consider the Laplace operator Δ in Rn and try to ﬁnd a distribution u ∈ D (Rn ) such that Δu = δ(x). Such u is called a fundamental solution for the operator Δ. It is not unique: we can add to it any distributional solution of the Laplace equation Δv = 0 (in this case v will be actually a C ∞ - and even real-analytic function, as we will see later). Fundamental solutions play an important role in what follows. We will try to select a special (and the most important) fundamental solution, using symmetry considerations. (It will later be referred to as the fundamental solution.) The operator Δ commutes with rotations of Rn (or, equivalently, preserves its form under an orthogonal transformation of coordinates). Therefore, by rotating a solution u we again get a solution of the same equation (it is easy to justify the meaning of the process, but, anyway, we now reason heuristically). But then we may average with respect to all rotations. It is also natural to suppose that u has no singularities at x = 0. Therefore, we will seek a distribution u(x) of the form f (r), where r = |x|, for x = 0. Let us calculate Δf (r). We have xj ∂r ∂ ∂ f (r) = f (r) = f (r) x21 + · · · + x2n = f (r) ∂xj ∂xj ∂xj r implying

2 2 x x 1 ∂2 j j − 3 . f (r) = f (r) 2 + f (r) r r r ∂x2j

Summing over j = 1, . . . , n we get Δf (r) = f (r) +

n−1 f (r). r

76

4. Distributions

Let us solve the equation f (r) +

n−1 f (r) = 0. r

Denoting f (r) = g(r), we get for g(r) a separable ODE g (r) +

n−1 g(r) = 0, r

which yields g(r) = Cr−(n−1) , where C is a constant. Integrating, we get C1 r−(n−2) + C2 for n = 2, f (r) = for n = 2. C1 ln r + C2 Since constants C2 are solutions of the homogeneous equation Δu = 0, we may take f (r) = Cr2−n

for n = 2,

f (r) = C ln r

for n = 2.

We will only consider the case n ≥ 2 (the case n = 1 is elementary and will later be analyzed in a far more general form). For convenience of the future calculations we introduce 1 2−n r for n ≥ 3, (4.30) un (x) = 2−n u2 (x) = ln r.

(4.31)

Then un (x) is locally integrable in Rn and hence may be considered as a 1 1−n n is introduced so that un satisﬁes ∂u distribution. The factor 2−n ∂r = r for all n ≥ 2. Now it is convenient to make a break in our exposition to include some integral formulas for the fundamental solutions (4.30), (4.31) and related functions. Some integral formulas We will use several formulas relating integrals over a bounded domain (open set) Ω with some integrals over its boundary. The ﬁrst formula of this kind is the fundamental theorem of calculus

b f (x)dx = f (b) − f (a), f ∈ C 1 ([a, b]), a

which is due to Newton and Leibniz. All other formulas of the described type (due to Gauss, Ostrogradsky, Green, Stokes, and others) follow from this one, most often through integration by parts.

4.5. Diﬀerentiation and multiplication by a smooth function

77

We will proceed through the following Green’s formula:

∂u ∂v −v dS, (uΔv − vΔu)dx = (4.32) u ∂n ¯ ∂n ¯ Ω ∂Ω where the notation and conditions will be explained now. Notation and assumptions • • •

Ω is a bounded domain (bounded open subset) in Rn . ¯ such that Ω ⊂ Ω ¯ ⊂ Rn . The closure of Ω is the smallest closed set Ω ¯ \ Ω. The boundary ∂Ω of Ω in Rn is Ω

It seems natural that ∂Ω should be called the boundary of Ω. However, ¯ and ∂Ω. To this would allow too much freedom in the global structure of Ω avoid these problems we will later introduce some additional “smoothness” restrictions. ¯ = Ω ∪ ∂Ω, but it is possible ¯ of Ω in Rn we have Ω • For the closure Ω that Ω ∩ ∂Ω = ∅. (For example, consider a nonempty closed straight line interval J ⊂ Ω and then replace Ω by Ω \ J.) •

dS is the area element on the boundary.

• Let k be ﬁxed, k ∈ {1, 2, . . . }, or k = ∞. (Sometimes it is also convenient to use k = 0, which means that only continuity of the integrand is required.) The space C k (Ω) is deﬁned as the space of all functions u ∈ C(Ω) such that any derivative ∂ α u with |α| ≤ k exists and is continuous in Ω. The space C ∞ (Ω) can be deﬁned as the intersection of all spaces C k (Ω). This intersection is a Fr´echet space. • n ¯ = n ¯ (x) is an outgoing unit vector ﬁeld which is based at every point x ∈ ∂Ω, tangent to Rn and orthogonal to ∂Ω at all points x ∈ ∂Ω. • Let us also assume that n ¯ (x) is continuous with respect to x. It is easy to see from the implicit function theorem that locally the problem has at most one solution. This means that the solution exists and is unique in a small neighborhood of any outgoing unit vector ﬁeld which is orthogonal to ∂Ω and Rn (at the points x where a local continuous outgoing normal vector exists). • Now we will formulate conditions which guarantee that the closure ¯ of Ω in Rn and the boundary ∂Ω ⊂ Rn are suﬃciently regular at a point Ω ¯ is a ﬁnite union of simplest cells such as x ¯. For simplicity it suﬃces that Ω formulated above. Deﬁnition 4.8 (Smoothness of the boundary). Assume that U ⊂ Rn is open and bounded in Rn , k ∈ {1, 2, . . . }. We will say that the boundary ¯ ∈ ∂Ω, there exist r > 0 and a ∂Ω is C k (or ∂Ω ∈ C ∞ ) if for every x n−1 k → R such that after relabeling and reorienting the C -function h : R

78

4. Distributions

coordinates axes, if necessary, we have (4.33)

U ∩ B(¯ x, r) = {x ∈ B(¯ x, r) | xn > h(x1 , . . . , xn−1 )}. The outward unit normal vector ﬁeld

Deﬁnition 4.9. (i) If ∂U is C 1 , then along ∂U we can deﬁne the outward unit normal vector ﬁeld ν = (ν1 , . . . , νn ). This ﬁeld is normal at any point x ¯ ∈ ∂U if ν(¯ x) = ν = (ν1 , . . . νn ). (ii) Let u ∈ C 1 (∂Ω). We deﬁne ∂u = ν · D u, ∂ν which is called the (outward ) normal derivative of u. (Here D = ∂/∂x.) We will sometimes need to change coordinates near a point x ¯ ∈ ∂Ω so as to “ﬂatten out” the boundary. To be more speciﬁc, ﬁx x ¯ ∈ ∂Ω, and choose r, h, etc., as above. Then deﬁne Φ1 (x), . . . , Φn (x) by the formulas (i = 1, . . . , n − 1) yi = xi = Φi (x) n yn = xn − h(x1 , . . . , xn−1 ) = Φ (x), and write y = Φ(x). Ψ1 (x), . . . , Ψn (x)

by the formulas Similarly, we deﬁne (i = 1, . . . , n − 1) yi = xi = Ψi (x) n yn = xn + h(x1 , . . . , xn−1 ) = Ψ (x) and write y = Ψ(x). Ψ−1 ,

Then Φ = and the mapping x → Φ(x) = y “ﬂattens out ∂Ω” near x ¯. Observe also that det Φ = det Ψ = 1. Let us discuss what kind of regularity we need to require for the boundary to deserve to be called “suﬃciently regular”. Unfortunately, there is no easy answer to this question: the answer is dictated by the goal. More speciﬁcally, suppose that we need to make sense of the formula (4.32), so as to give suﬃcient conditions for it to hold. Obviously, it is reasonable to request that we have a continuous unit normal vector and the area element (at least almost everywhere). To make sense of the left-hand side, it is nat¯ and ∂Ω ∈ C 1 , which will guarantee that ural to require that u, v ∈ C 2 (Ω) ¯ and the integrals on the both sides make sense. In fact, if u, v ∈ C 2 (Ω) 1 ∂Ω ∈ C , then Green’s formula (4.32) can indeed be veriﬁed.

4.5. Diﬀerentiation and multiplication by a smooth function

79

Additionally, if we need a C 2 change of variables, which moves the boundary (say, if we would like to straighten the C 1 boundary locally), then the second-order derivatives of the equation of the boundary (i.e., of the function h in Deﬁnition 4.8) will show up in the transformed operator Δ and will be out of control. Therefore, to guarantee the continuity of the coeﬃcients of the transformed equation, it is natural to assume ∂Ω ∈ C 2 . Other integral formulas Here we will establish more elementary integral formulas (Gauss, Green, Ostrogradsky, Stokes, . . . ). We will start with the deduction of Green’s formula (4.32), from simpler integral identities, because the intermediate formulas have at least an equal importance. The divergence formula, alternatively called the Gauss-Ostrogradsky for¯ mula, holds: If ∂Ω ∈ C 1 and F is a C 1 vector ﬁeld in a neighborhood of Ω n in R , then

div F dx = F·n ¯ dS, (4.34) Ω

∂Ω

where div F = ∇ · F =

n ∂Fj j=1

∂xj

.

Here F1 , . . . , Fn are the coordinates of the vector F. (For the proof see, for example, Rudin [27, Theorem 10.51].) For our purposes it suﬃces to have this formula for the bounded domains Ω with smooth boundary ∂Ω (say, class C k , with k ≥ 1), that is, the boundary which is a compact closed hypersurface (of class C k ) in Rn . More generally, we can allow the domains with a piecewise C k boundary. But the analysis of “piecewise smooth” situations is very diﬃcult compared with the “smooth” case. So we will not discuss the piecewise smooth objects in this book. Let us return to the deduction of Green’s formula (4.32). Taking F = ¯ and v ∈ C 2 (Ω), ¯ we have u∇v, we ﬁnd that for u ∈ C 1 (Ω) div(u∇v) = uΔv + ∇u · ∇v; hence (4.34) takes the form

(uΔv + ∇u · ∇v)dx = (4.35) Ω

u ∂Ω

∂v dS. ∂n ¯

¯ and subtracting from this formula the one Assuming now that u, v ∈ C 2 (Ω) obtained by transposition u and v, we obtain (4.32).

80

4. Distributions

The Green’s formula (4.32) is also a particular case of the Stokes formula for general diﬀerential forms (see, e.g., [27, Theorem 10.33]):

(4.36) dω = ω, Ω

∂Ω 1 C , and

dω is its exterior diﬀerential. where ω is an (n − 1)-form on Ω, ω ∈ Take the diﬀerential form n ∂u ∂v j−1 j ∧ . . . ∧ dxn , (−1) −v ω= u dx1 ∧ . . . ∧ dx ∂xj ∂xj j=1

j means that dxj is omitted. Clearly, where dx dω = (uΔv − vΔu)dx1 ∧ . . . ∧ dxn , so the left-hand side of (4.36) coincides with the left-hand side of (4.32). The right-hand side of (4.36) in this case may be easily reduced to the righthand side of (4.32) and we leave the corresponding details to the reader as an exercise. Let us return to the calculation of Δun in D (Rn ). For any ϕ ∈ D(Rn ),

un (x)Δϕ(x)dx = lim un (x)Δϕ(x)dx. Δun , ϕ = un , Δϕ = ε→0+ |x|≥ε

Rn

We now use Green’s formula (4.32) with u = un , v = ϕ, and Ω = {x : ε ≤ |x| ≤ R}, where R is so large that ϕ(x) = 0 for |x| ≥ R − 1. Since Δun (x) = 0 for x = 0, we get

∂un ∂ϕ un −ϕ dS. un (x)Δϕ(x)dx = ∂n ¯ ∂n ¯ |x|≥ε |x|=ε Clearly, ∂ϕ ∂ϕ =− , ∂n ¯ ∂r implying

(4.37)

∂un ∂un =− = −r1−n , ∂n ¯ ∂r

|x|≥ε

un (x)Δϕ(x)dx = −un (ε)

|x|=ε

∂ϕ dS + ε1−n ∂r

ϕ(x)dS, |x|=ε

1 where un (ε) = 2−n ε2−n for n ≥ 3 and un (ε) = ln ε for n = 2. Since the area of a sphere of radius ε in Rn is equal to σn−1 εn−1 (here σn−1 is the area of a sphere of radius 1), the ﬁrst summand in the right-hand side of (4.37) tends to 0 as ε → 0+, and the second summand tends to σn−1 ϕ(0). Finally, we obtain Δun , ϕ = σn−1 ϕ(0),

or Δun = σn−1 δ(x).

4.5. Diﬀerentiation and multiplication by a smooth function

81

Now, setting En (x) =

(4.38)

1 r2−n (2 − n)σn−1 E2 (x) =

(4.39)

for n ≥ 3,

1 ln r, 2π

we clearly get ΔEn (x) = δ(x); i.e., En (x) is a fundamental solution of the Laplace operator in Rn . It will later be referred to as the fundamental solution of the Laplace operator. It will be clear later that for n ≥ 3 it is the only fundamental solution such that it is a function of r = |x| and, besides, tends to 0 as |x| → ∞. For n = 2 there is no obvious way to choose the additive constant. Remark 4.10. The area σn−1 of the unit sphere in Rn . Let us compute σn−1 . To do this consider Gaussian integrals

∞ 2 e−x dx, I1 =

In =

e−|x| dx = 2

−∞

Rn

e−(x1 +···+xn ) dx1 . . . dxn = I1n . 2

Rn

2

Expressing In in polar coordinates, we get

∞ 2 In = e−r σn−1 rn−1 dr. 0

r2

= t, we get

∞ √ 1 ∞ n −1 −t σn−1 n −t n−1 Γ , e t 2 d( t) = σn−1 · t 2 e dt = In = σn−1 2 0 2 2 0

Setting

where Γ is the Euler Gamma-function. This implies σn−1 = same time, for n = 2, we have

∞ 2 2 e−r rdr = −πe−r |+∞ = π; I2 = 2π 0 therefore I1 = (4.40)

√

2In Γ( n ). 2

At the

0

π and In = π n/2 . Thus, σn−1 =

2π n/2 . Γ( n2 )

Note that this formula can also be rewritten without Γ-function, since the numbers Γ( n2 ) are easy to compute. Indeed, the functional equation Γ(s + 1) = sΓ(s) enables us to express Γ( n2 ) in terms of Γ(1) for even n and in terms of Γ( 12 ) for odd n. But for n = 1 we deduce from (4.40) that √ Γ( 12 ) = π. (Clearly, σ0 = 2. However, we could also use the formula σ2 = 4π taking n = 3 and using the relation Γ( 32 ) = 12 Γ( 12 ).)

82

4. Distributions

For n = 2, we deduce from (4.40) that Γ(1) = 1, which is also easy to verify without this formula. Therefore, for n = 2k + 2 we get σ2k+1 =

2π k+1 , k!

and for n = 2k + 1 we get σ2k =

2 · (2π)k , where (2k − 1)!! = 1 · 3 · 5 · · · · · (2k − 1). (2k − 1)!!

Remark 4.11. A physical meaning of the fundamental solution of the Laplace operator. For an appropriate choice of units, the potential u(x) of the electrostatic ﬁeld of a system of charges distributed with density ρ(x) in R3 satisﬁes the Poisson equation (4.41)

Δu = ρ.

In particular, for a point charge at zero we have ρ(x) = δ(x); hence, u(x) is a fundamental solution for Δ. Only potentials decaying at inﬁnity have physical meaning. In what follows, we will prove Liouville’s theorem, which implies in particular that a solution u(x) of the Laplace equation Δu = 0 deﬁned on Rn tending to 0 as |x| → +∞ is identically zero. Therefore, there is a unique fundamental solution tending to 0 as |x| → +∞; namely 1 . It deﬁnes the potential of the point unit charge situated at 0. E3 (x) = − 4πr Besides, the potential of an arbitrary distribution of charges should, clearly, by the superposition principle, be determined by the formula

(4.42) u(x) = E3 (x − y)ρ(y)dy. Applying formally the Laplace operator, we get

Δu(x) = δ(x − y)ρ(y)dy = ρ(x); i.e., Poisson’s equation (4.41) holds. These arguments may serve to deduce Poisson’s equation if, from the very beginning, the potential is deﬁned by (4.42). Its justiﬁcation may be obtained with the help of the convolution of distributions introduced below. Let us indicate a meaning of the fundamental solution E2 (x) in R2 . In consider an inﬁnite uniformly charged string (with the constant linear density of the charge equal to 1) placed along x3 -axis. From symmetry considerations, it is clear that its potential u(x) does not depend on x3 and only depends on r = x21 + x22 . Let x = (x1 , x2 ). Poisson’s equation for u 1 ln r + C. Therefore, in this takes the form Δu(x) = δ(x) implying u(x) = 2π case the potential is of the form E2 (x1 , x2 ) + C. R3

Note, however, that the potential of the string is impossible to compute with the help of (4.42) which, in this case, is identically equal to +∞. What

4.5. Diﬀerentiation and multiplication by a smooth function

83

is the meaning of this potential? This is easy to understand if we recall that the observable physical quantity is not the potential but the electrostatic ﬁeld E = −grad u. This ﬁeld can be computed via the Coulomb inverse square law, and the integral that deﬁnes it converges (at the points outside the string). The potential u, which can be recovered from E up to an additive constant, equals E2 (x1 , x2 ) + C. This potential can be deﬁned by another method when we ﬁrst assume that the string is of ﬁnite length 2l and subtract from the potential of the ﬁnite string a constant depending on l (this does not aﬀect E!) and then pass to the limit as l → +∞. It is easy to see that it is possible to choose the constants in such a way that the limit of the potentials of the ﬁnite strings really exists. Then this limit must be equal to E2 (x, y) + C by the above reasoning. We actually subtract an inﬁnite constant from the potential of the string of the form (4.42), but this does not aﬀect E. This procedure is called in physics renormalization of charge and has its analogues in quantum electrodynamics. Let us prove the possibility of renormalizing the charge. Let us write the potential of a segment of the string x3 ∈ [−l, l]:

l 1 dt . ul (x1 , x2 , x3 ) = − 2 2 4π −l x1 + x2 + (t − x3 )2 Since x21 + x22 + (t − x3 )2 ∼ |t| as t → ∞, it is natural to consider instead of ul the function

l 1 1 1 −√ vl (x1 , x2 , x3 ) = − dt 4π −l 1 + t2 x21 + x22 + (t − x3 )2 that diﬀers from ul by a constant depending on l:

l 1 dt √ . vl = ul + 4π −l 1 + t2 The integrand in the formula for vl is

√ 1 + t2 − x21 + x22 + (t − x3 )2 1 √ −√ = 2 1 + t2 x21 + x22 + (t − x3 )2 x1 + x22 + (t − x3 )2 · 1 + t2 1

1 + 2tx3 − x21 − x22 − x23

, √ √ x21 + x22 + (t − x3 )2 · 1 + t2 x21 + x22 + (t − x3 )2 + 1 + t2 which is asymptotically O t12 as t → +∞, so that there exists a limit of the integral as l → +∞. Clearly, the integral itself and its limit may =

84

4. Distributions

be computed explicitly but they are of no importance to us, since we can compute the limit up to a constant by another method. Multiplication by a smooth function This operation is introduced similarly to diﬀerentiation. Namely, if f ∈ a ∈ C ∞ (Ω), ϕ ∈ D(Ω), then, clearly,

L1loc (Ω),

af, ϕ = f, aϕ.

(4.43)

This formula may serve as a deﬁnition of af when f ∈ D (Ω), a ∈ C ∞ (Ω). It is easy to see that then we again get a distribution af ∈ D (Ω). If f ∈ E (Ω), then af ∈ E (Ω) and supp(af ) ⊂ supp f . Now let f ∈ S (Rn ). Then we can use (4.43) for ϕ ∈ S(Rn ) only if aϕ ∈ S(Rn ). Moreover, if we want af ∈ S (Rn ), it is necessary that the operator mapping ϕ to aϕ be a continuous operator from S(Rn ) to S(Rn ). To this end, it suﬃces, e.g., that a ∈ C ∞ (Rn ) satisﬁes |∂ α a(x)| ≤ Cα (1 + |x|)mα ,

x ∈ Rn ,

for every multi-index α with some constants Cα and mα . In particular, the multiplication by a polynomial maps S (Rn ) into S (Rn ). A function f ∈ L1loc (Ω) can be multiplied by any continuous function a(x). For a distribution f , conditions on a(x) justifying the existence of af can also be weakened. Namely, let, e.g., f ∈ D (Ω) be such that for some integer m ≥ 0 and every compact K ⊂ Ω we have (4.44)

|f, ϕ| ≤ CK

|α|≤m

sup |∂ α ϕ(x)|, ϕ ∈ D(K).

x∈K

Then we say that f is a distribution of ﬁnite order (not exceeding m). Denote by Dm (Ω) the set of such distributions. In the general case the estimate (4.44) only holds for a constant m depending on K, and now we require m to be independent of K. Clearly, if f ∈ E (Ω), then the order of f is ﬁnite. If f ∈ L1loc (Ω), then f ∈ D0 (Ω). (Ω), then we may extend f by continuity to a linear functional If f ∈ Dm on the space Dm (Ω)consisting of C m -functions with compact support in Ω. Clearly, Dm (Ω) = K Dm (K), where K is a compact in Ω and Dm (K) is the subset of Dm (Ω) consisting of functions with the support belonging to K. Introducing in Dm (K) the norm equal to the right-hand side of (4.44) without CK , we see that Dm (K) becomes a Banach space and f ∈ Dm (Ω) can be extended to a continuous linear functional on Dm (K). Therefore, it deﬁnes a linear functional on Dm (Ω). It is continuous in the sense that its restriction on Dm (K) is continuous for any compact K ⊂ Ω.

4.5. Diﬀerentiation and multiplication by a smooth function

85

For instance, it is obvious that δ(x − x0 ) ∈ D0 (Ω) for x0 ∈ Ω, so that δfunction deﬁnes a linear functional on D0 (Ω). It actually deﬁnes a continuous linear functional on C(Ω).

In general, if f ∈ Dm (Ω) and supp f ⊂ K, then f can be extended to a continuous linear functional on C m (Ω) if the topology in C m (Ω) is deﬁned by seminorms given by the right-hand side of (4.44) with any compact K ⊂ Ω.

Now, let f ∈ Dm (Ω) and a ∈ C m (Ω). Then (4.43) makes sense for ϕ ∈ D(Ω), since then aϕ ∈ Dm (Ω). Therefore, af is deﬁned for f ∈ Dm (Ω) and a ∈ C m (Ω). Example 4.11. a(x)δ(x) = a(0)δ(x) for a ∈ C(Rn ). Example 4.12. Let n = 1. Let us compute a(x)δ (x) for a ∈ C ∞ (R1 ). We have aδ , ϕ = δ , aϕ = −δ, (aϕ) = −a (0)ϕ(0) − a(0)ϕ (0) implying

a(x)δ (x) = a(0)δ (x) − a (0)δ(x).

Note, in particular, that a(0) = 0 does not necessarily imply a(x)δ (x) = 0. Example 4.13. Leibniz rule. Let f ∈ D (Ω), a ∈ C ∞ (Ω). Let us prove that ∂a ∂f ∂ (af ) = ·f +a . ∂xj ∂xj ∂xj

(4.45)

Indeed, if ϕ ∈ D(Ω), then ∂ϕ ∂ ∂ϕ (af ), ϕ = − af, = − f, a ∂xj ∂xj ∂xj ∂a ∂f ∂a ∂ (aϕ) + f, ϕ = , aϕ + f, ϕ = − f, ∂xj ∂xj ∂xj ∂xj ∂f ∂a = a + f, ϕ , ∂xj ∂xj as required. By (4.45) we may diﬀerentiate the product of a distribution by a smooth function as the product of the regular functions. It follows that the Leibniz rule for higher-order derivatives also holds. Example 4.14. A fundamental solution for an ordinary (one-dimensional) diﬀerential operator. On R1 consider a diﬀerential operator dm−1 d dm + a0 , + a + · · · + a1 m−1 m m−1 dx dx dx where the aj are constants. Let us ﬁnd its fundamental solution E(x), i.e., a solution of Lu = δ(x). Clearly, E(x) is determined up to a solution of the L=

86

4. Distributions

homogeneous equation Lu = 0. Besides, LE(x) = 0 should hold for x = 0. Therefore, it is natural to seek E(x) in the form y1 (x) for x < 0, E(x) = y2 (x) for x > 0, where y1 , y2 are solutions of Ly = 0. Moreover, subtracting y1 (x), we may assume that (4.46)

E(x) = θ(x)y(x),

where y(x) is a solution of Ly = 0. Now, let us ﬁnd LE(x). Clearly, (θ(x)y(x)) = y(0)δ(x) + θ(x)y (x), and the higher-order diﬀerentiations give rise to derivatives of δ(x). If we wish these derivatives to vanish and the δ-function to appear only at the very last step, we should assume y(0) = y (0) = · · · = y (m−2) (0) = 0, y (m−1) (0) = 1. Such a solution y(x) exists and is unique. For such a choice we get (θ(x)y(x)) = θ(x)y (x), (θ(x)y(x)) = (θ(x)y (x)) = θ(x)y (x), ··· (θ(x)y(x))(m−1) = θ(x)y (m−1) (x), (θ(x)y(x))(m) = y (m−1) (0)δ(x) + θ(x)y (m) (x) = δ(x) + θ(x)y (m) (x). Therefore, L(θ(x)y(x)) = θ(x)Ly(x) + δ(x) = δ(x), as required. Why should the fundamental solution be a regular C ∞ -function for x = 0? This follows from Theorem 4.12. Let u ∈ D ((a, b)) be a solution of the diﬀerential equation (4.47)

u(m) + am−1 (x)u(m−1) + · · · + a0 (x)u = f (x),

where aj (x), f (x) are C ∞ -functions on (a, b). Then u ∈ C ∞ ((a, b)). Proof. Subtracting a smooth particular solution of (4.47) which exists, as is well known, we see that everything is reduced to the case when f ≡ 0. Further, for f ≡ 0, equation (4.47) is easily reduced to the system of the form (4.48)

v = A(x)v,

4.6. Transposed operator. Change of variables

87

where v is a vector-valued distribution (i.e., a vector whose components are distributions) and A(x) a matrix with entries from C ∞ ((a, b)). Let Ψ(x) be an invertible matrix of class C ∞ satisfying Ψ (x) = A(x)Ψ(x); i.e., Ψ(x) is a matrix whose columns constitute a fundamental system of solutions of (4.48). Set v = Ψ(x)w; i.e., denote w = Ψ−1 (x)v. Then w is a vector-valued distribution on (a, b). Substituting v = Ψ(x)w in (4.48), we get w = 0. Now, it remains to prove Lemma 4.13. Let u ∈ D ((a, b)) and u = 0. Then u = const. Proof. The condition u = 0 means that u, ϕ = 0 for any ϕ ∈ D((a, b)). But, clearly, ψ ∈ D((a, b)) can be represented as ψ = ϕ , where ϕ ∈ D((a, b)), if and only if

b ψ(x)dx = 0. (4.49) 1, ψ = a

ϕ

then (4.49) holds due to the main theorem of calculus. Indeed, if ψ = x Conversely, if (4.49) holds, then we may set ϕ(x) = a ψ(t)dt and, clearly, ϕ ∈ D((a, b)) and ϕ = ψ. Consider the map I : D((a, b)) −→ C sending ψ to 1, ψ. Since u, ψ = 0 for ψ ∈ Ker I, it is clear that u, ψ only depends on Iψ, but since this dependence is linear, then u, ψ = C1, ψ, where C is a constant. But this means that u = C.

4.6. A general notion of the transposed (adjoint) operator. Change of variables. Homogeneous distributions The diﬀerentiation and multiplication by a smooth function are particular cases of the following general construction. Let Ω1 , Ω2 be two domains in Rn and let L : C ∞ (Ω2 ) −→ C ∞ (Ω1 ) be a linear operator. Let t L : D(Ω1 ) −→ D(Ω2 ) be a linear operator such that (4.50)

Lf, ϕ = f, t Lϕ, ϕ ∈ D(Ω1 ),

for f ∈ C ∞ (Ω2 ). The operator t L is called the transposed or adjoint or dual of L. Suppose that, for any compact K1 ⊂ Ω1 , there exists a compact K2 ⊂ Ω2 such that t LD(K1 ) ⊂ D(K2 ) and t L : D(K1 ) −→ D(K2 ) is a continuous map. Then formula (4.50) enables us to extend L to a linear operator L : D (Ω2 ) −→ D (Ω1 ). If Ω1 = Ω2 = Rn and the operator t L can be extended to a continuous linear operator t L : S(Rn ) −→ S(Rn ), then it is clear that L determines a linear map L : S (Rn ) −→ S (Rn ).

88

4. Distributions

Note that L thus constructed is always weakly continuous. This is obvious from (4.50), since |Lf, ϕ| is a seminorm of the general form for Lf and |f,t Lϕ| is a seminorm for f . Examples of transposed operators. a) If L =

∂ ∂xj ,

then t L = − ∂x∂ j .

b) If L = a(x) (the left multiplication by a(x) ∈ C ∞ (Ω)), then t L = L = a(x). In particular, the operators

∂ ∂xj

and a(x) are continuous on D (Ω) with

respect to the weak topology. Moreover,

∂ ∂xj

is continuous on S (Rn ).

c) A diﬀeomorphism χ : Ω1 −→ Ω2 naturally deﬁnes a linear map χ∗ : −→ C ∞ (Ω1 ) by the formula

C ∞ (Ω2 )

(χ∗ f )(x) = f (χ(x)). Clearly, the same formula deﬁnes a linear map χ∗ : L1loc (Ω2 ) −→ L1loc (Ω1 ). Besides, χ∗ transforms D(Ω2 ) to D(Ω1 ). We wish to extend χ∗ to a continuous linear operator χ∗ : D (Ω2 ) −→ D (Ω1 ). To do so, we ﬁnd the transposed operator. Let ϕ ∈ D(Ω1 ). Then −1

∂χ (z) ∗ −1 dz χ f, ϕ = f (χ(x))ϕ(x)dx = f (z)ϕ(χ (z)) ∂z Ω1 Ω2 −1 ∂χ (z) · ϕ(χ−1 (z)) , = f (z), ∂z ∂χ−1 (z) is its Jacobian. ∂z It is clear now that the transposed operator is of the form −1 ∂χ (z) t ∗ · ϕ(χ−1 (z)). ( χ ϕ)(z) = ∂z

where χ−1 is the inverse of χ and

−1

Since | ∂χ ∂z(z) | ∈ C ∞ (Ω2 ), it is clear that t χ∗ maps D(Ω1 ) to D(Ω2 ) and deﬁnes a continuous linear map of D(K1 ) to D(χ(K1 )). Therefore, the general scheme determines a continuous linear map χ∗ : D (Ω2 ) −→ D (Ω1 ). If Ω1 = Ω2 = Rn and χ is an aﬃne transformation (or at least coincides with an aﬃne transformation outside a compact set), then t χ∗ continuously maps −1 S(Rn ) to S(Rn ), and in this case | ∂χ ∂z (z) | = 0 everywhere (it is constant everywhere if χ is an aﬃne map, and it is a constant near inﬁnity in the general case). So a continuous linear map χ∗ : S (Rn ) −→ S (Rn ) is welldeﬁned.

4.6. Transposed operator. Change of variables

89

Examples. a) Translation operator f (x) → f (x − t) with t ∈ Rn can be extended to a continuous linear operator from D (Ω) to D (t + Ω) and from S (Rn ) to S (Rn ). In particular, the notation δ(x − x0 ) used above agrees with the deﬁnition of the translation by x0 . b) The homothety f (x) → f (tx), where t ∈ R\0, is deﬁned on D (Rn ) and maps D (Rn ) to D (Rn ) and S (Rn ) to S (Rn ). For instance, let us calculate δ(tx). We have

z |t|−n dz = |t|−n ϕ(0) = |t|−n δ(z)ϕ(z)dz. δ(tx)ϕ(x)dx = δ(z)ϕ t Therefore, δ(tx) = |t|−n δ(x), t ∈ R\0. Deﬁnition 4.14. A distribution f ∈ S (Rn ) is called homogeneous of order m ∈ R if (4.51)

f (tx) = tm f (x),

t > 0.

It is easily checked that the requirement f ∈ S (Rn ) can actually be replaced by f ∈ D (Rn ) (then (4.51) implies f ∈ S (Rn )). Sometimes, one says “homogeneous of degree m” instead of “homogeneous of order m”. Examples. a) If f ∈ L1loc (Rn ) and for any t > 0 (4.51) holds almost everywhere, then f is a homogeneous distribution of order m. b) δ-function δ(x) is homogeneous of order −n.

∂f It is easy to verify that ∂x∂ j f (tx) = t ∂x (tx). Therefore, if f is j ∂f is homogeneous of order m − 1. For homogeneous of order m, then ∂x j α instance, ∂ δ(x) is homogeneous of order −n − |α|.

Homogeneity considerations allow us to examine without computing the structure of the fundamental solution for the Laplace operator and its powers. Namely, the order of homogeneity of the distribution r2−n for n ≥ 3 is 2 − n, and for r = 0 this distribution satisﬁes the equation Δu = 0. Therefore, Δr2−n is supported at 0 and its order of homogeneity is −n. But, since the order of homogeneity of ∂ α δ(x) is −n − |α|, it is clear that Δr2−n = Cδ(x). For n = 2, we have Δ(ln r) = 0 for r = 0, and ∂x∂ j (ln r) is homogeneous of order −1. Therefore, Δ(ln r) is supported at 0 and its order of homogeneity is −2, so Δ(ln r) = Cδ(x).

90

4. Distributions

It is not clear a priori why the following does not hold: Δr2−n = 0 in D (Rn ). Later on we will see, however, that if u ∈ D (Ω) and Δu = 0, then u ∈ C ∞ (Ω) (and, moreover, u is real analytic in Ω). Since r2−n has a singularity at 0, it follows that Δr2−n = 0 implying Δr2−n = Cδ(x), where C = 0. Similarly, for n = 2 we have Δ(ln r) = Cδ(x) for C = 0. Finally, note that a rotation operator is deﬁned in D (Rn ) and in S (Rn ). This enables us to give a precise meaning to our arguments on averaging on which we based our conclusion about the existence of a spherically symmetric fundamental solution for Δ. Example 4.15. Fundamental solutions for powers of the Laplace operator. The spherical symmetry and homogeneity considerations make it clear that in order to ﬁnd the fundamental solution of Δm in Rn , it is natural to consider the function r2m−n . Then we get Δm−1 r2m−n = C1 r2−n . Can it be C1 = 0? It turns out that if 2m − n ∈ 2Z+ (i.e., 2m − n is not an even nonnegative integer), or, equivalently, r2m−n ∈ C ∞ (Rn ), then C1 = 0 and, therefore, Δm r2m−n = Cδ(x), where C = 0. Indeed, for r = 0 we have, for any α ∈ R, 2 d n−1 d α rα = [α(α−1)+α(n−1)]rα−2 = α(α+n−2)rα−2 . + Δr = dr2 r dr Therefore, Δr2k−n = (2k − n)(2k − 2)r2k−n−2 and consecutive applications of Δ to r2m−n yield a factor of the form (2m − 2)(2m − n), then (2m − 4)(2m − n − 2), etc., implying C1 = 0 for 2m − n ∈ 2Z+ . Therefore, for 2m − n ∈ 2Z+ the fundamental solution Enm (x) of Δm is of the form Enm (x) = Cm,n r2m−n . Further, let 2m−n ∈ 2Z+ so that r2m−n is a polynomial in x. Then consider the function r2m−n ln r. We get C1 ln r for n = 2, Δm−1 (r2m−n ln r) = 2−n for n ≥ 4 C2 r (C2 depends on n), where C1 = 0 and C2 = 0. Consider for the sake of deﬁniteness the case n ≥ 4. Since m−1 2 d n−1 d m−1 f (r) = + f (r) Δ dr2 r dr and

d dr

ln r = 1r , it follows that Δm−1 (r2m−n ln r) = Δm−1 (r2m−n ) · ln r + C2 r2−n = C2 r2−n

4.7. Appendix. Minkowski inequality

91

because r2m−n is a polynomial in x of degree 2m − n. Here C2 = 0 since r2m−n ln r ∈ C ∞ (Rn ) and the equation Δu = 0 has no solutions with singularities. (Indeed, if it had turned out that C2 = 0, then for some k we would have had that u = Δk−1 (r2m−n ln r) had a singularity at 0 and satisﬁed Δu = 0). Note though that C2 may be determined directly. Therefore, for 2m − n ∈ 2Z+ , Enm (x) = Cm,n r2m−n ln r.

4.7. Appendix. Minkowski inequality Let (M, μ) be a space with a positive measure. Here M is a set where a σ-algebra of subsets is chosen. (We will call a subset S ⊂ M measurable if it belongs to the σ-algebra.) The measure μ is assumed to be deﬁned on the σ-algebra and take values in [0, +∞]. We will also assume that μ is σ-ﬁnite; i.e., there exist measurable sets M1 ⊂ M2 ⊂ · · · , such that μ(Mj ) is ﬁnite for every j and M is the union of all Mj ’s. Then the function spaces Lp (M, μ), 1 ≤ p < +∞, are well-deﬁned, with the norm

1/p p |f (x)| μ(dx) , 1 ≤ p < +∞. f p = M

The Minkowski inequality in its simplest form is the triangle inequality for the norm · p : f + gp ≤ f p + gp ,

f, g ∈ Lp (M, μ).

By induction, this easily extends to several functions: f1 + f2 + · · · + fk p ≤ f1 p + f2 p + · · · + fk p , where fj ∈ Lp (M, μ), j = 1, 2, . . . , k. Multiplying each fj by aj > 0, we obtain (4.52) a1 f1 + a2 f2 + · · · + ak fk p ≤ a1 f1 p + a2 f2 p + · · · + ak fk p . A natural generalization of this inequality can be obtained if we replace summation by integration over a positive measure. More precisely, let (N, ν) be another space with a positive measure ν, and let f = f (x, y) be a measurable function on (M × N, μ × ν), where μ × ν is the product measure deﬁned by the property (μ × ν)(A × B) = μ(A)ν(B) for any measurable sets A ⊂ M, B ⊂ N with ﬁnite μ(A) and ν(B). Then we have Theorem 4.15 (Minkowski’s inequality). Under the conditions above

f (·, y)p ν(dy), (4.53) f (·, y)ν(dy) ≤ N

p

N

92

4. Distributions

or, equivalently, (4.54) p

1/p

f (x, y)ν(dy) μ(dx) ≤ M

N

N

1/p |f (x, y)| μ(dx) p

ν(dy).

M

The inequality (4.52) is a particular case of (4.53). To see this it suﬃces to choose N to be a ﬁnite set of k points and take measures of these points to be a1 , . . . , ak . Vice versa, if we know that (4.52) holds, then (4.53) becomes heuristically obvious. For example, if we could present the integrals in (4.53) as limits of the corresponding Riemann sums, then (4.53) would follow from (4.52). But this is not so easy to do rigorously even if the functions are suﬃciently regular. The easiest way to give a rigorous proof is by a direct use of the following very important Theorem 4.16 (H¨ older’s inequality). Let f ∈ Lp (M, μ), g ∈ Lq (M, μ), where 1 < p, q < +∞, and 1 1 + = 1. (4.55) p q Then f g ∈ L1 (M, μ) and

≤ f p gq . f (x)g(x)μ(dx) (4.56) M

Proof. Let us start with the following elementary inequality: (4.57)

ab ≤

ap bq + , p q

where a ≥ 0, b ≥ 0, and p, q are as in (4.55). It is enough to consider the case a > 0, b > 0. Consider the curve t = sp−1 , s ≥ 0, in the positive quadrant of the real plane R2s,t , and two disjoint bounded domains G1 and G2 , so that each of them is bounded by this curve together with the coordinate axes and straight line intervals as follows: the boundary of G1 consists of two straight line intervals {(s, 0) : 0 ≤ s ≤ a}, {(a, t) : 0 ≤ t ≤ ap−1 } and the curvilinear part {(s, t) : t = sp−1 , 0 ≤ s ≤ a}; the boundary of G2 also consists of two straight line intervals {(0, t) : 0 ≤ t ≤ b}, {(s, b) : 0 ≤ s ≤ b1/(p−1) } and the generally diﬀerent curvilinear piece of the same curve, namely, {(s, t) : s = t1/(p−1) , 0 ≤ t ≤ b}. It is easy to see that the areas of G1 , G2 are ap /p and bq /q, respectively, and their union contains the rectangle [0, a] × [0, b] up to a set of measure 0. Therefore, (4.57) follows by comparison of the areas. To prove (4.56) note ﬁrst that replacing f , g by |f |, |g|, we may assume that f ≥ 0 and g ≥ 0. Dividing both parts of (4.56) by f p gq , we

4.7. Appendix. Minkowski inequality

93

see that it suﬃces to prove it for normalized functions f, g, i.e., such that f p = gq = 1. To this end apply (4.57) with a = f (x), b = g(x) to get f (x)g(x) ≤

f (x)p g(x)q + . p q

Integrating this over M with respect to μ and using (4.55), we come to the desired result. Proof of Theorem 4.15. To prove (4.54) let us make some simplifying assumptions. Without loss of generality we can assume that f is positive (otherwise, replace it by |f |, which does not change the right-hand side and can only increase the left-hand side). We can assume that both sides of (4.54) are ﬁnite. If the right-hand side is inﬁnite, then the inequality is obvious. If the left-hand side is inﬁnite, we can appropriately truncate f (using σ-ﬁniteness of the measures) and then deduce the proof in the general case in the limit from a monotone approximation. Introducing a function

f (x, y)ν(dy),

h(x) := N

which is measurable on M by Fubini’s theorem, we can rewrite the pth power of the left-hand side of (4.54) as

p f (x, y)ν(dy) h(x)p−1 μ(dx) M h(x) μ(dx) = M N

f (x, y)h(x)p−1 μ(dx) ν(dy). = N

M

Now applying H¨older’s inequality (4.56) to the internal integral in the righthand side and taking into account that q = p/(p − 1), we obtain

1/p

(p−1)/p

f (x, y)h(x)p−1 μ(dx) ≤ f (x, y)p μ(dx) h(x)p μ(dx) . M

M

M

Using this to estimate the right-hand side of the previous identity, we obtain 1/p

(p−1)/p

p p p h(x) μ(dx) ≤ f (x, y) μ(dx) ν(dy) h(x) μ(dx) . M

N

M

M

Now dividing both sides by the last factor in the right-hand side, we obtain the desired inequality (4.54). This division is possible if we assume that the last factor does not vanish. But in the opposite case f = 0 almost everywhere and the statement is trivially true.

94

4. Distributions

4.8. Appendix. Completeness of distribution spaces Let E be a Fr´echet space with the topology given by seminorms {pj | j = 1, 2, . . . }, such that p1 (x) ≤ p2 (x) ≤ · · ·

for all x ∈ E.

E

Let denote the dual space of all linear continuous functionals f : E → R. Let us say that a set F ⊂ E is bounded if there exist an integer j ≥ 1 and a positive constant C such that (4.58)

|f, ϕ| ≤ Cpj (ϕ) for all f ∈ F, ϕ ∈ E.

Note that for F consisting of just one element f , this is always true, being equivalent to the continuity of f ; see (4.2). So the requirement (4.58) means equicontinuity, or uniform continuity of all functionals f ∈ F . Let us also say that a family F ⊂ E is weakly bounded if for every ϕ ∈ E there exists a positive Cϕ such that (4.59)

|f, ϕ| ≤ Cϕ

for all f ∈ F.

The following theorem is a simplest version of the Banach-Steinhaus theorem, which is also called the uniform boundedness principle. Theorem 4.17. If, under the assumptions described above, F is weakly bounded, then it is bounded. The completeness of E plays an important role in the proof, which relies on the following theorem. Theorem 4.18 (Baire’s theorem). A nonempty complete metric space cannot be presented as an at most countable union of nowhere dense sets. Here, nowhere dense set S in a metric space M is a set whose closure does not have interior points. So the theorem can be equivalently reformulated as follows: a nonempty complete metric space cannot be presented as an at most countable union of closed sets without interior points. Proof of Theorem 4.18. Let M be a nonempty complete metric space with the metric ρ, and let {Sj | j = 1, 2, . . . } be a sequence of nowhere dense subsets of M . We will produce a point x ∈ M which does not belong to any Sj . Passing to the closures we can assume that Sj is closed for every j. Now, clearly S1 = M because otherwise S1 would be open in M , so it would ¯ 1 , r1 ) = ¯1 = B(x not be nowhere dense. Hence there exists a closed ball B ¯ {x ∈ M | ρ(x, x1 ) ≤ r1 }, such that B1 ∩ S1 = ∅. By induction we can ¯ j , rj ), j = 1, 2, . . . , such that ¯j = B(x construct a sequence of closed balls B ¯ ¯ ¯j ¯ Bj+1 ⊂ Bj , rj+1 ≤ rj /2, and Bj ∩ Sj = ∅ for all j. It follows that B

4.8. Appendix. Completeness of distribution spaces

95

does not intersect with S1 ∪ S2 ∪ · · · ∪ Sj . On the other hand, the sequence of centers {xj | j = 1, 2, . . . } is a Cauchy sequence because by the triangle inequality, for m < n, ρ(xm , xn ) ≤ ρ(xm , xm+1 ) + · · · + ρ(xn−1 , xn ) ≤ rm + rm+1 + · · · ≤ 2rm ≤ 2−m+2 r1 . Therefore, due to the completeness of M , xm → x in M , and x belongs to ¯j . It follows that x ∈ Sj for all j. all balls B Proof of Theorem 4.17. Let us take for any j = 1, 2, . . . ! ϕ ∈ E sup |f, ϕ| ≤ jpj (ϕ) Sj = =

"#

f ∈F

$ ϕ ∈ E |f, ϕ| ≤ jpj (ϕ) .

f ∈F

Then Sj ⊂ Sj+1 for all j, each set Sj is closed, and by the assumed weak boundedness of the family F ⊂ E the union of all Sj , j = 1, 2, . . . , is the whole of E. Therefore, by Baire’s theorem, there exists j such that Sj has a nonempty interior. Taking a fundamental system of neighborhoods of any point ϕ0 ∈ E, consisting of sets Bj (ϕ0 , ε) = {ϕ0 + ϕ| pj (ϕ) < ε}, where ε > 0, j = 1, 2, . . . , we see that we can ﬁnd ϕ0 , j, and ε, such that Bj (ϕ0 , ε) ⊂ Sj . Then for any f ∈ F and any ϕ ∈ Bj (0, ε) |f, ϕ| ≤ |f, ϕ+ϕ0 |+|f, ϕ0 | ≤ jpj (ϕ+ϕ0 )+C1 ≤ jpj (ϕ)+C2 ≤ jε+C2 , where C1 , C2 > 0 are independent of ϕ and f (but may depend on ϕ0 , j, and ε). We proved this estimate for ϕ ∈ E such that pj (ϕ) < ε. By scaling we deduce that |f, ϕ| ≤ (j + ε−1 C2 )pj (ϕ) for all f ∈ F and ϕ ∈ E, which implies the desired result.

Corollary 4.19 (Completeness of the dual space). Let E be a Fr´echet space, and let {fk ∈ E | k = 1, 2, . . . } be a sequence of continuous linear functionals such that for every ϕ ∈ E there exists the limit f, ϕ := lim fk , ϕ. k→+∞

Then the limit functional f is also continuous; i.e., f ∈ E .

96

4. Distributions

Proof. Clearly, the set F = {fk | k = 1, 2, . . . } is weakly bounded. Therefore, by Theorem 4.17 there exists a seminorm pj , which is one of the seminorms deﬁning the topology in E, and a constant C > 0, such that |fk , ϕ| ≤ Cpj (ϕ),

ϕ ∈ E, k = 1, 2, . . . .

Taking the limit as k → +∞, we obtain |f, ϕ| ≤ Cpj (ϕ),

ϕ ∈ E,

with the same C and j. This implies continuity of f .

Applying this corollary of Theorem 4.17 to the distribution spaces D (Ω), and S (Rn ), we obtain their completeness. (In the case of D (Ω) we should apply the corollary to E = D(K).)

E (Ω),

4.9. Problems √ 4.1. Find lim χε (x) in D (R), where χε (x) = 1/ ε for x ∈ (−ε, ε) and ε→0+

χε (x) = 0 for x ∈ (−ε, ε). 4.2. Find lim ψε (x) in D (R), where ψε (x) = 1/ε for x ∈ (−ε, ε) and ε→0+

ψε (x) = 0 for x ∈ (−ε, ε). 1 ε→0+ x

4.3. Find lim

sin xε in D (R).

4.4. Prove that if f ∈ C ∞ (R), then x1 (f (x) − f (0)) ∈ C ∞ (R). 4.5. Consider the canonical projection p : R −→ S 1 = R/2πZ. It induces ∞ (R), where C ∞ (R) is the set of all 2πan isomorphism p∗ : C ∞ (S 1 ) −→ C2π 2π ∞ periodic C -functions on R. Introduce a natural Fr´echet topology in the space C ∞ (S 1 ) and deﬁne the set of distributions on S 1 as the dual space D (S 1 ) = (C ∞ (S 1 )) . Identify each function f ∈ L1 (S 1 ) (or, equivalently, a locally integrable 2π-periodic function f on R) with a distribution on S 1 by the formula

2π f (x)ϕ(x)dx. f, ϕ = 0

Prove that

p∗

can be extended by continuity to an isomorphism (R), D (S 1 ) −→ D2π

(R) is the set of all 2π-periodic distributions. where D2π

4.9. Problems

97

4.6. Prove that the trigonometric series fk eikx k∈Z

D (R)

converges in if and only if there exist constants C and N such that N |fk | ≤ C(1 + |k|) . Prove that if f is the sum of this series, then f ∈ D2π (R) and the series above is in fact its Fourier series; that is, 1 f, e−ikx , fk = 2π where the angle brackets denote the natural pairing between D (S 1 ) and C ∞ (S 1 ). 4.7. Prove that any 2π-periodic distribution (or, equivalently, distribution on S 1 ) is equal to the sum of its Fourier series. 4.8. Find the Fourier series expansion of the δ-function on S 1 , or, equivalently, of the distribution ∞ δ(x + 2kπ) ∈ D2π (R). k=−∞

4.9. Use the result of the above problem to prove the following Poisson summation formula: ∞ ∞ 1 ˜ f (k), f (2kπ) = 2π k=−∞

k=−∞

where f ∈ S(R) and f˜ is the Fourier transform of f . 4.10. Find all u ∈ D (R), such that a) xu(x) = 1; b) xu (x) = 1; c) xu (x) = δ(x). eikr , where r = |x|, is the fundamental solution of the 4πr Helmholtz operator Δ + k 2 in R3 . 4.11. Verify that −

10.1090/gsm/205/05

Chapter 5

Convolution and Fourier transform

5.1. Convolution and direct product of regular functions The convolution of locally integrable functions f and g on Rn is the integral

(5.1) (f ∗ g)(x) = f (x − y)g(y)dy, over Rn . Obviously, the integral (5.1) is not always deﬁned. Let us consider it when f, g ∈ C(Rn ) and one of these functions has a compact support (in this case (5.1) makes sense and, clearly, f ∗ g ∈ C(Rn )). The change of variables x − y = z turns (5.1) into

(f ∗ g)(x) = f (z)g(x − z)dz; i.e., the convolution is commutative: f ∗ g = g ∗ f. Further, let f ∈ C 1 (Rn ). Then it is clear that (5.1) may be diﬀerentiated under the integral sign, and we get ∂f ∂ (f ∗ g) = ∗ g. ∂xj ∂xj In general, if f ∈ C m (Rn ), then (5.2)

∂ α (f ∗ g) = (∂ α f ) ∗ g

for |α| ≤ m. 99

100

5. Convolution and Fourier transform

The convolution is associative; i.e., f ∗ (g ∗ h) = (f ∗ g) ∗ h if f, g, h ∈ C(Rn ) and two of the three functions have compact supports. This can be veriﬁed by the change of variables but a little later we will prove this in another way. Let ϕ ∈ C(Rn ) and let ϕ have a compact support. Then

f ∗ g, ϕ = f (x − y)g(y)ϕ(x)dydx may be transformed by the change x = y + z to the form

(5.3) f ∗ g, ϕ = f (z)g(y)ϕ(z + y)dydz. By Fubini’s theorem, this makes obvious the already proved commutativity of the convolution. Further, this also implies

(f ∗ g) ∗ h, ϕ = f (x)g(y)h(z)ϕ(x + y + z)dxdydz = f ∗ (g ∗ h), ϕ, proving the associativity of the convolution. Note the important role played in (5.3) by the function f (x)g(y) ∈ C(R2n ) called the direct or tensor product of functions f and g and denoted by f ⊗ g or just f (x)g(y) if we wish to indicate the arguments. Suppose that f and g have compact supports. Then in (5.3) we may set ϕ(x) = e−iξ·x , where ξ ∈ Rn . We get f ∗ g(ξ) = f˜(ξ) · g˜(ξ), where the tilde denotes the Fourier transform:

f˜(ξ) = f (x)e−iξ·x dx. Therefore, the Fourier transform turns the convolution into the usual product of Fourier transforms. Let us analyze the support of a convolution. Let A, B ⊂ Rn . Set A + B = {x + y : x ∈ A, y ∈ B} (sometimes A + B is called the arithmetic sum of subsets A and B). It turns out that supp(f ∗ g) ⊂ supp f + supp g. % Indeed, it follows from (5.3) that if supp ϕ (supp f + supp g) = ∅, then f ∗ g, ϕ = 0, yielding (5.4). (5.4)

5.2. Direct product of distributions

101

5.2. Direct product of distributions Let Ω1 ⊂ Rn1 , Ω2 ⊂ Rn2 , f1 ∈ D (Ω1 ), f2 ⊂ D (Ω2 ). Let us deﬁne the direct or tensor product f1 ⊗ f2 ∈ D (Ω1 × Ω2 ) of distributions f1 and f2 . Instead of f1 ⊗ f2 , we will often also write f1 (x)f2 (y), where x ∈ Ω1 , y ∈ Ω2 . Let ϕ(x, y) ∈ D(Ω1 × Ω2 ). For fj ∈ L1loc (Ωj ) the Fubini theorem yields (5.5)

f1 ⊗ f2 , ϕ = f1 (x), f2 (y), ϕ(x, y) = f2 (y), f1 (x), ϕ(x, y).

This formula should be considered as a basis for deﬁnition of f1 ⊗ f2 if f1 and f2 are distributions. Let us show that it makes sense. First, note that any compact K in Ω1 × Ω2 is contained in a compact of the form K1 × K2 , where Kj is a compact in Ωj , j = 1, 2. In particular, supp ϕ ⊂ K1 × K2 for some K1 , K2 . Further, ϕ(x, y) can now be considered as a C ∞ -function of x with values in D(K2 ) (the C ∞ -function with values in D(K2 ) means that all the derivatives are limits of their diﬀerence quotients in the topology of D(K2 )). Therefore, f2 (y), ϕ(x, y) ∈ D(K1 ), since f2 is a continuous linear functional on D(K2 ). This implies that the functional F, ϕ = f1 (x), f2 (y), ϕ(x, y) is well-deﬁned. It is easy to verify that F ∈ D (Ω1 × Ω2 ) and supp F ⊂ supp f1 × supp f2 . We can also construct the functional G ∈ D (Ω1 × Ω2 ) by setting G, ϕ = f2 (y), f1 (x), ϕ(x, y). Let us verify that F = G. Note that if ϕ = ϕ1 ⊗ ϕ2 , where ϕj ∈ D(Ωj ), i.e., ϕ(x, y) = ϕ1 (x)ϕ2 (y), then, clearly, F, ϕ = G, ϕ = f1 , ϕ1 f2 , ϕ2 . Therefore, obviously, to prove F = G it suﬃces to prove the following. Lemma 5.1. In D(Ω1 × Ω2 ), ﬁnite linear combinations of the functions ϕ1 ⊗ ϕ2 , where ϕj ∈ D(Ωj ) for j = 1, 2, are dense. More precisely, if Kj ˆ j is a compact in Ωj , such that Kj is is a compact in Ωj , j = 1, 2, and K ˆ j , then any function ϕ ∈ (K1 × K2 ) can be contained in the interior of K ˆ1 × K ˆ 2 ) by ﬁnite sums of functions approximated, as close as we like, in D(K ˆ of the form ϕ1 ⊗ ϕ2 , where ϕj ∈ D(Kj ). Proof. Consider the cube Qd in Rn1 +n2 : d d Qd = (x, y) : |xj | ≤ , |yl | ≤ , j = 1, . . . , n1 ; l = 1, . . . , n2 , 2 2 i.e., the cube with the center at the origin with edges of length d parallel to the coordinate axes. Take d large enough for the compact set K1 × K2 to

102

5. Convolution and Fourier transform

belong to the interior of Qd . In L2 (Qd ), the exponentials 2πi n n exp k · z : k ∈ Z , n = n1 + n2 ; z ∈ R d considered as functions of z, constitute a complete orthonormal system. A function ϕ = ϕ(x, y) = ϕ(z) can be expanded into the Fourier series 2πi k·z , ϕk exp (5.6) ϕ(z) = d k

where 1 ϕk = n d

2πi k · z dz. ϕ(z) exp − d Qd

The Fourier series (5.6) converges absolutely and uniformly in Qd to ϕ(z) and it can be termwise diﬀerentiated any number of times. Indeed, integrating by parts shows that the absolute values of N

1 d2 2πi 2 N k · z dz (1 + |k| ) ϕk = n ϕ(z) 1 − 2 Δz exp − d Qd 4π d N

2πi 1 d2 k · z dz ϕ(z) exp − = n 1 − 2 Δz d Qd 4π d are bounded by a constant CN independent of k; i.e., |ϕk | ≤ CN (1 + |k|2 )−N . This implies the convergence of (5.6) and the possibility of its termwise diﬀerentiation. Therefore, (5.6) converges in the topology of C ∞ (Qd ). Now, let χj ∈ ˆ j ), where χj = 1 in a neighborhood of Kj . We get D(K ϕ(x, y) = χ1 (x)χ2 (y)ϕ(x, y) 2πi 2πi = k1 · x χ2 (y) exp k2 · y , ϕk χ1 (x) exp d d k1 ,k2

Zn1 ,

k2 ∈ Zn2 , k = (k1 , k2 ), and the series converges in the where k1 ∈ ˆ 2 ). But then the partial sums of this series give the ˆ1 × K topology D(K required approximation for ϕ(x, y). Now we are able to introduce the following. Deﬁnition 5.2. If fj ∈ D (Ωj ), j = 1, 2, then the direct or tensor product of f1 and f2 is the distribution on Ω1 × Ω2 deﬁned via (5.5) and denoted by f1 ⊗ f2 or f1 (x)f2 (y) or f1 (x) ⊗ f2 (y).

5.2. Direct product of distributions

103

Clearly, supp(f1 ⊗ f2 ) ⊂ supp f1 × supp f2 , and if fj ∈ that

S (Rnj ),

then f1 ⊗ f2 ∈ S (Rn1 +n2 ). Further, it is easy to verify

∂ ∂f1 (f1 ⊗ f2 ) = ⊗ f2 , ∂xj ∂xj

∂ ∂f2 (f1 ⊗ f2 ) = f1 ⊗ . ∂yl ∂yl

Example 5.1. δ(x)δ(y) = δ(x, y). Example 5.2. Let t ∈ R, x ∈ Rn−1 , α(x) ∈ C(Rn−1 ). The distribution δ(t) ⊗ α(x) ∈ D (Rnt,x ) is called the single layer with density α on the plane t = 0. Its physical meaning is as follows: it describes the charge supported on the plane t = 0 and smeared over the plane with the density α(x). Clearly,

α(x)ϕ(0, x)dx. (5.7) δ(t) ⊗ α(x), ϕ(t, x) = Rn−1 x

For β(x) ∈ C(Rn−1 ), the distribution δ (t) ⊗ β(x) is called the double layer with density β on the plane t = 0. Its physical meaning is as follows: it describes the distribution of dipoles smeared with the density β over the plane t = 0 and oriented along the t-axis. Recall that a dipole in electrostatics is a system of two very large charges of the same absolute value and of opposite signs placed very close to each other. The product of charges by the distance between them should be a deﬁnite ﬁnite number called the dipole moment. Clearly, in D (Rn ), the following limit relation holds: ' & δ(t − ε) δ(t + ε) ⊗ β(x) − ⊗ β(x) , δ (t) ⊗ β(x) = lim ε→+0 2ε 2ε implying the above interpretation of the double layer. It is also clear that

∂ϕ (5.8) δ (t) ⊗ β(x), ϕ(t, x) = − (0, x)β(x)dx. n−1 ∂t Rx The formulas similar to (5.7), (5.8) can be used to deﬁne the single and double layers on any smooth surface Γ of codimension 1 in Rn . Namely, if ∂ (βδΓ ) by the α, β ∈ C(Γ), then we can deﬁne the distributions αδΓ and ∂n formulas

α(x)ϕ(x)dS, αδΓ , ϕ = Γ

∂ϕ ∂ (βδΓ ), ϕ = − β(x) dS, ∂n ∂n Γ where dS is the area element of Γ and n the external normal to Γ. Note, however, that, locally, a diﬀeomorphism reduces these distributions to the above direct products δ(t) ⊗ α(x) and δ (t) ⊗ β(x).

104

5. Convolution and Fourier transform

Note another important property of the direct product: it is associative; i.e., if fj ∈ D (Ωj ), j = 1, 2, 3, then (f1 ⊗ f2 ) ⊗ f3 = f1 ⊗ (f2 ⊗ f3 ) in D (Ω1 × Ω2 × Ω3 ). The proof follows from the fact that ﬁnite linear combinations of functions of the form ϕ1 ⊗ ϕ2 ⊗ ϕ3 with ϕj ∈ D(Ωj ), j = 1, 2, 3, are dense in D(Ω1 × Ω2 × Ω3 ) in the same sense as in Lemma 5.1.

5.3. Convolution of distributions It is clear from (5.3) that it is natural to try to deﬁne the convolution of distributions f, g ∈ D (Rn ) with the help of the formula (5.9)

f ∗ g, ϕ = f (x)g(y), ϕ(x + y),

ϕ ∈ D(Rn ).

This formula (and the convolution itself) does not always make sense, since ϕ(x + y) ∈ D(R2n ) for ϕ = 0. Nevertheless, sometimes the right-hand side of (5.9) can be naturally deﬁned. Let, for example, f ∈ E (Rn ). Let us set (5.10)

f (x)g(y), ϕ(x + y) = f (x), g(y), ϕ(x + y),

which makes sense since g(y), ϕ(x + y) ∈ E(Rnx ). We may proceed the other way around, setting (5.11)

f (x)g(y), ϕ(x + y) = g(y), f (x), ϕ(x + y),

which also makes sense, since f (x), ϕ(x + y) ∈ D(Rny ). Let us show that both methods give the same answer. To this end, take a function χ ∈ D(Rn ) such that χ(x) = 1 in a neighborhood of supp f . Then f = χf and we have f (x), g(y), ϕ(x + y) = χ(x)f (x), g(y), ϕ(x + y) = f (x), χ(x)g(y), ϕ(x + y) = f (x), g(y), χ(x)ϕ(x + y), and, similarly, g(y), f (x), ϕ(x + y) = g(y), f (x), χ(x)ϕ(x + y). The right-hand sides of these equations are equal, due to the property of the direct product (5.5) proved above since χ(x)ϕ(x + y) ∈ D(R2n ). Therefore, their left-hand sides coincide. Now we may introduce the following deﬁnition. Deﬁnition. The convolution of two distributions f, g ∈ D (Rn ) one of which has a compact support is the distribution f ∗ g ∈ D (Rn ) deﬁned via (5.9) understood in the sense of (5.10) or (5.11). It follows from the arguments above that the convolution is commutative; i.e., f ∗ g = g ∗ f,

f ∈ E (Rn ), g ∈ D (Rn ).

5.3. Convolution of distributions

105

Besides, it is associative; i.e., (f ∗ g) ∗ h = f ∗ (g ∗ h) if two of the three distributions have compact support. This is easy to verify from the associativity of the direct product and the equation (f ∗ g) ∗ h, ϕ = f (x)g(y)h(z), ϕ(x + y + z) because the same holds if (f ∗ g) ∗ h is replaced by f ∗ (g ∗ h). The rule (5.2) of diﬀerentiating the convolution applies also to distributions. Indeed, ∂ α (f ∗ g), ϕ = (−1)|α| f ∗ g, ∂ α ϕ = (−1)|α| f (x)g(y), (∂ αϕ)(x + y), but (∂ α ϕ)(x + y) can be represented in the form ∂xα ϕ(x + y) or ∂yα ϕ(x + y) and then, moving ∂ α backwards, we see that (5.2) holds: ∂ α (f ∗ g) = (∂ α f ) ∗ g = f ∗ (∂ α g). Example 5.3. δ(x) ∗ f (x) = f (x), f ∈ D (Rn ). From the algebraic viewpoint, we see that the addition and convolution endow E (Rn ) with the structure of an associative and commutative algebra (over C) with the unit δ(x), and D (Rn ) is a two-sided module over this algebra. Example 5.4. A solution of a nonhomogeneous equation with constant coeﬃcients. We begin this example with Deﬁnition. Let p(D) be a linear diﬀerential operator with constant coeﬃcients in Rn . Distribution E(x) ∈ D (Rn ) is its fundamental solution if p(D)E(x) = δ(x). Then the equation p(D)u = f, where f ∈

E (Rn ),

has a particular solution u = E ∗ f.

Indeed, p(D)(E ∗ f ) = [p(D)E] ∗ f = δ ∗ f = f, as required. Example 5.5. Using the above example we can ﬁnd a particular solution of a nonhomogeneous ordinary diﬀerential equation with constant coeﬃcients without variation of parameters. Indeed, let n = 1 and p(D) = L, where L=

dm−1 d dm + a0 + a + · · · + a1 m−1 m m−1 dx dx dx

106

5. Convolution and Fourier transform

and all coeﬃcients aj are constants. In this case, we may take E(x) = θ(x)y(x), where y is the solution of the equation Ly = 0 satisfying the initial conditions y(0) = y (0) = · · · = y (m−2) (0) = 0, y (m−1) (0) = 1 (see (4.46)). This implies that if f ∈ C(R) has a compact support, then

x y(x − t)f (t)dt u1 (x) = E(x − t)f (t)dt = −∞

is a particular solution of the equation Lu = f . Note that since y(x − t) satisﬁes, as a function of x, the equation Ly(x − t) = 0 for any t ∈ R, it follows that

0 u2 (x) = y(x − t)f (t)dt −∞

satisﬁes the equation Lu2 = 0. Therefore, together with u1 (x), the function

x (5.12) u(x) = y(x − t)f (t)dt 0

is also a particular solution of Lu = f . Formula (5.12) gives a solution of Lu = f for an arbitrary f ∈ C(R) (not necessarily with a compact support) since the equality Lu = f at x depends only on the behavior of f in a neighborhood of x and the solution u itself given by this formula is determined in a neighborhood of x by the values of f on a ﬁnite segment only. Here is an example: a particular solution of the equation u + u = f is given by the formula

x sin(x − t)f (t)dt. u(x) = 0

Example 5.6. Potentials. If En is the fundamental solution of the Laplace operator Δ in Rn deﬁned by (4.38), (4.39), then u = −En ∗f with f ∈ E (Rn ) is called a potential and satisﬁes the Poisson equation Δu = −f . In particular, if f = ρ, where ρ is piecewise continuous, then for n = 3 we get the Newtonian potential

ρ(y) 1 dy, u(x) = 4π |x − y| and for n = 2, we get the logarithmic potential

1 ρ(y) ln |x − y|dy. u(x) = − 2π

5.4. Support of a convolution

107

In these examples, the convolution En ∗ f is locally integrable for any n. As a distribution, it coincides with a locally integrable function

u(x) = − En (x − y)ρ(y)dy, since En (x − y)ρ(y) is locally integrable in R2n x,y and, by the Fubini theorem, we can make the change of variables in the integral u, ϕ which leads us to the deﬁnition of the convolution of distributions. Let us give other examples of potentials. Recall the deﬁnition of the operator δΓ in Example 4.4. The convolution u = −En ∗ αδΓ ,

(5.13)

where Γ is a smooth compact surface of codimension 1 in Rn and α ∈ C(Γ), is a simple layer potential and may be interpreted as a potential of a system of charges smeared over Γ with the density α. The convolution ∂ (βδΓ ) ∂n is called a double layer potential and denotes the potential of a system of dipoles smeared over Γ and oriented along the exterior normal with the density of the dipole moment equal to β(x).

(5.14)

u = −En ∗

Note that outside Γ both potentials (5.13) and (5.14) are C ∞ -functions given by the formulas

u(x) = − En (x − y)α(y)dSy ,

Γ

∂En (x − y) β(y)dSy , ∂ny respectively. The behavior of the potentials near Γ is of interest, and we will discuss it later in detail. u(x) =

5.4. Other properties of convolution. Support and singular support of a convolution First, let us study the convolution of a smooth function and a distribution. Proposition 5.3. Let f ∈ E (Rn ), g ∈ C ∞ (Rn ) or f ∈ D (Rn ), g ∈ C0∞ (Rn ). Then f ∗ g ∈ C ∞ (Rn ) and (f ∗ g)(x) = f (y), g(x − y) = f (x − y), g(y). Proof. First, observe that g(x − y) is an inﬁnitely diﬀerentiable function of x with values in C ∞ (Rny ). If, moreover, g ∈ C0∞ (Rn ), then the support

108

5. Convolution and Fourier transform

of g(x − y) as of a function of y belongs to a compact set if x runs over a compact set. Therefore, f (y), g(x − y) ∈ C ∞ (Rn ). The equation f (x − y), g(y) = f (y), g(x − y) follows since, by deﬁnition, the change of variables in distributions is performed as if we are dealing with an ordinary integral. It remains to verify that if ϕ ∈ D(Rn ), then f ∗ g, ϕ = f (y), g(x − y), ϕ(x), or f (x), g(y), ϕ(x + y) = f (y), g(x − y), ϕ(x). this means that Since g ∈

f (x), g(y)ϕ(x + y)dy = f (y), g(x − y)ϕ(x)dx, C ∞ (Rn ),

or

f (y), g(x − y)ϕ(x)dx = f (y), g(x − y)ϕ(x)dx;

i.e., we are to prove that the continuous linear functional f may be inserted under the integral sign. This is clear, since the integral (Riemann) sums of the integral

g(x − y)ϕ(x)dx, considered as functions of y, tend to the integral in the topology in C ∞ (Rny ), because derivatives ∂yα of the Riemann integral sums are Riemann integral sums themselves for the integrand in which g(x − y) is replaced with ∂yα g(x − y). Proposition 5.3 is proved. Proposition 5.4. Let fk → f in D (Rn ) and g ∈ E (Rn ) or fk → f in E (Rn ) and g ∈ D (Rn ). Then fk ∗ g → f ∗ g in D (Rn ). Proof. The proposition follows immediately from fk ∗ g, ϕ = fk (x), g(y), ϕ(x + y).

Corollary 5.5. C0∞ (Rn ) is dense in D (Rn ) and in E (Rn ). Proof. If ϕk ∈ C0∞ (Rn ), ϕk (x) → δ(x) in E (Rn ) as k → +∞ and f ∈ D (Rn ), then ϕk ∗ f ∈ C ∞ (Rn ) and lim ϕk ∗ f = δ ∗ f = f in D (Rn ). k→+∞

Therefore, C ∞ (Rn ) is dense in D (Rn ). But, clearly, C0∞ (Rn ) is dense in C ∞ (Rn ). Therefore, C0∞ (Rn ) is dense in D (Rn ). It is easy to see that if fl ∈ E (Rn ) and suppfl ⊂ K, where K is a compact in Rn (independent of l), then fl → f in D (Rn ) if and only if fl → f in E (Rn ). But then it follows from the above argument that C0∞ (Rn ) is dense in E (Rn ).

5.5. Smoothness of solutions

109

A similar argument shows that C0∞ (Ω) is dense in E (Ω) and D (Ω) and, besides, C0∞ (Rn ) is dense in S (Rn ). These facts also say that in order to verify any distributional formula both of whose sides continuously depend, for example, on f ∈ D (Rn ), it suﬃces to do so for f ∈ C0∞ (Rn ). Proposition 5.6. Let f ∈ E (Rn ), g ∈ D (Rn ). Then supp(f ∗ g) ⊂ supp f + supp g. % Proof. We have to prove that if ϕ ∈ D(Rn ) and supp ϕ (supp f + supp g) = ∅, then f ∗ g, ϕ = 0. But this is clear from (5.9) since in this case supp ϕ(x + y) does not intersect with supp[f (x)g(y)]. Deﬁnition 5.7. The singular support of a distribution f ∈ D (Ω) is the smallest closed subset F ⊂ Ω such that f |Ω\F ∈ C ∞ (Ω\F ) and is denoted sing supp f . Proposition 5.8. Let f ∈ E (Rn ), g ∈ D (Rn ). Then (5.15)

sing supp(f ∗ g) ⊂ sing supp f + sing supp g.

Proof. Using cut-oﬀ functions we can, for any ε > 0, represent any distribution h ∈ D (Rn ) as a sum h = hε + hε , where supp hε is contained in the ε-neighborhood of sing supp h and hε ∈ C ∞ (Rn ). Let us choose such representation for f and g: f = fε + fε ,

g = gε + gε .

Then f ∗ g = fε ∗ gε + fε ∗ gε + fε ∗ gε + fε ∗ gε . In this sum the last three summands belong to C ∞ (Rn ) by Proposition 5.3. Therefore, sing supp(f ∗ g) ⊂ sing supp(fε ∗ gε ) ⊂ supp(fε ∗ gε ) ⊂ supp fε + supp gε . The latter set is contained in the 2ε-neighborhood of sing supp f +sing supp g. Since ε is arbitrary, this implies (5.15), as required.

5.5. Relation between smoothness of a fundamental solution and that of solutions of the homogeneous equation We will ﬁrst prove the following simple but important lemma. Lemma 5.9. Let p(D) be a linear diﬀerential operator with constant coefﬁcients, E = E(x) its fundamental solution, u ∈ E (Rn ), and p(D)u = f . Then u = E ∗ f .

110

5. Convolution and Fourier transform

Proof. We have E ∗ f = E ∗ p(D)u = p(D)E ∗ u = δ ∗ u = u,

as required.

Remark 5.10. The requirement u ∈ E (Rn ) is essential here (the assertion is clearly wrong without it). Namely, it allows us to move p(D) from u to E in the calculation given above. Theorem 5.11. Assume that E is a fundamental solution of p(D) and E ∈ C ∞ (Rn \{0}). If u ∈ D (Ω) is a solution of p(D)u = f with f ∈ C ∞ (Ω), then u ∈ C ∞ (Ω). Proof. Note that the statement of the theorem is local; i.e., it suﬃces to prove it in a neighborhood of any point x0 ∈ Ω. Let χ ∈ D(Ω), χ(x) = 1 in a neighborhood of x0 . Consider the distribution χu and apply p(D) to it. Denoting p(D)(χu) by f1 , we see that f1 = f in a neighborhood of x0 and, in particular, x0 ∈ sing supp f1 . By Lemma 5.9, we have χu = E ∗ f1 .

(5.16)

We now use Proposition 5.8. Since sing supp E = {0}, we get sing supp(χu) ⊂ sing supp f1 . Therefore, x0 ∈ sing supp χu and x0 ∈ sing supp u; i.e., u is a C ∞ -function in a neighborhood of x0 , as required. A similar fact is true for the analyticity of solutions. Recall that a function u = u(x) deﬁned in an open set Ω ⊂ Rn , with values in R or C, is called real analytic if for every y ∈ Ω there exists a neighborhood Uy of y in Ω, such that u|Uy = u|Uy (x) can be presented in Uy as a sum of a convergent power series cα (x − y)α , x ∈ Uy , u(x) = α

with the sum over all multi-indices α. Consider ﬁrst the case of the homogeneous equation p(D)u = 0. Theorem 5.12. Assume that a fundamental solution E(x) of p(D) is real analytic in Rn \{0}, u ∈ D (Ω), and p(D)u = 0. Then u is a real-analytic function in Ω. Proof. We know already that u ∈ C ∞ (Ω). Using the same argument as in the proof of Theorem 5.11, consider (5.16), where now f1 ∈ D(Ω) and f1 (x) = 0 in a neighborhood of x0 . Then for x close to x0 we have

E(x − y)f1 (y)dy. (5.17) u(x) = |y−x0 |≥ε>0

5.5. Smoothness of solutions

111

Note that the analyticity of E(x) for x = 0 is equivalent to the possibility to extend E(x) to a holomorphic function in a complex neighborhood of the set Rn \{0} in Cn , i.e., in an open set G ⊂ Cn such that G ∩ Rn = Rn \{0}. Recall that a complex-valued function ϕ = ϕ(z) deﬁned for z ∈ G, where G is an open subset in Cn , is called holomorphic or complex analytic or even simply analytic if one of the following two equivalent conditions is fulﬁlled: a) ϕ(z) can be expanded in a power series in z − z0 in a neighborhood of each point z0 ∈ G; i.e., cα (z − z0 )α , ϕ(z) = α

where z is close to z0 , cα ∈ C, and α is a multi-index. b) ϕ(z) is continuous in G and diﬀerentiable with respect to each complex variable zj (i.e., is holomorphic with respect to each zj ). A proof can be found on the ﬁrst pages of any textbook on functions of several complex variables (e.g., Gunning and Rossi [11, Ch. 1, Theorem A2] or H¨ormander [14, Ch. 2]). Note that in (5.17) the variable y varies over a compact K ⊂ Rn . Fix y0 = x0 . Then for |z − x0 | < δ, |w − y0 | < δ with z, w ∈ Cn and δ suﬃciently small, we can deﬁne E(z − w) as a holomorphic function in z and w. Then, clearly,

|y−y0 |≤δ

E(z − y)f1 (y)dy

is well-deﬁned for |z −x0 | < δ and holomorphic in z since we can diﬀerentiate with respect to z under the integral sign. But using a partition of unity, we can express the integral in (5.17) as a ﬁnite sum of integrals of this form. Therefore, we can deﬁne u(z) for complex z, which are close to x0 , so that u(z) is analytic in z. In particular, u(x) can be expanded in a Taylor series in a neighborhood of x0 ; i.e., u(x) is real analytic. The case of a nonhomogeneous equation reduces to the above due to the Cauchy-Kovalevsky theorem which we will give in a form convenient for us. Theorem 5.13. (Cauchy-Kovalevsky) Consider the equation ∂mu + aα (x)D α u(x) = f (x), (5.18) ∂xm n |α|≤m,αn 0 such that (5.30) |∂ α f˜(ξ)| ≤ Cα (1 + |ξ|)N ξ

for any multi-index α. Proof. Let us choose constants N , C and a compact K ⊂ Rn so that sup |∂ α ϕ(x)|, ϕ ∈ C ∞ (Rn ). (5.31) |f, ϕ| ≤ C |α|≤N

x∈K

Set g(ξ) = f, e−iξ·x . Since e−iξ·x is an inﬁnitely diﬀerentiable function in ξ with values in C ∞ (Rn ), then, clearly, g(ξ) ∈ C ∞ (Rnξ ) and ∂ξα g(ξ) = f, (−ix)α e−iξ·x , implying that (5.30) holds for g(ξ). It remains to verify that g(ξ) = f˜(ξ), i.e., that

−iξ·x f (x), ϕ(ξ)e dξ = ϕ(ξ)f (x), e−iξ·x dξ, ϕ ∈ S(Rnξ ). But this follows from the convergence of ϕ(ξ)e−iξ·x dξ in the topology of C ∞ (Rnx ) which means that this integral and integrals obtained from it by taking derivatives of any order converge uniformly with respect to x ∈ K (where K is a compact set in Rn ). Remark 5.22. The condition f˜ ∈ C ∞ (Rn ) and the estimate (5.30) are necessary but not suﬃcient for a distribution f to have a compact support. For instance, if f ∈ E (Rn ), then, clearly, f˜(ξ) may be deﬁned via (5.29) for ξ ∈ Cn and we get an entire (i.e., homomorphic everywhere in Cn )-function satisfying |f˜(ξ)| ≤ C(1 + |ξ|)N eA|Im ξ| . It turns out that this condition on f˜ is already necessary and suﬃcient for f ∈ E (Rn ). This fact is a variant of the Paley-Wiener theorem and its proof may be found in H¨ ormander [13, Theorem 1.7.7] or [15, Theorem 7.3.1].

120

5. Convolution and Fourier transform

Proposition 5.23. Let f ∈ E (Rn ), g ∈ S (Rn ). Then (5.32)

f ∗ g(ξ) = f˜(ξ)˜ g (ξ).

Proof. Note that the right-hand side of (5.32) makes sense due to Proposition 5.21. It is also easy to verify that f ∗ g ∈ S (Rn ) so that the left-hand side of (5.32) makes sense. Indeed, if ϕ ∈ S(Rn ), then f (y), ϕ(x + y) ∈ S(Rnx ) due to (5.31). Therefore, for ϕ ∈ S(Rn ) the equality f ∗ g, ϕ = g(x), f (y), ϕ(x + y) makes sense and we clearly get f ∗ g ∈ S (Rn ). Now let us verify (5.32). For ϕ ∈ S(Rn ) we get

−iξ·x f ∗ g(ξ), ϕ(ξ) = f ∗ g, ϕ(ξ)e dξ

−iξ·(x+y) dξ = g(x), f (y), ϕ(ξ)e

. / = g(x), ϕ(ξ)e−iξ·x f (y), e−iξ·y dξ

−iξ·x ˜ dξ = g(x), ϕ(ξ)f (ξ)e = ˜ g (ξ), f˜(ξ)ϕ(ξ) = f˜(ξ)˜ g (ξ), ϕ(ξ),

as required.

5.9. Applying the Fourier transform to ﬁnd fundamental solutions Let p(D) be a diﬀerential operator with constant coeﬃcients, E ∈ S (Rn ) its ˜ ∈ S (Rn ) satisﬁes fundamental solution. The Fourier transform E(ξ) (5.33)

˜ = 1. p(ξ)E(ξ)

˜ = 1/p(ξ) on the open set {ξ : p(ξ) = 0}. If p(ξ) = 0 everywhere Clearly, E(ξ) n ˜ = 1/p(ξ) and apply the inverse Fourier in R , then we should just take E(ξ) transform. (It can be proved, though nontrivially, that in this case 1/p(ξ) is estimated by (1 + |ξ|)N with some N > 0; hence 1/p(ξ) ∈ S (Rnξ )—see, e.g., H¨ ormander [15, Example A.2.7 in Appendix A2, vol. II].) When p(ξ) has zeros we should regularize 1/p(ξ) in neighborhoods of zeros of p(ξ) so as ˜ to get a distribution E(ξ) equal to 1/p(ξ) for p(ξ) = 0 and satisfying (5.33). ˜ = 1 ˜ = v.p. 1 or E(ξ) For instance, for n = 1 and P (ξ) = ξ we can set E(ξ) ξ ξ±i0 ˜ = v.p. 1 + Cδ(ξ), where C is an arbitrary constant. or, in general, E(ξ) ξ In the general case, we must somehow give meaning to integrals of the form

ϕ(ξ) dξ, ϕ ∈ S(Rn ), p(ξ) n R

5.10. Liouville’s theorem

121

to get a distribution with the needed properties. It can often be done, e.g., 1 is by entering the complex domain (the construction of the distribution ξ±i0 an example). Sometimes 1/p(ξ) is locally integrable. Then it is again natural ˜ to assume E(ξ) = 1/p(ξ). For instance, for p(D) = Δ, i.e., p(ξ) = −ξ 2 , we ˜ can set E(ξ) = −1/ξ 2 for n ≥ 3. Since, as is easy to verify, the Fourier transform and its inverse preserve the spherical symmetry and transform a distribution of homogeneity degree α into a distribution of homogeneity degree −α−n, then, clearly, F −1 ( |ξ|12 ) = C|x|2−n , where C = 0. Therefore, we get the fundamental solution En (x) described above. For n = 2 the function |ξ|12 is not integrable in a neighborhood of 0. Here a regularization is required. For instance, we can set

ϕ(ξ) − χ(ξ)ϕ(0) ˜ dξ, E2 (ξ), ϕ(ξ) = − |ξ|2 R2 where χ(ξ) ∈ C0∞ (R2 ), χ(ξ) = 1 in a neighborhood of 0. It can be veriﬁed 1 ln |x| + C, where C depends on the choice of χ(ξ). that E2 (x) = 2π

5.10. Liouville’s theorem Theorem 5.24. Assume that the symbol p(ξ) of an operator p(D) vanishes (for ξ ∈ Rn ) only at ξ = 0. If u ∈ S (Rn ) and p(D)u = 0, then u(x) is a polynomial in x. In particular, if p(D)u = 0, where u(x) is a bounded measurable function on Rn , then u = const. Remarks. a) The inclusion u ∈ S (Rn ) holds, for instance, if u ∈ C(Rn ) and |u(x)| ≤ C(1 + |x|)N , x ∈ Rn , for some C and N . b) If p(ξ0 ) = 0 for some ξ0 = 0, then p(D)u = 0 has a bounded nonconstant solution eiξ0 ·x . Therefore, p(ξ) = 0 for ξ = 0 is a necessary condition for the validity of the theorem. c) If p(ξ) = 0 for all ξ ∈ Rn , then p(D)u = 0 for u ∈ S (Rn ) implies, as we will see from the proof of the theorem, that u ≡ 0. Proof of Theorem 5.24. Applying the Fourier transform we get p(ξ)˜ u(ξ) = 0. Since p(ξ) = 0 for ξ = 0, then, clearly, the support of u ˜(ξ) is equal to {0} (provided u ≡ 0). We give more detail: if ϕ(ξ) ∈ D(Rn \{0}), then ϕ(ξ) n p(ξ) ∈ D(R \{0}) implying ϕ(ξ) ϕ(ξ) = p(ξ)˜ u(ξ), = 0. ˜ u(ξ), ϕ(ξ) = u ˜(ξ), p(ξ) p(ξ) p(ξ)

122

5. Convolution and Fourier transform

Thus, supp u ˜(ξ) ⊂ {0}. Therefore, aα δ (α) (ξ), u ˜(ξ) =

aα ∈ C,

|α|≤p

implying u(x) =

c α xα ,

cα ∈ C,

|α|≤p

as required.

Example. If u(x) is a harmonic function everywhere on Rn and |u(x)| ≤ M , then u = const (this statement is often called Liouville’s theorem for harmonic functions). If u(x) is a harmonic function in Rn and |u(x)| ≤ C(1 + |x|)N , then u(x) is a polynomial. The same holds for solutions of Δm u = 0. Later we will prove a more precise statement: If u is harmonic and bounded below (or above), e.g., u(x) ≥ −C, x ∈ Rn , then u = const (see, e.g., Petrovskii [24]). This statement fails already for the equation Δ2 u = 0 satisﬁed by the nonnegative polynomial u(x) = |x|2 . Let us give one more example of an application of Liouville’s theorem. Let for n ≥ 3 a function u(x) be deﬁned and harmonic in the exterior of a ball, i.e., for |x| > R and u(x) → 0 as |x| → +∞. We wish to describe in detail the behavior of u(x) as |x| → ∞. To this end we may proceed, for example, as follows. Take a function χ(x) ∈ C ∞ (Rn ) such that 0 for |x| ≤ R + 1, χ(x) = 1 for |x| ≥ R + 2. Now, consider the following function u ˆ(x) ∈ C ∞ (Rn ): χ(x)u(x) for |x| > R, u ˆ(x) = 0 for |x| ≤ R. Clearly, Δˆ u(x) = f (x) ∈ C0∞ (Rn ). Let us verify that (5.34)

u ˆ = En ∗ f,

where En is the standard fundamental solution of the Laplace operator. Clearly, (En ∗ f )(x) → 0 as |x| → ∞ and, by assumption, the same holds for u(x) (and, hence, for u ˆ(x), since u ˆ(x) = u(x) for |x| > R + 2). Moreover, Δ(En ∗ f ) = ΔEn ∗ f = f, so Δ(ˆ u − En ∗ f ) = 0 everywhere on Rn . But, by Liouville’s theorem, this implies that u ˆ − En ∗ f = 0, as required.

5.11. Problems

123

Formula (5.34) can be rewritten in the form

u(x) = En (x − y)f (y)dy, |x| > R + 2. |y|≤R+2

The explicit form of En now implies |u(x)| ≤

C , |x|n−2

|x| > R + 2.

The above argument is also applicable in a more general situation. For harmonic functions more detailed information on the behavior of u(x) as |x| → ∞ can also be obtained with the help of the Kelvin transformation x 2−n u . u(x) → v(x) = |x| |x|2 It is easy to verify that the Kelvin transformation preserves harmonicity. For n = 2 the Kelvin transformation is the change of variables induced by the inversion x → x/|x|2 . If u(x) is deﬁned for large |x|, then v(x) is deﬁned in a punctured neighborhood of 0, i.e., in Ω\{0}, where Ω is a neighborhood of 0. If u(x) → 0 as |x| → +∞, then v(x)|x|n−2 → 0 as |x| → 0, and by the removable singularity theorem, v(x) can be extended to a function which is harmonic everywhere in Ω. (We will denote the extension by the same letter v.) Observe that the Kelvin transformation is its own inverse (or, as they say, is an involution); i.e., we may express u in terms of v via the same formula u(y) = |y|2−n v( |y|y 2 ). Expanding v(x) into Taylor series at x = 0 we see that u(y) has a convergent expansion into a series of the form yα cα 2|α| u(y) = |y|2−n |y| α in a neighborhood of inﬁnity. In particular, u(y) = c0 |y|2−n + O(|y|1−n ) as |y| → +∞.

5.11. Problems 5.1. Find the Fourier transforms of the following distributions: a)

1 x+i0 ;

b) δ(r − r0 ), r = |x|, x ∈ R3 ; c)

sin(r0 |x|) , |x|

x ∈ R3 ;

d) xk θ(x), k ∈ Z+ .

124

5. Convolution and Fourier transform

5.2. Find the inverse Fourier transforms of the following distributions: a) 1/|ξ|2 , ξ ∈ Rn , n ≥ 3; ξ ∈ R3 , k > 0;

b)

1 , |ξ|2 +k2

c)

1 , |ξ|2 −k2 ±i0

ξ ∈ R3 , k > 0.

5.3. Using the result of Problem 5.2 b), ﬁnd the fundamental solution for the operator −Δ + k 2 in R3 . 5.4. We will say that the convolution f ∗ g of two distributions f, g ∈ D (Rn ) is deﬁned if for any sequence χk ∈ D(Rn ) such that χk (x) = 1 for |x| ≤ k, k = 1, 2, . . ., the limit f ∗ g = lim f ∗ (χk g) exists in D (Rn ) and k→∞

does not depend on the choice of the sequence χk . Prove that if f, g ∈ D (R) and the supports of f and g belong to [0, +∞), then f ∗ g is deﬁned and addition and convolution deﬁne the structure of an associative commutative ¯ + ) of distributions with supports in [0, +∞). ring with unit on the set D (R 5.5. Prove that δ (k) (x) ∗ f (x) = f (k) (x) for f ∈ D (R). eixt x+i0 t→−∞

5.6. Find the limits lim

eixt x+i0 t→+∞

and lim

in D (R1 ).

¯ + ) deﬁne an integration operator setting If = x f (ξ)dξ. 5.7. For f ∈ L1loc (R 0 ¯ + ). Prove that If = f ∗ θ and I can be extended by continuity to D (R 5.8. Using the associativity of the convolution, prove that & ' 1 xm−1 θ(x) ∗ f, I mf = (m − 1)! ¯ + ). where m = 1, 2, . . ., for f ∈ D (R ¯ + ) and, in 5.9. Set xλ+ = θ(x)xλ for Re λ > −1, so that xλ+ ∈ L1loc (R λ λ ¯ + ). Deﬁne the fractional integral I by the formula particular, x+ ∈ D (R I λf =

λ−1 x+ ∗ f, Re λ > 0. Γ(λ)

Prove that I λ I μ = I λ+μ , Re λ > 0, Re μ > 0. 5.10. For any λ ∈ C deﬁne the fractional integral I λ by k d λ ¯ + ), I λ+k f for f ∈ D (R I f= dx

5.11. Problems

125

where k ∈ Z+ is such that Re λ + k > 0. Prove that I λ is well-deﬁned; i.e., the result does not depend on the choice of k ∈ Z+ . Prove that I λ I μ = I λ+μ for any λ, μ ∈ C. Prove that I 0 f = f and I −k f = f (k) for k ∈ Z+ and any ¯ + ). So I −λ f for λ > 0 can be considered a fractional derivative (of f ∈ D (R order λ) of the distribution f .

10.1090/gsm/205/06

Chapter 6

Harmonic functions

6.1. Mean-value theorems for harmonic functions A harmonic function in an open set Ω ⊂ Rn (n ≥ 1) is a (real-valued or complex-valued) function u ∈ C 2 (Ω) satisfying the Laplace equation (6.1)

Δu = 0,

where Δ is the standard Laplacian on Rn . As we already know from Chapter 5, any harmonic function is in fact in C ∞ (Ω) and even real-analytic in Ω. This is true even if we a priori assume only that u ∈ D (Ω), i.e., if u is a distribution in Ω. For n = 1 the harmonicity of a function on an interval (a, b) means that this function is linear. Since all statements discussed here are true but usually trivial for n = 1, we can concentrate on the case n ≥ 2. In this chapter we will discuss classical properties of harmonic functions and related topics. We will start with a property of harmonic functions that is called the mean value theorem, or Gauss’s law of the arithmetic mean. Theorem 6.1 (Mean value theorem for spheres). Let B = Br (x) be an open ball in Rn with the center at x and radius r > 0, let S = Sr (x) be its ¯ be the boundary (the sphere with the same center and radius), and let B ¯ = B ∪ S). Let u be a continuous function on B ¯ such that closure of B (so B its restriction to B is harmonic. Then the average of u over S equals u(x) (the value of u at the center ). In other words, (6.2)

u(x) =

1 voln−1 (S)

u(y)dSy = S

1 σn−1 rn−1

u(y)dSy , Sr (x)

127

128

6. Harmonic functions

where dSy means the standard area element on the sphere S and σn−1 is the (n − 1)-volume of the unit sphere. Proof. Let us rewrite the average of u in (6.2) as an average over the unit sphere. To this end, denote z = r−1 (y − x) and make the change of variables y → z in the last integral. We get then

1 1 u(y)dSy = u(x + rz)dSz . σn−1 rn−1 Sr (x) σn−1 |z|=1 Now, if we replace r by a variable ρ, 0 ≤ ρ ≤ r, then the integral becomes a function I = I(ρ) on [0, r], which is continuous due to the uniform continuity ¯ Clearly, I(0) = u(x). Our theorem will be proved if we show of u on B. that I(ρ) = const, or, equivalently, that I (ρ) ≡ 0 on [0, r). To this end, let us calculate the derivative I (ρ) by diﬀerentiating under the integral sign (which is clearly possible):

d ∂u 1 1 u(x + ρz)dSz = I (ρ) = (x + ρz)dSz , σn−1 |z|=1 dρ σn−1 |z|=1 ∂ n ¯ρ where n ¯ ρ is the outgoing unit normal to the sphere Sρ (x), at the point x+ρz. We will see that the last integral vanishes if we apply the following corollary of Green’s formula (4.32), which is valid for any bounded domain with a C 2 boundary

∂u (6.3) dS Δu dx = ¯ Ω ∂Ω ∂ n (take v ≡ 1 in (4.32)). For our purposes we should take Ω = Bρ (x). Since u is harmonic, we come to the desired conclusion that I (ρ) = 0. The sphere in Theorem 6.1 can be replaced by a ball: Theorem 6.2 (Mean value theorem for balls). In the same notation as in ¯ = B ¯r (x) such that its Theorem 6.1, let u be a continuous function on B restriction to B is harmonic. Then the average of u over B equals u(x). In other words,

n 1 u(y)dy = u(y)dy, (6.4) u(x) = vol(B) B σn−1 rn Br (x) where dy means the standard volume element (Lebesgue measure) on Rn . Proof. Let ρ denote the distance from x, considered as a function on Br (x), as in the proof of Theorem 6.1. Since | grad ρ| = 1, we can replace integration over Br (x) by two subsequent integrations: ﬁrst integrate over the spheres

6.2. Maximum principle

129

Sρ (x) for every ρ ∈ [0, r], and then integrate with respect to the radii ρ (no additional Jacobian type factor is needed). Using Theorem 6.1, we obtain

r

r u(y)dy = u(y)dSρ (y)dρ = u(x)voln−1 (Sρ (x))dρ Br (x)

0

Sρ (x)

0

= u(x) vol(Br (x)), which is what we need.

Remark 6.3. Theorem 6.1 shows one way that a local property can imply a global property: (6.1) implies the mean value property (6.2). It suggests that we can try to connect the desired global property with a trivial one by a path (in our case, the path is parametrized by the radius ρ ∈ [0, r]) and expand the validity of the desired property along this path, starting from a trivial case (ρ = 0 in our case), relying on the local information when moving by an inﬁnitesimal step with respect to the parameter. (The proof of Theorem 6.2 does not belong to this type, because it simply deduces a global result from another global one.)

6.2. The maximum principle In this section we will often assume for simplicity of formulation that Ω is an open connected subset in Rn . (In case Ω is not connected, we can then apply the results on each connected component separately.) Recall that connectedness of an open set Ω ⊂ Rn can be deﬁned by one of the following equivalent conditions: (i) Ω cannot be written as a union of two disjoint nonempty open subsets. (ii) If F ⊂ Ω is nonempty open and at the same time closed (in Ω), then F = Ω. (iii) Every two points x, y ∈ Ω can be connected by a continuous path in Ω (that is, there exists a continuous map γ : [0, 1] → Ω, such that γ(0) = x and γ(1) = y). (For the proofs see, e.g., Armstrong [1, Sect. 3.5].) The maximum principle for harmonic functions can be formulated as follows: Theorem 6.4 (Maximum principle for harmonic functions). Let u be a realvalued harmonic function in a connected open set Ω ⊂ Rn . Then u cannot have a global maximum in Ω unless it is constant. In other words, if x0 ∈ Ω and u(x0 ) ≥ u(x) for all x ∈ Ω, then u(x) = u(x0 ) for all x ∈ Ω. Proof. Let x0 ∈ Ω be a point of global maximum for u and let B = Br (x0 ) ¯ ⊂ Ω. (This is possible be a ball in Rn (with the center at x0 ) such that B

130

6. Harmonic functions

if the radius r is suﬃciently small.) Now, applying (6.4) with x = x0 , we obtain

1 1 0 = u(x0 ) − u(y)dy = (u(x0 ) − u(y))dy. vol(B) B vol(B) B Since u(x0 ) − u(y) ≥ 0 for all y ∈ Ω, we conclude that u(y) = u(x0 ) for all y ∈ B. Let us consider the set M = {y ∈ Ω | u(y) = u(x0 )}. As we just observed, for every point y ∈ M there exist a ball B centered at y, such that B ⊂ M . This means that M is open. But it is obviously closed too. It is nonempty because it contains x0 . Since Ω is connected, we conclude that M = Ω which means that u(y) = u(x0 ) = const for all y ∈ Ω. Corollary 6.5. Let u be a real-valued harmonic function in a connected open set Ω ⊂ Rn . Then u cannot have a global minimum in Ω unless it is constant. Proof. Applying Theorem 6.4 to −u, we obtain the desired result.

Corollary 6.6. Let u be a complex-valued harmonic function in a connected open set Ω ⊂ Rn . Then |u| cannot have a global maximum in Ω unless u = const. Proof. Assume that x0 ∈ Ω and |u(x0 )| ≥ |u(x)| for all x ∈ Ω. If |u(x0 )| = 0, then u ≡ 0 and the result is obvious. Otherwise, write u(x0 ) = |u(x0 )|eiϕ0 , ˜ = e−iϕ0 u, which is also 0 ≤ ϕ0 < 2π, and consider a new function u harmonic. Denote the real and imaginary parts of u ˜ by v, w, so that u ˜ = v + iw, where v and w are real-valued harmonic functions on Ω. Since u ˜(x0 ) = |u(x0 )| is real valued, w(x0 ) = 0 and so for all x ∈ Ω we have u(x)| = v 2 (x) + w2 (x) ≥ v(x). (6.5) v(x0 ) = |u(x0 )| ≥ |u(x)| = |˜ Applying Theorem 6.4 to v, we see that v(x) = v(x0 ) = const for all x ∈ Ω, and all the inequalities in (6.5) become equalities. In particular, we have v(x0 ) = v 2 (x0 ) + w2 (x); hence w ≡ 0, which implies the desired result. For bounded domains Ω we can take into account what happens near the boundary to arrive at other important facts. Theorem 6.7. 1) Let u be a real-valued harmonic function in a connected open bounded Ω ⊂ Rn such that u can be extended to a continuous function (still denoted by u) on the closure Ω. Then for any x ∈ Ω (6.6)

min u ≤ u(x) ≤ max u, ∂Ω

where ∂Ω = Ω \ Ω is the boundary of Ω.

∂Ω

6.3. Dirichlet’s problem

131

2) Let u be a complex-valued harmonic function on a connected open bounded Ω that can be extended to a continuous function on Ω. Then for any x ∈ Ω (6.7)

|u(x)| ≤ max |u|. ∂Ω

3) The inequalities in (6.6) and (6.7) become strict for x in those connected components of Ω where u is not constant. Proof. Since Ω is compact, u should attain its maximum and minimum somewhere on Ω. But due to Theorem 6.4 and Corollary 6.5 this may happen only on the boundary or in the connected components of Ω where u is constant. The desired results for a real-valued u immediately follow. For complex-valued u the same argument works if we use Corollary 6.6 instead of Theorem 6.4 and Corollary 6.5. Corollary 6.8. Let Ω be an open bounded set in Rn and let u, v be harmonic functions that can be extended to continuous functions on Ω such that u = v on ∂Ω. Then u ≡ v everywhere in Ω. Proof. Taking w = u − v, we see that w is harmonic and w = 0 on ∂Ω. Now it follows from (6.7) that w ≡ 0 in Ω.

6.3. Dirichlet’s boundary value problem For a given open set Ω and ϕ ∈ C(∂Ω), the following boundary value problem is called Dirichlet’s problem for the Laplace equation: Δu(x) = 0, x ∈ Ω, (6.8) u|∂Ω = ϕ(x), x ∈ ∂Ω. We are interested in a classical solution u, that is, u ∈ C 2 (Ω) ∩ C(Ω). Corollary 6.8 shows that for any bounded Ω and any continuous ϕ, there may be at most one solution of this problem; i.e., we have uniqueness for (6.8). It is natural to ask whether such a solution exists for a given Ω and ϕ ∈ C(∂Ω). This proves to be true if Ω is bounded and ∂Ω is smooth or piecewise smooth. But, in the general case, the complete answer, characterizing open sets Ω for which the Dirichlet problem is solvable for any continuous ϕ, is highly nontrivial. The answer was given by Norbert Wiener, and it is formulated in terms of capacity, the notion which appeared in electrostatics and was brought by Wiener to mathematics in the 1920s. The whole story is beyond the scope of this course and can be found in books on potential theory

132

6. Harmonic functions

(see, e.g., Wermer [39]). On the other hand, it occurs that in some weaker sense the problem is always solvable. The appropriate terminology and necessary techniques (functional analysis, Sobolev spaces) will be discussed in Chapters 8 and 9. It is quite natural to require the existence and uniqueness of a solution of a mathematical problem that appears as a model of a real-life situation. (For example, if it is not unique, then we should additionally investigate which one corresponds to reality.) But this is not enough. Another reasonable requirement is the well-posedness of the problem, which means that the solution continuously depends on the data of the problem, where the continuity is understood with respect to some natural topologies in the spaces of data and solutions. The well-posedness of Dirichlet’s problem in the sup-norm immediately follows from the maximum principle. More precisely, we have Corollary 6.9. Let Ω be an open bounded domain in Rn . Let u1 , u2 be harmonic functions in Ω that can be extended to continuous functions on Ω, and let ϕj = uj |∂Ω , j = 1, 2. Then (6.9)

max |u1 − u2 | ≤ max |ϕ1 − ϕ2 |. Ω

∂Ω

Proof. The result immediately follows from (6.7) in Theorem 6.7.

Now assume that {uk | k = 1, 2, . . . } is a sequence of harmonic functions in Ω, continuously extendable to Ω, whose restrictions to ∂Ω (call them ϕk ) converge uniformly to ϕ on ∂Ω as k → ∞. Then {ϕk } is a Cauchy sequence in the Banach space C(∂Ω) with the sup-norm. It follows from (6.9) that {uk } is a Cauchy sequence in C(Ω). Hence, uk → u ∈ C(Ω) with respect to this norm. It is easy to see that the limit function u is also harmonic. Indeed, the uniform convergence in Ω obviously implies convergence uk → u as distributions (that is, in D (Ω)); hence Δuk → Δu. But Δuk ≡ 0 in Ω, so Δu ≡ 0 in Ω; i.e., u is harmonic. The Laplace equation and the Dirichlet problem appear in electrostatics as the equation and boundary value problem for a potential in a domain Ω which is free of electric charges. It is also natural to consider a modiﬁcation which allows charges, including point charges and also charges distributed in some volume or on a surface in Ω. To this end we should replace the Laplace equation by the Poisson equation Δu = f (see (4.41)), where f is an appropriate distribution. This leads to Dirichlet’s problem for the Poisson equation: Δu(x) = f (x), x ∈ Ω, (6.10) x ∈ ∂Ω. u|∂Ω = ϕ(x),

6.4. Hadamard’s example

133

To make this precise, we consider u as a distribution in Ω that is in fact a continuous function in a neighborhood U in Ω of the boundary ∂Ω. We can also take f ∈ D (Ω) and ϕ ∈ C(∂Ω). Then we can extend the result of Corollary 6.8 to the problem (6.10): Corollary 6.10. Assume that u, v ∈ D (Ω) are solutions of the boundary value problem (6.10) with the same f and ϕ. Then u = v in Ω. Proof. Clearly, w = u − v is a distribution in Ω, satisfying Δw = 0. Therefore, by Theorem 5.11, w is a harmonic function in Ω. By our assumptions, w is continuous up to the boundary and w = 0 on ∂Ω. Hence, w ≡ 0 by the maximum principle.

6.4. Hadamard’s example In this section we encounter an example, due to Jacques Hadamard, which shows that there exist uniquely solvable problems which are not well-posed with respect to natural norms; i.e., small perturbations of the data may lead to uncontrollably large perturbations of the solution. Let us consider the initial value problem for the Laplacian in a twodimensional strip ΠT = {(x, y) ∈ R2 | x ∈ R, 0 ≤ y < T }: Δu = 0 in ΠT , (6.11) u(x, 0) = ϕ(x), ∂u ∂y (x, 0) = ψ(x) for all x ∈ R, where we assume that u ∈ C 2 (ΠT ), ϕ ∈ C 2 (R), and ψ ∈ C 1 (R). A solution of this problem may not exist (see Problem 6.5), but if it exists, then it is unique. To prove the uniqueness we will need the following, in which we write coordinates for Rn as x = (x , xn ), where x ∈ Rn−1 and xn ∈ R. Lemma 6.11. Let Ω be an open set in Rn and assume that a function u ∈ C 1 (Ω) is harmonic in the open sets Ω± = {x ∈ Ω| ± xn > 0}, which are the portions of Ω lying, respectively, above and below the hyperplane xn = 0. Then u is in fact a harmonic function in Ω. Proof. It suﬃces to prove that Δu = 0 in the sense of distributions, i.e., that for any test function Φ ∈ C0∞ (Ω) we have

u ΔΦ dx = 0. (6.12) Ω

Note that it is suﬃcient to prove this equality locally, i.e., for test functions Φ supported in a small neighborhood of any chosen point x∗ ∈ Ω. Since this is obvious for points x∗ ∈ Ω± , we may assume that x∗ lies on the hyperplane

134

6. Harmonic functions

xn = 0 and that Ω is a small ball centered at such a point. Then, due to Green’s formula (4.32), we can write

∂Φ ∂u u dx , u ΔΦ dx = −Φ ∂xn ∂xn Ω− Ω∩{xn =0} where we used that Δu = 0 in Ω− . Similarly,

∂u ∂Φ u ΔΦ dx = − −Φ u dx , ∂x ∂x + n n Ω Ω∩{xn =0} where the minus sign in the right-hand side appears due to the fact that the direction of the xn -axis is outward for Ω− and inward for Ω+ . We also used the continuity of u and ∂u/∂xn which implies that these quantities are the same from both sides of {xn = 0}. Adding these equalities leads to (6.12), which ends the proof. Now let us turn to the proof of the uniqueness in (6.11). For two solutions u1 , u2 , their diﬀerence u = u1 −u2 is a solution of (6.11) with ϕ ≡ ψ ≡ 0. By reﬂection as an odd function in y, extend u to a function u ˜ on the doubled strip ˜ T = {(x, y) ∈ R2 | x ∈ R, y ∈ (−T, T )}. Π (In detail, deﬁne u ˜(x, y) = −u(x, −y) for −T < y < 0.) Due to the initial ˜ T ). Also, u ˜ is harmonic in the conditions on u, we clearly have u ˜ ∈ C 1 (Π ˜ ˜ T except at the xdivided strip ΠT \ {(x, 0)| x ∈ R}, i.e., everywhere in Π ˜ T . Therefore, axis. It follows from Lemma 6.11 that u ˜ is harmonic in Π ˜ it is real-analytic in ΠT by Theorem 5.14 (or Corollary 5.15). But it is also easy to check that the equation Δ˜ u = 0 and the initial conditions ˜/∂y|y=0 = 0 imply, by induction in |α|, that ∂ α u ˜|y=0 = 0 for u ˜|y=0 = ∂ u every multi-index α = (α1 , α2 ). Hence, using the Taylor expansion of u ˜ at ˜ T , which proves the desired the point x = y = 0, we conclude that u ˜ ≡ 0 in Π uniqueness. Remark 6.12. The uniqueness of C m -solutions for any mth-order equation with real-analytic coeﬃcients and initial conditions on any noncharacteristic hypersurface is the content of a Holmgren theorem (see, e.g., John [16, Sect. 3.5]). This theorem is deduced, by a duality argument, from the existence of a solution for an adjoint problem in real-analytic functions, which is the content of the Cauchy-Kovalevsky theorem (see Theorem 5.14). In the next chapter we will explain Holmgren’s method in detail but for the heat equation. Now we can proceed to the Hadamard example. Let us try to ﬁnd complex exponential solutions of the Laplace equation Δu(x, y) = 0, that is, solutions of the form u(x, y) = ekx+ly ,

6.4. Hadamard’s example

135

where k, l ∈ C. Substituting this function into the Laplace equation, we immediately see that it is a solution if and only if k 2 + l2 = 0. For example, we can take k = iω, l = ω for a real ω, to obtain uω (x, y) = eiωx eωy . We can also take the real part of this solution, which is vω (x, y) = eωy cos ωx, to obtain a real-valued solution of Δu = 0. Let us try to understand whether we can use the C k -norm on functions ϕ ∈ C k (R), i.e., (6.13) ϕ(k) = sup |∂ α ϕ(x)|, |α|≤k

x∈R

for the initial data of the problem (6.11) and investigate whether a value of the solution u at a point can be made small if the C k -norms of the initial data are small. Equivalently, we can ask whether a small deviation of ϕ, ψ from the equilibrium values ϕ ≡ ψ ≡ 0 necessarily leads to small perturbation of the solution u. More precisely, ﬁxing an arbitrarily large k and choosing, for example, a point (0, σ) ∈ ΠT , where σ > 0, we would like to know the answer to the following question: is it true that, for any ε > 0, there exists δ > 0 such that the inequality ϕ(k) + ψ(k) < δ implies that |u(0, σ)| < ε? The answer to this question is emphatically negative. Indeed, for u = vω with ω > 1, we obtain ϕ(x) = vω (x, 0) = cos ωx and ψ(x) = ∂y vω (x, 0) = ω cos ωx, so ϕ(k) + ψ(k) ≤ (k + 1)ω k + (k + 1)ω k+1 ≤ 2(k + 1)ω k+1 . However, vω is much bigger near y = 0 if ω is large. In fact, if we ﬁx σ > 0 and instead take the solution v˜ω = e−ωσ/2 vω with the corresponding initial data ˜ ϕ(x) ˜ = e−ωσ/2 cos ωx, ψ(x) = e−ωσ/2 ω cos ωx, we see that v˜ω (0, σ) = eωσ/2 grows exponentially as ω → +∞, whereas ϕω (k) + ψω (k) ≤ 2(k + 1)ω k+1 e−ωσ/2 decays exponentially as ω → +∞. In particular, now the negative answer to the question becomes obvious: the value of u at a point cannot be controlled by the C k -norms of the initial data. As we have seen in Chapter 2, the situation is quite opposite for the two-dimensional wave equation: there the values of u are estimated by the sup-norms of the initial data. In fact, this holds for hyperbolic equations in higher dimensions too. We will see this for the wave equation in Chapter 10.

136

6. Harmonic functions

6.5. Green’s function for the Laplacian In Section 3.4 we introduced the Green’s function for the Sturm-Liouville operator L on [0, l] with vanishing boundary conditions at the ends 0 and l, and we used it to deﬁne the inverse operator L−1 (cf. (3.21)). The properties found in Section 3.4 imply that G(x, ξ) is a distribution solution of the following boundary value problem involving the Dirac distribution δ(x − ξ): Lx G(x, ξ) = δ(x − ξ), x, ξ ∈ (0, l), (6.14) G(0, ξ) = 0 = G(l, ξ), ξ ∈ [0, l]. In the present section we deﬁne a similar multidimensional object G(x, ξ) for the Dirichlet problem (6.8) in a bounded open set Ω ⊂ Rn with a C 2 -smooth boundary. The Green’s function will be useful in solving the following boundary value problem, called the Poisson problem: Δu(x) = f (x), x ∈ Ω, (6.15) u(x) = 0, x ∈ ∂Ω. The Poisson problem is analogous to the Dirichlet problem (6.8) and is uniquely solvable in appropriate classes of functions and domains. For example, if Ω is an open bounded domain such that ∂Ω is a C 2 -hypersurface in ¯ then we will show in subsequent sections of this chapter Rn and f ∈ C 1 (Ω), ¯ that we can solve (6.15) with u ∈ C 2 (Ω) ∩ C(Ω). We use the fundamental solution En = En (x) of the Laplacian Δ in Rn . Recall that En is spherically symmetric (with the center at 0), and if n ≥ 3, then En (x) → 0 as |x| → ∞. For an open bounded domain Ω, we will construct the Green’s function G = G(x, y), for x, y ∈ Ω, x = y, in the form (6.16)

G(x, y) := En (x − y) + g(x, y),

where g = g(x, y) is a function for all x, y ∈ Ω called the remainder. It is required to satisfy the following properties: i) For ﬁxed y ∈ Ω, Δx g(x, y) = 0 for all x ∈ Ω. ¯ and g(x, y) = ii) For ﬁxed y ∈ Ω, g(x, y) extends continuously to x ∈ Ω −En (x − y) for x ∈ ∂Ω. This means that, for each y ∈ Ω, uy (x) = G(x, y) is the solution of (6.15) with f (x) replaced by δ(x − y); i.e., Δx G(x, y) = δ(x − y), x, y ∈ Ω, (6.17) G(x, y) = 0, x ∈ ∂Ω, y ∈ Ω.

6.5. Green’s function for the Laplacian

137

As follows from Corollary 6.10, if for some y ∈ Ω such a distribution G(·, y) exists, then it is unique. Concerning the existence, we have Lemma 6.13. Assume that, for any ϕ ∈ C(∂Ω), the Dirichlet problem (6.8) has a solution. Then the Green’s function G(x, y) for Ω exists. Proof. Let us ﬁx y ∈ Ω and start with a particular solution u(x) of the equation Δu(x) = δ(x − y), which is u(x) = En (x − y), where En (x) is the fundamental solution of the Laplacian Δ, given by (4.38) and (4.39). Then the remainder (6.18)

g(x, y) := G(x, y) − En (x − y)

should be harmonic with respect to x and satisfy the boundary condition g(x, y) = −En (x − y) for x ∈ ∂Ω. This means that g(·, y) is a solution of the Dirichlet problem (6.8) with the boundary condition ϕ(x) = −En (x − y). Due to our assumption about the solvability of Dirichlet’s problem in Ω, such a function g exists, which ends the proof. Remark 6.14. In fact, for the existence of the Green’s function G(x, y), we need to solve (6.8) not for all ϕ ∈ C(∂Ω), but only for speciﬁc functions of the form ϕy = −En (· − y)|∂Ω , for ﬁxed y ∈ Ω. The preceeding remark allows that the Green’s function may have additional regularity near ∂Ω that is important for obtaining certain properties. Hence we usually impose the following condition on Ω: Condition G. Ω is a bounded domain with C 1 boundary and there exists a (unique) Green’s function G(x, y) whose remainder function g(x, y) has the ¯ property that for any y ∈ Ω, the function uy (x) = g(x, y) is in C 2 (Ω). Remark 6.15. The signiﬁcance of Condition G is that it allows us to apply ¯ For Green’s formula (see (6.19) below) to uy and an arbitrary v ∈ C 2 (Ω). 1 2 ¯ by this reason, we may relax the conditions ∂Ω ∈ C and uy ∈ C (Ω) 1 2 1 ¯ allowing ∂Ω to be piecewise C and uy ∈ C (Ω) ∩ C (Ω). In the next section we will give a regularity condition on ∂Ω that guarantees that Condition G holds. We now obtain the ﬁrst result that follows from Condition G. Proposition 6.16. Let Ω be a domain that satisﬁes Condition G. Then G is symmetric, that is, G(x, y) = G(y, x) for all x, y ∈ Ω, x = y. Proof. Let us recall Green’s formula (4.32):

∂u ∂v −v dS, (uΔv − vΔu)dx = u (6.19) ∂n ¯ ∂n ¯ Ω ∂Ω

138

6. Harmonic functions

¯ Let us try to apply this to the functions u = G(·, y) where u, v ∈ C 2 (Ω). and v = G(·, z), for ﬁxed y, z ∈ Ω, y = z. Due to the boundary conditions on G, we have u|∂Ω = v|∂Ω = 0; hence the right-hand side vanishes. We also have Δu = δ(· − y), Δv = δ(· − z), so the left-hand side equals u(z) − v(y) = G(z, y) − G(y, z) if the integrals there are understood in the sense of distributions. Therefore, if Green’s formula can be extended to such u, v, then we will get the desired symmetry, G(z, y) = G(y, z). Now the only reason that (6.19) cannot be directly applied to our u and v is that they have singularities at the points y and z, respectively. So a ¯ where natural approach is to approximate u, v by functions uε , vε ∈ C 2 (Ω), ε ∈ (0, ε0 ), so that the desired formula results in the limit as ε ↓ 0. To this end, let us take an arbitrary function χ ∈ C ∞ (Rn ), such that χ = 0 in B1/2 (0), χ = 1 in Rn \B1 (0), 0 ≤ χ(x) ≤ 1 for all x ∈ Rn . Then take χε (x) = χ(ε−1 x). Finally, deﬁne uε (x) = χε (x − y)u(x) and vε (x) = χε (x − z)u(x). ¯ε (y) and B ¯ε (z) are disjoint We choose ε0 > 0 so that the closed balls B 0 0 subsets in Ω (that is, ε0 < |y − z|/2, and ε0 is strictly less than the distances from y and z to ∂Ω). Clearly, |uε (x)| ≤ |u(x)| and |vε (x)| ≤ |v(x)| for all ¯ \ Bε (y), and vε = v in Ω ¯ \ Bε (z). In particular, ¯ Also, uε = u in Ω x ∈ Ω. these equalities hold in a neighborhood of ∂Ω. Note that u and v are locally integrable near their singularities, hence integrable in Ω. By the Lebesgue dominated convergence theorem uε → u, vε → v, as ε ↓ 0, in L1 (Ω), hence, in D (Ω). It follows that Δuε → Δu and Δvε → Δv in D (Ω). Since uε = u ∈ C ∞ in a neighborhood of supp Δvε (which is a subset of Bε0 (z)), we have

uε Δvε dx = Δvε , u → Δv, u = δ(· − z), u = u(z) = G(z, y). ¯ Ω

Here the angle brackets ·, · stand for the duality between E (Ω1 ) and C ∞ (Ω1 ), where Ω1 is an appropriate open subset in Ω. Similarly, we obtain

vε Δuε dx = Δuε , v → Δu, v = δ(· − y), v = v(y) = G(y, z). ¯ Ω

We conclude G(z, y) = G(y, z), as desired.

Remark 6.17. Under Condition G, the remainder function g(x, y) is also symmetric; i.e., g(x, y) = g(y, x) for all x, y ∈ Ω. Remark 6.18. The symmetry of the Green’s function can be interpreted as the selfadjointness of an operator in L2 (Ω) which is inverse to the Laplacian with Dirichlet’s boundary condition. The boundary condition deﬁnes an appropriate domain of the Laplacian, and Green’s function becomes the Schwartz kernel of this operator. We will discuss this approach in later

6.5. Green’s function for the Laplacian

139

chapters. It requires the powerful technique of Sobolev spaces, which will be discussed in Chapter 8. It is noticeable that this technique does not require any restrictions on Ω except its boundedness. Now we show that the Green’s function on a domain Ω satisfying Condition G can be used to obtain an integral representation for any C 2 -function ¯ let vy (x) = G(·, y) for ﬁxed ¯ Starting with an arbitrary u(x) ∈ C 2 (Ω), on Ω. y ∈ Ω. Then Δvy (x) = Δx G(x, y) = δ(x − y) and it is clear from the argument in the proof of Proposition 6.16 that we may plug u, v into (6.19) to obtain

∂G(x, y) G(x, y)Δu(x)dx = u(x) dSx u(y) − ∂n ¯x Ω ∂Ω (where we have used G(x, y) = 0 for x ∈ ∂Ω). Interchanging the roles of x and y and using the symmetry of G(x, y), we have proved the following: Proposition 6.19. Assume that Ω satisﬁes Condition G. Then for every ¯ the following integral representation holds: u ∈ C 2 (Ω)

∂G(x, y) G(x, y)Δu(y)dy + u(y)dSy , x ∈ Ω. (6.20) u(x) = ∂n ¯y Ω ∂Ω ¯ is the (unique) solution of the Dirichlet problem In particular, if u ∈ C 2 (Ω) (6.10), then

∂G(x, y) G(x, y)f (y)dy + ϕ(y)dSy , x ∈ Ω. (6.21) u(x) = ∂n ¯y Ω ∂Ω This proposition gives an explicit representation of the solution for the Dirichlet problem in terms of the Green’s function. Two important particular cases are obtained when ϕ ≡ 0 or f ≡ 0: if ϕ ≡ 0, then

G(x, y)f (y)dy, x ∈ Ω; (6.22) u(x) = Ω

if f ≡ 0, then (6.23)

u(x) = ∂Ω

∂G(x, y) ϕ(y)dSy , x ∈ Ω. ∂n ¯y

The equation (6.23) is called the Poisson integral formula. ¯ be harmonic in Corollary 6.20. Let Ω satisfy Condition G and u ∈ C 2 (Ω) Ω. Then u(x) for x ∈ Ω is given by (6.23), where ϕ = u|∂Ω . Remark 6.21. Formula (6.21) gives the solution of the Dirichlet boundary value problem for the Poisson equation (6.10) if such a solution exists; but it may happen that such solution does not exist! Of course, similar statements apply to the formula (6.22) with regard to (6.15), and (6.23) with regard to (6.8). Note also that the right-hand side of the formula (6.23) is nothing else but the double layer potential (see Chapter 5).

140

6. Harmonic functions

Let us end this section with a discussion of the physical meaning in electrostatics of the formulas (6.21)–(6.23). (For the simplest notions of electrostatics, see physics textbooks, e.g., [8, Vol. 2, Chapters 4–8] and also Chapters 4 and 5 in this book.) The situation to describe is a unit charge ¯ which is bounded by a thin closed conductive shell ∂Ω. sitting in a body Ω, Let us explain what these words mean mathematically. We are interested in the electric ﬁeld in Ω which contains only a single unit charge at y. The fact that ∂Ω is a thin shell made of a conductive material (e.g., a metal) means that no energy is needed to transport a small charge along ∂Ω. Since the energy needed to transport a small charge along a curve is equal to the diﬀerence in the potentials at the ends of the curve, this means that the potential u is locally constant on the boundary ∂Ω; i.e., it is constant on each of the connected components of ∂Ω. If ∂Ω is connected, then u is constant on ∂Ω, and by adding a constant to u we can achieve the boundary condition u|∂Ω = 0. Now let us consider Green’s function G(x, y) for Ω. For ﬁxed y ∈ Ω, ¯ of a unit charge u(x) = G(x, y) is interpreted as the potential at x ∈ Ω located at y. The boundary condition G(x, y) = 0 for x ∈ ∂Ω reﬂects the fact that Ω is bounded by a thin conductive shell ∂Ω. For a system of charges in Ω, the potentials add up by the superposition principle, and for a distributed charge f (x), the potential of this charge becomes an integral, as in (6.22). If the boundary is not connected, then the condition that the boundary is conductive means that the potential u is constant on each of the connected components of the boundary. By adding an appropriate solution of the Dirichlet problem (6.8), we can force these constants to vanish; hence we obtain an adjusted potential u ˜ which again satisﬁes the boundary condition ˜(x), we again get the Green’s function which u ˜|∂Ω = 0. If we let G(x, y) = u satisﬁes G(x, y) = 0 for x ∈ ∂Ω.

6.6. H¨ older regularity Now we show that a bounded domain Ω satisﬁes Condition G when ∂Ω is suﬃciently regular. This boundary regularity can be expressed more precisely using H¨ older spaces, which also allow a more precise description of the smoothness of solutions of the Dirichlet problem for the Poisson equation (6.10) in terms of the smoothness of f and φ. ¯ k = 0, 1, 2, . . . , introduce the norm First, for functions u ∈ C k (Ω), (6.24)

u(k) =

|α|≤k

sup |∂ α u(x)|, x∈Ω

6.6. H¨older regularity

141

¯ and the right-hand side makes sense (i.e., if it deﬁned whenever u ∈ C k (Ω) is ﬁnite). (We already used this norm on R: see (6.13).) With this norm, ¯ becomes a Banach space. C k (Ω) ¯ introduce a seminorm Now for any γ ∈ (0, 1) and any function u ∈ C(Ω) |u|(γ) =

sup x,y∈Ω, x=y

|u(x) − u(y)| . |x − y|γ

If this seminorm is ﬁnite, then we will say that this function is a H¨ older function with the exponent γ. (If the same is true with γ = 1, then one says that u is a Lipschitz function, but we will rarely use Lipschitz functions in this book.) We will denote the space of all H¨ older functions with the γ ¯ ( Ω). Note that 0 < γ < γ < 1 implies exponent γ by C 1 2 ¯ = C 0 (Ω) ¯ ⊃ C γ1 (Ω) ¯ ⊃ C γ2 (Ω) ¯ ⊃ C 1 (Ω). ¯ C(Ω) ¯ Similarly, for any integer k ≥ 0 and γ ∈ (0, 1) deﬁne the space C k,γ (Ω), k α γ ¯ such that ∂ u ∈ C (Ω) ¯ for all multiconsisting of functions u ∈ C (Ω) indices α with |α| = k. It is a Banach space with the norm

uk,γ = u(k) +

|∂ α u|(γ) .

{α: |α|=k}

¯ ⊃ C k,γ (Ω) ¯ ⊃ C k+1 (Ω). ¯ Note that C k (Ω) The C k,γ -norm can be used to measure the smoothness of the boundary as well. Namely, we will say that Ω has a C k,γ boundary and write ∂Ω ∈ C k,γ if ∂Ω ∈ C k and the functions h(x ), which enter into local representations of the boundary as the graph xn = h(x ) (see Deﬁnition 4.8), are in fact in C k,γ on every compact subset of the domain where they are deﬁned. ¯ and the C k,γ If k ≥ 1, then it is easy to see that the space C k,γ (Ω) k,γ smoothness of ∂Ω are invariant under C diﬀeomorphisms. (This follows from the statements formulated in Problem 6.6.) In fact, on the C k,γ boundary ∂Ω itself, for k ≥ 1, there exists a structure of a C k,γ manifold, which is given by the sets of local coordinates related to each other in the usual way, by C k,γ diﬀeomorphisms between two domains in Rn−1 , which are images ¯ Such local of the intersection of the two coordinate neighborhoods in Ω. coordinates can be taken to be x for the point (x , h(x )) in the local graph presentation of ∂Ω. Let us return to (6.10), the Dirichlet problem for the Poisson equation. Here is a remarkable theorem by J. Schauder, which gives precise solvability and regularity for the problem (6.10) in terms of the H¨ older spaces C k,γ .

142

6. Harmonic functions

Theorem 6.22. Assume that k ≥ 0 is an integer, γ ∈ (0, 1), ∂Ω ∈ C k+2,γ , ¯ and ϕ ∈ C k+2,γ (∂Ω). f ∈ C k,γ (Ω), ¯ of the (a) (Existence and uniqueness) A solution u ∈ C 2 (Ω) ∩ C(Ω) boundary value problem (6.10) exists and is unique. ¯ (b) (Regularity) In fact, the unique solution u belongs to C k+2,γ (Ω). (c) (Isomorphism operator) The operator ¯ → C k,γ (Ω) ¯ × C k+2,γ (∂Ω), (6.25) A : C k+2,γ (Ω)

Au = {Δu, u|∂Ω },

is a linear topological isomorphism (that is, a bounded linear operator which has a bounded inverse). ¯ and ϕ ∈ C ∞ (∂Ω). Corollary 6.23. Assume that ∂Ω ∈ C ∞ , f ∈ C ∞ (Ω), 2 ¯ ¯ Then (6.10) has a unique solution u ∈ C (Ω)∩C(Ω) and in fact u ∈ C ∞ (Ω). Moreover, the operator ¯ → C ∞ (Ω) ¯ × C ∞ (∂Ω), Au = {Δu, u|∂Ω }, (6.26) A : C ∞ (Ω) is a linear topological isomorphism of Fr´echet spaces. Proof. The statement immediately follows from Theorem 6.22 if we note ¯ is the intersection of all spaces that for any ﬁxed γ ∈ (0, 1), the space C ∞ (Ω) k,γ ¯ can be deﬁned C (Ω), k = 0, 1, 2, . . . , and the Fr´echet topology in C ∞ (Ω) by norms · k,γ . Remark 6.24. I will not provide a proof of Theorem 6.22. The interested reader can ﬁnd complete proofs in the monographs by Ladyzhenskaya and Ural’tseva [19, Ch. III] and Gilbarg and Trudinger [9, Ch. 6]. In both books a substantial extension of this result to general second-order elliptic operators is proved. (It is also due to Schauder.) ¯ All three statements of Theorem 6.22 fail in the usual classes C k (Ω) even locally. Namely, the equation Δu = f with f ∈ C(Ω) may fail to have a solution u ∈ C 2 (Ω1 ) where Ω1 is a proper subdomain of Ω. Even ¯ does not imply u ∈ C 2 (Ω). However, it is possible to deduce f ∈ C(Ω) weaker existence and regularity statements in terms of the usual C k spaces. Here is the regularity result: Corollary 6.25. Assume that k ≥ 1 is an integer and Ω is a bounded ¯ be a solution domain in Rn such that ∂Ω ∈ C k+2 . Let u ∈ C 2 (Ω) ∩ C(Ω) k k+2 ¯ (∂Ω). Then u ∈ of (6.10) with f = Δu ∈ C (Ω) and ϕ = u|∂Ω ∈ C ¯ for all γ ∈ (0, 1). In particular, if ∂Ω ∈ C k+2 , u ∈ C(Ω) ¯ is C k+1,γ (Ω) k+2 k+1 ¯ (∂Ω), then u ∈ C (Ω). harmonic in Ω, and ϕ = u|∂Ω ∈ C Proof. Choose γ ∈ (0, 1), and note that the given conditions imply the in¯ and ϕ ∈ C k+2,γ (∂Ω). Then Theorem clusions ∂Ω ∈ C k+1,γ , f ∈ C k−1,γ (Ω), ¯ hence u ∈ C k+1 (Ω). ¯ 6.22 implies that u ∈ C k+1,γ (Ω);

6.7. Explicit formulas for Green’s functions

143

For the existence result for C k spaces, see Problem 6.8. Now let us return to the existence and smoothness of the Green’s function for a given domain, i.e., Condition G. Theorem 6.26. If Ω is a bounded domain in Rn with ∂Ω ∈ C 2,γ for some γ ∈ (0, 1), then Ω satisﬁes Condition G. Consequently, the following statements hold: (a) The Green’s function satisﬁes G(x, y) = G(y, x) for all x, y ∈ Ω, x = y. ¯ × Ω). ¯ (b) The remainder function satisﬁes g(x, y) ∈ C 2 (Ω ¯ × Ω\diag( ¯ ¯ Ω)). ¯ (c) The Green’s function satisﬁes G ∈ C 2 (Ω Ω, Proof. For any ﬁxed y ∈ Ω, observe that ϕ(x) := −En (x − y) is a smooth function on ∂Ω, so by Theorem 6.22 we can ﬁnd g = g(x, y) such that ¯ ⊂ C 2 (Ω). ¯ This shows that Ω satisﬁes Condition G, and g(·, y) ∈ C 2,γ (Ω) hence property (a) holds by Proposition 6.16. But (a) means that regularity in x implies regularity in y, so properties (b) and (c) follow as well.

6.7. Explicit formulas for Green’s functions The cases where the Green’s function can be found explicitly are rare. Nevertheless, they are worth learning because they show what you can expect in the general case. We will construct Green’s functions using the method of images (sometimes called the method of reﬂections). We are usually interested in the Green’s function for a bounded domain, but let us ﬁrst consider the case of a half-space, since the method of images is simplest there. In fact, we assume Ω is the upper half-space: (6.27)

Ω = Rn+ := {x = (x1 , . . . , xn ) : xn > 0}.

To construct the Green’s function, we recall (6.16) to write G(x, y) = En (x − y) + g(x, y) for x, y ∈ Rn+ , x = y, where g(x, y) is harmonic in x ∈ Rn+ and satisﬁes g(x, y) = −En (x − y) for x ∈ ∂ Rn+ = Rn−1 . But such a function is g(x, y) := −En (x − y ∗ ), where (6.28)

y ∗ = (y1 , . . . , yn−1 , −yn )

is the image of y = (y1 , . . . , yn ) as reﬂected through the boundary Rn−1 . Notice that g(x, y) is harmonic in x ∈ Rn+ since y ∗ ∈ Rn+ , and g(x, y) = −E(x − y) for x ∈ ∂ Rn+ since then |x − y| = |x − y ∗ |. We have: The Green’s function for Rn+ : (6.29)

G(x, y) = En (x − y) − En (x − y ∗ )

for x, y ∈ Rn+ , x = y.

144

6. Harmonic functions

Before we turn to constructing the Green’s function on other domains, let us observe that for Ω = Rn+ the correction term g(x, y) = −En (x − y ∗ ) is just the potential with charge −1 at the point y ∗ , which lies outside of Rn+ . Let us use this idea for a bounded domain Ω; i.e., for each y ∈ Ω let (6.30)

G(x, y) = En (x − y) + q ∗ (y)En (x − y ∗ (y)) for x, y ∈ Ω, x = y,

¯ is its where q ∗ = q ∗ (y) ∈ R is the correcting charge and y ∗ = y ∗ (y) ∈ Rn \Ω location. For this to be the Green’s function for Ω, we need (6.31)

En (x − y) + q ∗ (y)En (x − y ∗ (y)) = 0

for x ∈ ∂Ω, y ∈ Ω.

Thus we can calculate G(x, y) by ﬁnding y ∗ (y) and q ∗ (y) so that (6.31) holds. We will see that such functions exist only for very special domains Ω, which we will completely describe (for n ≥ 3). Now assume n ≥ 3. The explicit form of En allows us to rewrite (6.31) in the form (6.32)

|x − y|2−n + q ∗ |x − y ∗ |2−n = 0,

¯ x ∈ ∂Ω, y ∈ Ω, y ∗ ∈ Rn \Ω.

We can equivalently write (since x = y) (6.33)

|x − y ∗ |2 = (−q ∗ )2/(n−2) , |x − y|2

¯ x ∈ ∂Ω, y ∈ Ω, y ∗ ∈ Rn \Ω,

which implies, in particular, that the left-hand side is independent of x ∈ ∂Ω. (Note that (6.32) implies q ∗ < 0.) If we denote Q∗ > 0 to be (6.34)

Q∗ = Q∗ (q ∗ ) = (−q ∗ )−2/(n−2),

then the equation in (6.33) is equivalent to the quadratic equation (6.35)

|x − y|2 − Q∗ |x − y ∗ |2 = 0.

Proposition 6.27. Assume n ≥ 3. For ﬁxed y, y ∗ ∈ Rn with y = y ∗ and Q∗ > 0, the set of points x ∈ Rn satisfying (6.35) is a) a hyperplane if Q∗ = 1 and b) a geometric sphere if Q∗ = 1. Proof. Denoting the standard scalar product of vectors x, y ∈ Rn by x · y, (6.35) implies 0 =Q∗ (|x|2 − 2x · y ∗ + |y ∗ |2 ) − (|x|2 − 2x · y + |y|2 ) =(Q∗ − 1)|x|2 − 2x · (Q∗ y ∗ − y) + (Q∗ |y ∗ |2 − |y|2 ). If Q∗ = 1, we obtain −2x · (y ∗ − y) + (y ∗ + y) · (y ∗ − y) = 0, or y∗ + y · (y ∗ − y) = 0, (6.36) x− 2

6.7. Explicit formulas for Green’s functions

145

which is the equation of the hyperplane passing through the point (y ∗ +y 2 )/2 and perpendicular to the vector y ∗ −y. If Q∗ = 1, then completing the square gives ' & 2x · (Q∗ y ∗ − y) |Q∗ y ∗ − y|2 ∗ 2 0 =(Q − 1) |x| − + Q∗ − 1 (Q∗ − 1)2 |Q∗ y ∗ − y|2 − (Q∗ |y ∗ |2 − |y|2 )(Q∗ − 1) Q∗ − 1 Q∗ y ∗ − y 2 Q∗ ∗ |y ∗ − y|2 , − =(Q − 1) x − Q∗ − 1 Q∗ − 1 −

or, equivalently, (6.37)

2 ∗ ∗ Q∗ x − Q y − y = |y ∗ − y|2 . Q∗ − 1 (Q∗ − 1)2

This means that our quadric is a sphere SR (c) with center c ∈ Rn and radius R > 0 given by the formulas (6.38)

c=

Q∗ y ∗ − y , Q∗ − 1

R=

(Q∗ )1/2 ∗ |y − y|. |Q∗ − 1|

Since the equation (6.35) was obtained by ﬁnding where G(x, y) vanishes, we see that constructing a Green’s function in the form of (6.30) will only work when Ω is a hyperplane or a sphere. Moreover, we can obtain additional conclusions about the possible relationships between y and y ∗ in these cases. Solving the ﬁrst equation in (6.38) with respect to y ∗ and y, we get 1 1 ∗ (6.39) y = ∗ y + 1 − ∗ c, y = Q∗ y ∗ + (1 − Q∗ )c. Q Q It follows that c, y, and y ∗ are all on the same line in Rn . In fact, if Q∗ −1 > 0 (respectively, Q∗ −1 < 0) , then they follow in the order c, y ∗ , y (respectively, c, y, y ∗ ). Substituting the expression for y ∗ from (6.39) into (6.38) and then solving the resulting equation with respect to Q∗ , we easily ﬁnd that |y − c|2 . R2 Now, using the ﬁrst equation of (6.38), we obtain

(6.40)

(6.41)

Q∗ =

y∗ − c =

y−c R2 · , |y − c| |y − c|

which implies that y ∗ − c and y − c are proportional and |y − c| · |y ∗ − c| = R2 . If (6.41) is considered as a map ∗ : Rn \{c} → Rn \{c},

given by y → y ∗ ,

146

6. Harmonic functions

then it is involutive or an involution, which means that ∗2 = Id (the identity map), with SR (c) being its ﬁxed point set. The particular involution ∗ : y → y ∗ of Rn \{c} given by (6.41) is called the reﬂection or inversion with respect to the sphere SR (c). Let us summarize the arguments above in the following theorem. Theorem 6.28. Let n ≥ 3 and let Ω be a bounded domain with a C ∞ boundary ∂Ω in Rn . Assume that for every y ∈ Ω and every q > 0 there ¯ and a number q ∗ < 0, such that the potential u of exist a point y ∗ ∈ Rn \Ω the system of two charges 1, q ∗ at the points y, y ∗ , respectively, has zero level set u−1 (0) = {x ∈ Rn : u(x) = 0}, which coincides with ∂Ω. Then the following statements hold: (a) Ω is a ball BR (c) with the center c ∈ Ω and radius R > 0. (b) The quantities R, c are unique and are given by the formulas (6.38), where Q∗ is given by (6.34). (c) The points c, y, y ∗ all lie on a straight line and satisfy (6.39). Making the calculations in the opposite direction, we can conclude that the following proposition holds: Proposition 6.29. Let n ≥ 3. For every sphere SR (c) and every point y ∈ Rn \SR (c), there exist a unique point y ∗ ∈ Rn and a unique number q ∗ ∈ R \{0}, such that the potential u of the system of two charged point particles at y, y ∗ with the charges 1, q ∗ , respectively, has zero level set which coincides with the sphere SR (c). Remark 6.30. In the limit R → +∞, with the appropriate choice of the center c = c(R), we can obtain a limit relation SR (c(R)) → S∞ , where S∞ is a hyperplane in Rn and the convergence is in the C 2 -topology of hypersurfaces (deﬁned by C 2 -topologies on functions representing S∞ as graphs of C 2 -functions in local coordinates). Then, for any ﬁxed y we get ∗ , where y ∗ is the mirror the reﬂection y ∗ = y ∗ (R), such that y ∗ (R) → y∞ ∞ image of y with respect to the limit hyperplane S∞ . (See more about this below in this section.) Since we now know that the method of images will succeed in producing the Green’s function for an open ball Ω = BR (0) in the form (6.42)

G(x, y) = En (x − y) + q ∗ (y)En (x − y ∗ (y)),

let us describe this formula explicitly. Note that the formulas we obtain work for n = 2 as well as n ≥ 3, and we have taken the center of the ball to

6.8. Problems

147

be c = 0 for convenience. In this case, the involution (6.41) is just y∗ =

R2 y , |y|2

for y ∈ Rn \{0} and extended to the points y = 0 and y = ∞ by continuity. Combining (6.33), (6.34), and (6.40), we obtain R |x − y ∗ | = |x − y| |y|

for |x| = R, |y| < R,

which we can easily check holds also for n = 2. So for |x| = R, |y| < R we have

⎧ ⎨ 1 log |y| |x − y ∗ | for n = 2, 2π R

n−2 En (x − y) = ⎩ R En (x − y ∗ ) for n ≥ 3. |y| But the expressions on the right are harmonic for all |x| < R, so we obtain: The Green’s function for the ball BR (0):

⎧ ⎨E2 (x − y) − 1 log |y| |x − y ∗ | for n = 2, 2π R

n−2 (6.43) G(x, y) = ⎩En (x − y) − R En (x − y ∗ ) for n ≥ 3. |y| Notice that for n = 2, G(x, y) is not quite of the form (6.30) but rather G(x, y) = E2 (x − y) − E2 (|x − y ∗ |) + p(y) where p(y) = −(2π)−1 log(|y|/R).

6.8. Problems 6.1. Prove the statements inverse to the ones of Theorems 6.1, 6.2: if a C 2 -function u in an open set Ω ⊂ Rn has mean value property for all balls ¯ ⊂ Ω or for all spheres which are boundaries of such balls (that B with B is, either (6.2) holds for such spheres or (6.4) holds for all such balls), then u is harmonic in Ω. Moreover, it suﬃces to take balls or spheres, centered at every point x with suﬃciently small radii r ∈ (0, ε(x)), where ε(x) > 0 continuously depends upon x ∈ Ω. 6.2. Let u be a real-valued C 2 -function on an open ball B = Br (x) ⊂ Rn , ¯ and let Δu ≥ 0 which extends to a continuous function on the closure B, everywhere in B. Prove that

1 1 u(y)dSy = u(y)dSy (6.44) u(x) ≤ voln−1 (S) S σn−1 rn−1 Sr (x) and (6.45)

1 u(x) ≤ vol(B)

u(y)dy = B

n

σn−1 rn

where the notation is the same as in Section 6.1.

u(y)dy, Br (x)

148

6. Harmonic functions

6.3. Prove the inverse statement to the one in the previous problem: if ¯ ⊂ Ω (or for all spheres u ∈ C 2 (Ω) is real valued and for all balls B with B which are boundaries of such balls), (6.45) (respectively, (6.44)) holds, then Δu ≥ 0 in Ω. Moreover, it suﬃces to take balls or spheres, centered at every point x with suﬃciently small radii r ∈ (0, ε(x)), where ε(x) > 0 continuously depends upon x ∈ Ω. Remark 6.31. A real-valued function u ∈ C 2 (Ω) is called subharmonic if Δu ≥ 0 in Ω. Note, however, that the general notion of subharmonicity does not require that u ∈ C 2 and allows more general functions u (or even distributions, in which case Δu ≥ 0 means that Δu is a positive Radon measure). 6.4. Let u be a real-valued subharmonic function in a connected open set Ω ⊂ Rn . Prove that u cannot have local maxima in Ω, unless it is constant. In other words, if x0 ∈ Ω and u(x0 ) ≥ u(x) for all x in a neighborhood of x0 , then u(x) = u(x0 ) for all x ∈ Ω. 6.5. Consider the initial value problem (6.11) with ϕ ∈ C0∞ (R) and ψ ≡ 0. Assume that it has a solution u. Prove that in this case ϕ ≡ 0 on R and u ≡ 0 on ΠT . 6.6. Let U, V, W be open sets in Rm , Rn , Rp , respectively, and let g : U → V , f : V → W be maps which are locally C k,γ (that is, they belong to C k,γ on any relatively compact subdomain of the corresponding domain of deﬁnition). (a) Assume that k ≥ 1. Then prove that the composition f ◦ g : U → W is also locally C k,γ . (b) Prove that the statement (a) above does not hold for k = 0. (c) Prove that if k ≥ 1, Ω is a domain in Rn with ∂Ω ∈ C k,γ , U is a ¯ in Rn , and ψ : U → Rn is a C k,γ diﬀeomorphism of U neighborhood of Ω onto a domain V ⊂ Rn , then the image Ω = ψ(Ω) is a domain with a C k,γ boundary, Ω ⊂ V . 6.7. (a) Prove that if Ω is a bounded domain in Rn and ∂Ω ∈ C k,γ with k ≥ 1, then ∂Ω is a C k,γ manifold, dimR ∂Ω = n − 1. (b) Prove that for any nonnegative integers k, k and numbers

γ, γ ∈ (0, 1), such that k +γ ≥ k +γ , k ≥ 1, the class C k ,γ is locally invari ant under C k,γ diﬀeomorphisms. In particular, function spaces C k ,γ (∂Ω) are well-deﬁned. (c) Under the same assumptions as in (a) and (b), for a real-valued function ϕ : ∂Ω → R, the inclusion ϕ ∈ C k ,γ (∂Ω) is equivalent to the ˆ ∂Ω = ϕ. existence of an extension ϕˆ ∈ C k ,γ (Ω), such that ϕ|

6.8. Problems

149

6.8. Assume that k ≥ 1 is an integer, Ω is a bounded domain in Rn , such ¯ and ϕ ∈ C k+2 (∂Ω). Using Schauder’s theorem, that ∂Ω ∈ C k+2 , f ∈ C k (Ω), ¯ Theorem 6.22, prove that the problem (6.10) has a solution u ∈ C k+1 (Ω). 6.9. Assume that ∂Ω ∈ C 2 and there exists Green’s function G = G(x, y) of ¯ Consider an integral representation Ω, such that G ∈ C 2 for x = y, x, y ∈ Ω. 2 ¯ ¯ and of a function u ∈ C (Ω) given by the formula (6.21), with f ∈ C(Ω) ϕ ∈ C(∂Ω). Prove that the pair {f, ϕ}, representing a ﬁxed function u ∈ ¯ by (6.21), is unique. C 2 (Ω) 6.10. Provide detailed proofs of identities (6.39), (6.40), and (6.41). 6.11. Deduce the expression for the Green’s function in a half-space of Rn , n ≥ 3, from symmetry considerations.

10.1090/gsm/205/07

Chapter 7

The heat equation

7.1. Physical meaning of the heat equation The heat equation is (7.1)

∂u = a2 Δu, ∂t

where u = u(t, x), t ∈ R, x ∈ Rn , a > 0, and Δ is the Laplace operator with respect to x. This equation is an example of a parabolic type equation. It describes, for instance, the temperature distribution of a homogeneous and isotropic medium (in which case u(t, x) is the temperature at the point x at the time t). This equation is also satisﬁed by the density of a diﬀusing matter, e.g., the density of Brownian particles when there are suﬃciently many of them so that we can talk about their density and consider the evolution of this density as a continuous process. This is a reason why the equation (7.1) is often called the diﬀusion equation. Deriving of the heat equation is based on a consideration of the energy balance. More precisely, we will deal with the transmission of the heat energy, which is the kinetic energy of the atoms’ and molecules’ chaotic motion in the gases and liquids or the energy of the atoms’ vibrations in metals, alloys, and other solid materials. Instead of “heat energy” we will use the word “heat”, which is shorter and historically customary in the books on thermodynamics. The heat equation for temperature u(t, x) is derived from the following natural empirically veriﬁed physical assumptions: 1) The heat Q needed to warm up a part of matter with the mass m from the temperature u1 to u2 is proportional to m and u2 − u1 : Q = cm(u2 − u1 ). 151

152

7. The heat equation

The coeﬃcient c is called the speciﬁc heat (per unit of mass and unit of temperature). 2) Fourier’s law: The quantity of heat ΔQ passing through a small surface element with the area S during the time Δt is proportional to S, Δt, and ∂∂un¯ (the rate of growth of u in the direction n ¯ of the unit normal vector to the surface). More precisely, ∂u Δt, ∂n ¯ where k > 0 is the heat conductivity coeﬃcient characterizing heat properties of the media; the minus sign indicates that heat is moving in the direction opposite to that of the growth of the temperature. ΔQ = −kS

One should keep in mind that both hypotheses are only approximations, and so is the equation (7.1) which they imply. As we will see below, a consequence of equation (7.1) is an inﬁnite velocity for propagation of heat, which is physically absurd. In the majority of applied problems, however, these hypotheses and equation (7.1) are suﬃciently justiﬁed. A more precise model of heat distribution should take into account the molecular structure of the matter leading to problems of statistical physics far more diﬃcult than the solution of the model equation (7.1). Note, however, that, regardless of its physical meaning, equation (7.1) and its analogues play an important role in mathematics. They are used, e.g., in the study of elliptic equations. Let us derive the heat equation (for n = 3). We use the energy conservation law (heat conservation in our case) in a volume Ω ⊂ R3 . Let us assume that Ω has a smooth boundary ∂Ω. The rate of change of the heat energy of the matter in Ω is obviously equal to

d dQ = cρu dV, dt dt Ω where ρ is the volume density of the matter (mass per unit volume) and dV is the volume element in R3 . Assuming that c = const and ρ = const, we get

∂u dQ = cρ dV. dt Ω ∂t Now, let P be the rate of the heat escape through ∂Ω. Clearly,

∂u k dS, P =− ¯ ∂Ω ∂ n where n ¯ is the outward unit normal to ∂Ω , dS the area element of ∂Ω. Here in the right-hand side we have the ﬂow of the vector −k grad u through ∂Ω.

7.2. Boundary value problems

153

By the divergence formula (4.34) we get

P = − div(k grad u)dV Ω

or, for k = const,

P = −k

Δu dV. Ω

The heat energy conservation law in the absence of heat sources means that dQ = −P. dt Substituting the above expressions for dQ dt and P , we get

∂u cρ dV = kΔu dV, ∂t Ω Ω which, since Ω is arbitrary, results in (7.1) with a2 =

k cρ .

7.2. Boundary value problems for the heat and Laplace equations A physical interpretation of the heat equation suggests a natural setting for boundary value problems for this equation. The Cauchy problem (sometimes also called the initial value problem) is the simplest one: ∂u 2 t > 0, x ∈ Rn , ∂t = a Δu, (7.2) u|t=0 = ϕ(x) x ∈ Rn . Its physical meaning is to determine the temperature of the medium at all moments t > 0 provided the temperature distribution is known at t = 0. Sometimes we are only interested in the temperature in a domain Ω ⊂ Rn regardless of what takes place outside of it. Then, on ∂Ω, additional conditions should be imposed: e.g., the temperature or the heat ﬂow should be given. Thus we come to the two following initial-boundary value problems. The ﬁrst initial-boundary value problem (or Cauchy-Dirichlet’s initialboundary value problem): ⎧ ∂u t > 0, x ∈ Ω, ⎨ ∂t = a2 Δu, (7.3) u|t=0 = ϕ(x), x ∈ Ω, ⎩ u|∂Ω = α(t, x), x ∈ ∂Ω, t > 0. The second boundary value problem (or Cauchy-Neumann’s boundary value problem) is obtained by replacing the latter of conditions (7.3) by ∂u = β(t, x), x ∈ ∂Ω, t > 0, ∂n ¯ ∂Ω

where n ¯ is the outward unit normal to ∂Ω.

154

7. The heat equation

For time-independent boundary conditions it is often interesting to determine temperature behavior as t → +∞. Suppose that a limit v(x) of the solution u(t, x) of the heat equation exists as t → +∞ (we say that the solution “stabilizes” or “tends to an equilibrium”). Then it is natural to expect that v(x) satisﬁes the Laplace equation Δv(x) = 0; i.e., v(x) is a harmonic in x, at least because the only possible ﬁnite limit of ∂u(t, x)/∂t can be 0. It is very easy to make the meaning of this statement precise but we will not do so at the moment in order not to divert our attention. The ﬁrst and second initial-boundary value problems for the heat equation turn in the limit into the following boundary value problems for the Laplace equation: Dirichlet’s problem (7.4)

Δv(x) = 0, x ∈ Ω, v|∂Ω = α(x), x ∈ ∂Ω.

Neumann’s problem: Δv(x) x ∈ Ω, = 0, (7.5) ∂v = β(x), x ∈ ∂Ω. ∂n ¯ ∂Ω In this model Dirichlet’s problem means the following: ﬁnd the steadystate temperature inside Ω when the temperature α(x) is maintained on the boundary. The meaning of Neumann’s problem is to ﬁnd the steady-state temperature inside Ω if the heat ﬂow β(x) on the boundary is given. We already mentioned the Dirichlet problem in the previous chapter; see Corollary 6.8, where the uniqueness of the classical solution was established. Clearly, the solution of Neumann’s problem is not unique: any constant can always be added to the solution. Further, it is clear that the steady-state distribution of heat can exist in Ω only if the total heat ﬂux through the boundary vanishes; i.e.,

β(x)dS = 0. (7.6) ∂Ω

(This also follows from the formula (6.3), which implies that if v is a solution of (7.5), then (7.6) holds.) We will see later that the above problems for the heat and Laplace equations are well-posed in the sense that they have a unique solution (up to adding a constant in the case of the Neumann problem) in natural classes of functions, and, besides, the solutions depend continuously on initial and boundary values for an appropriate choice of spaces. Note that the Dirichlet and Neumann problems for the Laplace equation have many physical interpretations apart from the above ones. For instance,

7.3. A proof that the limit function is harmonic

155

a small vertical deﬂection of a membrane u(t, x1 , x2 ) and a small displacement u(t, x1 , x2 , x3 ) of an elastic solid body satisfy the wave equation ∂ 2u = a2 Δu. ∂t2 In the time-independent case we come again to the Laplace equation and then the Dirichlet problem. For instance, for a membrane this is the problem of recovering the membrane form from that of its boundary. The boundary condition of Neumann’s problem is interpreted as prescribing the boundary value of the vertical component of the force acting on the membrane along its boundary. This topic is related to the famous question by Mark Kac: “Can one hear the shape of a drum?” (1968), which led to a wide development of spectral geometry.

7.3. A proof that the limit function is harmonic There are several ways to justify the harmonicity of the limit function. We will give one of them. Proposition 7.1. Let a solution u(t, x) of the heat equation be deﬁned and bounded for t > 0, x ∈ Ω (for an open subset Ω ⊂ Rn ). Suppose that for almost all x ∈ Ω there exists a limit v(x) = lim u(t, x).

(7.7)

t→+∞

Then v(x) is a harmonic function in Ω. Remark 7.2. The solution u(t, x) may be a priori considered as a bounded measurable function (here the equation (7.1) should be understood in the distributional sense). However we will see below that the heat operator has a fundamental solution which is inﬁnitely diﬀerentiable for |t| + |x| = 0 and, therefore, actually, u(t, x) ∈ C ∞ (R+ × Ω), where R+ = {t : t > 0}. Proof of Proposition 7.1. Clearly, lim u(t + τ, x) = v(x)

τ →+∞

for all t > 0 and almost all x ∈ Ω. If ϕ(t, x) ∈ D(R+ × Ω), then the dominated convergence theorem implies

u(t + τ, x)ϕ(t, x)dtdx = v(x)ϕ(t, x)dtdx, (7.8) lim τ →+∞

or, in other words, lim u(t + τ, x) = v(x) in D (R+ × Ω)

τ →+∞

(here v(x) should be understood as 1 ⊗ v, i.e., as the right-hand side of ∂ − a2 Δx to the equality above, we can (7.8)). Applying the heat operator ∂t

156

7. The heat equation

interchange this operator with the passage to the limit (this can be done in the space of distributions; for example, in our case everything is reduced ∂ − a2 Δx )ϕ(t, x) to replacing ϕ(t, x) in (7.8) by another function (− ∂t ∂ ∈ D(R+ × Ω)). Then ( ∂t − a2 Δx )v(x) = 0; i.e., v(x) is a distributional solution of Δv(x) = 0. But, as we know, v(x) is then a regular harmonic function, as required.

7.4. A solution of the Cauchy problem for the heat equation and Poisson’s integral Let us investigate the Cauchy problem (7.2). First, assume that ϕ(x) ∈ S(Rn ) and u(t, x) ∈ S(Rnx ) for any t ≥ 0, so that u(t, x) is a continuous function of t for t ≥ 0 with values in S(Rnx ). In addition, assume that u(t), considered as a curve in S(Rn ), is diﬀerentiable for t > 0. As we will see below, a solution u(t, x) with the described properties exists (and is unique in a very broad class of solutions). For the time being, note that we may perform the Fourier transform in x and since the Fourier transform is a topological isomorphism F : S(Rnx ) −→ S(Rnξ ), it commutes ∂ . Thus, let with ∂t

u ˜(t, ξ) = e−ix·ξ u(t, x)dx (hereafter, integration is performed over the whole of Rn unless otherwise indicated).

The standard integration by parts gives '

& ∂u ∂ ∂u ˜ 2 2 e−ix·ξ u(t, x)dx = e−ix·ξ − a2 Δx u dx = + a2 |ξ|2 +a |ξ| u ˜. ∂t ∂t ∂t

Therefore (7.1) is equivalent to ∂u ˜(t, ξ) + a2 |ξ|2 u ˜(t, ξ) = 0. ∂t The initial condition takes the form (7.9)

(7.10)

˜ where ϕ˜ = F ϕ. u ˜|t=0 = ϕ(ξ),

Equation (7.9) is an ordinary diﬀerential equation with the independent variable t and with a parameter ξ. It is easy to solve it, and we get with (7.10) 2 2 ˜ u ˜(t, ξ) = e−ta |ξ| ϕ(ξ). From this formula we see that u ˜(t, ξ) is actually a (continuous for t ≥ 0 and ∞ C for t > 0) function of t with values in S(Rnξ ). Therefore, applying the inverse Fourier transform, we get a solution u(t, x) of the Cauchy problem satisfying the above conditions.

7.4. Poisson’s integral

157

Let us derive an explicit formula. We have

2 2 −n ix·ξ −n ˜(t, ξ)dξ = (2π) ˜ u(t, x) = (2π) e u eix·ξ−ta |ξ| ϕ(ξ)dξ

2 2 = (2π)−n ei(x−y)·ξ−ta |ξ| ϕ(y)dydξ. Interchanging the order of integration (by Fubini’s theorem) we get

u(t, x) = Γ(t, x − y)ϕ(y)dy, where (7.11)

Γ(t, x) = (2π)

−n

eix·ξ−ta

2 |ξ|2

dξ

(we have essentially repeated the argument proving that the Fourier transform turns multiplication into convolution). Let us express Γ(t, x) explicitly, computing the integral in (7.11). For this, complete the square in the exponent: ix ix |x|2 2 2 2 2 . · ξ − − ix · ξ − ta |ξ| = −ta ξ · ξ + ix · ξ = −ta ξ − 2ta2 2ta2 4ta2 ix The change of coordinates ξ − 2ta 2 = η, i.e., the shift of the contour of integration over each ξj , yields

√ −n |x|2 |x|2 2 −n − 4ta2 −ta2 |η|2 −n − 4ta2 dη = (2π) e (a t) e e−|ξ| dξ Γ(t, x) = (2π) e √ (we have also made the change of variables ξ = a tη).

Using the formula

e−|ξ| dξ = π n/2 , 2

we ﬁnally get

√ −n |x|2 Γ(t, x) = (2a πt) exp − 2 . 4ta The solution of the Cauchy problem is expressed in the form

|x−y|2 1 − 4a2 t ϕ(y)dy, √ e (7.12) u(t, x) = (2a πt)n called Poisson’s integral. Let us analyze this formula. First, observe that the formula makes sense for a larger class of initial functions ϕ(x) than the class S(Rn ) from which we have started. For instance, it is clear that if ϕ(x) is continuous and (7.13)

2

|ϕ(x)| ≤ Ceb|x| , 1 4a2 t

b > 0,

> b, i.e., for 0 < t < 4a12 b . Here the then the integral (7.12) converges if integral itself and integrals obtained from it after diﬀerentiation with respect

158

7. The heat equation

to t or x any number of times uniformly converge for t ∈ [A, B], |x| ≤ R, where 0 < A < B < 4a12 b . Therefore, we may diﬀerentiate any number of times under the integral sign. The function u(t, x) is a solution of the heat equation (7.1) for the indicated values of t. Indeed, it suﬃces to show that Γ(t, x) is a solution of (7.1) for t > 0. This can be directly veriﬁed, but to avoid this calculation we may also note that, as we already know,

∂ ∂ 2 2 − a Δx u(t, x) = − a Δx Γ(t, x − y)ϕ(y)dy 0= ∂t ∂t for any ϕ ∈ S(Rn ), implying ∂ 2 − a Δx Γ(t, x − y) = 0, ∂t

t > 0.

It is easy to verify that if ϕ is continuous and satisﬁes (7.13) and u is given by (7.12), then u|t=0 = ϕ(x) in the sense that (7.14)

lim u(t, x) = ϕ(x) for any x ∈ Rn .

t→0+

We will even prove this in two ways.

√ First method. Note that Γ(t, x) = ε−n f ( xε ), where ε = 2a t and f (x) = π −n/2 exp(−|x|2 ), so that f (x) ≥ 0, f (x)dx = 1, and ε → 0+ as t → 0+. So we deal with a classical δ-like family of positive functions (in particular, Γ(t, x) → δ(x) in D (Rn ) as t → 0+). Now, (7.14) is veriﬁed via the standard scheme of the proof of δ-likeness; see Example 4.5. We skip the corresponding details; the reader should consider this a recommended exercise. Second method. Suppose we wish to verify (7.14) for x = x0 . Let us decompose ϕ(x) into the sum ϕ(x) = ϕ0 (x) + ϕ1 (x), where ϕ0 (x) is continuous, coincides with ϕ(x) for |x−x0 | ≤ 1, and vanishes for |x−x0 | ≥ 2. Accordingly, ϕ1 (x) is continuous, vanishes for |x − x0 | ≤ 1, and satisﬁes (7.13). Poisson’s integral (7.12) for ϕ splits into the sum of similar integrals for ϕ0 and ϕ1 (we denote them by u0 (t, x) and u1 (t, x)). It suﬃces to verify (7.14) for u0 (t, x) and u1 (t, x) separately. First, let us deal with u1 (t, x). We have

Γ(t, x0 − y)ϕ1 (y)dy. u1 (t, x0 ) = |y−x0 |≥1

Clearly, the integrand tends to zero as t → 0+ (since lim Γ(t, x) = 0 for x = 0) and is majorized for 0 < t ≤ B < function of the form C1 exp(−ε|y|2 ), where ε > convergence theorem lim u1 (t, x0 ) = 0. t→+0

t→+0 1 , by an independent of t 4a2 b 0, C1 > 0. By the dominated

7.4. Poisson’s integral

159

It remains to consider the case of an initial function with a compact support. Here the argument of Example 4.5 will do without any modiﬁcations. There is, however, another method based on approximation by smooth functions with a subsequent passage to the limit. It is based on the following important lemma. Lemma 7.3 (The maximum principle). Let a function u(t, x) be deﬁned by Poisson’s integral (7.12). Then inf ϕ(x) ≤ u(t, x) ≤ sup ϕ(x).

(7.15)

x∈Rn

x∈Rn

If one of these inequalities turns into the equality for some t > 0 and x ∈ Rn , then u = const. Proof. We have

u(t, x) = Γ(t, x − y)ϕ(y)dy ≤ sup ϕ(x) · Γ(t, x − y)dy = sup ϕ(x). x∈Rn

x∈Rn

The second inequality in (7.15) is proved similarly. Since Γ(t, x) > 0 for all t > 0, x ∈ Rn , then the equality is only possible if ϕ(x) = const almost everywhere, and then u = const as required. Remark 7.4. As we will see later, a solution u(t, x) of the Cauchy problem (7.2), satisfying the growth restriction of type (7.13) with respect to x uniformly in t, is unique. It will follow that the solution u(t, x) in fact can be represented by the Poisson integral and hence satisﬁes the maximum principle (7.15). Let us ﬁnish the proof of (7.14). Let ϕ be a continuous function with a compact support; let ϕk , k = 1, 2, . . . , be a sequence of functions from C0∞ (Rn ) such that sup |ϕ(x) − ϕk (x)| → 0 for k → +∞.

x∈Rn

(It is possible to obtain a sequence ϕk , for example, via mollifying; cf. Section 4.2.) Let

uk (t, x) =

Γ(t, x − y)ϕk (y)dy.

Since ϕk ∈ S(Rn ), it follows that lim uk (t, x) = ϕk (x) in the topology of t→+0

S(Rn ) (in particular, uniformly on Rn ). But by Lemma 7.3 sup t>0, x∈Rn

|uk (t, x) − u(t, x)| ≤ sup |ϕk (x) − ϕ(x)|; x∈Rn

this, clearly, implies lim u(t, x) = ϕ(x). Indeed, given ε > 0, we may t→+0

choose k such that sup |ϕk (x) − ϕ(x)| < x∈Rn

ε 3,

and then t0 > 0 such that

160

7. The heat equation

sup |uk (t, x) − ϕk (x)|
0, x ∈ Rn . It is also easily veriﬁed that u(t, x) is a solution of (7.1). The relation lim u(t, ·) = ϕ(·) now holds in t→+0

the sense of weak convergence in S (Rn ). Indeed, it is easy to verify that if ψ(x) ∈ S(Rn ), then

u(t, x)ψ(x)dx = ϕ(·), Γ(t, x − ·)ψ(x)dx

= ϕ(·), Γ(t, x − ·)ψ(x)dx = ϕ(·), v(t, ·), where v(t, x) is the Poisson integral corresponding to the initial function ψ(x) (we justify the possibility to interchange the integration with respect to x and application of the functional ϕ by the convergence of the integral in the topology of S(Rny ); cf. an identical argument in the proof of Proposition 5.3). Since v(t, ·) → ψ(·) as t → +0 in the topology of S(Rn ), it follows that

lim u(t, x)ψ(x)dx = ϕ, ψ, t→+0

implying that lim u(t, ·) = ϕ(·) in S (Rn ).

t→+0

We have already noted above that lim Γ(t, x) = δ(x) in S (Rn ). This t→+0

relation is often written in the shorter form Γ|t=0 = δ(x) and is a particular case of the statement just proved. Finally, let (7.13) be true for any b > 0 (with a constant C > 0 depending on b). Then the Poisson integral and, therefore, solution of the Cauchy problem are deﬁned for any t > 0. This is the case, e.g., if |ϕ(x)| ≤ C exp(b0 |x|2−ε ) for some ε > 0, b0 > 0 and, in particular, if ϕ(x) is bounded.

7.5. The fundamental solution. Duhamel’s formula

161

If ϕ ∈ Lp (Rn ) for p ∈ [1, ∞), then Poisson’s integral converges by H¨older’s inequality. In this case lim u(t, ·) = ϕ(·) with respect to the norm t→+0

in Lp (Rn ) (the proof is similar to the argument proving the convergence of molliﬁers with respect to the Lp -norm; see Section 4.2). It is easy to verify that if ϕ(x) satisﬁes (7.13), then the function u(t, x) determined by Poisson’s integral satisﬁes a similar estimate: 1 2 |u(t, x)| ≤ C1 eb1 |x| , 0 ≤ t ≤ B < 2 , 4a b where the constants C1 , b1 depend on ϕ and B. It turns out that a solution satisfying such an estimate is unique; we will prove this later. At the same time it turns out that the estimate 2+ε

|u(t, x)| ≤ Ceb|x|

does not guarantee the uniqueness for any ε > 0. (For the proof see, e.g., John [16, Ch. 7].) The uniqueness of solution makes it natural to apply Poisson’s integral in order to ﬁnd a meaningful solution (as a rule, solutions of fast growth have no physical meaning). For instance, the uniqueness of a solution takes place in the class of bounded functions. By the maximum principle (Lemma 7.3), the Cauchy problem is well-posed in the class of bounded functions (if ϕ is perturbed by adding a uniformly small function, then the same happens with u(t, x)). In conclusion notice that the fact that Γ(t, x) > 0 for all t > 0, x ∈ Rn implies that “the speed of the heat propagation is inﬁnite” (since the initial perturbation is supported at the origin!). But Γ(t, x) decreases very fast as |x| → +∞, which is practically equivalent to a ﬁnite speed of the heat propagation.

7.5. The fundamental solution for the heat operator. Duhamel’s formula Theorem 7.5. The following locally integrable function √ −n |x|2 E(t, x) = θ(t)Γ(t, x) = (2a πt) θ(t) exp − 2 4a t is a fundamental solution for function.

∂ ∂t

− a2 Δx in Rn+1 . Here θ(t) is the Heaviside

Proof. First, let us give a heuristic explanation for why E(t, x) is a fundamental solution. We saw that Γ(t, x) is a continuous function in t with values in S (Rnx ) for t ≥ 0. The function E(t, x) = θ(t)Γ(t, x) is not continuous as a function of t with values in S (Rnx ) but has a jump at t = 0 equal

162

7. The heat equation

∂ to δ(x). Therefore, applying ∂t , we get δ(t) ⊗ δ(x) + f (t, x), where f (t, x) is continuous in t in the similar sense. But then ∂ 2 − a Δx E(t, x) = δ(t) ⊗ δ(x) = δ(t, x), ∂t since the summand which is continuous with respect to t vanishes because ∂ ( ∂t − a2 Δx )E(t, x) = 0 for |t| + |x| = 0. This important argument makes it possible to write down the fundamental solution in all cases when we can write the solution of the Cauchy problem in a Poisson integral-like form. We will not bother with its justiﬁcation, however, since it is simpler to verify directly that E(t, x) is a fundamental solution.

We begin with the veriﬁcation of local integrability of E(t, x). Clearly, E(t, x) ∈ C ∞ (Rn+1 \{0}), so we only have to verify local integrability in a neighborhood of the origin. Since 1 exp − ≤ Cα τ α for τ > 0, for any α ≥ 0, τ we have t α −n/2 = Cα t−n/2+α |x|−2α . (7.16) |E(t, x)| ≤ Cα t |x|2 Let α ∈ ( n2 − 1, n2 ) so that − n2 + α > −1 and −2α > −n. It is clear from (7.16) that E(t, x) is majorized by an integrable function in a neighborhood of the origin; therefore, it is locally integrable itself. Now, let f (t, x) ∈ C0∞ (Rn+1 ). We have ∂ ∂ 2 2 − a Δx E(t, x), f (t, x) = E(t, x), − − a Δx f (t, x) ∂t ∂t

∂ E(t, x) − − a2 Δx f (t, x)dtdx. (7.17) = lim ε→+0 t≥ε ∂t ∂ and Δx in the integral back onto E(t, x). Since Let us move ∂t ∂ ( ∂t −a2 Δx )E(t, x) = 0 for t > 0, the result only contains a boundary integral over the plane t = ε obtained after integrating by parts with respect to t:

E(ε, x)f (ε, x)dx = Γ(ε, x)f (ε, x)dx.

The latter integral is uε (ε, 0), where uε (t, x) is Poisson’s integral with the initial function ϕε (x) = f (ε, x). Consider also Poisson’s integral u(t, x) with the initial function ϕ(x) = f (0, x). By the maximum principle |u(ε, 0) − uε (ε, 0)| ≤ sup |ϕε (x) − ϕ(x)| = sup |f (ε, x) − f (0, x)|. x

x

This, clearly, implies that u(ε, 0) − uε (ε, 0) → 0

for ε → +0.

7.5. The fundamental solution. Duhamel’s formula

163

But, as we have already seen, u(ε, 0) → ϕ(0) = f (0, 0) for ε → +0, implying lim uε (ε, 0) = f (0, 0) which, in turn, yields ε→+0

lim

ε→+0

∂ 2 E(t, x) − − a Δx f (t, x)dtdx = f (0, 0). ∂t t≥ε

This, together with (7.17), proves the theorem. Corollary 7.6. If u(t, x) ∈ D (Ω), where Ω is a domain in Rn+1 , and ∂ − a2 Δx )u = 0, then u ∈ C ∞ (Ω). ( ∂t

Therefore, the heat operator is an example of a hypoelliptic but not elliptic operator. Note that the heat operator has C ∞ but nonanalytic solutions (e.g., E(t, x) in a neighborhood of (0, x0 ), where x0 ∈ Rn \{0}). The knowledge of the fundamental solution makes it possible to solve the nonhomogeneous equation. For instance, a solution u(t, x) of the problem ∂u 2 n ∂t − a Δx u = f (t, x), t > 0, x ∈ R , (7.18) x ∈ Rn , u|t=0 = ϕ(x), can be written in the form

E(t − τ, x − y)f (τ, y)dτ dy + Γ(t, x − y)ϕ(y)dy u(t, x) = τ ≥0

(7.19)

t

dτ

= 0

Rn

Γ(t − τ, x − y)f (τ, y)dy +

Γ(t, x − y)ϕ(y)dy

under appropriate assumptions concerning f (t, x) and ϕ(x) which we will not formulate exactly. The ﬁrst integral in (7.19) is of interest. Its inner integral

v(τ, t, x) =

Γ(t − τ, x − y)f (τ, y)dy

is a solution of the Cauchy problem ∂ ( ∂t − a2 Δx )v(τ, t, x) = 0, t > τ, x ∈ Rn , (7.20) v|t=τ = f (τ, x). The Duhamel formula is a representation of the solution of the Cauchy problem (7.18) with zero initial function ϕ in the form

t v(τ, t, x)dτ, (7.21) u(t, x) = 0

where v(τ, t, x) is the solution of the Cauchy problem (7.20) (for a homogeneous equation!).

164

7. The heat equation

A similar representation is possible for any evolution equation. In par∂ − ticular, precisely this representation will do for operators of the form ∂t A(x, Dx ), where A(x, Dx ) is a linear diﬀerential operator with respect to x with variable coeﬃcients. Needless to say, in this case a2 Δx should be replaced by A(x, Dx ) in (7.20). The formal veriﬁcation is very simple: by ∂ − A(x, Dx ) to (7.21) we get in the right-hand side applying ∂t

t ∂ − A(x, Dx ) v(τ, t, x)dτ = v(t, t, x) = f (t, x), v(t, t, x) + ∂t 0 as required.

7.6. Estimates of derivatives of a solution of the heat equation Since the heat operator is hypoelliptic, we can use Proposition 5.20 to estimate derivatives of a solution of the heat equation through the solution itself on a slightly larger domain. We will only need it on an inﬁnite horizontal strip. Lemma 7.7. Let us assume that a solution u(t, x) of the heat equation (7.1) is deﬁned in the strip {t, x : t ∈ (a, b), x ∈ Rn } and satisﬁes the estimate p

|u(t, x)| ≤ Ced|x| ,

(7.22)

where p ≥ 0, C > 0, d ∈ R. Then in a somewhat smaller strip {t, x : t ∈ (a , b ), x ∈ Rn }, where a < a < b < b, similar estimates hold:

|∂ α u(t, x)| ≤ Cα ed |x| ,

(7.23)

p

where the constant p is the same as in (7.22); d = 0 if d = 0 and d = d + ε with an arbitrary ε > 0 if d = 0; the Cα depend upon the (n+1)-dimensional multi-index α and also upon ε, a , and b . Proof. Let us use Proposition 5.20 with Ω1 = {t, x : t ∈ (a, b), |x| < 1}, Ω2 = {t, x : t ∈ (a , b ), |x| < 1/2}, and estimate (5.22) applied to the solution uy (t, x) = u(t, x + y) of the heat equation depending on y ∈ Rn as on a parameter. We get |∂ α u(t, x)| ≤ Cα

sup |u(t, x )| ≤ Cα

|x −x| 0, c > 0, |x| ≥ R, and R is suﬃciently large. The same estimate holds also for any derivative ∂ α v(t, x), where α is a (n + 1)-dimensional multi-index. Let t0 > 0 be so small that 4ac2 t0 > b; i.e., t0 < 4ac2 b . Then the integral

u(t, x), v(t, x) = u(t, x)v(t, x)dx makes sense and converges uniformly for 0 ≤ t ≤ t0 − ε. As Lemma 7.7 shows, the integrals obtained by replacing u or v by their derivatives with respect to t or x also converge uniformly for 0 < ε ≤ t ≤ t0 − ε, where ε > 0 is arbitrary. Therefore, the function χ(t) = u(t, x), v(t, x) is continuous at t ∈ [0, t0 ) and vanishes at t = 0, and its derivative with respect to t at t ∈ (0, t0 ) is

∂(u(t, x)v(t, x)) ∂u ∂v dχ(t) = dx = ·v+u· dx dt ∂t ∂t ∂t

= a2 [(Δx u)v − u(Δx v)]dx.

7.7. Holmgren’s principle. The uniqueness of solution

167

The latter integral vanishes, since, by Green’s formula (4.32), it may be expressed in the form

lim [(Δx u)v − u(Δx v)]dx R→∞ |x|≤R

= lim

R→∞ |x|=R

∂v ∂u ·v−u· ∂n ¯ ∂n ¯

dS = 0,

because ∂∂un¯ ·v and u· ∂∂vn¯ tend to 0 exponentially fast as |x| → +∞. Therefore, χ(t) = const = 0. Now, let us verify that

u(t, x)v(t, x)dx. u(t0 , x)ψ(x)dx = lim t→t0 −0

(Actually, the right-hand side equals lim χ(t) = 0.) We have t→t0 −0

u(t, x)v(t, x)dx = u(t, x)Γ(t0 − t, x − y)ψ(y)dydx

= Γ(t0 − t, x − y)u(t, x)dx ψ(y)dy → u(t0 , y)ψ(y)dy, t → t0 − 0, since the argument we used in the consideration of Poisson’s integral shows that the convergence

Γ(t0 − t, x − y)u(t, x)dx = u(t0 , y) lim t→t0 −0

is uniform on any compact in Rny . Thus, clearly,

u(t0 , x)ψ(x)dx = 0, 0 ≤ t0 ≤ δ, ψ ∈ D(Rn ), if δ
0, x ∈ Ω, ⎨ ∂t = a2 u, (7.27) t > 0, x ∈ ∂Ω, u|x∈∂Ω = 0, ⎩ u|t=0 = ϕ(x), x ∈ Ω, by the Fourier method. Here we assume Ω to be a bounded domain in Rn . If the function u(t, x) = ψ(t)v(x) satisﬁes the ﬁrst two conditions in (7.27), we formally get (7.28)

Δv(x) ψ (t) = = −λ, v|∂Ω = 0, 2 a ψ(t) v(x)

leading to the eigenvalue problem for the operator −Δ with zero boundary conditions on ∂Ω: −Δv = λv, x ∈ Ω, (7.29) v|∂Ω = 0. For n = 1 this problem becomes the simplest Sturm-Liouville problem considered above (with the simplest boundary conditions: v should vanish at the endpoints of the segment). We will prove below that problem (7.29) possesses properties very similar to those of the Sturm-Liouville problem. In particular, it has a complete orthogonal system of eigenfunctions vk (x), k = 1, 2, . . ., with eigenvalues λk , such that λk > 0 and λk → +∞ as k → +∞. It follows from (7.28) that ψ(t) = Ce−λa t which makes it clear that it is natural to seek u(t, x) as a sum of the series 2

u(t, x) =

∞

Ck e−λk a t vk (x). 2

k=1

The coeﬃcients Ck are chosen from the initial condition u|t=0 = ϕ(x) and are coeﬃcients of the expansion of ϕ(x) in terms of the orthogonal system {vk (x)}∞ k=1 .

7.8. Fourier method

169

The second initial-boundary value ⎧ ∂u ⎨ ∂t = a2 Δu, ∂u | = 0, ⎩ ∂ n¯ x∈∂Ω u|t=0 = ϕ(x),

problem t > 0, x ∈ Ω, t > 0, x ∈ ∂Ω, x ∈ Ω,

is solved similarly. It leads to the eigenvalue problem −Δv = λv, x ∈ Ω, (7.30) ∂v ∂n ¯ |∂Ω = 0 with the same properties as the problem (7.29), the only exception being that λ = 0 is an eigenvalue of (7.30). The same approach, as for n = 1, yields solutions of more general problems; for example, ⎧ ∂u ⎨ ∂t = a2 Δu + f (t, x), t > 0, x ∈ Ω, = α(t, x), t > 0, x ∈ ∂Ω, u| ⎩ x∈∂Ω x ∈ Ω. u|t=0 = ϕ(x), For this, subtracting an auxiliary function, we make α(t, x) ≡ 0 and then seek the solution u(t, x) in the form of a series u(t, x) =

∞

ψk (t)vk (x),

k=1

which leads to ﬁrst-order ordinary diﬀerential equations for ψk (t). The uniqueness of the solution of these problems may be proved by Holmgren’s principle or derived from the maximum principle: if u(t, x) is ¯ and if it satisﬁes the heat a function continuous in the cylinder [0, T ] × Ω equation on (0, T ) × Ω, then 3 (7.31)

u(t, x) ≤ max sup u(0, x), x∈Ω

sup

u(t, x) ;

x∈∂Ω,t∈[0,T ]

and if the equality takes place, then u = const. We will not prove this physically natural principle. Its proof is quite elementary and may be found, for instance, in Petrovskii [24]. In what follows, we give the proof of existence of a complete orthogonal system of eigenfunctions for the problem (7.29). The boundary condition in (7.29) will be understood in a generalized sense. The Dirichlet problem will also be solved in a similar sense. All this requires introducing Sobolev spaces whose theory we are about to present.

170

7. The heat equation

7.9. Problems 7.1. Assume a rod of length l with a heat-insulated surface (the lateral sides and ends). For t = 0, the temperature of the left half of the rod is u1 and that of the right half is u2 . Find the rod temperature for t > 0 by the Fourier method. Find the behavior of the temperature as t → +∞. Estimate the relaxation time needed to decrease by half the diﬀerence between the maximal and minimal temperatures of the rod; in particular, ﬁnd the dependence of this time on the parameters of the rod: length, thickness, density, speciﬁc heat, and thermal conductivity. 7.2. The rod is heated by a constant electric current. Zero temperature is maintained at the ends and the side surface is insulated. Assuming that the temperature in any cross section of the rod is a constant, ﬁnd the distribution of the temperature in the rod from an initial distribution and describe the behavior of the temperature as t → +∞. 7.3. Let a bounded solution u(t, x) of the heat equation ut = a2 uxx (x ∈ R1 ) be determined for t > 0 and let it satisfy the initial condition u|t=0 = ϕ(x), where ϕ ∈ C(R1 ), lim ϕ(x) = b, lim ϕ(x) = c. Describe the behavior of x→+∞

u(t, x) as t → +∞.

x→−∞

7.4. Prove the maximum principle for the solutions of the heat equation in the formulation given at the end of Section 7.8 (see (7.31)).

10.1090/gsm/205/08

Chapter 8

Sobolev spaces. A generalized solution of Dirichlet’s problem

8.1. Spaces H s (Ω) Let s ∈ Z+ , and let Ω be an open subset in Rn . Deﬁnition 8.1. The Sobolev space H s (Ω) consists of u ∈ L2 (Ω) such that ∂ α u ∈ L2 (Ω) for |α| ≤ s, where α is a multi-index and the derivative is understood in the distributional sense. In H s (Ω), introduce the inner product (∂ α u, ∂ α v), (u, v)s = |α|≤s

where (·, ·) is the inner product in L2 (Ω), and also the corresponding norm ⎡ ⎤1 2

1 α 2 |∂ u(x)| dx⎦ . (8.1) us = (u, u)s2 = ⎣ |α|≤s Ω

Clearly, H s (Ω) ⊂ H s (Ω) for s ≥ s . Furthermore, H 0 (Ω) = L2 (Ω). In particular, each space H s (Ω) is embedded into L2 (Ω). Proposition 8.2. The inner product (u, v)s deﬁnes in H s (Ω) the structure of a separable Hilbert space (i.e., a Hilbert space which admits at most a countable orthonormal basis). 171

172

8. Sobolev spaces. Dirichlet’s problem

Proof. Only the completeness and separability are not obvious. Let us prove the completeness. Let a sequence {uk }∞ k=1 be a Cauchy sequence in H s (Ω). The deﬁnition of · s implies that all the sequences {∂ α uk }∞ k=1 (for |α| ≤ s) are Cauchy sequences in L2 (Ω). But then, by the completeness of L2 (Ω), they converge; i.e., lim ∂ α uk = vα with respect to the norm in k→∞

L2 (Ω). In particular, uk → v0 in L2 (Ω). But then ∂ α uk → ∂ α v0 in D (Ω) and since, on the other hand, ∂ α uk → vα in D (Ω), it follows that vα = ∂ α v0 . Therefore, v0 ∈ H s (Ω) and ∂ α uk → ∂ α v0 in L2 (Ω) for |α| ≤ s. But this means that uk → v0 in H s (Ω). Let us prove separability. The map u → {∂ α u}|α|≤s deﬁnes an isometric embedding of H s (Ω) into the direct sum of several copies of L2 (Ω) as a closed subspace. Now, the separability of H s (Ω) follows from that of L2 (Ω). Let us consider the case Ω = Rn in more detail. The space H s (Rn ) can be described in terms of the Fourier transform (understood in the sense of distributions, in S (Rn )). Recall that, by Plancherel’s theorem, the Fourier transformation

F : u(x) → u ˜(ξ) = e−ix·ξ u(x)dx determines an isometry of L2 (Rnx ) onto L2 (Rnξ ), where L2 (Rnξ ) is deﬁned by the measure (2π)−n dξ. Let us explain the meaning of this statement and its interpretation from the distributional viewpoint. We know from analysis that the Parseval identity

2 −n |˜ u(ξ)|2 dξ (8.2) |u(x)| dx = (2π) holds for u ∈ S(Rn ). The identity (8.2) for u ∈ S(Rn ) follows easily, for instance, from the inversion formula (5.23). Indeed,

2 −ix·ξ iy·ξ e u(x)dx e u(y)dy dξ |˜ u(ξ)| dξ =

= ei(y−x)·ξ u(x)u(y) dx dy dξ &

'

i(y−x)·ξ n |u(y)|2 dy. e u(x) dx dξ dy = (2π) = u(y) Since F deﬁnes an isomorphism of S(Rnx ) onto S(Rnξ ) and S(Rn ) is dense in L2 (Rn ), then F can be extended to an isometry F : L2 (Rnx ) −→ L2 (Rnξ ). Since the convergence in L2 (Rn ) implies a (weak) convergence in S (Rn ), the map F : L2 (Rnx ) −→ L2 (Rnξ ) is continuous in restrictions of the weak topologies of S (Rnx ) and S (Rnξ ). Therefore, this F is, actually, a restriction of the generalized Fourier transform F : S (Rnx ) −→ S (Rnξ ).

8.1. Spaces H s (Ω)

173

Now, note that the Fourier transform maps D α u(x) to ξ α u ˜(ξ). Theres n α 2 n ˜(ξ) ∈ L (R ) for |α| ≤ s. But instead fore, u ∈ H (R ) if andonly if ξ u α |2 |˜ |ξ u(ξ)|2 ∈ L1 (Rn ). Due to the obvious of this, we may write |α|≤s estimates C −1 (1 + |ξ|2 )s ≤ |ξ α |2 ≤ C(1 + |ξ|2 )s , |α|≤s

where C > 0 does not depend on ξ, we have u ∈ H s (Rn ) if and only if ˜(ξ) ∈ L2 (Rn ) and the norm (8.1) for Ω = Rn is equivalent to (1 + |ξ|2 )s/2 u the norm &

'1 2 2 s 2 (1 + |ξ| ) |˜ u(ξ)| dξ . (8.3) us = We will denote the norm (8.3) by the same symbol as the norm (8.1); there should not be any misunderstanding. Thus, we have proved Proposition 8.3. The space H s (Rn ) consists of u ∈ S (Rn ) such that ˜(ξ) ∈ L2 (Rn ). It can be equipped with the norm (8.3) which is (1 + |ξ|2 )s/2 u equivalent to the norm (8.1). This proposition may serve as a basis for the deﬁnition of H s (Rn ) for any real s. Namely, for any s ∈ R we may construct the space of u ∈ S (Rn ), ˜(ξ) ∈ L2 (Rn ), with the norm such that u ˜(ξ) ∈ L2loc (Rn ) and (1 + |ξ|2 )s/2 u (8.3). We again get a separable Hilbert space also denoted, as for s ∈ Z+ , by H s (Rn ). In what follows we will need the Banach space Cbk (Rn ) (here k ∈ Z+ ) consisting of functions u ∈ C k (Rn ), whose derivatives ∂ α u(x), |α| ≤ k, are bounded for all x ∈ Rn . The norm in Cbk (Rn ) is deﬁned by the formula u(k) = sup |∂ α u(x)|. |α|≤k

x∈Rn

Theorem 8.4 (Sobolev’s embedding theorem). For s > continuous embedding (8.4)

n 2

+ k, we have a

H s (Rn ) ⊂ Cbk (Rn ).

Let us clarify the formulation. At ﬁrst glance, it seems that it is meaningless, since a function u ∈ H s (Rn ) can be arbitrarily modiﬁed on any set of measure 0 without aﬀecting the corresponding distribution (and the element of H s (Rn )). Changing u at all points with rational coordinates we may make it discontinuous everywhere. Therefore, we should more precisely describe the embedding (8.4), as follows: if u ∈ H s (Rn ), then there exists a unique function u1 ∈ Cbk (Rn ) which coincides with the initial function u

174

8. Sobolev spaces. Dirichlet’s problem

almost everywhere (and for brevity we write u instead of u1 ). In other words, H s (Rn ) is a subspace of Cbk (Rn ) provided both are considered as subspaces of S (Rn ) or D (Rn ). Observe that the uniqueness of a continuous representative is obvious, since in any neighborhood of x0 ∈ Rn there exist points of any set of total measure, so that any modiﬁcation of a continuous function on a set of measure 0 leads to a function which is discontinuous at all points where the modiﬁcation took place. Proof of Theorem 8.4. First, let us prove the estimate u(k) ≤ Cus , u ∈ S(Rn ),

(8.5)

where C is a constant independent of u. The most convenient way is to apply the Fourier transform. We have

1 ˜(ξ)dξ, (iξ)α eix·ξ u ∂ α u(x) = (2π)n implying 1 |∂ u(x)| ≤ (2π)n α

hence,

|ξ α u ˜(ξ)|dξ;

u(k) ≤ C

(1 + |ξ|2 )k/2 |˜ u(ξ)|dξ.

Dividing and multiplying the integrand by (1 + |ξ|2 )s/2 , we have by the Cauchy-Schwarz inequality

u(ξ)|dξ u(k) ≤ C (1 + |ξ|2 )(k−s)/2 (1 + |ξ|2 )s/2 |˜

≤C

(1 + |ξ| )

2 (k−s)

1

1 2 2 2 s/2 2 dξ · (1 + |ξ| ) |˜ u(ξ)| dξ .

The integral in the ﬁrst factor in the right-hand side converges, since 2(k − s) < −n, and, therefore, the integrand decays faster than |ξ|−n−ε as |ξ| → +∞ for a suﬃciently small ε > 0. The second factor is equal to us . Therefore, for s > k + n2 we get (8.5). Now, note that S(Rn ) is dense in H s (Rn ) for any s ∈ R. Indeed, introduce the operator Λs which acts on u = u(x) by multiplying the Fourier transform u ˜(ξ) by (1 + |ξ|2 )s/2 . This operator is a unitary map of H s (Rn ) 2 n onto L (R ) mapping S(Rn ) isomorphically onto S(Rn ). Therefore, S(Rn ) is dense in H s (Rn ), because S(Rn ) is dense in L2 (Rn ). Now, let v ∈ H s (Rn ), um ∈ S(Rn ), um → v in H s (Rn ). But (8.5) implies that the sequence um is a Cauchy sequence with respect to the norm · (k) . Therefore, um → v1 in Cbk (Rn ). But then v and v1 coincide as distributions

8.1. Spaces H s (Ω)

175

and, therefore, almost everywhere. By continuity, (8.5) holds for all u ∈ H s (Rn ) proving the continuity of the embedding (8.4). In particular, Theorem 8.4 shows that it makes sense to talk about values of functions u ∈ H s (Rn ) at a point for s > n2 . Also of interest is the question of a meaning of restriction (or trace) of a function from a Sobolev space on a submanifold of an arbitrary dimension. We will not discuss this problem in full generality, however, and we only touch the most important case of codimension l. For simplicity, consider only the case of a hyperplane xn = 0 in Rn . A point x ∈ Rn will be expressed in the form x = (x , xn ), where x ∈ Rn−1 . Theorem 8.5 (Sobolev’s theorem on restrictions (or traces)). If s > 1/2, then the restriction operator S(Rn ) → S(Rn−1 ), u → u|xn =0 , can be extended to a linear continuous map 1

H s (Rn ) −→ H s− 2 (Rn−1 ). Proof. Let us express the restriction operator in terms of the Fourier transform (for u ∈ S(Rn )):

1 ˜(ξ , ξn )dξ dξn . eix ·ξ u u(x , 0) = n (2π) For brevity, set v(x ) = u(x , 0) and apply the Fourier transform F with respect to x . We obtain

1 (8.6) v˜(ξ ) = u ˜(ξ , ξn )dξn . 2π The needed statement takes on the form of the estimate

vs− 1 ≤ Cus ,

(8.7)

2

1

where · s− 1 is the norm in H s− 2 (Rn−1 ), u ∈ S(Rn ), and C does not 2

depend on u. Let us prove this estimate. We have 2

1 (8.8) vs− 1 = (1 + |ξ |2 )s− 2 |˜ v (ξ )|2 dξ . 2

From (8.6) we deduce for any s > 0 (8.9)

&

'2 1 2 −s/2 2 s/2 (1 + |ξ| ) u ˜(ξ)dξn (1 + |ξ| ) |˜ v (ξ )| = 4π 2

1 2 −s ≤ ) dξ · (1 + |ξ|2 )s |˜ u(ξ)|2 dξn . (1 + |ξ| n 4π 2 2

176

8. Sobolev spaces. Dirichlet’s problem

Let us estimate the ﬁrst factor. We have

(1 + |ξ|2 )−s dξn =

(1 + |ξ |2 + |ξn |2 )−s dξn ⎛ 2 ⎞−s

ξn ⎠ = (1 + |ξ |2 )−s ⎝1 + dξn 2 1 + |ξ |

∞ 1 2 −s+ 21 = (1 + |ξ | ) (1 + |t|2 )−s dt = C(1 + |ξ |2 )−s+ 2 . −∞

Here we used the fact that s > 12 , which ensures the convergence of the last integral. Therefore, (8.9) is expressed in the form 2

2 −s+ 12

|˜ v (ξ )| ≤ C(1 + |ξ | )

u(ξ)|2 dξn . (1 + |ξ|2 )s |˜

By applying this inequality to estimate |˜ v (ξ )|2 in (8.8), we get the desired estimate (8.7). 1

Remark. The restriction map H s (Rn ) → H s− 2 (Rn−1 ), u → u|xn =0 , is actually surjective (see, e.g., H¨ ormander [13, Ch. II], so the statement of the theorem is the exact one (the index s − 12 cannot be increased). Theorem 8.5 means that if u ∈ H s (Rn ), then the trace u|xn =0 of u on 1 the hyperplane xn = 0 is well-deﬁned and is an element of H s− 2 (Rn−1 ). It is obtained as follows: represent u in the form of the limit u = lim um of m→∞

functions um ∈ S(Rn ) with respect to the norm · s . Then the restrictions um |xn =0 have a limit as m → ∞ with respect to the norm · s− 1 in 1

2

the space H s− 2 (Rn−1 ), and this limit does not depend on the choice of an approximating sequence. If we wish to restrict ourselves to integer values of s, we may use the fact 1 that H s− 2 (Rn−1 ) ⊂ H s−1 (Rn−1 ). Therefore, the trace u|xn =0 of u ∈ H s (Rn ) belongs to H s−1 (Rn−1 ). The space H s (M ), where M is a smooth compact manifold (without boundary), can also be deﬁned. Namely, we will write u ∈ H s (M ) if for any coordinate diﬀeomorphism κ : U −→ Ω (here U is an open subset of M , Ω an open subset of Rn ) and any ϕ ∈ C0∞ (Ω) we have ϕ(u ◦ κ−1 ) ∈ H s (Rn ). Here u◦κ−1 is a distribution u pulled back to Ω via κ−1 . It is multiplied by ϕ to get a distribution with support in Ω. Therefore, ϕ(u ◦ κ−1 ) ∈ E (Ω) ⊂ E (Rn ); in particular, ϕ(u ◦ κ−1 ) ∈ S (Rn ), and it makes sense to talk about the

˚s (Ω) 8.2. Spaces H

177

inclusion ϕ(u ◦ κ−1 ) ∈ H s (Rn ). The localization helps us to prove the following analogue of Theorem 8.5: If Ω is a bounded domain with smooth boundary ∂Ω, then for s > 12 the ¯ to a linear continuous restriction map u → u|∂Ω can be extended from C ∞ (Ω) 1 operator H s (Ω) −→ H s− 2 (∂Ω).

˚s (Ω) 8.2. Spaces H ˚s (Ω) is the closure of D(Ω) in H s (Ω) (we Deﬁnition 8.6. The space H assume that s ∈ Z+ ). ˚s (Ω) is a closed subspace in H s (Ω). Hence, it is a separable Therefore, H Hilbert space itself. ˚1 (Ω) does not neces˚0 (Ω) = H 0 (Ω) = L2 (Ω). But already H Clearly, H sarily coincide with H 1 (Ω). Indeed, if Ω is a bounded domain with a smooth boundary, then, as had been noted above, a function u ∈ H 1 (Ω) possesses ˚1 (Ω), then u|∂Ω = 0. Therefore, a trace u|∂Ω ∈ L2 (∂Ω). However, if u ∈ H ∞ ¯ ˚1 (Ω). if, e.g., u ∈ C (Ω) and u|∂Ω = 0, then u ∈ H ˚s (Ω) is isometrically embedded into H s (Rn ) as well. This The space H embedding extends the embedding D(Ω) ⊂ D(Rn ). Such an extension is deﬁned by continuity, since the norm ·s , considered on D(Ω), is equivalent to the norm in H s (Rn ). The following theorem on compact embeddings is important. Theorem 8.7. Let Ω be a bounded domain in Rn , s, s ∈ Z+ , s > s . Then ˚s (Ω) ⊂ H s (Ω) is a compact operator. the embedding H Recall that a linear operator A : E1 −→ E2 , where E1 , E2 are Banach spaces, is called compact if the image of the unit ball B1 = {x : x ∈ E1 , xE1 ≤ 1} is a precompact subset in E2 . In turn, a set Q in a metric space E is called precompact (in E) if its closure is compact, or, equivalently, if from any sequence of points of Q we can choose a subsequence converging to a point of E. If E is a complete metric space, then a set Q ⊂ E is precompact if and only if it is totally bounded; i.e., for any ε > 0 it has a ﬁnite ε-net: a ﬁnite set x1 , . . . , xN ∈ E, such that Q⊂

N

B(xj , ε)

j=1

(here B(x, r) means open ball in E of radius r centered at x), or, in other words, for any point x ∈ Q there exists k ∈ {1, . . . , N } such that ρ(x, xk ) < ε, where ρ is the metric in E.

178

8. Sobolev spaces. Dirichlet’s problem

Let K be a compact subset in Rn (i.e., K is closed and bounded). Consider the space C(K) of continuous complex-valued functions on K. Let F ⊂ C(K). The Arzel` a-Ascoli theorem gives a criterion of precompactness of F : F is precompact if and only if it is uniformly bounded and equicontinuous. Here the uniform boundedness of the set F means the existence of a constant M > 0, such that supx∈K |f (x)| ≤ M for any f ∈ F ; the equicontinuity means that for any ε > 0 there exists δ > 0, such that |x − x | < δ with x , x ∈ K implies |f (x ) − f (x )| < ε for any f ∈ F . Proofs of the above general facts on compactness and the Arzel`a-Ascoli theorem may be found in any textbook on functional analysis; see, e.g., Kolmogorov and Fomin [17, Chapter II, Sect. 6 and 7]. Note the obvious corollary of general facts on compactness: for a set Q ⊂ E, where E is a complete metric space, to be precompact it suﬃces that for any ε > 0 there exists a precompact set Qε ⊂ E, such that Q belongs to the ε-neighborhood of Qε . Indeed, a ﬁnite ε/2-net of Qε/2 is an ε-net for Q. Proof of Theorem 8.7. We will use the mollifying operation (see the proof of Lemma 4.5). We have introduced there a family ϕε ∈ D(Rn ), such that supp ϕε ⊂ {x : |x| ≤ ε}, ϕε ≥ 0, and ϕε (x)dx = 1. Then from f ∈ L1loc (Rn ) we may form a molliﬁed family:

fε (x) = f (x − y)ϕε (y)dy = f (y)ϕε (x − y)dy. In Section 4.2 we formulated and proved a number of important properties of mollifying. Now we will make use of these properties.

First, observe that a set F is precompact in H s (Ω) if and only if all the sets Fα = {∂ α f : f ∈ F }, where α is a multi-index such that |α| ≤ s , are ˚s (Ω) clearly implies ∂ α f ∈ H ˚ s−|α| (Ω), precompact in L2 (Ω). Further, f ∈ H where |α| ≤ s. Since s > s , it is clear that all sets Fα belong to a bounded ˚1 (Ω). Therefore, it suﬃces to verify the compactness of the subset of H ˚1 (Ω) ⊂ L2 (Ω). embedding H ˚ 1 (Ω), u1 ≤ 1}. We should verify that F is precomLet F = {u : u ∈ H 2 pact in L (Ω) (or, equivalently, in L2 (Rn )). The idea of this veriﬁcation is as follows: consider the set Fε = {fε (x), f ∈ F } obtained by ε-mollifying all functions from F . Then prove that for a small ε > 0 the set F belongs to an arbitrarily small neighborhood of Fε (with ¯ for a ﬁxed respect to the norm in L2 (Rn )) and Fε is precompact in C(Ω)

˚s (Ω) 8.2. Spaces H

179

ε (hence, precompact in L2 (Ω)). The remark before the proof shows then that F is precompact in L2 (Ω). Observe also that the deﬁnition of the distributional derivative easily implies

∂f ∂f (y) (x) = ϕε (x − y)dy ∂xj ε ∂yj

∂ ϕε (x − y)dy = − f (y) ∂yj

∂ϕε (x − y) dy = f (y) ∂xj ∂fε (x) ˚1 (Ω). for f ∈ H = ∂xj In other words, distributional diﬀerentiation commutes with mollifying. On the other side, we clearly have from the Cauchy-Schwarz inequality that

|fε (x)| ≤ Cε |f (y)|dy

≤ Cε

1 |f (y)| dy 2

2

, f ∈ F.

Therefore, Fε is uniformly bounded (on Rn ) for a ﬁxed ε > 0. Further, ∂fε ∂f = ∂x are uniformly bounded by similar reasoning the derivatives ∂x j j ε for f ∈ F and a ﬁxed ε > 0. Therefore, Fε is equicontinuous. By the Arzel`a-Ascoli theorem, Fε is precompact in C(Rn ) (we may assume that ¯ in Rn ). It follows Fε ⊂ C(K), where K is a compact neighborhood of Ω that Fε is precompact in L2 (Rn ) and Fε |Ω = {fε |Ω , fε ∈ Fε } is precompact in L2 (Ω). It remains to verify that for any δ > 0 there exists ε > 0, such that F belongs to the δ-neighborhood of Fε . To this end, let us estimate the norm of f − fε in L2 (Rn ). (This norm will be, for the time being, denoted by just · .) First, assume that f ∈ D(Ω). We have

[f (x) − f (x − y)]ϕε (y)dy

1 d dt [f (x) − f (x − ty)]ϕε (y) = dy dt 0

1

n ∂f dt dy yj (x − ty)ϕε (y). = ∂xj 0

f (x) − fε (x) =

j=1

180

8. Sobolev spaces. Dirichlet’s problem

Using Minkowski’s inequality (4.53) with p = 2 and the Cauchy-Schwarz inequality for ﬁnite sums, we obtain ∂f f − fε ≤ dt dy |yj | (· − ty) ϕε (y) ∂xj 0 j=1

1

n n ∂f (·) ∂f dt dy |yj | ϕε (y) = = ∂xj ϕε (y)|yj |dy ∂xj 0 j=1 j=1 n ∂f √ ≤ δ1 ∂xj ≤ δ1 nf 1 ,

1

n

j=1

n

Clearly, δ1 ≤ Cε; hence δ1 → 0 as ε → 0. √ Since f ∈ F means f 1 ≤ 1, it follows that f − fε ≤ δ = δ1 n for f ∈ F , f ∈ D(Ω). where δ1 =

sup

x∈suppϕε

j=1 |xj |.

In the resulting inequality (8.10)

f − fε ≤ C1 εf 1

(which we proved for all f ∈ D(Ω)) we can pass to the limit along a sequence ˚ 1 (Ω) in the H 1 -norm · 1 , to conclude {fk ∈ D(Ω)}, converging to f ∈ H ˚1 (Ω). To make it more precise, we that the inequality holds for all f ∈ H need the estimate fε ≤ f which can be proved as follows:

fε = f (x − y)ϕε (y)dy ≤ f (· − y)ϕε (y)dy = f , where we again used Minkowski’s inequality (4.53). If the fk ∈ D(Ω), k = 1, 2, . . . , are chosen so that f − fk 1 ≤ 1/k, hence f − fk ≤ 1/k, then fε − (fk )ε = (f − fk )ε ≤ 1/k and, therefore, f − fε ≤ f − fk + fk − (fk )ε + fε − (fk )ε ≤ C1 εfk 1 +

2 C1 + 2 ≤ C1 εf 1 + . k k

In the resulting inequality we can pass to the limit as k → +∞, to conclude ˚( Ω) with the same constant C1 . It follows that that (8.10) holds for f ∈ H F lies in C1 ε-neighborhood of Fε which ends the proof of Theorem 8.7. On a compact manifold M without boundary one can prove compactness of the embedding H s (M ) ⊂ H s (M ) for any s, s ∈ R such that s > s . The embedding H s (Rn ) ⊂ H s (Rn ) is never compact whatever s and s .

8.3. Dirichlet’s integral. The Friedrichs inequality

181

8.3. Dirichlet’s integral. The Friedrichs inequality Dirichlet’s integral is

n ∂u 2 dx, D(u) = ∂xj Ω

where u ∈ H 1 (Ω).

j=1

One of the possible physical interpretations of this integral is the potential energy for small vibrations of a string (for n = 1) or a membrane (for n = 2), where u(x) is the displacement (assumed to be small) from the equilibrium. (In case of the membrane we should assume that the membrane points only move orthogonally to the coordinate plane. Besides, we will assume that the boundary of the membrane is ﬁxed; i.e., it cannot move.) Taking any displacement function u, let us try to understand when u represents an equilibrium. In the case of mechanical systems with ﬁnitely many degrees of freedom (see Section 2.5) the equilibriums are exactly the critical (or stationary) points of the potential energy. It is natural to expect that the same is true for systems with inﬁnitely many degrees of freedom, such as a string or a membrane, so the equilibrium property means vanishing of the variation of the Dirichlet integral under inﬁnitesimal deformations preserving u|∂Ω , i.e., the Euler equation for the functional D(u). We will see that it is just the Laplace equation. For the string this had actually been veriﬁed in Section 2.1, where the string’s Lagrangian was given. Let us verify this in the multidimensional case. ¯ δu ∈ C ∞ (Ω), ¯ and δu|∂Ω = 0, Let us take, for simplicity, u ∈ C ∞ (Ω), where Ω is assumed to be bounded with a smooth boundary. Let us write explicitly the condition for D(u) to be stationary, δD(u) = 0, where d δD(u) = D(u + t δu) . dt t=0

(See Section 2.5.) We have n

δD(u) = 2 j=1

= −2

Ω

∂u ∂ (δu)dx ∂xj ∂xj

n ∂ 2u Ω j=1

∂x2j

δudx = −2

Δu(x) · δu(x)dx. Ω

(The integral over ∂Ω appearing due to integration by parts vanishes since ¯ and ﬁxed u|∂Ω is δu|∂Ω = 0.) Thus, Dirichlet’s integral for u ∈ C ∞ (Ω) stationary if and only if Δu = 0. Therefore, the Dirichlet problem for the Laplace equation can be considered as a minimization problem for the Dirichlet integral in the class of functions such that u|∂Ω = ϕ, where ϕ is a ﬁxed function on ∂Ω.

182

8. Sobolev spaces. Dirichlet’s problem

The Laplace equation Δu(x) = 0 is also easy to derive for u ∈ H 1 (Ω) if u is a stationary point of Dirichlet’s integral for permissible variations δu ∈ C0∞ (Ω). The same argument ﬁts, but integration by parts should be replaced by the deﬁnition of the distributional derivative. Note that, in fact, the stationary points of Dirichlet’s integral are the points of minima (this is clear from what will follow). Therefore, Dirichlet’s problem may be considered as a minimization problem for Dirichlet’s integral in the class of functions with ﬁxed boundary values. Proposition 8.8. Let Ω be a bounded domain in Rn . Then there exists a constant C > 0, such that

n ∂u(x) 2 |u(x)| dx ≤ C ∂xj dx, 2

(8.11) Ω

˚1 (Ω). u∈H

Ω j=1

This inequality is called the Friedrichs inequality. Proof. Both parts of (8.11) continuously depend on u ∈ H 1 (Ω) in the norm ˚1 (Ω) is the closure of D(Ω) in H 1 (Ω), it is clear that it of H 1 (Ω). Since H suﬃces to prove (8.11) for u ∈ D(Ω) with C independent of u. Let Ω be contained in the strip {x : 0 < xn < R} (this can always be achieved by taking a suﬃciently large R and shifting Ω in Rn with the help of a translation that does not aﬀect (8.11)). For u ∈ D(Ω) we have

xn ∂u (x , t)dt. u(x) = u(x , xn ) = ∂x n 0 By the Cauchy-Schwarz inequality, this implies

xn

R ∂u 2 ∂u(x) 2 2 |u(x)| ≤ |xn | · ∂xn (x , t) dt ≤ R ∂xn dxn . 0 0 Integrating this inequality over Ω, we get

∂u 2 |u(x)| dx ≤ R ∂xn dx. 2

Ω

2

Ω

In particular, this implies that (8.11) holds for u ∈ D(Ω) with a constant C = R2 . The Friedrichs inequality implies that, for a bounded open set Ω, the ˚ 1 (Ω) is equivalent to the norm deﬁned by Dirichlet’s integral; norm ·1 in H 1 i.e., u → D(u) 2 . Indeed, u21 = u2 + D(u),

u ∈ H 1 (Ω),

8.4. Dirichlet’s problem (generalized solutions)

183

where · is the norm in L2 (Ω). The Friedrichs inequality means that ˚1 (Ω), implying u2 ≤ CD(u), for all u ∈ H D(u) ≤ u21 ≤ C1 D(u),

˚1 (Ω), u∈H

as required. Together with Dirichlet’s integral, we will use the corresponding inner product

n ∂u ∂v dx, u, v ∈ H 1 (Ω). [u, v] = ∂x ∂x j j Ω j=1

˚1 (Ω), H

˚ 1 (Ω) is a closed On it deﬁnes a Hilbert space structure, since H ˚1 (Ω) to the norm [u, u] 21 = subspace in H 1 (Ω) and u1 is equivalent in H 1 D(u) 2 .

8.4. Dirichlet’s problem (generalized solutions) Consider two generalized settings of Dirichlet’s boundary value problem for the Laplace operator. In both cases, the existence and uniqueness of a solution easily follows, after the above preliminary argument, from general theorems of functional analysis. The ﬁrst of the settings contains, as a classical analogue, the following problem (the boundary value problem with zero boundary condition for the Poisson equation): Δu(x) = f (x), x ∈ Ω, (8.12) u|∂Ω = 0. Instead of u|∂Ω = 0, write (8.13)

˚ 1 (Ω). u∈H

¯ u|∂Ω = 0. If Now let us try to make sense of Δu = f . First, let u ∈ C 2 (Ω), v ∈ D(Ω), then, integrating by parts, we get

n ∂u ∂¯ v (Δu(x))¯ v(x)dx = − dx = −[u, v]. ∂x ∂x j j Ω Ω j=1

If Δu = f , this means that (8.14)

[u, v] = −(f, v),

where (·, ·) is the inner product in L2 (Ω). Now note that if both parts of (8.14) are considered as functionals of v, then these functionals are linear and continuous with respect to the norm · 1 in H 1 (Ω). Therefore, (8.14) ˚1 (Ω). holds when v belongs to the closure of D(Ω) in H 1 (Ω), i.e., when v ∈ H

184

8. Sobolev spaces. Dirichlet’s problem

Conversely, let Ω be bounded and have a smooth boundary, let u ∈ % ˚1 ˚1 (Ω). Let us H (Ω), f ∈ L2 (Ω), and let (8.14) hold for any v ∈ H prove that u is a solution of (8.12). ˚1 (Ω) implies u|∂Ω = 0. This follows from the First, observe that u ∈ H fact that, due to Theorem 8.5, the map u → u|∂Ω extends to a continuous map H 1 (Ω) −→ L2 (∂Ω) which maps all functions from D(Ω) and, therefore, ˚1 (Ω) to zero. Note that here we only use, actually, the all functions from H fact that ¯ ∩H ˚1 (Ω) =⇒ u|∂Ω = 0. u ∈ C 2 (Ω) ¯ C 2 (Ω)

This fact can be obtained directly, avoiding Theorem 8.5 and its generalization to manifolds. Let us sketch the corresponding argument, omitting the details. Using a partition of unity and locally rectifying the boundary with the help of a diﬀeomorphism, we can, by the same arguments as those used in the proof of the Friedrichs inequality, get

|u(x)|2 dSx ≤ CD(u) ∂Ω

¯ with the support in a small neighborhood of a ﬁxed point of for u ∈ C 1 (Ω) the boundary. Therefore, if u|∂Ω = 0, then it is impossible to approximate u by functions from D(Ω) with respect to the norm in H 1 (Ω). Thus, if ¯ %H ˚ 1 (Ω), then u|∂Ω = 0. u ∈ C 2 (Ω) ¯ then Conversely, if u ∈ C 1 (Ω), ˚1 (Ω). u|∂Ω = 0 =⇒ u ∈ H This may be proved as follows. The ε-molliﬁed characteristic function of Ω2ε , where Ωδ = {x : x ∈ Ω, ρ(x, ∂Ω) ≥ δ}, is a function χε ∈ D(Ω), such that χε (x) = 1 for x ∈ Ω3ε and |∂ α χε (x)| ≤ ¯ then Cα ε−|α| (the constant Cα does not depend on ε). Now, if u ∈ C 1 (Ω), 1 ˚ χε u ∈ H (Ω), and u|∂Ω = 0 implies that if x ∈ Ω\Ω3ε , then |u(x)| ≤ Cε, where C hereafter depends on u but does not depend on ε. This enables us to estimate u − χε u21 = u − χε u2 + D(u − χε u), where · is the norm in L2 (Ω) and D is the Dirichlet integral. Since the Lebesgue measure of Ω\Ω3ε does not exceed Cε, we have u − χε u2 ≤ Cε sup |u − χε u|2 ≤ C1 ε3 Ω\Ω3ε

8.4. Dirichlet’s problem (generalized solutions)

185

and

2 n ∂ D(u − χε u) ≤ mes(Ω\Ω3ε ) · sup (u − χ u) ε ∂xj Ω\Ω3ε j=1

2 n ∂u 2 ∂ 2 2 ≤ Cε sup ∂xj |(1 − χε )| + |u| ∂xj (1 − χε ) Ω\Ω3ε j=1

≤ C1 ε, where C1 does not depend upon ε (but depends upon u). The product of u and ∂x∂ j (1 − χε ) is bounded because in the 3ε-neighborhood of the boundary we have u = O(ε) whereas the derivative is O(ε−1 ). Thus, it is possible to approximate u with respect to the norm · 1 by ¯ whose supports are compact in Ω. In turn, they can functions from C 1 (Ω) be easily approximated with respect to the norm · 1 by functions from D(Ω) with the help of mollifying. ¯ for a bounded open set Ω with a smooth Thus, supposing u ∈ C 2 (Ω) ˚1 (Ω). This justiﬁes the boundary, we get u|∂Ω = 0 if and only if u ∈ H replacement of the boundary conditions (8.12) with (8.13). ˚1 (Ω) and any v ∈ H ˚1 (Ω) if and only Further, (8.14) holds for u ∈ H if Δu = f , where Δ is understood in the distributional sense. Indeed, if (8.14) holds, then, taking v ∈ D(Ω) and moving the derivatives from u to v (using the deﬁnition of the distributional derivatives) we obtain that (u, Δv) = (f, v) and that Δu = f , because v can be an arbitrary function from D(Ω). ˚1 (Ω), and f ∈ L2 (Ω), then (8.14) holds for Conversely, if Δu = f , u ∈ H ˚1 (Ω). v ∈ D(Ω), hence, by continuity, for any v ∈ H All this justiﬁes the following generalized setting of the problem (8.12): • Let Ω be an arbitrary bounded open set in Rn , f ∈ L2 (Ω). Find u ∈ ˚1 (Ω) (or, equivalently, ˚1 (Ω), such that (8.14) holds for any v ∈ H H for any v ∈ D(Ω)). In this case u is called a generalized solution of the problem (8.12). We have seen above that if Ω is a bounded open set with a smooth boundary ¯ then u is a generalized solution of (8.12) if and only if it is and u ∈ C 2 (Ω), a solution of this problem in the classical sense. Theorem 8.9. A generalized solution of (8.12) exists and is unique. Proof. Rewrite (8.14) in the form [v, u] = −(v, f )

186

8. Sobolev spaces. Dirichlet’s problem

˚1 (Ω). This and consider the right-hand side as a functional in v for v ∈ H is a linear continuous functional. Its linearity is obvious and its continuity follows from the Cauchy-Schwarz inequality: |(v, f )| ≤ f · v ≤ f · v1 , where · is the norm in L2 (Ω). By the Riesz representation theorem we may (uniquely) write this func˚1 (Ω). This proves tional in the form [v, u], where u is a ﬁxed element of H the unique solvability of (8.12) (in the generalized setting). The generalized setting of the Dirichlet problem leaves two important questions open: 1) the exact description of the class of all functions u which appear as the solutions when f runs through all possible functions from L2 (Ω), 2) the existence of a more regular solution for a more regular f . Observe that local regularity problems are easily solved by the remark that if E(x) is a fundamental solution of the Laplace operator in Rn , then the convolution

E(x − y)f (y)dy u0 (x) = (E ∗ f )(x) = Ω

(deﬁned for almost all x) is a solution of Δu0 = f . Therefore, Δ(u−u0 ) = 0; i.e., u − u0 is an analytic function in Ω. Therefore, the regularity of u coincides locally with that of u0 and, therefore, can be easily described (e.g., if f ∈ C ∞ (Ω), then u ∈ C ∞ (Ω), as proved above). The question of regularity near the boundary is more complicated. Moreover, for nonsmooth boundary the answer is not completely clear even now. However, if ∂Ω is suﬃciently smooth (say, of class C ∞ ), there are exact (though not easy to prove) theorems describing the regularity of u in terms of the regularity of f . We will formulate one such theorems without proof. Theorem 8.10. Let the boundary of Ω be of class C ∞ . If f ∈ H s (Ω), ˚1 (Ω). s ∈ Z+ , then u ∈ H s+2 (Ω) ∩ H (For the proof see, e.g., Gilbarg and Trudinger [9, Sect. 8.4].) Conversely, it is clear that if u ∈ H s+2 (Ω), then Δu = f ∈ H s (Ω). Therefore, ˚1 (Ω) −→ H s (Ω), s ∈ Z+ , Δ : H s+2 (Ω) ∩ H is an isomorphism. In particular, in this case all generalized solutions (for ˚1 (Ω). f ∈ L2 (Ω)) lie in H 2 (Ω) ∩ H

8.4. Dirichlet’s problem (generalized solutions)

187

Another version of an exact description of the regularity, which we already discussed above, is Schauder’s theorem, Theorem 6.22, in H¨older functional spaces. There are numerous other descriptions, depending on the choice of functional spaces. Most of them are useful in miscellaneous problems of mathematical physics. ¯ even locally. Note that the last statement fails in the usual classes C k (Ω) 2 ¯ ¯ Namely, f ∈ C(Ω) does not even imply u ∈ C (Ω). Let us pass to the generalized setting of the usual Dirichlet problem for the Laplace equation: Δu = 0, (8.15) u|∂Ω = ϕ, where Ω is a bounded open set in Rn . First, there arises a question of interpretation of the boundary condition. We will do it by assuming that there exists a function v ∈ H 1 (Ω), such that v|∂Ω = ϕ, and formulating ˚ 1 (Ω). This implies the boundary condition for u as follows: u − v ∈ H 1 u ∈ H (Ω). Introducing v saves us from describing the smoothness of ϕ (which requires using Sobolev spaces with noninteger indices on ∂Ω if Ω has a smooth boundary). Besides, it becomes possible to consider the case of domains with nonsmooth boundaries. Let us give the generalized formulation of Dirichlet’s problem (8.15): • Given a bounded domain Ω ⊂ Rn and v ∈ H 1 (Ω), ﬁnd a function ˚1 (Ω). u ∈ H 1 (Ω), such that Δu = 0 and u − v ∈ H At ﬁrst sight this problem seems to be reducible to the above one if ˚1 (Ω) and put f = Δ(u − v) = −Δv. But we have no we seek u − v ∈ H reason to think that Δv ∈ L2 (Ω), and we did not consider a more general setting. Therefore, we will consider this problem independently of the ﬁrst problem, especially because its solution is as simple as the solution of the ﬁrst problem. The reasoning above makes it clear that if Ω has a smooth boundary, ¯ and u is a generalized solution of (8.15), ¯ v|∂Ω = ϕ, u ∈ C 2 (Ω), v ∈ C(Ω), then u is a classical solution of this problem. Theorem 8.11. A generalized solution u of Dirichlet’s problem exists and is unique for any bounded domain Ω ⊂ Rn and any v ∈ H 1 (Ω). This solution has strictly and globally minimal Dirichlet integral D(u) among all ˚1 (Ω). u ∈ H 1 (Ω) such that u − v ∈ H Conversely, if u is a critical (stationary) point of the Dirichlet integral ˚1 (Ω), then the Dirichlet in the class of all u ∈ H 1 (Ω) such that u − v ∈ H integral actually reaches the strict and global minimum at u and u is a generalized solution of Dirichlet’s problem.

188

8. Sobolev spaces. Dirichlet’s problem

Proof. Let u ∈ H 1 (Ω). The equation Δu = 0 can be expressed in the form (u, Δw) = 0,

w ∈ D(Ω),

or, equivalently, in the form [u, w] = 0,

w ∈ D(Ω).

Since [u, w] is a continuous functional in w on H 1 (Ω), then, by continuity, we get ˚1 (Ω). (8.16) [u, w] = 0, w ∈ H This condition is equivalent to the Laplace equation Δu = 0. Now we can invoke a geometric intepretation of the problem: we have ˚1 (Ω) with respect to [·, ·] and to ﬁnd u ∈ H 1 (Ω), which is orthogonal to H 1 ˚ (Ω). Imagining for a moment that H 1 (Ω) is a Hilbert such that v − u ∈ H space with respect to the inner product [·, ·], we see that u is the component in the decomposition v = (v − u) + u in the direct orthogonal decomposition ˚1 (Ω) + (H ˚1 (Ω))⊥ . H 1 (Ω) = H It is well known from functional analysis that such a decomposition exists and is unique (in any Hilbert space and for any closed subspace of this ˚1 (Ω)) component u Hilbert space), and, besides, the perpendicular (to H has the strictly minimal norm compared with all other decompositions v = ˚ 1 (Ω). Vice versa, in this case, if u is stationary, (v − w) + w with v − w ∈ H ˚1 (Ω). then it is orthogonal to H However, the “inner product” [·, ·] deﬁnes a Hilbert space structure on ˚ 1 (Ω), but not on H 1 (Ω) (it is not strictly positive; for example, [1, 1] = 0 H whereas 1 = 0). Therefore, the above argument should be made more ˚1 (Ω) so that v = u + z. It follows from precise. Denote z = v − u ∈ H (8.16) that (8.17)

[w, v] = [w, z],

˚1 (Ω). w∈H

Now it is easy to prove the existence of the solution. Namely, let us consider ˚1 (Ω). By the Riesz theorem [w, v] as a linear continuous functional in w on H it can be uniquely represented in the form [w, z], where z is a ﬁxed element ˚1 (Ω). It remains to set u = v − z. The deﬁnition of z implies (8.17) of H ˚1 (Ω) is obvious. which in turn implies (8.16). The condition u − v ∈ H The uniqueness of the solution is clear, since if u1 , u2 are two solutions, ˚ 1 (Ω) satisﬁes (8.16), and this yields u = 0. then u = u1 − u2 ∈ H Let us now prove that the Dirichlet integral is minimal on the solution ˚1 (Ω). Set z = u1 − u; u in the class of all u1 ∈ H 1 (Ω) such that u1 − v ∈ H 1 ˚ hence, u1 = u + z. Clearly, z ∈ H (Ω) and (8.16) yields D(u1 ) = [u1 , u1 ] = [u, u] + [z, z] = D(u) + D(z) ≥ D(u),

8.4. Dirichlet’s problem (generalized solutions)

189

where the equality is only attained for D(z) = 0, i.e., for z = 0 or, equivalently, for u = u1 . Finally, let us verify that if Dirichlet’s integral is stationary on u ∈ H 1 (Ω) ˚1 (Ω), then u is a in the class of all u1 ∈ H 1 (Ω) such that u1 − v ∈ H generalized solution of the Dirichlet problem (hence the Dirichlet integral reaches the strict global minimum at u). The stationarity property of the Dirichlet integral at u means that d ˚1 (Ω). D(u + tz)|t=0 = 0, z ∈ H dt But & ' d d D(u + tz) [u + tz, u + tz] = = [u, z] + [z, u] = 2 Re[u, z]. dt dt t=0

t=0

˚1 (Ω). Replacing The stationarity property gives Re[u, z] = 0 for any z ∈ H ˚1 (Ω), yielding (8.16). Theorem 8.11 z by iz, we get [u, z] = 0 for any z ∈ H is proved. Finally, we will give without proof a precise description of the regularity of solutions from the data on ϕ in the boundary value problem (8.15) for a domain Ω with a smooth boundary. It was already mentioned that if 1 u ∈ H s (Ω), s ∈ Z+ , s ≥ 1, then u|∂Ω ∈ H s− 2 (∂Ω). It turns out that the 1 converse also holds: if ϕ ∈ H s− 2 (∂Ω), where s ∈ Z+ , s ≥ 1, then there exists a unique solution u ∈ H s (Ω) of the problem (8.15) (see, e.g., Lions and Magenes [22]). Similarly, if ϕ ∈ C m,γ (∂Ω), where m ∈ Z+ , γ ∈ (0, 1), then there exists ¯ a (unique) solution u ∈ C m,γ (Ω). Combining the described regularity properties of solutions of problems (8.12) and (8.15), it is easy to prove that the map u → {Δu, u|Γ } can be extended to a topological isomorphism 1

H s (Ω) −→ H s−2 (Ω) ⊕ H s− 2 (∂Ω), and also to a topological isomorphism ¯ −→ C m,γ (Ω) ¯ ⊕ C m+2,γ (∂Ω), C m+2,γ (Ω)

s ∈ Z+ , s ≥ 2, m ∈ Z+ , γ ∈ (0, 1).

(See, e.g., the above-quoted books Gilbarg and Trudinger [9] and Ladyzhenskaya and Ural’tseva [19].) Similar theorems can also be proved for general elliptic equations with general boundary conditions satisfying an algebraic condition called the ellipticity condition (or coercitivity, or Shapiro-Lopatinsky condition). For the general theory of elliptic boundary value problems, see, e.g., books by Lions and Magenes [22], H¨ormander [13, Ch. X], or [15, vol. 3, Ch. 20].

190

8. Sobolev spaces. Dirichlet’s problem

8.5. Problems 8.1. Using the Fourier method, solve the following Dirichlet problem: ﬁnd a function u = u(x, y) such that Δu = 0 in the disc r ≤ a, where r = x2 + y 2 , u|r=a = (x4 − y 4 )|r=a . 8.2. By the Fourier method solve the Dirichlet problem in the disk r ≤ a for a smooth boundary function f (ϕ), where ϕ is the polar angle; i.e., ﬁnd a function u = u(x, y) such that Δu = 0 in the disc r ≤ a and u|r=a = f (ϕ). Justify the solution. Derive for this case Poisson’s formula:

2π (a2 − r2 )f (α)dα 1 . u(r, ϕ) = 2π 0 a2 − 2ar cos(ϕ − α) + r2 8.3. Using the positivity of Poisson’s kernel K(r, ϕ, α) =

2π[a2

a2 − r 2 , − 2ar cos(ϕ − α) + r2 ]

prove the maximum principle for the solutions of Dirichlet’s problem obtained in the preceding problem. Derive from the maximum principle the solvability of Dirichlet’s problem for any continuous function f (ϕ). 8.4. Find a function u = u(x, y) such that Δu = x2 − y 2 for r ≤ a and u|r=a = 0. 8.5. In the ring 0 < a ≤ r ≤ b < +∞, ﬁnd a function u = u(x, y) such that 2 Δu = 0, u|r=a = 1, ∂u ∂r r=b = (cos ϕ) . 8.6. By the Fourier method solve the Dirichlet problem for the Laplace equation in the rectangle 0 ≤ x ≤ a, 0 ≤ y ≤ b with the boundary conditions of the form u|x=0 = ϕ0 (y), u|x=a = ϕ1 (y) (0 ≤ y ≤ b), u|y=0 = ψ0 (x), u|y=b = ψ1 (x) (0 ≤ x ≤ a), where (the continuity condition for the boundary function) ϕ0 (0) = ψ0 (0), ϕ0 (b) = ψ1 (0), ϕ1 (0) = ψ0 (a), ϕ1 (b) = ψ1 (a). Describe a way for solving the general problem, and then solve the problem in the following particular case: πx , ϕ1 (y) = ψ1 (x) = 0. ϕ0 (y) = Ay(b − y), ψ0 (x) = B sin a 8.7. In the half-plane {(x, y) : y ≥ 0} solve Dirichlet’s problem for the Laplace equation in the class of bounded continous functions. For this purpose use the Fourier transform with respect to x to derive the following

8.5. Problems

191

formula for the solutions which are in L2 (R) with respect to x:

1 ∞ y u(x, y) = ϕ(z)dz, 2 π −∞ y + (x − z)2 where ϕ(x) = u(x, 0). Then investigate the case of a general bounded continuous function ϕ. 8.8. Investigate for which n, s, and α (α > −n) the function u(x) = |x|α belongs to H s (B1 ), where B1 is the open ball of radius 1 with the center at the origin of Rn . ˚1 ((−1, 1))? 8.9. Does the function u(x) ≡ 1, where x ∈ (−1, 1), belong to H 8.10. Prove that the inclusion H s (Rn ) ⊂ L2 (Rn ), s > 0, is not a compact operator.

10.1090/gsm/205/09

Chapter 9

The eigenvalues and eigenfunctions of the Laplace operator

9.1. Symmetric and selfadjoint operators in Hilbert space Let H be a Hilbert space with inner product (, ). Recall that a linear operator in H is a collection of two objects: a) a linear (but not necessarily closed) subspace D(A) ∈ H called the domain of the operator A, b) a linear map A : D(A) −→ H. In this case we will write, by an abuse of notation, that a linear operator A : H −→ H is deﬁned, though, strictly speaking, there is no map H −→ H, because in general A may not be deﬁned on the whole of H. It is possible to add linear operators and take their products (compositions). Namely, given two linear operators A1 , A2 in H, we set D(A1 + A2 ) = D(A1 ) ∩ D(A2 ), (A1 + A2 )x = A1 x + A2 x

for x ∈ D(A1 + A2 ).

Further, set D(A1 A2 ) = {x : x ∈ D(A2 ), A2 x ∈ D(A1 )}, (A1 A2 )x = A1 (A2 x)

for x ∈ D(A1 A2 ). 193

194

9. Eigenvalues and eigenfunctions

Deﬁne also Ker A = {x : x ∈ D(A), Ax = 0}, Im A = {x : x = Ay; y ∈ D(A)}. If Ker A = 0, then the inverse operator A−1 is deﬁned. Namely, set D(A−1 ) = Im A and A−1 x = y for x = Ay. If there are two linear operators A, B in H, such that D(A) ⊂ D(B) and Ax = Bx for x ∈ D(A), then we say that B is an extension of A and write A ⊂ B. An operator A is called symmetric if (Ax, y) = (x, Ay),

x, y ∈ D(A).

In what follows, we will also introduce the notion of a selfadjoint operator. (For unbounded operators symmetry and selfadjointness need not coincide!) Let A be a linear operator such that D(A) is dense in H. Then the adjoint operator A∗ is deﬁned via the identity (9.1)

(Ax, y) = (x, A∗ y).

More precisely, let D(A∗ ) consist of y ∈ H such that there exists z ∈ H satisfying (Ax, y) = (x, z) for x ∈ D(A). Since D(A) is dense in H, we see that z is uniquely deﬁned and we set A∗ y = z. If the domain of A is dense and A is symmetric, then, clearly, A ⊂ A∗ . An operator A is selfadjoint if A∗ = A (it is understood that the domain of A is dense and D(A∗ ) = D(A)). Clearly, any selfadjoint operator is symmetric. The converse is not, generally, true. Deﬁne the graph of A as the set GA = {x, Ax : x ∈ D(A)} ⊂ H × H. An operator is closed if its graph is closed in H × H, i.e., if xn ∈ D(A), xn → x, and Axn → y (the convergence is understood with respect to the norm in H) yield x ∈ D(A) and Ax = y. A selfadjoint operator is closed. A more general fact is true: the operator A∗ is always closed for any A. Indeed, if yn ∈ D(A∗ ), yn → y, and A∗ yn → z, then passing to the limit as n → ∞ in the identity (Ax, yn ) = (x, A∗ yn ),

x ∈ D(A),

we get (Ax, y) = (x, z), yielding y ∈ D(A∗ ) and A∗ y = z.

x ∈ D(A),

9.1. Symmetric and selfadjoint operators

195

If D(A) is a closed subspace in H (in particular, if D(A) = H, i.e., A is deﬁned everywhere), then the closedness of A is equivalent to its boundedness, due to the closed graph theorem. In particular, a selfadjoint everywhere deﬁned operator A is necessarily bounded. Let us indicate a way of constructing selfadjoint operators. Proposition 9.1. Let A be a selfadjoint operator in H and let Ker A = 0. Then A−1 is selfadjoint. Proof. Denote by E ⊥ the orthogonal complement to E ⊂ H; namely E ⊥ = {y : y ∈ H, (x, y) = 0 for all x ∈ E}. Given a linear operator A with D(A) dense in H, we easily derive from the deﬁnition of A∗ that (9.2)

Ker A∗ = (Im A)⊥ .

Indeed, the equality (Ax, y) = 0 for all x ∈ D(A) means, due to (9.1), exactly that y ∈ D(A∗ ) and A∗ y = 0. Note that E ⊥ = 0 if and only if the linear span of E is dense in H. Thus if A is selfadjoint and Ker A = {0}, then from (9.2) it follows that D(A−1 ) = Im A is dense in H. Therefore, A−1 and (A−1 )∗ are deﬁned. It remains to verify that (A−1 )∗ = A−1 . We have y ∈ D((A−1 )∗ ) and (A−1 )∗ y = z if and only if (9.3)

(A−1 x, y) = (x, z),

x ∈ Im A.

Setting x = Af for f ∈ D(A), we see that (9.3) is equivalent to (f, y) = (Af, z),

f ∈ D(A),

yielding z ∈ D(A∗ ) and A∗ z = y. But, since A∗ = A, this means that z ∈ D(A) and Az = y which may also be rewritten as z = A−1 y. We have proved that y ∈ D((A−1 )∗ ) with (A−1 )∗ y = z is equivalent to y ∈ D((A−1 )) with A−1 y = z. Therefore, (A−1 )∗ = A−1 as required. Corollary 9.2. If B is a bounded everywhere deﬁned symmetric operator with Ker B = 0, then A = B −1 is selfadjoint. The importance of selfadjoint operators follows from the spectral theorem. Before we formulate it, let us give an example of a selfadjoint operator. Let M be a space with measure μ; i.e., μ is a countably additive measure on a σ-algebra of subsets of M . Construct in the usual way the space L2 (M, dμ) consisting of classes of measurable functions on M , such that the square of their absolute value is integrable. (Two functions are in the same class if they coincide almost everywhere.) Then L2 (M, dμ) is a Hilbert space. Let a(m) be a real-valued measurable and almost everywhere ﬁnite

196

9. Eigenvalues and eigenfunctions

function on M . The operator A of multiplication by a(m) is deﬁned by the formula Af (m) = a(m)f (m) for f ∈ D(A), where D(A) := {f : f ∈ L2 (M, dμ), af ∈ L2 (M, dμ)}. This operator is always selfadjoint. Indeed, its symmetry is obvious and we have only to verify that D(A∗ ) = D(A). Let u ∈ D(A∗ ), A∗ u = v. Let us check that u ∈ D(A) and a(m)u(m) = v(m) for almost all m. Consider the set MN := {m : m ∈ M, |a(m)| ≤ N }, and let χN (m) = 1 for m ∈ MN and χN (m) = 0 for m ∈ MN . The identity A∗ u = v means that

a(m)f (m)u(m)dμ = f (m)v(m)dμ for any f ∈ D(A). M

M

then χN g ∈ D(A), implying Note that if g ∈

a(m)g(m)u(m)dμ = g(m)v(m)dμ for any g ∈ L2 (MN , dμ). L2 (M, dμ),

MN

MN

But it is clear that (au)|MN ∈ L2 (MN , dμ) since a is bounded on MN . Therefore, the arbitrariness of g implies au|MN = 8 = ∞ MN , then au| = v| . v|MN (almost everywhere). Therefore, if M N =1 M M 8 is equal to Since a(m) is almost everywhere ﬁnite, then the measure of M \M 0. Hence, a(m)u(m) = v(m) almost everywhere. In particular, a(m)u(m) ∈ L2 (M, dμ); i.e., u ∈ D(A) and au = v, as required. Here is an important example. Let l2 be the Hilbert space ⎧ ⎫ ∞ ⎨ ⎬ |xj |2 < +∞ , l2 = x = {x1 , x2 , . . .} : xj ∈ C, ⎩ ⎭ j=1

with inner product ({x1 , . . . }, {y1 , . . . }) =

∞

xj yj .

j=1

For any sequence of real numbers λ1 , λ2 , . . . , consider the operator A in l2 which is deﬁned as follows: ⎫ ⎧ ∞ ⎬ ⎨ |λj |2 |xj |2 < +∞ , D(A) = {x1 , x2 , . . .}, xj ∈ C, ⎭ ⎩ j=1

A({x1 , x2 , . . .}) = {λ1 x1 , λ2 x2 , . . .}.

9.2. The Friedrichs extension

197

To interpret this example as a multiplication operator, let M := {1, 2, . . . } and let the measure of each integer be equal to 1. Operators A1 : H1 −→ H1 and A2 : H2 −→ H2 are called unitarily equivalent if there exists a unitary (i.e., invertible and inner product preserving) operator U : H1 −→ H2 such that U −1 A2 U = A1 . (Recall that the equality of operators by deﬁnition implies the coincidence of their domains.) In other words, in this notation the unitary equivalence of A1 and A2 means that the diagram A

1 −→ H H⏐ 1 ⏐1 ⏐ ⏐ U< 0 to A0 , we get a new operator (denote it again by A0 ) such that (9.5)

(A0 ϕ, ϕ) ≥ ε(ϕ, ϕ),

ϕ ∈ D(A0 ).

9.2. The Friedrichs extension

199

We will assume from the very beginning that this stronger estimate (9.5) holds, since from the point of view of the eigenvalue problem the addition of (C + ε)I is of no importance, because it only shifts eigenvalues without aﬀecting eigenfunctions. Moreover, the Friedrichs inequality shows that in the above example (9.5) holds already. Set u, v ∈ D(A0 ).

[u, v] = (A0 u, v),

This scalar product deﬁnes a pre-Hilbert structure on D(A0 ) (the positive deﬁniteness follows from (9.5)). Denote by · 1 the corresponding norm (i.e., u1 = [u, u]1/2 ). Then the convergence with respect to · 1 implies convergence with respect to the norm · in H. In particular, if a sequence ϕk ∈ D(A0 ) is a Cauchy sequence with respect to · 1 , then it is a Cauchy sequence with respect to · ; hence, ϕk converges in H. Denote by H1 the completion of D(A0 ) with respect to · 1 . (In the ˚1 (Ω).) The inequality (9.5) and the above arguments show example, H1 = H the existence of a natural map H1 −→ H. Namely, the image g ∗ of g ∈ H1 is the limit in H of a sequence {ϕk }, ϕk ∈ D(A0 ), converging to g in H1 . Let us prove that this map is injective. First of all note that by continuity of scalar products [ϕ, g] = (A0 ϕ, g ∗ ),

ϕ ∈ D(A0 ).

Now, if g ∗ = 0, then, using the above-mentioned sequence {ϕk }, we obtain [g, g] = lim [ϕk , g] = lim (A0 ϕk , g ∗ ) = 0 k→∞

k→∞

implying g = 0, as required. Therefore, there is a natural inclusion H1 ⊂ H, so we will identify the elements of H1 with the corresponding elements of H. Now set D(A) = H1 ∩ D(A∗0 ) and A = A∗0 |D(A) . We get an operator A : H −→ H called the Friedrichs extension of A0 . Theorem 9.3. The Friedrichs extension of a semibounded operator A0 is a selfadjoint operator A : H −→ H. It is again semibounded from below and (Au, u) ≥ ε(u, u),

u ∈ D(A),

where ε > 0 is the same constant as in the similar estimate for A0 .

200

9. Eigenvalues and eigenfunctions

Proof. Let us verify that A−1 exists and is everywhere deﬁned, bounded, and symmetric. This implies that A is selfadjoint (see Corollary 9.2). First, verify that (9.6)

(Au, v) = [u, v],

u, v ∈ D(A).

Indeed, this is true for u, v ∈ D(A0 ). By continuity this is also true for u ∈ D(A0 ), v ∈ H1 . Now, make use of the identity (A0 u, v) = (u, A∗0 v),

u ∈ D(A0 ), v ∈ D(A∗0 ).

In particular, this is true for u ∈ D(A0 ), v ∈ D(A). Then A0 u = Au, A∗0 v = Av, and therefore, we have (Au, v) = (u, Av) = [u, v],

u ∈ D(A0 ), v ∈ D(A).

But this, in particular, implies (u, Av) = [u, v],

u ∈ D(A0 ), v ∈ D(A),

where we may pass to the limit with respect to u (if the limit is taken in H1 ). We get (9.7)

(u, Av) = [u, v],

u ∈ H1 , v ∈ D(A).

In particular, this holds for u ∈ D(A), v ∈ D(A). Interchanging u and v, we get (9.6), implying (9.8)

(Au, v) = (u, Av) = [u, v],

u, v ∈ D(A).

In particular, A is symmetric and (9.9)

(Au, u) = [u, u] ≥ ε(u, u),

u ∈ D(A),

where ε > 0 is the same as in (9.5). It follows from (9.9) that Ker A = 0 and A−1 is well-deﬁned. The symmetry of A implies that of A−1 . It follows from (9.9) that A−1 is bounded. Indeed, if u ∈ D(A), then u2 ≤ ε−1 (Au, u) ≤ ε−1 u · Au, implying u ≤ ε−1 Au,

u ∈ D(A).

Setting v = Au, we get A−1 v ≤ ε−1 v, which means the boundedness of A−1 .

v ∈ D(A−1 ),

9.3. Discreteness of spectrum in a bounded domain

201

Finally, let us prove that D(A−1 ) = H, i.e., Im A = H. We want to prove that if f ∈ H, then there exists u ∈ D(A), such that Au = f . But this means that u ∈ H1 ∩ D(A∗0 ) and A∗0 u = f ; i.e., (9.10)

(A0 ϕ, u) = (ϕ, f ),

ϕ ∈ D(A0 ).

Taking (9.8) into account, we may rewrite this in the form (9.11)

[ϕ, u] = (ϕ, f ),

ϕ ∈ D(A0 ).

Now, the Riesz theorem implies that for any f ∈ H there exists u ∈ H1 such that (9.11) holds. It remains to verify that u ∈ D(A∗0 ). But this is clear since, due to (9.7), we may rewrite (9.11) in the form (9.10), as required. Theorem 9.3 is proved. Now, let us pass to the model example. Let A0 be the operator −Δ on D(Ω), where Ω is a bounded domain in Rn . As we have already seen, the adjoint operator A∗0 maps u to −Δu for u ∈ L2 (Ω) and Δu ∈ L2 (Ω) (here Δ is understood in the distributional sense). Due to the Friedrichs inequality, A0 is positive; i.e., (9.5) holds. Let us construct the Friedrichs extension of A. This is an operator in L2 (Ω), such that A0 ⊂ A ⊂ A∗0 and ˚1 (Ω) ∩ D(A∗ ); D(A) = H 0 i.e., ˚1 (Ω), Δu ∈ L2 (Ω)}. D(A) = {u : u ∈ H By Theorem 9.3, it follows that A is selfadjoint and positive. It is usually called the selfadjoint operator in L2 (Ω) deﬁned by −Δ and the Dirichlet boundary condition u|∂Ω = 0. The operator A−1 is everywhere deﬁned and bounded.

9.3. Discreteness of spectrum for the Laplace operator in a bounded domain Let Ω be a bounded open subset in Rn . Theorem 9.4. The spectrum of the operator A deﬁned in L2 (Ω) by −Δ and the Dirichlet boundary condition is discrete. More precisely, there exists an ˚1 (Ω) orthonormal basis of eigenvectors ψj , j = 1, 2, . . . , such that ψj ∈ H 2 and −Δψj = λj ψj in L (Ω), where λj → +∞ as j → +∞. Proof. First, let us make a general remark on the Friedrichs extensions. Let A0 be a positive symmetric operator on a Hilbert space H, A its Friedrichs extension, and H1 the Hilbert space obtained as the completion of D(A0 ) with respect to the norm ϕ1 = (A0 ϕ, ϕ)1/2 . Since D(A) ⊂ H1 , it follows

202

9. Eigenvalues and eigenfunctions

that A−1 maps H to H1 . Let us prove that A−1 , considered as an operator from H to H1 , is continuous. The simplest way to deduce it is to use the closed graph theorem. Namely, verify that the graph of A−1 : H −→ H1 is closed; i.e., ϕk → ϕ in H and A−1 ϕk → f in H1 imply that A−1 ϕ = f . But, due to continuity of the embedding H1 ⊂ H, the above implies that A−1 ϕk → f in H, and the continuity of A−1 : H −→ H yields A−1 ϕk → A−1 ϕ in H; hence, A−1 ϕ = f , as required. ˚1 (Ω) and the embedding Let us return to our situation. We have H1 = H 1 2 ˚ operator H (Ω) ⊂ L (Ω) is not only continuous, but also compact. Therefore, A−1 : L2 (Ω) −→ L2 (Ω) as a composition of the continuous operator ˚1 (Ω) and the compact embedding H ˚1 (Ω) ⊂ L2 (Ω) is a A−1 : L2 (Ω) −→ H compact selfadjoint operator. By the Hilbert-Schmidt theorem, A−1 has an orthonormal eigenbasis ψ1 , ψ2 , . . . in L2 (Ω), and if μj are the corresponding eigenvalues, i.e., A−1 ψj = μj ψj , then μj → 0 as j → ∞. Note that 0 is not an eigenvalue of A−1 , since the deﬁnition of A−1 implies Ker A−1 = 0. Further, A−1 ψj = μj ψj may be rewritten as ψj = μj Aψj or Aψj = λj ψj , where λj = μ−1 j . The positivity of A implies that λj > 0, and condition μj → 0 yields λj → +∞ as j → +∞, as required.

9.4. Fundamental solution of the Helmholtz operator and the analyticity of eigenfunctions of the Laplace operator at the interior points. Bessel’s equation We would like to prove the analyticity of eigenfunctions of the Laplace operator at interior points of the domain. Since eigenvalues are positive, it suﬃces to prove analyticity of any solution u ∈ D (Ω) of the equation (9.12)

Δu + k 2 u = 0,

k > 0,

called the Helmholtz equation. First, let us ﬁnd explicitly a fundamental solution of the Helmholtz operator Δ+k 2 . Denote this fundamental solution (k) (k) by En (x). If it turns out that En (x) is analytic for x = 0, then any solution u ∈ D (Ω) of the equation (9.12) is analytic in Ω; cf. Theorem 5.12. Since Δ + k 2 commutes with rotations, it is natural to seek a spherically (k) symmetric fundamental solution. Let En (x) = f (r), where r = |x| for x = 0. Then f (r) for r > 0 satisﬁes the ordinary diﬀerential equation (9.13)

f (r) +

n−1 f (r) + k 2 f (r) = 0. r

(See how to compute Δf (r) in Example 4.10.) Clearly, to get a fundamental solution we should ﬁnd a solution f (r) with the same singularity as r → 0+ (0) as that of the fundamental solution En (x) = En (x) of the Laplace operator.

9.4. Fundamental solution of the Helmholtz operator

203

For instance, if it turns out that we have found a solution f (r) of the equation (9.13) such that f (r) = En (r)g(r), g ∈ C 2 ([0, +∞)), g(0) = 1, then the verbatim repetition of the veriﬁcation that ΔEn (x) = δ(x) yields (Δ + k 2 )f (|x|) = δ(x). We will reduce (9.13) to the Bessel equation. This may be done even for a more general equation α β 2 (9.14) f (r) + f (r) + k + 2 f (r) = 0, r r where α, β, k are arbitrary real (or even complex) constants. First, let us rewrite (9.14) in the form r2 f (r) + αrf (r) + (β 2 + k 2 r2 )f (r) = 0. Here the ﬁrst three terms of the left-hand side above constitute Euler’s operator d d2 L = r2 2 + αr + β, dr dr applied to f . The Euler operator commutes with the homotheties of the real line (i.e., with the transformations x → λx, with λ ∈ R \ 0, acting upon functions on R by the change of variables), and rκ for any κ ∈ C are its eigenfunctions. (It is also useful to notice that the change of variable r = et transforms Euler’s operator into an operator commuting with all translations with respect to t, i.e., into an operator with constant coeﬃcients.) Therefore, it is natural to make the following change of the unknown function in (9.14): f (r) = rκ z(r). We have

f (r) = rκ z (r) + κrκ−1 z(r), f (r) = rκ z (r) + 2κrκ−1 z (r) + κ(κ − 1)rκ−2 z(r). Substituting this into (9.14) and dividing by rκ , we get the following equation for z(r): & ' α + 2κ κ(κ + α − 1) + β 2 z (r) + k + z(r) = 0. (9.15) z (r) + r r2 Here the parameter κ is at our disposal. The ﬁrst idea is to set κ = −α/2 in order to cancel z (r). Then we get c (9.16) z (r) + k 2 + 2 z(r) = 0. r It is easy to derive from (9.16) that for k = 0 the solutions z(r) of this equation behave, as r → +∞, as solutions of the equation u (r) + k 2 u(r) = 0.

204

9. Eigenvalues and eigenfunctions

More precisely, let, for example, k > 0 (this case is most important to us). Then there exist solutions z1 (r) and z2 (r) of (9.16) whose asymptotics as r → +∞ are 1 1 (9.17) z1 (r) = sin kr + O , z2 (r) = cos kr + O . r r This can be proved by passing to an integral equation as we did when seeking asymptotics of eigenfunctions and eigenvalues of the Sturm-Liouville operator. Namely, let us rewrite (9.16) in the form z (r) + k 2 z(r) = −

c z(r) r2

and seek z(r) in the form z(r) = C1 (r) cos kr + C2 (r) sin kr. The variation of parameters method for functions Cj (r) yields C1 (r) cos kr + C2 (r) sin kr = 0, −kC1 (r) sin kr + kC2 (r) cos kr = − rc2 z(r), which imply

r

C1 (r) = A + r0

C2 (r) = B −

r r0

c sin kρ z(ρ)dρ, ρ2 k c cos kρ z(ρ)dρ. ρ2 k

We wish that C1 (r) → A and C2 (r) → B as r → +∞. But then it is natural to take r0 = +∞ and write

c ∞ 1 (9.18) z(r) = A cos kr + B sin kr + sin k(r − ρ)z(ρ)dρ, k r ρ2 which is an integral equation for z(r). Let us rewrite it in the form (I − T )z = A cos kr + B sin kr, where T is the integral operator mapping z(r) to the last summand in (9.18). Consider this equation in the space Cb ([r0 , ∞)) of continuous bounded functions on [r0 , ∞). Let z = supr≥r0 |z(r)|. It is clear that

c |c| ∞ 1 dρz = z. T z ≤ 2 |k| r0 ρ |k|r0 If r0 is suﬃciently large, then T < 1 and the integral equation (9.18) has a unique solution in Cb ([r0 , ∞)). But its continuous bounded solutions are

9.4. Fundamental solution of the Helmholtz operator

205

solutions of (9.16), and if z(r) is such a solution, then

∞ C 1 dρ = ; |z(r) − A cos kr − B sin kr| ≤ C 2 ρ r r i.e., (9.19)

1 z(r) = A cos kr + B sin kr + O r

as r → +∞.

Thus, for any A and B, the equation (9.16) has a solution with the asymptotic (9.19). In particular, there are solutions z1 (r) and z2 (r) with asymptotics (9.17). Clearly, these solutions are linearly independent. The asymptotic (9.19) may also be expressed in the form 1 , z(r) = A sin k(r − r0 ) + O r which makes it clear that if z(r) is real (for instance, if all constants are real), then z(r) has inﬁnitely many zeros with approximately the same behavior as that of the zeros of sin k(r − r0 ), as r → +∞. Now, return to (9.15) and choose the parameter κ so as to have α + 2κ = 1. Then the equation for z(r) becomes & ' 1 ν2 2 (9.20) z (r) + z (r) + k − 2 z(r) = 0, r r where ν 2 = κ2 − β. (In particular, we can take ν = ±κ in the case β = 0, which was our starting point; generally, ν may be not real.) Put r = k −1 x or x = kr. Introducing x as a new independent variable, we get instead of (9.20) the following equation for y(x) = z(k −1 x) (or z(x) = y(kx)): 1 ν2 (9.21) y (x) + y (x) + 1 − 2 y(x) = 0. x x (One should not confuse the real variable x > 0 here with the x ∈ Rn used earlier.) The equation (9.21) is called the Bessel equation and its solutions are called cylindric functions. Cylindric functions appear when we apply separation of variables (or the Fourier method) to ﬁnd eigenfunctions of the Laplace operator in a disc, and hence solutions of the Dirichlet problem in a cylinder (over a disc) of ﬁnite or inﬁnite height; this accounts for the term “cylindric functions”. Eliminating, as above, y (x) in (9.21), we see that the asymptotic of any cylindric function as x → +∞ has the form 1 A . (9.22) y(x) = √ sin(x − x0 ) + O x x

206

9. Eigenvalues and eigenfunctions

Incidentally, substituting all parameters, we see that (9.15) in our case (for α = 1, κ = −1/2, k = 1, β = −ν 2 ) takes the form 1 2 4 −ν z (r) + 1 + z(r) = 0, r2 which makes it clear that for ν = ± 12 the Bessel equation is explicitly solvable and the solutions are of the form coinciding with the ﬁrst term in (9.22); i.e., 1 y(x) = √ (A cos x + B sin x). x In particular, for n = 3 we can solve explicitly the equation (9.13) that arose from the Helmholtz equation. Namely, setting α = 2, β = 0, κ = −1, we see that (9.15) takes the form z + k 2 z = 0, implying that for n = 3 the spherically symmetric solutions of the Helmholtz equation are of the form 1 f (x) = (A cos kr + B sin kr), r We get the fundamental solution if A = −

eikr , 4πr

−

1 4π .

e−ikr , 4πr

r = |x|.

In particular, functions

−

cos kr 4πr

are fundamental solutions, and the function u = homogeneous equation

sin kr r

is a solution of the

(Δ + k 2 )u(x) = 0, x ∈ R3 . Let us return to the general equation (9.21). We should describe somehow the behavior of its solutions as x → 0+. The best method for this is to 2 try to ﬁnd the solution in the form of a series in x. Since xν 2 plays a more important role than 1 as x → 0+, we may expect that y(x) behaves at zero like a solution of the Euler equation obtained from (9.21) by removing the 1 from the coeﬃcient of y(x). But solutions of Euler’s equation are linear combinations of functions xσ and, perhaps, xσ ln x for appropriate σ. In our case, substituting xσ into Euler’s equation, we get σ = ±ν. Therefore, it is natural to expect that we may ﬁnd a pair of solutions of (9.21) that behave like x±ν as x → +0. This is, however, true up to some corrections only. For the time being, let us seek solutions y(x) of (9.21) in the form (9.23)

y(x) = xσ (a0 + a1 x + a2 x2 + · · · ) = a0 xσ + a1 xσ+1 + · · · .

9.4. Fundamental solution of the Helmholtz operator

207

Substituting this series in (9.21) and equating coeﬃcients of all powers of x to zero, we get a0 σ(σ − 1) + a0 σ − a0 ν 2 = 0, a1 (σ + 1) + a1 (σ + 1) − a1 ν 2 = 0, ··· ak (σ + k)(σ + k − 1) + ak (σ + k) + ak−2 − ak ν 2 = 0, k = 2, 3, . . . , or (9.24)

a0 (σ 2 − ν 2 ) = 0,

(9.25)

a1 [(σ + 1)2 − ν 2 ] = 0, ak [(σ + k)2 − ν 2 ] + ak−2 = 0,

(9.26)

k = 2, 3, . . . .

Without loss of generality, we may assume that a0 = 0 (for a0 = 0 we may replace σ by σ − k, where k is the number of the ﬁrst nonzero coeﬃcient ak and then reenumerate the coeﬃcients). But then (9.24) implies σ = ±ν. If ν is neither integer nor half-integer, then (9.25) and (9.26) imply a1 = a3 = · · · = 0, ak−2 ak−2 ak−2 ak = − =− . =− 2 2 (σ + k) − ν (σ + ν + k)(σ − ν + k) k(k + 2σ) If k = 2m, then this yields a2m = −a2m−2 Set a0 =

(−1)m a0 1 = · · · = . 22 m(m + σ) 22m m!(m + σ)(m + σ − 1) . . . (σ + 1)

1 2σ Γ(σ+1) .

Then, using the identity sΓ(s) = Γ(s + 1), we get a2m =

(−1)m . 22m+σ m!Γ(m + σ + 1)

The corresponding series for σ = ν is denoted by Jν (x) and is called the Bessel or cylindric function of the ﬁrst kind. Thus, ∞

x 2k+ν 1 (−1)k . (9.27) Jν (x) = k!Γ(k + ν + 1) 2 k=0

Clearly, this series converges for all complex x = 0. All these calculations also make sense for an integer or half-integer σ = ν if ν ≥ 0. We could assume that ν ≥ 0 without loss of generality, because only ν 2 enters the equation. But it makes sense to consider ν with Re ν < 0 in the formulas for the solutions, to get the second solution supplementing the one given by (9.27). (In other words, we could take σ = −ν, returning to (9.23).) For ν = −1, −2, . . ., the series (9.27) may be deﬁned by continuity, since Γ(z) has no zeros but only poles at z = 0, −1, −2, . . ..

208

9. Eigenvalues and eigenfunctions

So, the Bessel equation (9.21) for Re ν ≥ 0 always has a solution of the form Jν (x) = xν gν (x), where gν (x) is an entire analytic function of x, such 1

= 0. that gν (0) = Γ(ν+1) We will neither seek the second solution (in the cases when we did not do this yet) nor investigate other properties of cylindric functions. We will only remark that in reference books and textbooks on special functions, properties of cylindric functions are discussed with the same minute detail as the properties of trigonometric functions are discussed in textbooks on trigonometry. Cylindric functions are tabulated and are a part of standard mathematics software, such as Maple, Mathematica, and MATLAB. The same applies to many other special functions. Many of them are, as cylindric functions, solutions of certain second-order linear ordinary diﬀerential equations. Let us return to Helmholtz’s equation. We have seen that (9.13) reduces to Bessel’s equation. Indeed, it is of the form (9.14) with α = n − 1, β = 0. 2−n Let us pass to (9.15) and set κ = 1−α 2 = 2 . Then we get (9.20), where & ' 2−n n−2 2 n−2 n−2+ = , ν = 2 2 2 2

implying ν = ± n−2 2 . Therefore, in particular, we get a solution of (9.13) of the form f (r) = r

2−n 2

J n−2 (kr). 2

Actually, this solution does not have any singularities (it expands in a series in powers of r2 ). We may also consider the solution f (r) = r

2−n 2

J 2−n (kr). 2

If n is odd, then, multiplying this function by a constant, we may get a fundamental solution for Helmholtz’s operator (this solution is, actually, an elementary function). If n is even, then, together with J n−2 (x), we should 2 use the second solution of Bessel’s equation (it is called a cylindric function of the second kind or Neumann’s function and its expansion contains ln x as x → 0+). It can also be expressed in terms of J n−2 (x) via a quadrature 2 with the help of a well-known Wronskian trick (which gives for the second solution a ﬁrst-order linear diﬀerential equation). In any case, we see that (k) there exists a fundamental solution En (r), analytic in r for r = 0. This implies that all eigenfunctions of the Laplace operator are analytic at interior points of the domain.

9.5. Variational principle

209

Note that there is a more general theorem which guarantees the analyticity of any solution of an elliptic equation with analytic coeﬃcients. Solutions of an elliptic equation with smooth (C ∞ ) coeﬃcients are also smooth.

9.5. Variational principle. The behavior of eigenvalues under variation of the domain. Estimates of eigenvalues Let A be a selfadjoint semibounded from below operator in a separable Hilbert space H with discrete spectrum. Let ϕ1 , ϕ2 , . . . be a complete orthonormal system of eigenvectors of A with eigenvalues λ1 , λ2 , . . .; i.e., Aϕj = λj ϕj for j = 1, 2, . . .. The discreteness of the spectrum means that λj → +∞ as j → +∞. Therefore, we may assume that the eigenvalues increase with j; i.e., λ1 ≤ λ2 ≤ λ3 ≤ · · · . Instead of the set of eigenvalues, it is convenient to consider a (nonstrictly) increasing function N (λ) deﬁned for λ ∈ R as the number of eigenvalues not exceeding λ. This function takes only nonnegative integer values, is constant between eigenvalues, and at eigenvalues its jumps are equal to the multiplicities of these eigenvalues. It is continuous from the right; i.e., N (λ) = N (λ + 0). The eigenvalues λj are easily recovered from N (λ). Namely, if λ is a real number, then it is an eigenvalue of multiplicity N (λ + 0) − N (λ − 0). (If N (λ + 0) − N (λ − 0) = 0, then λ is not an eigenvalue.) If N (λ − 0) < j ≤ N (λ + 0) and j is a nonnegative integer, then λj = λ. In short, λj is uniquely deﬁned from the condition N (λj − 0) < j ≤ N (λj + 0). The following proposition is important. Proposition 9.5 (Glazman’s lemma). (9.28)

N (λ) = sup {dim L : (Au, u) ≤ λ(u, u) for all u ∈ L ⊂ D(A)} ,

where L ⊂ D(A) denotes that L is a ﬁnite-dimensional linear subspace of D(A). If A is the Friedrichs extension of A0 and λ is not an eigenvalue, then L ⊂ D(A) may be replaced by the stronger condition L ⊂ D(A0 ). Proof. Let Lλ be a ﬁnite-dimensional subspace in H spanned by all eigenvectors ϕj with eigenvalues λj ≤ λ. Clearly, Lλ ⊂ D(A) and A − λI has only nonpositive eigenvalues on Lλ and hence is nonpositive itself; i.e., (Au, u) ≤ λ(u, u) for all u ∈ Lλ . In particular, we may take L = Lλ in the right-hand side of (9.28). But dim Lλ = N (λ), implying that the left-hand side of (9.28) is not greater than the right-hand side. Now, let us verify that the left-hand side of (9.28) is not less than the right-hand side. Let L ⊂ D(A) and (Au, u) ≤ λ(u, u), u ∈ L. We wish to prove that dim L ≤ N (λ). Fix such a subspace L and set Mλ = L⊥ λ (the

210

9. Eigenvalues and eigenfunctions

orthogonal complement to Lλ ). Clearly, if u ∈ Mλ , then u can be expanded with respect to the eigenfunctions ϕj with the eigenvalues λj > λ. Therefore, if u ∈ D(A) and u = 0, then (Au, u) > λ(u, u) implying L ∩ Mλ = 0. Now, consider the projection Eλ : H −→ H onto Lλ parallel to Mλ ; i.e., if u = v + w, where v ∈ Lλ , w ∈ Mλ , then Eλ u = v. Clearly, Ker Eλ = Mλ . Now, consider the operator Eλ |L : L −→ Lλ . Since (Ker Eλ ) ∩ L = 0, then Eλ |L is a monomorphism. Therefore, dim L ≤ dim Lλ = N (λ), as required. It remains to verify the last statement of the proposition. Without loss of generality, we may assume that (Au, u) ≥ (u, u), for u ∈ D(A0 ). Let us make use of the Hilbert space H1 , the completion of D(A0 ) with respect to the norm deﬁned by the inner product [u, v] = (A0 u, v), where u, v ∈ D(A0 ). Instead of (Au, u) in (9.28), we may write [u, u]. Let ·1 be the norm in H1 . Let L be any ﬁnite-dimensional subspace of D(A). In particular, L ⊂ H1 . In L, choose an orthonormal (with respect to the inner product in H) basis e1 , . . . , ep and let vectors f1 , . . . , fp ∈ D(A0 ) be such that ej − fj 1 < δ, where δ > 0 is suﬃciently small. Let L1 be a subspace spanned by f1 , . . . , fp . p It is easy to see that dim L1 = dim L = p for a small δ. Indeed, if j=1 cj fj = 0 with cj ∈ C, then by taking the inner product with fk , we get p

cj (fj , fk ) = 0, k = 1, . . . , p.

j=1

But (fj , fk ) = δjk + εjk , where εjk → 0 as δ → 0. Therefore, the matrix (δjk + εjk ), being close to the identity matrix, is invertible for a small δ. Therefore, cj = 0 and f1 , . . . , fp are linearly independent; i.e., dim L1 = p. Further, if [u, u] ≤ λ(u, u) for all u ∈ L, then [v, v] ≤ (λ+ε)(v, v), v ∈ L1 , with ε = ε(δ), such that ε(δ) → 0 as δ → 0. Indeed, if v = pj=1 cj fj , then p u = j=1 cj ej is close to v with respect to · 1 , implying the required statement. Now, set N1 (λ) = sup {dim L1 : (Au, u) ≤ λ(u, u) for all u ∈ L1 ⊂ D(A0 )} . Then from the above argument, it follows that N (λ − ε) ≤ N1 (λ) ≤ N (λ) for any λ ∈ R and ε > 0. If λ is not an eigenvalue, then N (λ) is continuous at λ implying N1 (λ) = N (λ). Remark. Obviously, N (λ) can be recovered from N1 (λ); namely, N (λ) = N1 (λ + 0). Another description of the eigenvalues in terms of the quadratic form of A is sometimes more convenient. Namely, the following statement holds.

9.5. Variational principle

211

Proposition 9.6 (Courant’s variational principle). (9.29)

λ1 =

(Aϕ, ϕ) , ϕ∈D(A)\0 (ϕ, ϕ) min

(9.30) λj+1 =

max

L⊂D(A), dim L=j

(Aϕ, ϕ) , ϕ∈(L⊥ \0)∩D(A) (ϕ, ϕ) min

j = 1, 2, . . . .

If A is the Friedrichs extension of A0 , then, by replacing max with sup and min with inf, we may write ϕ ∈ D(A0 )\{0} and L ⊂ D(A0 ), instead of ϕ ∈ D(A)\{0} and L ⊂ D(A), respectively. Proof. We will use the same notation as in the proof of Proposition Let ∞ 9.5. 2 λ2 < c ϕ for ϕ ∈ D(A) (i.e., if |c | us prove (9.29) ﬁrst. If ϕ = ∞ j j=1 j j j=1 j ∞ 2 and (ϕ, ϕ) = 2 . Therefore, λ |c | |c | +∞), then (Aϕ, ϕ) = ∞ j=1 j j j=1 j (Aϕ, ϕ) =

∞

λj |cj | ≥ λ1 2

j=1

∞

|cj |2 = λ1 (ϕ, ϕ),

j=1

and the equality is attained at ϕ = ϕ1 . This proves (9.29). Now let us prove (9.30). Let Φj be the subspace spanned by ϕ1 , . . . , ϕj . Take L = Φj . We see that the right-hand side of (9.30) is not smaller than the left-hand side. Next we should verify that the right-hand side is not greater than the left-hand side, i.e., if L ⊂ D(A) and dim L = j, then there exists a nonzero vector ϕ ∈ L⊥ ∩ D(A) such that (Aϕ, ϕ) ≤ λj+1 (ϕ, ϕ). But, as in the proof of Proposition 9.5, it is easy to verify that Φj+1 ∩ L⊥ = 0 (if we had Φj+1 ∩ L⊥ = 0, then the orthogonal projection onto L would be an injective map of Φj+1 in L, contradicting the fact that dim Φj+1 = j +1 > j = dim L). Therefore, we may take ϕ ∈ Φj+1 ∩ L⊥ , ϕ = 0, and then it is clear that (Aϕ, ϕ) ≤ λj+1 (ϕ, ϕ). The last statement of Proposition 9.6 is proved as the last part of Proposition 9.5. Let us return to the eigenvalues of −Δ in a bounded domain Ω ⊂ Rn (with the Dirichlet boundary conditions). Denote them by λj (Ω), j = 1, 2, . . ., and their distribution function, N (λ), by NΩ (λ). In this case D(A0 ) = D(Ω). Now let Ω1 and Ω2 be two bounded domains such that Ω1 ⊂ Ω2 . Then D(Ω1 ) ⊂ D(Ω2 ), and Proposition 9.5 implies that (9.31)

NΩ1 (λ) ≤ NΩ2 (λ);

therefore, λj (Ω1 ) ≥ λj (Ω2 ),

j = 1, 2, . . . .

212

9. Eigenvalues and eigenfunctions

The eigenfunctions and eigenvalues may be found explicitly if Ω is a rectangular parallelepiped. Suppose Ω is of the form Ω = (0, a1 ) × · · · × (0, an ). It is easy to verify that the eigenfunctions can be taken in the form ϕk1 ,...,kn = Ck1 ,...,kn sin

k1 πx1 kn πxn · · · sin , a1 an

where k1 , . . . , kn are (strictly) positive integers and Ck1 ,...,kn are normalizing constants. The eigenvalues are of the form k1 π 2 kn π 2 +··· + . λk1 ,...,kn = a1 an The function NΩ (λ) in our case is equal to the number of√points of the form ( ka11π , . . . , kannπ ) ∈ Rn belonging to the closed ball of radius λ with the center at the origin. If the ki are allowed to be any integers, we see that points ( ka11π , . . . , kannπ ) run over a lattice in Rn obtained by obvious homotheties of an integer lattice. It is easy to see that NΩ (λ) for √ large λ has a two-sided estimate in terms of the volume of a ball of radius λ; i.e., C −1 λn/2 ≤ NΩ (λ) ≤ Cλn/2 ,

(9.32) where C > 0.

Now note that we may inscribe a small cube in Ω and, the other way around, embed Ω into a suﬃciently large cube. With (9.31) we see that (9.32) holds for any bounded domain Ω. Making these arguments more precise, it is possible to prove the following asymptotic formula by H. Weyl: NΩ (λ) ∼ (2π)−n ωn · mes(Ω) · λn/2 ,

as λ → +∞,

where ωn is the volume of the unit ball in Rn and mes(Ω) is the Lebesgue measure of Ω; see, e.g., Courant and Hilbert, Volume I of [5].

9.6. Problems 9.1. Let Ω be a bounded domain in Rn with a smooth boundary. Let L be a selfadjoint operator in L2 (Ω) determined by the Laplace operator in Ω with the Dirichlet boundary conditions. The Green’s function of the Laplace operator in Ω is the Schwartz kernel of L−1 , i.e., a distribution G(x, y), x, y ∈ Ω, such that

G(x, y)f (y)dy. L−1 f (x) = Ω

9.6. Problems

213

Prove that Δx G(x, y) = δ(x − y), G|x∈∂Ω = 0, and G is uniquely deﬁned by these conditions. Give a physical interpretation of the Green’s function. 9.2. Find out what kind of singularity the Green’s function G(x, y) has at x = y ∈ Ω. 9.3. Prove that the Green’s function is symmetric; i.e., G(x, y) = G(y, x),

x, y ∈ Ω.

9.4. Prove that the solution of the problem Δu = f , u|∂Ω = ϕ in the domain Ω is expressed via the Green’s function by the formula

∂G(x, y) G(x, y)f (y)dy − ϕ(y) dSy , u(x) = ∂ny Ω ∂Ω where ny is the exterior normal to the boundary at y and dSy is the area element of the boundary at y. 9.5. The Green’s function of the Laplace operator in an unbounded domain Ω ⊂ Rn is a function G(x, y), x, y ∈ Ω, such that Δx G(x, y) = δ(x − y), G|x∈∂Ω = 0, and G(x, y) → 0 as |x| → +∞ (for every ﬁxed y ∈ Ω) if n ≥ 3; G(x, y) is bounded for |x| → +∞ for every ﬁxed y ∈ Ω if n = 2. Find the Green’s function for the half-space xn ≥ 0. 9.6. Using the result of the above problem, write a formula for the solution of Dirichlet’s problem in the half-space xn ≥ 0 for n ≥ 3. Solve this Dirichlet problem also with the help of the Fourier transform with respect to x = (x1 , . . . , xn−1 ) and compare the results. 9.7. Find the Green’s function of a disc in R2 and a ball in Rn , n ≥ 3. 9.8. Write a formula for the solution of the Dirichlet problem in a disc and a ball. Obtain in this way the Poisson formula for n = 2 from Problem 8.2. 9.9. Find the Green’s function of a half-ball. 9.10. Find the Green’s function for a quarter of R3 , i.e., for {(x1 , x2 , x3 ) : x2 > 0, x3 > 0} ⊂ R3 .

214

9. Eigenvalues and eigenfunctions

9.11. The Bessel function Jν (x) is deﬁned for x > 0 and any ν ∈ C, by the series ∞

x 2k+ν (−1)k Jν (x) = . k!Γ(k + ν + 1) 2 k=0

Prove that if n ∈ Z+ , then J−n (x) = (−1)n Jn (x). 9.12. Prove that if n ∈ Z+ , then the Bessel equation 1 n2 y + y + 1− 2 y =0 x x has for a solution not only Jn (x) but also a function deﬁned by the series ∞ ck xk , c0 = 0. (ln x) · x−n k=0 x

1

9.13. Prove that the Laurent expansion of e 2 (t− t ) with respect to t (for t = 0, ∞) has the form ∞ x (t− 1t ) 2 = Jn (x)tn . e −∞

9.14. Prove that the expansion of

eix sin ϕ

into the Fourier series of ϕ is of

the form e

ix sin ϕ

= J0 (x) + 2

∞

[J2n (x) cos 2nϕ + iJ2n−1 (x) sin(2n − 1)ϕ].

n=1

9.15. Find eigenvalues and eigenfunctions of the Laplace operator in a rectangle. Prove that these eigenfunctions form a complete orthogonal system. 9.16. Let k ∈ Z+ and αk,1 < αk,2 < · · · be all the zeros of Jk (x) for x > 0. Prove that

R αk,n αk,m r Jk r rdr = 0, for m =

n. Jk R R 0 9.17. Using the result of Problem 9.16, ﬁnd a complete orthonormal system of eigenfunctions of the Laplace operator in a disc. 9.18. Describe a way to use the Fourier method to solve the Dirichlet problem in a straight cylinder (of ﬁnite height) with a disc as the base.

10.1090/gsm/205/10

Chapter 10

The wave equation

10.1. Physical problems leading to the wave equation There are many physical problems leading to the wave equation (10.1)

utt = a2 Δu,

where u = u(t, x), t ∈ R, x ∈ Rn , Δ is the Laplacian with respect to x. The following more general nonhomogeneous equation is also often encountered: (10.2)

utt = a2 Δu + f (t, x).

We have already seen that small oscillations of the string satisfy (10.1) (for n = 1), and in the presence of external forces they satisfy (10.2). It can be shown that small oscillations of a membrane satisfy similar equations, if we denote by u = u(t, x), for x ∈ R2 , the vertical displacement of the membrane from the (horizontal) equilibrium position. Similarly, under small oscillations of a gas (acoustic wave) its parameters (e.g., pressure, density, displacement of gas particles) satisfy (10.1) with n = 3. One of the most important ﬁelds, where equations of the form (10.1) and (10.2) play a leading role, is classical electrodynamics. Before explaining this in detail, let us recall some notation from vector analysis. (See, e.g., Ch. 10 in [27].) Let F = F(x) = (F1 (x), F2 (x), F3 (x)) be a vector ﬁeld deﬁned in an open set Ω ⊂ R3 . Here x = (x1 , x2 , x3 ) ∈ Ω, and we will assume that F is suﬃciently smooth, so that all derivatives of the components Fj , which we need, are continuous. It is usually enough to assume that F ∈ C 2 (Ω) (which means that every component Fj is in C 2 (Ω)). If f = f (x) is a suﬃciently 215

216

10. The wave equation

smooth function on Ω, then its gradient ∂f ∂f ∂f grad f = ∇f = , , ∂x1 ∂x2 ∂x3 is an example of a vector ﬁeld in Ω. Here it is convenient to consider ∇ as a vector ∂ ∂ ∂ , , ∇= ∂x1 ∂x2 ∂x3 whose components are operators acting, say, in C ∞ (Ω), or, as operators from C k+1 (Ω) to C k (Ω), k = 0, 1, . . . . The divergence of a vector ﬁeld F is a scalar function ∂F1 ∂F2 ∂F3 + + . div F = ∂x1 ∂x2 ∂x3 For a vector ﬁeld F in Ω its curl is another vector ﬁeld ∂F3 ∂F2 ∂F1 ∂F3 ∂F2 ∂F1 . − , − , − curl F = ∇ × F = ∂x2 ∂x3 ∂x3 ∂x1 ∂x1 ∂x2 Here the cross “×” in ∇×F stands for the cross product of two 3-component vectors with the product of ∂/∂xj and Fk interpreted as the derivative ∂Fk /∂xj . The following identities, which hold for every smooth scalar function f and vector ﬁeld F, are easy to verify: (10.3)

div grad f = ∇ · ∇f = Δf,

(10.4)

curl grad f = ∇ × ∇f = 0,

(10.5)

div curl F = 0,

(10.6)

curl(curl F) = ∇ × (∇ × F) = grad div F − ΔF,

where in the last term the Laplacian Δ is applied to every component of F. Vector analysis is easier to understand if we use the language of diﬀerential forms. To this end, with each vector ﬁeld F we associate two diﬀerential forms λF = F1 dx1 + F2 dx2 + F3 dx3 and ωF = F1 dx2 ∧ dx3 + F2 dx3 ∧ dx1 + F3 dx1 ∧ dx2 . It is easy to see that dλF = ωcurl F , which can serve as a deﬁnition of curl F. Also, dωF = div F dx1 ∧ dx2 ∧ dx3 . In particular, the identities (10.4) and (10.5) immediately follow from the well-known relation d2 = 0 for the de Rham diﬀerential d on diﬀerential forms.

10.1. Physical problems

217

In what follows we will deal with functions and vector ﬁelds which additionally depend upon the time variable t. In this case all operations grad, div, curl should be applied with respect to the space variables x (for every ﬁxed t). The equations of classical electrodynamics (Maxwell’s equations) are

(M.3)

ρ , ε0 ∂B , curl E = − ∂t div B = 0,

(M.4)

c2 curl B =

(M.1) (M.2)

div E =

j ∂E . + ε0 ∂t

Here E, B are vectors of electric and magnetic ﬁelds, respectively (threedimensional vectors depending on t and x ∈ R3 ), ρ a scalar function of t and x (density of the electric charge), j a three-dimensional vector of density of the electric current, also depending on t and x (if charges with density ρ at a given point and time move with velocity v, then j = ρv). The numbers c, ε0 are universal constants depending on the choice of units: the speed of light and the dielectric permeability of vacuum. For applications of the Maxwell equations it is necessary to complement them with the formula for the Lorentz force acting on a moving charge. This force is of the form (M.5)

F = q(E + v × B),

where q is the charge, v its velocity. Formula (M.5) may be used for experimental measurement of E and B (or, say, to deﬁne them as physical quantities). The discussion of experimental facts on which equations (M.1)–(M.5) are based may be found in textbooks on physics (see, e.g., Feynmann, Leighton, and Sands [8, vol. 2, Ch. 18]). Let us rewrite the Maxwell equations, introducing scalar and vector potentials ϕ and A. For simplicity assume that the ﬁelds E and B are deﬁned and suﬃciently smooth in the whole space R4t,x . By the Poincar´e lemma, it follows from (M.3) that B = curl A, where A is a vector function of t and x determined up to a ﬁeld A0 such that curl A0 = 0; hence A0 = grad ψ, where ψ is a scalar function. Substituting B = curl A into (M.2), e lemma again, we see we get curl(E + ∂A ∂t ) = 0. Hence, using the Poincar´ ∂A that E + ∂t = − grad ϕ, where ϕ is a scalar function. Therefore, instead of

218

10. The wave equation

(M.2) and (M.3), we may write (M.6)

B = curl A,

∂A . ∂t The vector function A is called the vector potential, and the scalar function ϕ is called the scalar potential. Note that equations (M.6) and (M.7) do not uniquely determine potentials A and ϕ. Without aﬀecting E and B, we may replace A and ϕ by E = − grad ϕ −

(M.7)

∂ψ ∂t (this transformation of potentials is called a gauge transformation). Transformations (M.8) may be used to get simpler equations for potentials. For instance, using (M.8) we may get any value of A = A + grad ψ, ϕ = ϕ −

(M.8)

div A = div A + div grad ψ = div A + Δψ, since solving Poisson’s equation Δψ = α(t, x) we may get any value of Δψ, hence, of div A. Let us derive equations for A and ϕ using two remaining Maxwell equations. Substituting the expression for E in terms of the potentials into (M.1), we get −Δϕ −

(M.9)

ρ ∂ div A = . ∂t ε0

Substituting E and B in (M.4), we get ∂A j ∂ 2 c curl curl A − − grad ϕ − = . ∂t ∂t ε0 Using (10.6), we obtain ∂ 2A ∂ j grad ϕ + = . 2 ∂t ∂t ε0 Now, choose A with the help of the gauge transformation such that (M.10)

−c2 ΔA + c2 grad(div A) +

1 ∂ϕ . c2 ∂t More precisely, ﬁrst let some potentials ϕ and A be given. We want to ﬁnd the function ψ in the gauge transformation (M.8) so that ϕ and A satisfy (M.11). This yields for ψ an equation of the form

(M.11)

div A = −

1 ∂ 2ψ − Δψ = θ(t, x), c2 ∂t2 where θ(t, x) is a known function. This is a nonhomogeneous wave equation. Suppose that we have solved this equation and, ﬁnding ψ, obtained ϕ and (M.12)

10.1. Physical problems

219

A satisfying (M.11). Then the second and the third terms in (M.10) cancel and we get (M.13)

∂2A j − c2 ΔA = , ∂t2 ε0

and (M.9) takes the form (M.14)

c2 ρ ∂ 2ϕ 2 − c Δϕ = . ∂t2 ε0

Therefore, we may assume that ϕ and A satisfy nonhomogeneous wave equations (M.13), (M.14), which, together with (M.11), are equivalent to the Maxwell equations. The gauge condition (M.11) is called the Lorentz gauge condition. Denote by the wave operator (or d’Alembertian) (M.15)

=

∂2 − c2 Δ ∂t2

and let −1 be the convolution with a fundamental solution of this operator. Then −1 commutes with diﬀerentiations. Suppose −1 j and −1 ρ make 2 sense. Then the potentials A = −1 εj0 and ϕ = −1 cε0ρ satisfy (M.13) and (M.14). Will the Lorentz gauge condition be satisﬁed for this particular choice of the potentials? Substituting the found A and ϕ into (M.11) we see that this condition is satisﬁed if and only if (M.16)

∂ρ + div j = 0. ∂t

But this condition means the conservation of charge. Therefore, if the conservation of charge is observed, then a solution of the Maxwell equations may be found if we can solve a nonhomogeneous wave equation. Reducing Maxwell’s equations to the form (M.13), (M.14) shows that Maxwell’s equations are invariant with respect to Lorentz transformations (linear transformations of the space-time R4t,x preserving the Minkowski metric c2 dt2 − dx2 − dy 2 − dz 2 ). Moreover, they are invariant under the transformations from the Poincar´e group, which is generated by the Lorentz transformations and translations. In the empty space (in the absence of currents and charges) the potentials ϕ and A satisfy the wave equation of the form (10.1) for n = 3 and a = c. This implies that all components of E and B satisfy the same equation.

220

10. The wave equation

10.2. Plane, spherical, and cylindric waves A plane wave is a solution of (10.1) which, for a ﬁxed t, is constant on each hyperplane which is parallel to a given one. By a rotation in x-space we may make this family of planes to be of the form x1 = const. Then a plane wave is a solution of (10.1) depending only on t and x1 . But then it is a solution of the one-dimensional wave equation utt = a2 ux1 x1 ; i.e., it is of the form u(t, x) = f (x1 − at) + g(x1 + at), where f, g are arbitrary functions. Before the rotation, each of the summands had clearly been of the form f (r · x − at), where r ∈ Rn , |r| = 1. Another form of expression is f (k · x − ωt), where k ∈ Rn , but may have an arbitrary length. Such a form is convenient to make the argument of f dimensionless and is most often used in physics and mechanics. If the dimension of t is [second] and that of x is [meter], then that of ω is [sec−1 ] and that of k is [m−1 ]. Substituting f (k · x − ωt) into (10.1), we see that ω 2 = a2 |k|2 or a = ω |k| . Clearly, the speed of the plane wave front (the plane where f takes a prescribed value) is equal to a and the equation of such a front is k · x − ωt = const, where k and ω satisfy ω 2 − a2 |k|2 = 0. An important example of a plane wave is ei(k·x−ωt) . (Up to now we only considered real solutions in this section, and we will do this in the future, but we write this exponent assuming that either the real or imaginary part is considered. They will also be solutions of the wave equation.) In this case each ﬁxed point x oscillates sine-like with frequency ω and for a ﬁxed t the wave depends sine-like on k · x, so that it varies sine-like in the direction of k with the speed |k|. The vector k is often called the wave vector. The relation ω 2 = a2 |k|2 is the dispersion law for the considered waves (in physics more general dispersion laws of the form ω = ω(k) are encountered). Many solutions of the wave equation may be actually obtained as a superposition of plane waves with diﬀerent k. We will, however, obtain the most important types of such waves directly. In what follows, we will assume that n = 3 and will study spherical waves, solutions of (10.1) depending only on t and r, where r = |x|. Thus, let u = u(t, r), where r = |x|. Then (10.1) can be rewritten in the form (10.7)

2 1 utt = urr + ur . 2 a r

Let us multiply (10.7) by r and use the identity rurr + 2ur =

∂2 (ru). ∂r2

10.2. Plane, spherical, and cylindric waves

221

Then, clearly, (10.7) becomes 1 (ru)tt = (ru)rr , a2 implying ru(t, r) = f (r − at) + g(r + at) and (10.8)

u(t, r) =

f (r − at) g(r + at) + . r r

This is the general form of spherical waves. The wave r−1 f (r − at) diverges from the origin 0 ∈ R3 , as from its source, and the wave r−1 g(r + at), conversely, converges to the origin (coming from inﬁnity). In electrodynamics one usually considers the wave coming out of a source only, discarding the second summand in (10.8) by physical considerations. In this case the function f (r) characterizes properties of the source. The front of the emanating spherical wave is the sphere r − at = const. It is clear that the speed of the front is a, as before. Let us pass to cylindric waves—solutions of (10.1) depending only on t and the distance to the x3 -axis. A cylindric wave only depends, actually, on t, x1 , x2 (and even only on t and ρ = x21 + x22 ) so that it is a solution of the wave equation (10.1) with n = 2. It is convenient, however, to consider it as a solution of the wave equation with n = 3, but independent of x3 . One of the methods to construct cylindric waves is as follows: take a superposition of identical spherical waves with sources at all points of the x3 -axis (or converging to all points of the x3 -axis). Let e3 be the unit vector of the x3 -axis. Then we get a cylindric wave, by setting

∞

∞ f (|x − ze3 | − at) g(|x − ze3 | + at) dz + dz. (10.9) v(t, x) = |x − ze3 | |x − ze3 | −∞ −∞ Set r = |x − ze3 | =

ρ2 + (x3 − z)2 , ρ =

x21 + x22 .

Clearly, v(t, x) does not depend on x3 and only depends on t and ρ. Therefore, we may assume that x3 = 0 in (10.9). Then the integrands are even functions of z, and it suﬃces to calculate the integrals over (0, +∞). For the variable of integration take r instead of z, so that dr =

z dz, r

dz =

r r dr = dr. z r2 − ρ2

Since r runs from ρ to ∞ (for z ∈ [0, ∞)), we get

∞

∞ f (r − at) g(r + at) dr + 2 dr v(t, ρ) = 2 2 2 r −ρ r2 − ρ2 ρ ρ

222

10. The wave equation

or

(10.10)

∞

v(t, ρ) = 2 ρ−at

f (ξ)dξ (ξ +

at)2

− ρ2

∞

+2 ρ+at

g(ξ)dξ (ξ − at)2 − ρ2

.

All this makes sense when the integrals converge. For instance, if f, g are continuous functions of ξ, then for the convergence it suﬃces that

∞

∞ |f (ξ)|dξ |g(ξ)|dξ < +∞, < +∞ for M > 0. |ξ| |ξ| M M It can be shown that (10.10) is the general form of the cylindric waves. Let us brieﬂy describe another method of constructing cylindric waves. Seeking a cylindric wave of the form v(t, ρ) = eiωt f (ρ), we easily see that f (ρ) should satisfy the equation 1 ω f + f + k 2 f = 0, k = , ρ a which reduces to the Bessel equation with ν = 0 (see Chapter 9). Then we can take a superposition of such waves (they are called monochromatic waves) by integrating over ω.

10.3. The wave equation as a Hamiltonian system In Section 2.1 we have already discussed for n = 1 the possibility of expressing the wave equation as a Lagrange equation of a system with an inﬁnite number of degrees of freedom. Now, let us brieﬂy perform the same for n = 3, introducing the corresponding quanitites by analogy with the one-dimensional case. Besides, we will discuss a possibility to pass to the Hamiltonian formalism. For simplicity, we will only consider real-valued functions u(t, x), which are smooth functions of t with values in S(R3x ); i.e., we will analyze the equation (10.1) in the class of rapidly decaying functions with respect to x. Here temporarily we will denote by S(R3x ) the class of real-valued functions satisfying the same restrictions as in the deﬁnition of S(R3 ) which was previously formulated. We prefer to do this rather than complicate notation, and we hope that this will not lead to confusion. All other functions in this section will be also assumed to be real valued. Let us introduce terminology similar to one used in classical mechanics. The space M = S(R3x ) will be referred to as the conﬁguration space. An element u = u(x) ∈ S(R3x ) may be considered as a “set of coordinates” of the three-dimensional system, where x is the “label” of the coordinate and u(x) is the coordinate itself with the index x. The tangent bundle on M is the direct product T M = S(R3x ) × S(R3x ). As usual, the set of pairs {(u0 , v) ∈ T M } for a ﬁxed u0 ∈ M is denoted by T Mu0 and is called the tangent space at u0 (this space is canonically isomorphic to M as in the case

10.3. The wave equation as a Hamiltonian system

223

when M is a ﬁnite-dimensional vector space). Elements of T M are called tangent vectors. A path in M is a function u(t, x) , t ∈ (a, b), inﬁnitely diﬀerentiable with respect to t with values in S(R3x ). The velocity of u(t, x) at t0 is the tangent ˙ 0 , x)}, where u˙ = ∂u vector {u(t0 , x), u(t ∂t . The kinetic energy is the function K on the tangent bundle introduced by the formula

1 K({u, v}) = v 2 (x)dx. 2 R3 Given a path u(t, x) in M then, taking for each t a tangent vector {u, u} ˙ and the value of the kinetic energy at it, we get the function of time

1 [u(t, ˙ x)]2 dx, K(u) = 2 R3 which we will call the kinetic energy along the path u. The potential energy is the following function on M :

a2 |ux (x)|2 dx, U (u) = 2 R3

∂u ∂u ∂u . , , where ux = grad u(x) = ∂x 1 ∂x2 ∂x3 If there is a path in M , then the potential energy along this path is a function of time. With the help of the canonical projection T M −→ M , the function U is lifted to a function on T M , again denoted by U . Therefore, U ({u, v)} = U (u). The Lagrangian or the Lagrange function is the function L = K − U on T M . The action along the path u(t, x), t ∈ [t0 , t1 ], is the integral along this path:

t1 Ldt, S = S[u] = t0

where 1 L = L({u(t, x), u(t, ˙ x)}) = 2

a2 [u(t, ˙ x)] dx − 2 R3

2

R3

|ux (x)|2 dx.

The wave equation (10.1) may be expressed in the form δS = 0, where δS is the variation (or the diﬀerential) of the functional S taken along paths u(t, x) with ﬁxed source u(t0 , x) and the target u(t1 , x). Indeed, if δu(t, x) is an admissible variation of the path, i.e., a smooth function of t ∈ [t0 , t1 ] with values in S(R3x ), such that δu(t0 , x) = δu(t1 , x) = 0, then for such a

224

10. The wave equation

variation of the path, δS takes the form

t1

t1

d δS = S[u + ε(δu)]|ε=0 = u(δ ˙ u)dxdt ˙ − a2 ux · δux dxdt dε t0 R3 t0 R3

2 ∂ u 2 δudxdt + a (Δu)δudxdt = (−u)δudxdt, = − ∂t2 where =

∂2 ∂t2

− a2 Δx is the d’Alembertian.

It is clear now that δS = 0 is equivalent to u = 0, i.e., to the wave equation (10.1). To pass to the Hamiltonian formalism, we should now introduce the cotangent bundle T ∗ M . But in our case each T Mu0 is endowed with the inner product determined by the quadratic form 2K which makes it natural to set T ∗ Mu0 = T Mu0 and T ∗ M = T M . The Hamiltonian or energy is the following function on T M :

1 a2 2 (10.11) H = H({u, v}) = K + U = v (x)dx + |ux (x)|2 dx. 2 R3 2 R3 Given a path u(t, x), we see that H({u, u}) ˙ is a function of time along this path. Proposition 10.1 (The energy conservation law). Energy is constant along any path u(t, x) satisfying u = 0. Proof. Along the path u(t, x), we have

dH 2 2 = u¨ ˙ udx + a ux · u˙ x dx = u¨ ˙ udx − a Δu · udx ˙ dt R3 R3 R3 R3

= ( u) · udx ˙ = 0, R3

as required. Corollary 10.2. For any ϕ, ψ ∈ S(R3x ) there exists at most one path u(t, x) satisfying u = 0 and initial conditions u(t0 , x) = ϕ(x),

u(t ˙ 0 , x) = ψ(x).

Proof. It suﬃces to verify that if ϕ ≡ ψ ≡ 0, then u ≡ 0. But this follows immediately from H = const = 0 along this path. The Hamiltonian expression of equations uses also the symplectic form on T ∗ M . To explain and motivate further calculations, let us start with a toy model case of a particle which moves along the line R in a potential ﬁeld given by a potential energy U = U (x) (see the appendix in Chapter 2). Here M = R, T M = R × R, T ∗ M = R × R, the equation of motion is

10.3. The wave equation as a Hamiltonian system

225

m¨ x = −U (x), and the energy is H(x, x) ˙ = 12 mx˙ 2 + U (x). Introducing the momentum p = mx˙ and using q instead of x (as physicists do) we can rewrite the Hamiltonian in the form p2 (10.12) H(q, p) = + U (q) 2m and then the equation of motion is equivalent to a Hamiltonian system q˙ = ∂H ∂p , (10.13) p˙ = − ∂H ∂q . Let us agree that when we talk about a point in coordinates q, p, then it is understood that we mean a point in T ∗ M , and we will simply write {q, p} ∈ T ∗ M . Another important object (in this toy model) is the symplectic 2-form ω = dp ∧ dq on T ∗ M . It can also be considered as a skew-symmetric form on each tangent space Tq,p (T ∗ M ), which in this case can be identiﬁed with the set of 4-tuples {q, p, Xq , Xp }, where all entries are reals and {Xq , Xp } are the natural coordinates along the ﬁber of T (T ∗ M ) over the point {q, p}: ω({q, p, Xq , Xp }, {q, p, Yq , Yp }) = Xp Yq − Xq Yp . Now with any vector ﬁeld X on T ∗ M we can associate a 1-form αX on T ∗ M given by the relation (10.14)

αX (Y ) = ω(Y, X).

Note that the form ω is nondegenerate in the following sense: if X is a tangent vector to T ∗ M at z ∈ T ∗ M , such that ω(X, Y ) = 0 for all Y ∈ Tz (T ∗ M ), then X = 0. Therefore, the vector ﬁeld X is uniquely deﬁned by its 1-form αX . So we will denote by Xα the vector ﬁeld corresponding to any 1-form α on T ∗ M ; i.e., the relations α = αX and X = Xα are equivalent. Let us try to ﬁnd X = XdH where H is a smooth function on T ∗ M (a Hamiltonian). By deﬁnition we should have for any vector ﬁeld Y on T ∗ M ∂H ∂H Yq + Yp = Xq Yp − Xp Yq ; ∂q ∂p hence

∂H ∂H , Xp = − , ∂p ∂q so XdH is the vector ﬁeld corresponding to the Hamiltonian system (10.13). Note that this argument works for any smooth Hamiltonian H, not only for the speciﬁc one given by (10.12). Xq =

The arguments and calculations above can be easily extended to the case of motion in M = Rn . In this case we should allow x, p, q to be vectors in Rn , x = (x1 , . . . , xn ), p = (p1 , . . . , pn ), q = (q1 , . . . , qn ), the potential energy

226

10. The wave equation

U = U (x) to be a smooth scalar (real-valued) function on Rn ; then take the Hamiltonian n p2j + U (q), (10.15) H(q, p) = 2mj j=1

where q = x. Here H(q, p) should be considered as a function on T ∗ M = R2n . Now the equations of motion ∂U mj x ¨j = − , j = 1, . . . , n, ∂xj again can be rewritten as a Hamiltonian system (10.13), where ∂H/∂p and ∂H/∂q should be understood as gradients with respect to variables p and q, respectively; for example, ∂H ∂H ∂H = ,..., . ∂p ∂p1 ∂pn Instead of the form dp ∧ dq we should take the 2-form n dpj ∧ dqj . (10.16) ω= j=1

which is usually called the canonical symplectic form on T ∗ M . It is nondegenerate in the sense explained above. Therefore, (10.14) again deﬁnes a one-to-one correspondence between vector ﬁelds and 1-forms on T ∗ M . Then again it is easy to check that the 1-form dH corresponds to the vector ﬁeld which generates the Hamiltonian system (10.13). The reader may wonder why we need fancy notation like M , T ∗ M for the usual vector spaces Rn , R2n . The point is that in fact M can be an arbitrary smooth manifold. In fact, choosing local coordinates x = (x1 , . . . , xn ) in an open set G ⊂ M , we can deﬁne natural coordinates q, p in the cotangent bundle T ∗ M , so that q, p represent the cotangent vector pdq = p1 dq1 + · · · + pn dqn at q = x. Then the 1-form pdq is well-deﬁned on M ; i.e., it does not depend on the choice of local coordinates. (We leave the proof to the reader as an easy exercise.) Therefore, the 2-form ω = d(pdq) is also well-deﬁned on M . In an even more general context, a 2-form ω on a manifold Z is called symplectic if it is closed and nondegenerate. Then the same machinery works. So, starting with a Hamiltonian H (a smooth function on Z), we can take the 1-form dH and the corresponding vector ﬁeld XH . The system of ODE (10.17)

z˙ = XH (z)

is called a Hamiltonian system with the Hamiltonian H. Locally, it is nothing new: by the Darboux theorem we can choose local coordinates so that ω is

10.3. The wave equation as a Hamiltonian system

227

given by (10.16) in these coordinates. Therefore, the Hamiltonian system (10.17) has the canonical form (10.13). More details on symplectic structure, Hamiltonian systems, and related topics can be found in [3]. Now let us make a leap to inﬁnite-dimensional spaces; namely, let us return to M = S(R3x ). The symplectic form should be deﬁned on each tangent space to T ∗ M at a ﬁxed point. In our case this tangent space is naturally identiﬁed with T ∗ M = T M itself, so we have T (T ∗ M ) = T ∗ M × T ∗ M = T M × T M . An element of T (T ∗ M ) should be expressed as a 4-tuple {u, v, δu, δv} consisting of functions belonging to S(Rn ). The symplectic form itself depends only on δu, δv. For brevity let us set δu = α , δv = β. Then deﬁne the symplectic form to be (10.18)

[{u, v, α, β}, {u, v, α1 , β1 }] = [{α, β}, {α1 , β1 }] =

(αβ1 − βα1 )dx. R3 n j=1 dpj ∧ dqj , where

(This is an analogue of the canonical 2-form ω = summation is replaced by integration, though we use brackets instead of ω to simplify notation.)

Choosing an arbitrary {u, v} ∈ T ∗ M , let us take a variation of H = H({u, v}) (which replaces the diﬀerential in the inﬁnite-dimensional case) in the direction of a vector {u, v, δu, δv} ∈ T (T ∗ M ). In our case u, v, δu, δv are arbitrary functions from S(R3x ). We obtain, diﬀerentiating the expression of H in (10.11) and integrating by parts,

2 δH · δux + v δv dx a δH(δu, δv) = δux R3

(−a2 Δu δu + v δv)dx = (α δu + β δv)dx, = R3

R3

where α = −a2 Δu and β = v. Now, if we take another vector Y = {u, v, α1 , β1 } ∈ T (T ∗ M ) (with the same u, v) and denote by XH the Hamiltonian vector ﬁeld with the Hamiltonian H, then we should have [Y, XH ] = δH(Y ), which gives

2 (α1 β − β1 α)dx = (α1 v + β1 a Δu)dx = (α1 Xα + β1 Xβ )dx, R3

R3

R3

where {u, v, Xα , Xβ } is the vector of the Hamiltonian vector ﬁeld on T ∗ M . Since α1 and β1 are arbitrary, we should have Xα = v, Xβ = a2 Δu, and the corresponding diﬀerential equation on T ∗ M has the form u˙ = v, v˙ = a2 Δu,

228

10. The wave equation

which is equivalent to the wave equation (10.1) for u. So we established that the wave equation can be rewritten as a Hamiltonian system with the Hamiltonian (10.11) and the symplectic form given by (10.18). Later we will use the fact that the ﬂow of our Hamiltonian system preserves the symplectic form (or, as one says, the transformations in this ﬂow are canonical). This is true for general Hamiltonian systems (see, e.g., [3]) but we will not prove this. Instead we will directly establish the corresponding analytical fact for the wave equation. Proposition 10.3. Let u(t, x) and v(t, x) be two paths. Consider the following function of t:

(uv˙ − uv)dx. ˙ [u, v] = R3

If u = v = 0, then [u, v] = const. Proof. We have

d d [u, v] = (uv˙ − uv)dx ˙ = (u¨ v−u ¨v)dx dt R3 dt R3

2 (u · Δv − Δu · v)dx = 0, =a R3

as required.

10.4. A spherical wave caused by an instant ﬂash and a solution of the Cauchy problem for the three-dimensional wave equation Let us return to spherical waves and consider the diverging wave (10.19)

f (r − at) , r

r = |x|.

The function f (−at) characterizes the intensity of the source (located at x = 0) at the moment t. It is interesting to take the wave emitted by an instant ﬂash at the source. This corresponds to f (ξ) = δ(ξ). We get the wave δ(r − at) δ(r − at) = , r at whose meaning is to be clariﬁed.

(10.20)

r = |x|,

It is possible to make sense of (10.20) in many equivalent ways. We will consider it as a distribution with respect to x depending on t as a parameter. First, consider the wave (10.19) and set f = fk , where fk (ξ) ∈ C0∞ (R1 ), fk (ξ) ≥ 0, fk (ξ) = 0 for |ξ| ≥ 1/k and fk (ξ)dξ = 1, so that fk (ξ) → δ(ξ)

10.4. Spherical waves and the Cauchy problem

229

as k → +∞. Now, set δ(r − at) = lim fk (r − at),

(10.21)

k→∞

D (R3 )

if this limit exists in (or, equivalently, in E (R3 ), since supports of all functions fk (r − at) belong to a ﬁxed compact provided t belongs to a ﬁxed ﬁnite interval of R). Clearly, this limit exists for t < 0 and is 0; i.e., δ(r − at) = 0 for t < 0. We will see that the limit exists also for t > 0. Then this limit is the distribution δ(r − at) ∈ E (R3 ) and, obviously, supp δ(r − at) = {x : |x| = at}. Since 1/r = 1/|x| is a C ∞ -function in a neighborhood of supp δ(r − at) for t = 0, then the distribution δ(r−at) is well-deﬁned and r fk (r − at) δ(r − at) = lim , t = 0. k→∞ r r Let us prove the existence of the limit (10.21) for t > 0 and compute it. Let ϕ ∈ D(R3 ). Let us express the integral fk (r − at), ϕ in polar coordinates

fk (|x| − at)ϕ(x)dx fk (r − at), ϕ = R3 >

=

(10.22)

∞

= |x|=r

0

∞

=

fk (r − at)ϕ(x)dSr =

fk (r − at)

0

dr >

|x|=r

ϕ(x)dSr

dr,

where dSr is the area element of the sphere of radius r. Clearly, |x|=r ϕ(x)dSr is an inﬁnitely diﬀerentiable function of r for r > 0. Since lim fk (ξ) = δ(ξ), k→∞

we obviously get

lim fk (r − at), ϕ =

k→∞

|x|=at

ϕ(x)dSat

and, therefore, the limit of (10.21) exists and

δ(r − at), ϕ = ϕ(x)dSat . |x|=at

Passing to the integration over the unit sphere we get

2 2 ϕ((at)x )dS1 , δ(r − at), ϕ = a t |x |=1

implying (10.23)

δ(r − at) ,ϕ r

= at

|x |=1

ϕ((at)x )dS1 .

230

10. The wave equation

It is worthwhile to study the dependence on the parameter t. It is clear from (10.23) that (10.24)

lim

t→0

δ(r − at) =0 r

|t=0 = 0. Finally, it is obvious from (10.23) and, by continuity, we set δ(r−at) r δ(r−at) that is inﬁnitely diﬀerentiable with respect to t for t > 0 (in the weak r topology). Derivatives with respect to t are easy to compute. For instance, d δ(r − at) ∂ δ(r − at) ,ϕ = ,ϕ (10.25) ∂t r dt r

3

∂ϕ 2 = a ϕ((at)x )dS1 + a t xj ((at)x )dS1 . ∂x j |x |=1 |x |=1 j=1

This, in particular, implies that lim

t→+0

∂ δ(r − at) = 4πaδ(x) ∂t r

(the ﬁrst of the integrals in (10.25) tends to 4πaϕ(0) and the second one to 0). Finally, by (10.22) it is clear that Therefore, the distribution with the initial conditions (10.26)

δ(r − at) = 0, t > 0. r

δ(r−at) r

is a solution of the wave equation for t > 0

∂ δ(r − at) δ(r − at) = 0, = 4πaδ(x). r ∂t r t=0 t=0

is similarly constructed. It is equal to 0 for The converging wave δ(r+at) r t > 0, and for t < 0 it satisﬁes the wave equation with the initial conditions δ(r + at) ∂ δ(r + at) (10.27) = 0, = −4πaδ(x) r ∂t r t=0 t=0 is obtained from δ(r−at) by re(this is clear, for instance, because δ(r+at) r r placing t with −t). Besides, we can perform translations with respect to the spatial and time variables and consider waves δ(|x − x0 | − a(t − t0 )) , |x − x0 |

δ(|x − x0 | + a(t − t0 )) , |x − x0 |

which are also solutions of the wave equation with initial values (for t = t0 ) obtained by replacing δ(x) with δ(x − x0 ) in (10.26) and (10.27).

10.4. Spherical waves and the Cauchy problem

231

Now, we want to apply the invariance of the symplectic form (Proposition 10.3) to an arbitrary solution u(t, x) of the equation u = 0 and to the converging wave δ(|x − x0 | + a(t − t0 )) v(t, x) = . |x − x0 | We will assume that u ∈ C 1 ([0, t0 ] × R3x ) and u = 0 holds (where is understood in the distributional sense). Then we can deﬁne

∂ ∂ v(t, x), u(t, x) − v(t, x), u(t, x) , (uv˙ − uv)dx ˙ = [u, v] = ∂t ∂t R3 where ·, · denotes the value of the distribution of the variable x at a test function (t is considered as a parameter) which makes sense for u ∈ C 1 (not only for u ∈ C0∞ ) due to explicit formulas (10.23), (10.25). Now, we use [u, v] = const, which holds for u ∈ C 2 and is proved similarly to Proposition 10.3 or by approximating u and v by smooth solutions of the wave equation with compact (with respect to x) supports (this can be performed, for instance, with the help of mollifying). Actually, we only have to verd [u, v] = [u, ˙ v] + [u, v], ˙ which is clear, for example, from explicit ify that dt formulas (10.23), (10.25) which deﬁne the functionals v and v. ˙ Now, let us explicitly write the relation (10.28)

[u, v]|t=0 = [u, v]|t=t0 . Set u|t=0 = ϕ(x), ∂u ∂t t=0 = ψ(x). Then ∂ δ(|x − x0 | + a(t − t0 )) , ϕ(x) [u, v]|t=0 = ∂t |x − x0 | t=0 δ(|x − x0 | + a(t − t0 )) , ψ(x) − |x − x0 | t=0 δ(|x − x0 | − at0 ) ∂ δ(|x − x0 | − at0 ) , ϕ(x) − , ψ(x) = − ∂t0 |x − x0 | |x − x0 | = >

1 ∂ 1 = − ψ(x)dSat0 − ϕ(x)dSat0 . at0 |x−x0 |=at0 ∂t0 at0 |x−x0 |=at0

Further, by (10.27), we have [u, v]|t=t0 = −4πaδ(x − x0 ), u(t0 , x) = −4πau(t0 , x0 ). Therefore, (10.28) can be expressed in the form − 4πau(t0 , x0 ) = >

∂ 1 1 ψ(x)dSat0 − ϕ(x)dSat0 . =− at0 |x−x0 |=at0 ∂t0 at0 |x−x0 |=at0

232

10. The wave equation

Replacing t0 , x0 , x by t, x, y we get (10.29) u(t, x) = = >

∂ 1 1 ϕ(y)dSat + ψ(y)dSat , ∂t 4πa2 t |y−x|=at 4πa2 t |y−x|=at where dSat is the area element of the sphere {y : |y − x| = at}. We obtained the formula for the solution of the Cauchy problem ∂u = ψ(x). (10.30) u = 0, u|t=0 = ϕ(x), ∂t t=0

Formula (10.29) is called the Kirchhoﬀ formula. Let us derive some corollaries of it. 1) The solution of the Cauchy problem is unique and continuously depends on initial data in appropriate topology (e.g., if ψ continuously varies in C(R3 ) and ϕ continuously varies in C 1 (R3 ), then u(t, x) continuously varies in C(R3 ) for each t > 0). 2) The value u(t, x) only depends on the initial data near the sphere {y : |y − x| = at}. Suppose that for t = 0 the initial value diﬀers from 0 in a small neighborhood of one point x0 only. But then at t the perturbation diﬀers from 0 in a small neighborhood of the sphere {x : |x − x0 | = at} only. This means that a perturbation spreads at a speed a and vanishes after a short time at each observation point. Therefore, the spreading wave has a leading edge and a trailing one. A strictly localized initial state is observed later at another point as a phenomenon, which is also strictly localized. This phenomenon is called the Huygens principle. Since Huygens’s principle holds for the wave equation for n = 3, we may transmit information with the help of sound or light. As we will see later, the Huygens principle does not hold for n = 2 (once started, oscillations never cease, e.g., waves on the water). Therefore, the residents of Flatland (the ﬂat two-dimensional world described by E. A. Abbott in 1884) would ﬁnd it very diﬃcult to pass information through their space (the plane). We have not yet proved the existence of the solution for the Cauchy problem (10.30). The simplest way to do this is to take the function u(t, x) constructed by the Kirchhoﬀ formula (10.29) and verify that it is a solution of (10.30). The fulﬁllment of u = 0 may be deduced from the fact that ∂ 1 δ(r−at) the right-hand side of (10.29) is the sum of two convolutions ϕ ∗ ∂t 4π r 1 δ(r−at) 3 with t as a parameter) of ϕ and ψ ∗ 4π (taken with respect to x ∈ R r

10.4. Spherical waves and the Cauchy problem

233

and ψ with distributions of x depending smoothly on t and satisfying in a natural sense the wave equation (for t > 0). We will use more convenient arguments. In order not to bother with distributions depending on a parameter, it is convenient to consider δ(r−at) r as a distribution on R4t,x by setting δ(r − at) , ϕ(t, x) r t,x

∞

∞ δ(r − at) 1 , ϕ(t, x) dt = dt = ϕ(t, x)dSat r at 0 0 |x|=at x

∞

= dt at ϕ(t, atx )dS1 , ϕ ∈ D(R4 ). 0

|x |=1

This formula implies, in particular, that the functional on D(R4 ) determined . by the left-hand side is a distribution which we will again denote by δ(r−at) r δ(r−at) 4 Therefore, we may assume that ∈ D (R ). r = K + , where K + is the upper sheet of the light Clearly, supp δ(r−at) r 2 2 2 cone {(t, x) : |x| − a t = 0}; namely, K + = {(t, x) : |x| = at} = {(t, x) : |x|2 − a2 t2 = 0, t ≥ 0}. It is also easy to verify that δ(r−at) = 0 for |t|+|x| = 0. Indeed, δ(r−at) =0 r r δ(r−at) + = 0 for t > 0. But outside K ; therefore, it suﬃces to check that r this is clear, for example, from (10.22) which is true also when the limit is understood in D ({(t, x) : t > ε > 0}) for any ε > 0. It is easy to prove that the right-hand side of (10.29) can be written in the form 1 δ(r − at) ∂ 1 δ(r − at) + [ϕ(x) ⊗ δ(t)] ∗ 4π r ∂t 4π r and then u = 0 obviously holds due to the properties of convolutions.

(10.31) u = [ψ(x) ⊗ δ(t)] ∗

The initial conditions in (10.30) may also be deduced from (10.31); we will, however, deduce them directly from the structure of formula (10.29). Note that Kirchhoﬀ’s formula is of the form ∂ u = uψ + uϕ , ∂t where

1 ψ(y)dSat uψ (t, x) = 4πa2 t |y−x|=at and uϕ is a similar expression obtained by replacing ψ by ϕ. This structure of Kirchhoﬀ’s formula is not accidental. Indeed, suppose we have proved

234

10. The wave equation

that uψ is a solution of the Cauchy problem (10.32)

uψ = 0,

Let us prove then that vϕ = (10.33)

uψ |t=0 = 0,

∂uψ = ψ. ∂t t=0

∂ ∂t uϕ

vϕ = 0,

is a solution of the Cauchy problem ∂vϕ vϕ |t=0 = ϕ, = 0. ∂t t=0

The equation vϕ = 0 obviously holds and so does vϕ |t=0 = ϕ due to the second equation in (10.32). It remains to verify the second equality in (10.33). Assuming that uϕ ∈ C 2 for t ≥ 0, we get ∂ 2 uϕ ∂vϕ = = a2 Δuϕ |t=0 = a2 Δ(uϕ |t=0 ) = 0, ∂t t=0 ∂t2 t=0 as required. The condition uϕ ∈ C 2 (t ≥ 0) holds, for example, for ϕ ∈ C 2 (R3 ), as is clear for example from the expression

t ψ(x + aty )dS1 . uψ (t, x) = 4π |y |=1 This expression makes (10.32) obvious for ψ ∈ C 1 (R3 ) also. Thus, if ψ ∈ C 1 (R3 ), ϕ ∈ C 2 (R3 ), then Kirchhoﬀ’s formula (10.29) gives a solution of the Cauchy problem (10.30) (the equation u = 0 holds in the distributional sense for t > 0). If we additionally assume that ψ ∈ C 2 (R3 ) and ϕ ∈ C 3 (R3 ), then u ∈ C 2 (R3 ) and u = 0 holds in the classical sense. Thus the Cauchy problem (10.30) is uniquely solvable. The Kirchhoﬀ formula also makes it clear that it is well-posed.

10.5. The fundamental solution for the three-dimensional wave operator and a solution of the nonhomogeneous wave equation Set E3 (t, x) =

1 δ(|x| − at) ∈ D (R4 ). 4πa |x|

Theorem 10.4. The distribution E3 (t, x) satisﬁes E3 (t, x) = δ(t, x); i.e., it is a fundamental solution for the d’Alembert operator. Proof. The statement actually follows from the fact that E3 (t, x) = 0 for |t| + |x| = 0, E3 (t, x) = 0 3 E3 (+0, x) = 0, ∂E ∂t (+0, x) = δ(x).

for t < 0,

10.5. The fundamental solution for the three-dimensional operator

235

The function E3 (t, x) may be considered as a smooth function of t (for t = 0) 3 with values in D (R3 ) and for t = 0 it is continuous while ∂E ∂t has a jump 2 ∂ 2 equal to δ(x). Therefore, applying = ∂t 2 −a Δx to E3 , we get δ(t)⊗δ(x) = δ(t, x). It is diﬃcult to make the above arguments entirely rigorous (if we try to do this, we face the necessity of a cumbersome struggle with the topology in D (R3 )). We may, however, consider all the above as a heuristic hint and directly verify that E3 = δ. This can be done similarly to the proof for the heat equation (see the proof of Theorem 7.5) and it is left to the reader as an exercise. Now, using E3 (t, x) we can solve the nonhomogeneous wave equation u = f by writing (10.34)

u = E3 ∗ f,

if the convolution in the right-hand side makes sense. The convolution (10.34) is called the retarded potential. In a more detailed expression, the retarded potential is of the form

∞

f (t − τ, x − y) 1 dSaτ (10.35) dτ u(t, x) = 4πa 0 |y| |y|=aτ

∞ dτ 1 f (t − τ, x − y)dSaτ = 4πa2 0 τ |y|=aτ (here the integration in the second integrals is done over y). With the help of E3 (t, x), we may obtain the Kirchhoﬀ formula (10.29) for the solution of the Cauchy problem (10.30) in a diﬀerent way. Namely, let a solution u(t, x) of the Cauchy problem (10.30) be given for t > 0. Extend it by zero onto the half-space {t : t < 0} and consider the obtained distribution u ∈ D (R4 ). Clearly, (10.36)

u = δ (t) ⊗ ϕ(x) + δ(t) ⊗ ψ(x).

A solution of this equation vanishing for t < 0 may be found in the form of a retarded potential (10.34) taking f equal to the right-hand side of (10.36). Clearly, we again get the Kirchhoﬀ formula. Note also that the retarded potential u(t, x) obtained via (10.35) depends only on values f (t , x ) for t ≤ t and |x −x| = a(t −t ), i.e., on values of f at points of the lower part of the light cone with the vertex at (t, x) (the word “lower” here is understood with reference to the direction of the t-axis). This is what the term “retarded potential” really means. This property of the retarded potential is ensured by the fact that supp E3 ⊂ K + .

236

10. The wave equation

A fundamental solution for with this property is unique. Even a stronger result holds: Theorem 10.5. There exists exactly one fundamental solution for with the support belonging to the half-space {(t, x) : t ≥ 0}, namely, E3 (t, x). Proof. If there were two such solutions, then their diﬀerence u(t, x) ∈ D (R4 ) would have satisﬁed the wave equation u = 0 and would have vanished for t < 0. If u were a smooth function, then the uniqueness of the solution of the Cauchy problem would imply u ≡ 0 (Cauchy initial values vanish for t = t0 < 0). We can, however, make u smooth by mollifying. Let ϕε ∈ D(R4 ), supp ϕε ⊂ {(t, x) : |t|+|x| ≤ ε}, ϕε ≥ 0, and ϕε dtdx = 1, so that ϕε → δ(t, x) in D (R4 ) as ε → +0. Then limε→+0 (u ∗ ϕε ) = u in D (R4 ) (see Proposition 5.4). But u ∗ ϕε ∈ C ∞ (R4 ) and (u ∗ ϕε ) = (u) ∗ ϕε = 0, and also supp(u ∗ ϕε ) ⊂ supp u + supp ϕε ⊂ {(t, x) : t ≥ −ε}. Therefore, due to the uniqueness of the smooth solution of the Cauchy problem for the wave equation, u ∗ ϕε ≡ 0 for any ε > 0, implying u ≡ 0, as required. The importance of Theorem 10.5 stems from the fact that among all the fundamental solutions it chooses the only one which satisﬁes the causality principle, which claims that it is impossible to transmit information to the “past”. Therefore, in electrodynamics, to solve equations of the form u = f , satisﬁed by the scalar and vector potentials, one uses just this fundamental solution. It seems that Nature itself has chosen the solution satisfying the causality principle among all the solutions of the equation u = f (within the limits of the current accuracy of experiments).

10.6. The two-dimensional wave equation (the descent method) Let us solve the Cauchy problem for the equation 2 ∂ 2u ∂2u 2 ∂ u =a + , u = u(t, x1 , x2 ), (10.37) ∂t2 ∂x21 ∂x22 with the initial conditions (10.38)

u|t=0 = ϕ(x1 , x2 ),

∂u = ψ(x1 , x2 ). ∂t t=0

The idea of the solution (the descent method) is very simple: introduce an additional variable x3 and ﬁnd the solution of the Cauchy problem for the

10.6. The two-dimensional wave equation (the descent method)

237

three-dimensional wave equation utt = a2 Δu (where Δ is the Laplacian with respect to x1 , x2 , x3 ), with the initial conditions (10.38), which are independent of x3 . Then the solution u(t, x1 , x2 , x3 ) does not actually depend on x3 , since the function uz (t, x1 , x2 , x3 ) = u(t, x1 , x2 , x3 + z) is a solution of the same equation utt = a2 Δu for any z with the same initial conditions (10.38); therefore, the uniqueness theorem for the solution of the Cauchy problem for the three-dimensional wave equation implies that uz does not depend on z; i.e., u does not depend on x3 . Therefore the solution of the Cauchy problem (10.37)–(10.38) does exist (e.g., for any ϕ ∈ C 2 (R2 ), ψ ∈ C 1 (R2 )). It is unique due to the uniqueness theorem concerning the three-dimensional case since the solution of (10.37)–(10.38) can also be regarded as a solution of the three-dimensional Cauchy problem. Now, let us express u(t, x1 , x2 ) via the Kirchhoﬀ formula = >

1 ∂ ϕ(y1 , y2 )dSat (10.39) u(t, x) = ∂t 4πa2 t |y−x|=at

1 ψ(y1 , y2 )dSat , + 4πa2 t |y−x|=at where y = (y1 , y2 , y3 ), dSat is the area element of the sphere {y : y ∈ R3 , |y − x| = at}, and x = (x1 , x2 ) = (x1 , x2 , 0) (we identify the points of R2 with the points of R3 whose third coordinate is 0; this coordinate might have any other value since it does not aﬀect anything). Let us rewrite the second summand in formula (10.39), which we will denote uψ :

1 ψ(y1 , y2 )dSat (10.40) uψ = 4πa2 t |y−x|=at ∂ (the ﬁrst summand in (10.39) is of the form ∂t uϕ ). 3 Consider the sphere in Ry over which we integrate in (10.40). This is the sphere with the center at x of radius at (see Figure 10.1). We are to integrate a function, which is independent of y3 , over the sphere. This actually means that we integrate twice over the projection of the sphere onto the plane y3 = 0. Let dy1 dy2 be the Lebesgue measure of this plane, dSat the area

y3 y2 y1

x

Figure 10.1. To the descent method.

238

10. The wave equation

element of the sphere at y whose projection is equal to dy1 dy2 . Clearly, dy1 dy2 = | cos α(y)|dSat , where α(y) is the angle between the normal to the sphere and the y3 -axis. But the normal is proportional to the vector y − x = (y1 − x1 , y2 − x2 , y3 ) of length at. Let us integrate over the upper half-sphere (y3 > 0) and then double the result. Then |y − x| = at implies y3 = a2 t2 − (y1 − x1 )2 − (y2 − x2 )2 , 1 2 2 y3 = a t − (y1 − x1 )2 − (y2 − x2 )2 , cos α(y) = at at atdy1 dy2 dy1 dy2 dSat = = . 2 2 cos α(y) a t − (y1 − x1 )2 − (y2 − x2 )2 Therefore, we can rewrite (10.40) in the form

1 ψ(y)dy uψ (t, x) = , 2πa |y−x|≤at a2 t2 − |y − x|2 where x = (x1 , x2 ), y = (y1 , y2 ), dy = dy1 dy2 . The solution of the Cauchy problem (10.37)–(10.38) is given by the formula = >

1 ϕ(y)dy ∂ u(t, x) = ∂t 2πa |y−x|≤at a2 t2 − |y − x|2 (10.41)

ψ(y)dy 1 + 2πa |y−x|≤at a2 t2 − |y − x|2 known as Poisson’s formula. It is clear from Poisson’s formula that the value of the solution u(t, x) at x for n = 2 depends upon initial values ϕ(y) and ψ(y) in the disc {y : |y − x| ≤ at} and not only on its boundary (recall that for n = 3 it was suﬃcient to know initial values on the sphere {y : |y − x| = at}). In particular, if the supports on ϕ, ψ are near the origin, then the solution in a neighborhood of x is always nonzero after a certain moment. Therefore, a localized perturbation is not seen as a localized one from another point; i.e., the wave does not go tracelessly but leaves an aftereﬀect, though decreasing with time as 1t , as is clear from (10.41). In other words, the Huygens principle fails for n = 2. The method of descent enables us also to derive d’Alembert’s formula from Poisson’s formula, the former determining the solution of the Cauchy problem for the one-dimensional wave equation. However, we have already obtained this formula by another method. Let us also ﬁnd a fundamental solution for the two-dimensional wave operator. As in the three-dimensional case, we need to solve the Cauchy problem with the initial conditions u|t=0 = 0, ut |t=0 = δ(x).

10.7. Problems

239

But it is clear from Poisson’s formula that such a solution E2 (t, x) has the form θ(at − |x|) E2 (t, x) = , x ∈ R2 . 2πa a2 t2 − |x|2 Now, it is easy to verify directly that E2 (t, x) is locally integrable and is a fundamental solution for the two-dimensional wave operator. The latter fact is veriﬁed in exactly the same way as in the three-dimensional case and we leave this to the reader as an exercise. Finally, d’Alembert’s formula makes it clear that a fundamental solution ∂2 2 ∂2 for the one-dimensional wave operator ∂t 2 − a ∂x2 is of the form E1 (t, x) =

1 θ(at − |x|). 2a

10.7. Problems 10.1. By separation of variables ﬁnd cylindric waves in R3 . 10.2. Solve the “explosion-of-a-ball problem” in the three-dimensional space: ﬁnd u = u(t, x), x ∈ R3 , if utt = a2 Δu,

u|t=0 = ϕ,

ut |t=0 = 0,

where ϕ is the characteristic function of the ball {x : |x| ≤ R}. Draw an animation describing the behavior of u(t, x) as a function of t and |x|. 10.3. The same as in the above problem but with diﬀerent initial conditions: u|t=0 = 0, ut |t=0 = ψ, where ψ is the characteristic function of the ball {x : |x| ≤ R}. 10.4. Using the result of Problem 5.1 c), write, with the help of the Fourier transform, a formula for the solution of the Cauchy problem for the threedimensional wave equation. (Derive once more Kirchhoﬀ’s formula in this way.) 10.5. With the help of the Fourier transform with respect to x, solve the Cauchy problem for the wave equation utt = a2 Δu,

u = u(t, x),

x ∈ Rn .

Write the fundamental solution for the n-dimensional d’Alembertian = ∂2 − a2 Δ in the form of an integral and prove that its singularities belong ∂t2 to the light cone {(t, x) : |x|2 = a2 t2 }. 10.6. Describe the location of the singularities of the fundamental solution ∂2 ∂2 ∂2 for the operator ∂t 2 − ∂x2 − 4 ∂y 2 .

240

10. The wave equation

10.7. Let u(t, x) be a solution of the Cauchy problem of the equation utt = Δu, x = (x1 , x2 ) ∈ R2 with the initial conditions u|t=0 = ϕ, ut |t=0 = ψ. a) Let ϕ, ψ be known in the rectangle x1 ∈ [0, a], x2 ∈ [0, b]. Where can u be determined? Draw the domain in R3t,x1 ,x2 in which u can be determined (for t > 0). b) Let the supports of ϕ, ψ belong to the rectangle x1 ∈ [0, a], x2 ∈ [0, b]. Where is u = 0 for t > 0? Draw this domain in R3t,x1 ,x2 . 10.8. Solve Problem 10.7 for x ∈ R3 with the rectangle replaced by a rectangular parallelepiped. Instead of the domain in R4t,x , draw its section by the hyperplane t = 1.

10.1090/gsm/205/11

Chapter 11

Properties of the potentials and their computation

We have already met the potentials in this book. In Remark 4.11 we discussed the physical meaning of the fundamental solution of the Laplace operator in R3 and R2 . It is explained there that the fundamental solution E3 (x) of the Laplace operator in R3 has the meaning of the potential of a point charge, and the fundamental solution E2 (x) of the Laplace operator in R2 has the meaning of the potential of a thin uniformly charged string. It is impossible to deﬁne the potential of the string as an ordinary integral (it diverges). We indicated two methods for deﬁning this potential: its recovery from the strength of the ﬁeld (which is minus the gradient of the potential) and making the divergent integral meaningful by renormalization of the charge. Both methods deﬁne the potential of the string up to a constant, which suﬃces since in the long run we are only interested in the strength of the ﬁeld which is physically observable, whereas the potential is an auxiliary mathematical object (at least in the classical electrodynamics and gravitation theory). In Example 5.2 we introduced distributions, called single and double layers on the plane t = 0 and on an arbitrary hypersurface, and explained their physical meaning. In Example 5.6 we introduced the potentials of the simple and double layers as convolutions of the fundamental solution En (x) with the single and double layers on the surface.

241

242

11. Potentials and their computation

In Section 10.1 the role of the scalar and vector potentials in electrodynamics was clariﬁed. Here we will discuss in detail properties of potentials (and also how to deﬁne and calculate them). However, before we start the details, we advise the reader to reread the above-indicated passages. Potentials will be understood with suﬃcient versatility. When the preliminary deﬁnitions that we begin with become unsuitable, we will modify them at our convenience trying all the time to preserve physical observables.

11.1. Deﬁnitions of potentials Let En (x) be the standard fundamental solution for Δ in Rn ; i.e., 1 En (x) = |x|2−n , for n ≥ 3, (2 − n)σn−1 where σn−1 is the area of the unit sphere in Rn and 1 ln |x|. E2 (x) = 2π In particular, the case n = 3 is the most important one: 1 . E3 (x) = − 4π|x| E3 (x) is the potential of a point charge, which is equal to +1 for a proper system of units, and E2 (x) is the potential of an inﬁnite uniformly charged string (see, e.g., Feynman, Leighton, and Sands [8, vol. 2, Ch. 4]). All potentials have the form (11.1)

−En ∗ f

for diﬀerent distributions f ∈ E (Rn ) and, therefore, for n = 3 and 2 have the meaning of potentials of some distributed systems of charges. Every potential u of the form (11.1) satisﬁes in D (Rn ) the Poisson equation Δu = −f. Now we will describe the potentials which we need, turn by turn. a) The volume potential of charges distributed with density ρ(x) is the integral

En (x − y)ρ(y)dy. (11.2) u(x) = − Rn

We will always suppose that ρ(y) is a locally integrable function with compact support and ρ(y) is piecewise smooth; i.e., ρ is a C ∞ -function outside of ﬁnitely many smooth hypersurfaces in Rn on which it may have jumps. All the derivatives ∂ α ρ are supposed to satisfy the same condition (∂ α ρ is

11.1. Deﬁnitions of potentials

243

taken here outside of the hypersurfaces). An example of an admissible function ρ(y) is the characteristic function of an arbitrary bounded domain with smooth boundary. The most important property of the volume potential is that it satisﬁes the equation Δu = −ρ understood in the distributional sense; this follows from the equalities u = −En ∗ ρ and ΔEn (x) = δ(x). b) The single layer potential is the integral

(11.3) u(x) = − En (x − y)σ(y)dSy , Γ

where Γ is a smooth hypersurface in Rn , σ is a C ∞ -function on Γ, and dSy is the area element on Γ. For the time being, we assume that σ has a compact support (in the sequel, we will encounter examples when this does not hold). This potential satisﬁes Δu = −σδΓ , where σδΓ is the distribution with the support on Γ deﬁned in Example 5.2. c) The double layer potential is the integral

∂En (x − y) α(y)dSy , (11.4) u(x) = ∂n ¯y Γ where n ¯ y is the outward (or in some other way chosen) normal to Γ at y and the remaining notation is the same as in the preceding case. The potential of the double layer satisﬁes ∂ Δu = − (α(y)δΓ ), ∂n ¯ ∂ where ∂ n¯ (αδΓ )—the “double layer”—is the distribution deﬁned in Example 5.2. In this case we assume for the time being that α has a compact support and that α ∈ C ∞ (Γ). It is easy to see that the potentials (11.2)–(11.4) are deﬁned and inﬁnitely diﬀerentiable on Rn \Γ (where in the case of the volume potential (11.2), Γ is understood as the union of surfaces where ρ or its derivatives have jumps). We will understand these potentials everywhere in Rn as distributions obtained as convolutions (11.1), where f for the volume potential is the same as ρ in (11.2); for the case of the single layer potential f = σδΓ , where

σ(x)ϕ(x)dSx , ϕ ∈ D(Rn ), σδΓ , ϕ = Γ

whereas for the case of the double layer potential ∂ (αδΓ ), f= ∂n ¯

244

11. Potentials and their computation

∂ϕ(x) ∂ (αδΓ ), ϕ = − α(x) dSx . ∂n ¯ ∂n ¯x Γ Clearly, the convolutions u = −En ∗ f in all the above cases coincide for x ∈ Rn \Γ with the expressions given by the integrals (11.2)–(11.4). The physical meaning of the potentials (11.2)–(11.4) was described in Chapter 5. where

11.2. Functions smooth up to Γ from each side, and their derivatives Let Ω be a domain in Rn , let Γ be a smooth hypersurface in Ω (i.e., Γ is a submanifold of codimension 1 that is closed in Ω), and let u ∈ C ∞ (Ω\Γ). We will say that u is smooth up to Γ from each side if all the derivatives ∂ α u are continuous up to Γ from each side of the hypersurface. More precisely, if x0 ∈ Γ, then there should exist a neighborhood U of x0 in Ω such that U = U + ∪ ΓU ∪ U − , where ΓU = Γ ∩ U , U + and U − are open, and the restrictions u|U + and u|U − can be extended to functions u+ ∈ C ∞ (U + ) and u− ∈ C ∞ (U − ). In what follows we will often be interested in local questions only, where one should set from the beginning U = Ω and the above decomposition is of the form Ω = Ω+ ∪ Γ ∪ Ω− , where u+ ∈ C ∞ (Ω+ ) and u− ∈ C ∞ (Ω− ). Our aim is to prove that all the potentials are smooth up to Γ from each side (this in turn enables us to prove theorems about jumps of the potentials and calculate the potentials). To this end we will ﬁrst prove several auxiliary lemmas about functions which are smooth up to Γ from each side. If u ∈ C ∞ (Ω\Γ) is smooth up to Γ from each side, then it uniquely determines a distribution [u] ∈ D (Ω), such that [u] ∈ L1loc (Ω) and [u]|Ω\Γ = u. Our immediate goal is to learn how to apply to [u] diﬀerential operators with smooth coeﬃcients. Let us agree to write [u] only when u is smooth up to Γ from each side. In a neighborhood U of every point x0 ∈ Γ we can construct a diﬀeomorphism of this neighborhood onto a domain in Rn that transforms U ∩ Γ into a part of the plane {x : xn = 0}. Here x = (x1 , . . . , xn−1 , xn ) = (x , xn ), where x = (x1 , . . . , xn−1 ) ∈ Rn−1 . This diﬀeomorphism induces a change of variables in any diﬀerential operator transforming it into another diﬀerential operator of the same order (see Section 1.3). Thus, we will ﬁrst consider a local version in which U is a suﬃciently small neighborhood of 0 ∈ Rn and the coordinates in U may be selected so that Γ = U ∩ {x : xn = 0}, U ± = {x : x ∈ U, x = (x , xn ), ±xn > 0},

11.2. Functions smooth up to the boundary

245

∂u and the normal n ¯ to Γ is of the form n ¯ = (0, . . . , 0, 1); i.e., ∂∂un¯ = ∂x . For n simplicity of notation we also temporarily assume that U = Ω, so U ∩ Γ = Γ.

Denote by ψk the jump of the kth normal derivative of u on Γ; i.e., ∂ k u+ ∂ k u− ψk = − ∈ C ∞ (Γ), ∂xkn Γ ∂xkn Γ where k = 0, 1, 2, . . .. Note that the jumps of all the other derivatives are expressed via ψk by the formulas

(∂ α u+ )|Γ − (∂ α u− )|Γ = ∂ α ψαn , if α = (α , αn ), where α is an (n − 1)-dimensional multi-index. 2

2

∂[u] ∂ [u] Lemma 11.1. The derivatives ∂x , ∂x2 , and ∂x∂ n[u] ∂xj (j = 1, 2, . . . , n − 1) n n are expressed by the formulas ' & ∂u ∂[u] + ψ0 (x ) ⊗ δ(xn ), = (11.5) ∂xn ∂xn

(11.6)

& ' ∂u ∂[u] = , ∂xj ∂xj

j = 1, . . . , n − 1,

(11.7)

& 2 ' ∂ 2 [u] ∂ u + ψ1 (x ) ⊗ δ(xn ) + ψ0 (x ) ⊗ δ (xn ), = 2 ∂xn ∂x2n

(11.8)

& ' ∂ 2u ∂ψ0 ∂ 2 [u] = ⊗ δ(xn ). + ∂xn ∂xj ∂xn ∂xj ∂xj

Proof. For any ϕ ∈ D(Ω) we have

∂ϕ ∂ϕ ∂ϕ ∂[u] , ϕ = − [u], [u] dx − [u] dx =− ∂xn ∂xn ∂xn ∂xn xn ≥0 xn ≤0 & ∞ '

0 ∂ϕ ∂ϕ = − dx [u] dxn − [u] dxn ∂xn ∂xn 0 −∞ '

&

∂u + = ϕ(x)dx + u (x , 0)ϕ(x , 0)dx − u− (x , 0)ϕ(x , 0)dx ∂xn & '

∂u = , ϕ + ψ0 (x )ϕ(x , 0)dx , ∂xn which proves (11.5). Integrating by parts with respect to xj easily leads to (11.6). The formula (11.7) is obtained if we twice apply (11.5), whereas (11.8) follows from (11.5) and (11.6). We need to learn how to multiply distributions encountered in the righthand sides of (11.5)–(11.8) by smooth functions. To this end, let a ∈ C ∞ (Ω)

246

11. Potentials and their computation

and expand a = a(x , xn ) by the Taylor formula in xn near xn = 0. We have a(x , xn ) =

(11.9)

N −1

ak (x )xkn + xN n a(N ) (x , xn ),

k=0

where ak ∈ C ∞ (Γ), a(N ) ∈ C ∞ (V ) (here V is a neighborhood of Γ), and 1 ∂ k a ak (x ) = , k! ∂xkn xn =0

1 N 1 ∂ a (x , txn )(1 − t)N −1 dt. a(N ) (x , xn ) = (N − 1)! 0 ∂xN n Lemma 11.2. If ψ ∈ C ∞ (Γ), then a(x)(ψ(x ) ⊗ δ(xn )) = [a0 (x )ψ(x )] ⊗ δ(xn ),

(11.10)

(11.11) a(x)(ψ(x )⊗δ (xn )) = [a0 (x )ψ(x )]⊗δ (xn )−[a1 (x )ψ(x )]⊗δ(xn ). Proof. Clearly, ak (x )xkn (ψ(x ) ⊗ f (xn )) = [ak (x )ψ(x )] ⊗ [xkn f (xn )] for any distribution f ∈ D (R). Further, N a(N ) (x , xn )xN n (ψ(x ) ⊗ f (xn )) = a(N ) (x , xn )(ψ(x ) ⊗ xn f (xn )).

But a direct calculation shows that xpn δ (k) (xn ) = 0 for p > k and xn δ (xn ) = −δ(xn ), which, with a(x , xn ) in the form (11.9), immediately yields (11.10) and (11.11). Lemma 11.3. For any set of functions ψ0 , ψ1 , . . . , ψN ∈ C ∞ (Γ), there exists a function u ∈ C ∞ (Ω\Γ) smooth up to Γ from each side, such that ψk is the ∂k u jump of ∂x k on Γ for all k = 0, 1, . . . , N . n

Proof. We may, for example, set u(x) =

N 1 ψk (x )xkn θ(xn ), k! k=0

where θ(t) is the Heaviside function. Lemma 11.4. Let a distribution f ∈ D (Ω) be of the form (11.12)

f (x) = [f0 (x)] + b0 (x ) ⊗ δ(xn ) + b1 (x ) ⊗ δ (xn ),

where b0 , b1 ∈ C ∞ (Γ) and f0 is smooth up to Γ from each side. Let A be a second-order linear diﬀerential operator such that Γ is noncharacteristic for A at every point. Then for any integer N ≥ 0 there exists a function u on a neighborhood Ω of Γ in Ω such that u is smooth up to Γ from each side, and (11.13)

A[u] − f ∈ C N (Ω ).

11.2. Functions smooth up to the boundary

247

Proof. Given ψ0 , ψ1 , . . . , ψN ∈ C ∞ (Γ), we can use Lemma 11.3 to construct u ∈ C ∞ (Ω\Γ) having ψk as the jump of ∂u/∂xkn on Γ. It is clear from Lemmas 11.1 and 11.2 that the distribution A[u] = f˜ always has the form (11.12) (but perhaps with other f0 , b0 , b1 ). We need to select the jumps ψ0 , ψ1 , . . . , ψN +2 in such a way that f˜(x) = [f˜0 (x)] + b0 (x ) ⊗ δ(xn ) + b1 (x ) ⊗ δ (xn ) with the same functions b0 , b1 as in (11.12) and also the jumps of and

∂ k f˜0 ∂xkn

∂ k f0 ∂xkn

on Γ coincide for k = 0, 1, . . . , N . (Then we will obviously have ˜ f − f ∈ C N (Ω), as required.) First, note that Γ being noncharacteristic for A means that A can be expressed in the form (11.14)

A = a(x)

∂2 + A , ∂x2n

∂ where a ∈ C ∞ (Ω), a(x , 0) = 0, and A does not contain ∂x 2 . Since a(x) = 0 n in a neighborhood of Γ, we can divide (11.13) by a(x) and reduce the problem to the case when a(x) = 1. Therefore, in what follows we assume that 2

(11.15)

A=

∂2 + A , ∂x2n

where A is a second-order diﬀerential operator that does not contain

∂2 . ∂x2n

Let us start with selecting a function v0 such that A[v0 ] = [f˜0 (x)] + ˜b0 (x ) ⊗ δ(xn ) + b1 (x ) ⊗ δ (xn ), where f˜0 is any function (smooth up to Γ from each side), ˜b0 ∈ C ∞ (Γ). (0) Lemmas 11.1 and 11.2 show that it suﬃces to take ψ0 (x ) = b1 (x ), where (0) ψ0 is the jump of v0 on Γ. Now, (11.13) may be rewritten in the form (11.16)

A[u − v0 ] = [f1 ] + b2 (x ) ⊗ δ(xn )

with some f1 and b2 , where b2 ∈ C ∞ (Γ). Now, select a function v1 such that A[v1 ] = [f˜1 ] + b2 (x ) ⊗ δ(xn ), (1) (1) where b2 is the same as in (11.16) and f˜1 is arbitrary. If ψ0 and ψ1 are (1) (1) ∂v1 on Γ, then it suﬃces to take ψ0 = 0 and ψ1 (x ) = jumps of v1 and ∂x n b2 (x ). Setting u2 = u − v0 − v1 we see that the condition (11.13) for u reduces to A[u2 ] − [f2 ] ∈ C N (Ω ), where f2 is a function that is smooth up to Γ from each side. We will prove the possibility to select such u2 by induction with respect to N , starting

248

11. Potentials and their computation

with N = −1 and deﬁning C −1 (Ω ) as the set of all functions in Ω that are smooth up to Γ from each side. For N = −1 everything reduces to the simple condition u2 ∈ C 1 (Ω ), ∂u2 on Γ. i.e., to the absence of jumps of u2 and ∂x n Now suppose that we have already selected a function uk+2 such that (11.17)

A[uk+2 ] − [f2 ] ∈ C k−1 (Ω ),

where k is a nonnegative integer (as we have just seen this is possible for k = 0). Then, as an induction step, we need to select a function uk+3 such that (11.18)

A[uk+3 ] − [f2 ] ∈ C k (Ω ).

To this end set uk+3 = uk+2 + vk+2 , where vk+2 ∈ C k+1 (Ω ) (we do this in order to preserve (11.17) when we replace uk+2 by uk+3 ). Clearly, A [uk+2 ] ∈ C k (Ω ) and, therefore, (11.18) reduces to ∂ 2 [vk+2 ] − [f˜k ] ∈ C k (Ω ), ∂x2n where f˜k = −(A[uk+2 ] − [f2 ]) ∈ C k−1 (Ω ). This can be obtained by setting, for example, k f˜+ k f˜− ∂ ∂ 1 k k xk+2 vk+2 (x) = − θ(xn ) n k k (k + 2)! ∂xn ∂xn Γ

Γ

∂ 2 [v

]

in order to make the jumps of the kth derivatives of the functions ∂xk+2 2 n ˜ and fk on Γ coincide. Thus, we have proved that the induction with respect to N can be applied. This completes the proof. Global version. Fermi coordinates Now we will brieﬂy describe a global version of the results above. Let us introduce appropriate notation. Let Γ be a closed hypersurface in Rn , that is an (n − 1)-dimensional compact closed submanifold of Rn without boundary. Let us take a function s : Rn → R, deﬁned as follows: s(x) = dist(x, Γ) if x is outside of the (closed, compact) body B bounded by Γ (in particular, Γ = ∂B), and s(x) = −dist(x, Γ) if x is inside this body, and s = 0 on Γ. Clearly, s < 0 on the interior of B, and s > 0 on Rn \ B. It is easy to see that s ∈ C ∞ in a suﬃciently small neighborhood of Γ. Indeed, let n ¯ = n ¯ (z) denote the outward unit normal vector to Γ at the point z ∈ Γ. Consider the map (11.19)

Γ × R → Rn ,

(z, s) → z + s¯ n.

11.2. Functions smooth up to the boundary

249

Its diﬀerential is obviously bijective at the points (z, 0) ∈ Γ × R. Therefore, by the inverse function theorem, the map (11.19) restricts to a diﬀeomorphism of Γ × (−ε, ε) onto an open subset Γε of Rn , provided ε > 0 is suﬃciently small. In Γε , it is convenient to use Fermi coordinates which have the form where y = (y1 , . . . , yn−1 ) is a set of local coordinates on Γ and yn = s ∈ (−ε, ε). If the (n − 1)-tuple y represents a point z ∈ Γ, then, by n ∈ Γε . If we have deﬁnition, y = (y , s) are coordinates of the point z + s¯ a second-order diﬀerential operator A such that Γ is noncharacterisic for A at every point, then, as in (11.14), we can write in any Fermi coordinate system (y , yn ): (y , yn ),

(11.20)

A = b(y)

∂2 + B, ∂yn2

∂ where b ∈ C ∞ (Γε ), b(y , 0) = 0, B does not contain ∂y 2 . Now, if we pass n from one Fermi coordinate system to another one y˜ = (˜ y , yn ), then the ﬁrst term in the right-hand side of (11.20) will not change its form, except for change of variables in the scalar function b. Therefore, the function b is a well-deﬁned scalar function on Γε . Then in any equation of the form Au = f we can divide by b on a small neighborhood of Γ = Γ × {0}. For example, we can replace ε by a smaller value and assume that b(y) = 0 for all y ∈ Γε . 2

For the Laplacian Δ in Rn and Fermi coordinates y with respect to any hypersurface Γ, the formula (11.20) simpliﬁes: then b(y) = 1 for all y, so that we have (11.21)

Δ=

∂2 + B, ∂yn2

and, moreover, B does not contain any second derivatives of the form ∂yj∂∂xn with j ≤ n. Both statements follow from a simple geometric fact (a generalization of a Gauss lemma) that in Fermi coordinates ∂y∂n is the gradient of s = yn , or, equivalently, that the invariantly deﬁned tangent vector ﬁeld ∂ ∂yn is orthogonal to the level sets of the signed distance function s = yn . (See the proof in a much more general context in Gray [10, Chapter 2]; Rn can be replaced by an arbitrary Riemannian manifold M , and Γ by a closed submanifold of an arbitrary dimension of M . As an exercise, the reader can ﬁnd an elementary proof based on the fact that for small ε the straight line interval {z + ts¯ n| 0 ≤ t ≤ 1} is the shortest way between the level hypersurfaces yn = 0 and yn = s , where s ∈ [0, ε] is arbitrarily ﬁxed.) 2

For more information and details about Fermi coordinates see Gray [10, Chapter 2].

250

11. Potentials and their computation

To understand how the statement about the second derivatives follows from the above geometric fact, let us recall that in any local coordinates y = (y1 , . . . , yn ) the Laplacian can be written in the form of the LaplaceBeltrami operator, which is given by 1 ∂ ij √ ∂u g (11.22) Δu = √ g . g ∂yi ∂yj 1≤i,j≤n

(g ij )

is the inverse matrix for (gij ), where gij = gij (y) is the symmetric Here positive deﬁnite matrix of the standard Riemannian metric on Rn (i.e., ds2 = dx21 + · · · + dx2n ) in coordinates y; that is, ∂ ∂ , , gij (x) = ∂yi ∂yj x where the tangent vectors ∂y∂ j are considered as vectors in Rn and ·, · is the standard scalar product in Rn . (See, for example, Taylor [35, Vol. I, Sect. 2.4] for the details about the Laplace-Beltrami operator.) In Fermi coordinates, note that the vector

∂ ∂yn

has length one; hence

gnn = 1. The absence of the mixed second derivatives

∂2 ∂yj ∂xn ,

j < n, is

equivalent to gjn = 0, which implies that for the inverse matrix (g ij ) we also have g nn = 1 and g jn = 0 if j < n. This implies the desired absence of the mixed derivatives. Now we can globalize the arguments given above in the proofs of Lemmas 11.1–11.4 (the local case), except we need to replace ψ(x ) ⊗ δ(xn ) by ψδΓ where ψ ∈ C ∞ (Γ), and ψ(x )⊗δ (xn ) by − ∂∂n¯ (ψδΓ ). This leads, in particular, to the following global version of Lemma 11.4: Lemma 11.5. Let Γ be a closed compact hypersurface in Rn , and let f ∈ D (Rn ) be of the form ∂ (b1 (x)δΓ ), ∂n ¯ where b0 , b1 ∈ C ∞ (Γ) and f0 is a locally integrable function that is smooth up to Γ from each side. Let A be a second-order linear diﬀerential operator in Rn , such that Γ is noncharacteristic for A at every point. Then for any integer N ≥ 0 there exists an integrable compactly supported function u in Rn such that u is smooth up to Γ from each side, and (11.23)

(11.24)

f (x) = [f0 (x)] + b0 (x)δΓ +

A[u] − f ∈ C N (Rn ).

We leave the details of the proof as an exercise for the reader. Note only that if we found u satisfying (11.24) but without compact support, then we can replace u by χu where χ ∈ C0∞ (Rn ), χ = 1 in a neighborhood of Γ. Then χu will satisfy all the requirements of the lemma.

11.3. Jumps of potentials

251

Remark 11.6. Lemma 11.5 is a simpliﬁed version of more general results. It is not necessary to require compactness of Γ. For example, the proof works if Γ is a hyperplane or an inﬁnite cylinder with a compact base. In fact, it suﬃces that Γ is a closed submanifold in Rn , though the proof requires a minor modiﬁcation: we need to glue local solutions into a global one by a partition of unity, and this should be done separately on every induction step from the proof of Lemma 11.4. Moreover, the result holds in an even more general setting, where Rn is replaced by any general Riemannian manifold (for example, any open set in Rn ) and Γ by a closed submanifold.

11.3. Jumps of potentials Now we are able to prove the main theorem on jumps of potentials. Theorem 11.7. 1) Let u be one of the potentials (11.2)–(11.4). Then u is smooth up to Γ from each side. 2) If u is a volume potential, then u and u ∈ C 1 (Rn ).

∂u ∂n ¯

do not have jumps on Γ, so

3) If u is a single layer potential, then u does not have any jump on Γ, so u ∈ C(Rn ), whereas the jump of ∂∂un¯ on Γ is (−σ), where σ is the density of the charge on Γ in the integral (11.3) deﬁning u. 4) If u is a double layer potential, then the jump of u on Γ is equal to (−α), where α is the density (of dipole moment) on Γ in the integral (11.4) deﬁning u. Here the direction of the jumps is naturally determined by the normal vector ﬁeld n ¯. Proof. I will ﬁrst explain the main idea of the proof. As we have seen in the previous section, the diﬀerentiation (of order 1 or 2) of a function u which is smooth up to Γ from each side leads to another function of the same kind but with additional singular terms: terms containing the surface δ-function δΓ and its normal derivative, with coeﬃcients which are sums of jumps of u and its normal derivative on Γ, taken with coeﬃcients from C ∞ (Γ). For our proof, we need, vice versa, to ﬁnd some jumps by the information from the right-hand side of the equation Δu = −f , which is satisﬁed for all potentials with an explicitly given distribution f that is supported on Γ and has exactly the structure that we have described. Now, from the results of the previous section, we can produce a solution u ˆ of Δˆ u = −f1 , where f1 diﬀers from f by an arbitrarily smooth function. Then Δ(u − u ˆ) = f1 − f can be made arbitrarily smooth, and so u − u ˆ can be made arbitrarily smooth too. It will follow that u is smooth up to Γ from each side, and the jumps can be recovered from f .

252

11. Potentials and their computation

Now let us give more detail for each of the statements 1)–4) above. 1) By Lemma 11.5, for each of the potentials u we may ﬁnd a compactly supported integrable function uN smooth up to Γ from each side, such that Δ[uN ] + f ∈ C N (Rn ) (here f is the same as in (11.1)). Therefore, Δ(u − uN ) = fN ∈ C N (Rn ). Let us deduce from this that u − uN ∈ C N (Rn ). Without loss of generality we may assume that uN and fN have compact supports. But then Δ(u − uN − En ∗ fN ) = 0, which implies that u − uN − En ∗ fN ∈ C ∞ (Rn ). At the same time, clearly, En ∗ fN ∈ C N (Rn ), because expressing this convolution as an integral

En (y)fN (x − y)dy, we can diﬀerentiate, by the dominated convergence theorem, under the integral N times with respect to x. Therefore, u − uN ∈ C N (Rn ). This implies that the derivatives ∂ α u for |α| ≤ N are continuous up to Γ from each side. Since N is arbitrary, this proves the ﬁrst statement. 2) Let u be a volume potential. Clearly, u ∈ C ∞ (Rn \Γ). In a neighborhood of every point of Γ we can rectify Γ and use the equation Au = −f , obtained by the change of variables from the equation Δu = −f . Since Δ is an elliptic operator, A is also elliptic. Therefore, Γ is noncharacteristic. But then the inclusion f ∈ L1loc (Rn ) and Lemmas 11.1 and 11.2 imply u ∈ C 1 (Rn ). (If we assume the opposite, then either u or ∂∂un¯ has a nontrivial jump on Γ, and the formulas (11.5)–(11.8), (11.10), and (11.11) would clearly lead to the occurrence of a nonvanishing singular distribution of type ψ0 (x ) ⊗ δ(xn ) or ψ0 (x ) ⊗ δ (xn ) in the right-hand side of the equation Au = −f , whereas f is locally integrable due to our assumptions in the case of the volume potential.) 3) In a neighborhood of a point x0 ∈ Γ it is convenient to use Fermi ∂u at the boundary points. Let coordinates y = (y , yn ). In particular, ∂∂un¯ = ∂y n u be a single layer potential, deﬁned by (11.3), with the density σ ∈ C0∞ (Γ). Then the Poisson equation Δu = −σδΓ implies that u itself cannot have a jump, because any nonvanishing jump ψ0 of u would lead to the appearance of the term −ψ0 (y )⊗δ (yn ) in f = −Δu, which would contradict the Poisson equation. Therefore, u is continuous. On the other hand, it is clear from the form of the Laplacian in coordinates y given by (11.21) and from Lemma 11.1 that the jump of ∂∂un¯ should be −σ for u to satisfy the Poisson equation. 4) The same argument works for the double layer potential. Namely, a jump ψ1 of u leads to the term ∂∂n¯ (ψ1 )δΓ in f = −Δu, which implies the desired statement.

11.4. Calculating potentials

253

Remark. In a similar way we may ﬁnd the jumps of any derivative of the potentials.

11.4. Calculating potentials We will give here simple examples of the calculation of potentials. The explicit calculation becomes possible in these examples with the help of a) symmetry considerations, b) the Poisson equation satisﬁed by the potential, c) theorems on jumps, d) asymptotics at inﬁnity. As a rule, these considerations even give a surplus of information which enables us to verify the result. Example 11.1. The potential of a uniformly charged sphere. Let Γ be the sphere of radius R in R3 with the center at the origin. Suppose a charge is uniformly (with the surface density σ0 ) distributed over the sphere. Denote the total charge of the sphere by Q; i.e., Q = 4πR2 σ0 . Let us ﬁnd the potential of the sphere and its ﬁeld. Denote the desired potential by u. First, symmetry considerations show that u is invariant under rotations; hence u(x) = f (r), where r = |x|; i.e., u only depends on the distance to the center of the sphere. Now use the fact that u is a harmonic function outside Γ. Since any spherically symmetric harmonic function is of the form C1 + C2 r−1 , we have A + Br−1 , r < R, f (r) = C + Dr−1 , r > R. Here A, B, C, D are real constants to be determined. First, B = 0 since u is harmonic in a neighborhood of 0. This, in particular, implies that u = const for |x| < R and, therefore, E = − grad u = 0 for |x| < R; i.e., the ﬁeld vanishes inside a uniformly charged sphere. To ﬁnd the three remaining constants, we need three more conditions for u. They can be obtained in diﬀerent ways. Let us indicate the simplest of them. 1◦ . The continuity of u(x) on Γ implies A = C + DR−1 . 2◦ . Considering the representation of u(x) as an integral, we see that u(x) → 0 as |x| → ∞; hence C = 0.

254

11. Potentials and their computation

3◦ . Consider the integral that deﬁnes u(x), in somewhat more detail. We have

σ0 dSy dSy σ0 1 1 = u(x) = y x 4π |y|=R |x − y| 4π |y|=R |x| | |x| − |x| |

1 σ0 Q 1 = = 1+O dSy 1 + O 4π|x| |y|=R |x| 4π|x| |x| 1 Q 1+O , = 4πr r as r → ∞. Hence D = Q . Thus, A = 4πR

Q 4π .

From 1◦ –3◦ we deduce that B = C = 0, D =

u(x) =

Q 4πR Q 4πr

Q 4π ,

for |x| < R, for |x| > R.

This means that outside the sphere the potential is the same as if the whole charge were concentrated at the center. Therefore the homogeneous sphere attracts the charges situated out of it as if the whole charge were supported in its center. This fact was ﬁrst proved by Newton who considered gravitational forces. (He actually proved this theorem in a diﬀerent way: by considering directly the integral that deﬁnes the attraction force.) We will indicate other arguments that may be used to get equations on A, C, D. We can use them to verify the result. 4◦ . The jump of the normal derivative of u on the sphere should be Q Q 2 equal to −σ0 ; i.e., Dr−2 |r=R = σ0 or D = σ0 R2 = 4πR 2 · R = 4π , which agrees with the value of D found above. 5◦ . The value of u(x) at the center of the sphere is equal to

σ0 dSy 1 Q Q A= = · σ0 4πR2 = , 4π |y|=R R 4πR 4πR which agrees with the value of A found above. Using the obtained result, we can immediately ﬁnd the attracting force (and the potential) of the uniformly charged ball by summing the potentials (or attracting forces) of the spherical layers constituting the ball. In particular, the ball attracts a charge outside of it as if the total charge of the ball were concentrated at its center. Example 11.2. The potential of the double layer of the sphere. Consider the sphere of radius R over which dipoles with the density of the dipole moment α0 are uniformly distributed. The potential of the double layer here is the integral (11.4), where Γ = {x : x ∈ R3 , |x| = R},

α(y) = α0 = const .

11.4. Calculating potentials

255

Let us calculate this potential. This can be done in the same way as in the preceding example. We may immediately note, however, that u(x) = 0 for |x| > R since u(x) is the limit of the potentials of simple layers of a pair of concentric spheres uniformly charged by equal charges with opposite signs. In the limit, these spheres tend to each other, and charges grow in inverse proportion to the distance between the spheres. Even before the passage to the limit, the potential of this pair of spheres vanishes outside the largest of them. Therefore, u(x) = 0 for |x| > R. It is also clear that u(x) = const for |x| < R, and the theorem on jumps implies (when the exterior normal is chosen) that u(x) = α0 for |x| < R. Example 11.3. The potential of a uniformly charged plane. Let us deﬁne and calculate the potential of a uniformly charged plane with the surface density of the charges σ0 . Let x3 = 0 be the equation of the plane in R3 . Observe that we cannot use the deﬁnition of the potential by (11.3) because the integral diverges. Therefore, it is necessary to introduce ﬁrst a new deﬁnition of the potential. This may be done in several ways. We will indicate two of them. The ﬁrst is to preserve the main equation (11.25)

Δu = −σ0 δΓ = −σ0 δ(x3 )

and as many symmetry properties of the potential as possible. (Physicists often determine a value of an observable quantity from symmetry considerations without bothering much with the deﬁnitions.) If the integral (11.3) were convergent in our case, then (11.25) would be satisﬁed and the following symmetry conditions would hold: a) u is not changed by translations along our plane; i.e., u = u(x3 ). b) u is invariant under reﬂections with respect to our plane; i.e., u(−x3 ) = u(x3 ). Now, try to calculate u(x3 ) using (11.25) and the above symmetry properties. It turns out that u(x3 ) is uniquely deﬁned up to a constant. Therefore, we can deﬁne the potential of the plane to be a distribution u satisfying (11.25), a), and b). From (11.25) it is clear that u(x3 ) is linear in x3 separately for x3 > 0 and x3 < 0; i.e., A + Bx3 , x3 > 0, u(x3 ) = C + Dx3 , x3 < 0. Equation (11.25) means, besides, that the theorem on jumps holds; i.e., u ∂u on Γ is equal is continuous on Γ (implying A = C) and the jump of ∂x 3 to −σ0 (implying B − D = −σ0 ). Finally, the fact that u(x3 ) is even (condition b)) once again yields A = C and, besides, implies D = −B.

256

11. Potentials and their computation

Hence, B = −D = − σ20 and σ0 |x3 |. 2 Thus we have found u up to an irrelevant additive constant. u = u(x3 ) = A −

Now, recall that our main goal is to ﬁnd E = − grad u. We see that E is everywhere perpendicular to the plane and if the direction from the plane is taken as the positive one, then the magnitude of the ﬁeld is equal to σ20 . Here is another method to deﬁne u. The ﬁeld E can be found by the Coulomb law (E is given, as is easy to see, by a converging integral!) and then u is found from the equation E = − grad u. Since the symmetry properties clearly hold and div E = σ0 δ(x3 ) leads to (11.25), we can apply the same method to ﬁnd u. Therefore we have explicitly found the ﬁeld strength E given by a relatively complicated integral.

11.5. Problems 11.1. Calculate the potential of a uniformly charged circle on the plane. 11.2. Calculate the potential of a uniformly charged disc on the plane. 11.3. Deﬁne and calculate the potential of the uniformly charged surface of a straight cylinder in R3 . 11.4. Find the potential of the double layer of the circle in R2 and the surface of a straight cylinder in R3 with a constant density of the dipole moment. 11.5. Find the potential of a uniformly charged sphere in Rn for n ≥ 3. 11.6. A point charge Q is placed at the distance d from the center of a grounded conducting sphere of radius R (R < d). Find the total charge induced on the sphere. 11.7. Under the conditions of the preceding problem ﬁnd the distribution of the induced charge over the surface of the sphere. 11.8. A thin elliptic wire is charged with the total charge Q. How will the charge distribute along the wire?

10.1090/gsm/205/12

Chapter 12

Wave fronts and short-wave asymptotics for hyperbolic equations

12.1. Characteristics as surfaces of jumps Let Ω be an open set in Rn , S a smooth hypersurface in Ω, i.e., a closed submanifold of Ω of codimension 1. Let a linear diﬀerential operator of degree m (see Chapter 1) (12.1)

A = a(x, D) =

aα (x)D α

|α|≤m

be given in Ω, with aα ∈ C ∞ (Ω), and let a(x, ξ) be the total symbol of A (we usually consider (12.1) to be the deﬁnition of both A and a(x, D)). Recall that (12.2) a(x, ξ) = aα (x)ξ α , |α|≤m

so that a(x, D) is obtained from a(x, ξ) if all the ξ α are written on the right (as in (12.2)) and then replaced by D α . The principal symbol am (x, ξ) of A is of the form aα (x)ξ α . am (x, ξ) = |α|=m

257

258

12. Wave fronts and short-wave asymptotics

Recall that a hypersurface S is called a characteristic of A if for any point x ∈ S the unit normal n ¯ x to S at x satisﬁes (12.3)

am (x, n ¯ x ) = 0.

Unfortunately, deﬁnitions of characteristic are ambiguous: there are many (not necessarily equivalent) versions. However, there are many important common features, which allow us to create a common use of them. Let us take now a real-valued function u ∈ C ∞ (Ω\S) which is inﬁnitely diﬀerentiable up to S from each side (cf. the deﬁnition in Section 11.2). Using the same notation as in Section 11.2, we locally have Ω = Ω+ ∪ S ∪ Ω− , and there exist u+ ∈ C ∞ (Ω+ ) and u− ∈ C ∞ (Ω− ) such that u|Ω+ = u+ , u|Ω− = u− . We can consider u as a locally integrable function on Ω, and we will denote by u the corresponding distribution too. Note immediately that it may actually turn out that there is no discontinuity; i.e., u can be extended to a function from C ∞ (Ω). We will say that S is a surface of jumps for u if, ﬁrst, u is inﬁnitely diﬀerentiable up to S from each side and, secondly, for each point x0 ∈ S there exists a multi-index α such that ∂ α u is discontinuous at x0 ; i.e., ∂ α u does not extend to a function on Ω continuous at x0 . This actually means that (∂ α u+ )(x0 ) = (∂ α u− )(x0 ). Note that the same inequality also holds on S for x close to x0 (with the same α). Theorem 12.1. If S is a surface of jumps for u ∈ L1loc (Ω) and u is a solution of the equation Au = f in Ω, with f ∈ C ∞ (Ω), then S is a characteristic of A. Proof. Since the theorem is local and the notion of a characteristic is invariant under diﬀeomorphisms, we may assume that S is of the form S = {x : x ∈ Ω, xn = 0}, where x = (x , xn ) = (x1 , . . . , xn−1 , xn ). Since the normal to S is of the form (0, 0, . . . , 1), the characteristic property of S means that the coeﬃcient of ∂m ∂xm in A vanishes on S. n

We will prove the theorem arguing by contradiction. Let S be nonchar∂m acteristic at some point. This means that the coeﬃcient of ∂x m is nonzero n at this point, hence, near it. Then we may assume that it is nonzero everywhere in Ω. Dividing the equation Au = f by this coeﬃcient, we may

12.1. Characteristics as surfaces of jumps

259

assume that the equation is of the form ∂mu ∂ m−j u + aj (x, D ) m−j = f, m ∂xn ∂xn m

(12.4)

j=1

where aj (x, D ) is a diﬀerential operator in Ω without any derivatives with respect to xn (its order is ≤ j though this is of no importance now). Let u be a C ∞ -function up to S from each side, let Ω± = Ω ∩ {x : ±xn > 0}, and let u± ∈ C ∞ (Ω± ) be such that u|Ω± = u± . We want to prove that u can be extended to a function from C ∞ (Ω); i.e., the jumps of all the derivatives actually vanish. Denote these jumps by ϕα ; i.e., ϕα = (∂ α u+ )|S − (∂ α u− )|S ∈ C ∞ (S). As we have seen in Section 11.2, a special role is played by the jumps of the normal derivatives ∂ k u+ ∂ k u− − ∈ C ∞ (S). (12.5) ψk = ∂xkn S ∂xkn S All the jumps ϕα are expressed in terms of the jumps of the normal derivatives ψk via the already-mentioned formula

ϕα = ∂xα ψαn ,

α = (α , αn ).

Therefore, it suﬃces to prove that ψk = 0 for k = 0, 1, 2, . . . . Let us ﬁnd Au in D (Ω). By Lemma 11.1 we have ' & ∂u ∂u , = ψ0 (x ) ⊗ δ(xn ) + ∂xn ∂xn

& 2 ' ∂ 2u ∂ u . = ψ0 (x ) ⊗ δ (xn ) + ψ1 (x ) ⊗ δ(xn ) + 2 ∂xn ∂x2n

Further calculations of the same type easily yield (12.6)

& k ' k−1 ∂ u ∂ku (l) = ψk−l−1 (x ) ⊗ δ (xn ) + . k ∂xn ∂xkn l=0

Now, if α is an (n − 1)-dimensional multi-index and ∂ α = ∂xα , then (12.7)

& ' k−1 k ∂ k u α (l) α ∂ u = [∂ ψk−l−1 (x )] ⊗ δ (xn ) + ∂ . ∂ ∂xkn ∂xkn α

l=0

260

12. Wave fronts and short-wave asymptotics

Now, let a ∈ C ∞ (Ω). Using the same arguments as in the proof of Lemma 11.2, we easily obtain a(x , xn )(ψ(x ) ⊗ δ (l) (xn )) =

(12.8)

l

aq (x ) ⊗ δ (q) (xn ),

q=0

where aq ∈ C ∞ (S). Formulas (12.6)–(12.8) show that if Au = f outside S, then (12.4) may be rewritten in the form ψ0 (x ) ⊗ δ (m−1) (xn ) +

(12.9)

m−1

al (x ) ⊗ δ (m−1−l) (xn ) = g(x),

l=1

(x )

C ∞ (S),

al ∈ and ψ0 is the jump of u on S (see formula where g ∈ (12.5) for k = 0). It is clear from (12.9) that g ≡ 0 since the left-hand side of (12.9) is supported on S. Further, after applying the distribution at the in left-hand side of (12.9) to a test function ϕ that is equal to ϕ1 (x )xm−1 n ∞ a neighborhood of S (here ϕ1 ∈ C0 (S)), we see that all the terms but the ﬁrst vanish, and the ﬁrst one is equal to (−1)m−1 (m − 1)! ψ0 (x )ϕ1 (x )dx . Since ϕ1 is arbitrary, it is clear that ψ0 (x ) ≡ 0; i.e., u cannot have a jump on S. L1loc (Ω),

One similarly proves that the derivatives up to order m − 1 cannot have jumps on S. Namely, let p be the smallest of all k such that ψk ≡ 0. Therefore, ψ0 ≡ ψ1 ≡ · · · ≡ ψp−1 ≡ 0,

(12.10)

ψp ≡ 0.

We wish to distinguish the most singular term in the left-hand side of (12.4). This is easy to do starting from the formulas (12.6)–(12.8). Namely, calcu∂k u 1 lating the normal derivatives ∂x k for k ≤ p, we get functions from Lloc (Ω). n For k = p + 1 we get & p+1 ' ∂ u ∂ p+1 u p+1 = ψp (x ) ⊗ δ(xn ) + ∂xn ∂xp+1 n and for p ≤ m − 1 equation (12.4) takes on the form

ψp (x ) ⊗ δ

(m−p−1)

(xn ) +

m−p−1

al (x ) ⊗ δ (m−p−1−l) (x ) = g(x),

l=1

where al (x ) ∈ C ∞ (S), g ∈ L1loc (Ω). As above, this implies ψp ≡ 0, contradicting the hypothesis. Thus, ψp = 0 for p ≤ m − 1. Let us prove that ψm = 0. This is immediately clear from (12.4) since all the derivatives ∂ α u, where α = (α , αn ) and αn ≤ m − 1, are continuous in Ω. Therefore, the normal derivative of order m has no jumps either.

12.1. Characteristics as surfaces of jumps

261

Now, consider the case when p > m, where p is the number of the ﬁrst discontinuous normal derivative; see (12.10). This case can be easily reduced p−m to the case p = m. Indeed, applying ∂ p−m to (12.4), we see that u satisﬁes ∂xn an equation of the same form as (12.4) but of order p instead of m. This implies that the case p > m is impossible. This concludes the proof. Theorem 12.1 implies that solutions of elliptic equations with smooth (C ∞ ) coeﬃcients and smooth right-hand side cannot have jumps of the structure described above. It can even be proved that they are always C ∞ functions. (The reader can ﬁnd proofs in more advanced textbooks; see, for example, Taylor [35, vol. II, Sect. 7.4], H¨ormander [13, Ch. X], or [15, vol. I, Corollary 8.3.2].) The examples given below show that nonelliptic equations may have discontinuous solutions. Example 12.1. The equation utt = a2 uxx has a solution θ(x − at) with a jump along the characteristic line x = at. A jump of derivatives of arbitrarily large order along the characteristic x = at can be arranged by considering the solution u(t, x) = (x − at)N θ(x − at), where N ∈ Z+ . Similar discontinuities can obviously be constructed along any characteristic (all the characteristics here can be written in one of the forms x − at = c1 , x + at = c2 ). Example 12.2. Let us consider a more intricate case of the three-dimensional wave equation utt = a2 Δu,

u = u(t, x),

x ∈ R3 .

The simplest example of a discontinuous solution of this equation is the spherical wave u(t, x) = θ(|x|−at) . This solution, clearly, has a jump on |x| the upper sheet of the light cone {(t, x) : |x| = at, t > 0}. The sphere {x : |x| = at} ∈ R3 is naturally called the wave front. The wave front moves at a speed a. In geometrical optics, a ray is a parametrically deﬁned (with the parameter t) curve which is normal to the wave front at any moment of time. In our example the rays are straight lines x = ae t, where t > 0, e ∈ R3 , |e| = 1. Sometimes any curve normal to all the wave fronts (without an explicitly given parameter) is also called a ray. In what follows we will introduce more terminology of the geometric optics in connection with the Hamilton-Jacobi equation. Example 12.3. Consider again the three-dimensional wave equation utt = a2 Δu, but this time try to construct a solution which is discontinuous on a given smooth compact surface Γ ⊂ R3 at the initial moment. Let, for

262

12. Wave fronts and short-wave asymptotics

instance, Γ = ∂Ω, where Ω is a bounded domain in R3 and the initial conditions are of the form u|t=0 = χΩ ,

ut |t=0 = 0,

where χΩ is the characteristic function of Ω. Let us try to ﬁnd out how the jumps of u behave as t grows by looking at the Kirchhoﬀ formula which we will use to construct u. In the Kirchhoﬀ formula we are to integrate χΩ over the sphere of radius at with the center at x, divide the result by 4πa2 t, and then take the derivative with respect to t. It is easy to see that this u(t, x) is a smooth function of t, x for t, x such that the sphere of radius at with the center at x is not tangent to Γ (i.e., this sphere either does not intersect Γ or intersects it transversally). Therefore, the jumps may occur only at the points where this sphere is tangent to Γ. For small t, the set of all such x forms exactly the set of the points x which are situated at the distance at from Γ (“wave front”). For small t, the wave front can be constructed as follows: draw all the normals to Γ (“rays”) and mark on them points at the distance at on both sides. It is easy to see that on the wave front thus constructed, there is actually a jump of u. For large t normals may begin to intersect. An envelope of the family of rays appears and is called a caustic. The singularity of the solution on the caustic has a much more sophisticated structure and we will not deal with it. Even this example shows how objects of geometric optics appear in the wave theory. In what follows we will consider this problem in somewhat more detail.

12.2. The Hamilton-Jacobi equation. Wave fronts, bicharacteristics, and rays The Hamilton-Jacobi equation is an equation of the form ∂S (12.11) H x, = 0, ∂x where H = H(x, ξ) is a given real-valued function of variables x ∈ Rn and ξ ∈ Rn (we will usually assume that H ∈ C 2 ) and S = S(x) is an unknown (also real-valued) function with the gradient ∂S ∂x . The HamiltonJacobi equation plays an important role in mechanics and physics. Recall brieﬂy a method for integrating (12.11), more precisely, a connection of this equation with the Hamiltonian system x˙ = Hξ (x, ξ), (12.12) ξ˙ = −Hx (x, ξ),

12.2. The Hamilton-Jacobi equation

263

where the Hamiltonian H = H(x, ξ) is the same function as in (12.11), Hξ = ∂H ∂H ∂H ∂H ∂H ∂H ∂ξ = ( ∂ξ1 , . . . , ∂ξn ), Hx = ∂x = ( ∂x1 , . . . , ∂xn ). (We already discussed Hamiltonian systems in Section 10.3; here we use x, ξ instead of q, p in (10.13).) Solutions of this system (curves (x(t), ξ(t))) are called the Hamiltonian curves (or, more exactly, Hamiltonian curves of the Hamiltonian H(x, ξ)). Vector ﬁeld (Hξ , −Hx ) in R2n x,ξ deﬁning the system (12.12) is called a Hamiltonian vector ﬁeld with the Hamiltonian H. As was mentioned in Section 10.3, any Hamiltonian vector ﬁeld is welldeﬁned on T ∗ Rnx . This means that the system (12.12) retains the same form for any choice of curvilinear coordinates in Rnx assuming that ξ is a cotangent vector expressed in the coordinate system corresponding to a given coordinate system in Rnx (i.e., considering ξ1 , . . . , ξn as coordinates of a cotangent vector with respect to the basis dx1 , . . . , dxn ; see Section 1.1). The reader can ﬁnd more detail on this in Arnold [3]. Proposition 12.2. The Hamiltonian H(x, ξ) is a ﬁrst integral of the system (12.12); i.e., H(x(t), ξ(t)) = const along any Hamiltonian curve (x(t), ξ(t)). Proof. We have ∂H ∂H ˙ d H(x(t), ξ(t)) = x˙ + ξ = Hx Hξ + Hξ (−Hx ) = 0. dt ∂x ∂ξ (Here and below for the sake of simplicity we omit the dot standing for the inner product of two vectors.) This immediately implies the desired result. In classical mechanics, Proposition 12.2 has the meaning of the energy conservation law. In fact, we already proved it for particular inﬁnitedimensional linear Hamiltonian system, which is equivalent to the wave equation (see Proposition 10.1). The Hamiltonian curves with H(x(t), ξ(t)) ≡ 0 are of particular importance for us. If H is the principal symbol of a diﬀerential operator, then they are called bicharacteristics (of the operator or of its symbol). By some abuse of terminology, we will call such a curve bicharacteristic for any H (even if H is not a principal symbol of any diﬀerential operator). As Proposition 12.2 says, if (x(t), ξ(t)) is a Hamiltonian curve deﬁned on an interval of the t-axis and H(x(t0 ), ξ(t0 )) = 0 for some t0 , then this curve is a bicharacteristic. Now, consider the graph of the gradient of S(x), i.e., the n-dimensional ∂S n submanifold in R2n x,ξ of the form ΓS = {(x, Sx (x)), x ∈ R }, where Sx = ∂x . Let us formulate the main statement necessary to integrate the HamiltonianJacobi equation.

264

12. Wave fronts and short-wave asymptotics

Proposition 12.3. If S is a C 2 -solution of the Hamilton-Jacobi equation (12.11), then the Hamiltonian ﬁeld is tangent to ΓS at its points. Proof. The statement means that if (x(t), ξ(t)) is a bicharacteristic and (x0 , ξ0 ) = (x(t0 ), ξ(t0 )) ∈ ΓS , then d d (x(t), Sx (x(t)))|t=t0 = (x(t), ξ(t))|t=t0 dt dt or, in a shorter form, d ˙ 0 ). Sx (x(t))|t=t0 = ξ(t dt But d Sx (x(t))|t=t0 = Sxx (x(t0 ))x(t ˙ 0 ) = Sxx (x0 )Hξ (x0 , ξ0 ), dt 2

S where Sxx = ∂x∂i ∂x n is considered as an n × n symmetric matrix. On j i,j=1 ˙ 0 ) = −Hx (x0 , ξ0 ), so we need to prove that the other hand, ξ(t

(Sxx Hξ + Hx )|ΓS = 0. But this immediately follows by diﬀerentiation of the Hamilton-Jacobi equation (12.11) with respect to x. Proposition 12.3 directly implies Theorem 12.4. ΓS is invariant with respect to the Hamiltonian ﬂow (i.e., with respect to the motion along the bicharacteristics). This means that if (x(t), ξ(t)) is a bicharacteristic deﬁned for t ∈ (a, b) and ξ(t0 ) = Sx (x(t0 )) for some t0 ∈ (a, b), then Sx (x(t)) = ξ(t) for all t ∈ (a, b). Using Theorem 12.4, we can construct a manifold ΓS if it is known over a submanifold M of codimension 1 in Rnx . If we also know S over M , then S is recovered by simple integration over ΓS (note that if ΓS is found, then S is recovered from a value at a single point). It should be noted, however, that all these problems are easy to study locally whereas we are rarely successful in recovering S globally. We more often succeed in extending ΓS to a manifold which is not the graph of a gradient any longer but is a so-called Lagrangian manifold; this often suﬃces to solve the majority of problems of mathematical physics (the ambiguity of projecting a Lagrangian manifold onto Rnx causes the appearance of caustics). The detailed analysis of these questions, however, is beyond the scope of this textbook. Let us indicate how to construct (locally) S if it, itself, and its gradient are known over an initial manifold M of codimension 1 in Rnx . If the

12.2. The Hamilton-Jacobi equation

265

bicharacteristic L = {(x(t), ξ(t))} begins at (x0 , ξ0 ), where x0 ∈ M and ξ0 ∈ Sx (x0 ), and terminates at (x1 , ξ1 ), then, clearly,

(12.13) S(x1 ) − S(x0 ) = Sx dx = ξdx. L

L

The projections of bicharacteristics to Rnx are called rays. We will only consider bicharacteristics starting over M at the points of the form (x, Sx (x)), where x ∈ M . Take two rays x1 (t) and x2 (t) corresponding to these bicharacteristics. They may intersect at some point x3 ∈ Rnx and then (12.13) gives two (generally, distinct) values for S(x) at x3 . For this reason the Cauchy problem for the Hamilton-Jacobi equation (ﬁnding S from the values of S|M and Sx |M ) is rarely globally solvable. The Cauchy problem is, however, often solvable locally. For its local solvability it suﬃces, for example, that rays of the described type that start at the points from M be transversal to M at the initial moment. Then we can consider the map f : M × (−ε, ε) −→ Rnx , which to a pair {x0 ∈ M, t ∈ (−ε, ε)} assigns a point x(t) of the ray starting at x0 (i.e., x(0) = x0 ) which is the projection of the bicharacteristic (x(t), ξ(t)) such that ξ(0) = Sx (x0 ). (We assume that this map is well-deﬁned; i.e., the corresponding bicharacteristics are deﬁned for t ∈ (−ε, ε).) By the implicit function theorem, f determines a diﬀeomorphism of U × (−ε, ε) (where U is a small neighborhood of x0 in M and ε is suﬃciently small) with an ˙ is not tangent to M (this is a necessary open set in Rnx if and only if x(0) and suﬃcient condition for the derivative map of f to be an isomorphism (Tx0 M × R −→ Tx0 Rnx ). In this case locally the rays do not intersect and the Cauchy problem is locally solvable. Let us give an exact formulation of the Cauchy problem. Given a submanifold M ⊂ Rnx and S0 ∈ C ∞ (M ), let ΓS |M be a submanifold in (T ∗ Rnx )|M which projects to the graph of the gradient of S0 under the natural projection onto T ∗ M and belongs to the zero level surface of H(x, ξ). The Cauchy problem is then to ﬁnd S such that S|M = S0 and ΓS |M coincides with the submanifold thus denoted above and given over M . If the above transversality condition holds at all points of M , the Cauchy problem has a solution in a neighborhood of M . The level surfaces of S are usually called wave fronts. We will see below why the use of this term here agrees with that in Example 12.2. Example 12.4. Consider a Hamilton-Jacobi equation of the form 1 (12.14) |Sx (x)|2 = 2 . a The Hamiltonian in this case is 1 H(x, ξ) = |ξ|2 − 2 a

266

12. Wave fronts and short-wave asymptotics

and the Hamiltonian equations are ξ˙ = 0, x˙ = 2ξ, which implies that the Hamiltonian curves are given by ξ = ξ0 , x = 2ξ0 t + x0 . In particular, the rays are straight lines in all directions. If such a ray is a projection of a bicharacteristic belonging to the graph ΓS of the gradient of the solution S(x) (due to Theorem 12.4 this means that Sx (x0 ) = ξ0 ), then x˙ Sx (x(t)) = ξ(t) = ξ0 = , 2 along the whole ray, and the ray x(t) is orthogonal to all the wave fronts S(x) = const which it intersects. Note that in general rays are not necessarily orthogonal to wave fronts. The function S(x) = a−1 |x| in Rn \0 is one of the solutions of (12.14). Its level lines, i.e., wave fronts, are spheres {x : S(x) = t}, and they coincide with the wave fronts of Example 12.2. The rays in this case are straight lines which also coincide with the rays, which we mentioned in Example 12.2. Now consider the following particular case of the Hamilton-Jacobi equation: ∂S ∂S + A x, = 0, (12.15) ∂t ∂x where S = S(t, x), A = A(x, ξ), t ∈ R1 , x, ξ ∈ Rn , A ∈ C 2 is assumed to be real valued. This is a Hamilton-Jacobi equation in Rn+1 t,x with the Hamiltonian H(t, x, τ, ξ) = τ + A(x, ξ). The Hamiltonian system that deﬁnes the Hamiltonian curves (t(s), x(s), τ (s), ξ(s)) is ⎧ ⎪ t˙ = 1, ⎪ ⎪ ⎪ ⎨x˙ = A (x, ξ), ξ

⎪ τ˙ = 0, ⎪ ⎪ ⎪ ⎩ξ˙ = −A (x, ξ) x (the “dot” stands for the derivative with respect to the parameter s). It follows that τ (s) = τ0 , t(s) = s + t0 , and (x(s), ξ(s)) is a Hamiltonian curve of the Hamiltonian A(x, ξ). Since a shift of the parameter of a Hamiltonian curve does not aﬀect anything, we may assume that t0 = 0, i.e., t = s,

12.2. The Hamilton-Jacobi equation

267

and regard x and ξ as functions of t. Further, when interested in bicharacteristics, we may arbitrarily set x(0) = x0 , ξ(0) = ξ0 , and then τ0 is uniquely determined: τ0 = −A(x0 , ξ0 ). Therefore arbitrary Hamiltonian curves (x(t), ξ(t)) of the Hamiltonian A(x, ξ) are in one-to-one correspondence with bicharacteristics for H(t, x, τ, ξ) starting at t = 0. The rays corresponding to the bicharacteristics described above are of the form (t, x(t)). In particular, they are transversal to all the planes t = const. We may set the Cauchy problem for the equation (12.15) imposing the initial condition (12.16)

S|t=0 = S0 (x).

∂S 0 This implies Sx |t=0 = ∂S ∂x , and from (12.15) we ﬁnd ∂t t=0 . Therefore, the graph ΓS of the gradient of S is known over the plane M = {(t, x) : t = 0}. Since the rays are transversal to M at the initial moment, ΓS may be extended to deﬁne ﬁrst ΓS and then S over a neighborhood of M . Namely, in accordance with (12.13), we obtain

t [τ0 dt + ξ(t)x(t)dt], ˙ S(t, x) = S0 (x0 ) + 0

where (x(t), ξ(t)) is a Hamiltonian curve of the Hamiltonian A(x, ξ) such that x(0) = x0 and x(t) = x; i.e., the points x0 and x are connected by a ray (clearly, x0 should be chosen so that the ray with x0 as the source hits x at the time t); besides, ∂S0 (x0 ) = −A(x(t), ξ(t)) τ0 = −A(x0 , ξ0 ) = −A x0 , ∂x (this means that the corresponding bicharacteristic of H(t, x, τ, ξ) belongs to ΓS ). Thus,

x (12.17) S(t, x) = S0 (x0 ) + (ξdx − Adt), x0

where the integral is taken over a Hamiltonian curve of the Hamiltonian A such that its projection joins x0 and x. Instead of (12.17) we can also write

x Ldt, (12.18) S(t, x) = S0 (x0 ) + x0

where L = ξ x˙ − A is called the Lagrangian corresponding to the Hamiltonian A and the integral is taken again over the curve. Formulas (12.17) and (12.18) show that S is similar to the action of classical mechanics (ξ ∈ Rn plays the role of momentum). Therefore, the wave fronts are level surfaces of the action.

268

12. Wave fronts and short-wave asymptotics

We will not go any deeper into the Hamilton-Jacobi equation theory and its relations with mechanics and geometric optics. Details can be found in textbooks on mechanics (see, e.g., Arnold [3]). Note, in particular, that we have proved here the uniqueness of a solution for the Cauchy problem and gave a method of its construction but did not verify its existence, i.e., we did not verify that this method actually produces a solution. This is really so in the domain where the above system of rays starting at t = 0 is diﬀeomorphic to the above system of parallel segments (in particular, this is true in a small neighborhood of the plane {(t, x) : t = 0}), but we skip the veriﬁcation of this simple fact. Let us give one further remark on the Hamilton-Jacobi equation of the form (12.15). Suppose we are interested not in the solution itself but only in the wave fronts {(t, x) : S(t, x) = c}, where c is a constant. Fix one value of the constant and consider the system of surfaces {x : S(t, x) = c} belonging to Rnx and depending on t as on a parameter. If ∂S ∂t = 0 somewhere, then we may (locally) solve the equation S(t, x) = c for t and express this system of ˜ ˜ surfaces in the form {x : S(x) = t}, i.e., in the form of level surfaces of S. ˜ How do we write an equation for S? ˜ S(S(x), x) ≡ c with respect to x, we get

Diﬀerentiating the identity

∂S ˜ · Sx + Sx = 0. ∂t Substituting Sx = −S˜x · (12.19)

∂S ∂t

into (12.15), we get ∂S ∂S ˜ + A x, −Sx (x) = 0. ∂t ∂t

Consider the case most often encountered and the most important for partial diﬀerential equations: the case when A(x, ξ) is homogeneous of ﬁrst order in ξ; i.e., A(x, λξ) = λA(x, ξ), x ∈ Rn , ξ ∈ Rn , λ > 0. We may assume that ∂S ∂t < 0 (otherwise, replace S by −S, which does not aﬀect the wave fronts). Dividing (12.19) by (− ∂S ∂t ) we get (12.20)

A(x, S˜x (x)) = 1.

We could get the same result just assuming that ˜ (12.21) S(t, x) = S(x) − t + c. Then (12.20) is immediately obtained without requiring A(x, ξ) to be homogeneous. The role of homogeneity is in the fact that under the assumption of homogeneity, (12.15) is actually just an equation for the direction of the normal vector to the wave fronts and does not depend in any way on parameterization of the wave fronts and the length of the normal vector.

12.3. The characteristics of hyperbolic equations

269

Thus, assuming that S(t, x) is of the form (12.21) or A(x, ξ) is homogeneous of order 1 in ξ, we see that the wave fronts are determined by ˜ ˜ the equation S(x) = t, where S(x) is a solution of the time-independent Hamilton-Jacobi equation (12.20).

12.3. The characteristics of hyperbolic equations We have not yet discussed the existence of characteristics. Clearly, the elliptic operator has no characteristics. We will see that a hyperbolic operator has plenty of characteristics. Let A = a(t, x, Dt , Dx ) =

aα (t, x)Dtα0 Dxα ,

|α|≤m

where t ∈ R, x ∈ Rn , α = (α0 , α ) is an (n + 1)-dimensional multi-index, aα (t, x) ∈ C ∞ (Ω), with Ω a domain in Rn+1 . Let us recall what the hyperbolicity of A with respect to a distinguished variable t means (see Section 1.1). Consider the principal symbol aα (t, x)τ α0 ξ α . (12.22) am (t, x, τ, ξ) = |α|=m

The hyperbolicity of A means that the equation (12.23)

am (t, x, τ, ξ) = 0,

regarded as an equation with respect to τ , has exactly m real distinct roots for any (t, x) ∈ Ω, ξ ∈ Rn \0. Denote the roots by τ1 (t, x, ξ), . . . , τm (t, x, ξ). Clearly, am is a polynomial of degree m in τ and the hyperbolicity implies that its highest coeﬃcient a(m,0,...,0) (the coeﬃcient by τ m ) does not vanish for (t, x) ∈ Ω (this coeﬃcient does not depend on ξ due to (12.22). If we are only interested in the equation Au = f , then we can divide by this coeﬃcient to make the highest coeﬃcient identically 1. Then all coeﬃcients of the principal symbol am will become real. So we can (and will) assume that a(m,0,...,0) ≡ 1 from the very beginning. Since the roots of am with respect to τ are simple, we have ∂am

= 0. ∂τ τ =τj By the implicit function theorem, we can solve (12.23) for τ in a neighborhood of τj and obtain τj (t, x, ξ) as a locally smooth function of (t, x, ξ) ∈ Ω × (Rn \0), i.e., in a suﬃciently small neighborhood of any ﬁxed point (t0 , x0 , ξ0 ) ∈ Ω × (Rn \0).

270

12. Wave fronts and short-wave asymptotics

The last statement can be made global due to hyperbolicity. Indeed, we can enumerate the roots so that τ1 (t, x, ξ) < τ2 (t, x, ξ) < · · · < τm (t, x, ξ). Then every τj will be a continuous (hence, smooth) function of t, x, ξ on Ω × (Rn \0) because when we continuously move t, x, ξ (keeping ξ = 0), the roots also move continuously and cannot collide (a collision would contradict hyperbolicity); hence their order will be preserved for any such motion. Further, by homogeneity of am , i.e., since am (t, x, sτ, sξ) = sm am (t, x, τ, ξ), it is clear that the set of roots τ1 (t, x, sξ), . . . , τm (t, x, sξ) coincides with the set sτ1 (t, x, ξ), . . . , sτm (t, x, ξ). Since the order of the roots τj is preserved under the homothety, we see that the functions τj (t, x, ξ) are smooth, real valued, and homogeneous of order 1 in ξ: τj (t, x, sξ) = sτj (t, x, ξ), s > 0. In particular, if we know them for |ξ| = 1, then they can be extended on all values of ξ = 0. Suppose the surface {(t, x) : S(t, x) = 0} is a characteristic. Then the vector (τ, ξ) = ( ∂S ∂t , Sx ) on this surface satisﬁes (12.23). But then it is clear that for some j we locally have ∂S ∂S (12.24) = τj t, x, ∂t ∂x (again on the surface S = 0). In particular, for S we may take any solution of the Hamilton-Jacobi equation (12.24). We only need for the surface S = 0 to be nonsingular; i.e., S = 0 should imply grad S = 0. Such a function S may be locally obtained, for example, by solving the Cauchy problem for the equation (12.24) with the initial condition S|t=t0 = S0 (x), where gradx S0 (x) = 0. For instance, we may construct a characteristic passing through (t0 , x0 ) taking S0 (x) = ξ · (x − x0 ), where ξ ∈ Rn \0. In fact, taking S0 (x) to be arbitrary, so that grad S0 (x) = 0 on {x : S0 (x) = 0}, and choosing turn by turn all τj , j = 1, . . . , m, we will ﬁnd m diﬀerent characteristics, whose intersection with the plane t = t0 is the same nonsingular hypersurface in this plane. This means that the wave front, preassigned for t = t0 , may propagate in m diﬀerent directions. We may verify that the obtained characteristics, whose intersection with the plane t = t0 is the given nonsingular hypersurface, do not actually depend on the arbitrariness in the choice of the initial function S0 if the

12.4. The eikonal and transport equations

271

initial surface {x : S0 (x) = 0} is ﬁxed. We skip a simple veriﬁcation of this fact; it follows, for example, from the study of the above method of constructing solutions for the Hamilton-Jacobi equation. For n = 1 such a veriﬁcation is performed in Chapter 1, where we proved that in R2 exactly two characteristics of any second-order hyperbolic equation pass through each point.

12.4. Rapidly oscillating solutions. The eikonal equation and the transport equations Now we wish to understand how the wave optics turns into the geometric optics for very short waves (or, equivalently, for high frequencies). Recall that the plane wave with frequency ω and wave vector k is of the form k u(t, x) = ei(ωt−k·x) = eiω(t− ω ·x) . The length of the vector ωk is equal to a1 , where a is the speed of the wave propagation. For the three-dimensional wave equation, a is a constant, i.e., it does not depend on ω, whereas ω may be viewed as arbitrary, in particular, arbitrarily large. Therefore, for any vector k0 of length a1 and any ω there exists a plane wave of the form (12.25)

u(t, x) = eiω(t−k0 ·x) .

We may assume now k0 to be ﬁxed and let ω tend to +∞. This is the limit transition to geometric optics for the case of a plane wave. The argument of the exponent, ϕ = ω(t − k0 · x), is called the phase of the plane wave (12.25). Let us ﬁx ω and k0 and consider constant phase surfaces ϕ = const, which are called wave fronts. For each t a wave front is the plane {x : k0 · x = t + c} ⊂ R3 . A wave front moves with the speed a in the direction of the vector k0 . Therefore, the straight lines in the direction k0 are naturally called rays. Now consider a general diﬀerential operator aα (x)D α , x ∈ Ω ⊂ Rn , A = a(x, D) = |α|≤m

and try to ﬁnd a solution of the equation Au = 0, by analogy with the plane wave (12.25), in the form (12.26)

u(x) = eiλS(x) ,

where λ is a large parameter and S(x) is a real-valued function called phase.

272

12. Wave fronts and short-wave asymptotics

Calculating Au, we see that (12.27)

Au = λm am (x, Sx (x))eiλS(x) + O(λm−1 ),

where am (x, ξ) is the principal symbol of A (see Section 1.2). For the moment we will ignore the terms denoted in (12.27) by O(λm−1 ) (they are of the form of eiλS(x) times a polynomial of λ of degree not higher than m − 1). But it is clear from (12.27) that if we want Au = 0 to be satisﬁed for arbitrarily large values of λ, then S(x) should satisfy the equation ∂S =0 (12.28) am x, ∂x which is called the eikonal equation in geometric optics. It is a HamiltonJacobi equation if the principal symbol am (x, ξ) is real valued (only this case will be considered in what follows). The equation (12.28) means the fulﬁllment of Au = 0 in the ﬁrst approximation. More precisely, if u(x) is of the form (12.26), then (12.28) is equivalent to Au = O(λm−1 )

as λ → ∞.

We see that the search for wave fronts (constant phase surfaces) reduces in this case to solving the Hamilton-Jacobi equation. Suppose that the solvability conditions for the Cauchy problem hold for the Hamilton-Jacobi equation (12.28) (this is true, for example, if A is hyperbolic with respect to a variable denoted by t). Then we may construct wave fronts by methods of geometric optics proceeding from their disposition on the initial manifold (at t = 0 in the hyperbolic case). Now let us try to satisfy Au = 0 with better precision. It turns out that it is a good idea to seek u in the form (12.29)

u = eiλS(x) b(x, λ),

where b(x, λ) =

∞

bj (x)λ−j

j=0

is a formal power series (in λ) called the amplitude. Therefore u = u(x, λ) is also a formal series. Let us explain the meaning of Au. Clearly, (12.30)

A(f (x)eiλS(x) ) = eiλS(x)

m l=0

λm−l (Al f ),

12.4. The eikonal and transport equations

273

where the Al are linear diﬀerential operators. Applying A to the kth term of (12.29) we get (12.31)

A(e

iλS(x)

−k

bk (x)λ

)=e

iλS(x)

m

λm−l−k (Al bk ).

l=0

Set (12.32)

Au = eiλS(x)

∞

vj (x)λm−j ,

j=0

which is the formal series obtained by applying A termwise to (12.29) and grouping together all the terms with the same power of λ (as is clear from (12.31), there are only a ﬁnite number of terms with any given power of λ). We call u an asymptotic or rapidly oscillating solution if Au = 0, i.e., if all the coeﬃcients vj in (12.32) vanish. Let u be an asymptotic solution of the form (12.29). Consider a ﬁnite segment of the series (12.29): uN (x, λ) = eiλS(x)

N

bj (x)λ−j .

j=0

Then uN is a (regular) function of x depending on λ as on a parameter. The series u − uN begins with λ−N −1 . Therefore, the series A(u − uN ) begins with λ−N −1+m . But Au = 0; therefore, A(u − uN ) = −AuN , implying AuN = O(λ−N −1+m ). Thus, if we know an asymptotic solution u(x, λ), then its ﬁnite segments uN (x, λ) for large λ are “near-solutions” which become ever more accurate as N increases. Similar arguments are applicable to a nonhomogeneous equation Au = f and to boundary value problems for this equation. Let a boundary value problem for Au = f have a unique solution continuously depending on f (this is often possible to deduce, for example, from energy estimates). Then, having found uN such that AuN = fN diﬀers but little from f (for large λ), we see that since A(u − uN ) = f − fN , the approximate solution diﬀers only a bit from the exact solution u (we assume that uN satisﬁes the same boundary conditions as u). The “exact solution”, however, does not have greater physical meaning than the approximate one since in physics the equations themselves are usually approximate. Therefore, in the sequel we will speak about approximate solutions and not bother with “exact” one. Note, however, that an approximate solution satisﬁes the equation with good accuracy only for large λ (for “high frequencies” of “in short-wave range”).

274

12. Wave fronts and short-wave asymptotics

How do we ﬁnd asymptotic solutions? Note ﬁrst that we may assume b0 (x) ≡ 0 (if b0 (x) ≡ 0, then we could factor out λ and reenumerate the coeﬃcients bj ). It may happen that b0 (x) ≡ 0 but b0 |U ≡ 0 for some open subset U ⊂ Rn . In this case we should consider U and its complement separately. With this remark in mind we will assume that b0 |U ≡ 0 for any U ⊂ Ω. But then the highest term in the series Au is (12.33)

λm am (x, Sx (x))b0 (x)eiλS(x) .

If we wish it to be 0, then the phase S(x) should satisfy the Hamilton-Jacobi equation (12.28) and if this equation is satisﬁed, then (12.33) vanishes whatever b0 (x). Thus, let S(x) be a solution of the Hamilton-Jacobi equation. Now, let us try to ﬁnd bj (x) so that the succeeding terms would vanish. For this, let us write the coeﬃcients vj in the series Au (formula (12.32)) more explicitly. In the above notation, we clearly have (Al bk ), j = 0, 1, 2, . . . , (12.34) vj (x) = l+k =j 0≤l≤m k = 0, 1, 2, . . .

where the Al are linear diﬀerential operators depending on S and deﬁned by (12.30). The sum in (12.34) runs over l ∈ {0, 1, . . . , m} and integers k ≥ 0 such that l + k = j; therefore, the sum is ﬁnite. For j = 0 the sum contains one summand: A0 b0 = am (x, Sx (x))b0 (x). This makes it clear that A0 = 0 due to the choice of S(x). Therefore, the sum over l in (12.34) actually runs from 1 to m. We have v1 = A1 b0 , v2 = A1 b1 + A2 b0 , v3 = A1 b2 + A2 b1 + A3 b0 , ··· . Therefore, the conditions vj = 0, j = 0, 1, 2, . . ., yield also, in addition to the already written Hamilton-Jacobi equation, linear diﬀerential equations for functions bj . These equations may be written in the form A1 b0 = 0, A1 b1 = −A2 b0 , A1 b2 = −A2 b1 − A3 b0 , ··· ; i.e., all of them are of the form (12.35)

A1 bj = fj ,

where fj is expressed in terms of b0 , . . . , bj−1 . Equations (12.35) are called transport equations. portance of the role played by A1 .

They show the im-

12.4. The eikonal and transport equations

275

Let us ﬁnd A1 . We assume that S(x) satisﬁes the eikonal equation (12.28). Then A1 is determined from the formula A(f (x)eiλS(x) ) = eiλS(x) λm−1 (A1 f ) + O(λm−2 ). The direct computation, using the Leibniz product diﬀerentiation rule, shows that for |α| = m we have D α (f (x)eiλS(x) ) = λm (Sx (x))α f (x)eiλS(x) + λm−1

n

αj (Sx (x))α−ej [Dj f (x)]eiλS(x)

j=1

+ λm−1 cα (x)f (x)eiλS(x) + O(λm−2 ), where Dj = i−1 ∂x∂ j , ej is the multi-index with only the jth nonzero entry which is equal to 1, and cα is a C ∞ -function which depends upon the ﬁrst and second derivatives of S. Since αj ξ α−ej =

∂ α ξ , ∂ξj

it follows that A(f (x)eiλS(x) ) ⎤ ⎡ n ∂am (Dj f (x)) + c(x)f (x)⎦ eiλS(x) + O(λm−2 ), = λm−1 ⎣ ∂ξj ξ=Sx (x) j=1

implying

n ∂am A1 = ∂ξj j=1

Dj + c(x) = i−1 L1 + c(x),

ξ=Sx (x)

where L1 is the derivative along the vector ﬁeld ∂ ∂am . · V(x) = ∂ξ ξ=Sx (x) ∂x What is the geometric meaning of V(x)? Let us write the Hamiltonian system with the Hamiltonian am (x, ξ): m x˙ = ∂a ∂ξ , ξ˙ = − ∂am ∂x

and take a bicharacteristic belonging to the graph of the gradient S, i.e., a bicharacteristic (x(t), ξ(t)) such that ξ(t) = Sx (x(t)). Then V(x(t)) is

276

12. Wave fronts and short-wave asymptotics

the tangent vector to the ray x(t) corresponding to such a bicharacteristic. Therefore, the transport equations (12.35) are of the form 1 d bj (x(t)) + c(x(t))bj (x(t)) = fj (x(t)); i dt i.e., they are ordinary linear diﬀerential equations along rays. Therefore, the value bj (x(t)) along the ray x(t) is uniquely determined by bj (x(0)). Example 12.5. Consider the wave equation in a nonhomogeneous medium ∂2u − c2 (x)Δu = 0, x ∈ R3 , ∂t2 where c(x) is a smooth positive function of x. Due to the general recipe, we seek a rapidly oscillating solution in the form

(12.36)

(12.37)

u(t, x, λ) = e

iλS(x)

∞

bj (t, x)λ−j .

j=0

The principal symbol of the corresponding wave operator is H(t, x, τ, ξ) = −τ 2 + c2 (x)|ξ|2 , so the phase S(x, t) should satisfy the following eikonal equation: 2 2 ∂S ∂S 2 − c (x) = 0. ∂t ∂x The equations of Hamiltonian curves are ⎧ ⎪ t˙ = −2τ, ⎪ ⎪ ⎪ ⎨x˙ = 2c2 (x)ξ, ⎪ τ˙ = 0, ⎪ ⎪ ⎪ ⎩ξ˙ = −2c(x)|ξ|2 ·

∂c ∂x ,

implying τ = τ0 , t = −2τ0 σ + t0 , where σ is the parameter along the curve. Since we are only interested in bicharacteristics, we should also assume that τ 2 = c2 (x)|ξ|2 or τ = ∓c(x)|ξ|. From this we ﬁnd for |ξ| = 0 x˙ 2c2 (x)ξ dx ξ = = = ±c(x) . ˙ dt −2τ0 |ξ| t In particular, the absolute value of the ray’s speed is c(x). This easily implies that c(x) is the speed of the wave front propagation at each point. The ﬁrst and the last equations of bicharacteristics also imply −2c(x)|ξ|2 cx (x) dξ = = ∓|ξ| · cx (x). dt −2τ0

12.4. The eikonal and transport equations

277

Thus, x(t) and ξ(t) satisfy (12.38)

dx dt dξ dt

ξ = ±c(x) |ξ| ,

= ∓|ξ| · cx (x).

This is a Hamiltonian system with the Hamiltonian H(x, ξ) = ±c(x)|ξ|. In particular, c(x)|ξ| = H0 = const along the trajectories. The choice of the constant H0 (determined by the initial conditions) does not aﬀect the rays x(t) because (x(t), λξ(t)) is a solution of (12.38) for any λ > 0 as soon as (x(t), ξ(t)) is. Using this fact, let us try to exclude ξ(t). Replacing |ξ| by H0 /c(x) we derive from (12.38) that (12.39)

dx dt dξ dt

= ±H0−1 c2 (x)ξ, x (x) = ∓H0 cc(x) .

Now, substituting the expression ξ=±

(12.40)

H0 dx c2 (x) dt

found from the ﬁrst equation into the second one, we get

(12.41)

d dt

1 dx c2 (x) dt

+

cx (x) = 0. c(x)

Thus, we have shown that every ray x(t) satisﬁes the second-order diﬀerential equation (12.41). This is a nonlinear diﬀerential equation (more precisely, a system of n such equations). It can be resolved with respect to d2 x (its coeﬃcient is equal to c21(x) and we have assumed that c(x) > 0). dt2 Therefore, the initial conditions x(0) = x0 , x (0) = v0 (the “prime” stands for the derivative with respect to t) uniquely deﬁne x(t). Not every solution x(t) of this equation, however, determines a ray, since the passage from (12.38) to (12.41) is not exactly an equivalence. Let us consider this in detail. Let x(t) be a solution of (12.41). Determine ξ(t) via (12.40) choosing (so far, arbitrarily) a positive constant H0 . Then (12.41) is equivalent to the second of equations (12.39), whereas the ﬁrst of equations (12.39) means simply that ξ(t) is obtained via (12.40). Therefore, if (x(t), ξ(t)) is a solution

278

12. Wave fronts and short-wave asymptotics

of (12.39) for some H0 > 0, then x(t) is a solution of (12.41) and, conversely, if x(t) is a solution of (12.41) and H0 is an arbitrary positive constant, then there exists a unique function ξ(t), such that (x(t), ξ(t)) is a solution of (12.39). How do we pass from (12.39) to (12.38)? If (x(t), ξ(t)) is a solution of (12.39) and ξ(t) = 0, then clearly (12.38) holds if and only if c(x(t))|ξ(t)| = H0 . Note that for this it suﬃces that c(x(0))|ξ(0)| = H0 . Indeed, it suﬃces to verify that the vector ﬁeld

±H0−1 c2 (x)ξ,

cx (x) ∓H0 c(x)

is tangent to the manifold M = {(x, ξ) : c(x)|ξ| = H0 } ⊂ R2n x,ξ . Take (x0 , ξ0 ) ∈ M and a curve (x(t), ξ(t)) which is a solution of (12.39) such ˙ ˙ ξ(0)) that (x(0), ξ(0)) = (x0 , ξ0 ). In particular, the velocity vector (x(0), ˙ ˙ ξ(0)) coincides with the vector ﬁeld at (x0 , ξ0 ). We are to verify that (x(0), is tangent to M at (x0 , ξ0 ), i.e., that d 2 (c (x(t))|ξ(t)|2)|t=0 = 0. dt But d 2 (c (x(t))|ξ(t)|2 )|t=0 = 2c(x0 )[cx (x0 ) · x (0)]|ξ0 |2 + 2c2 (x0 )ξ0 · ξ (0) dt = 2c(x0 )|ξ0 |2 cx (x0 ) · [±H0−1 c2 (x0 )ξ0 ] + 2c2 (x0 )ξ0 · [∓H0 c−1 (x0 )cx (x0 )] = 2c(x0 )[ξ0 · cx (x0 )][±H0−1 c2 (x0 )|ξ0 |2 ∓ H0 ] = 0, because c(x0 )|ξ0 | = H0 due to our assumption that (x0 , ξ0 ) ∈ M . Thus, if c(x(0))|ξ(0)| = H0 , then a solution (x(t), ξ(t)) of (12.39) is also a solution of (12.38); hence, x(t) is a ray. Now note that equation (12.40) implies that for each ﬁxed t the condition c(x(t))|ξ(t)| = H0 is equivalent to |x (t)| = c(x(t)). In particular, if |x (0)| = c(x(0)) and x(t) is a solution of (12.41), then x(t) is a ray and |x (t)| = c(x(t)) for all t. Thus, the rays x(t) are distinguished among the solutions of (12.41) by the fact that their initial values x(0) = x0 , x (0) = v0 satisfy v0 = c(x0 ). Let us verify “Fermat’s principle”, which means that the rays are extrema of the functional T = T (γ) deﬁning time for our wave (e.g., light) to

12.4. The eikonal and transport equations

279

pass along the curve γ; i.e.,

T (γ) = γ

ds , c

where ds is the arc-length element of the curve γ and c = c(x) is considered as a function on γ. The extremum is understood with respect to all variations preserving the endpoints, i.e., in the class of the paths γ that connect two ﬁxed points. Denote these points x0 and x1 . Since the extremality of a curve does not depend on a choice of a parameter, it is convenient to assume that the parameter runs over [0, 1]. Thus, let γ = {x(τ ) : τ ∈ [0, 1]}. Take variations of γ preserving the conditions x(0) = x0 , x(1) = x1 . A curve is an extremum if it satisﬁes the Euler-Lagrange equations (see Section 2.5) for the functional

1 |x(τ ˙ )|dτ , T ({x(τ )}) = c(x(τ )) 0 where the “dot” denotes the derivative with respect to τ . Lagrange equations d dτ for the Lagrangian L =

(12.42)

d dτ

|x| ˙ c(x)

∂L ∂ x˙

−

The Euler-

∂L =0 ∂x

can be rewritten as

1 1 dx(τ ) c(x(τ )) |x(τ ˙ )| dτ

+

|x(τ ˙ )|

cx (x(τ )) c2 (x(τ ))

= 0.

Clearly, the extremality of a curve is independent upon a choice of the parameter τ and its range, and so the Euler-Lagrange equations hold for every such choice or for none. So the parameter τ may run over any ﬁxed segment [a, b]. Let now {x(t)} be a ray with the parameter t (the same as in (12.38)). We know that |x(t)| ˙ = c(x(t)) along the ray. But then setting τ = t, we see that the Lagrange equation (12.42) turns into (12.41) derived above for rays. Thus, the rays satisfy the Fermat principle. This agrees, in particular, with the fact that in a homogeneous medium (i.e., for c(x) = c = const) the rays are straight lines. Let us write down the transport equations in the described situation. For this purpose, substitute a series u(t, x, λ) of the form (12.37) into equation (12.36). We will assume that S(t, x) satisﬁes the eikonal equation. Then

280

12. Wave fronts and short-wave asymptotics

the series

∂2u ∂t2

− c2 (x)Δu begins with λ1 . We have

ut (t, x, λ) = iλSt e

utt (t, x, λ) =

ux (t, x, λ) =

Δu(t, x, λ) =

iλS

∞

bj λ

−j

+e

iλS

∞ ∂bj

∂t

λ−j ,

j=0 j=0 ∞ ∞ −λ2 St2 eiλS bj λ−j + iλStt eiλS bj λ−j j=0 j=0 ∞ ∞ ∂bj −j ∂ 2 bj −j λ + eiλS +2iλSt eiλS λ , ∂t ∂t2 j=0 j=0 ∞ ∞ ∂bj −j λ , iλSx eiλS bj λ−j + eiλS ∂x j=0 j=0 ∞ ∞ bj λ−j + iλΔSeiλS bj λ−j −λ2 Sx2 eiλS j=0 j=0 ∞ ∞ ∂bj −j λ + eiλS +2iλeiλS Sx · (Δbj )λ−j . ∂x j=0 j=0

(Here Sx2 means Sx · Sx = |Sx |2 .) Now, compute utt − c2 (x)Δu. The principal terms (with λ2 ) in the expressions for utt and c2 (x)Δu cancel due to the eikonal equation. Equating the coeﬃcients of all the powers of λ to zero and dividing by 2i, we get 2 0 St ∂b ∂t − c (x)Sx · 2 1 St ∂b ∂t − c (x)Sx ·

(12.43)

∂b0 1 2 ∂x + 2 [Stt − c (x)ΔS]b0 ∂b1 1 2 ∂x + 2 [Stt − c (x)ΔS]b1 2 +(2i)−1 [ ∂∂tb20 − c2 (x)Δb0 ] = 0, ∂b2 1 2 2 2 St ∂b ∂t − c (x)Sx · ∂x + 2 [Stt − c (x)ΔS]b2 2 +(2i)−1 [ ∂∂tb21 − c2 (x)Δb1 ] = 0,

= 0,

···

∂b St ∂tj

∂b − c2 (x)Sx · ∂xj + 12 [Stt − c2 (x)ΔS]bj ∂2b +(2i)−1 [ ∂tj−1 − c2 (x)Δbj−1 ] = 0. 2

These are the transport equations. The Hamiltonian equations imply that the vector ﬁeld (St , −c2 (x)Sx ) = (τ, −c2 (x)ξ)

12.5. The Cauchy problem with rapidly oscillating initial data

281

is tangent to the rays; i.e., the equations (12.43) are of the form d bj (t(σ), x(σ)) + a(σ)bj (t(σ), x(σ)) + f (σ) = 0, dσ where a(σ) only depends on S and f (σ) depends on S, b0 , . . . , bj−1 while (t(σ), x(σ)) is a ray with the parameter σ considered at the beginning of the discussion of this example. Let us indicate another method of analyzing this example and examples of a similar structure. We may seek solutions of (12.36) in the form (12.44)

u(t, x, ω) = eiωt v(x, ω),

where ω is a large parameter and v is a formal series of the form ∞ iωS(x) vk (x)ω −k . v(x, ω) = e k=0

Equation (12.36) for a function u(t, x, ω) of the form (12.44) turns into an equation of the Helmholtz type for v(x, ω) [Δ + ω 2 c−2 (x)]v(x, ω) = 0. For S(x) we get the eikonal equation Sx2 (x) =

1 . c2 (x)

We could get the same result assuming initially S(t, x) = t − S(x). We will not write now the transport equations for functions vk (x). The structure of these equations is close to that of nonstationary transport equations (12.43). Remark. Transport equations describe transport of energy, polarization, and other important physical phenomena in physical problems. We will not dwell, however, on these questions which rather belong to physics.

12.5. The Cauchy problem with rapidly oscillating initial data Consider the equation Au = 0, where A = a(t, x, Dt , Dx ), t ∈ R, x ∈ Rn , and A is hyperbolic with respect to t. Let us consider a formal series ∞ iλS(t,x) bj (t, x)λ−j (12.45) u(t, x, λ) = e j=0

and try to ﬁnd out which initial data at t = 0 enable us to recover this series if it is an asymptotic solution. We will only consider the phase functions S(t, x) such that gradx S(t, x) = 0. The function S(t, x) should satisfy the eikonal equation (12.46)

am (t, x, St , Sx ) = 0,

282

12. Wave fronts and short-wave asymptotics

where am (t, x, τ, ξ) is the principal symbol of A (which can be assumed real without loss of generality). But as we have seen above in Section 12.3, this is equivalent to the fulﬁllment of one of m equations ∂S = τl (t, x, Sx ) ∂t

(12.47)

where τl (t, x, ξ), for l = 1, 2, . . . , m, is the complete set of roots of the equation am (t, x, τ, ξ) = 0, chosen, say, in increasing order. Setting the initial condition S|t=0 = S0 (x),

(12.48)

where S0 (x) is such that gradx S0 (x) = 0, we may construct exactly m diﬀerent solutions of the eikonal equation (12.46) satisfying this initial condition, namely, solutions of each of the equations (12.47) with the same initial condition (12.48) (such a solution is unique as soon as we have choosen the root τl ). Solutions of equations (12.47) exist in a small neighborhood of the plane t = 0 determined as indicated in Section 12.2. In the same neighborhood, the functions bj (t, x) are uniquely deﬁned from the transport equations provided the initial values are given: bj |t=0 = bj,0 (x), j = 0, 1, 2, . . . . Let us now formulate the Cauchy problem for the equation Au = 0 by imposing the initial conditions of the form (0) −j u|t=0 = eiλS0 (x) ∞ j=0 cj (x)λ , (1) ∞ ut |t=0 = λeiλS0 (x) j=0 cj (x)λ−j , (12.49) · ·· (m−1) ∂ m−1 u = λm−1 eiλS0 (x) ∞ (x)λ−j . j=0 cj ∂tm−1 t=0

Solving this problem would allow us to ﬁnd with arbitrarily good accuracy (for large λ) solutions of the equation Au = 0 with rapidly oscillating initial values that are ﬁnite segments of the series on the right-hand sides of (12.49). A solution of this problem is not just a series of the form (12.45): there may be no such series. We should take a ﬁnite sum of these series with diﬀerent phases S(t, x). Thus, consider a sum (12.50)

u(t, x, λ) =

m

ur (t, x, λ),

r=0

where r

u (t, x, λ) = e

iλS r (t,x)

∞ j=0

bj (t, x)λ−j . (r)

12.5. The Cauchy problem with rapidly oscillating initial data

283

Sum (12.50) is called a solution of the equation Au = 0 if each ur (t, x, λ) k r is a solution of this equation. Since all the series ∂∂tuk are of the form t=0

k iλS0 (x)

λ e

∞

cj (x)λ−j ,

j=0

the initial values u|t=0 , ut |t=0 , . . . ,

∂ m−1 u ∂tm−1 t=0

also are of the same form.

Therefore, it makes sense to say that u(t, x, λ) satisﬁes the initial conditions (12.49). For instance, we may consider a particular case when the series in (12.49) contain one term each: u|t=0 = eiλS0 (x) ϕ0 (x), ··· ∂ m−1 u | = λm−1 eiλS0 (x) ϕm−1 (x). ∂tm−1 t=0 Cutting oﬀ the series for u at a suﬃciently remote term, we see that any asymptotic solution provides (regular) functions u(t, x, λ), such that Au = O(λ−N ), u|t=0 = eiλS0 (x) ϕ0 (x) + O(λ−N ), · ·· ∂ m−1 u = λm−1 eiλS0 (x) ϕm−1 (x) + O(λ−N ), ∂tm−1 t=0

where N may be taken arbitrarily large. Theorem 12.5. The above Cauchy problem for the hyperbolic equation Au = 0 with initial conditions (12.49) is uniquely solvable in a small neighborhood of the initial plane t = 0. Proof. Each of the phases S r (t, x) should satisfy the eikonal equation and the initial condition S|t=0 = S0 (x). There exist, as we have seen, exactly m equations for the amplitudes such phases S 1 , S 2 , . . . , S m . Now the transport (r) (r) bj (t, x) show that it suﬃces to set bj = bj,r (x) to uniquely deﬁne all t=0

(r)

functions bj (t, x). Thus we need to show that the initial conditions (12.49) m r uniquely determine bj,r (x). Substitute u = r=1 u into the initial con(r) ditions (12.49) and write the obtained equations for bj (t, x) equating the coeﬃcients of all the powers of λ. The ﬁrst of equations (12.49) gives m r=1

(r) bj

t=0

(0)

= cj , j = 0, 1, . . . .

284

12. Wave fronts and short-wave asymptotics

The second of equations (12.49) becomes (r) m m ∂bj−1 ∂S r (r) + b i ∂t j t=0 ∂t r=1

r=1

t=0

In general, the equation which determines (12.51) m ∂S r k (r) bj i ∂t r=1

(k)

= fj (x),

(1)

= cj , j = 0, 1, . . . .

∂k u ∂tk t=0

becomes

k = 0, 1, . . . , m − 1, j = 0, 1, . . . ,

t=0

(k)

(r)

(r)

where fj (x) only depends on b0 , . . . , bj−1 . Now, note that ∂S0 ∂S r , = τr 0, x, ∂t t=0 ∂x 0 where τr (0, x, ξ) = τl (0, x, ξ) if r = l and ξ = 0. Therefore, if ∂S ∂x = 0 (which we assume to be the case in the problem considered), then the system (12.51) (r) = bj,r (x) for a ﬁxed j is a system of linear equations with respect to bj t=0 with the determinant equal to the Vandermond determinant 1 . . . 1 iτ1 . . . iτm , ... ... ... (iτ1 )m−1 . . . (iτm )m−1 0 where τr = τr (0, x, ∂S ∂x ). Since, due to hyperbolicity, τj = τk for j = k, the 0 determinant does not vanish anywhere (recall that we assumed ∂S ∂x = 0).

(r)

(r)

The right-hand sides of (12.51) depend only on b0 , . . . , bj−1 . Therefore, the sequence of the systems of linear equations (12.51) enables us to (r) uniquely determine all functions bj by induction. Namely, having found (r)

b0,r (x) for r = 1, . . . , m from (12.51) with j = 0, we may determine b0 (t, x) (r) from the transport equations with the initial conditions b0 |t=0 = b0,r (x). (r) If b0 , . . . , bj−1 are determined, then the functions bj |t=0 , r = 1, . . . , m, are recovered from (12.51), and then the transport equations enable us to (r) determine bj (t, x). Let us brieﬂy indicate the connection of Theorem 12.5 with the behavior of the singularities for solutions of the Cauchy problem. It is well known that the singularities of f (x) are connected with the behavior of its Fourier ˜ at inﬁnity. For instance, the inversion formula transform f (ξ)

(12.52) f (x) = (2π)−n f˜(ξ)eix·ξ dξ

12.5. The Cauchy problem with rapidly oscillating initial data

285

˜ ≤ C(1 + |ξ|)−N , then f ∈ C N −n−1 (Rn ). Therefore, implies that if |f (ξ)| to ﬁnd the singularities of a solution u(t, x) it suﬃces to solve the Cauchy problem up to functions whose Fourier transform with respect to x decreases suﬃciently rapidly as |ξ| → ∞. Formula (12.52) can be regarded as a representation of f (x) in the form of a linear combination of exponents eiξ·x . Since the problem is a linear one, to ﬁnd solutions of the problem with one of the initial conditions equal to f (x), it suﬃces to consider a solution with the initial data f (x) replaced by eix·ξ and then take the same linear combination of solutions. If f ∈ E (Rn ) and supp f ⊂ K, where K is a compact in Rn , it is more convenient to somewhat modify the procedure. Namely, let ϕ ∈ C0∞ (Rn ), ϕ = 1 in a neighborhood of K. Then

−n f˜(ξ)ϕ(x)eix·ξ dξ (12.53) f (x) = ϕ(x)f (x) = (2π) and it suﬃces to solve the initial problem by replacing f (x) by ϕ(x)eix·ξ . For instance, if f (x) = δ(x), ϕ ∈ C0∞ (Rn ), ϕ(0) = 1, then (12.53) holds with f˜(ξ) ≡ 1; therefore, having solved the Cauchy problem ⎧ ⎪ Au = 0, ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨u|t=0 = 0, ut |t=0 = 0, (12.54) ⎪ ⎪ ⎪ · ·· ⎪ ⎪ ⎪ m−1 u ⎪ ∂ ⎩ m−1 = ϕ(x)eix·ξ ∂t

t=0

and having denoted the solution by uξ (t, x), we can write

−n uξ (t, x)dξ E(t, x) = (2π) (if the integral exists in some sense, e.g., converges for any ﬁxed t in the topology of D (Rnx )), and then E(t, x) is a solution of the Cauchy problem ⎧ ⎪ AE = 0, ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨E|t=0 = 0, Et |t=0 = 0, (12.55) ⎪ ⎪ ⎪ · · · ⎪ ⎪ ⎪ m−1 E ⎪ ∂ ⎩ m−1 = δ(x). ∂t

t=0

If we only wish to trace the singularities of E(x), we should investigate the behavior of uξ (x) as |ξ| → +∞. We may ﬁx the direction of ξ introducing ξ . Then the Cauchy problem a large parameter λ = |ξ| and setting η = |ξ|

286

12. Wave fronts and short-wave asymptotics

(12.54) takes on the form of the Cauchy problem with rapidly oscillating initial data. Namely, in (12.49) we should set S0 (x) = η · x, (0)

(1)

(m−2)

cj (x) = cj (x) = · · · = cj (m−1) cj (x) (m−1) cm−1 (x)

= 0,

(x) = 0,

for j = 0, 1, . . . ,

for j = m − 1,

= ϕ(x).

We may ﬁnd a solution uξ (t, x) = uη (t, x, λ) which satisﬁes (12.54) with accuracy up to O(λ−N ) for arbitrarily large N . Then the equations in (12.55) are satisﬁed up to functions of class C N −n−1 (Rn ). It can be proved that singularities of the genuine solution of (12.55) are the same as for the obtained approximate solution; i.e., the genuine solution diﬀers from the approximate one by an arbitrarily smooth function (for large values of N ). Where are the singularities of E(t, x)? To ﬁnd out, we should consider the integral

−n uξ,N (t, x)dξ, EN (t, x) = (2π) where uξ,N is the sum of the ﬁrst N + 1 terms of the series which represents the asymptotic solution uξ . All these terms are of the form

r k λ eiλS (t,x,η) ϕ(t, x, η)dξ, where λ = |ξ|, η = ξ/|ξ|, S r (t, x, η) is one of the phase functions satisfying the eikonal equation with the initial condition S r |t=0 = S0 (x) = η · x; ϕ(t, x, η) is inﬁnitely diﬀerentiable with respect to all variables; ϕ is deﬁned for small t and has the (t, x)-support belonging to an arbitrarily small neighborhood of the set of rays (t, x(t)) with the source in (0, 0). The last statement is deduced from the form of the transport equations. Therefore we see that the support of EN (t, x) belongs to an arbitrarily small neighboorhood of the set of rays passing through (0, 0). Taking into account the fact that the singularities of E and EN coincide (with any accuracy), we see that the distribution E(t, x) is inﬁnitely diﬀerentiable outside of the set of rays with (0, 0) as the source. Here we should take all the rays corresponding to all the functions S r (t, x, η) for each η and each r = 1, . . . , m. We get the set of m curved cones formed by all rays starting from (0, 0) (for each η ∈ Rn , |η| = 1, and each r = 1, . . . , m we get a ray smoothly depending on η). A more detailed description is left to the reader as a useful exercise which should be worked out for second-order equations, for example, for the equation (12.36) describing light propagation in a nonhomogeneous medium.

12.6. Problems

287

12.6. Problems 12.1. For the equation utt = a2 Δu, where u = u(t, x), x ∈ R2 , ﬁnd the characteristics whose intersection with the plane t = 0 is a) the line η · x = 0, where η ∈ R2 \0), b) the circle {x : |x| = R}. 12.2. Describe the relation between the rays and characteristics of the equation utt = c2 (x)uxx . 12.3. For the Cauchy problem ⎧ 2 ⎪ ⎨utt = x uxx , u|t=0 = 0, ⎪ ⎩ ut |t=0 = eiλx , write and solve the eikonal equation and the ﬁrst transport equation. Use this to ﬁnd the function u(t, x, λ), such that for x = 0, we have as λ → ∞ ⎧ 2 −1 ⎪ ⎨utt − x uxx = O(λ ), u|t=0 = O(λ−2 ), ⎪ ⎩ ut |t=0 = eiλx + O(λ−1 ).

10.1090/gsm/205/13

Chapter 13

Answers and hints. Solutions

1.1. a) upp + uqq + urr = 0; the change of variables is ⎧ ⎪ ⎨p = x, q = −x + y, ⎪ ⎩ r = x − 12 y + 12 z. (The reader should keep in mind that a change of variables leading to the canonical form is not unique; the above one is one of them, and it may very well happen that the reader will ﬁnd another one.) b) upp − uqq + 2up = 0; the change of variables is ⎧ ⎪ ⎨p = x + y, q = x − y, ⎪ ⎩ r = y + z. 1.2. a) uzw −

1 w uz

−

b) uzz + uww +

1 4z uw

1 z+w uz

−

1 u z2

+

= 0; the change of variables is

z = yx−3 , w = xy.

1 2w uw

= 0; the change of variables is z = y 2 − x2 , w = x2 . 289

290

13. Answers and hints. Solutions

c) uww + 2uz + uw = 0; the change of variables is z = x + y, w = x. @ 1.3. a) u(x, y) = f ( xy ) + xy g(xy), where f , g are arbitrary functions of one independent variable. (Here and in b) below the form of the general solution is not unique.) Hint. The change of variables z = ln |xy|, w = ln |y/x| ∂ (2uw + u) = 0 which reduces the equation to the form 2uzw + uz = 0; i.e., ∂z implies 2uw + u = F (w) and u(z, w) = a(w) + exp(−w/2)b(z).

b) u(x, y) = ln | xy |f (xy) + g(xy), where f, g are arbitrary functions of one independent variable. Hint. The change of variables z = ln |xy|, w = ln |y/x| reduces the equation to the form uww = 0; hence u(z, w) = wϕ(z) + ψ(z). 2.1. a) ux |x=0 = 0. Hint. Write the dynamical Newton equation of the motion of the ring. b) (mutt − T ux )|x=0 = 0. Hint. See the hint to a). c) (mutt + αut − T ux )|x=0 = 0. l 2.2. a) E(t) = 12 0 [ρu2t (t, x) + T u2x (t, x)]dx = E0 = const . b), c) Let 1 E(t) = mu2t |x=0 + 2

l

[ρu2t (t, x) + T u2x (t, x)]dx. 0

t Then E(t) ≡ E0 = const in case b) and E(t) − E(0) = Af r = − 0 αu2t |x=0 dt in case c) (here Af r is the work of the friction force). 2.3. ρutt = Euxx or utt = a2 uxx with a = E/ρ. Here u = u(t, x) is a longitudinal displacement of a point with position x in equilibrium, ρ is the volume density of the material of the rod, E is the Young modulus (also called the modulus of elongation) which describes the force arising in the deformed material by the formula F = ES Δl l (Hooke’s law), S is the cross

13. Answers and hints. Solutions

291

section where the force is measured, Δl/l is the deformation of the small piece of material around the point of measurement (Δl/l is often called the relative lengthening; here l is the length of the piece in equilibrium position and Δl is the increment of this length caused by the force). Hint. Prove that the relative lengthening of the rod at the point with position x in equilibrium is ux = ux (t, x) so the force F = ESux (t, x) is acting on the left part of the rod in the corresponding cross section at the point x. Consider the motion of the part [x, x + Δx] of the rod; write Newton’s dynamical equation for this part and then take the limit as Δx → 0. 2.4. ux |x=0 = 0. 2.5. (ESux − ku)|x=0 = 0. Here k is the rigidity coeﬃcient of the spring, i.e., the force per unit lengthening of the spring. 2.6. a), b) 1 E(t) ≡ 2

l

[ρu2t (t, x) + Eu2x (t, x)]dx = E0 = const . 0

c) E(t) ≡ 12 ku2 |x=0 +

1 2

l

2 0 [ρSut (t, x) +

ESu2x (t, x)]dx = E0 = const.

∂ (ES(x) ∂u 2.7. ρS(x)utt = ∂x ∂x ), where S(x) is the area of the cross section of the rod at the point with position x in equilibrium (the x-axis should be directed along the axis of the rod). ∞ 2.9. E(t) ≡ 12 −∞ [ρu2t (t, x) + T u2x (t, x)]dx = E0 = const (perhaps +∞ for all t).

Hint. Using Problem 2.8 prove that E(t ) ≤ E(t) for t ≥ t, and then reverse the direction of the time variable. 2.10. See Figure 13.1. 2.11. u ≡ 0.

Hint. The Cauchy initial conditions are u|x=x0 ≡ 0, ∂u ∂x x=x0 ≡ 0. f (x − at) for x > x0 + ε, where f (x) = 0 for x < x0 + ε, 2.12. u = g(x + at) for x < x0 − ε, where g(x) = 0 for x > x0 − ε. 2.13. See Figure 13.2. 2.14. See Figure 13.3. Hint. Use the formula

x+at

z 1 1 ψ(s)ds = Ψ(x + at) − Ψ(x − at), where Ψ(z) = ψ(s)ds. 2a x−at 2a 0

292

13. Answers and hints. Solutions

c

t=0

d

x t = ε < l/2a, l = d − c x t = l/2a x t = l/2a + ε x t

l/2a x

Figure 13.1

The graphs of Ψ(x + at) and Ψ(x − at) are drawn on Figure 13.3 by dashed lines. ∞ 2.15. E(t) ≡ 12 0 (ρu2t (t, x) + T u2x (t, x))dx = E0 = const. Hint. See the hint to Problem 2.9. 2.16. The reﬂected wave is & ' −1 2

x k ω2 2ωk k cos ω t − − exp (x − at) + 2 E2S2 a aES a ES 2

x ω k2 + , − 2 2 sin ω t − a2 E S a with notation as in the answer to Problem 2.3. 2.17. On the left u = sin ω x+at a+v with the frequency ωl = On the right u = sin ω x−at v−a with the frequency ωr =

ωa a+v < ω. ωa a−v > ω.

2.18. The standing waves are X0 (x)(at+b) and Xn (x)(an cos ωn t+bn sin ωn t) πna for n = 1, 2, . . . with Xn (x) = cos nπx l and ωn = l . For the graphs of the ﬁrst Xn (x), n = 1, 2, 3, see Figure 13.4. 2.19. The boundary conditions are u|x=0 = 0 and (ESux + ku)|x=l = 0. γ x The standing waves are Xp (x)(ap cos ωp t+bp sin ωp t), where Xp (x) = sin pl , ωp = γp a/l, and the γp are the solutions of the equation cot γ = −

kl ESγ

13. Answers and hints. Solutions

l

293

t=0 x t = ε < l/2a

3l

x t = l/2a x t = l/2a + ε x t = l/a x t = l/a + ε x t = 3l/2a x t = 3l/2a + ε x t = 2l/a x t = 2l/a + ε x t = 5l/2a x t = 5l/2a + ε x t = 3l/a + ε x t > 3l/a x Figure 13.2

such that γp ∈ (pπ, (p + 1)π) for p = 0, 1, 2, . . . . The system {Xp : p = 0, 1, . . .} is orthogonal in L2 ([0, l]), and every function Xp is an eigenfunction of the operator L = −d2 /dx2 with the boundary conditions X(0) = 0, X (0) + αX(0) = 0 for α = k/(ES). The corresponding eigenvalue is λp = (γp /l)2 and its short-wave asymptotic is ' & π(2p + 1) 2 1 1+O . λp = 2l p2

294

13. Answers and hints. Solutions

−l

l

t=0 x

3l

t = ε < l/a x

t = l/a x

t = l/a + ε x

t = 2l/a x

t = 2l/a + ε x

t = 3l/a x 2l t > 3l/a x

Figure 13.3

2.20. a) The resonance frequencies are ωk = (2k+1)πa , k = 0, 1, 2, . . .; the 2l condition for a resonance is ω = ωk for some k. If ω = ωk , then u(t, x) =

F0 a ω ESω cos ω l sin a x · sin ωt a 2F0 ωa2 k + ∞ k=0 (−1) ωk lES(ω 2 −ω 2 ) k

sin ωk t · sin (2k+1)πx . 2l

13. Answers and hints. Solutions

295

X0 ≡ 1

l

0

x l

0

X1 (x) = cos

πx l

X2 (x) = cos

2πx l

X3 (x) = cos

3πx l

x

l

0

x

l

0

x

Figure 13.4

If ω = ωk for some k, then 2

F0 a 3 ω xω ω ω u(t, x) = (−1)k ESlω 2 [ 2 sin ωt · sin a x − a sin ωt · cos a x − ωt cos ωt · sin a x] (2p+1)πx 2F0 ωa2 + p≥0,p=k (−1)p ωp lES(ω . 2 −ω 2 ) sin ωp t · sin 2l p

Hint. Seek a particular solution in the form u0 = X(x) sin ωt satisfying the boundary conditions (but not the initial conditions); then ﬁnd v = u−u0 by the Fourier method in case ω = ωk . In case of a resonance ω = ωk , take the limit of the nonresonance formula for u as ω → ωk . b) Resonance frequencies are ωk = kπa l , k = 1, 2, . . .; the condition for a resonance is ω = ωk for some k. If ω = ωk , then 2

a F0 a t ω u(t, x) = − ESωF0sin ωl sin ωt · cos a x + ESlω a 2F0 ωa2 kπx k + ∞ k=0 (−1) ω lES(ω 2 −ω 2 ) sin ωk t · cos l . k

k

If ω = ωk for some k, then 2

0a [ 2ω3 2 sin ωt · sin ωa x− ω1 ( xa sin ωt · cos ωak x +t cos ωt · sin ωa x)] u(t, x) = (−1)k FESl 2

0a t + + FESlω

2F0 ωa2 p p≥1, p=k (−1) ωp lES(ω 2 −ωp2 )

sin ωp t · cos pπx l .

Hint. See the hint to a). for k = 1, 2, . . .; the 2.21. a) The resonance frequencies are ωk = kπa l condition for a resonance is ω = ωk for some k. If ω = ωk , then A sin ωx k+1 2ωAa sin ω t · sin kπx . u(t, x) = sin ωla sin ωt + ∞ k k=1 (−1) l l(ω 2 −ω 2 ) a

k

296

13. Answers and hints. Solutions

If ω = ωk for some k, then ω [ 1 sin ωt · sin ωa x + xa sin ωt · cos ωx u(t, x) = (−1)k Aa a + t cos ωt · sin a x] l 2ω pπx + p≥1, p=k (−1)p+1 l(ω2ωAa 2 −ω 2 ) sin ωp t · sin l . p

Hint. See the hint to Problem 2.20 a). b) The resonance frequencies are ωk = (2k+1)πa , k = 0, 1, 2, . . .; the con2l dition for a resonance is ω = ωk for some k. If ω = ωk , then u(t, x) =

A cos ωx a cos ωl a

sin ωt +

∞

k 2ωAa k=0 (−1) l(ω 2 −ω 2 ) k

sin ωk t · cos (2k+1)πa . 2l

If ω = ωk for some k, then 1 ω x ωx ωx u(t, x) = (−1)k Aa l [ 2ω sin ωt · cos a x + a sin ωt · sin a − t cos ωt · cos a ] (2p+1)πa + (−1)p l(ω2ωAa . 2 −ω 2 ) sin ωp t · cos 2l p≥0, p=k

p

Hint. See the hint to Problem 2.20 a). c) The resonance frequencies are ωp = γp a/l, where the γp are as in the answer to Problem 2.19, p = 1, 2, . . .. The condition for a resonance is ω = ωp for some p. If ω = ωp , then with γ = ωl/a ka ωx ka ωx −1 u(t, x) = A(cos ωl + ESω sin ωl a ) (cos a + ESω sin a ) sin ωt ∞ a 4Aγγp sin γp γp x γ x kl sin pl ). + p=0 (2γp −sin(2γp ))(γ 2 −γ 2 ) sin ωp t · (cos l + ESγ p p

If ω = ωp for some p, then u(t, x) =

2γA x x at x 2γ−sin 2γ [ l sin ωt · cos(γ( l − 1)) + l cos ωt · sin(γ( l − 1))] x x Aγ(1−cos 2γ) sin ωt·sin(γ( l −1)) A sin ωt·sin(γ( l +1)) + − 2γ−sin 2γ (2γ−sin 2γ)2 4Aγγn sin γn kl + n≥0, n=p (2γn −sin(2γ sin ωn t · (cos γnl x + ESγ sin γnl x ). 2 2 n n ))(γ −γn )

Hint. See the hint to Problem 2.20 a). x 3.1. Φ(x) = cos kx + k1 0 sin k(x − t)q(t)Φ(t)dt; x Φ(x) = cos kx + O( k1 ) = cos kx + k1 0 sin k(x − t) cos kt q(t)dt + O( k12 ); x Φ (x) = −k sin kx + O(1) = −k sin kx + 0 cos k(x − t) cos kt q(t)dt + O( k1 ). 3.2. Hint. The values of k satisfying Φ(l) = Φ(l, k) = 0 can be found by the implicit function theorem near k such that cos kl = 0. Diﬀerentiation with respect to k of the integral equation of Problem 3.1 gives an integral equation for ∂Φ/∂k providing information needed to apply the implicit function theorem.

13. Answers and hints. Solutions

297

Figure 13.5

3.3. G(x, ξ) = 1l [θ(ξ − x)x(l − ξ) + θ(x − ξ)(l − x)ξ]. Here is a physical interpretation: G(x, ξ) is the form of a string which is pulled at point ξ by the vertical point force F = T , where T is the tension force of the string. 3.4. G(x, ξ) =

1 sinh l [θ(ξ − x) cosh x · cosh(l − ξ) + θ(x − ξ) cosh(l − x) · cosh ξ].

Here is a physical interpretation: G(x, ξ) is the form of a string which is pulled at a point ξ by the vertical point force F = T , has free ends (ends which are allowed to move freely in the vertical direction), and is subject to the action of an elastically returning uniformly distributed force; see Figure 13.5, where the returning force is realized by the little springs attached to the points of the string and to a rigid massive body. 3.6. Hint. A solution y = y(x) ≡ 0 of the equation −y + q(x)y = 0 with q ≥ 0 has at most one root. 3.7. Hint. Use the inequality (Gϕ, ϕ) ≥ 0 for G = L−1 ; take ϕ(x) =

n

cj ϕε (x − xj )

j=1

with a δ-like family ϕε . 3.8. a)

∞ j=1

1 λj

=

l 0

G(x, x)dx.

Hint. Use the expansion G(x, ξ) =

∞ Xj (x)Xj (ξ) j=1

λj

which can be obtained, e.g., as the orthogonal expansion of G(·, ξ) with respect to {Xj (x), j = 1, 2, . . .}.

298

13. Answers and hints. Solutions

b)

∞

1 j=1 λ2j

=

l l 0

0

|G(x, ξ)|2 dxdξ.

Hint. This is the Parseval identity for the orthogonal expansion of G = G(x, ξ) with respect to the system {Xj (x)Xk (ξ) : j, k = 1, 2, . . .}. In the particular case L = −d2 /dx2 on [0, π] with boundary conditions X(0) = X(π) = 0 we obtain λn = n2 , and taking into account the answer to Problem 3.3, we get ∞ 1 π2 , = n2 6

n=1

∞ 1 π4 . = n4 90

n=1

4.1. 0. 4.2. 2δ(x). 4.3. πδ(x). 4.4. Hint. f (x) − f (0) =

1

d 0 dt f (tx)dt

=x

1 0

f (tx)dt.

(R) and its 4.5. Hint. Write explicit formulas for the map D (S 1 ) → D2π inverse.

4.6. Hint. Use the continuity of a distribution from D (S 1 ) with respect to a seminorm in C ∞ (S 1 ). 4.7. Hint. Use duality considerations. 1 imx . 4.8. k∈Z δ(x + 2kπ) = 2π m∈Z e 4.9. Hint. Multiply the formula in the answer to Problem 4.8 by f (x) and integrate. 4.10. a) u(x) = v.p. x1 + Cδ(x) =

1 x+i0

+ C1 δ(x) =

b) u(x) = ln |x| + Cθ(x) + C1 . c) u(x) = −δ(x) + Cθ(x) + C1 . 5.1. a) −2πiθ(ξ). r0 |ξ| . b) 4πr0 sin|ξ|

Hint. Use the spherical symmetry and take ξ = |ξ|(1, 0, 0) = (ξ1 , 0, 0). c)

2π 2 ρ0 δ(|ξ| −

ρ0 ).

Hint. Use the answer to b). d)

(−i)k+1 . (ξ−i0)k+1

1 x−i0

+ C2 δ(x).

13. Answers and hints. Solutions

5.2. a)

299

1 . (n−2)σn−1 |x|n−2

Hint. It should be a spherically symmetric fundamental solution for (−Δ). b)

1 −k|x| . 4π|x| e

Hint. Using the spherical symmetry take ξ = (ξ1 , 0, 0). Use the regularization with multiplication by exp(−ε|ξ|). Calculate the arising onedimensional integral using residues. c)

1 ∓ik|x| . 4π|x| e

Hint. Calculate F −1 ( |ξ|2 −k1 2 ±iε ) as in b) and pass to the limit as ε → +0. 5.3.

1 −k|x| . 4π|x| e

5.6. 0 as t → +∞ and −2πiδ(x) as t → −∞. 6.1. If Δu = 0, then for some ball Bε (x) we have, say, Δu > 0 in Bε (x). Let I(r) be as in the proof of Theorem 6.1; by the mean value property, we have I (r) = 0. But then Bε Δu dx > 0 is in contradiction with (6.3). 6.2. Use the same arguments as in Theorems 6.1 and 6.2 but replace Δu = 0 with Δu ≥ 0. 6.3. If Δu ≤ 0, then for some small ball Bε (x) we have Δu > 0 in Bε (x). Follow the two previous hints/solutions. 6.4. If u(x0 ) ≥ u(x) for all x ∈ Bε (x0 ) but there are some x ∈ Bε (x0 ) for which u(x0 ) > u(x), then

n u(y) dy, u(x0 ) > σn εn Bε (x0 ) contradicting (6.45). 6.5. First reﬂect u(x, y) in the x-axis and use Lemma 6.11 to consider u(x, y) as harmonic in the strip −T < y < T . Then pick a point x0 on the x-axis outside the support of φ, so the Cauchy data vanishes there. By the CauchyKovalevsky theorem, Theorem 5.13, u(x, y) vanishes in a neighborhood of (x0 , 0). By analyticity, u(x, y) is identically zero. 6.8. Since k ≥ 1, we can let k = k − 1 ≥ 0. Apply Schauder’s theorem, Theorem 6.22, with k instead of k, and use C k+2 ⊂ C k+1,γ ⊂ C k+1 . 6.9. The function f can be recovered from u by the formula f = Δu, whereas at any regular point of the boundary x ∈ ∂Ω, ϕ is the jump of the

300

13. Answers and hints. Solutions

outward normal derivative of u on ∂Ω: ∂u ∂u (13.1) ϕ(x) = lim (x + ε¯ n) − (x − ε¯ n) . ε→0+ ∂ n ¯ ∂n ¯

6.11. Use the uniqueness of Green’s function combined with the invariance of the problem with respect to all translations parallel to the mirror L, rotations with a ﬁxed point c ∈ L, and mirror reﬂections with the mirror L. 7.1. u(t, x) =

u1 +u2 2

+

∞

p=0

2(u1 −u2 )(−1)p (2p+1)π 2

exp[−( (2p+1)πa )2 t] cos (2p+1)πx . l l 2

2

l cp l Relaxation time is tr ! 4πl2 a2 ! 40a 2 ! 40k , where k is the thermal conductivity, c the speciﬁc heat (per units of mass and temperature), ρ the density (mass per volume), l the length of the rod.

Hint. The ﬁrst term in the sum above is much greater than the other terms at times comparable with the relaxation time. 7.2. Let I be the current, R the electric resistance of the rod, V the volume of the rod, k the thermal conductivity, 2 cp = l

l& 0

' I 2R pπx ϕ(x) − x(l − x) sin dx, where ϕ = u|t=0 . 2V k l

Then & ' ∞ pπa 2 pπx I 2R x(l − x) + cp exp − t sin u(t, x) = 2V k l l p=1

and lim u(t, x) =

t→+∞

I 2R x(l − x). 2V k

Hint. The equation describing the process is ut = a2 uxx + 7.3. lim u(t, x) = t→+∞

I 2R . cρV

b+c 2 .

Hint. Consider Poisson’s integral giving an explicit formula for u(t, x).

13. Answers and hints. Solutions

301

8.1. u(x, y) = a2 (x2 − y 2 ). Hint. The Fourier method gives in polar coordinates u(r, ϕ) = a0 +

∞

rk (ak cos kϕ + bk sin kϕ).

k=1

8.4. u(t, x) =

1 4 12 (x

− y4) −

a2 2 12 (x

− y 2 ).

Hint. Find a particular solution up of Δu = x2 − y 2 and then seek v = u − up . 8.5. In polar coordinates u(r, ϕ) = 1 +

b3 b r ln + (r2 − a4 r−2 ) cos 2ϕ. 2 a 4(a4 + b4 )

Hint. The Fourier method gives in polar coordinates u(r, ϕ) = A0 + B0 ln r +

∞ [(Ak rk + Bk r−k ) cos kϕ + (Ck rk + Dk r−k ) sin kϕ]. k=1

8.6. The answer to the last question is u(x, y) = B +

sinh π(b−y) a sinh ∞ p=0

πb a

sin

πx a (2p+1)π(a−x)

sinh 8Ab b 2 2 (2p+1)πa (2p + 1) π sinh b

sin

(2p + 1)πy . b

Hint. The general scheme is as follows. Step 1. Reduce the problem to the case ϕ0 (0) = ϕ0 (a) = ϕ1 (0) = ψ0 (0) = ψ0 (b) = ψ1 (0) = ψ1 (b) = 0 by subtracting a function of the form A0 + A1 x + A2 y + A3 xy. Step 2. Seek u in the form u = u1 + u2 with u1 , u2 such that Δu1 = Δu2 = 0; u1 satisﬁes the Dirichlet boundary conditions with ψ0 = ψ1 = 0 and the same ϕ0 , ϕ1 as for u; u2 satisﬁes the Dirichlet boundary conditions with ϕ0 = ϕ1 = 0 and the same ψ0 , ψ1 as for u. Then ∞ kπx kπy − kπx b + Dk e b ) sin u1 (x, y) = k=1 (Ck e b ∞ kπ(a−x) kπx = (A sinh + B sinh ) sin kπy k k b b b , k=1 kπ(b−y) ∞ kπy kπx ) sin a , u2 (x, y) = k=1 (Ak sinh a + Bk sinh a

302

13. Answers and hints. Solutions

where Ak sinh kπa b = Bk sinh kπa b = kπb Ak sinh a = Bk sinh kπb a =

kπy 2 b b 0 ϕ1 (y) sin b dy, kπy 2 b b 0 ϕ0 (y) sin b dy, 2 a kπx a 0 ψ1 (x) sin a dx, 2 a kπx a 0 ψ0 (x) sin a dx.

2 ˜ + ∂∂yu˜2 = 0 and 8.7. Hint. If u ˜(ξ, y) = e−iξx u(x, y)dx, then −ξ 2 u u ˜(ξ, y) = C1 (ξ) exp(−|ξ|y) + C2 (ξ) exp(|ξ|y). The second summand should vanish almost everywhere if we want u to be bounded. Hence, u ˜(ξ, y) = −|ξ|y+i(x−z)ξ 1 ϕ(z)dzdξ. Now, change the e ϕ(ξ) ˜ exp(−|ξ|y) and u(x, y) = 2π order of integration. 8.8. α ∈ 2Z+ or α − s > −n/2. 8.9. No. Hint. Use Friedrichs’s inequality. 8.10. Hint. Work with the Fourier transform. Show that the question can be reduced to the case of a multiplication operator in L2 (Rn ). 9.1. Here is a physical interpretation: G(x, y) is the potential at x of a unit point charge located at the point y ∈ Ω inside a conducting grounded surface ∂Ω. 9.2. The same as for En (x − y), where En is the fundamental solution of Δ. 9.3. Hint. Use the fact that L−1 is selfadjoint. 9.4. Hint. Apply the Green’s formula (4.32) with v(x) = G(x, y) (consider y as a parameter) and Ω replaced by Ω\B(y, ε), where B(y, ε) = {x : |x−y| ≤ ε}. Then pass to the limit as ε → +0. ⎧ 2 2 ⎪ ⎨ 1 ln (x1 − y1 ) + (x2 − y2 ) for n = 2, 2 2 9.5. G(x, y) = 4π (x1 −

y1 ) + (x2 + y2 ) ⎪ 1 1 1 ⎩ for n ≥ 3, (2−n)σn−1 |x−y|n−2 − |x−y|n−2 where y = (y1 , . . . , yn−1 , −yn ) for y = (y1 , . . . , yn−1 , yn ). Hint. Seek the Green’s function in the form G(x, y) = En (x − y) + cEn (x − y), where y ∈ Ω and c is a constant.

13. Answers and hints. Solutions

303

9.6.

2xn ϕ(y )dy σn−1 |x − (y , 0)|n n−1 R

2 xn ϕ(y1 , . . . , yn−1 )dy1 . . . yn−1 = n . σn−1 [(x1 − y1 )2 + · · · + (xn−1 − yn−1 )2 + yn2 ] 2

u(x) =

Rn−1

Hint. Use the formula in Problem 9.4. 9.7. Let the disc or the ball be {x : |x| < R}. Then G(x, y) =

1 2π

R|x−y| ln |y||x−y|

for n = 2,

1

(2−n)σn−1 |x−y|n−2

−

Rn−2 |y|n−2 (2−n)σn−1 |x−y|n−2

for n ≥ 3,

2

y is the point which is obtained from y by the inversion with where y = R |y|2 respect to the circle (sphere) {x : |x| = R}.

Hint. If n = 2, seek G(x, y) in the form E2 (x − y) − E2 (x − y) − c(y). If n ≥ 3, seek G(x, y) in the form En (x − y) − c(y)En (x − y), where (in both 2y |y| |x − y| = does . Show geometrically that if |x| = R, then cases) y = R |y|2 |x − y| R not depend on x. 9.8. u(x) =

1 σn−1 R

|y|=R

R2 − |x|2 f (y)dSy . |x − y|n

Here the disc (or the ball) is Ω = {x : |x| < R}, f = u|∂Ω . Hint. Use the formula from Problem 8.4. When calculating use that |y| = R implies ∂ ∂ny |x

y y−x y − y| = − ∂n∂ y |x − y| = −(∇y |x − y|, |y| ) = −( |y−x| , |y| ) =

∂ ∂ny G(x, y),

(x,y)−R2 R|x−y| .

9.9. For the half-ball {x : |x| ≤ R, xn ≥ 0} the answer is G(x, y) = G0 (x, y) − G0 (x, y), where G0 is the Green’s function of the ball {x : |x| < R} and y = (y1 , . . . , yn−1 , −yn ) for y = (y1 , . . . , yn−1 , yn ).

304

9.10.

13. Answers and hints. Solutions

G(x, y) = − √

1 (x1 −y1 )2 +(x2 −y2 )2 +(x3 −y3 )2 1 + √ 4π (x1 −y1 )2 +(x2 +y2 )2 +(x3 −y3 )2 1 + √ 4π (x1 −y1 )2 +(x2 −y2 )2 +(x3 +y3 )2 1 − √ . 4π (x1 −y1 )2 +(x2 +y2 )2 +(x3 +y3 )2 4π

Hint. G(x, y) = E3 (x − y) − E3 (x − T2 y) − E3 (x − T3 y) + E3 (x − T2 T3 y), where T2 (y1 , y2 , y3 ) = (y1 , −y2 , y3 ), T3 (y1 , y2 , y3 ) = (y1 , y2 , −y3 ). 9.11. Hint. Use the fact that Γ(z) has simple poles for z = 0, −1, −2, . . .. 9.12. Hint. Use the Liouville formula for the Wronskian of two linearly independent solutions of the Bessel equation. 9.13. Hint. Expand the left-hand side in power series in t and x, and use the expansion of Jn (x) given in Problem 9.11. 9.14. Hint. Take t = eiϕ in the formula of Problem 9.13, and use the result of Problem 9.11. 9.15. For the domain {x = (x1 , . . . , xn ) : 0 < xj < aj , j = 1, . . . , n} the A sin k πx eigenfunctions are nj=1 ajj j for kj = 1, 2, . . . and the eigenvalues are λk1 ...kn = −[( ka11π )2 + · · · + ( kannπ )2 ]. 9.16. Hint. Use the fact that Jk (αk,n x) for all n = 1, 2, . . . are eigenfuncd2 1 d k2 tions of the same operator L = dx 2 + x dx − x2 (on the space of functions vanishing at x = 1). This operator is symmetric in the space L2 ((0, 1), xdx) consisting of the functions on (0, 1) which are square integrable with respect to the measure xdx. 9.17. For the disc {x : |x| < R}, in polar coordinates the desired system is

α k,n Jk r eikϕ for k ∈ Z+ , n = 1, 2, . . . , R where the αk,n are deﬁned in Problem 9.16. Hint. Write Δ in polar and expand the eigenfunction into coordinates ikϕ Fourier series u(r, ϕ) = k Xk (r)e . Then each term Xk (r)eikϕ will be an eigenfunction and Xk (r) satisﬁes the Bessel equation. Using regularity at 0, show that Xk (r) = cJk (αr). 9.18. Hint. For the cylinder Ω = {(x, y, z) : x2 + y 2 < R2 , 0 < z < H} reduce the equation to the case Δu = f, u|∂Ω = 0; then expand u and f in the eigenfunctions of Δ in Ω which are of the form !

mπz r sin : k ∈ Z+ , n = 1, 2, . . . , m = 1, 2, . . . . eikϕ Jk αk,n R H

13. Answers and hints. Solutions

10.1. u(t, x) = eiωt J0 ( ωa ρ), ρ = {x : x1 = x2 = 0}).

305

x21 + x22 . (The axis of the cylinder is

10.2. See Figure 13.6.

u

t=0

R

0

r = |x| u

t = ε < R/a 0 u

t = R/a 0

u

t > R/a 0

Figure 13.6

306

13. Answers and hints. Solutions

⎧ ⎪ 1 ⎪ ⎨ 1 at u(t, r) = − ⎪ 2 2r ⎪ ⎩ 0

if r ≤ R − at, if r + at > R and −R < r − at < R, if r − at ≥ R.

Hint. Introduce a new function v = ru instead of u. (Here r = |x|.) Then v(t, r) = 12 [ϕ(r ˆ + at) + ϕ(r ˆ − at)], where ϕˆ is the extension of rϕ to R as an odd function. 10.3. See Figure 13.7.

t=0 |x|

0

t = ε < R/2a

t = R/2a

t = R/2a + ε

t = R/a

t > R/a

Figure 13.7

13. Answers and hints. Solutions

⎧ ⎪ ⎨t u(t, r) =

1 [R2 ⎪ 4ar

⎩

− (r −

at)2 ]

0

307

if r + at ≤ R, if r + at > R and |r − at| < R, if |r − at| ≥ R.

Hint. Introduce a new function v = ru with r = |x|. Then

r+at 1 ˆ ψ(s)ds = Ψ(r + at) − Ψ(r − at), v(t, r) = 2a r−at r 1 ˆ ˆ where Ψ(r) = 2a 0 ψ(s)ds and ψ is the extension of rψ to R as an odd function. 10.5. Hint. Consider the Cauchy problem with initial conditions u|t=0 = 0 and ut |t=0 = ψ(x). Then 1 i[(x−y)·ξ+at|ξ|] e ψ(y)dydξ u(t, x) = (2π)−n 2ia|ξ| 1 i[(x−y)·ξ−at|ξ|] −n ψ(y)dydξ. −(2π) 2ia|ξ| e Consider, e.g., the ﬁrst summand. Multiplying the integrand by a C ∞ function χ(ξ) that is equal to 1 for |ξ| > 1 and vanishes for |ξ| < 1/2 (this does not aﬀect the singularities of u), try to ﬁnd a diﬀerential operator L = nj=1 cj (t, x, y, ξ) ∂ξ∂ j such that Lei[(x−y)·ξ+at|ξ|] = ei[(x−y)·ξ+at|ξ|] . Prove that such an operator L exists if and only if |x − y|2 = a2 t2 (here x, y, t are considered as parameters) and integrate by parts N times, thus moving L from the exponential to other terms. 10.6. {(t, x, y) : t ≥ 0, t2 = x2 + y 2 /4}. Hint. Reduce the operator to the canonical form. 10.7. a) {(t, x, y) : 0 ≤ t ≤ min( a2 ,

b 2 ),

x ∈ [t, a − t], y ∈ [t, b − t]}.

(See Figure 13.8: the lid of a coﬃn.) b) {(t, x, y) : t ≥ 0, dist((x, y), Π) ≤ t}, where Π = [0, a] × [0, b]. (See Figure 13.9: an inﬁnitely high cup of a stadium.) 10.8. a) Let {(x, y, z) : x ∈ [0, a], y ∈ [0, b], z ∈ [0, c]} be the given parallelepiped.

t

x2 b

a Figure 13.8

x1

308

13. Answers and hints. Solutions

Figure 13.9

Figure 13.10

Figure 13.11

Then the answer is a b c , , , (t, x, y, z) : 0 ≤ t ≤ min 2 2 2

t ≤ x ≤ a − t, t ≤ y ≤ b − t, t ≤ z ≤ c − t The section by the hyperplane t = 1 is a smaller parallelepiped, or a rectangle, or an interval, or the empty set (Figure 13.10). b) {(t, x, y, z) : dist((x, y, z), Π) ≤ t}, where Π is the given parallelepiped. The section by the hyperplane t = 1 is plotted in Figure 13.11. 11.1.

u(x) =

Q ln R − 2π Q − 2π ln r

if |x| ≤ R, if |x| ≥ R,

where r = |x|, the circle is {x : |x| = R}, Q is the total charge of the circle, i.e., Q = |y|=R σ(y)dsy , where σ is the density that deﬁnes the potential.

13. Answers and hints. Solutions

11.2.

Qr 2 − 4πR 2 + u(x) = Q − 2π ln r

309

Q 4π

−

Q 2π

ln R

if r ≤ R, if r ≥ R.

Here Q is the total charge of the disc {x : |x| ≤ R}, r = |x|. x2 + y 2 in the 11.3. The same answer as in Problem 11.1 with r = (x, y, z)-space, where the z-axis is the axis of the cylinder and R is the radius of its cross section by the xy-plane. 11.4.

α0 , u(x) = 0,

r < R, r > R,

where α0 is the density of the dipole moment, r is the distance to the center of the circle or to the axis of the cylinder, and R is the radius of the circle or of the cross section of the cylinder. 11.5.

u(x) =

Q (n−2)σn−1 Rn−2 Q (n−2)σn−1 |x|n−2

if |x| ≤ R, if |x| > R,

where the sphere of radius R is taken with the center at the origin and Q is the total charge of the sphere. 11.6. The induced charge is q = − QR d . Hint. The potential vanishes on the surface of the sphere, hence inside of the sphere. Calculate the potential in the center assuming that we know the distribution of the charge on the surface. 11.7. σ(x) = Q is situated.

Q(R2 − d2 ) , where y is the point at which the initial charge 4πR|x − y|3

Hint. Let y be the point obtained by reﬂecting y with respect to the 2y ). Place the charge q at sphere (if the sphere is {x : |x| = R}, then y = R |y|2 q Q the point y; then the potential u(x) = 4π|x−y| + 4π|x−y| coincides with the potential of all real charges (Q and induced ones) outside the sphere because of the uniqueness of the solution of the Dirichlet problem in the exterior. Then ∂u x ∂u(x) , where = ∇u, . σ(x) = − ∂r ∂r |x|

11.8. The charge will be distributed uniformly over the whole wire. 12.1. a) at ± x ·

η |η|

= 0.

b) at ± (|x| − R) = 0.

310

13. Answers and hints. Solutions

12.2. Rays and characteristics are the same curves in the (t, x)-plane. Hint. Use the description of rays as solutions of (12.39) satisfying |x(t)| ˙ = c(x(t)). 12.3. u(t, x, λ) =

iλ−1 −t 2x [exp(iλxe

+ 32 t) − exp(iλxe−t − 32 t)].

Bibliography

[1] M. A. Armstrong, Basic topology, corrected reprint of the 1979 original, Undergraduate Texts in Mathematics, Springer-Verlag, New York-Berlin, 1983. MR705632 [2] V. I. Arnold, Ordinary diﬀerential equations, translated from the Russian and edited by Richard A. Silverman, MIT Press, Cambridge, Mass.-London, 1978. MR0508209 [3] V. I. Arnold, Mathematical methods of classical mechanics, translated from the Russian by K. Vogtmann and A. Weinstein; Graduate Texts in Mathematics, 60, Springer-Verlag, New York-Heidelberg, 1978. MR0690288 [4] F. A. Berezin and M. A. Shubin, Schr¨ odinger equation, Kluwer, Dordrecht, 1987. [5] R. Courant and D. Hilbert, Methods of mathematical physics, Interscience, New York, Vol. I, 1953; Vol. II, 1962. [6] L. C. Evans, Partial diﬀerential equations, 2nd ed., Graduate Studies in Mathematics, vol. 19, American Mathematical Society, Providence, RI, 2010. MR2597943 [7] G. Eskin, Lectures on linear partial diﬀerential equations, Graduate Studies in Mathematics, vol. 123, American Mathematical Society, Providence, RI, 2011. MR2809923 [8] R. Feynman, R. B. Leighton, and N. Sands, The Feynman lectures on physics, SpringerVerlag, Reading, MA, Palo Alto, London, 1967. [9] D. Gilbarg and N. S. Trudinger, Elliptic partial diﬀerential equations of second order, reprint of the 1998 edition, Classics in Mathematics, Springer-Verlag, Berlin, 2001. MR1814364 [10] A. Gray, Tubes, Addison-Wesley Publishing Company, Advanced Book Program, Redwood City, CA, 1990. MR1044996 [11] R. C. Gunning and H. Rossi, Analytic functions of several complex variables, Prentice-Hall, Inc., Englewood Cliﬀs, N.J., 1965. MR0180696 [12] P. Hartman, Ordinary diﬀerential equations, John Wiley & Sons, Inc., New York-LondonSydney, 1964. MR0171038 [13] L. H¨ ormander, Linear partial diﬀerential operators, Die Grundlehren der mathematischen Wissenschaften, Bd. 116, Academic Press, Inc., Publishers, New York; Springer-Verlag, Berlin-G¨ ottingen-Heidelberg, 1963. MR0161012 [14] L. H¨ ormander, An introduction to complex analysis in several variables, D. Van Nostrand Co., Inc., Princeton, N.J.-Toronto, Ont.-London, 1966. MR0203075 [15] L. H¨ ormander, The analysis of partial diﬀerential operators. v. I, II, Springer, Berlin et al., 1983.

311

312

Bibliography

[16] F. John, Partial diﬀerential equations, Springer, New York et al., 1986. [17] A. N. Kolmogorov and S. V. Fomin, Elements of the theory of functions and functional analysis, Dover Publications, Inc., Mineola, New York, 1999. [18] N. V. Krylov, Lectures on elliptic and parabolic equations in H¨ older spaces, Graduate Studies in Mathematics, vol. 12, American Mathematical Society, Providence, RI, 1996. MR1406091 [19] O. A. Ladyzhenskaya and N. N. Ural’tseva, Linear and quasilinear elliptic equations, translated from the Russian by Scripta Technica, Inc., translation editor: Leon Ehrenpreis, Academic Press, New York-London, 1968. MR0244627 [20] N. N. Lebedev, Special functions and their applications, revised edition, translated from the Russian and edited by Richard A. Silverman; unabridged and corrected republication, Dover Publications, Inc., New York, 1972. MR0350075 [21] E. H. Lieb and M. Loss, Analysis, 2nd ed., Graduate Studies in Mathematics, vol. 14, American Mathematical Society, Providence, RI, 2001. MR1817225 [22] J.-L. Lions and E. Magenes, Non-homogeneous boundary value problems and applications. Vol. I, translated from the French by P. Kenneth; Die Grundlehren der mathematischen Wissenschaften, Band 181, Springer-Verlag, New York-Heidelberg, 1972. MR0350177 [23] V. Palamodov, Topics in mathematical physics, Spring semester, preprint, 2002. [24] I. G. Petrovskii, Partial diﬀerential equations, translated from the Russian by Scripta Technica, Ltd, Iliﬀe Books Ltd., London (distributed by W. B. Saunders Co., Philadelphia, Pa.), 1967. MR0211021 [25] M. Reed and B. Simon, Methods of modern mathematical physics. I. Functional analysis, Academic Press, New York-London, 1972. MR0493419 [26] A. P. Robertson and W. J. Robertson, Topological vector spaces, Cambridge Tracts in Mathematics and Mathematical Physics, No. 53, Cambridge University Press, New York, 1964. MR0162118 [27] W. Rudin, Principles of mathematical analysis, 3rd ed., International Series in Pure and Applied Mathematics, McGraw-Hill Book Co., New York-Auckland-D¨ usseldorf, 1976. MR0385023 [28] W. Rudin, Functional analysis, 2nd ed., International Series in Pure and Applied Mathematics, McGraw-Hill, Inc., New York, 1991. MR1157815 [29] H. H. Schaefer, Topological vector spaces, The Macmillan Co., New York; Collier-Macmillan Ltd., London, 1966. MR0193469 [30] L. Schwartz, Mathematics for the physical sciences, Hermann, Paris; Addison-Wesley Publishing Co., Reading, Mass.-London-Don Mills, Ont., 1966. MR0207494 [31] L. Schwartz. Analyse (French), Deuxi` eme partie: Topologie g´ en´ erale et analyse fonctionnelle; Collection Enseignement des Sciences, No. 11, Hermann, Paris, 1970. MR0467223 [32] G. E. Shilov, Generalized functions and partial diﬀerential equations, authorized English edition revised by the author, translated and edited by Bernard D. Seckler, Gordon and Breach, Science Publishers, Inc., New York-London-Paris, 1968. MR0230129 [33] N. Shimakura, Partial diﬀerential operators of elliptic type, translated and revised from the 1978 Japanese original by the author, Translations of Mathematical Monographs, vol. 99, American Mathematical Society, Providence, RI, 1992. MR1168472 [34] S. L. Sobolev and E. R. Dawson, Partial diﬀerential equations of mathematical physics, translated from the third Russian edition by E. R. Dawson; English translation edited by T. A. A. Broadbent, Pergamon Press, Oxford-Edinburgh-New York-Paris-Frankfurt; AddisonWesley Publishing Co., Inc., Reading, Mass.-London, 1964. MR0178220 [35] M. E. Taylor, Partial diﬀerential equations, Vol I, Basic theory; Vol. II, Qualitative studies of linear equations, Springer-Verlag, New York, 1996. [36] F. Treves, Topological vector spaces, distributions and kernels, Academic Press, Inc., (London, New York), LTD, third printing, 1970.

Bibliography

313

[37] V. S. Vladimirov, Equations of mathematical physics, translated from the Russian by Eugene Yankovsky [E. Yankovski˘ı], “Mir”, Moscow, 1984. MR764399 [38] F. W. Warner, Foundations of diﬀerentiable manifolds and Lie groups, corrected reprint of the 1971 edition, Graduate Texts in Mathematics, vol. 94, Springer-Verlag, New York-Berlin, 1983. MR722297 [39] J. Wermer, Potential theory, 2nd ed., Lecture Notes in Mathematics, vol. 408, Springer, Berlin, 1981. MR634962 [40] E. T. Whittaker and G. N. Watson, A course of modern analysis: An introduction to the general theory of inﬁnite processes and of analytic functions; with an account of the principal transcendental functions, reprint of the fourth (1927) edition, Cambridge Mathematical Library, Cambridge University Press, Cambridge, 1996. MR1424469

Index

C k boundary, 77 C k (Ω), 77 C k,γ , 141 C k,γ boundary, 141 Cbk (Rn ), 173 D(Ω) = C0∞ (Ω), 58 E(Ω) = C ∞ (Ω), 58 ˚s (Ω), 177 H H s (Rn ), 173 L2 (M, dμ), 195 N, positive integers, xvii Q∗ , 144 S(Rn ), 58 T ∗ M , 224 Z+ , nonnegative integers, xvii δ-function, 53 δ-like sequence, 54, 67 acoustic wave, 215 action, 267 action along the path, 223 adjoint Cauchy problem, 165 adjoint operator, 194 asymptotic solution, 273 Bessel equation, 205 bicharacteristic, 263 boundary value problem Cauchy-Dirichlet’s, 153 Cauchy-Neumann’s, 153 boundary value problem, ﬁrst, 153 boundary value problem, second, 153 boundary, piecewise C k , 79

boundary, piecewise smooth, 79 bundle, cotangent, 12 bundle, tangent, 12 canonical form of a second-order operator, 5, 9, 10 Cauchy problem, 20, 153, 265 Cauchy problem for the Hamilton-Jacobi equation, 265 Cauchy sequence, 61 Cauchy-Dirichlet’s initial-boundary value problem, 153 Cauchy-Neumann’s boundary value problem, 153 causality principle, 236 caustic, 262 characteristic hypersurface, 258 characteristic vector, 7 closed graph theorem, 115 compact operator, 177 completeness of distribution space, 67 conﬁguration space, 222 conservation of momentum, 39 convergent sequences in D(Ω), 63 convolution, 99 convolution of distributions, 104 convolution of two distributions, 124 cotangent bundle, 12, 224 cotangent vector, 12 countably normed space, 61 Courant’s variational principle, 210 cylindric function of the second kind, 208

315

316

cylindric functions, 205 cylindric wave, 221 d’Alembert’s formula, 20, 22 d’Alembert’s principle, 17 d’Alembertian, 2, 219, 224 degrees of freedom, 39 dependence domain, 23 derivative, directional, 13 derivative, fractional, 125 descent method, 236 dielectric permeability, 217 diﬀerential of a map, 13 diﬀerential operator, linear, 2 diﬀerential operator, order of, 2 diﬀusion equation, 151 dipole, 103 dipole moment, 103 Dirac’s δ-function, 53 Dirac’s measure, 53 direct product of distributions, 101, 102 direct product of functions, 100 directional derivative, 13 Dirichlet kernel, 68 Dirichlet’s integral, 181 Dirichlet’s problem, 131, 154 dispersion law, 220 distribution, 57 distribution with compact support, 63 distribution, homogeneous, 89 distribution, regular, 64 distribution, tempered, 63 distributions, 63 divergence formula, 79 domain of the operator, 193 domain, dependence, 23 domain, inﬂuence, 23 Doppler’s eﬀect, 41 double layer, 103 double layer potential, 107, 243 dual space, 62 Duhamel’s formula, 163 eigenfrequencies, 29 eigenfunctions, 28 eigenvalues, 28 eikonal equation, 272 energy, 36, 224 energy conservation law, 19, 36, 37, 224 energy, full, 36 energy, kinetic, 36, 223 energy, kinetic along the path, 223

Index

energy, potential, 36, 37, 223 equation, diﬀusion, 151 equation, Hamilton-Jacobi, 262 equation, parabolic, 151 equations, Lagrange’s, 17, 36 equations, Maxwell, 217 equations, transport, 274, 280 equicontinuity, 178 equivalent systems of seminorms, 59 Euler-Lagrange equations, 34, 35 extremal point, 34 Fejer kernel, 68 Fermat’s principle, 278 Fermi coordinates, 249 ﬁrst boundary value problem, 25 ﬁrst initial-boundary value problem, 153 forced oscillations, 41 formula, d’Alembert’s, 22 formula, Duhamel’s, 163 formula, Kirchhoﬀ, 232 formula, Poisson’s, 238 formulas, Sokhotsky, 70 Fourier transform, 100 Fourier transformation, 100 Fourier’s law, 152 fractional derivative, 125 fractional integral, 124 Fr´echet space, 61 freedom, degrees of, 39 Friedrichs extension of an operator, 198, 199 Friedrichs inequality, 182 function, Bessel of the ﬁrst kind, 207 function, cylindric of the ﬁrst kind, 207 function, cylindric of the second kind, 208 function, Green’s, 50 function, harmonic, 114 function, Lagrange, 223 function, Neumann’s, 208 function, subharmonic, 148 functional, 33 functions smooth up to G from each side, 244 fundamental solution of Laplace operator, 75 gauge transformation, 218 gauge, Lorentz, 219 Gauss’s law of the arithmetic mean, 127 Gauss-Ostrogradsky formula, 79

Index

Gaussian integrals, 81 generalized function, 57 generalized solution, 11 generalized solution of Dirichlet’s problem, 185 generalized solutions of the Cauchy problem, 22 Glazman’s lemma, 209 gradient, 13 graph of the operator, 194 Green’s function, 50 Hadamard’s example, 134 Hamilton-Jacobi equation, 262 Hamiltonian, 36, 224, 263 Hamiltonian curves, 263 Hamiltonian system, 225, 262 Hamiltonian vector ﬁeld, 263 harmonic function, 114, 127 Hausdorﬀ property, 60 heat conductivity, 152 heat equation, 151 heat operator, 2 Heaviside function, xvii Helmholtz operator, 97 high-frequency asymptotics, 47 H¨ older function, 141 H¨ older’s inequality, 92 Holmgren’s principle, 165 homogeneous distribution, 89 homothety, 89 Hooke’s law, 16, 290 Huygens principle, 232 hyperbolicity of an operator, 269 hypoelliptic operator, 112, 114 identity, Parseval, 298 inequality, Friedrichs, 182 inequality, H¨ older’s, 92 inﬂuence domain, 23 initial value problem, 21 integral, Dirichlet, 181 integral, fractional, 124 integral, Poisson’s, 157 inverse operator, 194 inversion, 146 involution, 146 involutive, 146 Jacobi matrix, 13 Kelvin transformation, 123 kernel, Dirichlet, 68

317

kernel, Fejer, 68 kinetic energy, 36, 223 kinetic energy along the path, 223 Kirchhoﬀ formula, 232 L. Schwartz’s space, 58 Lagrange function, 223 Lagrange’s equations, 17, 36 Lagrangian, 223 Lagrangian corresponding to the Hamiltonian, 267 Lagrangian manifold, 264 Laplace operator, 2 Laplacian, 2 law of dispersion, 220 law, conservation of energy, 224 law, Fourier, 152 law, Hooke’s, 290 linear operator in Hilbert space, 193 Liouville’s theorem for harmonic functions, 122 Lipschitz function, 141 logarithmic potential, 106 Lorentz gauge, 219 map, tangent, 13 maximum principle, 159, 169 maximum principle for harmonic functions, 129 Maxwell equations, 217 mean value theorem for harmonic functions, 127 method of descent, 236 mixed problem, 25 momentum, 36, 38 momentum, conservation of, 39 momentum, total, 38 monochromatic wave, 222 multi-index, 1 Neumann’s function, 208 Neumann’s problem, 154 Newton’s Second Law, 36 Newton’s Third Law, 38 nodal hypersurface, 146 norm, 58 operator semibounded from below, 198 operator, adjoint, 87, 194 operator, closed, 194 operator, compact, 177 operator, dual, 87 operator, elliptic, 7, 10

318

operator, extension of, 194 operator, heat, 2 operator, Helmholtz, 97 operator, hyperbolic, 7, 9 operator, hypoelliptic, 112, 114 operator, inverse, 194 operator, Laplace, 2 operator, of translation, 89 operator, one-dimensional Schr¨ odinger, 45 operator, parabolic, 9 operator, selfadjoint, 194 operator, Sturm-Liouville, 2, 45 operator, symmetric, 194 operator, transposed, 87 operator, wave, 2, 219 order of a distribution, 84 order of diﬀerential operator, 2 Parseval identity, 298 partition of unity, 70 path, 223 phase, 271 piecewise C k boundary, 79 piecewise smooth boundary, 79 plane wave, 220 Poisson problem, 136 Poisson’s equation, 82 Poisson’s formula, 238 Poisson’s integral, 157 potential, 106 potential energy, 36, 37, 223 potential of a uniformly charged sphere, 253 potential of the double layer of the sphere, 254 potential, double layer, 243 potential, retarded, 235 potential, single layer, 243 precompact, 177 principle of maximum, 169 principle, causality, 236 principle, Courant’s variational, 210 principle, d’Alembert’s, 17 principle, Holmgren’s, 165 principle, Huygens, 232 principle, maximum, 159 problem, adjoint Cauchy, 165 problem, Cauchy, 20 problem, ﬁrst boundary value, 25 problem, initial value, 21 problem, mixed, 25

Index

problem, Sturm-Liouville, 28, 44, 45 rapidly oscillating solution, 273 ray, 261, 265, 271 real analytic function, 110 reﬂection with respect to a sphere, 146 regular distribution, 64 removable singularity theorem, 114 renormalization, 83 retarded potential, 235 Schr¨ odinger operator, one-dimensional, 45 Schwartz’s kernel, 50 seminorm, 58 seminorms, equivalent systems of, 59 sheaf, 70 short-wave asymptotics, 47 simple layer potential, 107 single layer, 103 single layer potential, 243 singular support of a distribution, 109 Sobolev space, 171 Sobolev’s embedding theorem, 173 Sobolev’s theorem on restrictions, 175 Sokhotsky’s formulas, 70 solution asymptotic, 273 solution rapidly oscillating, 273 solution, generalized, 11 space, countably normed, 61 space, dual, 62 space, Fr´echet, 61 space, Sobolev, 171 speciﬁc heat, 152 spectral theorem, 197 speed of light, 217 spherical waves, 220 standing waves, 26 string, unbounded, 20 Sturm-Liouville operator, 2, 45 Sturm-Liouville problem, 28, 44, 45 subharmonic function, 148 superposition, 11, 29 support, 58 support of the distribution, 71 surface of jumps, 258 surface, characteristic, 7 surface, characteristic at a point, 7 symbol, 2 symbol, principal, 2 symbol, total, 2

Index

tangent bundle, 12, 222 tangent map, 13 tangent space, 222 tangent vectors, 12, 223 tempered distribution, 63 tensor product of distributions, 101, 102 tensor product of functions, 100 theorem, Liouville, 122 theorem, mean value, for harmonic functions, 127 theorem, Sobolev’s on embedding, 173 theorem, Sobolev’s on restrictions, 175 theorem, spectral, 197 theorem, Tikhonov, 166 transformation, Kelvin, 123 transport equations, 274, 280 uniform boundedness, 178 unitarily equivalent operators, 197 variation of a functional, 34 variational derivative, 34 variational derivative, partial, 35 vector, characteristic, 7 vector, cotangent, 12 vector, tangent, 12 vector, wave, 220 velocity, 223 volume potential, 242 wave equation, the nonresonance case, 32 wave equation, the resonance case, 32 wave front, 261, 265, 271 wave operator, 2, 219 wave vector, 220 wave, cylindric, 221 wave, monochromatic, 222 waves standing, 26 waves, spherical, 220 weak topology, 67 well-posedness, 132

319

Selected Published Titles in This Series 205 204 203 202

Mikhail Shubin, Invitation to Partial Diﬀerential Equations, 2020 Sarah J. Witherspoon, Hochschild Cohomology for Algebras, 2019 Dimitris Koukoulopoulos, The Distribution of Prime Numbers, 2019 Michael E. Taylor, Introduction to Complex Analysis, 2019

201 Dan A. Lee, Geometric Relativity, 2019 200 Semyon Dyatlov and Maciej Zworski, Mathematical Theory of Scattering Resonances, 2019 199 Weinan E, Tiejun Li, and Eric Vanden-Eijnden, Applied Stochastic Analysis, 2019 198 Robert L. Benedetto, Dynamics in One Non-Archimedean Variable, 2019 197 196 195 194

Walter Craig, A Course on Partial Diﬀerential Equations, 2018 Martin Stynes and David Stynes, Convection-Diﬀusion Problems, 2018 Matthias Beck and Raman Sanyal, Combinatorial Reciprocity Theorems, 2018 Seth Sullivant, Algebraic Statistics, 2018

193 192 191 190

Martin Lorenz, A Tour of Representation Theory, 2018 Tai-Peng Tsai, Lectures on Navier-Stokes Equations, 2018 Theo B¨ uhler and Dietmar A. Salamon, Functional Analysis, 2018 Xiang-dong Hou, Lectures on Finite Fields, 2018

189 188 187 186

I. Martin Isaacs, Characters of Solvable Groups, 2018 Steven Dale Cutkosky, Introduction to Algebraic Geometry, 2018 John Douglas Moore, Introduction to Global Analysis, 2017 Bjorn Poonen, Rational Points on Varieties, 2017

185 Douglas J. LaFountain and William W. Menasco, Braid Foliations in Low-Dimensional Topology, 2017 184 Harm Derksen and Jerzy Weyman, An Introduction to Quiver Representations, 2017 183 Timothy J. Ford, Separable Algebras, 2017 182 181 180 179

Guido Schneider and Hannes Uecker, Nonlinear PDEs, 2017 Giovanni Leoni, A First Course in Sobolev Spaces, Second Edition, 2017 Joseph J. Rotman, Advanced Modern Algebra: Third Edition, Part 2, 2017 Henri Cohen and Fredrik Str¨ omberg, Modular Forms, 2017

178 Jeanne N. Clelland, From Frenet to Cartan: The Method of Moving Frames, 2017 177 Jacques Sauloy, Diﬀerential Galois Theory through Riemann-Hilbert Correspondence, 2016 176 Adam Clay and Dale Rolfsen, Ordered Groups and Topology, 2016 175 Thomas A. Ivey and Joseph M. Landsberg, Cartan for Beginners: Diﬀerential Geometry via Moving Frames and Exterior Diﬀerential Systems, Second Edition, 2016 174 Alexander Kirillov Jr., Quiver Representations and Quiver Varieties, 2016 173 Lan Wen, Diﬀerentiable Dynamical Systems, 2016 172 Jinho Baik, Percy Deift, and Touﬁc Suidan, Combinatorics and Random Matrix Theory, 2016 171 Qing Han, Nonlinear Elliptic Equations of the Second Order, 2016 170 169 168 167

Donald Yau, Colored Operads, 2016 Andr´ as Vasy, Partial Diﬀerential Equations, 2015 Michael Aizenman and Simone Warzel, Random Operators, 2015 John C. Neu, Singular Perturbation in the Physical Sciences, 2015

166 Alberto Torchinsky, Problems in Real and Functional Analysis, 2015

For a complete list of titles in this series, visit the AMS Bookstore at www.ams.org/bookstore/gsmseries/.

It is a great pleasure to see this book—written by a great master of the subject—finally in print. This treatment of a core part of mathematics and its applications offers the student both a solid foundation in basic calculations techniques in the subject, as well as a basic introduction to the more general machinery, e.g., distributions, Sobolev spaces, etc., which are such a key part of any modern treatment. As such this book is ideal for more advanced undergraduates as well as mathematically inclined students from engineering or the natural sciences. Shubin has a lovely intuitive writing style which provides a gentle introduction to this beautiful subject. Many good exercises (and solutions) are provided! —Rafe Mazzeo, Stanford University This text provides an excellent semester’s introduction to classical and modern topics in linear PDE, suitable for students with a background in advanced calculus and Lebesgue integration. The author intersperses treatments of the Laplace, heat, and wave equations with developments of various functional analytic tools, particularly distribution theory and spectral theory, introducing key concepts while deftly avoiding heavy technicalities. —Michael Taylor, University of North Carolina, Chapel Hill

For additional information and updates on this book, visit www.ams.org/bookpages/gsm-205

GSM/205 www.ams.org

Mikhail Shubin photo © 2020 Galina Ester Shubina. Photography by Kate Gabor

This book is based on notes from a beginning graduate course on partial differential equations. Prerequisites for using the book are a solid undergraduate course in real analysis. There are more than 100 exercises in the book. Some of them are just exercises, whereas others, even though they may require new ideas to solve them, provide additional important information about the subject.