Functional Analytic Techniques for Diffusion Processes [1 ed.] 9789811910982, 9789811910999

This book is an easy-to-read reference providing a link between functional analysis and diffusion processes. More precis

206 97 9MB

English Pages 782 [792] Year 2022

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Foreword
Preface
Contents
Notation and Conventions
1 Introduction and Summary
1.1 Markov Processes and Semigroups
1.1.1 Brownian Motion
1.1.2 Markov Processes
1.1.3 Transition Functions
1.1.4 Kolmogorov's Equations
1.1.5 Feller Semigroups
1.1.6 Path Functions of Markov Processes
1.1.7 Strong Markov Processes
1.1.8 Infinitesimal Generators of Feller Semigroups
1.1.9 One-Dimensional Diffusion Processes
1.1.10 Multidimensional Diffusion Processes
1.2 Propagation of Maxima
1.3 Construction of Feller Semigroups
1.4 Notes and Comments
Part I Foundations of Modern Analysis
2 Sets, Topology and Measures
2.1 Sets
2.2 Mappings
2.3 Topological Spaces
2.4 Compactness
2.5 Connectedness
2.6 Metric Spaces
2.7 Baire's Category
2.8 Continuous Mappings
2.9 Linear Spaces
2.10 Linear Topological Spaces
2.10.1 The Ascoli–Arzelà Theorem
2.11 Factor Spaces
2.12 Algebras and Modules
2.13 Linear Operators
2.14 Measurable Spaces
2.15 Measurable Functions
2.16 Measures
2.16.1 Lebesgue Measures
2.16.2 Signed Measures
2.16.3 Borel Measures and Radon Measures
2.16.4 Product Measures
2.16.5 Direct Image of Measures
2.17 Integrals
2.18 The Radon–Nikodým Theorem
2.19 Fubini's Theorem
2.20 Notes and Comments
3 A Short Course in Probability Theory
3.1 Measurable Spaces and Functions
3.1.1 The Monotone Class Theorem
3.1.2 The Approximation Theorem
3.1.3 Measurability of Functions
3.2 Probability Spaces
3.3 Random Variables and Expectations
3.4 Independence
3.4.1 Independent Events
3.4.2 Independent Random Variables
3.4.3 Independent Algebras
3.5 Construction of Random Processes with Finite Dimensional Distribution
3.6 Conditional Probabilities
3.7 Conditional Expectations
3.8 Notes and Comments
4 Manifolds, Tensors and Densities
4.1 Manifolds
4.1.1 Topology on Manifolds
4.1.2 Submanifolds
4.2 Smooth Mappings
4.2.1 Partitions of Unity
4.3 Tangent Bundles
4.4 Vector Fields
4.5 Vector Fields and Integral Curves
4.6 Cotangent Bundles
4.7 Tensors
4.8 Tensor Fields
4.9 Exterior Product
4.10 Differential Forms
4.11 Vector Bundles
4.12 Densities
4.13 Integration on Manifolds
4.14 Manifolds with Boundary and the Double of a Manifold
4.15 Stokes's Theorem, Divergence Theorem and Green's Identities
4.16 Notes and Comments
5 A Short Course in Functional Analysis
5.1 Metric Spaces and the Contraction Mapping Principle
5.2 Linear Operators and Functionals
5.3 Quasinormed Linear Spaces
5.3.1 Compact Sets
5.3.2 Bounded Sets
5.3.3 Continuity of Linear Operators
5.3.4 Topologies of Linear Operators
5.3.5 The Banach–Steinhaus Theorem
5.3.6 Product Spaces
5.4 Normed Linear Spaces
5.4.1 Linear Operators on Normed Spaces
5.4.2 Method of Continuity
5.4.3 Finite Dimensional Spaces
5.4.4 The Hahn–Banach Extension Theorem
5.4.5 Dual Spaces
5.4.6 Annihilators
5.4.7 Dual Spaces of Normed Factor Spaces
5.4.8 Bidual Spaces
5.4.9 Weak Convergence
5.4.10 Weak* Convergence
5.4.11 Dual Operators
5.4.12 Adjoint Operators
5.5 Linear Functionals and Measures
5.5.1 The Space of Continuous Functions
5.5.2 The Space of Signed Measures
5.5.3 The Riesz–Markov Representation Theorem
5.5.4 Weak Convergence of Measures
5.6 Closed Operators
5.7 Complemented Subspaces
5.8 Compact Operators
5.9 The Riesz–Schauder Theory
5.10 Fredholm Operators
5.11 Hilbert Spaces
5.11.1 Orthogonality
5.11.2 The Closest-Point Theorem and Applications
5.11.3 Orthonormal Sets
5.11.4 Adjoint Operators
5.12 The Hilbert–Schmidt Theory
5.13 Notes and Comments
6 A Short Course in Semigroup Theory
6.1 Banach Space Valued Functions
6.2 Operator Valued Functions
6.3 Exponential Functions
6.4 Contraction Semigroups
6.4.1 The Hille–Yosida Theory of Contraction Semigroups
6.4.2 The Contraction Semigroup Associated with the Heat Kernel
6.5 (C0) Semigroups
6.5.1 Semigroups and Their Infinitesimal Generators
6.5.2 Infinitesimal Generators and Their Resolvents
6.5.3 The Hille–Yosida Theorem
6.5.4 (C0) Semigroups and Initial-Value Problems
6.6 Notes and Comments
Part II Elements of Partial Differential Equations
7 Distributions, Operators and Kernels
7.1 Notation
7.1.1 Points in Euclidean Spaces
7.1.2 Multi-Indices and Derivations
7.2 Function Spaces
7.2.1 Lp Spaces
7.2.2 Convolutions
7.2.3 Spaces of Ck Functions
7.2.4 Space of Test Functions
7.2.5 Hölder Spaces
7.2.6 Friedrichs' Mollifiers
7.3 Differential Operators
7.4 Distributions and the Fourier Transform
7.4.1 Definitions and Basic Properties of Distributions
7.4.2 Topologies on mathcalD(Ω)
7.4.3 Support of a Distribution
7.4.4 Dual Space of Cinfty(Ω)
7.4.5 Tensor Product of Distributions
7.4.6 Convolution of Distributions
7.4.7 The Jump Formula
7.4.8 Regular Distributions with Respect to One Variable
7.4.9 The Fourier Transform
7.4.10 Tempered Distributions
7.4.11 Fourier Transform of Tempered Distributions
7.5 Operators and Kernels
7.5.1 Schwartz's Kernel Theorem
7.5.2 Regularizers
7.6 Layer Potentials
7.6.1 Single and Double Layer Potentials
7.6.2 The Green Representation Formula
7.6.3 Approximation to the Identity via Dirac Measure
7.7 Distribution Theory on a Manifold
7.7.1 Densities on a Manifold
7.7.2 Distributions on a Manifold
7.7.3 Differential Operators on a Manifold
7.7.4 Operators and Kernels on a Manifold
7.8 Domains of Class Cr
7.9 The Seeley Extension Theorem
7.9.1 Proof of Lemma 7.46
7.10 Notes and Comments
8 L2 Theory of Sobolev Spaces
8.1 The Spaces Hs(Rn)
8.2 The Spaces Hsloc(Ω) and Hscomp(Ω)
8.3 The Spaces Hs(M)
8.4 The Spaces Hs(overlineRn+)
8.5 The Spaces Hs(overlineΩ)
8.6 Trace Theorems
8.7 Sectional Trace Theorems
8.8 Sobolev Spaces and Regularizations
8.9 Friedrichs' Mollifiers and Differential Operators
8.10 Notes and Comments
9 L2 Theory of Pseudo-differential Operators
9.1 Symbol Classes
9.2 Phase Functions
9.3 Oscillatory Integrals
9.4 Fourier Integral Operators
9.5 Pseudo-differential Operators
9.5.1 Definitions and Basic Properties
9.5.2 Symbols of a Pseudo-differential Operator
9.5.3 The Algebra of Pseudo-differential Operators
9.5.4 Elliptic Pseudo-differential Operators
9.5.5 Invariance of Pseudo-differential Operators Under Change of Coordinates
9.5.6 Pseudo-differential Operators and Sobolev Spaces
9.6 Pseudo-differential Operators on a Manifold
9.6.1 Definitions and Basic Properties
9.6.2 Classical Pseudo-differential Operators
9.6.3 Elliptic Pseudo-differential Operators
9.7 Elliptic Pseudo-differential Operators and Their Indices
9.7.1 Pseudo-differential Operators on Sobolev Spaces
9.7.2 The Index of an Elliptic Pseudo-differential Operator
9.8 Potentials and Pseudo-differential Operators
9.8.1 Single and Double Layer Potentials Revisited
9.8.2 The Green Representation Formula Revisited
9.8.3 Surface and Volume Potentials
9.9 The Sharp Gårding Inequality
9.10 Hypoelliptic Pseudo-differential Operators
9.11 Notes and Comments
Part III Maximum Principles and Elliptic Boundary Value Problems
10 Maximum Principles for Degenerate Elliptic Operators
10.1 Introduction
10.2 Maximum Principles
10.3 Propagation of Maxima
10.3.1 Statement of Results
10.3.2 Preliminaries
10.3.3 Proof of Theorem 10.14
10.3.4 Proof of Theorem 10.19
10.3.5 Proof of Theorem 10.17
10.4 Notes and Comments
Part IV L2 Theory of Elliptic Boundary Value Problems
11 Elliptic Boundary Value Problems
11.1 The Dirichlet Problem in the Framework of Hölder Spaces
11.2 The Dirichlet Problem in the Framework of L2 Sobolev Spaces
11.3 General Boundary Value Problems
11.3.1 Formulation of Boundary Value Problems
11.3.2 Reduction to the Boundary
11.4 Unique Solvability Theorem for General Boundary Value Problems
11.4.1 Statement of Main Results
11.4.2 Proof of Theorem11.19
11.4.3 End of Proof of Theorem11.19
11.4.4 Proof of Corollary11.20
11.5 Notes and Comments
Part V Markov Processes, Feller Semigroups and Boundary Value Problems
12 Markov Processes, Transition Functions and Feller Semigroups
12.1 Markov Processes and Transition Functions
12.1.1 Definitions of Markov Processes
12.1.2 Transition Functions
12.1.3 Kolmogorov's Equations
12.1.4 Feller and C0 Transition Functions
12.1.5 Path Functions of Markov Processes
12.1.6 Stopping Times
12.1.7 Definition of Strong Markov Processes
12.1.8 Strong Markov Property and Uniform Stochastic Continuity
12.2 Feller Semigroups and Transition Functions
12.2.1 Definition of Feller Semigroups
12.2.2 Characterization of Feller Semigroups in Terms of Transition Functions
12.3 The Hille–Yosida Theory of Feller Semigroups
12.3.1 Generation Theorems for Feller Semigroups
12.3.2 Generation Theorems for Feller Semigroups in Terms of Maximum Principles
12.4 Infinitesimal Generator of Feller Semigroups on a Bounded Domain (i)
12.5 Infinitesimal Generator of Feller Semigroups on a Bounded Domain (ii)
12.6 Feller Semigroups and Boundary Value Problems
12.7 Notes and Comments
13 L2 Approach to the Construction of Feller Semigroups
13.1 Statements of Main Results
13.2 Proof of Theorem 13.1
13.2.1 Proof of Theorem 13.5
13.3 Proof of Theorem 13.3
13.3.1 Proof of Theorem 13.15
13.4 The Degenerate Diffusion Operator Case
13.4.1 The Regular Boundary Case
13.4.2 The Totally Characteristic Case
13.5 Notes and Comments
14 Concluding Remarks
Appendix A Brief Introduction to the Potential Theoretic Approach
A.1 Hölder Continuity and Hölder Spaces
A.2 Interior Estimates for Harmonic Functions
A.3 Hölder Regularity for the Newtonian Potential
A.4 Hölder Estimates for the Second Derivatives
A.5 Hölder Estimates at the Boundary
A.6 Notes and Comments
Appendix References
Index
Recommend Papers

Functional Analytic Techniques for Diffusion Processes [1 ed.]
 9789811910982, 9789811910999

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Springer Monographs in Mathematics

Kazuaki Taira

Functional Analytic Techniques for Diffusion Processes

Springer Monographs in Mathematics Editors-in-Chief Minhyong Kim, School of Mathematics, Korea Institute for Advanced Study, Seoul, South Korea, International Centre for Mathematical Sciences, Edinburgh, UK Katrin Wendland, School of Mathematics, Trinity College Dublin, Dublin, Ireland Series Editors Sheldon Axler, Department of Mathematics, San Francisco State University, San Francisco, CA, USA Mark Braverman, Department of Mathematics, Princeton University, Princeton, NY, USA Maria Chudnovsky, Department of Mathematics, Princeton University, Princeton, NY, USA Tadahisa Funaki, Department of Mathematics, University of Tokyo, Tokyo, Japan Isabelle Gallagher, Département de Mathématiques et Applications, Ecole Normale Supérieure, Paris, France Sinan Güntürk, Courant Institute of Mathematical Sciences, New York University, New York, NY, USA Claude Le Bris, CERMICS, Ecole des Ponts ParisTech, Marne la Vallée, France Pascal Massart, Département de Mathématiques, Université de Paris-Sud, Orsay, France Alberto A. Pinto, Department of Mathematics, University of Porto, Porto, Portugal Gabriella Pinzari, Department of Mathematics, University of Padova, Padova, Italy Ken Ribet, Department of Mathematics, University of California, Berkeley, CA, USA René Schilling, Institute for Mathematical Stochastics, Technical University Dresden, Dresden, Germany Panagiotis Souganidis, Department of Mathematics, University of Chicago, Chicago, IL, USA Endre Süli, Mathematical Institute, University of Oxford, Oxford, UK Shmuel Weinberger, Department of Mathematics, University of Chicago, Chicago, IL, USA Boris Zilber, Mathematical Institute, University of Oxford, Oxford, UK

This series publishes advanced monographs giving well-written presentations of the “state-of-the-art” in fields of mathematical research that have acquired the maturity needed for such a treatment. They are sufficiently self-contained to be accessible to more than just the intimate specialists of the subject, and sufficiently comprehensive to remain valuable references for many years. Besides the current state of knowledge in its field, an SMM volume should ideally describe its relevance to and interaction with neighbouring fields of mathematics, and give pointers to future directions of research.

More information about this series at https://link.springer.com/bookseries/3733

Kazuaki Taira

Functional Analytic Techniques for Diffusion Processes

Kazuaki Taira Tsuchiura, Japan

ISSN 1439-7382 ISSN 2196-9922 (electronic) Springer Monographs in Mathematics ISBN 978-981-19-1098-2 ISBN 978-981-19-1099-9 (eBook) https://doi.org/10.1007/978-981-19-1099-9 Mathematics Subject Classification: Primary: 47D07, 35J25, Secondary: 47D05, 60J35, 60J60 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore

Dedicated to Prof. Kiyosi Itô (1915–2008) in appreciation of his constant encouragement

Foreword

There are dozens of books about Markov processes, some of them very good, but none match the depth and broad coverage of Kazuaki Taira’s books. Let me try to put this into context. Sometimes a massive study is done and leads to a major volume or volumes that redefine a field of study. For instance, the three-volume work of Nelson Dunford and Jack Schwartz did this for abstract mathematical analysis. The famed Charles Misner, Kip Thorne and John Wheeler book did this for general relativity. Taira’s work does this for Markov processes from a broad perspective. A simple view of Markov processes is that they deal with classes of dependent random variables that have both a nice theory and useful applications. But the general theory of Markov processes turns out to be extremely complicated. It is essential for applications to fields including mathematical biology, ecology, diffusion, statistical physics, etc. The mathematics needed for the hard parts of Markov processes require up-to-date versions of functional analysis, probability theory, partial and pseudo-differential equations, differential geometry, Fourier analysis, and more. Taira’s books bring these topics all together. They are not easy to explain in their general forms, but Taira does this carefully and quite nicely. These topics are usually hard to follow, but Taira explains things in a more easily readable way than one normally expects. The scope of his work is vast; it has been and continues to be a major influence in stochastic analysis and related fields. This book is a revised and expanded edition of the previous book [191] published in 1988. But is a new edition needed? In June 2019, Taira and I were both at a meeting in Cesena, Italy. His lecture was wonderful; it was on new, deep results. The topics he covered are among the new results in his new edition. In particular, the new material on the theory of pseudo-differential operators widens the scope of the book (which has a huge scope to begin with). This is nicely explained in Chap. 1 (Introduction and Summary) and Chap. 13 (L 2 Approach to the Construction of Feller Semigroups) of this edition.

vii

viii

Foreword

This wonderful book will be a major influence in a very broad field of study for a long time. I thank both Taira and Springer for their great contribution to the mathematical research community in publishing this book. November 2021

Jerome Arthur Goldstein University of Memphis Memphis, Tennessee, USA

Preface

This book is devoted to the functional analytic approach to the problem of construction of diffusion processes in probability theory. It is well known that, by virtue of the Hille–Yosida theory of semigroups, the problem of construction of Markov processes can be reduced to the study of boundary value problems for degenerate elliptic integro-differential operators of second order. Several recent developments in the theory of partial differential equations have made possible further progress in the study of boundary value problems and hence of the problem of construction of Markov processes. The presentation of these new results is the main purpose of the present book. Unlike many other books on Markov processes, this book focuses on the relationship between Markov processes and elliptic boundary value problems with emphasis on the study of maximum principles. Our approach here is distinguished by the extensive use of the theory of partial differential equations. Our functional analytic approach to diffusion processes is inspired by the following bird’s-eye view of mathematical studies of Brownian motion (see Tables 1.1, 1.2 ans Figure 1.1 in Chap. 1):

Markov Processes

Brownian Motion

Diffusion Equations

(Probability)

(Physics)

(Partial Differential Equations)

Semigroups (Functional Analysis)

ix

x

Preface

This book grew out of lecture notes for graduate courses given by the author at Sophia University, Waseda University, Hokkaido University, Tôhoku University, Tokyo Metropolitan University, Tokyo Institute of Technology, Hiroshima University and University of Tsukuba. It is addressed to advanced undergraduates, graduate students and mathematicians with interest in probability, functional analysis and partial differential equations. This book may be considered as the second edition of the book [191] published in 1988, which was found useful by a number of people, but it went out of print after several years. This augmented edition has been revised to streamline some of the analysis and to give better coverage of important examples and applications. I have endeavored to present it in such a way as to make it accessible to undergraduates as well. Moreover, in order to make the book more up-to-date, additional references have been included in the bibliography. This book is amply illustrated; 14 tables and 141 figures are provided. The contents of the book are divided into five principal parts. (1)

(2)

(3)

(4)

The first part (Chaps. 2 through 6) provides the elements of the Lebesgue theory of measure and integration, probability theory, manifold theory, functional analysis and distribution theory which are used throughout the book. The material in these preparatory chapters is given for completeness, to minimize the necessity of consulting too many outside references. This makes the book fairly self-contained. In the second part (Chaps. 7–9), the basic definitions and results about Sobolev spaces are summarized and the calculus of pseudo-differential operators—a modern version of classical potentials—is developed. The theory of pseudodifferential operators forms a most convenient tool in the study of elliptic boundary value problems in Chap. 11. It should be emphasized that pseudodifferential operators provide a constructive tool to deal with existence and smoothness of solutions of partial differential equations. The full power of this very refined theory is yet to be exploited. Our approach is not far removed from the classical potential approach. Our subject proper starts with the third part (Chap. 10), where various maximum principles for degenerate elliptic differential operators of second order are studied. In particular, the underlying analytical mechanism of propagation of maxima is revealed here. This plays an important role in the interpretation and study of Markov processes in terms of partial differential equations in Chap. 12. The fourth part (Chap. 11) is devoted to general boundary value problems for second order elliptic differential operators. The basic questions of existence, uniqueness and regularity of solutions of general boundary value problems with a spectral parameter are studied in the framework of Sobolev spaces, using the calculus of pseudo-differential operators. A fundamental existence and uniqueness theorem is proved here. The importance of such a theorem is visible in constructing Markov processes in Chaps. 12 and 13.

Preface

(5)

xi

The fifth and final part (Chaps. 12 and 13) is devoted to the functional analytic approach to the problem of construction of Markov processes. This part is the heart of the subject. General existence theorems for Markov processes in terms of boundary value problems are proved in Chap. 12, and then the construction of Markov processes is carried out in Chap. 13, by solving general boundary value problems with a spectral parameter.

To make the material in Chaps. 10 through 13 accessible to a broad spectrum of readers, I have added an Introduction and Summary (Chap. 1). In this introductory chapter, I have included ten elementary (but important) examples of diffusion processes, and further I have attempted to state our problems and results in such a fashion that a broad spectrum of readers could understand, and also to describe how these problems can be solved, using the mathematics I present in Chaps. 2 through 9. In the last Chap. 14, as concluding remarks, we give an overview on generation theorems for Feller semigroups proved by the author using the L p theory of pseudo-differential operators and the Calderón–Zygmund theory of singular integral operators (Table 14.1). Bibliographical references are discussed primarily in notes at the end of the chapters. These notes are intended to supplement the text and place it in better perspective. In Appendix A, following Gilbarg–Trudinger [74], we present a brief introduction to the potential theoretic approach to the Dirichlet problem for Poisson’s equation. The approach here can be traced back to the pioneering work of Schauder, [158] and [159], on the Dirichlet problem for second order elliptic differential operators. This appendix is included for the sake of completeness. This book may be considered as an elementary introduction to the more advanced book Boundary Value Problems and Markov Processes (the third edition) which was published in the Lecture Notes in Mathematics series in 2020. In fact, we confined ¯ The reason is ourselves to the case when the differential operator A is elliptic on D. that when A is not elliptic on D¯ we do not know whether the operator T (α) = L P(α), which plays a fundamental role in the proof, is a pseudo-differential operator or not. This book provides a powerful method for the analysis of elliptic boundary value problems in the framework of L 2 Sobolev spaces. For advanced undergraduates working in functional analysis, partial differential equations and probability, this book may serve as an effective introduction to these three interrelated fields of analysis. For beginning graduate students about to major in the subject and mathematicians in the field looking for a coherent overview, I hope that the readers will find this book a useful entrée to the subject. The presentation on some results of this book was given in “MathematischPhysikalisches Kolloquium” which was held on November 3rd, 2015 at Leibniz Universität Hannover (Germany) while I was on leave from Waseda University. I take this opportunity to express my sincere gratitude to these institutions. In preparing this book, I am indebted to many friends, colleagues and students. It is my great pleasure to thank all of them. In particular, I would like to express my hearty

xii

Preface

¯ thanks to Kenji Asada, Sunao Ouchi, Bernard Helffer, Jacques Camus, Charles Rockland, Junjiro Noguchi, Yuji Kasahara, Masao Tanikawa, Yasushi Ishikawa, Elmar Schrohe, Seiichiro Wakabayashi, Silvia Romanelli and Angelo Favini. Kasahara, Tanikawa and Wakabayashi helped me to learn the material that was presented in the previous book [191]. Schrohe and Ishikawa have read and commented on portions of various preliminary drafts. I am deeply indebted to Professors Kôichi Uchiyama, Jean-Michel Bony, Minoru Motoo, Tadashi Ueno, Shinzo Watanabe, Francesco Altomare and Jerome Arthur Goldstein for their constant interest in my work. I am grateful to my students—especially Hideo Deguchi, Nobuyuki Sugino, Takayasu Ito and Yusuke Yoshida—for many comments and corrigenda concerning my original lecture notes. Furthermore, I am very happy to acknowledge the influence of two of my teachers: Prof. Daisuke Fujiwara, from whose lectures I first learned this subject, and Prof. Hikosaburo Komatsu, who has done much to shape my viewpoint of analysis. I would like to extend my warmest thanks to the late Prof. Richard Ernest Bellman (1920–1984) who originally suggested that my work be published in book form. I am sincerely grateful to the four anonymous referees and a copyeditor for their many valuable suggestions and comments, which have substantially improved the presentation of this book. I would like to extend my hearty thanks to the staff of Springer-Verlag (Tokyo), who have generously complied with all my wishes. Last but not least, I owe a great debt of gratitude to my family, who gave me moral support during the preparation of this book. Tsuchiura, Ibaraki, Japan November 2021

Kazuaki Taira

Contents

1

Introduction and Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Markov Processes and Semigroups . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.1 Brownian Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.2 Markov Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.3 Transition Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.4 Kolmogorov’s Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.5 Feller Semigroups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.6 Path Functions of Markov Processes . . . . . . . . . . . . . . . . . 1.1.7 Strong Markov Processes . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.8 Infinitesimal Generators of Feller Semigroups . . . . . . . . 1.1.9 One-Dimensional Diffusion Processes . . . . . . . . . . . . . . . 1.1.10 Multidimensional Diffusion Processes . . . . . . . . . . . . . . . 1.2 Propagation of Maxima . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Construction of Feller Semigroups . . . . . . . . . . . . . . . . . . . . . . . . . . 1.4 Notes and Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Part I 2

1 1 1 3 5 8 9 14 15 17 20 22 25 30 39

Foundations of Modern Analysis

Sets, Topology and Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Topological Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Compactness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 Connectedness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6 Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.7 Baire’s Category . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.8 Continuous Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.9 Linear Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.10 Linear Topological Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.10.1 The Ascoli–Arzelà Theorem . . . . . . . . . . . . . . . . . . . . . . . .

43 43 44 44 46 47 47 48 48 49 50 51

xiii

xiv

Contents

2.11 2.12 2.13 2.14 2.15 2.16

Factor Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Algebras and Modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Linear Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Measurable Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Measurable Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.16.1 Lebesgue Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.16.2 Signed Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.16.3 Borel Measures and Radon Measures . . . . . . . . . . . . . . . . 2.16.4 Product Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.16.5 Direct Image of Measures . . . . . . . . . . . . . . . . . . . . . . . . . . Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Radon–Nikodým Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fubini’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Notes and Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

52 52 53 54 56 57 57 58 58 59 59 60 62 70 76

A Short Course in Probability Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Measurable Spaces and Functions . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.1 The Monotone Class Theorem . . . . . . . . . . . . . . . . . . . . . . 3.1.2 The Approximation Theorem . . . . . . . . . . . . . . . . . . . . . . . 3.1.3 Measurability of Functions . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Probability Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Random Variables and Expectations . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.1 Independent Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.2 Independent Random Variables . . . . . . . . . . . . . . . . . . . . . 3.4.3 Independent Algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5 Construction of Random Processes with Finite Dimensional Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.6 Conditional Probabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.7 Conditional Expectations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.8 Notes and Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

77 77 78 80 83 87 89 102 102 103 105 111 113 127 140

Manifolds, Tensors and Densities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.1 Topology on Manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.2 Submanifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Smooth Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.1 Partitions of Unity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Tangent Bundles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Vector Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5 Vector Fields and Integral Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6 Cotangent Bundles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.7 Tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

141 141 142 143 143 144 145 147 151 154 155

2.17 2.18 2.19 2.20 3

4

Contents

4.8 4.9 4.10 4.11 4.12 4.13 4.14 4.15

5

xv

Tensor Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exterior Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Differential Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Vector Bundles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Densities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Integration on Manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Manifolds with Boundary and the Double of a Manifold . . . . . . . Stokes’s Theorem, Divergence Theorem and Green’s Identities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.16 Notes and Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

157 159 162 164 166 170 172

A Short Course in Functional Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Metric Spaces and the Contraction Mapping Principle . . . . . . . . . 5.2 Linear Operators and Functionals . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Quasinormed Linear Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.1 Compact Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.2 Bounded Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.3 Continuity of Linear Operators . . . . . . . . . . . . . . . . . . . . . 5.3.4 Topologies of Linear Operators . . . . . . . . . . . . . . . . . . . . . 5.3.5 The Banach–Steinhaus Theorem . . . . . . . . . . . . . . . . . . . . 5.3.6 Product Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4 Normed Linear Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.1 Linear Operators on Normed Spaces . . . . . . . . . . . . . . . . . 5.4.2 Method of Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.3 Finite Dimensional Spaces . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.4 The Hahn–Banach Extension Theorem . . . . . . . . . . . . . . . 5.4.5 Dual Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.6 Annihilators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.7 Dual Spaces of Normed Factor Spaces . . . . . . . . . . . . . . . 5.4.8 Bidual Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.9 Weak Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.10 Weak* Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.11 Dual Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.12 Adjoint Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5 Linear Functionals and Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5.1 The Space of Continuous Functions . . . . . . . . . . . . . . . . . 5.5.2 The Space of Signed Measures . . . . . . . . . . . . . . . . . . . . . 5.5.3 The Riesz–Markov Representation Theorem . . . . . . . . . . 5.5.4 Weak Convergence of Measures . . . . . . . . . . . . . . . . . . . . 5.6 Closed Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.7 Complemented Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.8 Compact Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.9 The Riesz–Schauder Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.10 Fredholm Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

179 179 181 182 184 185 186 187 187 188 188 190 192 194 195 197 198 198 199 199 201 201 202 204 204 205 206 207 208 210 211 212 213

175 177

xvi

6

Contents

5.11 Hilbert Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.11.1 Orthogonality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.11.2 The Closest-Point Theorem and Applications . . . . . . . . . 5.11.3 Orthonormal Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.11.4 Adjoint Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.12 The Hilbert–Schmidt Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.13 Notes and Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

218 220 220 223 224 225 227

A Short Course in Semigroup Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 Banach Space Valued Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Operator Valued Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Exponential Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4 Contraction Semigroups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4.1 The Hille–Yosida Theory of Contraction Semigroups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4.2 The Contraction Semigroup Associated with the Heat Kernel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5 (C0 ) Semigroups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5.1 Semigroups and Their Infinitesimal Generators . . . . . . . . 6.5.2 Infinitesimal Generators and Their Resolvents . . . . . . . . 6.5.3 The Hille–Yosida Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 6.5.4 (C0 ) Semigroups and Initial-Value Problems . . . . . . . . . . 6.6 Notes and Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

229 229 231 233 235

Part II 7

235 248 255 255 262 269 276 280

Elements of Partial Differential Equations

Distributions, Operators and Kernels . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1.1 Points in Euclidean Spaces . . . . . . . . . . . . . . . . . . . . . . . . . 7.1.2 Multi-Indices and Derivations . . . . . . . . . . . . . . . . . . . . . . 7.2 Function Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.1 L p Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.2 Convolutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.3 Spaces of C k Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.4 Space of Test Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.5 Hölder Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.6 Friedrichs’ Mollifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3 Differential Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4 Distributions and the Fourier Transform . . . . . . . . . . . . . . . . . . . . . 7.4.1 Definitions and Basic Properties of Distributions . . . . . . 7.4.2 Topologies on D (Ω) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.3 Support of a Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.4 Dual Space of C ∞ (Ω) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.5 Tensor Product of Distributions . . . . . . . . . . . . . . . . . . . . . 7.4.6 Convolution of Distributions . . . . . . . . . . . . . . . . . . . . . . . 7.4.7 The Jump Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

283 284 284 284 285 285 286 287 290 290 292 294 294 294 300 301 302 303 305 307

Contents

xvii

7.4.8 Regular Distributions with Respect to One Variable . . . . 7.4.9 The Fourier Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.10 Tempered Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.11 Fourier Transform of Tempered Distributions . . . . . . . . . 7.5 Operators and Kernels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.5.1 Schwartz’s Kernel Theorem . . . . . . . . . . . . . . . . . . . . . . . . 7.5.2 Regularizers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.6 Layer Potentials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.6.1 Single and Double Layer Potentials . . . . . . . . . . . . . . . . . . 7.6.2 The Green Representation Formula . . . . . . . . . . . . . . . . . . 7.6.3 Approximation to the Identity via Dirac Measure . . . . . . 7.7 Distribution Theory on a Manifold . . . . . . . . . . . . . . . . . . . . . . . . . . 7.7.1 Densities on a Manifold . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.7.2 Distributions on a Manifold . . . . . . . . . . . . . . . . . . . . . . . . 7.7.3 Differential Operators on a Manifold . . . . . . . . . . . . . . . . 7.7.4 Operators and Kernels on a Manifold . . . . . . . . . . . . . . . . 7.8 Domains of Class C r . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.9 The Seeley Extension Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.9.1 Proof of Lemma 7.46 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.10 Notes and Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

308 310 314 322 329 331 334 336 336 338 339 341 341 342 344 345 346 349 351 355

8

L 2 Theory of Sobolev Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1 The Spaces H s (R n ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . s s (Ω) and Hcomp (Ω) . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 The Spaces Hloc 8.3 The Spaces H s (M) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . n ) ...................................... 8.4 The Spaces H s (R+ 8.5 The Spaces H s (Ω) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.6 Trace Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.7 Sectional Trace Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.8 Sobolev Spaces and Regularizations . . . . . . . . . . . . . . . . . . . . . . . . 8.9 Friedrichs’ Mollifiers and Differential Operators . . . . . . . . . . . . . . 8.10 Notes and Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

357 357 361 364 366 369 372 376 378 381 390

9

L 2 Theory of Pseudo-differential Operators . . . . . . . . . . . . . . . . . . . . . 9.1 Symbol Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2 Phase Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3 Oscillatory Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.4 Fourier Integral Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.5 Pseudo-differential Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.5.1 Definitions and Basic Properties . . . . . . . . . . . . . . . . . . . . 9.5.2 Symbols of a Pseudo-differential Operator . . . . . . . . . . . . 9.5.3 The Algebra of Pseudo-differential Operators . . . . . . . . . 9.5.4 Elliptic Pseudo-differential Operators . . . . . . . . . . . . . . . . 9.5.5 Invariance of Pseudo-differential Operators Under Change of Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.5.6 Pseudo-differential Operators and Sobolev Spaces . . . . .

391 391 395 396 398 400 400 403 405 406 407 408

xviii

Contents

9.6

Pseudo-differential Operators on a Manifold . . . . . . . . . . . . . . . . . 9.6.1 Definitions and Basic Properties . . . . . . . . . . . . . . . . . . . . 9.6.2 Classical Pseudo-differential Operators . . . . . . . . . . . . . . 9.6.3 Elliptic Pseudo-differential Operators . . . . . . . . . . . . . . . . 9.7 Elliptic Pseudo-differential Operators and Their Indices . . . . . . . . 9.7.1 Pseudo-differential Operators on Sobolev Spaces . . . . . . 9.7.2 The Index of an Elliptic Pseudo-differential Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.8 Potentials and Pseudo-differential Operators . . . . . . . . . . . . . . . . . 9.8.1 Single and Double Layer Potentials Revisited . . . . . . . . . 9.8.2 The Green Representation Formula Revisited . . . . . . . . . 9.8.3 Surface and Volume Potentials . . . . . . . . . . . . . . . . . . . . . . 9.9 The Sharp Gårding Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.10 Hypoelliptic Pseudo-differential Operators . . . . . . . . . . . . . . . . . . . 9.11 Notes and Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

409 409 410 412 412 412 416 426 426 427 428 440 445 450

Part III Maximum Principles and Elliptic Boundary Value Problems 10 Maximum Principles for Degenerate Elliptic Operators . . . . . . . . . . . 10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2 Maximum Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3 Propagation of Maxima . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3.1 Statement of Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3.3 Proof of Theorem 10.14 . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3.4 Proof of Theorem 10.19 . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3.5 Proof of Theorem 10.17 . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.4 Notes and Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

453 453 458 466 467 470 480 493 511 513

Part IV L2 Theory of Elliptic Boundary Value Problems 11 Elliptic Boundary Value Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 The Dirichlet Problem in the Framework of Hölder Spaces . . . . . 11.2 The Dirichlet Problem in the Framework of L 2 Sobolev Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.3 General Boundary Value Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 11.3.1 Formulation of Boundary Value Problems . . . . . . . . . . . . 11.3.2 Reduction to the Boundary . . . . . . . . . . . . . . . . . . . . . . . . . 11.4 Unique Solvability Theorem for General Boundary Value Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.4.1 Statement of Main Results . . . . . . . . . . . . . . . . . . . . . . . . . 11.4.2 Proof of Theorem 11.19 . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.4.3 End of Proof of Theorem 11.19 . . . . . . . . . . . . . . . . . . . . . 11.4.4 Proof of Corollary 11.20 . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.5 Notes and Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

517 518 519 527 527 530 544 546 548 560 565 571

Contents

Part V

xix

Markov Processes, Feller Semigroups and Boundary Value Problems

12 Markov Processes, Transition Functions and Feller Semigroups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.1 Markov Processes and Transition Functions . . . . . . . . . . . . . . . . . . 12.1.1 Definitions of Markov Processes . . . . . . . . . . . . . . . . . . . . 12.1.2 Transition Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.1.3 Kolmogorov’s Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.1.4 Feller and C0 Transition Functions . . . . . . . . . . . . . . . . . . 12.1.5 Path Functions of Markov Processes . . . . . . . . . . . . . . . . . 12.1.6 Stopping Times . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.1.7 Definition of Strong Markov Processes . . . . . . . . . . . . . . . 12.1.8 Strong Markov Property and Uniform Stochastic Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.2 Feller Semigroups and Transition Functions . . . . . . . . . . . . . . . . . . 12.2.1 Definition of Feller Semigroups . . . . . . . . . . . . . . . . . . . . . 12.2.2 Characterization of Feller Semigroups in Terms of Transition Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.3 The Hille–Yosida Theory of Feller Semigroups . . . . . . . . . . . . . . . 12.3.1 Generation Theorems for Feller Semigroups . . . . . . . . . . 12.3.2 Generation Theorems for Feller Semigroups in Terms of Maximum Principles . . . . . . . . . . . . . . . . . . . . 12.4 Infinitesimal Generator of Feller Semigroups on a Bounded Domain (i) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.5 Infinitesimal Generator of Feller Semigroups on a Bounded Domain (ii) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.6 Feller Semigroups and Boundary Value Problems . . . . . . . . . . . . . 12.7 Notes and Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

636 647 679

L 2 Approach to the Construction of Feller Semigroups . . . . . . . . . . . 13.1 Statements of Main Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.2 Proof of Theorem 13.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.2.1 Proof of Theorem 13.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.3 Proof of Theorem 13.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.3.1 Proof of Theorem 13.15 . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.4 The Degenerate Diffusion Operator Case . . . . . . . . . . . . . . . . . . . . 13.4.1 The Regular Boundary Case . . . . . . . . . . . . . . . . . . . . . . . . 13.4.2 The Totally Characteristic Case . . . . . . . . . . . . . . . . . . . . . 13.5 Notes and Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

681 682 685 687 705 708 711 713 715 718

13

575 577 577 583 590 592 595 596 601 602 603 603 604 612 612 619 626

14 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 719

xx

Contents

Appendix: A Brief Introduction to the Potential Theoretic Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 723 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 761 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 771

Notation and Conventions

The notation for set-theoretic concepts is standard. For example, the following notation is used for sets of numbers: (1) (2) (3) (4) (5) (6) (7) (8)

N: positive integers. Z: integers. Z0 : non-negative integers. R: real numbers. C: complex numbers. [a, b]: the closed interval {x ∈ R : a ≤ x ≤ b}. [a, b): the semiclosed interval {x ∈ R : a ≤ x < b}. (a, b): the open interval {x ∈ R : a < x < b}. The following notation and conventions are used for differentiation:

(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11)

α = (α1 , . . . , αn ) ∈ Zn0 . |α| = α1 + · · · + αn for α = (α1 , . . . , αn ) ∈ Zn0 . α! = α1 ! . . . αn ! for α = (α1 , . . . , αn ) ∈ Zn0 . α ≥β if and only if αi ≥ βi for all 1 ≤ i ≤ n.  α α! . = β!(α−β)! β x α = x1α1 . . . xnαn for x = (x1 , . . . , xn ) ∈ Rn and α = (α1 , . . . , αn ) ∈ Zn0 . β β β ∂x = ∂x11 . . . ∂xnn for β = (β1 , . . . , βn ) ∈√Zn0 . ∂ Dξ j = −i ∂ξ for 1 ≤ j ≤ n, where i = −1. β

β

j

β

Dx =  Dx11 . . . Dxnn for β = (β1 , . . . , βn ) ∈ Zn0 . y = 1+ |y|2 for y =(y1 , . . . , yn ) ∈ Rn .  Dξ 2 = 1 + nj=1 Dξ2j = 1 − ξ (minus the usual Laplacian).

xxi

xxii

Notation and Conventions

These conventions greatly simplify many expressions. For example, the Taylor series for a function f (x) takes the form f (x) =

 1 ∂ α f (0)x α . α! α≥0

Chapter 1

Introduction and Summary

In this introductory chapter, ten elementary (but important) examples of diffusion processes are included. Furthermore, our problems and results are stated in such a fashion that a broad spectrum of readers could understand, and it is also described how these problems can be solved, using the mathematics presented in Chaps. 2–9. (I) First, Table 1.1 below gives a bird’s-eye view of strong Markov processes, Feller semigroups and degenerate elliptic Ventcel’ (Wentzell) boundary value problems, and how these relate to each other. (II) Secondly, our functional analytic approach to strong Markov processes through Feller semigroups may be visualized as in Fig. 1.1 below. (III) Thirdly, Table 1.2 below gives a bird’s-eye view of Markov transition functions, Feller semigroups and Green operators (resolvents), and how these relate to each other.

1.1 Markov Processes and Semigroups This section is devoted to the functional analytic approach to the problem of construction of Markov processes in probability theory. General existence theorems for Markov processes are formulated in terms of elliptic boundary value problems with a spectral parameter.

1.1.1 Brownian Motion In 1828, the English botanist Robert Brown observed that pollen grains suspended in water move chaotically, incessantly changing their direction of motion. The physical © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 K. Taira, Functional Analytic Techniques for Diffusion Processes, Springer Monographs in Mathematics, https://doi.org/10.1007/978-981-19-1099-9_1

1

2

1 Introduction and Summary {Tt } : Feller semigroup on C(D)

pt (x, ·) : uniform stochastic continuity + Feller property

X : strong Markov process

right-continuous Markov process

Fig. 1.1 A functional analytic approach to strong Markov processes Table 1.1 A bird’s-eye view of strong Markov processes, Feller semigroups and degenerate elliptic boundary value problems Probability (Microscopic Functional analysis Elliptic boundary value approach) (Macroscopic approach) problems (Mesoscopic approach) Strong Markov process X = (x t , F , Ft , Px ) Markov transition function pt (x, dy) = Px {xt ∈ dy} Chapman–Kolmogorov equation pt + s(x, dz) =  D¯ pt (x, dy) ps (y, dz) Absorption, reflection, viscosity phenomena, drift and diffusion along the boundary

Feller semigroup {Tt }t ≥ 0 Tt f (x) =





pt (x, dy) f (y)

Infinitesimal generator A Tt = et A

Semigroup property Tt + s = T t · Ts

Degenerate diffusion operator A

¯ Function space C( D)

Ventcel’ (Wentzell) boundary condition L

Table 1.2 A bird’s-eye view of Markov transition functions, Feller semigroups and Green operators Dynkin Markov transition function Feller semigroup Tt = et A ⇐⇒ pt (x, dy)      Laplace transform  Tt f (x) = D¯ pt (x, dy) f (y)  Hille−Yosida Green kernel G α (x, y)

⇐⇒ Riesz−Markov

Green operator (αI − A)−1

explanation of this phenomenon is that a single grain suffers innumerable collisions with the randomly moving molecules of the surrounding water [27]. A mathematical theory for Brownian motion was put forward by Albert Einstein in 1905 [50]. Let p(t, x, y) be the probability density function that a one-dimensional Brownian particle starting at position x will be found at position y at time t. Einstein derived the following formula from statistical mechanical considerations:

1.1 Markov Processes and Semigroups

3

(y−x)2 1 p(t, x, y) = √ exp− 2Dt . 2π Dt

Here D is a positive constant determined by the radius of the particle, the interaction of the particle with surrounding molecules, temperature and the Boltzmann constant. This gives an accurate method of measuring Avogadro number by observing particles undergoing Brownian motion. Einstein’s theory was experimentally tested by Jean Baptiste Perrin between 1906 and 1909 [145]. Brownian motion was put on a firm mathematical foundation for the first time by Norbert Wiener in 1923 [237]. Let Ω be the space of continuous functions ω : [0, ∞) → R with coordinates xt (ω) = ω(t) and let F be the smallest σ-algebra in Ω which contains all sets of the form {ω ∈ Ω : a ≤ xt (ω) < b}, t ≥ 0, a < b. Wiener constructed probability measures Px , x ∈ R, on F for which the following formula holds true:   Px ω ∈ Ω : a1 ≤ xt1 (ω) < b1 , a2 ≤ xt2 (ω) < b2 , . . . , an ≤ xtn (ω) < bn  bn  b1  b2 ··· p(t1 , x, y1 ) p(t2 − t1 , y1 , y2 ) . . . = a1

a2

an

p(tn − tn−1 , yn−1 , yn ) dy1 dy2 . . . dyn , 0 < t1 < t2 < . . . < tn < ∞.

(1.1)

This formula expresses the “starting afresh” property of Brownian motion that if a Brownian particle reaches a position, then it behaves subsequently as though that position had been its initial position. The measure Px is called the Wiener measure starting at x. Paul Lévy found another construction of Brownian motion in stochastic analysis, and gave a profound description of qualitative properties of the individual Brownian path in his book [114]: Processus stochastiques et mouvement brownien.

1.1.2 Markov Processes Markov processes are an abstraction of the idea of Brownian motion. Let K be a locally compact, separable metric space and B the σ-algebra of all Borel sets in K , that is, the smallest σ-algebra containing all open sets in K . (The reader may content himself with thinking of R while reading about K .) Let (Ω, F, P) be a probability space. A function X defined on Ω taking values in K is called a random variable if it satisfies the condition {X ∈ E} = X −1 (E) ∈ F for all E ∈ B. We express this by saying that X is F/B-measurable. A family {xt }t≥0 of random variables is called a stochastic process, and may be thought of as the motion in time

4

1 Introduction and Summary

of a physical particle. The space K is called the state space and Ω the sample space. For a fixed ω ∈ Ω, the function xt (ω), t ≥ 0, defines in the state space K a trajectory or path of the process corresponding to the sample point ω. In this generality the notion of a stochastic process is of course not so interesting. The most important class of stochastic processes is the class of Markov processes which is characterized by the Markov property. Intuitively, the (temporally homogeneous) Markov property is that the prediction of subsequent motion of a particle, knowing its position at time t, depends neither on the value of t nor on what has been observed during the time interval [0, t]; that is, a particle starts afresh. Now we introduce a class of Markov processes which we will deal with in this book (Definition 12.3). Assume that we are given the following: (1) A locally compact, separable metric space K and the σ-algebra B of all Borel sets in K . A point ∂ is adjoined to K as the point at infinity if K is not compact, and as an isolated point if K is compact. We let K ∂ = K ∪ {∂}, B∂ = the σ-algebra in K ∂ generated by B. (2) The space Ω of all mappings ω : [0, ∞] → K ∂ such that ω(∞) = ∂ and that if ω(t) = ∂ then ω(s) = ∂ for all s ≥ t. We let ω∂ be the constant map ω∂ (t) = ∂ for all t ∈ [0, ∞]. (3) For each t ∈ [0, ∞], the coordinate map xt defined by xt (ω) = ω(t) for every ω ∈ Ω. (4) For each t ∈ [0, ∞], a mapping ϕt : Ω → Ω defined by ϕt ω(s) = ω(t + s), ω ∈ Ω. Note that ϕ∞ ω = ω∂ and xt ◦ ϕs = xt+s for all t, s ∈ [0, ∞]. (5) A σ-algebra F in Ω and an increasing family {Ft }0≤t≤∞ of sub-σ-algebras of F. (6) For each x ∈ K ∂ , a probability measure Px on (Ω, F). We say that these elements define a (temporally homogeneous) Markov process X = (xt , F, Ft , Px ) if the following conditions are satisfied: (i) For each 0 ≤ t < ∞, the function xt is Ft /B∂ -measurable, that is, {xt ∈ E} ∈ Ft for all E ∈ B∂ . (ii) For each 0 ≤ t < ∞ and E ∈ B, the function pt (x, E) = Px {xt ∈ E} is a Borel measurable function of x ∈ K . (iii) Px {ω ∈ Ω : x0 (ω) = x} = 1 for each x ∈ K ∂ . (iv) For all t, h ∈ [0, ∞], x ∈ K ∂ and E ∈ B∂ , we have the formula

(1.2)

1.1 Markov Processes and Semigroups

5

Px {xt+h ∈ E|Ft } = ph (xt , E) ,

(1.3)

or equivalently  Px (A ∩ {xt+h ∈ E}) =

ph (xt (ω), E) d Px (ω) for every A ∈ Ft . A

(1.3 )

Here is an intuitive way of thinking about the above definition of a Markov process. The sub-σ-algebra Ft may be interpreted as the collection of events which are observed during the time interval [0, t]. The value Px (A), A ∈ F, may be interpreted as the probability of the event A under the condition that a particle starts at position x; hence the value pt (x, E) expresses the transition probability that a particle starting at position x will be found in the set E at time t. The function pt is called the transition function of the process X . The transition function pt specifies the probability structure of the process. The intuitive meaning of the crucial condition (iv) is that the future behavior of a particle, knowing its history up to time t, is the same as the behavior of a particle starting at xt (ω), that is, a particle starts afresh. A particle moves in the space K until it “dies” at which time it reaches the point ∂; hence the point ∂ is called the terminal point. With this interpretation in mind, we let ζ(ω) = inf {t ∈ [0, ∞] : xt (ω) = ∂} . The random variable ζ is called the lifetime of the process X . Using the Markov property (1.3 ) repeatedly, we easily obtain the following formula, analogous to formula (1.1):   Px ω ∈ Ω : xt1 (ω) ∈ A1 , xt2 (ω) ∈ A2 , . . . , xtn (ω) ∈ An    = ··· pt1 (x, dy1 ) pt2 −t1 (y1 , dy2 ) · · · ptn −tn−1 (yn−1 , dyn ), A1

A2

An

0 < t1 < t2 < . . . < tn < ∞,

A1 , A2 , . . . , An ∈ B.

1.1.3 Transition Functions From the viewpoint of analysis, the transition function is something more convenient than the Markov process itself. In fact, it can be shown that the transition functions of Markov processes generate solutions of certain parabolic partial differential equations such as the classical diffusion equation; and, conversely, these differential equations can be used to construct and study the transition functions and the Markov processes themselves. First, we give the precise definition of a transition function which is adapted to analysis (Definition 12.4):

6

1 Introduction and Summary

Let K be a locally compact, separable metric space and B the σ-algebra of all Borel sets in K . A function pt (x, E), defined for all t ≥ 0, x ∈ K and E ∈ B, is called a (temporally homogeneous) Markov transition function on K if it satisfies the following four conditions: (a) (b) (c) (d)

pt (x, ·) is a measure on B and pt (x, K ) ≤ 1 for each t ≥ 0 and x ∈ K . pt (·, E) is a Borel measurable function for each t ≥ 0 and E ∈ B. p0 (x, {x}) = 1 for each x ∈ K . For any t, s ≥ 0 and E ∈ B, we have the formula  pt+s (x, E) =

pt (x, dy) ps (y, E).

(1.4)

K

Equation (1.4), called the Chapman–Kolmogorov equation [34, 103], expresses the idea that a transition from the position x to the set E in time t + s is composed of a transition from x to some position y in time t, followed by a transition from y to the set E in the remaining time s; the latter transition has probability ps (y, E) which depends only on y. Thus it is just condition (d) which reflects the Markov property that a particle starts afresh. The Chapman–Kolmogorov equation (1.4) asserts that the transition function pt (x, K ) is monotonically increasing as t ↓ 0, so that the limit p+0 (x, K ) = lim pt (x, K ) t↓0

exists. A transition function pt is said to be normal if it satisfies the condition p+0 (x, K ) = 1 for all x ∈ K . The next theorem justifies our definition of a transition function, and hence it will be fundamental for our further study of Markov processes: Theorem 1.1 (Dynkin) For every Markov process, the function pt , defined by (1.2), is a transition function. Conversely, every normal transition function corresponds to some Markov process. Here are some important examples of normal transition functions on R. Example 1.2 (uniform motion) If t ≥ 0, x ∈ R and E ∈ B, we let pt (x, E) = χ E (x + vt), where v is a constant, and χ E (y) =

1 if y ∈ E, 0 if y ∈ / E.

(1.5)

1.1 Markov Processes and Semigroups

7

This process, starting at x, moves deterministically with constant velocity v. Example 1.3 (Poisson process) If t ≥ 0, x ∈ R and E ∈ B, we let pt (x, E) = e−λt



(λt)n n=0

n!

χ E (x + n),

(1.6)

where λ is a positive constant. This process, starting at x, advances one unit by jumps, and the probability of n jumps in time t is equal to e−λt (λt)n /n!. Example 1.4 (Brownian motion) If t > 0, x ∈ R and E ∈ B, we let (y − x)2 pt (x, E) = √ dy, exp − 2t 2πt E 

1

(1.7)

and p0 (x, E) = χ E (x). This is a mathematical model of one-dimensional Brownian motion. Example 1.5 (Brownian motion with constant drift) If t > 0, x ∈ R and E ∈ B, we let  1 (y − mt − x)2 dy, (1.8) pt (x, E) = √ exp − 2t 2πt E and p0 (x, E) = χ E (x), where m is a constant. This represents Brownian motion with constant drift m: the process can be represented as {xt + mt}, where {xt } is Brownian motion. Example 1.6 (Cauchy process) If t > 0, x ∈ R and E ∈ B, we let pt (x, E) =

1 π

 E

t dy, t 2 + (y − x)2

(1.9)

and p0 (x, E) = χ E (x). This process can be thought of as the “trace” on the real line of trajectories of two-dimensional Brownian motion, and it moves by jumps.

8

1 Introduction and Summary

1.1.4 Kolmogorov’s Equations Among the first works devoted to Markov processes, the most fundamental was A. N. Kolmogorov’s work (1931) where the general concept of a Markov transition function was introduced for the first time and an analytic method of describing Markov transition functions was proposed [103]. We now take a close look at Kolmogorov’s work. Let pt be a transition function on R, and assume that the following two conditions are satisfied: (i) For each ε > 0, we have the assertion lim t↓0

1 sup pt (x, R \ (x − ε, x + ε)) = 0. t x∈R

(ii) The three limits  1 x+ε pt (x, dy)(y − x)2 := a(x), t↓0 t x−ε  1 x+ε pt (x, dy)(y − x) := b(x), lim t↓0 t x−ε 1 lim ( pt (x, R) − 1) := c(x) t↓0 t lim

exist for each x ∈ R. Physically, the limit a(x) may be thought of as variance (over ω ∈ Ω) instantaneous (with respect to t) velocity when the process is at position x (see Sect. 1.1.2), and the limit b(x) has a similar interpretation as a mean. The transition functions (1.5), (1.7) and (1.8) satisfy conditions (i) and (ii) with a(x) = 0, b(x) = v, c(x) = 0; a(x) = 1, b(x) = c(x) = 0; a(x) = 1, b(x) = m, c(x) = 0, respectively, whereas the transition functions (1.6) and (1.9) do not satisfy condition (i). Furthermore, we assume that the transition function pt has a density p(t, x, y) with respect to the Lebesgue measure dy. Intuitively, the density p(t, x, y) represents the state of the process at position y at time t, starting from the initial state that a unit mass is at position x. Under certain regularity conditions, Kolmogorov showed that the density p(t, x, y) is, for fixed y, the fundamental solution of the Cauchy problem: ⎧ a(x) ∂ 2 p ∂p ⎨∂ p = + c(x) p, t > 0. + b(x) (1.10) ∂t 2 ∂x 2 ∂x ⎩ limt↓0 p(t, x, y) = δ(x − y), and is, for fixed x, the fundamental solution of the Cauchy problem:

1.1 Markov Processes and Semigroups

9

 ⎧ 2  a(y) ∂ ⎨∂ p = ∂ p − (b(y) p) + c(y) p, t > 0. ∂t ∂ y2 2 ∂y ⎩ limt↓0 p(t, x, y) = δ(y − x).

(1.11)

Here δ is the Dirac measure, and δ(x − y) (resp. δ(y − x)) represents a unit mass at position y (resp. x). Equation (1.10) is called Kolmogorov’s backward equation, since we consider the terminal state (the variable y) to be fixed and vary the initial state (the variable x). In this context, Eq. (1.11) is called Kolmogorov’s forward equation. These equations are also called the Fokker–Planck partial differential equations. In the case of Brownian motion (Example 1.4), Eqs. (1.10) and (1.11) become the classical diffusion (or heat) equations for t > 0: 1 ∂2 p ∂p = , ∂t 2 ∂x 2

1 ∂2 p ∂p = . ∂t 2 ∂ y2

Conversely, Kolmogorov raised the problem of construction of Markov transition functions by solving the given Fokker–Planck partial differential equations (1.10) and (1.11) [124, 132]. It is worth pointing out here that the forward equation (1.11) is given in a more intuitive form than the backward equation (1.10), but regularity conditions on the functions a and b are more stringent than those needed in the backward case. This suggests that the backward approach is more convenient than the forward approach from the viewpoint of analysis. In 1936, W. Feller treated this problem by classical analytic methods, and proved that Eq. (1.10) (or (1.11)) has a unique solution p(t, x, y) under certain regularity conditions on the functions a, b and c, and that this solution p(t, x, y) determines a Markov process [58]. In 1943, R. Fortet proved that these solutions correspond to Markov processes with continuous paths [64]. On the other hand, Bernstein [19] and Paul Lévy [114] made probabilistic approaches to this problem, by using stochastic differential equations.

1.1.5 Feller Semigroups In the 1950s, the theory of Markov processes entered a new period of intensive development [59, 60, 98, 147]. The Hille–Yosida theory of semigroups in functional analysis made possible further progress in the study of Markov processes [81, 240]. Kolmogorov’s backward and forward equations (1.10) and (1.11) can be formulated in terms of semigroup theory, which we shall state. Let K be a locally compact, separable metric space and B(K ) the space of realvalued, bounded Borel measurable functions on K ; B(K ) is a Banach space with the supremum norm:  f  = sup | f (x)|. x∈K

10

1 Introduction and Summary

We can associate with each transition function pt on K a family {Tt }t≥0 of linear operators acting on B(K ) in the following way:  Tt f (x) =

pt (x, dy) f (y) for f ∈ B(K ).

(1.12)

K

Then the operators Tt are non-negative and contractive on B(K ): f ∈ B(K ), 0 ≤ f (x) ≤ 1 on K =⇒ 0 ≤ Tt f (x) ≤ 1 on K . Furthermore, the Chapman–Kolmogorov equation (1.4) implies that the family {Tt } forms a semigroup Tt+s = Tt · Ts for t, s ≥ 0. We also have T0 = I = the identity operator. The Hille–Yosida theory of semigroups requires the strong continuity of {Tt }t≥0 : lim Tt f − f  = 0 for each f ∈ B(K ). t↓0

(1.13)

That is,      lim sup  pt (x, dy) f (y) − f (x) = 0 for each f ∈ B(K ). t↓0 x∈K

(1.13 )

K

We define the infinitesimal generator A of the semigroup {Tt }t≥0 by the formula A f = lim t↓0

Tt f − f , t

provided that the limit exists in B(K ). Then the Hille–Yosida theory tells us that the semigroup {Tt } can be written as Tt = etA with a suitable interpretation of the exponential, and that the infinitesimal generator A determines completely the semigroup {Tt }. The exponential differential equation associated with {Tt }: d (Tt f ) = A (Tt f ) dt is a generalization of Kolmogorov’s backward equation (1.10). On the other hand, let M(K ) be the space of real Borel measures on K ; M(K ) is a Banach space with the total variation norm. If μ ∈ M(K ), we let

1.1 Markov Processes and Semigroups

11

 Ut μ(E) =

μ(d x) pt (x, E) for E ∈ B. K

Then the operators Ut also form a contraction semigroup on M(K ). The semigroup {Ut } has the probabilistic interpretation that if μ is the initial probability distribution, then Ut μ may be interpreted as the probability distribution at time t. The differential equation    d Ut μ(d x) f (x) = Ut μ(d x)A f (x) dt K K is a generalization of Kolmogorov’s forward equation (1.11). Although the semigroup {Tt } appears less natural than the semigroup {Ut }, as the further development of the theory has shown, it is the more convenient one from the viewpoint of functional analysis. For technical reasons, we will concentrate on the semigroup {Tt }. If pt is the transition function of a Markov process X , then the infinitesimal generator A of the associated semigroup {Tt }t≥0 is called the infinitesimal generator of the process X . Now, by taking f = χ{x} ∈ B(K ) in formula (1.13 ), we obtain that lim pt (x, {x}) = 1 for x ∈ K . t↓0

(1.14)

However, the Brownian motion transition function (1.7), the most important and interesting example, does not satisfy condition (1.14). Thus we shift our attention to continuous functions, instead of measurable functions. Let C(K ) be the space of real-valued, bounded continuous functions on K ; C(K ) is a Banach space with the supremum (maximum) norm  f  = sup | f (x)|. x∈K

We add a new point ∂ to the locally compact space K as the point at infinity if K is not compact, and as an isolated point if K is compact; so the space K ∂ = K ∪ {∂} is compact. We say that a function f ∈ C(K ) converges to a ∈ R as x → ∂ if, for each ε > 0, there exists a compact subset E of K such that | f (x) − a| < ε for all x ∈ K \ E, and write lim f (x) = a.

x→∂

Let C0 (K ) be the subspace of C(K ) which consists of all functions satisfying lim x→∂ f (x) = 0; the space C0 (K ) is a closed subspace of C(K ). We remark that C0 (K ) may be identified with C(K ) if K is compact. Namely, we have the formula

12

1 Introduction and Summary

C0 (K ) =

{ f ∈ C(K ) : lim x→∂ f (x) = 0} if K is locally compact, C(K ) if K is compact.

Now we introduce a useful convention: Any real-valued function f on K is extended to the compact space K ∂ = K ∪ {∂} by setting f (∂) = 0.

From this viewpoint, the space C0 (K ) is identified with the subspace of C(K ∂ ) which consists of all functions f satisfying the condition f (∂) = 0. More precisely, we have the following decomposition (see Fig. 12.6): C(K ∂ ) = {constant functions} + C0 (K ). Furthermore, we extend a transition function pt on K to a transition function pt on K ∂ as follows: ⎧  ⎪ for x ∈ K and E ∈ B, ⎨ p t (x, E) = pt (x, E)  for x ∈ K , p t (x, {∂}) = 1 − pt (x, K ) ⎪ ⎩  p t (∂, K ) = 0, p  t (∂, {∂}) = 1 for x = ∂. Intuitively, this means that a Markovian particle moves in the space K until it dies, at which time it reaches the point ∂; hence the point ∂ is the terminal point or cemetery. We remark that our convention is consistent, since Tt f (∂) = f (∂) = 0. Now we can introduce two important conditions on the measures pt (x, ·) related to continuity in x ∈ K , for fixed t ≥ 0: (i) A transition function pt is called a Feller function if the function  Tt f (x) =

pt (x, dy) f (y) K

is a continuous function of x ∈ K whenever f is bounded and continuous on K . That is, the Feller property is equivalent to saying that the space C(K ) is an invariant subspace of B(K ) for the operators Tt : f ∈ C(K ) =⇒ Tt f ∈ C(K ). (ii) We say that pt is a C0 -function if the space C0 (K ) is an invariant subspace of C(K ) for the operators Tt : f ∈ C0 (K ) =⇒ Tt f ∈ C0 (K ). For example, the transition functions in Examples 1.2–1.6 are all Feller and C0 functions.

1.1 Markov Processes and Semigroups

13

The next theorem (Theorems 12.32 and 12.33) states the most important relation between Feller transition functions and semigroups on C(K ): Theorem 1.7 If pt is a Feller transition function on K , then its associated operators {Tt }t≥0 , defined by formula (1.12), form a non-negative and contraction semigroup on C(K ). Conversely, if {Tt }t≥0 is a non-negative and contraction semigroup on C0 (K ), then there exists a unique C0 -transition function pt on K for which formula (1.12) holds true. The Feller property deals with continuity of a transition function pt (x, E) in x, and, by itself, is not concerned with continuity in t. Now we give a necessary and sufficient condition on pt (x, E) in order that its associated semigroup {Tt }t≥0 is strongly continuous in t on the space C0 (K ): lim Tt+s f − Tt f  = 0 for f ∈ C0 (K ). s↓0

A Markov transition function pt on K is said to be uniformly stochastically continuous on K if the following condition is satisfied: (U) For each ε > 0 and each compact E ⊂ K , we have the assertion lim sup [1 − pt (x, Uε (x))] = 0, t↓0 x∈E

where Uε (x) is an ε-neighborhood of x. For example, the transition functions in Examples 1.2–1.6 are all uniformly stochastically continuous. Then we have the following result (Theorem 12.37): Theorem 1.8 Let pt be a C0 -transition function on K . The associated semigroup {Tt }t≥0 is strongly continuous in t on C0 (K ) if and only if pt is uniformly stochastically continuous on K and satisfies the condition: (L) For each s > 0 and compact E ⊂ K , we have the assertion lim sup pt (x, E) = 0.

x→∂ 0≤t≤s

Remark 1.9 We remark that condition (L) is trivially satisfied, if the state space K is compact. A strongly continuous, non-negative and contraction semigroup {Tt }t≥0 on the space C0 (K ) is called a Feller semigroup. Therefore, Theorems 1.7 and 1.8 can be summarized as follows (Theorem 12.37): Theorem 1.10 If pt is a uniformly stochastically continuous, C0 -transition function on K and satisfies condition (L), then its associated operators {Tt }t≥0 , defined by

14

1 Introduction and Summary

Fig. 1.2 An overview of Theorem 1.10

pt (x, ·) : uniform stochastic continuity + C0 -property + condition (L)

{Tt } : Feller semigroup on C0 (K)

formula (1.12), form a Feller semigroup on K . Conversely, if {Tt }t≥0 is a Feller semigroup on K , then there exists a uniformly stochastically continuous, C0 -transition function pt on K , satisfying the condition (L), such that formula (1.12) holds true. The most important applications of Theorem 1.10 are of course in the second statement. Theorem 1.10 can be visualized as in Fig. 1.2.

1.1.6 Path Functions of Markov Processes It is naturally interesting and important to ask the following question: Given a Markov transition function pt , under which conditions on pt does there exist a Markov process with transition function pt whose paths are almost surely continuous? A Markov process X = (xt , F, Ft , Px ) is said to be right-continuous provided that we have, for each x ∈ K , Px {ω ∈ Ω : the mapping t → xt (ω) is a right-continuous function from [0, ∞) into K ∂ } = 1. Furthermore, we say that X is continuous provided that we have, for each x ∈ K , Px {ω ∈ Ω : the mapping t → xt (ω) is a continuous function from [0, ζ) into K } = 1. Here ζ is the lifetime of the process X . Now we give some useful criteria for path-continuity in terms of transition functions (Theorem 12.21): Theorem 1.11 (Seregin–Kinney–Dynkin) Let K be a locally compact, separable metric space and pt a normal transition function on K . (i) Assume that the following conditions are satisfied:

1.1 Markov Processes and Semigroups

15

(L) For each s > 0 and each compact E ⊂ K , we have the assertion lim sup pt (x, E) = 0.

x→∂ 0≤t≤s

(M) For each ε > 0 and each compact E ⊂ K , we have the assertion lim sup pt (x, K \ Uε (x)) = 0. t↓0 x∈E

Then there exists a right-continuous Markov process X with transition function pt . (ii) Assume that condition (L) and the following condition (replacing condition (M) are satisfied: (N) For each ε > 0 and each compact E ⊂ K , we have the assertion lim t↓0

1 sup pt (x, K \ Uε (x)) = 0. t x∈E

Then there exists a continuous Markov process X with transition function pt . Remark 1.12 It should be noticed that every uniformly stochastically continuous transition function pt is normal and satisfies condition (M) in Theorem 1.11, that is, condition (U) implies condition (M). For example, the Poisson process (Example 1.3) and the Cauchy process (Example 1.6) are right-continuous Markov processes; uniform motion (Example 1.2), Brownian motion (Example 1.4) and Brownian motion with constant drift (Example 1.5) are continuous Markov processes.

1.1.7 Strong Markov Processes A Markov process is called a strong Markov process if the “starting afresh” property holds true not only for every fixed moment but also for suitable random times. For the precise definition of this “strong” Markov property, see Sect. 12.1.7. We state a useful criterion for the strong Markov property (Theorem 12.27): Theorem 1.13 (Dynkin) Any right-continuous Markov process whose transition function has the C0 -property is a strong Markov process. By combining Theorem 1.13 with the criterion for path-continuity in Sect. 1.1.6, we have the following simple criterion in terms of transition functions (Theorem 12.29):

16

1 Introduction and Summary

uniform stochastic continuity + condition (L)

right-continuous Markov process

C0 -property

strong Markov process

Fig. 1.3 An overview of Theorems 1.13 and 1.14

Theorem 1.14 If a uniformly stochastically continuous, C0 transition function satisfies condition (L), then it is the transition function of some strong Markov process whose paths are right-continuous and have no discontinuities other than jumps. For example, the transition functions in Examples 1.2–1.6 correspond to strong Markov processes. Theorems 1.13 and 1.14 can be visualized as in Fig. 1.3. A continuous strong Markov process is called a diffusion process. The next theorem states a sufficient condition for the existence of a diffusion process with a prescribed transition function (Theorem 12.30): Theorem 1.15 Any uniformly stochastically continuous C0 -transition function which satisfies conditions (L) and (N) is the transition function of some diffusion process. For example, the transition functions in Example 1.2 (uniform motion) and Examples 1.4 and 1.5 (Brownian motion) correspond to diffusion processes. Here are two more examples of diffusion processes on the half line [0, ∞) in which we must take account of the effect of the boundary point 0: Example 1.16 (reflecting barrier Brownian motion) If t > 0, x ∈ [0, ∞) and E ∈ B, we let pt (x, E)    1 (y − x)2 (y + x)2 =√ dy + dy , exp − exp − 2t 2t 2πt E E

(1.15)

and p0 (x, E) = χ E (x). This represents Brownian motion with a reflecting barrier at x = 0; the process may be represented as {|xt |}, where {xt } is Brownian motion on R. Example 1.17 (sticking barrier Brownian motion) If t > 0, x ∈ [0, ∞) and E ∈ B, we let

1.1 Markov Processes and Semigroups

17

pt (x, E)    1 (y − x)2 (y + x)2 =√ dy − dy exp − exp − 2t 2t 2πt E E  2  x 1 z dz χ E (0), + 1− √ exp − (1.16) 2t 2πt −x and p0 (x, E) = χ E (x). This represents Brownian motion with a sticking barrier at x = 0; when a Brownian particle reaches x = 0 for the first time, it sticks there forever. It is easy to verify that the transition functions (1.15) and (1.16) are uniformly stochastically continuous C0 -transition functions which satisfy conditions (L) and (N).

1.1.8 Infinitesimal Generators of Feller Semigroups Now we return to the consideration of Feller semigroups. Let K be a locally compact, separable metric space and C0 (K ) the space of continuous functions on K vanishing at the point at infinity ∂. If pt is a uniformly stochastically continuous C0 -transition function on K , then its associated operators  Tt f (x) =

pt (x, dy) f (y) for f ∈ C0 (K ), K

form a Feller semigroup on K . Recall that the infinitesimal generator A of the semigroup {Tt }t≥0 is defined by the formula A f = lim t↓0

Tt f − f , t

(1.17)

provided that the limit exists in C0 (K ). The domain D(A) of A consists of all functions f ∈ C0 (K ) for which the limit (1.17) exists. First, we write down explicitly the infinitesimal generators of Feller semigroups associated with the transition functions in Examples 1.2–1.17. Example 1.18 (uniform motion) K = R.

  D(A) = f ∈ C0 (K ) ∩ C 1 (K ) : f  ∈ C0 (K ) , A f = v f  for f ∈ D(A).

Example 1.19 (Poisson process) K = R.

18

1 Introduction and Summary



D(A) = C0 (K ), A f (x) = λ( f (x + 1) − f (x))

for f ∈ D(A).

We remark that the operator A is not “local”; the value A f (x) depends on the values f (x) and f (x + 1). This reflects the fact that the Poisson process changes state by jumps. Example 1.20 (Brownian motion) K = R. ⎧   ⎨D(A) = f ∈ C0 (K ) ∩ C 2 (K ) : f  ∈ C0 (K ), f  ∈ C0 (K ) , 1 ⎩A f = f  for f ∈ D(A). 2 The operator A is “local”, that is, the value A f (x) is determined by the values of f in an arbitrary small neighborhood of x. This reflects the fact that Brownian motion changes state by continuous motion. Example 1.21 (Brownian motion with constant drift) K = R. ⎧   ⎨D(A) = f ∈ C0 (K ) ∩ C 2 (K ) : f  ∈ C0 (K ), f  ∈ C0 (K ) , 1 ⎩A f = f  + m f  for f ∈ D(A). 2 Example 1.22 (Cauchy process) K = R. The domain D(A) contains C 2 functions on K with compact support, and the infinitesimal generator A takes the form  dy 1 ∞ [ f (x + y) + f (x − y) − 2 f (x)] 2 π 0 y    1 ∞ 1   f (x + t y) + f  (x − t y) (1 − t) dt dy. = π 0 0

A f (x) =

Example 1.23 (reflecting barrier Brownian motion) K = [0, ∞). ⎧   ⎨D(A) = f ∈ C0 (K ) ∩ C 2 (K ) : f  ∈ C0 (K ), f  ∈ C0 (K ), f  (0) = 0 , 1 ⎩A f = f  for f ∈ D(A). 2 Example 1.24 (sticking barrier Brownian motion) K = [0, ∞). ⎧   ⎨D(A) = f ∈ C0 (K ) ∩ C 2 (K ) : f  ∈ C0 (K ), f  ∈ C0 (K ), f  (0) = 0 , 1 ⎩A f = f  for f ∈ D(A. 2 Here are two more examples where it is difficult to begin with a transition function and the infinitesimal generator is the basic tool of describing the process.

1.1 Markov Processes and Semigroups

19

Example 1.25 (sticky barrier Brownian motion) K = [0, ∞). D(A)   = f ∈ C0 (K ) ∩ C 2 (K ) : f  ∈ C0 (K ), f  ∈ C0 (K ), f  (0) − α f  (0) = 0 , and Af =

1  f for f ∈ D(A). 2

Here α is a positive constant. This process may be thought of as a “combination” of the reflecting and sticking Brownian motions. The reflecting and sticking cases are obtained by letting α → 0 and α → ∞, respectively. Example 1.26 (absorbing barrier Brownian motion) K = [0, ∞) where the boundary point 0 is identified with the point at infinity ∂. ⎧   ⎨D(A) = f ∈ C0 (K ) ∩ C 2 (K ) : f  ∈ C0 (K ), f  ∈ C0 (K ), f (0) = 0 , 1 ⎩A f = f  for f ∈ D(A). 2 This represents Brownian motion with an absorbing barrier at x = 0; a Brownian particle “dies” at the first moment when it hits the boundary x = 0. It is worth pointing out here that a strong Markov process cannot stay at a single position for a positive length of time and then leave that position by continuous motion; it must either jump away or leave instantaneously. We give a simple example of a strong Markov process which changes state not by continuous motion but by jumps when the motion reaches the boundary. Example 1.27 K = [0, ∞). D(A) =



f ∈ C0 (K ) ∩ C 2 (K ) : f  ∈ C0 (K ), f  ∈ C0 (K ),  ∞  f  (0) = 2c ( f (y) − f (0))d F(y) , 0

and Af =

1  f for f ∈ D(A). 2

Here c is a positive constant and F is a distribution function on (0, ∞). This process may be interpreted as follows: When a Brownian particle reaches the boundary x = 0, it stays there for a positive length of time and then jumps back to a random point, chosen with the function F, in the interior (0, ∞). The constant c is the parameter in the “waiting time” distribution at the boundary x = 0. We remark that the boundary condition

20

1 Introduction and Summary

f  (0) = 2c





( f (y) − f (0))d F(y)

0

depends on the values of f far away from the boundary x = 0, unlike the boundary conditions in Examples 1.16–1.19.

1.1.9 One-Dimensional Diffusion Processes A Markov process is said to be one-dimensional or multidimensional according to whether the state space is a subset of R or Rn for n ≥ 2. In the early 1950s, W. Feller characterized completely the analytic structure of onedimensional diffusion processes; he gave an intrinsic representation of the infinitesimal generator A of a one-dimensional diffusion process and determined all possible boundary conditions which describe the domain D(A) [59, 60]. The probabilistic meaning of Feller’s work was clarified by Dynkin [45, 46], Itô, McKean, Jr. [95], Ray [147] and others. One-dimensional diffusion processes have been comprehensively studied both from analytic and probabilistic viewpoints. Now we take a close look at Feller’s work [59, 60]. Let X = (xt , F, Ft , Px ) be a one-dimensional Markov process with state space K . A point x of K is called a right (resp. left) singular point if xt (ω) ≥ x (resp. xt (ω) ≤ x) for all t ∈ [0, ζ(ω)) with Px -measure one. A right and left singular point is called a trap. For example, the point at infinity ∂ is a trap. A point which is neither right nor left singular is called a regular point. For simplicity, we assume that the state space K is the half line: K = [0, ∞), and all its interior points are regular. Feller proved that there exist a strictly increasing, continuous function s on (0, ∞) and Borel measures m and k on (0, ∞) such that the infinitesimal generator A of the process X can be expressed as A f (x) = lim y↓x

f + (y) − f + (x) −

 (x,y]

m ((x, y])

f (z)dk(z)

.

(1.18)

Here: (x+ε)− f (x) (1) f + (x) = limε↓0 fs(x+ε)−s(x) , the right-derivative of f at x with respect to s. (2) The measure m is positive for non-empty open subsets, and is finite for compact sets. (3) The measure k is finite for compact subsets.

The function s is called a canonical scale, and the measures m and k are called a canonical measure (or speed measure) and a killing measure for the process X ,

1.1 Markov Processes and Semigroups

21

respectively. They determine the behavior of a Markovian particle in the interior of the state space K . We remark that the right-hand side of the formula (1.18) is a generalization of the second order differential operator a(x) f  + b(x) f  + c(x) f, where a > 0 and c ≤ 0 on K . For example, the formula A f = a(x) f  + b(x) f  can be written in the form (1.18), if we take  y b(z) dz dy, exp − 0 0 a(z)  x 1 b(y) dm(x) = exp dy d x, a(x) 0 a(y) dk(x) = 0. 

s(x) =

x

The boundary point 0 is called a regular boundary if it satisfies the following condition: For an arbitrary point r ∈ (0, ∞), we have the assertions  

(0,r )

(0,r )

[s(r ) − s(x)][dm(x) + dk(x)] < ∞, [m ((x, r )) + k ((x, r ))]ds(x) < ∞.

It can be shown that this notion is independent of the point r used. Intuitively, the regularity of the boundary point means that a Markovian particle approaches the boundary in finite time with positive probability, and also enters the interior from the boundary. The behavior of a Markovian particle at the boundary point is characterized by boundary conditions. In the case of regular boundary points, Feller determined all possible boundary conditions which are satisfied by the functions f in the domain D(A) of A. A general boundary condition is of the form γ f (0) − δA f (0) + μ f + (0) = 0,

(1.19)

where γ, δ and μ are constants such that γ ≤ 0, δ ≥ 0, μ ≥ 0, μ + δ > 0. If we admit jumps from the boundary into the interior, then a general boundary condition takes the form

22

1 Introduction and Summary

∂D

Fig. 1.4 A bounded domain D with smooth boundary ∂ D and the unit inward normal n to ∂ D

D

+

n



γ f (0) − δA f (0) + μ f (0) +

(0,∞)

[ f (x) − f (0)] dν(x) = 0

(1.20)

where ν is a Borel measure with respect to which the function min(1, s(x) − s(+0)) is integrable. We remark that boundary condition (1.20) is a “combination” of the boundary conditions in Examples 1.23–1.27 if we take s(x) = x, dm(x) = 2d x, dk(x) = 0.

1.1.10 Multidimensional Diffusion Processes The main purpose of this book is to generalize Feller’s work to the multidimensional case. In 1959, A. D. Ventcel’ (Wentzell) studied the problem of determining all possible boundary conditions for multidimensional diffusion processes [236], which we shall state. Let D be a bounded domain in R N with smooth boundary ∂ D and C(D) the space of real-valued continuous functions on D = D ∪ ∂ D (see Fig. 1.4 above). A Feller semigroup on D is a strongly continuous, non-negative and contraction semigroup {Tt }t≥0 on C(D). Theorems 1.10 and 1.14 tell us that there corresponds to a Feller semigroup {Tt }t≥0 on D a strong Markov process X on D whose transition function pt (x, dy) satisfies the condition  Tt f (x) =

pt (x, dy) f (y) for f ∈ C(D). D

Under certain continuity hypotheses concerning pt (x, dy) such as condition (N) in Sect. 1.1.6, Ventcel’ [236] showed that the infinitesimal generator A of {Tt } is described analytically as follows (Theorems 12.55 and 12.57): (i) Let x be a point of the interior D of the state space D. For all u ∈ D(A) ∩ C 2 (D), we have the formula Au(x) = Au(x) :=

N

i, j=1

∂2u ∂u (x) + bi (x) (x) + c(x)u(x), ∂xi ∂x j ∂xi i=1 N

a i j (x)

1.1 Markov Processes and Semigroups

23 D = {xN > 0}

Fig. 1.5 A local coordinate system x = (x  , x N ) = (x1 , x2 , . . . , x N −1 , x N ) near the boundary ∂ D

∂D = {xN = 0}

xN x

where the matrix (a i j (x)) is positive semi-definite and c(x) ≤ 0. (ii) Let x  be a point of the boundary ∂ D of the state space D, and choose a local coordinate system   x = x  , x N = (x1 , x2 , . . . , x N −1 , x N ) in a neighborhood of x  such that x ∈ D if x N > 0 and x ∈ ∂ D if x N = 0 (see Fig. 1.5). Then every function u ∈ D(A) ∩ C 2 (D) satisfies the following boundary condition: Lu(x  ) =

N −1

αi j (x  )

i, j=1

N −1

∂2u ∂u  (x  ) + β i (x  ) (x ) ∂xi ∂x j ∂xi i=1

+ γ(x  )u(x  ) + μ(x  ) = 0.

∂u  (x ) − δ(x  )Au(x  ) ∂n

  Here the matrix αi j (x  ) is positive semi-definite, and γ(x  ) ≤ 0, μ(x  ) ≥ 0 and δ(x  ) ≥ 0 on ∂ D, and ∂u ∂u on ∂ D. = ∂n ∂x N The boundary condition L is called a second order Ventcel’ boundary condition. Probabilistically, the above result may be interpreted as follows: A Markovian particle of the diffusion process X on D is governed by a degenerate elliptic differential operator A of second order in the interior D of the state space D, and it obeys a Ventcel’ boundary condition L on the boundary ∂ D of D. The four terms N −1

i, j=1

αi j (x  )

N −1

∂2u ∂u  (x  ) + β i (x  ) (x ), ∂xi ∂x j ∂x i i=1

γ(x  )u(x  ), μ(x  )

∂u  (x ), δ(x  )Au(x  ) ∂n

24

1 Introduction and Summary

D

∂D

D

∂D

absorption

reflection

Fig. 1.6 Absorption and reflection phenomena

D ∂D

D ∂D

diffusion along the boundary

viscosity

Fig. 1.7 Diffusion along ∂ D and viscosity phenomenon

of L are supposed to correspond to a diffusion along the boundary, an absorption phenomenon, a reflection phenomenon and a viscosity phenomenon, respectively (see Figs. 1.6 and 1.7). Analytically, via the Hille–Yosida theory of semigroups, it may be interpreted as follows: A Feller semigroup {Tt }t≥0 on D is described by a degenerate elliptic differential operator A of second order and a Ventcel’ boundary condition L if the paths of its corresponding strong Markov process X are continuous. We are thus reduced to the study of (non-)elliptic boundary value problems for (A, L) in the theory of partial differential equations. In this book we shall consider, conversely, the following problem: Problem 1.28 Construct a Feller semigroup {Tt }t≥0 on D with prescribed analytic data (A, L). Table 1.3 below gives typical examples of multidimensional diffusion processes and elliptic boundary value problems and how these relate to each other (see [209, Chap. 15]):

1.2 Propagation of Maxima

25

Table 1.3 Typical examples of multidimensional diffusion processes and elliptic boundary value problems Diffusion process Elliptic boundary value problem (Microscopic approach) (Mesoscopic approach) A= L N = Neumann condition A= L R = Robin condition A= L O = Oblique derivative condition

Reflecting barrier Brownian motion Reflecting and absorbing barrier Brownian motion Reflecting, absorbing and drift barrier Brownian motion

1.2 Propagation of Maxima Now we pause from our main development in order to study an intimate connection between Markov processes and partial differential equations. This will play an important role in the study of Markov processes in terms of partial differential equations [177, 178]. We begin with the following elementary result: 2 Let I be an open interval of R. If u ∈ C 2 (I ), dd xu2 (x) ≥ 0 in I and u takes its maximum at a point of I , then u(x) is a constant. This result can be extended tothe N -dimensional case, with the operator d 2 /d x 2 N replaced by the Laplacian  = i=1 ∂ 2 /∂xi2 : Theorem 1.29 (the strong maximum principle) Let D be a connected open subset of R N . If u ∈ C 2 (D), u(x) ≥ 0 in D and u takes its maximum at a point of D, then u(x) is a constant. Theorem 1.29 is well known by the name of the strong maximum principle for the Laplacian. Now we study the underlying analytical mechanism of propagation of maxima for degenerate elliptic differential operators of second order, which will reveal an intimate connection between partial differential equations and Markov processes. Let A be a second order differential operator with real coefficients such that A=

N

i, j=1

∂2 ∂ + bi (x) , ∂xi ∂x j ∂xi i=1 N

a i j (x)

(1.21)

where the coefficients a i j , bi satisfy the following two conditions: (1) The a i j (x) are C 2 functions on R N all of whose derivatives of order ≤ 2 are bounded in R N , a i j (x) = a ji (x) for all x ∈ R N and all 1 ≤ i, j ≤ N , and satisfy the degenerate elliptic condition

26

1 Introduction and Summary N

  a i j (x)ξi ξ j ≥ 0 for all (x, ξ) ∈ T ∗ R N = R N × R N .

(1.22)

i, j=1

  Here T ∗ R N is the cotangent bundle of R N . (2) The bi (x) are C 1 functions on R N with bounded derivatives in R N for 1 ≤ i ≤ N . We consider the following: Problem 1.30 Let D be a connected open subset of R N and x a point of D. Then determine the largest connected, relatively closed subset D(x) of D, containing x, such that: If u ∈ C 2 (D), Au(x) ≥ 0 in D, sup D u = M < +∞ (1.23) and u(x) = M, then u(x) ≡ M throughout D(x). The set D(x) is called the propagation set of x in D. We now give a coordinate-free description of the set D(x) in terms of subunit vectors whose notion was introduced by Fefferman–Phong [57]. A tangent vector N

∂ γj ∈ Tx (D) X= ∂x j j=1 at x ∈ D is said to be subunit for the operator A0 =

N

i, j=1

a i j (x)

∂2 ∂xi ∂x j

if it satisfies the condition ⎛ ⎞2 N N N



⎝ γ jηj⎠ ≤ a i j (x)ηi η j for x ∈ D and η = η j d x j ∈ Tx∗ (D), j=1

i, j=1

j=1

where Tx∗ (D) is the cotangent space of D at x. Note that this notion is coordinate-free. So we rotate the coordinate axes so that the matrix (a i j ) is diagonalized at x:  ij  a (x) = (λi δi j ), λ1 > 0, . . . , λr > 0, λr +1 = . . . = λ N = 0. Here r = rank (a i j (x)). Then it is easy to see that the vector X is subunit for A0 if and only if it is contained in the following ellipsoid of dimension r : (γ r )2 (γ 1 )2 + ... + ≤ 1, γ r +1 = . . . = γ N = 0. λ1 λr

(1.24)

1.2 Propagation of Maxima

27

A subunit trajectory is a Lipschitz path γ : [t1 , t2 ] −→ D such that the tangent vector γ(t) ˙ = dtd (γ(t)) is subunit for the operator A0 at γ(t) for ˙ hence subunit almost every t. We remark that if γ(t) ˙ is subunit for A0 , so is −γ(t); trajectories are not oriented. We let ⎞ ⎛ N N ij



∂ ∂a ⎝bi (x) − X 0 (x) = (x)⎠ . ∂x ∂x j i i=1 j=1 The vector field X 0 (x) is called the drift vector field in probability theory, while it is the so-called subprincipal part of the operator A in terms of the theory of partial differential equations. A drift trajectory is a curve θ : [t1 , t2 ] −→ D such that

˙ = X 0 (θ(t)) on [t1 , t2 ], θ(t)

and this curve is oriented in the direction of increasing t. Now we can state our main result for the strong maximum principle (Theorem 10.14): Theorem 1.31 (the strong maximum principle) Let the differential operator A of the form (1.21) satisfy the degenerate elliptic condition (1.22). Then the propagation set D(x) of x in D contains the closure D  (x) in D of all points y ∈ D that can be joined to x by a finite number of subunit and drift trajectories. This result tells us that if the matrix (a i j ) is non-degenerate at x, that is, if r = rank (a i j (x)) = N , then the maximum propagates in an open neighborhood of x; but if the matrix (a i j ) is degenerate at x, then the maximum propagates only in a “thin” ellipsoid of dimension r (see formula (1.24)) and in the direction of X 0 . Now we see the reason why the strong maximum principle (Theorem 1.29) holds true for the Laplacian. We consider four simple examples in the case where D is the square (−1, 1) × (−1, 1) (N = 2) (see Figs. 1.8 and 1.9). Example 1.32 A1 = ∂ 2 /∂x 2 + x 2 ∂ 2 /∂ y 2 . The subunit vector fields for A1 are generated by   ∂ ∂ ,x . ∂x ∂ y Hence we have the assertion (see Fig. 1.8)

28

1 Introduction and Summary

y

y D

D



∂ ∂x

∂ ∂x 0

x

∂ ∂y 0

x

Fig. 1.8 The subunit vector fields in Examples 1.32 and 1.33

The set D  ((x, y)) is equal to D for every (x, y) ∈ D.

(1.25)

That is, the strong maximum principle (Theorem 1.29) remains valid for the operator A1 . Example 1.33 A2 = x 2 ∂ 2 /∂x 2 + ∂ 2 /∂ y 2 . The subunit vector fields for A2 are generated by the following:   ∂ ∂ . (1.26) x , ∂x ∂ y Thus we have the assertion (see Fig. 1.8) ⎧ ⎪ if x > 0, ⎨[0, 1) × (−1, 1)  D ((x, y)) = {0} × (−1, 1) if x = 0, ⎪ ⎩ (−1, 0] × (−1, 1) if x < 0. It can be shown that the operator A2 does not have property (1.23) in some weak sense. Example 1.34 A3 = x 2 ∂ 2 /∂x 2 + ∂ 2 /∂ y 2 + y∂/∂x. The subunit vector fields for A03 = A2 are generated by formula (1.26), and the drift vector field is given by the formula ∂ . (y − 2x) ∂x Thus, by virtue of the drift vector field (see Fig. 1.9 below), we have assertion (1.25), and so the strong maximum principle (Theorem 1.29) remains valid for the operator A3 .

1.2 Propagation of Maxima

29

y

y

D

D y

∂ ∂x

∂ ∂x

∂ ∂y

x

0

y

x

0 ∂ ∂x

∂ ∂x

Fig. 1.9 The subunit and drift vector fields in Examples 1.34 and 1.35

Example 1.35 A4 = x 2 ∂ 2 /∂x 2 + ∂ 2 /∂ y 2 + ∂/∂x. The subunit vector fields for A04 = A2 are generated by formula (1.26) (see Fig. 1.8), and the drift vector field is given by the formula ∂ . (1 − x) ∂x Hence we have the assertion (see Fig. 1.9) 

D ((x, y)) =

D if x < 0, [0, 1) × (−1, 1) if x ≥ 0.

It can also be shown that the operator A4 does not have the property (1.25) in some weak sense. It is worth pointing out here that the propagation set D  (x) coincides with the support of the Markov process corresponding to the operator A, which is the closure of the collection of all possible trajectories of a Markovian particle, starting at x, with generator A. In the case where the operator A is written as the sum of squares of vector fields, we can give another (equivalent) description of the set D  (x) (see Bony [21, Théorème 2.1] and Hill [80, Theorem 2]). Now we assume that the operator A is of the form: A=

r

Yk2 + Y0 ,

(1.27)

k=1

where the Yk are real C 2 vector fields on R N and Y0 is a real C 1 vector field on R N . Hill’s diffusion trajectory is a curve

30

1 Introduction and Summary

β : [t1 , t2 ] −→ D such that

˙ = 0 on [t1 , t2 ]. ˙ = Yk (β(t)), β(t) β(t)

Hill’s diffusion trajectories are not oriented; they may be traversed in either direction. Hill’s drift trajectory is a curve η : [t1 , t2 ] −→ D such that ˙ = 0 on [t1 , t2 ], η(t) ˙ = Y0 (η(t)), η(t) but they are oriented in the direction of increasing t. In this case, as a byproduct of Theorem 1.31 we can prove that our propagation set D  (x) coincides with that of Hill [80, Theorem 1]. Namely, our main Theorem 1.31 can be restated as follows (Theorem 10.17): Theorem 1.36 Assume that the differential operator A of the form (1.21) can be written as the sum (1.27) of squares of vector fields. Then the propagation set D  (x) coincides with the closure in D of all points y ∈ D that can be joined to x by a finite number of Hill’s diffusion and drift trajectories. Hill’s result is completely proved and extended to the non-linear case by Redheffer [148, Theorem 2] (see also Bony [21, Théorème 2.1]). Furthermore, Theorem 1.31 may be reformulated in various ways. For example, we have the following result (Theorem 10.19): Theorem 1.37 (strong maximum principle) Let the differential operator A of the form (1.21) satisfy the degenerate elliptic condition (1.22) and let c(x) be a continuous function on D such that c(x) ≤ 0 in D. If u ∈ C 2 (D), (A + c(x)) u ≥ 0 in D and if u attains its positive maximum M at a point x of D, then u(x) ≡ M throughout D  (x).

1.3 Construction of Feller Semigroups Now we return to the problem of construction of Feller semigroups. Following Sato– Ueno [156] and Bony–Courrège–Priouret [22], we give general existence theorems for Feller semigroups in terms of boundary value problems. Our approach to the construction of Feller semigroups is essentially based on the a priori estimates stated in Chap. 9 and the maximum principle discussed in Chap. 10, respectively. In Chap. 9 we define pseudo-differential operators and study their basic properties such as the behavior of transposes, adjoints and compositions of such operators, and the effect of a change of coordinates on such operators. Furthermore, we discuss in

1.3 Construction of Feller Semigroups

31

detail, via functional analysis, the behavior of elliptic pseudo-differential operators on Sobolev spaces, and formulate the sectional trace theorem (Theorem 9.42) and classical surface and volume potentials (Theorems 9.48 and 9.49) in terms of pseudodifferential operators. This calculus of pseudo-differential operators is applied to elliptic boundary value problems in Chap. 11. Moreover, we give Gårding’s inequality and related inequalities (Theorems 9.51 and 9.53), and describe three classes of hypoelliptic pseudo-differential operators (Theorems 9.55, 9.58 and 9.60). In Chap. 10 we prove various maximum principles for degenerate elliptic differential operators of second order, and reveal the underlying analytical mechanism of propagation of maxima in terms of subunit vectors introduced by Fefferman– Phong [57] (Theorems 10.14 and 10.17). The results may be applied to questions of uniqueness for elliptic boundary value problems. Furthermore, the mechanism of propagation of maxima will play an important role in the interpretation and study of Markov processes from the viewpoint of functional analysis in Part V (see also [202]). Chapter 11 is devoted to general boundary value problems for second order elliptic differential operators in the framewok of L 2 Sobolev spaces. We begin with a summary of the basic facts about existence, uniqueness and regularity of solutions of the Dirichlet problem in the framework of Hölder spaces (Theorem 11.1). By using the calculus of pseudo-differential operators developed in Sects. 9.8–9.10, we prove an existence, uniqueness and regularity theorem for the Dirichlet problem in the framework of L 2 Sobolev spaces (Theorem 11.6). Moreover, we formulate a general boundary value problem

Au = f in Ω, Bγu = ϕ on ∂Ω,

(∗)

and show that these problems can be reduced to the study of pseudo-differential operators on the boundary (Proposition 11.9). The virtue of this reduction is that there is no difficulty in taking adjoints after restricting the attention to the boundary, whereas boundary value problems in general do not have adjoints. This allows us to discuss the existence theory more easily (Theorems 11.10–11.18). In Sect. 11.4 we study the basic questions of existence and uniqueness of solutions of a general boundary value problem ⎧ ⎨(A − α) u = f ⎩ Bu := B0 (u|∂Ω ) + B1



in Ω,   ∂u  = ϕ on ∂Ω, ∂ν ∂Ω

()α

with a spectral parameter α. We prove two existence and uniqueness theorems for the boundary value problem ()α in the framework of Sobolev spaces when α → +∞ (Theorem 11.19 and Corollary 11.20). For this purpose, we make use of a method essentially due to Agmon and Nirenberg (see Agmon [3], Agmon–Nirenberg [5], Lions–Magenes [116]). This is a technique of treating the spectral parameter α as

32

1 Introduction and Summary

a second order elliptic differential operator of an extra variable and relating the old problem to a new one with the additional variable. Our presentation of this technique is due to Fujiwara [69]. Theorem 11.19 plays a fundamental role in constructing Feller semigroups (Markov processes) in Chap. 13. Chapter 12 is the heart of the subject. Now let D be a bounded domain in Euclidean space R N with C ∞ boundary ∂ D, and let A be a second order strictly elliptic differential operator with real coefficients such that Au(x) =

N

i, j=1

∂2u ∂u (x) + bi (x) (x) + c(x)u(x). ∂xi ∂x j ∂x i i=1 N

a i j (x)

(1.28)

Here: (1) a i j ∈ C ∞ (R N ), a i j (x) = a ji (x) for all x ∈ R N and all 1 ≤ i, j ≤ N , and there exists a constant a0 > 0 such that N

  a i j (x)ξi ξ j ≥ a0 |ξ|2 for all (x, ξ) ∈ T ∗ R N = R N × R N ,

(1.29)

i, j=1

  where T ∗ R N is the cotangent bundle of R N . (2) bi ∈ C ∞ (R N ) for 1 ≤ i ≤ N . (3) c ∈ C ∞ (R N ) and c(x) ≤ 0 on D. Let L be a second order Ventcel’ boundary condition such that Lu(x  ) ⎛ ⎞ N −1 N −1 2



u ∂ ∂u =⎝ αi j (x  ) (x  ) + β i (x  ) (x  ) + γ(x  )u(x  )⎠ ∂x ∂x ∂x i j i i, j=1 i=1 ∂u  (x ) − δ(x  )Au(x  ) ∂n ∂u := Qu(x  ) + μ(x  ) (x  ) − δ(x  )Au(x  ) for all x  ∈ ∂ D. ∂n + μ(x  )

(1.30)

Here: ij  ∞ (1) The 2 α (x ) are the components of a C symmetric contravariant tensor of type on ∂ D, and satisfy the degenerate elliptic condition 0

1.3 Construction of Feller Semigroups

33

Fig. 1.10 An intuitive meaning of the transversality condition (1.32)

∂D

D

∂D \ M = {μ > 0} M = {μ = 0}

N −1

αi j (x  )ηi η j ≥ 0

i, j=1

for all x  ∈ ∂ D and η =

N −1

η j d x j ∈ Tx∗ (∂ D),

(1.31)

j=1

(2) (3) (4) (5) (6)

where Tx∗ (∂ D) is the cotangent space of ∂ D at x  . β i ∈ C ∞ (∂ D) for 1 ≤ i ≤ N − 1. γ ∈ C ∞ (∂ D) and γ(x  ) ≤ 0 on ∂ D. μ ∈ C ∞ (∂ D) and μ(x  ) ≥ 0 on ∂ D. δ ∈ C ∞ (∂ D) and δ(x  ) ≥ 0 on ∂ D. n is the unit inward normal to ∂ D (see Fig. 1.4).

A Ventcel’ boundary condition L is said to be transversal on ∂ D if it satisfies the condition (1.32) μ(x  ) + δ(x  ) > 0 on ∂ D. Intuitively, the transversality condition implies that either a reflection or a viscosity phenomenon occurs at each point of ∂ D (see Fig. 1.10). More precisely, this means that a viscosity phenomenon occurs at each point of the boundary portion   M = x  ∈ ∂ D : μ(x  ) = 0 , while a reflection phenomenon occurs at each point of the boundary portion   ∂ D \ M = x  ∈ ∂ D : μ(x  ) > 0 . The next theorem (Theorem 12.81) states sufficient conditions for the general existence of a Feller semigroup in terms of boundary value problems: Theorem 1.38 (the existence theorem) Let the differential operator A of the form (1.28) satisfy the strict ellipticity condition (1.29) and let the boundary condition L of the form (1.30) satisfy the degenerate elliptic condition (1.31). Assume that L is transversal on ∂ D and further that the following two conditions are satisfied:

34

1 Introduction and Summary

Fig. 1.11 A Markov process on the boundary ∂ D can be “pieced together” with an A-diffusion in the interior D

Markov process on ∂D

A-diffusion

[I] (Existence) For some constants α ≥ 0 and λ ≥ 0, the boundary value problem

(α − A) u = 0 in D, (λ − L) u = ϕ on ∂ D

()α,λ

has a solution u in C(D) for any ϕ in some dense subset of C(∂ D). [II] (Uniqueness) For some constant α > 0, we have the assertion u ∈ C(D), (α − A) u = 0 in D, Lu = 0 on ∂ D =⇒ u = 0 in D. Then there exists a Feller semigroup {Tt }t≥0 on D whose infinitesimal generator A is characterized as follows: (a) The domain D(A) of A is the space   D(A) = u ∈ C(D) : Au ∈ C(D), Lu = 0 on ∂ D . (b) Au = Au for every u ∈ D(A). Here Au and Lu are taken in the sense of distributions. Remark 1.39 The probabilistic meaning of the unique solvability of problem ()α,λ is that there exists a Markov process Y (with discontinuous paths) on the boundary ∂ D [227]. However, the transversality condition condition (1.32) of L implies that every Markov process on ∂ D is the “trace” on ∂ D of trajectories of some Markov process on D. Hence we can “piece out” the process Y with A-diffusion in D to construct a Markov process X on D and hence a Feller semigroup {Tt }t≥0 on D (see Fig. 1.11). It seems that our method of construction of Feller semigroups is, in spirit, not far away from the probabilistic method of construction of diffusion processes by means of Poisson point processes of Brownian excursions used by Watanabe [233]. In this way, we are reduced to the study of the boundary value problem ()α,λ with a spectral parameter α and λ.

1.3 Construction of Feller Semigroups

35

Now we can state our existence theorems for Feller semigroups (Theorems 1.40 and 1.41). (I) First, as in Sect. 1.2 we say that a tangent vector v=

N −1

vj

j=1

∂ ∈ Tx  (∂ D) ∂x j

is subunit for the operator L0 =

N −1

i, j=1

αi j (x  )

∂2 ∂xi ∂x j

if it satisfies the condition ⎛ ⎝

N −1

j=1

⎞2 v jηj⎠ ≤

N −1

i, j=1

αi j (x  )ηi η j , η =

N −1

η j d x j ∈ Tx∗ (∂ D).

j=1

If ρ > 0, we define a “non-Euclidean” ball B L 0 (x  , ρ) of radius ρ about x  as follows: B L 0 (x  , ρ) = the set of all points y ∈ ∂ D which can be joined to x  by a Lipschitz path v : [0, ρ] → ∂ D for which the tangent vector v(t) ˙ of ∂ D at v(t) is subunit for L 0 for almost every t. Also we let B E (x  , ρ) = the ordinary Euclidean ball of radius ρ about x  . Our main result is the following existence theorem of a Feller semigroup (Theorem 13.1): Theorem 1.40 (the existence theorem) Let the differential operator A of the form (1.28) satisfy the strict ellipticity condition (1.29) and let the boundary condition L of the form (1.30) satisfy the degenerate elliptic condition (1.31). Assume that L is transversal on ∂ D and further that: (A.1) There exist constants 0 < ε ≤ 1 and C > 0 such that we have, for ρ > 0 all sufficiently small,   B E (x  , ρ) ⊂ B L 0 (x  , Cρε ) on the set M = x  ∈ ∂ D : μ(x  ) = 0 . Then there exists a Feller semigroup {Tt }t≥0 on D whose infinitesimal generator A is characterized as follows:

36

1 Introduction and Summary

Fig. 1.12 A Markovian particle exits the boundary portion M in finite time through diffusion along the boundary

∂D

D

∂D \ M = {μ > 0}

M = {μ = 0}

(a) The domain D(A) of A is the space   D(A) = u ∈ C(D) : Au ∈ C(D), Lu = 0 on ∂ D . (b) Au = Au for every u ∈ D(A). Here Au and Lu are taken in the sense of distributions. Furthermore, the generator A coincides with the minimal closed extension in C(D) of the restriction of A to the space 

 u ∈ C 2 (D) : Lu = 0 on ∂ D .

The proof of Theorem 1.40 (Theorem 13.1) is based on Theorem 9.60 essentially due to Fefferman–Phong [57]. Theorem 1.29 in Sect. 1.2 tells us that the non-Euclidean ball B L 0 (x  , ρ) may be interpreted as the set of all points where a Markovian particle with generator L 0 , starting at x  , diffuses during the time interval [0, ρ]. Hence the intuitive meaning of hypothesis (A.1) is that a Markovian particle with generator L 0 goes through the set M, where no reflection phenomenon occurs, in finite time (see Fig. 1.12). Therefore, the above result may be stated as follows: If a Markovian particle goes through the set, where no reflection phenomenon occurs, in finite time, then there exists a Feller semigroup corresponding to such a diffusion phenomenon. (II) Secondly, we consider the first order case of the Ventcel’ boundary condition L, that is, the case where αi j (x  ) ≡ 0 on ∂ D:

1.3 Construction of Feller Semigroups

37

Fig. 1.13 A C ∞ vector field β on ∂ D and the unit inward normal n to ∂ D

∂D

D

Lu(x  ) =

N −1

β i (x  )

i=1

n

β

∂u  ∂u (x ) + γ(x  )u(x  ) + μ(x  ) (x  ) ∂xi ∂n

− δ(x  ) Au(x  ) :=β(x  ) · u + γ(x  )u(x  ) + μ(x  ) for all x  ∈ ∂ D. Here β(x  )· =

∂u  (x ) − δ(x  ) Au(x  ) ∂n (1.33)

N −1

β i (x  )

i=1

∂ ∂xi

is a C ∞ vector field on ∂ D (see Fig. 1.13). We let L O u(x  ) =

N −1

β i (x  )

i=1

∂u  ∂u (x ) + γ(x  )u(x  ) + μ(x  ) (x  ), ∂xi ∂n

(1.34)

and consider the term −δ(x  )Au(x  ) in Lu(x  ) as a term of “perturbation” of L O u(x  ): Lu(x  ) = L O u(x  ) − δ(x  )Au(x  ). More precisely, we study the oblique derivative boundary value problem

(α − A) u = f in D, (λ − L O ) u = ϕ on ∂ D,

(13.44)

where α and λ are positive constants. Then we have the following existence theorem of a Feller semigroup (Theorem 13.3): Theorem 1.41 (the existence theorem) Let the differential operator A of the form (1.28) satisfy the strict ellipticity condition (1.29) and let the boundary condition L of the form (1.33) satisfy the transversality condition (1.32). Assume that: (A.2) The vector field β is non-zero on the set

38

1 Introduction and Summary x(t; x0 )

x0 M = {x ∈ ∂D : μ(x ) = 0}

Fig. 1.14 A Markovian particle exits the boundary portion M in finite time through the drift vector field β

  M = x  ∈ ∂ D : μ(x  ) = 0 and any maximal integral curve x(t; x0 ) of β starting at x0 ∈ M is not entirely contained in M (see Fig. 1.14). Then we have the same conclusions as in Theorem 1.40. The proof of Theorem 1.41 (Theorem 13.3) is based on Theorem 9.58 essentially due to Melin–Sjöstrand [127] (see Theorem 13.15). Note that the vector field β is the drift vector field. Hence Theorem 1.31 in Sect. 1.2 tells us that hypothesis (A.2) has an intuitive meaning similar to hypothesis (A.1) (see Fig. 1.12).

Markov Processes (Probability)

Brownian Motion (Physics)

Diffusion Equations (Partial Differential Equations)

N. Wiener P. L´evy E. B. Dynkin

R. Brown A. Einstein J. B. Perrin

S. Chapman A. N. Kolmogorov H. P. McKean, Jr. E. Nelson

K. Itˆo

Feller Semigroups (Functional Analysis) K. Yosida S. Kakutani W. Feller D. Ray

Fig. 1.15 A bird’s-eye view of mathematical studies of Brownian motion

1.4 Notes and Comments

39

1.4 Notes and Comments Our functional analytic approach to diffusion processes is inspired by the bird’s-eye view of mathematical studies of Brownian motion in Fig. 1.15 above.

Part I

Foundations of Modern Analysis

Chapter 2

Sets, Topology and Measures

This chapter is a summary of the basic definitions and results about topological spaces, linear spaces and measure spaces which will be used throughout the book. Most of the material will be quite familiar to the reader and may be omitted. This chapter, included for the sake of completeness, should serve to settle questions of notation and such.

2.1 Sets A set is a collection of elements, and is described either by listing their members or by the expressions of the form {x : P} which denote the set of those elements x satisfying property P. The empty set is the set with no element, and is denoted by ∅. The words collection, family and class will be used synonymously with set. The notation x ∈ A (or A  x) means that x is a member or element of the set A. We also say that x belongs to the set A. The elements of A are frequently called points, and in this case the set A is referred to as the space. If every element of a set A is also an element of a set B, then A is said to be a subset of B, and we write A ⊂ B or B ⊃ A. Two sets A and B are said to be equal if A ⊂ B and B ⊂ A, and we write A = B. The negations of ∈, ⊂ and = are denoted by ∈, / ⊂ and =, respectively. If A ⊂ B but A = B, then A is called a proper subset of B. The difference between two sets A and B is the set of all those elements of A which do not belong to B, and is denoted by A \ B. If A is a subset of a fixed set X , then the difference X \ A is called the complement of A, and is denoted by Ac . We will often consider a collection {Aλ : λ ∈ Λ} of sets Aλ indexed by the set Λ. The union of the sets Aλ is the set of all those elements which belong to at least one of the Aλ , and is denoted by  Aλ . λ∈Λ

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 K. Taira, Functional Analytic Techniques for Diffusion Processes, Springer Monographs in Mathematics, https://doi.org/10.1007/978-981-19-1099-9_2

43

44

2 Sets, Topology and Measures

The intersection of the sets Aλ is the set of all those elements which belong to every Aλ , and is denoted by  Aλ . λ∈Λ

A collection {Aλ } of sets is said to be disjoint if every two distinct sets of the Aλ have no element in common. In this case, the union of the sets Aλ is called a disjoint union, and is denoted as follows:  Aλ . λ∈Λ

The Cartesian product A1 × A2 × · · · × An of sets A1 , A2 , . . . , An is the set of all ordered n-tuples (a1 , a2 , . . . , an ) with ai ∈ Ai for each i.

2.2 Mappings Let X and Y be two sets. A correspondence f which assigns to each element x of X an element f (x) of Y is called a mapping or map of X into Y , and we write f : X → Y . When describing a mapping f by describing its effect on individual elements, we use the special arrow → and write “x → f (x)”. The term mapping, function and transformation will be used synonymously. If A is a subset of X , the set f (A) = { f (x) : x ∈ A} is called the image of A under f . If B is a subset of Y , the set f −1 (B) = {x ∈ X : f (x) ∈ B} is called the inverse image of B under f . The domain D( f ) of f is the set X and the range R( f ) of f is the set f (X ). If, for each element y of f (X ), there exists only one element x of X such that f (x) = y, then f is called a one-to-one map or injection of X into Y . We also say that f is one-to-one or injective. If f is injective, then the inverse (mapping) f −1 , defined by x = f −1 (y) = f −1 ({y}), is a mapping with domain f (X ) and range X . A mapping f is called an onto map or surjection if f(X) = Y . We also say that f is onto or surjective. If f is both an injection and a surjection, then it is called a bijection. We also say that f is bijective. If f : X → Y and g : Y → Z are two mappings, the composite mapping g ◦ f : X → Z is defined by the formula (g ◦ f )(x) = g( f (x)) for x ∈ X .

2.3 Topological Spaces Let X be a non-empty set. A collection O of subsets of X is said to be a topology on X if it satisfies the following three conditions: (T1) The empty set ∅ and the set X itself belong to O. (T2) If O1 , O2 are members of O, then the intersection O1 ∩ O2 belongs to O.

2.3 Topological Spaces

45

(T3) If {Oλ }λ∈Λ is an arbitrary collection of members of O, then the union ∪λ∈Λ Oλ belongs to O. The pair (X, O) is called a topological space and the members of O are called open sets in X ; their complements are called closed sets. Let (X, O) be a topological space. A neighborhood of a point x of X is an open set which contains x, and the neighborhood system U(x) of x is the collection of all neighborhoods of x. A subcollection U ∗ (x) of U(x) is called a fundamental neighborhood system of x if it has the following property: (FV) For any U ∈ U(x), there exists V ∈ U ∗ (x) such that V ⊂ U . A topology on X can be formulated in terms of fundamental neighborhood systems as follows: (1) A family  ∗  U (x) x∈X of fundamental neighborhood systems of a topological space (X, O) enjoys the following properties: (V1) If V ∈ U ∗ (x), then x ∈ V . (V2) For V1 , V2 ∈ U ∗ (x), there exists V3 ∈ U ∗ (x) such that V3 ⊂ V1 ∩ V2 . (V3) If V ∈ U ∗ (x), then for each y ∈ V there exists W ∈ U ∗ (y) such that W ⊂ V . (2) Conversely, assume that we are given for each point x of a set X a collection U ∗ (x) of subsets of X and that the family {U ∗ (x)}x∈X satisfies conditions (V1), (V2) and (V3). We let O ={O ⊂ X : for every point x of O, there exists V ∈ U ∗ (x) such that V ⊂ O}. Then it is easy to verify that the collection O satisfies axioms (T1), (T2) and (T3) of a topology and that U ∗ (x) is a fundamental neighborhood system of x in the topological space (X, O). Let O1 and O2 be two topologies on the same set X . Then O1 is said to be stronger than O2 if every O2 -open set is an O1 -open set. We also say that O2 is weaker than O1 . Let X be a topological space (we often omit O and refer to X as a topological space). A point x of X is called an accumulation point of a subset A of X if every neighborhood of x contains at least one point of A different from x. A subset of X is closed if and only if it contains all its accumulation points. The closure A of a subset A of X is the smallest closed subset of X which contains A. The interior Ao of A is the largest open subset of X contained in A. The set A \ Ao is called the boundary of A. A subset A of X is said to be everywhere dense or simply dense in X if A = X . A topological space is said to be separable if it contains a countable, dense subset.

46

2 Sets, Topology and Measures

A topological space X is said to satisfy the first axiom of countability if, for each point x of X , there exists a fundamental neighborhood system of x which has countably many members. A family of open sets in X is called an open base for X if every open set can be expressed as a union of members of this family. A topological space X is said to satisfy the second axiom of countability if there exists an open base for X which has countably many members. A topological space with a countable open base is separable. A topological space X is called a Hausdorff space if, for arbitrary two distinct points x, y of X , there exist a neighborhood U of x and a neighborhood V of y such that U ∩ V = ∅. We also say that X is Hausdorff. Let Y be a subset of a topological space (X, O). We let OY = {O ∩ Y : O ∈ O} . Then the collection OY of subsets of Y satisfies axioms (T1), (T2) and (T3) of a topology; hence OY is a topology on Y . This topology is called the relative topology of Y as a subset of (X, O), and (Y, OY ) is called a topological subspace of (X, O). If X 1 , X 2 , . . . , X n are topological spaces, then a topology is defined on the Cartesian product X 1 × X 2 × · · · × X n by taking as a fundamental neighborhood system of a point (x1 , x2 , . . . , xn ) all sets of the form U1 × U2 × · · · × Un , where Ui is a neighborhood of xi for each i. This topology is called the product topology and X 1 × X 2 × · · · × X n is called the product topological space.

2.4 Compactness A collection {Uλ }λ∈Λ of open sets of a topological space X is called an open covering of X if X = ∪λ∈Λ Uλ . A topological space X is said to be compact if every open covering {Uλ } of X contains some finite subcollection of {Uλ } which still covers X . If a subset of X is compact considered as a topological subspace of X , then it is called a compact subset of X . A subset of a topological space X is said to be relatively compact if its closure is compact subset of X . A topological space X is said to be locally compact if every point of X has a relatively compact neighborhood. A subset of a topological space X is called a σ-compact subset if it is a countable union of compact sets. Compactness is such a useful property that, given a non-compact space (X, O), it is worth while constructing a compact space (X  , O ) with X being its dense subset. Such a space is called a compactification of (X, O). The simplest way in which this can be achieved is by adjoining one extra point ∞ to the space X ; a topology O can be defined on X  = X ∪ {∞} in such a way that (X  , O ) is compact and that O is the relative topology induced on X by O . The topological space (X  , O ) is called the one-point compactification of (X, O), and the point ∞ is called the point at infinity.

2.5 Connectedness

47

2.5 Connectedness A topological space X is said to be connected if there do not exist two non-empty open subsets O1 , O2 of X such that O1 ∩ O2 = ∅ and X = O1 ∪ O2 . A subset Y of X is called a connected subset if it is connected considered as a topological subspace of X . For each point x of X , there exists a maximal connected subset C x of x which contains x. The subset C x is called the connected component of X which contains x. If y is a point of C x , then C x = C y . A connected subset C of X is called a connected component of X if C = C x for each x ∈ C. If C and C  are connected components of X , then C = C  or C = C  according to whether C ∩ C  = ∅ or C ∩ C  = ∅.

2.6 Metric Spaces A set X is called a metric space if there is defined a real-valued function ρ on the Cartesian product X × X such that (D1) (D2) (D3) (D4)

0 ≤ ρ(x, y) < +∞. ρ(x, y) = 0 if and only if x = y. ρ(x, y) = ρ(y, x). ρ(x, y) ≤ ρ(x, z) + ρ(y, z) (the triangle inequality).

The function ρ is called a metric or distance function on X . If x ∈ X and ε > 0, then B(x; ε) will denote the open ball of radius ε about x, that is, B(x; ε) = {y ∈ X : ρ(x, y) < ε}. The countable family  

1 B x; :n∈N n of open balls forms a fundamental neighborhood system of x; hence a metric space is a topological space which satisfies the first axiom of countability. A topological space X is said to be metrizable if we can introduce a metric ρ on X in such a way that the induced topology on X by ρ is just the original topology on X. Two metrics ρ1 and ρ2 on the same set X are said to be equivalent if, for each ε > 0, there exists δ > 0 such that ρ1 (x, y) < δ =⇒ ρ2 (x, y) < ε, ρ2 (x, y) < δ =⇒ ρ1 (x, y) < ε. Equivalent metrics induce the same topology. If x is a point of X and A is a subset of X , then we define the distance dist (x, A) from x to A by the formula

48

2 Sets, Topology and Measures

dist (x, A) = inf ρ(x, a). a∈A

2.7 Baire’s Category Let X be a topological space. A subset of X is said to be nowhere dense in X if its closure does not contain a non-empty open subset of X . Any countable union of nowhere dense sets is called a set of the first category; all other subsets of X are of the second category. Let (X, ρ) be a metric space. A sequence {xn } in X is called a Cauchy sequence if it satisfies Cauchy’s convergence condition lim ρ (xn , xm ) = 0.

n,m→∞

A metric space X is said to be complete if every Cauchy sequence in X converges to a point in X . The next theorem about complete metric spaces is one of the fundamental theorems in analysis: Theorem 2.1 (Baire–Hausdorff) A non-empty complete metric space is of the second category.

2.8 Continuous Mappings Let X and Y be a topological spaces. A mapping f : X → Y is said to be continuous at a point x0 of X if, to every neighborhood V of f (x0 ), there corresponds a neighborhood U of x0 such that f (U ) ⊂ V . For metric spaces, this definition of continuity is equivalent to the usual epsilon-delta definition. If f : X → Y is continuous at every point of X , we say that f is continuous. A necessary and sufficient condition for f to be continuous is that the inverse image f −1 (V ) of every open set V in Y is an open set in X . If f : X → Y is a bijection and both f and f −1 are continuous, then f is called a homeomorphic map or homeomorphism of X onto Y . Two topological spaces are said to be homeomorphic if there is a homeomorphism between them. Let X and Y be locally compact, Hausdorff topological spaces. A continuous mapping f : X → Y is said to be proper if the inverse image f −1 (K ) of every compact set K in Y is a compact set in X .

2.9 Linear Spaces

49

2.9 Linear Spaces Let the symbol K denote the real number field R or the complex number field C. A set X is called a linear space or a vector space over K if two operations, called addition and scalar multiplication, are defined in X with the following properties: (i) To every pair of elements x, y of X , there is associated an element x + y of X in such a way that (a) x + y = y + x, (b) (x + y) + z = x + (y + z), (c) There exists a unique element 0 of X , called the zero vector, such that x +0= x

for every x ∈ X.

(d) For each element x of X , there exists a unique element −x of X , called the inverse element of x, such that x + (−x) = 0. (ii) To any element x of X and each α ∈ K, there is associated an element αx of X in such a way that (a) (b) (c) (d)

(αβ)x = α(βx), 1x = x, α(x + y) = αx + αy, (α + β)x = αx + βx.

The elements of X are called vectors and the elements of K are called scalars. We also say that K is the coefficient field of the linear space X . A linear space X is said to be real or complex according to whether K = R or K = C. Let X be a linear space over K. If x1 , x2 , . . . , xn are vectors of X , a vector of the form α1 x1 + α2 x2 + . . . + αn xn with α1 , α2 , . . . , αn ∈ K is called a linear combination of x1 , x2 , . . . , xn . The vectors x1 , x2 , . . . , xn are said to be linearly independent if α1 x1 + α2 x2 + · · · αn xn = 0 implies that α1 = α2 = · · · = αn = 0. We also say that the set {x1 , x2 , . . . , xn } is linearly independent. The vectors x1 , x2 , . . . , xn are said to be linearly dependent if α1 x1 + α2 x2 + . . . + αn xn = 0 with some αi = 0. If a linear space X contains n linearly independent vectors, but n + 1 or more vectors are linearly dependent, then X is said to be n-dimensional or to have dimension n; we then write dim X = n. If the number of linearly independent vectors in X is not finite, then X is said to be infinite dimensional. A set {x1 , x2 , . . . , xn } of n linearly independent vectors in an n-dimensional linear space X is called a basis of X . Then an arbitrary vector x of X can be written uniquely as x = α1 x1 + α2 x2 + . . . αn xn .

50

2 Sets, Topology and Measures

The scalars α1 , α2 , . . . , αn are called the components of x with respect to the basis {x1 , x2 , . . . , xn }. A subset M of a linear space X is called a linear subspace or simply a subspace of X if it is a linear space with respect to the addition and scalar multiplication defined in X . A subset M of X is a subspace if and only if x + y ∈ M and αx ∈ M whenever x, y ∈ M and α ∈ K. For a subset A of X , there exists a smallest subspace [A] of X which contains A. In fact, the space [A] is the intersection of all linear subspaces of X which contain A or it is the totality of finite linear combinations of elements of A. The space [A] is called the subspace spanned by A. Let M, N be two subspaces of a linear space X . The linear subspace spanned by the union M ∪ N is called the sum of M and N , and is denoted by M + N . If M ∩ N = {0}, then the sum M + N is called the direct sum of M and N , and is ˙ . An arbitrary element x of the direct sum M +N ˙ can be expressed denoted by M +N uniquely in the form x = y+z

for y ∈ M and z ∈ N .

A set A in a linear space X is said to be convex if all points of the form αx + (1 − α)y for all 0 < α < 1, are in A whenever x, y ∈ A. For example, all linear subspaces are convex.

2.10 Linear Topological Spaces A linear topological space or a topological vector space is a linear space and at the same time a Hausdorff topological space such that the linear space operations of addition and scalar multiplication are continuous. That is, if X is a linear topological space over the real or complex number field K, then two mappings X × X  {x, y} −→ x + y ∈ X

and K × X  {α, x} −→ αx ∈ X are both continuous. We remark that the topology on X is translation invariant; this means that a subset A of X is open if and only if each of its translates x + A = {x + a : a ∈ A} is open. Hence the topology on X is completely determined by a fundamental neighborhood system of the origin (the zero vector).

2.10 Linear Topological Spaces

51

A linear topological space X is called a locally convex linear topological space if there exists a fundamental neighborhood system of the origin consisting of convex sets.

2.10.1 The Ascoli–Arzelà Theorem In this subsection we formulate one of fundamental theorems in the space of continuous functions defined on a metric space. Let S be a subset of a metric space (X, ρ) and K the real or complex number field. We let C(S) := the space of continuous maps of S into K. We say that a subset Φ of C(S) is equicontinuous at a point x0 of S if, for any given ε > 0 there exists a constant δ = δ(x0 , ε) > 0 such that we have, for all f ∈ Φ, x ∈ S, ρ(x, x0 ) < δ =⇒ | f (x) − f (x0 )| < ε. We say that Φ is equicontinuous on S if it is equicontinuous at every point of S. The next theorem provides a criterion for compactness of compact subsets of functions spaces: Theorem 2.2 (Ascoli–Arzelà) Let S be a compact subset of a metric space (X, ρ) and let Φ be a subset of the space C(S) of continuous functions on S. Then Φ is relatively compact in C(S) if and only if the following two conditions are satisfied: (I) Φ is equicontinuous. (II) The set Φ is uniformly bounded, that is, there exists a constant C > 0 such that | f (x)| ≤ C for all f ∈ Φ and x ∈ S. We remark that a subset Φ is relatively compact in C(S) if and only if it has the property that any sequence in Φ has a convergent subsequence, converging in its closure. The Ascoli–Arzelà theorem is used mostly when X is a σ-compact space. In that case, we have the following version of the Ascoli–Arzelà theorem: Corollary 2.3 (Ascoli–Arzelà) Let X be a metric space that is a denumerable union of compact sets. Let { f n }∞ n=1 be a sequence of continuous functions on X . Assume that: (1) The sequence { f n }∞ n=1 is equicontinuous. (2) For each x ∈ X , the closure of the set { f n (x) : n = 1, 2, . . .} is compact in K. Then there exists a subsequence of { f n }∞ n=1 that converges pointwise to a continuous function f ∈ C(X ), and this convergence is uniform on every compact subset of X .

52

2 Sets, Topology and Measures

2.11 Factor Spaces Let X be a linear space and M a linear subspace of X . We say that two elements x1 and x2 of X are equivalent modulo M if x1 ∼ x2 ∈ M; we then write x1 ∼ x2 (mod M). The relation enjoys the so-called equivalence laws: (E1) x ∼ x (reflexivity). (E2) If x1 ∼ x2 , then x2 ∼ x1 (symmetry). (E3) If x1 ∼ x2 and x2 ∼ x3 , then x1 ∼ x3 (transitivity). For each x ∈ X , we let   x = x ∈ X : x ∼ x . Then we have x = {x + m : m ∈ M} , and hence



x1 ∼ x2 ⇐⇒ x1 = x2 , x1 ∩ x2 = ∅. x1  x2 ⇐⇒

(2.1)

The set x is called an equivalence class modulo M and each element of x is called a representative of the class x . Assertion (2.1) implies that the space X can be decomposed into equivalence classes modulo M. We denote by X/M the totality of equivalence classes modulo M. In the set X/M we can define addition and scalar multiplication as follows: x2 =  x + y, x1 + α x =α

x for α ∈ K. In fact, it is easy to verify that the above definitions do not depend on the choice of representatives x, y of the equivalence classes x, y, respectively. Therefore, the set X/M is a linear space and is called the factor space of X modulo M. If the factor space X/M has finite dimension, then we say that the subspace M has finite codimension, and dim X/M is called the codimension of M and is denoted by codim M. It is easy to see that the subspace M has finite codimension n if and ˙ = X. only if there exists an n-dimensional linear subspace N of X such that M +N

2.12 Algebras and Modules A linear space A over a field K is called an (associative) algebra if, to every pair of elements a, b of A, there is associated an element a ◦ b of A in such a way that

2.12 Algebras and Modules

53

α(a ◦ b) = (αa) ◦ b = a ◦ (αb), (a + b) ◦ c = a ◦ c + b ◦ c, a ◦ (b + c) = a ◦ b + a ◦ c, a ◦ (b ◦ c) = (a ◦ b) ◦ c (associative law) for α ∈ K and a, b, c ∈ A. If a ◦ b = b ◦ a for every pair a, b ∈ A, then A is said to be commutative. A subset J of a commutative algebra A is called an ideal of A if it is a linear subspace of A and satisfies the condition a ∈ A, b ∈ J =⇒ a ◦ b ∈ J. For example, A itself and {0} are ideals of A. Let A be an algebra. A linear space M over K is called an A-module if, to every pair of an element a of A and an element x of M, there is associated an element ax of M in such a way that a(αx) = (αa)x = α(ax), a(x + y) = ax + ay, (a + b)x = ax + bx, a(bx) = (a ◦ b)x, α ∈ K; a, b, c ∈ A.

2.13 Linear Operators Let X , Y be linear spaces over the same scalar field K. A mapping T defined on a linear subspace D of X and taking values in Y is said to be linear if it preserves the operations of addition and scalar multiplication:

T (x1 + x2 ) = T x1 + T x2 for x1 , x2 ∈ D; T (αx) = αT x for x ∈ D and α ∈ K.

(2.2)

We often write T x, rather than T (x), if T is linear. We let D(T ) = D, R(T ) = {T x : x ∈ D(T )} , N (T ) = {x ∈ D(T ) : T x = 0} , and call them the domain, the range and the null space of T , respectively. The mapping T is called a linear operator from D(T ) ⊂ X into Y . We also say that T is a linear operator from X into Y with domain D(T ). In the particular case when

54

2 Sets, Topology and Measures

Y = K, the mapping T is called a linear functional on D(T ). In other words, a linear functional is a K-valued function on D(T ) which satisfies condition (2.2). If a linear operator T is a one-to-one map of D(T ) onto R(T ), then the inverse mapping T −1 is a linear operator on R(T ) onto D(T ). The mapping T −1 is called the inverse operator or simply the inverse of T . A linear operator T admits the inverse T −1 if and only if T x = 0 implies that x = 0. Let T1 and T2 be linear operators from a linear space X into a linear space Y with domains D(T1 ) and D(T2 ), respectively. Then T1 = T2 if and only if D(T1 ) = D(T2 ) and T1 x = T2 x for all x ∈ D(T1 ) = D(T2 ). If D(T1 ) ⊂ D(T2 ) and T1 x = T2 x for all x ∈ D(T1 ), then we say that T2 is an extension of T1 and also that T1 is a restriction of T2 , and we write T1 ⊂ T2 .

2.14 Measurable Spaces Let X be a non-empty set. An algebra of sets on X is a non-empty collection A of subsets of X which is closed under finite unions and complements, that is, if it has the following two properties (F1) and (F2): (F1) If E ∈ A, then its complement E c = X \ E belongs to A. (F2) If {E j }nj=1 is an arbitrary finite collection of members of A, then the union n j=1 E j belongs to A. Two subsets E and F of X are said to be disjoint if E ∩ F = ∅, that is, if there are no elements common to E and F. A disjoint union is a union of sets that are mutually disjoint. A collection E of subsets of X is called an elementary family on X if it has the following three properties (EF1), (EF2) and (EF3): (EF1) The empty set ∅ belongs to E. (EF2) If E, F ∈ E, then their intersection E ∩ F belongs to E. (EF3) If E ∈ E, then the complement E c = X \ E is a finite disjoint union of members of E. It should be noticed that if E is an elementary family, then the collection of finite disjoint unions of members of E is an algebra. A σ-algebra of sets on X is an algebra which is closed under countable unions and complements. More precisely, a non-empty collection M of subsets of X is called a σ-algebra if it has the following three properties (S1), (S2) and (S3): (S1) The empty set ∅ belongs to M. (S2) If A ∈ M, then its complement Ac = X \ A belongs to M. }∞ (S3) If {An  n=1 is an arbitrary countable collection of members of M, then the union ∞ n=1 An belongs to M. The pair (X, M) is called a measurable space and the members of M are called measurable sets in X .

2.14 Measurable Spaces

55

It is easy to see that the intersection of any family of σ-algebras on X is a σalgebras. Therefore, for any collection F of subsets of X we can find a unique smallest σ-algebra σ(F) on X which contains F, that is, the intersection of all σalgebras containing F. This σ(F) is sometimes called the σ-algebra generated by F. If X is a topological space, then the σ-algebra B(X ) generated by the family O X of open sets in X is called the Borel σ − algebra on X . Namely, we have the formula B(X ) = σ (O X ) . The members of B(X ) are called Borel sets in X . We sometimes consider measurability on subsets of X . If Ω is a non-empty Borel set of X , then the collection B(Ω) = {Ω ∩ A : A ∈ B(X )} is a σ-algebra on Ω. The next proposition asserts that the Borel σ-algebra B(R) on R can be generated in a number of different ways [63, Proposition 1.2]: Proposition 2.4 The Borel σ-algebra B(R) is generated by each of the following five collections (a)–(e): (a) The open rays: E1 = {(a, ∞) : −∞ < a < ∞} or E2 = {(−∞, b) : −∞ < b < ∞}. (b) The closed rays: E3 = {[a, ∞) : −∞ < b < ∞} or E4 = {(−∞, b] : −∞ < b < ∞} or (c) The open intervals: E5 = {(a, b) : −∞ < a < b < ∞}. (d) The half-open intervals: E6 = {(a, b] : −∞ < a < b < ∞} or E7 = {[a, b) : −∞ < a < b < ∞}. (e) The closed intervals: E8 = {[a, b] : −∞ < a < b < ∞}. Moreover, the next proposition asserts that the Borel σ-algebra B(Rn ) generated by the family On of open sets in Rn can be generated in a number of different ways: Proposition 2.5 The Borel σ-algebra B(Rn ) on Rn is generated by each of the following three collections (a), (b) and (c): (a) The open intervals: E1 = {(a1 , b1 ) × · · · × (an , bn ) : −∞ < ai < bi < ∞ (1 ≤ i ≤ n)}. (a) The half-open intervals: E2 = {(a1 , b1 ] × · · · × (an , bn ] : −∞ < ai < bi < ∞ (1 ≤ i ≤ n)} or E3 ={[a1 , b1 ) × · · · × [an , bn ) : −∞ < ai < bi < ∞ (1 ≤ i ≤ n)}. (c) The product of Borel sets of R: E4 = B(R) × · · · × B(R) = {A1 × · · · × An : Ai ∈ B(R) (1 ≤ i ≤ n)}.

56

2 Sets, Topology and Measures

2.15 Measurable Functions We let R = {−∞} ∪ R ∪ {+∞} with the obvious ordering. The topology on R is defined by declaring that the open sets in R are those which are unions of segments of the types (a, b), [−∞, a), (a, +∞]. The elements of R are called extended real numbers. Let (X, M) be a measurable space. An extended real-valued function f defined on a set A ∈ M is said to be M-measurable or simply measurable if, for every a ∈ R, the set {x ∈ A : f (x) > a} is in M. If A is a subset of X , we let χ A (x) =

1 if x ∈ A, 0 if x ∈ / A.

The function χ A is called the characteristic function of A. A real-valued function f on X is called a simple function if it takes on only a finite number of values. Thus, if a1 , a2 , . . . , am are the distinct values of f , then f can be written as m  a j χA j f = j=1

where A j = {x ∈ X : f (x) = a j }. We remark that the function f is measurable if and only if each A j is measurable. The next theorem characterizes measurable functions in terms of simple functions: Theorem 2.6 An extended real-valued function defined on a measurable set is measurable if and only if it is a pointwise limit of a sequence of measurable simple functions. Furthermore, every non-negative measurable function is a pointwise limit of an increasing sequence of non-negative measurable simple functions.

2.16 Measures

57

2.16 Measures Let (X, M) be a measurable space. An extended real-valued function μ defined on M is called a non-negative measure or simply a measure if it has the following three properties: (M1) 0 ≤ μ(A) ≤ ∞ for every A ∈ M. (M2) μ(φ) = 0. (M3) The function μ is countably additive, that is,  μ

∞ 

 Ai

=

i=1

∞ 

μ(Ai )

i=1

for any disjoint countable collection {Ai } of members of M. The triple (X, M, μ) is called a measure space. In other words, a measure space is a measurable space which has a non-negative measure defined on the σ-algebra of its measurable sets. If μ(X ) < ∞, then the measure μ is called a finite measure and the space (X, M, μ) is called a finite measure space. If X is a countable union of sets of finite measure, then the measure μ is said to be σ-finite on X . We also say that the measure space (X, M, μ) is σ-finite.

2.16.1 Lebesgue Measures The next theorem is one of the fundamental theorems in measure theory: Theorem 2.7 There exist a σ-algebra M in Rn and a non-negative measure μ on M having the following properties: (i) Every open set in Rn is in M. (ii) If A ⊂ B, B ∈ M and μ(B) = 0, then A ∈ M and μ(A) = 0. (iii) If A = {x ∈ Rn : a j ≤ x j ≤ b j (1 ≤ j ≤ n)}, then A ∈ M and we have the formula n  (b j − a j ). μ(A) = j=1

(iv) The measure μ is translation invariant, that is, if x ∈ Rn and A ∈ M, then the set x + A = {x + y : y ∈ A} is in M and μ(x + A) = μ(A). The elements of M are called Lebesgue measurable sets in Rn and the measure μ is called the Lebesgue measure on Rn .

58

2 Sets, Topology and Measures

2.16.2 Signed Measures Let (X, M) be a measurable space. A real-valued function μ defined on M is called a signed measure or real measure if it is countably additive, that is,  μ

∞ 

 Ai

i=1

=

∞ 

μ(Ai )

i=1

{Ai } of members of M. We remark that for any disjoint countable collection every  rearrangement of the series i μ(Ai ) also converges, since the disjoint union i Ai is not changed if the subscripts are permuted. A signed measure takes its values in (−∞, +∞), but a non-negative measure may take +∞; hence the non-negative measures do not form a subclass of the signed measures. If μ is a signed measure, we define a function |μ| on M as follows: |μ|(A) = sup

n 

|μ(Ai )| for A ∈ M,

i=1

where the supremum is taken over all countable partitions {Ai } of A into members of M. Then the function |μ| is a finite non-negative measure on M. The measure |μ| is called the total variation measure of μ, and the quantity |μ|(X ) is called the total variation of μ. Note that |μ(A)| ≤ |μ|(A) ≤ |μ|(X ) for all A ∈ M.

(2.3)

2.16.3 Borel Measures and Radon Measures Let X be a locally compact Hausdorff space. There exists a smallest σ-algebra B in X which contains all open sets in X . The members of B are called Borel sets in X . A signed measure defined on B is called a real Borel measure on X . A non-negative Borel measure μ is said to be regular if we have, for every B ∈ B, μ(B) = sup {μ(F) : F ⊂ B, F is compact} = inf {μ(G) : B ⊂ G, G is open} . We give a useful criterion for regularity of μ: Theorem 2.8 Let X be a locally compact, Hausdorff space in which every open set is σ-compact. If μ is a non-negative Borel measure on X such that μ(K ) < +∞ for every compact set K ⊂ X , then it is regular.

2.16 Measures

59

A Radon measure μ on X is a Borel measure that is finite on all compact sets in X , and is outer regular on all Borel sets in X and inner regular on all open sets in X , that is, μ satisfies the two conditions μ(B) = inf {μ(G) : B ⊂ G, G is open} for every Borel set B ⊂ X, μ(U ) = sup {μ(F) : F ⊂ U, F is compact} for every open set U ⊂ X.

2.16.4 Product Measures Let (X, M) and (Y, N ) be measurable spaces. We let M ⊗ N = the smallest σ-algebra in X × Y which contains all sets of the form A × B where A ∈ M and B ∈ N. Then (X × Y, M ⊗ N ) is a measurable space. For the product of measure spaces, we have the following: Theorem 2.9 Let (X, M, μ) and (Y, N , ν) be σ-finite measure spaces. Then there exists a unique σ-finite, non-negative measure λ on M ⊗ N such that λ(A × B) = μ(A)ν(B) for A ∈ M and B ∈ N . The measure λ is called the product measure of μ and ν, and is denoted by μ × ν.

2.16.5 Direct Image of Measures Let (X, M) and (Y, N ) be measurable spaces. A mapping f of X into Y is said to be measurable if the inverse image f −1 (B) of every B ∈ N is in M. Let (X, M, μ) be a measure space and (Y, N ) a measurable space. If f : X → Y is a measurable mapping, then we can define a measure ν on (Y, N ) by the formula ν(B) = μ( f −1 (B))

for B ∈ N .

We then write ν = f ∗ μ. The measure f ∗ μ is called the direct image of μ under f .

60

2 Sets, Topology and Measures

2.17 Integrals Let (X, M, μ) be a measure space. If A is a measurable subset of X and if f is a non-negative measurable simple function on A of the form f =

m 

a j χ A j , a j ≥ 0,

j=1

then we let

 f (x) dμ(x) = A

m 

a j μ(A j ).

(2.4)

j=1

The convention: 0 · ∞ = 0 is used here; it may happen that a j = 0 and μ(A j ) = ∞. If f (x) is a non-negative measurable function on A, we let 

 f (x) dμ(x) = sup A

s(x) dμ(x),

(2.5)

A

where the supremum is taken over all measurable simple functions s(x) on A such that 0 ≤ s(x) ≤ f (x) for x ∈ A. We remark that if f is a non-negative simple function,  then the two definitions (2.4) and (2.5) of the integral A f (x) dμ(x) coincide. If f is a measurable function on A, we can write it in the form f (x) = f + (x) − f − (x) where f + (x) = max{ f (x), 0}, f − (x) = max{− f (x), 0}. Both f + and f − are non-negative measurable functions on A. Then we define the integral of f by the formula 

 f (x) dμ(x) = A



+

f − (x) dμ(x),

f (x) dμ(x) − A

A

provided at least one of the integrals on the right-hand side is finite. If both integrals are finite, we say that f is μ-integrable or simply integrable on A. For simplicity, we abbreviate 

 f dμ = A

f (x) dμ(x). A

If μ is the Lebesgue measure on Rn , we customarily write

2.17 Integrals

61

 f (x) d x, A

 instead of A f (x) dμ(x). A proposition concerning the points of a measurable set A is said to hold true μ-almost everywhere (μ-a.e.) or simply almost everywhere (a.e.) on A if there exists a measurable set N of measure zero such that the proposition holds true for all x ∈ A \ N . For example, if f and g are measurable functions on A and if μ({x ∈ A : f (x) = g(x)}) = 0, then we say that f = g a.e. on A. The next three theorems are concerned with the interchange of integration and limit process: Theorem 2.10 (the monotone convergence theorem) If { f n } is an increasing sequence of non-negative measurable functions on a measurable set A, then we have the formula     lim lim f n dμ. f n dμ = n→∞

A

A

n→∞

Theorem 2.11 (Fatou’s lemma) If { f n } is a sequence of non-negative measurable functions on a measurable set A, then we have the inequality     lim inf f n dμ ≤ lim inf f n dμ. A

n→∞

n→∞

A

Theorem 2.12 (the dominated convergence theorem) Let { f n } be a sequence of measurable functions on a measurable set A which converges pointwise to a function f on A. If there exists a non-negative integrable function g on A such that | f n (x)| ≤ g(x) for all x ∈ A and all n = 1, 2, . . . , then the function f is integrable on A and we have the formula 

 f dμ = lim

n→∞

A

f n dμ. A

Theorem 2.13 (Beppo Levi) Let { f n }∞ n=1 be an increasing sequence of non-negative, measurable functions on a measurable set A. Assume that  f n (x) dμ < ∞.

sup n≥1

A

Then it follows that the limit function f (x) = lim f n (x) n→∞

is finite for almost all x ∈ A, and is integrable on A.

62

2 Sets, Topology and Measures

Moreover, we have the formula 





f (x) dμ = lim A

f n (x) dμ = sup

n→∞

n≥1

A

f n (x) dμ. A

2.18 The Radon–Nikodým Theorem Let (X, M) be a measurable space, and let ν be a signed measure on M, that is,  ν

∞ 

 Ei

i=1

=

∞ 

ν(E i )

i=1

for any disjoint countable collection {E i } of members of M. The motivation of this notion comes by looking at the difference ν = ν 1 − ν2 of two measures ν1 and ν2 , defined by the formula ν(E) = ν1 (E) − ν2 (E)

for E ∈ M.

More precisely, we have the following theorem of a signed measure: Theorem 2.14 (Jordan–Hahn) (i) If ν is a signed measure, then there exist two measurable sets A and B such that X = A ∪ B, A ∩ B = ∅ and further that we have, for all E ∈ M, ν(E ∩ A) ≥ 0, ν(E ∩ B) ≤ 0. The dcomposition X = A ∪ B is called a Hahn decomposition for ν. (ii) If we let ν + (E) = ν(E ∩ A), ν − (E) = −ν(E ∩ B) for any E ∈ M, then ν + and ν − are measures and we have the formula ν = ν+ − ν−.

2.18 The Radon–Nikodým Theorem

63

The dcomposition ν = ν + − ν − is called the Jordan decomposition of ν. If ν is a signed measure, the measure |ν| = ν + + ν − is called the total variation of ν. Note that |ν(E)| ≤ |ν|(E) for all E ∈ M. A signed measure ν is said to be absolutely continuous with respect to a measure μ if it satisfies the condition E ∈ M, |μ|(E) = 0 =⇒ ν(E) = 0. We will write ν  μ. It should be noticed that if f is integrable on E, then a Hahn decomposition for a finite signed measure  ν(E) =

f dν E

is given by the formulas A = {x ∈ E : f (x) ≥ 0} , B = {x ∈ E : f (x) < 0} . Moreover, the Jordan decomposition of ν is given by the formulas ν + (E) =

 f dν, 

E∩A

ν − (E) = −

f dν. E∩B

The Radon–Nikodým theorem reads as follows: Theorem 2.15 (Radon–Nikodým) Let (X, M, μ) be a σ-finite measure space with μ a measure, and let ν be a σ-finite signed measure on M. If ν is absolutely continuous with respect to μ, then there exists a real-valued measurable function f on X such that  f dμ for all E ∈ M for which |ν|(E) < ∞. (2.6) ν(E) = E

Moreover, the density f of ν(E) with respect to μ is uniquely determined in the sense that any two of them are equal with respect to μ.

64

2 Sets, Topology and Measures

Zn

Fig. 2.1 X is the countable disjoint union of sets Z n of finite measure

X=

∞ n=1

Zn

Z2 Z1

The function f is called the Radon–Nikodým derivative of ν with respect to μ. We will write dν or dν = f dμ. (2.7) f = dμ Proof First, we can write X as a countable disjoint union of sets X j X=

∞ 

X j,

j=1

each X j having a finite μ-measure: μ(X j ) < ∞. We can also write each X j as a countable disjoint union of sets Y j h Xj =

∞ 

Y jh,

h=1

each Y j h having a finite ν-measure: |ν|(Y j h ) < ∞. We will write the double sequence {Y j h } as a sequence {Z n } (see Fig. 2.1): X=

∞  j=1

Xj =

∞  j=1

∪∞ h=1 Y j h =

∞ 

Zn.

n=1

The proof of Theorem 2.15 is divided into three steps. Step 1: If g is a function such that  ν(E) =

g dμ for all E ∈ M for which |ν|(E) < ∞. E

2.18 The Radon–Nikodým Theorem

Then we have the assertion   f dμ = g dμ E

65

for any measurable subset of each Z n .

E

and so f = g almost everywhere in Z n with respect to μ, This proves that f = g almost everywhere in X with respect to μ, Namely, the density f of ν(E) with respect to μ is uniquely determined. Step 2: Assume that we can prove the existence part of the theorem for each subspace Z n . Then, on each Z n we have a μ-integrable function f n such that  ν (E ∩ Z n ) =

f n dμ

for all E ∈ M,

E∩Z n

since |ν|(Z n ) < ∞. We define f (x) := f n (x) if x ∈ Z n , then f is a measurable function on X . If |ν|(E) < ∞, we have the formula  | f | dμ = E

∞   n=1

| f n | dμ = E∩Z n

∞ 

|ν| (E ∩ Z n )

n=1

= |ν|(E) < ∞. This proves that f is μ-integrable on E and further that the desired formula (2.6) holds true:  f dμ = E

∞   n=1

f dμ = E∩Z n

∞   n=1

f n dμ = E∩Z n

∞ 

ν (E ∩ Z n )

n=1

= ν(E) for all E ∈ M for which |ν|(E) < ∞. Therefore, we have shown that it suffices to prove the existence part of the theorem in each subspace Z n where μ and ν are both finite. Step 3: Without loss of generality, we may assume that μ(X ) < ∞, |ν|(X ) < ∞.

66

2 Sets, Topology and Measures

Substep 3-1: Let X = A∪B be a Hahn decomposition for the signed measure ν. The argument in Step 2 shows that it suffices to prove the existence part of the theorem in each of the subspaces A and B separately. We remark that ν is a measure on A and −ν is a measure on B, respectively. Hence, without loss of generality, we may assume that ν is a measure on X . Now we consider the case where μ and ν are finite measures, and prove the existence of a density function f . To do so, we let D := the class of all non-negative functions f, μ-integrable on X,  f dμ ≤ ν(E) for all E ∈ M. such that E

Since 0 ∈ D, we can define



fˆ dμ.

α = sup

fˆ∈D

X

Then there exists a sequence { f n } in D such that  f n dμ = α.

lim

n→∞

X

We let gn := sup { f 1 , . . . , f n } , and write each measurable set E as a disjoint union of measurable sets E j E=

n 

Ej

j=1

as follows: E 1 = {x ∈ E : f 1 (x) ≥ sup{ f 2 (x), . . . , f n (x)}} , E 2 = {x ∈ E : x ∈ / E 1 , f 2 (x) ≥ sup{ f 3 (x), . . . , f n (x)}} , .. . / (E 1 ∪ . . . ∪ E n−2 ), f n−1 (x) ≥ f n (x)}} , E n−1 = {x ∈ E : x ∈ E n = E \ (E 1 ∪ . . . ∪ E n−1 ). Then we remark that gn (x) = f j (x)

for x ∈ E j .

(2.8)

2.18 The Radon–Nikodým Theorem

67

Hence, we have the inequality  gn dμ = E

n   j=1

gn dμ = Ej

n   j=1

f j dμ ≤ Ej

n    ν E j = ν(E), j=1

and so gn ∈ D. If we let f 0 (x) := sup f n (x) = sup = lim gn (x), n∈N

n∈N

n→∞

then the sequence {gn } is a monotone increasing sequence of non-negative integral functions on X . By applying the Lebesgue monotone convergence theorem (Theorem 2.10), we obtain that   f 0 dμ = lim gn dμ ≤ ν(E) for all E ∈ M, n→∞

E

E

so that f 0 ∈ D. By definition of formula (2.8), it follows that  f 0 dμ ≤ α. X

However, we have the inequality 





f 0 dμ = lim X

gn dμ ≥ lim

n→∞

n→∞

X

f n dμ = α, X

Summing up, we have proved the formula  f 0 dμ = α. X

Moreover, since the function f 0 is integrable on X , it is a real-valued function for almost all x ∈ X . If we let f to be a real-valued function that is equal to almost everywhere in X to f , then we have the assertions f ∈ D,  f dμ = α. X

(2.9a) (2.9b)

68

2 Sets, Topology and Measures

Substep 3-2: Finally, it remains to show that  f dμ = ν(E) for all E ∈ M, E

To do so, by letting  λ(E) := ν(E) −

f dμ for E ∈ M, E

we show that λ ≡ 0. Here we remark that λ  μ and further that λ is a measure. Our proof is based on a reduction to absurdity. Assume, to the contrary, that λ ≡ 0. Then we let X := Am ∪ Bm be a Hahn decomposition for the signed measure λ−

1 μ for m = 1, 2, . . . . m

If we write A0 = ∪∞ m=1 Am , B0 = ∩∞ m=1 Bm , it follows that A0 ∩ B0 = ∅, A0 ∪ B0 = X. However, since B0 ⊂ Bm , we have the assertion 0 ≤ λ(B0 ) ≤

1 μ(B0 ) −→ 0 as m → ∞, m

and so λ (B0 ) = 0. This proves that λ (A0 ) > 0,

2.18 The Radon–Nikodým Theorem

69

since λ ≡ 0. We recall that λ is absolutely continuous with respect to μ. Hence, we also have the assertion μ (A0 ) > 0. We may assume that μ (Am ) > 0 for some number m, and further that λ−

(2.10)

1 μ m

is positive on Am . Hence, we have the inequality 1 μ (E ∩ Am ) ≤ λ (E ∩ Am ) = ν (E ∩ Am ) − m

 f dμ

(2.11)

E∩Am

for all E ∈ M. Now we consider the function

1 g(x) := f (x) + χ Am (x) = m

f (x) + f (x)

1 m

for x ∈ Am , otherwise.

We have, by formula (2.9b) and assertion (2.10), 

 g dμ = X

f dμ + X

1 μ (Am ) > α. m

(2.12)

However, we have, by inequality (2.10), 

 g dμ = E

f dμ + 

E

=

1 μ (E ∩ Am ) ≤ m



 f dμ + ν (E ∩ Am ) − E

f dμ E∩Am

f dμ + ν (E ∩ Am ) ≤ ν (E \ Am ) + ν (E ∩ Am ) E\Am

= ν(E) for all E ∈ M. This proves that g= f + so that

1 χ A ∈ D, m m

 g dμ ≤ α, X

thereby contradicting inequality (2.12). Now the proof of Theorem 2.15 is complete.



70

2 Sets, Topology and Measures

2.19 Fubini’s Theorem We consider integration on product spaces. Let (X, M, μ) and (Y, N , ν) be two σ-finite measure spaces, and let μ × ν be the uniqueproduct measure of μ and ν on the product σ-algebra M ⊗ N . First, let E be a subset of X × Y . For a point x ∈ X , we define the x-section E x of E by the formula E x = {y ∈ Y : (x, y) ∈ E} . For a point y ∈ Y , we define the y-section E y of E by the formula E y = {x ∈ X : (x, y) ∈ E} . For example, if E = A × B where A ⊂ X and B ⊂ Y , then we have the formulas E x = B, E y = A. More generally, we can prove the following [63, Proposition 2.34]: Proposition 2.16 (a) If E ∈ M ⊗ N , then it follows that E x ∈ N for all x ∈ X and that E y ∈ M for all y ∈ Y . (b) If f (x, y) is M ⊗ N -measurable, then it follows that f x (y) is N -measurable for all x ∈ X and that f y (x) is M-measurable for all y ∈ Y . Secondly, let f (x, y) be a function defined on X × Y . For a point x ∈ X , we define the x-section f x of f by the formula f x (y) = f (x, y) for y ∈ E x . For a point y ∈ Y , we define the y-section f y of f by the formula f y (x) = f (x, y) for x ∈ E y . For example, if E = A × B where A ⊂ X and B ⊂ Y , then we have the formulas (χ E )x (y) = χ E x (y) = χ B (y) for y ∈ Y, (χ E ) y (x) = χ E y (x) = χ A (x) for x ∈ X. Now we assume that f (x, y) is an M ⊗ N -measurable function on X × Y such that its integral exists. Then we customarily write  X ×Y

f (x, y) d(μ × ν)(x, y).

2.19 Fubini’s Theorem

71

This integral is called the double integral of f . If it happens that the function  f (x, y) dν(y) for x ∈ X

g(x) = Y

is defined and also its integral exists, then we denote the integral  g(x) dμ(x) X

by any one of the following notation:    X

  f (x, y) dν(y) dμ(x), dμ(x) f (x, y) dν(y), X Y  f (x, y) dν(y) dμ(x), f dν dμ.

Y

X ×Y

X ×Y

Similarly, we write  





f (x, y)dμ(x) dν(y), Y

f (x, y) dμ(x) dν(y),

f (x, y) dμ(x),

dν(y)  Y

X

X ×Y

 X

f dμ dν. X ×Y

These integrals are called the iterated integrals of f (x, y). The next theorem describes the most important relationship between double integrals and iterated integrals: Theorem 2.17 (Fubini) Assume that (X, M, μ) and (Y, N , ν) are σ-finite measure spaces. Then we have the following three assertions: (i) If E ∈ M ⊗ N , then it follows that E x ∈ N for all x ∈ X and that E y ∈ M for all y ∈ Y . Moreover, ν(E x ) is an M-measurable function of x and μ(E y ) is an N -measurable function of y, respectively, and we have the formula 

 ν(E x ) dμ(x) = X

μ(E y ) dν(y) = (μ × ν)(E). Y

In particular, if (μ × ν)(E) < ∞, then it follows that ν(E x ) < ∞ for μ-almost all x ∈ X and that μ(E y ) < ∞ for ν-almost all y ∈ Y . (ii) If f (x, y) is a non-negative, M ⊗ N -measurable function on X × Y , then it follows that f x (y) is an N -measurable function of y for any x ∈ X and that f y (x) is an M-measurable function of x for all y ∈ Y . Moreover, the functions

72

2 Sets, Topology and Measures

 X  x −→ g(x) =



Y

Y  y −→ h(y) =

f x (y) dν(y) =

f (x, y) dν(y),

Y

f y (x) dμ(x) = X

f (x, y) dμ(x) X

are M-measurable and N -measurable, respectively, and we have the formula 





g(x) dμ(x) =

h(y) dν(y) =

X

X ×Y

Y

f (x, y) d(μ × ν).

(iii) If f (x, y) is a μ × ν-integrable function on X × Y , then it follows that the function f x (y) is ν-integrable for μ-almost all x ∈ X and that the function f y (x) is μ-integrable for ν-almost all y ∈ Y . Furthermore, the function g(x) is μ-integrable on X and the function h(y) is ν-integrable on Y , respectively, and we have the formula 





g(x) dμ(x) =

h(y) dν(y) =

X

X ×Y

Y

f (x, y) d(μ × ν).

The next theorem gives a useful version of Fubini’s theorem: Theorem 2.18 (Fubini) Assume that (X, M, μ) and (Y, N , ν) are σ-finite measure spaces. Then we have the following two assertions: (i) If f (x, y) is a μ ⊗ ν-integrable function on X × Y , then the function f x (y) on Y is ν-integrable for μ- almost all x ∈ X , and the function f y (x) is μ-integrable for ν-almost all y ∈ Y . Furthermore, the function g(x), defined by the formula 



g(x) =

f x (y) dν(y) = Y

f (x, y) dν(y) Y

for μ-almost all x ∈ X , is μ-integrable, and the function h(y), defined by the formula   y h(y) = f (x) dμ(x) = f (x, y) dμ(x) X

X

for ν-almost all y ∈ Y , is ν-integrable, respectively, and we have the formula 

 X ×Y



f (x, y) d(μ × ν) =

g(x) dμ = X

h(y) dν. Y

(ii) Conversely, if f (x, y) is an M ⊗ N -measurable function on X × Y , then the functions

2.19 Fubini’s Theorem

73

 ϕ(x) =

Y

ψ(y) =

| f (x, y)| dν(y) for x ∈ X, | f (x, y)| dμ(x) for y ∈ Y,

X

are M-measurable and N -measurable, respectively, and we have the formula 

 X ×Y



| f (x, y)| d(μ × ν) =

ϕ(x) dμ = X

ψ(y) dν. Y

Furthermore, if either ϕ(x) or ψ(y) is integrable, then f (x, y) is integrable, and part (i) applies. Let (X, M, μ) and (Y, N , ν) be complete, σ-finite measure spaces, and let (X × Y, L, μ × ν) be the completion of (X × Y, M ⊗ N , μ × ν). The next theorem is a version of Fubini’s theorem (Theorem 2.17) for complete measures: Theorem 2.19 (Fubini) Let (X × Y, L, μ × ν) be the completion of the product measure space (X × Y, M ⊗ N , μ × ν). Then we have the following three assertions: (i) If E ∈ L, then it follows that E x ∈ N for μ-almost all x ∈ X and that E y ∈ M for ν-almost all y ∈ Y . Moreover, ν(E x ) is an M-measurable function of x and μ(E y ) is an N -measurable function of y, respectively, and we have the formula 

 ν(E x ) dμ(x) = X

μ(E y ) dν(y) = (μ × ν)(E). Y

In particular, if μ(E) < ∞, then it follows that ν(E x ) < ∞ for μ-almost all x ∈ X and that μ(E y ) < ∞ for ν-almost all y ∈ Y . (ii) If f (x, y) is an L-measurable function on the product space X × Y such that f (x, y) ≥ 0 for μ × ν-almost all (x, y) ∈ X × Y , then it follows that f x (y) is an N -measurable function of y for μ-almost all x ∈ X and that f y (x) is an M-measurable function of x for ν-almost all y ∈ Y . Moreover, the functions  X  x −→ g(x) =

Y

Y  y −→ h(y) =

f (x, y) dν(y), f (x, y) dμ(x)

X

are M-measurable and N -measurable, respectively, and the formula 





g(x) dμ(x) = X

holds true.

h(y) dν(y) = Y

X ×Y

f (x, y) d(μ × ν)

74

2 Sets, Topology and Measures

(iii) If f (x, y) is an L-measurable and μ × ν-integrable function on X × Y , then it follows that the function f x (y) is ν-integrable for μ-almost all x ∈ X and that the function f y (x) is μ-integrable for ν-almost all y ∈ Y . Furthermore, the function g(x) is μ-integrable on X and the function h(y) is ν-integrable on Y , respectively, and we have the formula 





g(x) dμ(x) =

h(y) dν(y) =

X

Y

X ×Y

f (x, y) d(μ × ν).

The next corollary gives a useful version of Fubini’s theorem for complete measures: Corollary 2.20 Let (X × Y, L, μ × ν) be the completion of the product measure space (X × Y, M ⊗ N , μ × ν). If f (x, y) is an L-measurable function on X × Y , then the functions  ϕ(x) =

Y

| f (x, y)| dν(y) for x ∈ X,

ψ(y) =

| f (x, y)| dμ(x) for y ∈ Y, X

are M-measurable and N -measurable, respectively, and we have the formula 

 X ×Y



| f (x, y)| d(μ × ν) =

ϕ(x) dμ(x) = X

ψ(y) dν(y). Y

Furthermore, if either ϕ(x) or ψ(y) is integrable, then f (x, y) is integrable, and Theorem 2.19 applies. We consider integration on product spaces. Let (X, M, μ) and (Y, N , ν) be σfinite measure spaces, and μ × ν the product measure of μ and ν. If f is an M × N measurable function on X × Y such that its integral exists, then we customarily write  f (x, y) d(μ × ν)(x, y). X ×Y

This integral is called the double integral of f . If it happens that the function  f (x, y) dν(y) for x ∈ X

g(x) = Y

is defined and also its integral exists, then we denote the integral of the following notation:

 X

g dμ by any one

2.19 Fubini’s Theorem

   X

Y

X ×Y

75

  f (x, y) dν(y) dμ(x), dμ(x) f (x, y) dν(y), X Y  f (x, y) dν(y) dμ(x), f dν dμ. X ×Y

Similarly, we write  







f (x, y) dμ(x) dν(y), Y

X ×Y

f (x, y) dμ(x),

dν(y)  Y

X

f (x, y) dμ(x) dν(y),

X

f dμ dν. X ×Y

These integrals are called the iterated integrals of f . The next theorem describes the most important relationship between double integrals and iterated integrals: Theorem 2.21 (Fubini) (i) If f is a μ × ν-integrable function on X × Y , then the function f x on Y defined by f x (y) = f (x, y), y ∈ Y , is ν-integrable for μ-almost all x ∈ X , and the function f y on X defined by the formula f y (x) = f (x, y) for x ∈ X, is μ-integrable for ν-almost all y ∈ Y . Furthermore, the function g(x), defined by the formula   f x (y) dν(y) = f (x, y) dν(y) g(x) = Y

Y

for μ-almost all x ∈ X , is μ-integrable, and the function h(y), defined by the formula 



h(y) =

f (x) dμ(x) =

f (x, y) dμ(x)

y

X

X

for ν-almost all y ∈ Y , is ν-integrable; and we have the formula 

 X ×Y



f d(μ × ν) =

g dμ = X

h dν. Y

(ii) Conversely, if f is an M × N -measurable function on X × Y , then the functions  ϕ(x) = | f (x, y)| dν(y) for x ∈ X, Y ψ(y) = | f (x, y)| dμ(x) for y ∈ Y, X

are M-measurable and N -measurable, respectively; and we have the formula

76

2 Sets, Topology and Measures



 X ×Y



| f |d(μ × ν) =

ϕ dμ = X

ψ dν. Y

Furthermore, if either ϕ or ψ is integrable, then f is integrable, and part (i) applies.

2.20 Notes and Comments For topological spaces, see Friedman [66], Jameson [96], Schaefer [157] and Treves [223]. For the theory of measure and integration, see Folland [63], Lang [112] and Rudin [153]. This chapter is adapted from these books in such a way as to make it accessible to graduate students and advanced undergraduates as well.

Chapter 3

A Short Course in Probability Theory

This chapter is intended as a short introduction to probability theory. First, we present a brief dictionary of probabilists’ dialect due to Folland [63, Chap. 10] (see Table 3.1 below). Section 3.1 serves to illustrate some results of measure theory, since measure spaces are the natural setting for the study of probability. In particular, we prove the monotone class theorem (Theorem 3.2) and the Dynkin class theorem (Corollary 3.3) which will be useful for the study of measurability of functions in Chap. 12. In Sect. 3.2 we introduce probability spaces and in Sect. 3.3 we consider random variables and their expectations. One of the most important concepts in probability theory is that of independence. It is the concept of independence more than anything else which gives probability theory a life of its own, distinct from other branches of analysis. In Sect. 3.4 we study independent events, independent random variables and independent algebras. In Sect. 3.6, as an application of the Radon–Nikodým theorem (Theorem 2.15) we introduce conditional probabilities and conditional expectations (Definitions 3.24, 3.27 and 3.33). The last Sect. 3.7 is devoted to the general theory of conditional expectations which will play a vital role in the study of Markov processes in Chap. 12.

3.1 Measurable Spaces and Functions This section serves to illustrate some results of measure theory, since measure spaces are the natural setting for the study of probability. We study measurable spaces and measurable functions. In particular, we prove the monotone class theorem (Theorem 3.2) and the Dynkin class theorem (Corollary 3.3) which will be useful for the study of measurability of functions in Chap. 12.

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 K. Taira, Functional Analytic Techniques for Diffusion Processes, Springer Monographs in Mathematics, https://doi.org/10.1007/978-981-19-1099-9_3

77

78

3 A Short Course in Probability Theory

Table 3.1 A brief dictionary of probabilists’ dialect Real analysis Probability Measure space (X, M, μ) σ -algebra Measurable set Real-valued measurable function f (x) Integral of f X f (x) dμ p L  norm p X | f | dμ Convergence in measure Almost every(where) (a. e.) Borel probability measure on R Fourier transform of a measure Characteristic function

Probability space (Ω, F , P) σ -field Event Random variable X (ω) Expectation  (Mean) of X E(X ) = Ω X (ω) d P p-th moment E (|X | p ) Convergence in probability Almost sure(ly) (a. s.) Distribution Characteristic function of a distribution Indicator function

3.1.1 The Monotone Class Theorem Let X be a non-empty set. Now we introduce a new class of subsets of X which is closely related to σ -algebras: Definition 3.1 Let F be a collection of subsets of X . (i) F is called a π -system in X if it is closed under finite intersections. (ii) F is called a d-system in X if it has the following three properties (a), (b) and (c): (a) (b) (c)

The set X itself belongs to F. If A, B ∈ F and A ⊂ B, then the difference B \ A belongs to F. If {An }∞ n=1 is an increasing sequence of members of F, then the union ∞ A n=1 n belongs to F.

It should be emphasized that a collection F is a σ -algebra if and only if it is both a π -system and a d-system. For any collection F of subsets of X , there exists a smallest d-system d(F) which contains F. Indeed, it suffices to note that the intersection of an arbitrary number of d-systems is again a d-system. The next theorem gives a useful criterion for the d-system d(F) to be a σ -algebra: Theorem 3.2 (the monotone class theorem) If a collection F of subsets of X is a π -system, then it follows that d(F) = σ (F).

3.1 Measurable Spaces and Functions

79

Proof Since we have the assertion d(F) ⊂ σ (F), we have only to show that d(F) is a σ -algebra. To do this, it suffices to show that d(F) is a π -system. The proof is divided into two steps. Step 1: First, we let D1 := {B ∈ d(F) : B ∩ A ∈ d(F) for all A ∈ F} . Then it is easy to verify that D1 is a d-system and further that F ⊂ D1 , since F is a π -system. Hence we have the assertion d(F) ⊂ D1 , and so D1 = d(F). Step 2: Secondly, we let D2 := {B ∈ d(F) : B ∩ A ∈ d(F) for all A ∈ d(F)} . Again, it is easy to verify that D2 is a d-system. Moreover, if A is an arbitrary element of F, then we have, for all B ∈ D1 = d(F), B ∩ A ∈ d(F). This proves that F ⊂ D2 . Hence we have the assertion d(F) ⊂ D1 , and so D2 = d(F). This implies that d(F) is closed under finite intersections, that is, d(F) is a π -system. The proof of Theorem 3.2 is complete.  The next version of the monotone class theorem will be useful for the study of measurability of functions in Chap. 12:

80

3 A Short Course in Probability Theory

Corollary 3.3 (the Dynkin class theorem) Let F be a π -system. If D is a d-system which contains F, then it follows that σ (F) ⊂ D. Indeed, it follows from an application of Theorem 3.2 that σ (F) = d(F) ⊂ D, since D is a d-system which contains F.

3.1.2 The Approximation Theorem Let X be a non-empty set equipped with a collection D of subsets of X . An extended real-valued set function μ is a function defined on D taking extended real numbers. The collection D is called the domain of definition of μ. Let μ be a set function defined on an algebra A on X . The set function μ is said to be finitely additive if we have, for any m ∈ N, ⎛ μ⎝

m 

⎞ Aj⎠ =

j=1

m 

μ(A j )

j=1

provided that the A j are mutually disjoint sets of A. We say that μ is countably additive if we have the formula ⎛ ⎞ ∞ ∞   μ⎝ Aj⎠ = μ(A j ) j=1

j=1

provided that the A j are mutually disjoint sets of A such that ∞ j=1 A j . Let (X, M) be a measurable space. An extended real-valued function μ defined on the σ -algebra M is called a non-negative measure or simply a measure if it has the following three properties (M1), (M2) and (M3): (M1) (M2) (M3)

0 ≤ μ(E) ≤ ∞ for all E ∈ M. μ(∅) = 0. The function μ is countably additive, that is, ⎛ μ⎝

∞  j=1

⎞ E j⎠ =

∞ 

μ(E j )

j=1

for any disjoint countable collection {E j }∞ j=1 of members of M.

3.1 Measurable Spaces and Functions

81

The triplet (X, M, μ) is called a measure space. In other words, a measure space is a measurable space which has a non-negative measure defined on the σ -algebra of its measurable sets. If μ(X ) < ∞, then the measure μ is called a finite measure and the space (X, M, μ) is called a finite measure space. If X is a countable union of sets of finite measure, then the measure μ is said to be σ -finite on X . We also say that the measure space (X, M, μ) is σ -finite. Some basic properties of measures are summarized in the following four properties (a)–(d): (a) (Monotonicity) If E, F ∈ M and E ⊂ F, then μ(E) ≤ μ(F). (b) (Subadditivity) If {E j }∞ j=1 ⊂ M, then we have the inequality ⎛ μ⎝



⎞ E j⎠ ≤

j=1

∞ 

μ(E j ).

j=1

(c) (Continuity from below) If {E j }∞ j=1 ⊂ M and if E 1 ⊂ E 2 ⊂ . . ., then we have the formula ⎛ ⎞ ∞

μ⎝ E j ⎠ = lim μ(E j ). j=1

j→∞

(d) (Continuity from above) If {E j }∞ j=1 ⊂ M and if E 1 ⊃ E 2 ⊃ . . . and μ(E 1 ) < ∞, then we have the formula ⎛ ⎞ ∞ μ⎝ E j ⎠ = lim μ(E j ). j=1

j→∞

Let (X, M, μ) be a measure space, and let A be an algebra in M. The next approximation theorem asserts that every set in the σ -algebra σ (A) can be approximated by sets in the algebra A: Theorem 3.4 (the approximation theorem) If Λ ∈ σ (A) and μ(Λ) < ∞, then there exists a sequence {An }∞ n=1 in A such that lim μ (Λ An ) = 0,

n→∞

where A B = (A \ B) ∪ (B \ A) is the symmetric difference of A and B. Proof We let C := {Λ ∈ M : condition (3.1) holds true} .

(3.1)

82

3 A Short Course in Probability Theory

We have only to show that C is a σ -algebra which contains A. Indeed, we then have the assertion σ (A) ⊂ C. Namely, the desired condition (3.1) holds true for all Λ ∈ σ (A). (a) First, it follows that A ⊂ C. Indeed, it suffices to take An := Λ if Λ ∈ A. (b) Secondly, if Λ ∈ C, then it is easy to see that Λc Acn = Λ An , so that

lim μ Λc Acn = lim μ(Λ An ) = 0.

n→∞

n→∞

This proves that Λc ∈ C, since Acn ∈ A. (c) Thirdly, if {Λn }∞ n=1 ⊂ C, then it follows that Λ := ∪∞ n=1 Λn ∈ C. Indeed, since μ(Λ) < ∞, it follows from the continuity from above of the measure μ that, for any given ε > 0 there exists a positive number N = N (ε) such that  μ Λ\

N

 Λn


0 at x = a if and only if P(X = a) = δ. In particular, FX (x) is continuous at x = a if and only if P(X = a) = 0. It is a general principle that all properties of random variables which are relevant to probability theory can be formulated in terms of their distributions. The integral  Ω

X (ω) d P

is called the expectation or mean of X , and is denoted by E(X ). When we speak of E(X ), it is understood that the integral of |X (ω)| is finite. If Λ ∈ F, we let  E(X ; Λ) := E(X χΛ ) =

 Ω

X (ω) χΛ (ω) d P =

Λ

X (ω) d P.

Some basic properties of the expectations are summarized in the following seven properties (E1)–(E7): (E1) (E2)

E(X ) exists if and only if E(|X |) exists. If either E(|X |) < ∞ or E(|Y |) < ∞, then we have, for all a, b ∈ R, E(a X + bY ) = a E(X ) + bE(Y ).

(E3) (E4) (E5) (E6) (E7)

If X = c for some constant c almost everywhere in Ω, then E(X ) = c. If X = Y almost everywhere in Ω, then E(X ) = E(Y ). If X ≤ Y almost everywhere in Ω, then E(X ) ≤ E(Y ). If X ≥ 0 almost everywhere in Ω, then E(X ) ≥ 0. |E(X )| ≤ E(|X |).

We remark that expectations can be computed using PX or FX instead of the integral over Ω as follows: 

 E(X ) =

Ω

X (ω) d P =

 x d PX =

R



−∞

x d FX (x),

(3.10)

where the last expression is interpreted as an improper Riemann–Stieltjes integral and the third one is interpreted as a Lebesgue–Stieltjes integral. The second equality in formula (3.10) is a special case of the following measuretheoretic construction: Let (Ω  , F  ) be another measurable space, and assume that a

3.3 Random Variables and Expectations

97

mapping φ : Ω → Ω  is measurable in the sense that φ −1 (A ) ∈ F for every A ∈ F  . Then the measure P induces an image measure Pφ on (Ω  , F  ) by the formula Pφ (A ) = P(φ −1 (A )) for every A ∈ F  .

(3.11)

Indeed, it suffices to note that the mapping φ −1 preserves unions and intersections. Then we have the following theorem: Theorem 3.13 If X  is a measurable function from (Ω  , F  ) into (R, B(R)), then the composite function X (ω) = X  (φ(ω)) is a random variable on (Ω, F, P), and we have the formula  E(X ) =

Ω

 X (ω) d P =

Ω

X  (ω ) d Pφ ,

(3.12)

where the existence of either side implies that of the other. Proof First, it is clear that X = X  ◦ φ is F  -measurable. The proof of formula (3.12) is divided into three steps. Step 1: If X  is a characteristic function of a set A ∈ F  , then it follows that X is also a characteristic function of the set φ −1 (A ): X = X  ◦ φ = χ A ◦ φ = χφ −1 (A ) . Hence we have, by definition (3.11), E(X ) = P(φ −1 (A )) = Pφ (A ) =

 Ω

χ A (ω ) d Pφ =

 Ω

X  (ω ) d Pφ .

This proves the desired formula (3.12) for characteristic functions. The extension to simple functions follows by taking finite linear combinations, since the mapping φ −1 preserves unions and intersections. Step 2: If X  is a non-negative, F  -measurable function, then we can find an increasing sequence of simple functions {X n } which converges to X  . Hence it follows from an application of the monotone convergence theorem (Theorem 2.10)  lim

n→∞ Ω 

X n (ω ) d Pφ =

 Ω

X  (ω ) d Pφ .

(3.13)

However, we remark that the composite functions X n = X n ◦ φ are an increasing sequence of simple functions which converges to X . By applying again the monotone convergence theorem, we obtain that 

 E(X ) =

Ω

X (ω) d P = lim

n→∞ Ω

X n (ω) d P = lim E(X n ). n→∞

(3.14)

98

3 A Short Course in Probability Theory

Since formula (3.12) holds true for the simple functions X n and X n , it follows from formulas (3.14) and (3.13) that   E(X ) = lim E(X n ) = lim X n (ω ) d Pφ = X  (ω ) d Pφ . n→∞

n→∞ Ω 

Ω

This proves the desired formula (3.12) for non-negative, F  -measurable functions. Step 3: Finally, the general case of formula (3.12) follows by applying separately to the positive and negative parts of X  : +



X  (ω ) = X  (ω ) − X  (ω ) for ω ∈ Ω  , where +

X  (ω ) = max{X  (ω ), 0}, −

X  (ω ) = max{−X  (ω ), 0}. Indeed, since we have the formulas +

X  (ω ) = X + (φ(ω)) = max{X (φ(ω)), 0}, −

X  (ω ) = X − (φ(ω)) = max{−X (φ(ω)), 0}, we obtain from Step 2 that  Ω

 + − X  (ω ) d Pφ − X  (ω ) d Pφ Ω Ω    + − = X (ω) d P − X (ω) d P = X (ω) d P

X  (ω ) d Pφ =



Ω

Ω

Ω

= E(X ). This proves the desired formula (3.12) for general F  -measurable functions. Moreover, it is easily seen that E(X ) exists if and only if X  is integrable with respect to Pφ . The proof of Theorem 3.13 is complete.  Corollary 3.14 Let g(x) be a Borel measurable function from (Rn , B(Rn )) into (R, B(R)). If a vector-valued function X = (X 1 , X 2 , . . . , X n )) is a random variable on Ω and if the expectation E(g(X )) exists, then we have the formula  E(g(X )) = =

g(X 1 (ω), X 2 (ω), . . . , X n (ω)) d P  ∞ ··· g(x1 , x2 , . . . , xn ) d PX .

Ω∞

−∞

−∞

(3.15)

3.3 Random Variables and Expectations

99

Indeed, Corollary 3.14 follows from an application of Theorem 3.13 with Ω  := Rn , F  = B(Rn ), φ := X, X  = g, X := g(X ). Now, if (Ω1 , F1 ) and (Ω2 , F2 ) are measurable spaces, then we let Ω := Ω1 × Ω2 , F := F1 ⊗ F2 be the Cartesian product of the measurable spaces (Ω1 , F1 ) and (Ω2 , F2 ). We recall that a rectangle is a set of the form A1 × A2 where A1 ∈ F1 and A2 ∈ F2 , and further that the collection A of finite disjoint unions of rectangles forms an algebra. Moreover, we have the assertion σ (A) = F = F1 ⊗ F2 . If A ∈ F, then the ω1 -section Aω1 of A is defined by the formula Aω1 = {ω2 ∈ Ω2 : (ω1 , ω2 ) ∈ A} for ω1 ∈ Ω1 . If X is an F-measurable function on Ω, then the ω1 -section X ω1 of X is defined by the formula X ω1 (ω2 ) = X (ω1 , ω2 ) for ω2 ∈ Ω2 . The next theorem will be useful for the study of measurability of functions in Chap. 12: Theorem 3.15 Let P(ω1 , A2 ) be a function defined on Ω1 × F2 . Assume that the following two conditions (i) and (ii) are satisfied: (i) For each ω1 ∈ Ω1 , P(ω1 , ·) is a probability measure on (Ω2 , F2 ). (ii) For each A2 ∈ F2 , P(·, A2 ) is an F1 -measurable function on Ω1 . If h is a bounded, F-measurable function on Ω, we let  H (ω1 ) :=

Ω2

h(ω1 , ω2 ) P(ω1 , dω2 ).

Then H is a bounded, F1 -measurable function on Ω1 . Proof The proof is divided into two steps. Step 1: We prove the boundedness of H on Ω1 . First, since h(ω1 , ·) is F2 -measurable for all ω1 , it follows that the function H is well-defined. Moreover, we have the inequality

100

3 A Short Course in Probability Theory

  |H (ω1 )| = 

   h(ω1 , ω2 ) P(ω1 , dω2 ) ≤ |h(ω1 , ω2 )| P(ω1 , dω2 ) Ω2 Ω2  ≤ sup |h| P(ω1 , dω2 ) = sup |h| P(ω1 , Ω2 ) Ω

Ω

Ω2

= sup |h|. Ω

This proves that H is bounded on Ω1 . Step 2: Secondly, we prove the F1 -measurability of H . Step 2-1: If h = χ A1 ×A2 is a characteristic function with A1 ∈ F1 and A2 ∈ F2 , then it follows that  H (ω1 ) = χ A1 (ω1 ) χ A2 (ω2 ) P(ω1 , dω2 ) = χ A1 (ω1 P(ω1 , A2 )) . Ω2

This proves that H is F1 -measurable, since χ A1 and P(·, A2 ) are F1 -measurable. Step 2-2: Let A be the collection of finite disjoint unions of rectangles in Ω. If h = χ A with A ∈ A, then it follows that h is a simple function of the form h=

k 

(2) A(1) j ∈ F1 , A j ∈ F2 .

a j χ A(1) ×A(2) , j

j

j=1

Hence we have the formula H (ω1 ) =

k 

a j χ A(1) (ω1 P(ω1 , A(2) j )). j

j=1

By Step 2-1, this proves that H is F1 -measurable. Step 2-3: We let 





M := A ∈ F :

Ω2

χ A (ω1 , ω2 ) P(ω1 , dω2 ) is F1 -measurable .

By Step 2-2, it follows that A ⊂ M. Moreover, by applying the monotone convergence theorem (Theorem 2.10) we obtain that M is a d-system. Therefore, it follows from an application of the Dynkin class theorem (Corollary 3.3) that F = σ (A) ⊂ M ⊂ F, so that M = F.

3.3 Random Variables and Expectations

101

This proves that the function  Ω2

χ A (ω1 , ω2 ) P(ω1 , dω2 )

is F1 -measurable for every A ∈ F. Step 2-4: If h is a general simple function of the form h=

k 

a j χ A j with A j ∈ F,

j=1

then it follows that H (ω1 ) =

k  j=1

 aj

Ω2

χ A j (ω1 , ω2 ) P(ω1 , dω2 ).

By Step 2-3, this proves that H is F1 -measurable. Step 2-5: If h is a bounded, F-measurable function on Ω, it can be decomposed into the positive and negative parts: h(ω1 , ω2 ) = h + (ω1 , ω2 ) − h − (ω1 , ω2 ), (ω1 , ω2 ) ∈ Ω = Ω1 × Ω2 , where h + (ω1 , ω2 ) = max{h(ω1 , ω2 ), 0}, h − (ω1 , ω2 ) = max{−h(ω1 , ω2 ), 0}. However, we know that the function h + is a pointwise limit of an increasing sequence ∞ − {h + n }n=1 of non-negative, F-measurable simple functions and that function h is a − ∞ pointwise limit of an increasing sequence {h n }n=1 of non-negative, F-measurable simple functions. Hence it follows from an application of the monotone convergence theorem (Theorem 2.10) that   ± h (ω1 , ω2 ) P(ω1 , dω2 ) = lim h± n (ω1 , ω2 ) P(ω1 , dω2 ). n→∞ Ω 2

Ω2

By Step 2-4, we find that the functions  Ω2

h ± (ω1 , ω2 ) P(ω1 , dω2 )

are F1 -measurable. Summing up, we have proved that the function

102

3 A Short Course in Probability Theory

 H (ω1 ) =

Ω2

 =

Ω2

h(ω1 , ω2 ) P(ω1 , dω2 ) h + (ω1 , ω2 ) P(ω1 , dω2 ) −

 Ω2

h − (ω1 , ω2 ) P(ω1 , dω2 )

is F1 -measurable. The proof of Theorem 3.15 is complete.



3.4 Independence One of the most important concepts in probability theory is that of independence. It is the concept of independence more than anything else which gives probability theory a life of its own, distinct from other branches of analysis.

3.4.1 Independent Events Let (Ω, F, P) be a probability space. Two events A and B in F are said to be independent if the following product rule holds true: P(A ∩ B) = P(A)P(B). A collection {E 1 , E 2 , . . . , E n } of events in F is said to be independent if the product rule holds true for every subcollection of them, that is, if every subcollection {E i1 , E i2 , . . . , E ik } satisfies the condition P(E i1 ∩ E i2 ∩ . . . ∩ E ik ) = P(E i1 )P(E i2 ) . . . P(E ik ).

(3.16)

A collection A = {E i : i ∈ I } of events in F, where I is a finite or infinite index set, is said to be independent if every finite subcollection of A is independent, that is, if condition (3.16) holds true for all k ∈ N and all distinct i 1 , i 2 , . . ., i k ∈ I . We remark that if A = {E i : i ∈ I } is an independent class, then so is the class A obtained by replacing the E i in any subclass of A by either ∅, Ω or E ic . For example, if A and B are independent, then the following product rules hold true: P(A ∩ B c ) = P(A)P(B c ), P(Ac ∩ B) = P(Ac )P(B), P(Ac ∩ B c ) = P(Ac )P(B c ).

3.4 Independence

103

3.4.2 Independent Random Variables Let (Ω, F, P) be a probability space. A collection {X 1 , X 2 , . . . , X n } of random variables on Ω is said to be independent if the events E 1 = X −1 (B1 ), E 2 = X −1 (B2 ), . . ., E n = X −1 (Bn ) satisfy condition (3.16) for every choice of Borel sets B1 , B2 , . . ., Bn ∈ B(R): k



P X i j ∈ Bi j . P X i1 ∈ Bi1 , X i2 ∈ Bi2 , . . . , X ik ∈ Bik =

(3.17)

j=1

A collection C = {X i : i ∈ I } of random variables on Ω, where I is a finite or infinite index set, is said to be independent if every finite subcollection of C is independent, that is, if condition (3.17) holds true for all k ∈ N and all distinct i 1 , i 2 , . . ., i k ∈ I . For any finite sequence {X 1 , X 2 , . . . , X n } of random variables on Ω, we consider (X 1 , X 2 , . . . , X n ) as a map of Ω into Rn (X 1 , X 2 , . . . , X n ) : Ω −→ Rn , and define the image measure P(X 1 ,X 2 ,...,X n ) on (Rn , B(Rn )) by the formula

P(X 1 ,X 2 ,...,X n ) (B) = P (X 1 , X 2 , . . . , X n )−1 (B) for every Borel set B ∈ B(Rn ). The probability measure P(X 1 ,X 2 ,...,X n ) on Rn is called the joint distribution of (X 1 , X 2 , . . . , X n ). The next theorem gives a characterization of independent random variables in terms of their joint distributions: Theorem 3.16 A collection {X i : i ∈ I } of random variables on Ω is independent if and only if the joint distribution P(X α1 ,X α2 ,...,X αn ) of any finite set {X α1 , X α2 , . . . , X αn } is the product of their individual distributions: P(X α1 ,X α2 ,...,X αn ) =

n 

PX α j on B(Rn ).

(3.18)

j=1

Proof First, we have, for all Borel sets B1 , B2 , . . ., Bn ∈ B(R), P(X α1 ,X α2 ,...,X αn ) (B1 × B2 . . . × Bn )

= P (X α1 , X α2 , . . . , X αn )−1 (B1 × B2 . . . × Bn )

= P X α−1 (B1 ) ∩ X α−1 (B2 ) ∩ . . . ∩ X α−1 (Bn ) . 1 2 n

(3.19)

(1) The “only if” part: If we define a measurable rectangle (Borel cylinder set) to be a set of the form

104

3 A Short Course in Probability Theory

B1 × B2 × . . . × Bn with B1 , B2 , . . . , Bn ∈ B(R), then it follows that the collection A of finite disjoint unions of rectangles forms an algebra and further that A generates the σ -algebra B(Rn ): σ (A) = B(Rn ). If {X i } is independent, then it follows from formula (3.18) that P(X α1 ,X α2 ,...,X αn ) (B1 × B2 . . . × Bn ) n n     = P X α−1j (B j ) = PX α j (B j ) j=1

⎛ =⎝

n 



(3.20)

j=1

PX α j ⎠ (B1 × B2 × . . . × Bn ) .

j=1

We let M :=

⎧ ⎨ ⎩

A ∈ B(Rn ) : P(X α1 ,X α2 ,...,X αn ) (A) =

n  j=1

⎫ ⎬

PX α j (A) . ⎭

Then we find from formula (3.20) that$ A ⊂ M. Moreover, it is easy to see that M is a d-system, since P(X α1 ,X α2 ,...,X αn ) and nj=1 PX α j are measures on B(Rn ). Therefore, by applying the Dynkin class theorem (Corollary 3.3) we obtain that σ (A) ⊂ M, so that M = B(Rn ). This proves the desired formula (3.18). (2) The “if” part: If formula (3.18) holds true, then we have, by formulas (3.19) and (3.20),

(B1 ) ∩ X α−1 (B2 ) ∩ . . . ∩ X α−1 (Bn ) P X α−1 1 2 n

= P (X α1 , X α2 , . . . , X αn )−1 (B1 × B2 . . . × Bn ) = P(X α1 ,X α2 ,...,X αn ) (B1 × B2 . . . × Bn ) ⎞ ⎛ n  PX α j ⎠ (B1 × B2 × . . . × Bn ) =⎝ j=1

=

n  j=1

or equivalently,

PX α j (B j ) =

n  j=1

  P X α−1j (B j ) ,

3.4 Independence

105

n



P X α1 ∈ B1 , X α2 ∈ B2 , . . . , X αn ∈ Bn = P Xαj ∈ B j . j=1

This proves that {X i } is independent. The proof of Theorem 3.16 is complete.



3.4.3 Independent Algebras Let (Ω, F, P) be a probability space. A collection A = {A1 , A2 , . . . , An } of subalgebras of F is said to be independent if we have, for any event E i ∈ Ai , P (E 1 ∩ E 2 ∩ . . . ∩ E n ) = P(E 1 )P(E 2 ) . . . P(E n ).

(3.21)

A collection A = {Ai : i ∈ I } of subalgebras of F, where I is an infinite index set, is said to be independent if every finite subcollection of A is independent, that is, if condition (3.21) holds true for all n ∈ N and all distinct i 1 , i 2 , . . ., i n ∈ I :

P E i1 ∩ E i2 ∩ . . . ∩ E in = P(E i1 )P(E i2 ) . . . P(E in ) for E ik ∈ Aik . If A is an event of Ω, we define the σ -algebra σ (A) as follows:   σ (A) = ∅, A, Ac , Ω . Then it is easy to see that a collection {A1 , A2 , . . . , An } of events is independent if and only if the collection {σ (A1 ), σ (A2 ), . . . , σ (An )} of σ -algebras is independent. We recall that a collection {X 1 , X 2 , . . . , X n } of random variables on Ω is independent if the events E 1 = X 1−1 (B1 ), E 2 = X 2−1 (B2 ), . . ., E n = X n−1 (Bn ) satisfy condition (3.21) for every choice of Borel sets B1 , B2 , . . ., Bn ∈ B(R): P (X 1 ∈ B1 , X 2 ∈ B2 , . . . , X n ∈ Bn ) =

n 

P X j ∈ Bj .

(3.22)

j=1

If X is a random variable on Ω, we define the σ -algebra σ (X ) by the formula   σ (X ) = X −1 (A) : A ∈ B(R) . Then we have the following theorem: Theorem 3.17 Let C = {X i : i ∈ I } be a collection of random variables on Ω, where I is a finite or infinite index set. Then the collection C is independent if and only if the collection A = {σ (X i ) : i ∈ I } of σ -algebras is independent.

106

3 A Short Course in Probability Theory

Proof The proof is divided into two steps. Step 1: The “only if” part: Let {X i1 , X i2 , . . . , X in } be an arbitrary finite subcollec(B ), B ∈ B(R), is an element of σ X ik , we obtain from tion of A. If E ik = X i−1 ik ik k formula (3.22) that



(Bik ) P ∩nk=1 E ik = P ∩nk=1 X i−1 k

= P X i1 ∈ Bi1 , X i2 ∈ Bi2 , . . . , X in ∈ Bin =

n 

n



P X ik ∈ Bik = P X i−1 (Bik ) k

k=1

=

n 

k=1

P E ik .

k=1

This proves the independence of the collection {σ (X i1 ), σ (X i2 ), . . . , σ (X in )} of σ algebras. Step 2: The “if” part: Let {X i1 , X i2 , . . . , X in } be an arbitrary finite subcollection of A. For any Borel set Bik ∈ B(R), it follows that (Bik ) ∈ σ X ik . E ik = X i−1 k Hence we have, by the independence of {σ (X i1 ), σ (X i2 ), . . . , σ (X in )},

P X i1 ∈ Bi1 , X i2 ∈ Bi2 , . . . , X in ∈ Bin n



n = (B ) = P ∩ E P E ik = P ∩nk=1 X i−1 i i k k=1 k k k=1

=

n 

P X ik ∈ Bik .

k=1

This proves the independence of the collection {X i1 , X i2 , . . . , X in } of random variables. The proof of Theorem 3.17 is complete.  The next theorem asserts that functions of independent random variables are independent: Theorem 3.18 Let ϕi, j : Rki → R be Borel measurable functions for 1 ≤ i ≤ n, 1 ≤ j ≤ i , and let Xi : Ω → Rki be random variables for 1 ≤ i ≤ n. If the random variables {X1 , X2 , . . . , Xn } are independent, then the random variables Yi , defined by the formula

Yi = ϕi,1 (X i ), ϕi,2 (X i ), . . . , ϕi, i (X i ) for every 1 ≤ i ≤ n, are independent.

3.4 Independence

107

Proof First, we have, for any set B ∈ B(R), ϕi,−1j (B) ∈ B(Rki ) for 1 ≤ i ≤ n and 1 ≤ j ≤ i ,   Yi−1 (B) = Xi−1 ϕi,−1j (B) for 1 ≤ i ≤ n, and so   σ (Yi ) = Yi−1 (B) : B ∈ B(R) ⊂ σ (Xi ) = Xi−1 (A) : A ∈ B(Rki ). This proves that σ (Yi ) is a sub-σ -algebra of σ (Xi ) for 1 ≤ i ≤ n. Now we assume that the random variables {X1 , X2 , . . . , Xn } are independent. Then it follows from an application of Theorem 3.17 that the collection {σ (X1 ), σ (X2 ), . . . , σ (Xn )} is independent. Hence we find that the collection {σ (Y1 ), σ (Y2 ), . . . , σ (Yn )} of σ algebras is independent, since σ (Yi ) ⊂ σ (Xi ) for 1 ≤ i ≤ n. Therefore, by applying again Theorem 3.17 we obtain that the random variables {Y1 , Y2 , . . . , Yn } are independent. The proof of Theorem 3.18 is complete.  We give examples of operations which preserve the independence of algebras: Theorem 3.19 Let A = {Ai : i ∈ I } be a collection of sub-algebras of F, where I is a finite or infinite index set. If the collection A is independent, then the collection B = {σ (Ai ) : i ∈ I } of σ -algebras is independent. Here σ (Ai ) is the σ -algebra generated by the algebra Ai . Proof We have only to prove that if every finite subcollection {A1 , A2 , . . . , An } of A is independent, then the collection {σ (A1 ), σ (A2 ), . . . , σ (An )} of σ -algebras is independent. Let Λi be an arbitrary element of σ (Ai ) with 1 ≤ i ≤ n. By Theorem 3.4, for any positive ε > 0 we can find a subset Ai ∈ Ai such that P (Λi Ai )
Θ

so that P({X ≤ x} ∩ Θ) > P({X ≤ y} ∩ Θ). This is a contradiction, since we have the assertion x < y =⇒ {X ≤ x} ∩ Θ ⊂ {X ≤ y} ∩ Θ. (ii) The proof of property (CD2): By property (CD1), it follows that the sequence {P(X ≤ k | B)(ω)} is increasing with respect to k, for almost all ω ∈ Ω. Hence we obtain that the limit lim P(X ≤ k | B)(ω) k→∞

exists for almost all ω ∈ Ω. For any given ε > 0, we let   Θε := ω ∈ Ω : lim P(X ≤ x | B)(ω) ≤ 1 − ε . k→∞

Then it follows that Θε ∈ B, and further from condition (3.30) that P(Θε ) = lim P({X ≤ k} ∩ Θε ) = lim E(P(X ≤ k | B); Θε ) k→∞ k→∞  = lim P(X ≤ k | B)(ω) d P ≤ (1 − ε)P(Θε ). Θε k→∞

This proves that P(Θε ) = 0. Since ε > 0 is arbitrary, we find that lim P(X ≤ k | B)(ω) = 1

k→∞

116

3 A Short Course in Probability Theory

for almost all ω ∈ Ω. (iii) The proof of property (CD3): Similarly, by letting   Λε = ω ∈ Ω : lim P(X ≤ x | B)(ω) ≥ ε , k→−∞

we can prove that lim P(X ≤ k | B)(ω) = 0

k→−∞

for almost all ω ∈ Ω. The proof of Lemma 3.25 is complete.



Moreover, we can prove the following theorem: Theorem 3.26 Let X be a random variable on (Ω, F, P). If B is a sub-σ -algebra of F, then there exists a function μB on Ω × B(R) which satisfies the following two conditions (1) and (2): (1) μB (ω, ·) is a probability measure on B(R) for almost all ω ∈ Ω. (2) For every A ∈ B(R), μB (·, A) is a version of the conditional probability P(X ∈ A | B). In particular, we have the formula  P ((X ∈ A) ∩ Θ) =

Θ

μB (ω, A) d P for all Θ ∈ B.

Moreover, the function μB is uniquely determined in the sense that any two of them are equal with respect to P. More precisely, if a function  μB on Ω × B(R) satisfies conditions (1) and (2), then we have, for almost all ω ∈ Ω,  μB (ω, A) = μB (ω, A) for A ∈ B(R).

(3.31)

Proof The proof is divided into four steps. Step 1: First, we prove the uniqueness of the function μB , that is, we prove formula (3.31) for almost all ω ∈ Ω. To do this, we let M := {A ∈ B(R) :  μB (ω, A) = μB (ω, A) for almost all ω ∈ Ω} . It suffices to show that M = B(R). If we let E := the collection of sets of the form (x, y] or (x, ∞) or ∅ where − ∞ ≤ x < y < ∞,

(3.32)

3.6 Conditional Probabilities

117

then it is easy to see that E is an elementary family. Hence we find that the collection A of finite disjoint unions of members in E is an algebra, and further that the σ -algebra σ (A) generated by A is B(R). For every rational number r ∈ Q, it follows that (−∞, r ] ∈ B(R). Hence we have, for almost all ω ∈ Ω,  μB (ω, (−∞, r ]) = μB (ω, (−∞, r ]).

(3.33)

By the right-continuity of measures, we obtain form formula (3.33) that, for all x ∈ R, μB (ω, (−∞, rn ]) = lim μB (ω, (−∞, rn ]  μB (ω, (−∞, x]) = lim  rn ∈Q rn ↓x

rn ∈Q rn ↓x

(3.34)

= μB (ω, (−∞, x]). Moreover, we have, for all (x, ∞) = R \ (−∞, x] with x ∈ R, μB (ω, R \ (−∞, x]) =  μB (ω, R) −  μB (ω, (−∞, x]) (3.35)  μB (ω, (x, ∞)) =  =1− μB (ω, (−∞, x]) = 1 − μB (ω, (−∞, x]) = μB (ω, R \ (−∞, x]) =  μB (ω, (x, ∞)), and, for all (x, y] = (−∞, y] \ (−∞, x] with y > x, μB (ω, (−∞, y] \ (−∞, x])  μB (ω, (x, y]) = 

(3.36)

= μB (ω, (−∞, y]) −  μB (ω, (−∞, x]) = μB (ω, (−∞, y]) − μB (ω, (−∞, x]) = μB (ω, (x, y]). Hence, we obtain from formulas (3.34), (3.35) and (3.36) that A ⊂ M. However, it is easy to see that M is a d-system, since  μB and μB are probability measures on B(R). Therefore, by applying the Dynkin class theorem (Corollary 3.3) we obtain that B(R) = σ (A) ⊂ M ⊂ B(R), so that M = B(R). This proves the desired formula (3.31). Step 2: We prove the existence of the function μB . The proof is divided into two steps. Step 2-1: For every r ∈ Q, we define a B-measurable function G B (r ) by the formula G B (r )(ω) = P (X ≤ r | B) (ω) for almost all ω ∈ Ω.

118

3 A Short Course in Probability Theory

For each integer n ∈ N, we let      k k+1 (ω) ≤ G B (ω) for all k ∈ Z, Ωn := ω ∈ Ω : G B n n      k k (ω) = 1, lim G B (ω) = 0 . lim G B k→∞ k→−∞ n n Then it follows from an application of Lemma 3.25 that 

Ωn ∈ B for all n ∈ N, P (Ωn ) = 1 for all n ∈ N,  & = ∞ Ω n=1 Ωn ∈ B,  = 1. P(Ω)

so that

Hence we can define a function FB on Ω × R as follows: ⎧ ⎨limr ∈Q G B (r )(ω) if ω ∈ Ω,  r ↓x FB (ω, x) = ⎩0  if ω ∈ / Ω.  and that Then it is easy to see that FB (ω, ·) is a distribution function for each ω ∈ Ω FB (·, x) is B-measurable for each x ∈ R. Moreover, it follows from an application of the monotone convergence theorem (Theorem 2.10) that  P({X ≤ x} ∩ Θ) = lim P({X ≤ r } ∩ Θ) = lim r ∈Q r ↓x

r ∈Q r ↓x

 =

Θ

Θ

G B (r )(ω) d P

FB (ω, x) d P for every Θ ∈ B.

This proves that FB (·, x) is a version of the conditional probability P(X ≤ x | B). Step 2-2: We remark that the distribution function FB (ω, ·) determines a proba In particular, we have the formula bility measure μB (ω) on R for each ω ∈ Ω. FB (ω, x) = μB (ω) ((−∞, x]) for x ∈ R. Therefore, for every A ∈ B(R) we can define a function μB (·, A) on Ω by the formula  μB (ω, A) =

 μB (ω)(A) if ω ∈ Ω,  0 if ω ∈ / Ω.

(3.37)

3.6 Conditional Probabilities

119

We show that the function μB (·, A) is B-measurable for every A ∈ B(R). To do this, we let L := {A ∈ B(R) : μB (·, A) is B-measurable} . It suffices to show that L = B(R). (a) First, it follows that (−∞, y] ∈ L for all y ∈ R, since the function  μB (ω, (−∞, y]) =

 μB (ω) ((−∞, y]) = FB (ω, y) if ω ∈ Ω,  0 if ω ∈ /Ω

is B-measurable. (b) Secondly, it follows that (x, y] ∈ L for all y > x, since we have, for (x, y] = (−∞, y] \ (−∞, x], μB (ω, (x, y]) = μB (ω, (−∞, y]) − μB (ω, (−∞, x]). (c) Thirdly, it follows that (x, ∞) ∈ L for all x ∈ R, since we have the formula μB (ω, (x, ∞)) =

∞ 

μB (ω, (x + j, x + j + 1]).

j=0

Therefore, we find that the elementary family E defined by formula (3.32) is contained in L and further that the collection A of finite disjoint unions of members in E is an algebra contained in L. (d) However, it is easy to see that L is a d-system, since μB (ω, ·) are probability measures on B(R). Therefore, by applying the Dynkin class theorem (Corollary 3.3) we obtain that B(R) = σ (A) ⊂ L ⊂ B(R), so that L = B(R). Step 2-2: Finally, we prove that μB (·, A) is a version of the conditional probability P(X ∈ A | B). It remains to show that we have, for every Θ ∈ B,  P((X ∈ A) ∩ Θ) =

Θ

μB (ω, A) d P.

To do this, we let N := {A ∈ B(R) : Formula (3.38) holds true for all Θ ∈ B} . It suffices to show that N = B(R).

(3.38)

120

3 A Short Course in Probability Theory

By arguing just as in Step 2-1, we obtain that E ⊂ A ⊂ N and further that N is a d-system. Indeed, if {An }∞ n=1 is an increasing sequence of members of N , then it follows from an application of the monotone convergence theorem (Theorem 2.10) that 

∞ P (X ∈ ∪n=1 An ) ∩ Θ = lim P ((X ∈ An ) ∩ Θ) = lim μ(ω, An ) d P n→∞ n→∞ Θ  μB (ω, ∪∞ = n=1 An ) d P. Θ

This proves that the union ∪∞ n=1 An belongs to N . Therefore, by applying the Dynkin class theorem (Corollary 3.3) we obtain that B(R) = σ (A) ⊂ N ⊂ B(R), so that N = B(R). Summing up, we have proved that μB (ω, A) = P(X ∈ A | B)(ω) for every A ∈ B(R). Now the proof of Theorem 3.26 is complete.  Definition 3.27 The function μB on Ω × B(R) is called a version of the conditional probability of X with respect to B, and will be denoted by P(X ∈ · | B). If B is generated by the random variables X 1 , X 2 , . . ., X n , that is, if B = σ (X 1 , X 2 , . . . , X n ), then P(X ∈ · | B) will be denoted as follows: P (X ∈ · | X 1 , X 2 , . . . , X n ) = P(X ∈ · | σ (X 1 , X 2 , . . . , X n )). The next theorem is an Rn -version of Theorem 3.26: Theorem 3.28 Let X be a random variable and let X = (X 1 , X 2 , . . . , X n ) be a vector-valued random variable on the probability space (Ω, F, P). Then there exists a function ψ(x, A) on Rn × B(R) which satisfies the following two conditions (i) and (ii): (i) ψ(x, ·) is a probability measure on B(R) for μX -almost all x ∈ Rn . Here μX is the joint distribution of X = (X 1 , X 2 , . . . , X n ). (ii) For every A ∈ B(R), ψ(·, A) is a Borel measurable function on Rn and we have, for almost all ω ∈ Ω, ψ(X 1 (ω), X 2 (ω), . . . , X n (ω), ·) = P (X ∈ · | X 1 , X 2 , . . . , X n ) (ω).

(3.39)

3.6 Conditional Probabilities

121

Moreover, the function ψ is uniquely determined in the sense that any two of  on Rn × B(R) them are equal with respect to μX . More precisely, if a function ψ satisfies conditions (i) and (ii), then we have, for μX -almost all x ∈ Rn , (x, A) = ψ(x, A) for A ∈ B(R). ψ

(3.40)

Proof The proof is divided into three steps. Step 1: First, we construct a function ψ by using the function μB in Theorem 3.26. Since B = σ (X 1 , X 2 , . . . , X n ) and μB (·, A) is B-measurable for every A ∈ B(R), it follows from an application of Theorem 3.12 that there exists a Borel measurable function Φ(·, A) on Rn such that Φ (X 1 (ω), X 2 (ω), . . . , X n (ω), A)   μB (ω)(A) if ω ∈ Ω, = μB (ω, A) =  0 if ω ∈ / Ω.

(3.41)

 Here we recall that μB (ω) is a probability measure on R for every ω ∈ Ω. If we let G(x, y) := Φ(x, (−∞, y]) for x ∈ Rn and y ∈ R, it follows that the function G(x, y) = Φ(x, (−∞, y]) is a distribution function of y  Hence, if we let for every x ∈ X(Ω).  Γ := x ∈ Rn : G(x, r ) ≤ G(x, r  ) for all r < r  with r, r  ∈ Q,   G(x, r ) = 1 , lim G(x, r ) = 0, lim  r ∈Q r →−∞

r ∈Q r  →∞

then we have the assertions Γ ∈ B(Rn ),  ⊂ Γ. X(Ω) If we define a function F(x, y) on Rn × R by the formula F(x, y) =

⎧ ⎨limr ∈Q G(x, r ) if x ∈ Γ, ⎩0

r ↓y

we have the following four assertions (a)–(d):

if x ∈ / Γ,

122

3 A Short Course in Probability Theory

(a) For each x ∈ Rn , the function R  y −→ F(x, y) is right-continuous on R. (b) For each y ∈ R, the function Rn  x −→ F(x, y) is Borel measurable on Rn . (c) For each x ∈ Γ , the function F(x, y) is a distribution function of y, and  F(x, y) = G(x, y) for all x ∈ X(Ω). (d) For each x ∈ Γ , the function F(x, ·) determines a probability measure ψ(x, ·)  on R. In particular, we have, for all x ∈ X(Ω), ψ(x, ·) = Φ(x, ·). Moreover, if we define a function ψ(x, A) by the formula  ψ(x, A) =

Φ(x, A) if x ∈ Γ, 0 if x ∈ / Γ,

then we have the formula ψ(X(ω), A) = Φ(X(ω), A) for all ω ∈ Ω. Indeed, it suffices to note that c ) ⊂ Γ c . ψ(x, A) = Φ(x, A) = 0 for all x ∈ X(Ω Therefore, we obtain from formula (3.41) that, for almost all ω ∈ Ω, ψ(X 1 (ω), X 2 (ω), . . . , X n (ω), A) = Φ(X(ω), A) = μB (ω, A) = P (X ∈ A | X 1 , X 2 , . . . , X n ) (ω). This proves the desired formula (3.39). Step 2: Secondly, we show that ψ(x, A) is a Borel measurable function of x ∈ Rn for every A ∈ B(R). To do this, we let   M := A ∈ B(R) : ψ(·, A)is Borel measurable on Rn . Then, by arguing just as in the proof of Theorem 3.26 we can prove that M = B(R).

3.6 Conditional Probabilities

123

Moreover, since ψ(x, ·) a probability measure on R for each x ∈ Γ and since  ⊂ Γ , it follows that X(Ω)

 ≤ P X−1 (Γ ) = μX (Γ ) ≤ 1. 1 = P(Ω) Therefore, we have proved that ψ(x, ·) is a probability measure on B(R) for μX almost all x ∈ Rn . Step 3: Finally, we prove the uniqueness of the function ψ. Assume that we have, for every A ∈ B(R), ψ(X 1 (ω), X 2 (ω), . . . , X n (ω), A) = P (X ∈ A | X 1 , X 2 , . . . , X n ) (ω) (X 1 (ω), X 2 (ω), . . . , X n (ω), A). =ψ

(3.42)

If we let F(x, y) := ψ (x, (−∞, y]) for x ∈ Rn and y ∈ R,  y) := ψ  (x, (−∞, y]) for x ∈ Rn and y ∈ R, F(x,  y) are distribution functions of then we obtain that the functions F(x, y) and F(x, y for μX -almost all x ∈ Rn . For each y ∈ Q, we let    y) . B1 := x ∈ Rn : F(x, y) > F(x, Then it follows that B1 ∈ B(Rn ). Moreover, we have, by assertion (3.42) with x := X(ω) and A := (−∞, y], 

 y) dμX (x) = F(x, B1

 (X∈B1 )

(X(ω), (−∞, y]) d P ψ

= P ((X ∈ (−∞, y]) ∩ (X ∈ B1 ))  = ψ(X(ω), (−∞, y]) d P (X∈B1 )  = F(x, y) dμX (x), B1



so that



0=

 y) dμX (x). F(x, y) − F(x,

B1

This proves that μX (B1 ) = 0, since the integrand is positive on B1 .

124

3 A Short Course in Probability Theory

Similarly, if we let    y) > F(x, y) , B2 := x ∈ Rn : F(x, it follows that μX (B2 ) = 0. Therefore, we obtain that μX



 y) x ∈ Rn : F(x, y) = F(x,



= μX (B1 ∪ B2 ) = 0 for all y ∈ Q.

Hence we have the assertion    r) μX ∪r ∈Q x ∈ Rn : F(x, r ) = F(x,     r ) = 0. μX x ∈ Rn : F(x, r ) = F(x, ≤ r ∈Q

Namely, we have the assertion 

  r ) for all r ∈ Q x ∈ Rn : F(x, r ) = F(x,    r) = μX ∩r ∈Q x ∈ Rn : F(x, r ) = F(x,      r ) c = 1. = μX ∪r ∈Q x ∈ Rn : F(x, r ) = F(x,

μX

 y) are right-continuous functions of y, we have Moreover, since F(x, y) and F(x, the assertion  r ) = F(x,  y) for every y ∈ R. F(x, y) = lim F(x, r ) = lim F(x, r ∈Q r ↓y

r ∈Q r ↓y

This proves that  ·) for μX -almost all x ∈ Rn . F(x, ·) = F(x, Summing up, we have proved the desired assertion (3.40). The proof of Theorem 3.28 is complete.



Definition 3.29 The function ψ on Rn × B(R) is called a conditional distribution of X with respect to (X 1 , X 2 , . . . , X n ). We shall write P (X ∈ A | X 1 = x1 , X 2 = x2 , . . . , X n = xn ) = ψ(x1 , x2 , . . . , xn , A) for A ∈ B(R).

3.6 Conditional Probabilities

125

Example 3.30 Let Y be a random variable and let X = (X 1 , X 2 , . . . , X n ) be a vector-valued random variables on the probability space (Ω, F, P). Then Y and X are independent if and only if we have, for μX -almost all x ∈ Rn , P (Y ∈ A | X 1 = x1 , X 2 = x2 , . . . , X n = xn ) = P(Y ∈ A) for every A ∈ B(R).

(3.43)

Proof (i) The “if” part: Since we have the formula   B = σ (X) = X−1 (B) : B ∈ B(Rn ) , it follows from formula (3.39) and condition (3.43) that ψ(x, A) = P (Y ∈ A | X = x) = P(Y ∈ A) for every A ∈ B(R). Hence we have, for every B ∈ B(Rn ),  P ((Y ∈ A) ∩ (X ∈ B)) = 

P (Y ∈ A | X) (ω) d P  ψ(x, A) dμX = P(Y ∈ A) dμX

(X∈B)

= B

B

= P(Y ∈ A) μX (B) = P(Y ∈ A) P(X ∈ B). This proves that the random variables Y and X are independent. (ii) The “only if” part: If Y and X are independent variables, it follows from an application of Theorem 3.17 that the σ -algebras σ (Y ) and B = σ (X) are independent. Hence we have, for every B ∈ B(Rn ), P ((Y ∈ A) ∩ (X ∈ B)) = P(Y ∈ A) P(X ∈ B)  = P (Y ∈ A) d P. (X∈B)

This proves that P (Y ∈ A | X) (ω) = P(X ∈ A), or equivalently, P (Y ∈ A | X = x) = P(Y ∈ A) for μX -almost all x ∈ Rn . This proves the desired condition (3.43).



Example 3.31 Let X and Y be random variables such that the joint distribution of (X, Y ) has a density f (x, y). Then a version of the conditional distribution of Y with

126

3 A Short Course in Probability Theory

respect to X has a density function f (x, y) . −∞ f (x, y) dy

∞

(3.44)

Proof We have, for all Borel sets A, B ∈ B(R),   P(X ∈ B, Y ∈ A) =

f (x, y) d x d y. B

A

By taking A = R, we obtain that the distribution of X is given by the formula   P(X ∈ B) =

∞ −∞

B

 f (x, y) dy d x.

Namely, the distribution μ = P ◦ X −1 of X has a density  g(x) =

∞ −∞

f (x, y) dy,

since we have the formula μ(B) = P ◦ X

−1

 (B) = P(X ∈ B) =

g(x) d x for every B ∈ B(R). B

If we let  := {x ∈ R : g(x) = 0} , then we have the assertion  μ() =



g(x) d x = 0.

Therefore, we can define a function ψ(x, A) on Rn × B(R) by the formula ⎧ ⎨ 1  f (x, y) dy if x ∈ / , ψ(x, A) = g(x) A ⎩ 0 if x ∈ . Then it follows that ψ(x, ·) is a probability measure on R if x ∈ / . Moreover, we obtain that ψ is a version of the conditional distribution of Y with respect to X . Indeed, it suffices to note that

3.6 Conditional Probabilities

127

P ((Y ∈ A) ∩ (X ∈ B))    = f (x, y) dy d x B A     g(x)ψ(x, A) d x + f (x, y) dy d x = B\ B∩ A    g(x)ψ(x, A) d x = g(x)ψ(x, A) d x = ψ(x, A) dμ = B\ B B  ψ(X (ω), A) d P for every B ∈ B(R). = (X ∈B)

Summing up, we have proved that ψ(x, ·) = P(Y ∈ · | X = x) has the density function (3.44). The proof of Example 3.31 is complete. 

3.7 Conditional Expectations The general theory of conditional expectations does play a vital role in the study of Markov processes in Chap. 12. Let X be a random variable on the probability space (Ω, F, P). If B is a sub-σ algebra of F, then it follows from an application of Theorem 3.26 that there exists a conditional probability P(X ∈ · | B) of X with respect to B. The next theorem will play a crucial role in the study of Markov processes in Sect. 11.1: Theorem 3.32 Assume that E (|X |) < ∞. Then the integral  Y (ω) =

∞ −∞

x P(X ∈ d x | B)(ω)

(3.45)

exists for almost all ω ∈ Ω, and satisfies the following two conditions (CE1) and (CE2): (CE1) (CE2)

The function Y is B-measurable. E(X ; Λ) = E(Y ; Λ) for every Λ ∈ B. Namely, we have, for every Λ ∈ B,  Λ

 X (ω) d P =

Λ

Y (ω) d P.

Moreover, the function Y is uniquely determined in the sense that any two of them are equal with respect to P.

128

3 A Short Course in Probability Theory

Fig. 3.4 The approximations Z n (ω) to |X (ω)|

k+1 2n 2k + 1 2n+1 k 2n

|X(ω)|

Zn+1 (ω) Zn (ω)

0

ω

Proof The proof is divided into two steps. Step 1: First, we show that the function Y (ω) is a real-valued, B-measurable random variable, that is, it satisfies condition (CE1). To do this, we let An,k := (−(k + 1)/2n , −k/2n ] ∪ [k/2n , (k + 1)/2n ) for n, k ∈ N,

  −1 An,k = ω ∈ Ω : X (ω) ∈ An,k for n, k ∈ N, Λ(n) k := X and Z n (ω) :=

∞ ∞   k k (n) (ω) = χ χ (X (ω)) for n ∈ N. n Λk n An,k 2 2 k=0 k=0

Then it is easy to see the following two assertions (a) and (b) (see Fig. 3.4): (a) The Z n are B-measurable functions. (b) Z n ↑ |X | almost everywhere in Ω. Hence, by applying the monotone convergence theorem (Theorem 2.10) we obtain that   |X (ω)| d P = lim Z n (ω) d P (3.46) Ω

n→∞

Ω

 ∞  k = lim χ (n) (ω) d P n→∞ 2 n Ω Λk k=0  ∞ ∞  

k k χ (X (ω)) d P = lim P X ∈ An,k = lim A n,k n n n→∞ n→∞ 2 Ω 2 k=0 k=0 ∞ 

k P |X | ∈ [k/2n , (k + 1)/2n ) . n n→∞ 2 k=0

= lim

3.7 Conditional Expectations

129

However, we have the formulas 

∞ 

k P X ∈ An,k | B (ω), n n→∞ 2 −∞ k=0     = • χΛ(n) (ω) d P = P Λ(n) P(Λ(n) k k | B)(ω) d P k Ω Ω = P(X ∈ An,k | B)(ω) d P.





|x| P(X ∈ d x | B)(ω) = lim

(3.47a) (3.47b)

Ω

Hence, by using the monotone convergence theorem (Theorem 2.10) we obtain from formulas (3.46), (3.47a) and (3.47b) that   Ω



−∞

 |x| P(X ∈ d x | B)(ω) d P

 ∞ 

k = P X ∈ An,k | B (ω) d P lim n Ω n→∞ k=0 2  ∞ 

k P X ∈ An,k | B (ω) d P = lim n n→∞ 2 Ω k=0 ∞  k  χ A (X (ω)) d P = lim n→∞ 2n Ω n,k k=0  

= E (|X |) < ∞. This proves that the B-measurable function 

∞ −∞

|x| P(X ∈ d x | B)(ω)

is finite for almost all ω ∈ Ω. Therefore, by letting Yn (ω) =

∞ 

k P X ∈ [k/2n , (k + 1)/2n ) | B (ω), n 2 k=−∞

we obtain from assertion (3.45) that the series ∞ 

k P X ∈ [k/2n , (k + 1)/2n ) | B (ω) n 2 k=−∞

converges absolutely for almost all ω ∈ Ω, and further that the limit function

130

3 A Short Course in Probability Theory ∞ 

k P X ∈ [k/2n , (k + 1)/2n ) | B (ω) n n→∞ 2 k=−∞  ∞ = x P(X ∈ d x | B)(ω)

lim Yn (ω) = lim

n→∞

−∞

= Y (ω) is finite for almost all ω ∈ Ω. Step 2: Secondly, we let X n (ω) :=

∞  k n (X (ω)) χ n for n ∈ N. n [k/2 ,(k+1)/2 ) 2 k=−∞

Then it is easy to see the following three assertions (a), (b) and (c): (a) The X n are B-measurable functions. (b) |X n | ≤ |X | almost everywhere in Ω. (c) X n → X almost everywhere in Ω. Therefore, by using the dominated convergence theorem (Theorem 2.12) we obtain from formulas (3.45) and condition (3.47a) that  E(Y ; Λ) =

 





Y (ω) d P = x P(X ∈ d x | B)(ω) d P Λ −∞    ∞ 

k = P X ∈ [k/2n , (k + 1)/2n ) | B (ω) d P lim n Λ n→∞ k=−∞ 2  ∞ 

k = lim P X ∈ [k/2n , (k + 1)/2n ) | B (ω) d P n n→∞ 2 Λ k=−∞ Λ

∞ 

k E P X ∈ [k/2n , (k + 1)/2n ) | B ; Λ n n→∞ 2 k=−∞

= lim

∞ 

k P X ∈ [k/2n , (k + 1)/2n ) ∩ Λ n n→∞ 2 k=−∞  ∞  k χ[k/2n ,(k+1)/2n ) (X (ω)) d P = lim n→∞ 2n Λ k=−∞   X n (ω) d P = X (ω) d P = lim

= lim

n→∞ Λ

Λ

= E(X ; Λ) for every Λ ∈ B. This proves that Y satisfies condition (CE2).

3.7 Conditional Expectations

131

% satisfies condition (CE1) and (CE2). Step 3: Finally, we assume that a function Y If we let   %(ω) , Λ1 := ω ∈ Ω : Y (ω) > Y it follows that Λ1 ∈ B. Moreover, since we have the formula %; Λ1 ), E(X ; Λ1 ) = E(Y ; Λ1 ) = E(Y 

it follows that

Λ1

%(ω) d P = E(Y − Y %; Λ1 ) = 0. Y (ω) − Y

% is positive on the set Λ1 . This proves that P(Λ1 ) = 0, since the integrand Y − Y Similarly, if we let   %(ω) > Y (ω) , Λ2 := ω ∈ Ω : Y it follows that P(Λ2 ) = 0. Summing up, we have proved that P



%(ω) ω ∈ Ω : Y (ω) = Y



= P (Λ1 ∪ Λ2 ) = 0,

% are equal with respect to P. so that Y and Y Now the proof of Theorem 3.32 is complete.



Definition 3.33 Let B be a sub-σ -algebra of F. An integrable random variable Y is called a version of the conditional expectation of X for given B if it satisfies condition (CE1) and (CE2). We shall write  Y = E(X | B) =



−∞

x P(X ∈ d x | B).

Theorem 3.34 Assume that a Borel function f (x) on R satisfies the condition  E(| f (X )|) =

Ω

| f (X (ω))| d P < ∞.

Then we have the formula  E( f (X ) | B)(ω) =

∞ −∞

f (x) P(X ∈ d x | B)(ω)

(3.48)

for almost all ω ∈ Ω. Proof First, by arguing just as in Theorem 3.32 we obtain that the right-hand side of formula (3.48)

132

3 A Short Course in Probability Theory



∞ −∞

f (x) P(X ∈ d x | B)(ω)

∞ 

k P f (X ) ∈ [k/2n , (k + 1)/2n ) | B (ω) n n→∞ 2 k=−∞

= lim

is B-measurable and finite for almost all ω ∈ Ω. Indeed, it suffices to note that    ∞ | f (x)| P(X ∈ d x | B)(ω) d P Ω

−∞

 ∞ 

k = lim P f (X ) ∈ An,k | B (ω) d P n n→∞ 2 Ω k=0 = E (| f (X )|) < ∞. Moreover, by using the dominated convergence theorem (Theorem 2.12) we have the formula E( f (X ); Λ)

 ∞ 

k P f (X ) ∈ [k/2n , (k + 1)/2n ) | B (ω) d P n n→∞ 2 Λ k=−∞

= lim

∞ 

k P X ∈ f −1 [k/2n , (k + 1)/2n ) ∩ Λ n n→∞ 2 k=−∞

= lim

∞ 

k E P X ∈ f −1 [k/2n , (k + 1)/2n ) | B ; Λ n n→∞ 2 k=−∞   ∞ 

k n n = E lim P f (X ) ∈ [k/2 , (k + 1)/2 ) | B ; Λ n→∞ 2n k=−∞  ∞  =E f (x) P(X ∈ d x | B); Λ for every Λ ∈ B.

= lim

−∞

This proves that the right-hand side of formula (3.48) satisfies condition (CE2). The proof of Theorem 3.34 is complete.  When X is the characteristic function χ A of a set A ∈ F, then we have the formula E (χ A | B) = P(A | B). Indeed, it suffices to note that the conditional probability P(A | B) is a B-measurable random variable which satisfies the condition   P(A | B)(ω) d P = P(A ∩ Λ) = χ A (ω) d P for every Λ ∈ B. Λ

Λ

3.7 Conditional Expectations

133

The next theorem summarizes the basic properties of the conditional expectation: Theorem 3.35 Assume that E (|X |) < ∞. Then we have the following seven assertions (i)–(vii): (i) If X is B-measurable, then it follows that X (ω) = E(X | B) for almost all ω ∈ Ω. In particular, we have the formula E(X | B)(ω) = E(X ) for almost all ω ∈ Ω if B = {∅, Ω}. (ii) Conditional expectation is linear in X . Namely, we have, for all a1 , a2 ∈ R, (3.49) E (a1 X 1 + a2 X 2 | B) (ω) = a1 E(X 1 | B)(ω) + a2 E(X 2 | B)(ω) for almost all ω ∈ Ω. (iii) If X 1 (ω) ≤ X 2 (ω) for almost all ω ∈ Ω, then it follows that E(X 1 | B)(ω) ≤ E(X 2 | B)(ω) for almost all ω ∈ Ω. (iv) If X (ω) ≥ 0 for almost all ω ∈ Ω, then it follows that E(X | B)(ω) ≥ 0 for almost all ω ∈ Ω. More precisely, we have the inequality |E(X | B)(ω)| ≤ E(|X | | B)(ω) for almost all ω ∈ Ω.

(3.50)

(v) If Y is B-measurable and if E (|X Y |) < ∞, then it follows that E(X Y | B)(ω) = Y E(X | B)(ω) for almost all ω ∈ Ω. (vi) If X n (ω) ↑ X (ω) for almost all ω ∈ Ω, then it follows that E(X n | B)(ω) ↑ E(X | B)(ω) for almost all ω ∈ Ω. (vii) If the σ -algebras σ (X ) and B are independent, then we have the formula E(X | B)(ω) = E(X ) for almost all ω ∈ Ω.

(3.51)

Proof (i) This is trivial, since the function X itself satisfies conditions (CE1) and (CE2). Moreover, if B = {∅, Ω}, it follows that E(X | B)(ω) = E(X ) for almost all ω ∈ Ω. Indeed, we have the formula   E(X ) d P = E(X ) = X (ω) d P. Ω

Ω

(ii) First, it follows that the function a1 E(X 1 | B) + a2 E(X 2 | B) is B-measurable. Moreover, we have, by assertion (i), E (a1 E(X 1 | B) + a2 E(X 2 | B); Λ)  = (a1 E(X 1 | B)(ω) + a2 E(X 2 | B)(ω)) d P Λ   = a1 E(X 1 | B)(ω) d P + a2 E(X 2 | B)(ω) d P Λ

Λ

134

3 A Short Course in Probability Theory

 = a1

Λ



 X 1 (ω) d P + a2

Λ

X 2 (ω) d P =

Λ

(a1 X 1 (ω) + a2 X 2 (ω)) d P

= E (a1 X 1 + a2 X 2 ; Λ) for every Λ ∈ B. This proves the desired formula (3.49). (iii) If we let Λ := {ω ∈ Ω : E(X 2 | B)(ω) < E(X 1 | B)(ω)} , then it follows that Λ ∈ B. However, we have, by assertions (i) and (ii),  0≤ =

Λ Λ

= so that

Λ

(E(X 1 | B)(ω) − E(X 2 | B)(ω)) d P E(X 1 − X 2 | B)(ω) d P (X 1 − X 2 )(ω) d P ≤ 0,

 Λ

(E(X 1 | B)(ω) − E(X 2 | B)(ω)) d P = 0.

This proves that P(Λ) = 0, since the integrand E(X 1 | B) − E(X 2 | B) is positive on Λ. (iv) Since we have the inequality −|X (ω)| ≤ X (ω) ≤ |X (ω)| for almost all ω ∈ Ω, it follows from an application of assertion (iii) that −E(|X | | B)(ω) ≤ E(X | B)(ω) ≤ E(|X | | B)(ω) for almost all ω ∈ Ω. This proves the desired inequality (3.50). (v) We let Yn (ω) :=

∞  k n (Y (ω)) χ n for n ∈ N. n [k/2 ,(k+1)/2 ) 2 k=−∞

Then it is easy to see the following two assertions (a) and (b): (a) The Yn are B-measurable functions. (b) Yn (ω) → Y (ω) for almost all ω ∈ Ω. Moreover, since |X Yn | ≤ |X Y | in Ω and E (|X Y |) < ∞, by using the dominated convergence theorem (Theorem 2.12) we obtain that

3.7 Conditional Expectations

135

E(X Y ; Λ)   = X (ω)Y (ω) d P = lim X (ω)Yn (ω) d P Λ

∞ 

(3.52)

n→∞ Λ



k χ[k/2n ,(k+1)/2n ) (Y (ω)) d P 2n k=−∞ Λ   ∞  k n n = lim E X n χ[k/2 ,(k+1)/2 ) (Y ); Λ n→∞ 2 k=−∞ = lim

X (ω)

n→∞

∞ 

k E X ; Λ ∩ Y −1 [k/2n , (k + 1)/2n ) for every Λ ∈ B. n n→∞ 2 k=−∞

= lim

However, since Y is B-measurable, it follows that Λ ∩ Y −1 [k/2n , (k + 1)/2n ) ∈ B. Hence we obtain from condition (CE2) and the dominated convergence theorem that ∞ 

k −1 n n lim E X ; Λ ∩ Y [k/2 , (k + 1)/2 ) n→∞ 2n k=−∞

(3.53)

∞ 

k E E(X | B); Λ ∩ Y −1 [k/2n , (k + 1)/2n ) n n→∞ 2 k=−∞   ∞ k E(X | B)(ω) n χ[k/2n ,(k+1)/2n ) (Y (ω)) d P = lim n→∞ Λ 2 k=−∞  Yn (ω)E(X | B)(ω) d P = lim

= lim

n→∞ Λ

= E (Y E(X | B); Λ) for every Λ ∈ B. By combining formulas (3.52) and (3.53), we have proved that ∞ 

k E X ; Λ ∩ Y −1 [k/2n , (k + 1)/2n ) n n→∞ 2 k=−∞

E(X Y ; Λ) = lim

= E (Y E(X | B); Λ) for every Λ ∈ B. This proves that E(X Y | B)(ω) = Y E(X | B)(ω) for almost all ω ∈ Ω. (vi) Since X n (ω) ↑ X (ω) for almost all ω ∈ Ω, by applying assertion (iii) we find that E(X n | B)(ω) is increasing in n. Hence, if we let 



Y (ω) := min lim sup E(X n | B)(ω), E(X | B)(ω) , n→∞

136

3 A Short Course in Probability Theory

then it follows that Y is B-measurable and further that E(X n | B)(ω) ↑ Y (ω) for almost all ω ∈ Ω. Therefore, we obtain from the monotone convergence theorem (Theorem 2.10) and condition (CE2) that E(X ; Λ) = lim E(X n ; Λ) = lim E(E(X n | B); Λ) n→∞

n→∞

= E(Y ; Λ) for every Λ ∈ B. This proves that Y (ω) = E(X | B)(ω) for almost all ω ∈ Ω. (vii) Since X is independent of every set B in B, it follows that P (B ∩ (X ∈ A)) = P(B)P (X ∈ A) for every A ∈ B(R). Therefore, we have the formula  E(X | B)(ω) d P B

 =

 ∞  k χ[k/2n ,(k+1)/2n ) (X (ω)) d P n→∞ 2n B k=−∞

X (ω) d P = lim B

 ∞  k dP n→∞ 2n B∩X −1 [k/2n ,(k+1)/2n ) k=−∞

= lim

∞ 

k = lim P B ∩ X ∈ [k/2n , (k + 1)/2n n n→∞ 2 k=−∞

 ∞ 

k n n P X ∈ [k/2 , (k + 1)/2 ) · P(B) = X (ω) d P · P(B) n→∞ 2n Ω k=−∞  = E(X ) d P for every B ∈ B. = lim

B

This proves the desired formula (3.51). Now the proof of Theorem 3.35 is complete.



Theorem 3.36 Assume that E (|X |) < ∞. If B1 ⊂ B2 , then we have the formula E(X | B1 )(ω) = E (E(X | B2 ) | B1 ) (ω) = E (E(X | B1 ) | B2 ) (ω) for almost all ω ∈ Ω.

(3.54)

3.7 Conditional Expectations

137

Proof First, we have, for every Λ ∈ B1 ⊂ B2 , 

 E(X ; Λ) =

Λ

X (ω) d P =

Λ

E(X | B2 )(ω) d P = E (E(X | B2 ); Λ) .

(3.55)

However, it follows from an application of assertion (iv) of Theorem 3.35 that |E(X | B2 )(ω)| ≤ E(|X | | B2 )(ω) for almost all ω ∈ Ω, so that  E (|E(X | B2 )|) ≤ E (E(|X | | B2 )) =  = |X |(ω) d P

Ω

E(|X | | B2 )(ω) d P

Ω

= E (|X |) < ∞. Hence, by taking the conditional expectation of Z := E(X | B2 ) with respect to B1 we obtain that, for every Λ ∈ B1 , E (E(X | B2 ); Λ) = E(Z ; Λ) = E (E(Z | B1 ); Λ)

(3.56)

= E (E(E(X | B2 ) | B1 ); Λ) . Therefore, it follows from formulas (3.55) and (3.56) that  E(X ; Λ) = E (E(X | B2 ) | B1 ) (ω) d P for every Λ ∈ B1 . Λ

This proves the first equality in formula (3.54). Moreover, since E(X | B1 ) is B2 -measurable, by applying assertion (i) of Theorem 3.35 with X := E(X | B1 ), B := B1 , we obtain that E (E(X | B1 ) | B2 ) (ω) = E(X | B1 )(ω) for almost all ω ∈ Ω. This proves the second equality in formula (3.54). Now the proof of Theorem 3.36 is complete.



Example 3.37 Let X be a random variable and let X = (X 1 , X 2 , . . . , X n ) be a vector-valued random variables on the probability space (Ω, F, P). We recall (see Definition 3.29) that the conditional distribution ψ of X with respect to the random variable (X 1 , X 2 , . . . , X n ) is given by the formula

138

3 A Short Course in Probability Theory

ψ(x1 , x2 , . . . , xn , A) = P (X ∈ A | X 1 = x1 , X 2 = x2 , . . . , X n = xn ) for μX -almost all (x1 , x2 , . . . , xn ) ∈ Rn and A ∈ B(R). Assume that a Borel measurable function g(z, x1 , . . . , xn ) on Rn+1 satisfies the condition E (|g(X, X 1 , X 2 , . . . , X n )|)  |g(X (ω), X 1 (ω), X 2 (ω) . . . , X n (ω))| d P < ∞. = Ω

Then we have the formula (3.57) E (g(X, X 1 , . . . , X n ) | X 1 , X 2 , . . . , X n ) (ω)  ∞ = g(x, X 1 (ω), . . . , X n (ω))ψ (X 1 (ω), X 2 (ω), . . . , X n (ω), d x) −∞

for almost all ω ∈ Ω. If we define a Borel measurable function h(x1 , x2 , . . . , xn ) on Rn by the formula  h(x1 , x2 , . . . , xn ) =

∞ −∞

g(x, x1 , x2 , . . . , xn )ψ (x1 , x2 , . . . , xn , d x) ,

then we obtain from formula (3.57) that E (g(X, X 1 , . . . , X n ) | X 1 , X 2 , . . . , X n ) (ω) = h(X 1 (ω), X 2 (ω), . . . , X n (ω)) for almost all ω ∈ Ω. We shall write h(x1 , x2 , . . . , xn ) = E (g(X, X 1 , X 2 , . . . , X n ) | X 1 = x1 , X 2 = x2 , . . . , X n = xn ) . Example 3.38 Let X = (X 1 , X 2 , . . . , X n ) and Y = (X n+1 , X n+2 , . . . , X n+ ) be vector-valued random variables on the probability space (Ω, F, P). Then there exists a conditional distribution Φ of Y with respect to the random variable (X 1 , X 2 , . . . , X n ): Φ(x1 , x2 , . . . , xn , A) = P ((X n+1 , X n+2 , . . . , X n+ ) ∈ A | X 1 = x1 , X 2 = x2 , . . . , X n = xn ) for μX -almost all (x1 , x2 , . . . , xn ) ∈ Rn and A ∈ B(R ).

3.7 Conditional Expectations

139

Proof Let ψk be the conditional distribution ψ of X k+1 with respect to the random variable (X 1 , X 2 , . . . , X k ): ψk (x1 , x2 , . . . , xk , d xk+1 ) = P (X k+1 ∈ d xk+1 | X 1 = x1 , X 2 = x2 , . . . , X k = xk ) . We let Φ(x1 , x2 , . . . , xn , A)  ∞  ∞ := ψn (x1 , x2 , . . . , xn , d xn+1 ) ψn+1 (x1 , . . . xn+1 , d xn+2 ) · · · −∞ −∞  ∞ χ A (xn+1 , . . . , xn+ ) ψn+ −1 (x1 , . . . , xn+ −1 , d xn+ ) × −∞

for A ∈ B(R ). Then we can prove the following two assertions (1) and (2): (1) For μX -almost all x ∈ Rn , Φ(x1 , x2 , . . . , xn , ·) is a probability measure on R . (2) For every A ∈ B(R ), Φ(·, A) is a Borel measurable function on Rn . Moreover, by applying Theorem 3.36 we have the formula Φ(X 1 (ω), . . . , X n (ω), A)   = E E · · · E E (χ A (X n+1 , . . . , X n+ ) | X 1 , . . . , X n+ −1 )   | X 1 , . . . , X n+ −2 · · · | X 1 , . . . , X n (ω) = E (χ A (X n+1 , . . . , X n+ ) | X 1 , . . . , X n ) (ω) = P (X n+1 , . . . , X n+ ∈ A | X 1 , . . . , X n ) (ω) for almost all ω ∈ Ω. This proves that Φ(x1 , x2 , . . . , xn , A) = P ((X n+1 , X n+2 , . . . , X n+ ) ∈ A | X 1 = x1 , X 2 = x2 , . . . , X n = xn ) . The proof of Example 3.38 is complete.



140

3 A Short Course in Probability Theory

3.8 Notes and Comments The results discussed here are based on Blumenthal–Getoor [20], Lamperti [110], [111], Nishio [134] and Folland [63]. This chapter is adapted from these books in such a way as to make it accessible to graduate students and advanced undergraduates as well. Section 3.1: The monotone class (Theorem 3.2) and the Dynkin class theorem (Corollary 3.3) were first proved by Dynkin [45]. Our proof is due to Blumenthal– Getoor [20, Chap. 0]. The approximation theorem (Theorem 3.4) is taken from Nishio [134, Chap. 2, Sect. 3, Theorem 5]. Sections 3.2–3.4: The material in these sections is taken from Nishio [134]. Section 3.6: Theorems 3.26 and 3.28 are adapted from Nishio [134, Chap. 7, Sect. 1]. Section 3.7: Theorems 3.32, 3.34 and 3.35 are adapted from Nishio [134, Chap. 7, Section 2] and Lamperti [111, Appendix 2].

Chapter 4

Manifolds, Tensors and Densities

The purpose of this chapter is to summarize the basic facts about manifolds and mappings between them which are most frequently used in the theory of partial differential equations. Manifolds are an abstraction of the idea of a surface in Euclidean space. The virtue of manifold theory is that it provides the geometric insight into the study of partial differential equations, and intrinsic properties of partial differential equations may be revealed.

4.1 Manifolds Let M be a set and 0 ≤ r ≤ ∞. An atlas or coordinate neighborhood system of class C r on M is a family of pairs A = {(Ui , ϕi )}i∈I satisfying the following three conditions: (MA1) (MA2) (MA3)

Each Ui is a subset of M and M = ∪i∈I Ui . Each ϕi is a bijection of Ui onto an open subset of Rn , and for every pair i, j of I with Ui ∩ U j = ∅ the set ϕi (Ui ∩ U j ) is open in Rn . For each pair i, j of I with Ui ∩ U j = ∅ the mapping ϕ j ◦ ϕi−1 : ϕi (Ui ∩ U j ) −→ ϕ j (Ui ∩ U j ) is a C r diffeomorphism. Here a C 0 diffeomorphism means a homeomorphism.

In other words, M is a set which can be covered by subsets Ui , each of which is parametrized by an open subset of Rn . Each pair (Ui , ϕi ) is called a chart or coordinate neighborhood of A. The mappings ϕ j ◦ ϕi−1 in condition (MA3) are called transition maps or coordinate transformations. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 K. Taira, Functional Analytic Techniques for Diffusion Processes, Springer Monographs in Mathematics, https://doi.org/10.1007/978-981-19-1099-9_4

141

142

4 Manifolds, Tensors and Densities

Let (U, ϕ) be a chart on M. If p is a point of U , then ϕ( p) is a point of Rn and hence an n-tuple of real numbers. We let   ϕ( p) = x 1 ( p), x 2 ( p), . . . , x n ( p) for all p ∈ U.

(4.1)

The n-tuple (x 1 ( p), x 2 ( p), . . . , x n ( p)) of real numbers is called the local coordinates of p in the chart (U, ϕ), and the n-tuple (x 1 , x 2 , . . . , x n ) of real-valued functions on U is called the local coordinate system on (U, ϕ). Following standard notation, we shall write formula (4.1) as   ϕ(x) = x 1 , x 2 , . . . , x n for all x ∈ U.

(4.2)

Two atlases A1 and A2 on M are said to be compatible if the union A1 ∪ A2 is an atlas on M. It is easy to see that the relation of compatibility between atlases is an equivalence relation. An equivalence class of atlases on X is said to define a C r structure D on M. The union  {A : A ∈ D} AD = of the atlases in D is called the maximal atlas of D, and a chart (U, ϕ) of AD is called an admissible chart. An n-dimensional C r manifold M is a pair consisting of a set M and a C r structure D on M. We often identify M with the underlying set M for notational convenience. Given an atlas A on M, we can obtain a maximal atlas just by including all charts whose transition maps with those in A are C r diffeomorphisms. This maximal atlas is said to define the C r structure generated by A.

4.1.1 Topology on Manifolds Now we see how to define a topology on a manifold by means of atlases. Let M be an n-dimensional C r manifold. A subset O of M is defined to be open if and only if, for each x ∈ O, there exists an admissible chart (U, ϕ) such that x ∈ U and U ⊂ O. It is easy to verify that the open sets in M define a topology. A C r manifold is said to be Hausdorff if it is Hausdorff as a topological space. From now on we assume that our manifolds are Hausdorff. Let X be a topological space. A collection C of subsets of X is said to be locally finite if every point of X has  a neighborhood which intersects only finitely many elements of C. A covering V j of X is called a refinement of a covering {Ui } of X if each V j is contained in some Ui . A topological space X is said to be paracompact if it is a Hausdorff space and every open covering of X has a locally finite refinement which is also an open covering of X.

4.1 Manifolds

143

The next theorem gives criteria for paracompactness of a manifold: Theorem 4.1 If M is a C 0 manifold, then the following three conditions are equivalent: (i) M satisfies the second axiom of countability. (ii) M is a countable union of compact subsets. (iii) M is paracompact and the number of connected components of M is at most countable.

4.1.2 Submanifolds Let M be a C r manifold (0 ≤ r ≤ ∞) and N a subset of M. We say that N is a submanifold of M if, at each point x of N , there exists an admissible chart (U, ϕ) on M such that: (SM)

ϕ : U → V1 × V2 , where V1 is open in Rm and V2 is open in Rn−m (1 ≤ m ≤ n), and we have the formula ϕ(U ∩ N ) = V1 × {0}. The number n − m is called the codimension of N in M.

An open subset of M is a submanifold if we take m = n, and is called an open submanifold. A submanifold of M is called a closed submanifold if it is a closed subset of M. If N is a submanifold of M, then it is a C r manifold in its own right with the C r structure generated by the atlas: {(U ∩ N , ϕ|U ∩N )} where (U, ϕ) is an admissible chart on M having property (SM). Furthermore, the topology on N defined by the above atlas is just the relative topology.

4.2 Smooth Mappings Let M and N be two C ∞ manifolds. A mapping f : M → N is said to be of class C ∞ if, for each x ∈ M and each admissible chart (V, ψ) on N with f (x) ∈ V , there exists a chart (U, ϕ) on M with x ∈ U and f (U ) ⊂ V such that the mapping ψ ◦ f ◦ ϕ−1 : ϕ(U ) −→ ψ(V )

144

4 Manifolds, Tensors and Densities

is of class C ∞ . The mapping ψ ◦ f ◦ ϕ−1 is called a local representative of f . A mapping f : M → N is called a C ∞ diffeomorphism if it is a bijection and both f and f −1 are of class C ∞ . Two C ∞ manifolds are said to be diffeomorphic if there exists a diffeomorphism between them. Let M be an n-dimensional C ∞ manifold and {(Uα , ϕα )}α∈I an atlas on M. Let U be an open set in M. A real-valued continuous function f defined on U is of class C ∞ if and only if, for each α ∈ I , the local representative f ◦ ϕ−1 α of f is of class C ∞ on ϕα (U ∩ Uα ). Let C ∞ (M) denote the space of real-valued C ∞ functions on M. The space ∞ C (M) has an algebra structure. In fact, the product f g defined by the formula ( f g)(x) = f (x)g(x) for all x ∈ M, enjoys the usual algebraic properties of a product. Let ϕ : M → N be a C ∞ mapping of manifolds. If g ∈ C ∞ (N ), the pull-back ∗ ϕ g of g by ϕ is defined by the formula ϕ∗ g = g ◦ ϕ ∈ C ∞ (M). If ϕ is a diffeomorphism, then ϕ∗ : C ∞ (N ) → C ∞ (M) is an isomorphism and (ϕ∗ )−1 = (ϕ−1 )∗ . If f ∈ C ∞ (M), the push-forward ϕ∗ f of f by ϕ is defined by the formula ϕ∗ f = f ◦ ϕ−1 ∈ C ∞ (N ). Note that

ϕ∗ = (ϕ−1 )∗ , ϕ∗ = (ϕ−1 )∗ .

4.2.1 Partitions of Unity Let {Ui }i∈I be an open covering of a C ∞ manifold M. A family {gi }i∈I of C ∞ functions on M is called a partition of unity subordinate to the covering {Ui }i∈I if the following three conditions are satisfied: (PU1) (PU2) (PU3)

0 ≤ gi (x) ≤ 1 for all x ∈ M and i ∈ I . supp gi ⊂ Ui for each i ∈ I . The collection {supp gi }i∈I is locally finite and 

gi (x) = 1 for each x ∈ M.

i∈I

Here supp gi is the support of gi , that is, the closure in M of the set {x ∈ M : gi (x) = 0}. We give a general theorem on the existence of partitions of unity:

4.2 Smooth Mappings

145

Theorem 4.2 Every paracompact C ∞ manifold M has a partition of unity subordinate to any given open covering. The next version of partition of unity plays an important role in the proof of various inequalities of Gårding’s type (see [62, Theorem (7.15)], [239, Theorem 19.2]): Theorem 4.3 Every paracompact C ∞ manifold M has a family {h i }i∈I of C ∞ functions associated with any given open covering {Ui }i∈I that satisfies the following conditions: (PV1) (PV2) (PV3)

0 ≤ h i (x) ≤ 1 for all x ∈ M and i ∈ I . supp h i ⊂ Ui for each i ∈ I . The collection {supp h i }i∈I is locally finite and 

h i (x)2 = 1 for each x ∈ M.

i∈I

Indeed, we have only to let h i (x) := 

gi (x) j∈I

g j (x)2

1/2 for each i ∈ I.

4.3 Tangent Bundles From the notion of directional derivative in Euclidean space we will obtain the notion of a tangent vector to a smooth manifold. Let M be an n-dimensional smooth manifold. At each point x of M, we consider triples (U, ϕ, v) where (U, ϕ) is a chart at x and v is a vector in Rn . We say that two such triples (U, ϕ, v) and (V, ψ, w) are equivalent if the derivative (ψ ◦ ϕ−1 ) of ψ ◦ ϕ−1 at ϕ(x) maps v on w, that is, if we have the formula   ψ ◦ ϕ−1 (ϕ(x))v = w. It is easy to verify that this is an equivalence relation. An equivalence class of such triples is called a tangent vector of M at x. The set of such tangent vectors is denoted by Tx (M), and is called the tangent space of M at x. Each chart (U, ϕ) defines a bijection of Tx (M) onto Rn in such a way that the equivalence class v¯ of (U, ϕ, v) corresponds to the vector v. In the space Tx (M), we can define addition and scalar multiplication as follows:

v¯1 + v¯2 = v1 + v2 , c v¯ = cv for c ∈ R.

146

4 Manifolds, Tensors and Densities

Hence the tangent space Tx (M) is a real linear space, and the mapping v : −→ v is an isomorphism of Rn onto Tx (M). We consider a family of the tangent spaces Tx (M) parametrized by the smooth manifold M Tx (M), T (M) = x∈M

and define a mapping π : T (M) −→ M by the formula π(v) = x for v ∈ Tx (M). Now we see that in a natural way the collection of all tangent vectors itself forms a smooth manifold called the tangent bundle. We have a similar dual object called the cotangent bundle formed from the linear functionals on the tangent spaces. We make T (M) into a 2n-dimensional smooth manifold by giving natural charts for it: Let (U, ϕ) be a chart on M. We define a mapping τϕ : π −1 (U ) −→ ϕ(U ) × Rn by the formula τϕ (v) = (ϕ(x), v) if π(v) = x and v is a tangent vector at x represented by v in the chart (U, ϕ). Then the mapping τϕ is a bijection. Further, if (U, ϕ) and (V, ψ) are two overlapping charts, that is, if U ∩ V = ∅, then we have the formula π −1 (U ) ∩ π −1 (V ) = π −1 (U ∩ V ), and the transition map τψ ◦ τϕ−1 : ϕ(U ∩ V ) × Rn −→ ψ(U ∩ V ) × Rn is given by the formula   (ϕ(x), v) −→ ψ(x), (ψ ◦ ϕ−1 ) (ϕ(x))v for x ∈ U ∩ V and v ∈ Rn . Since the derivative (ψ ◦ ϕ−1 ) is of class C∞ and is an isomorphism at ϕ(x), we obtain that the family of pairs (π −1 (U ), τϕ ) , where (U, ϕ) ranges over all admissible charts, is an atlas on T (M). This proves that T (M) is a 2n-dimensional smooth manifold. We call T (M) the tangent bundle of M and π the tangent bundle projection of M, respectively. Each chart (π −1 (U ), τϕ ) is called a trivializing chart on T (M) over

4.3 Tangent Bundles

147

U . Each such trivializing chart on T (M) identifies the tangent bundle over U with the product ϕ(U ) × Rn . We will study mappings between smooth manifolds and the effect that mappings have on tangent vectors. Let M, N be two smooth manifolds and f : M → N a smooth mapping. At each point x of M, we define a map Tx f : Tx (M) −→ T f (x) (N ) as follows: If (U, ϕ) is a chart at x and (V, ψ) is a chart at f (x) with f (U ) ⊂ V and if v is a tangent vector of M at x represented by v ∈ Rn in (U, ϕ), then we let

(ϕ(x))v Tx f (v) = the tangent vector of N at f (x) represented by f ϕψ

where f ϕψ = ψ ◦ f ◦ ϕ−1 is the local representative of f . It is easy to verify that the map Tx f is independent of the charts used, and is linear. The map Tx f is called the tangent map of f at x. We define the tangent map T f : T (M) −→ T (N ) to be the map equal to Tx f : Tx (M) −→ T f (x) (N ) on each Tx (M).

4.4 Vector Fields Let M be an n-dimensional smooth manifold. A smooth vector field on M is a smooth mapping  : M −→ T (M) such that (x) ∈ Tx (M) for each x ∈ X . In other words, a vector field  assigns to each point x of M a tangent vector (x) of M at x. The set X (M) of all smooth vector fields on M is a real linear space with the obvious operations of addition and scalar multiplication. Rephrased, the space X (M) is the space C ∞ (M; T (M)) of all smooth sections of the tangent bundle T (M). If (U, ϕ) is a chart on M, then a smooth vector field  on M induces a smooth

on ϕ(U ) by defining vector field 

(z) = τϕ ◦ (ϕ−1 (z)) for z ∈ ϕ(U ). 

148

4 Manifolds, Tensors and Densities

is called the local representative of  in the chart (U, ϕ). If we The vector field  identify the tangent bundle over U with the product U × Rn , then  corresponds to a mapping U −→ U × Rn x −→ (x, 1 (x), . . . , n (x)), where 1 , . . . , n are smooth functions on U . The n-component vector function (1 , . . . , n ) on U is called the local components of  relative to the chart (U, ϕ). We remark that

i (ϕ(x)) = i (z) on U.  (4.3) If f ∈ C ∞ (M) and  ∈ X (M), then the mapping M  x −→ f (x)(x) defines a smooth vector field on M. This is called the product of f and M. It is easy to verify that the space X (M) is a C ∞ (M)-module with respect to this operation of product. Now we define how vector fields operate on functions. Let f ∈ C ∞ (M). Since T f : T (M) → T (R) = R × R, we can write T f acting on each Tx (M) in the form T f (v) ¯ = ( f (x), d f (x) · v) for v ∈ Tx (M). Recall that T f = Tx f on Tx (M) and that Tx f : Tx (M) → R is linear. Hence d f (x) is an element of the dual space Tx∗ (M) of Tx (M), and is called the differential of f at x. The dual space Tx∗ (M) is called the space of differentials or cotangent space at x. We work out d f in local charts. If (U, ϕ) is a chart on M, then the local representative of T f is given by the formula ( f (z), f (z)v) for z ∈ ϕ(U ) and v ∈ Rn , where f = f ◦ ϕ−1 is the local representative of f . Hence the local representative of d f is the derivative of the local representative of f . Namely, if (x 1 , . . . , x n ) is a local coordinate system on (U, ϕ), then the local components of d f are given by the formulas ∂f ∂ f (d f )i (x) = (x) = (ϕ(x)). (4.4) ∂xi ∂z i If f ∈ C ∞ (M) and  ∈ X (M), we define the derivative of f in the direction  by the formula [ f ](x) = d f (x) · (x) for x ∈ M. The real-valued function

4.4 Vector Fields

149

x −→ [ f ](x) on M is denoted by [ f ] or d f (). In view of formulas (4.3) and (4.4), it follows that n n   ∂f ∂ f i

i (ϕ(x))   (x) = (ϕ(x)) on U. (4.5) [ f ](x) = ∂x ∂z i i i=1 i=1 This proves that

[ f ] ∈ C ∞ (M).

The derivative [ f ] is also occasionally denoted by L f , and is called the Lie derivative of f along . It follows from formula (4.5) that the mapping L : C ∞ (M) −→ C ∞ (M) satisfies the condition L ( f g) = L f · g + f · L g for all f, g ∈ C ∞ (M).

(4.6)

A mapping D : C ∞ (M) → C ∞ (M) is called a derivation on C ∞ (M) if it is linear and satisfies the condition D( f g) = D f · g + f · Dg for all f, g ∈ C ∞ (M). The collection of all derivations on C ∞ (M) is a real linear space with the obvious operations of addition and scalar multiplication. Formula (4.6) asserts that, for each  ∈ X (M) the Lie derivative L is a derivation. The next theorem shows the converse: Theorem 4.4 The collection of all derivations on C ∞ (M) is a real linear space isomorphic to the space X (M). More precisely, for each derivation D on C ∞ (M), there exists a unique smooth vector field  on M such that L = D. This theorem provides a local basis for vector fields in the following way: If (U, ϕ) is a chart on M with ϕ(x) = (x 1 , . . . , x n ), we define n derivations ∂ ∂ ,..., n ∂x 1 ∂x on C ∞ (U ) by the formulas ∂f ∂ f (x) = i (ϕ(x)) for f ∈ C ∞ (U ) and 1 ≤ i ≤ n. i ∂x ∂z These derivations are linearly independent with coefficients in C ∞ (U ). Indeed, since we have the formulas

150

4 Manifolds, Tensors and Densities

∂ j (x j ) = δi for 1 ≤ i, j ≤ n, ∂x i it follows that n 

∂ f = 0, f i ∈ C ∞ (U ) =⇒ ∂x i

f =

i

i=1

j

 n  i=1

∂ f ∂x i i

 (x j ) = 0.

j

Here the δi are the usual Kronecker symbols: j δi

1 if j = i, = 0 otherwise.

Theorem 4.4 tells us that the derivations ∂/∂x 1 , . . . , ∂/∂x n may be identified with smooth vector fields on U . If  ∈ X (M) has the local components (1 , . . . , n ) in (U, ϕ), then we have the formula n  ∂ i i , L =  = ∂x i=1 with the identification of vector fields with derivations. This proves that the vector fields ∂ ∂ ∂ , ,..., n ∂x 1 ∂x 2 ∂x form a local basis for the space X (M). On the other hand, since we have the formulas  dx

i

∂ ∂x j

 =

∂x i = δ ij for 1 ≤ i, j ≤ n, ∂x j

we see that the differentials d x 1, d x 2, . . . , d x n form a basis of Tx∗ (M) dual to the basis ∂/∂x 1 , . . . , ∂/∂x n of Tx (M) at each point x of U . Hence, if f ∈ C ∞ (U ), then the differential d f has the local expression df =

n  ∂f dxi , i ∂x i=1

since we have the formulas  df

∂ ∂x i

 =

∂f ∂x i

for 1 ≤ i ≤ n.

4.4 Vector Fields

151

Let (U, ϕ) be a chart on M with ϕ(x) = (x1 , x2 , . . . , xn ). If v is a C ∞ vector field on M, then it has the local expression v=

n  i=1

ξi

∂ , ∂xi

where ξ 1 , ξ 2 , . . ., ξ n are C ∞ functions on U . The functions ξ 1 , ξ 2 , . . ., ξ n are called the local components of v relative to the chart (U, ϕ). If (V, ψ) is another chart with ψ(y) = (y1 , y2 , . . . , yn ) such that U ∩ V = ∅, then we have the formula v=

n 

 ∂ ∂ = ηj ∂xi ∂ yj i=1 n

ξi

i=1

on U ∩ V,

where the local components ξ i and η j are related as follows: ξi =

n  =1

η

∂xi , ∂ y

ηj =

n  m=1

ξm

∂yj . ∂xm

4.5 Vector Fields and Integral Curves Let U be an open subset of Rn . A vector field on U is a mapping X of U into Rn , which we interpret as assigning a vector to each point of U . Let x0 be a point of U . An integral curve of X at x0 is a C 1 map c from an open interval I of R containing the zero 0 into U such that c(t) ˙ = X (c(t)), (4.7) c(0) = x0 , where c˙ = dc/dt. Let D be a subset of Rn . A mapping f of D into Rn is said to be Lipschitz continuous on D if there exists a constant K > 0 such that | f (x) − f (y)| ≤ K |x − y| for all x, y ∈ D. The constant K is called a Lipschitz constant for f . We say that f is locally Lipschitz continuous on D if it is Lipschitz continuous on compact subsets of D. By the mean value theorem, we see that a C 1 mapping is locally Lipschitz continuous. The next theorem is one of the fundamental theorems in the theory of ordinary differential equations: Theorem 4.5 Let U be an open subset of Rn and X : U → Rn a Lipschitz continuous vector field with Lipschitz constant K . Let x0 ∈ U and assume that the closed ball

152

4 Manifolds, Tensors and Densities

B(x0 ; 2a) of radius 2a about x0 is contained in U and that the vector field X is bounded by a constant L > 0 on the ball B(x0 ; 2a). If b = min(1/K , 2a/L), then there exists a unique C 1 map x : (−b, b) → U such that

x(t) ˙ = X (x(t)) , x(0) = x0 .

(4.8)

Furthermore, if we denote by αx the solution of the problem

x(t) ˙ = X (x(t)) , x(0) = x,

(4.9)

then the mapping x −→ αx of the open ball B(x0 ; a) of radius a about x0 into U is Lipschitz continuous. We restate this theorem in terms of integral curves: Theorem 4.6 Let U be an open subset of Rn and X : U → Rn a Lipschitz continuous vector field. Then we have the following three assertions: (i) For each x0 ∈ U , there exists an integral curve of X at x0 . (ii) If c1 : I1 → U and c2 : I2 → U are two integral curves of X at the same point of U , then c1 = c2 on I1 ∩ I2 . (iii) There exist an open subset U0 of U containing x0 , an open interval I0 containing the zero and a continuous mapping α : U0 × I0 −→ U such that, for each x ∈ U0 , the mapping αx : I0 → U , defined by αx (t) = α(x, t), is an integral curve of X at x. Furthermore, the mapping α is Lipschitz continuous in the variable x and is of class C 1 in the variable t. For each x ∈ U , we let I (x) = the union of all open intervals containing the zero on which integral curves of X at x are defined. Parts (i) and (ii) of Theorem 4.6 allow us to define the integral curve uniquely on all of I (x). Furthermore, we let D X = the set of those (x, t) ∈ U × R such that t ∈ I (x), and define a global flow of X to be the map α : D X −→ U

4.5 Vector Fields and Integral Curves

153

such that for each x ∈ U the mapping αx : I (x) → U , given by αx (t) = α(x, t), is an integral curve of X at x. The curve αx is called the maximal integral curve of X at x. The next theorem describes the set D X and the mapping α: Theorem 4.7 Let U be an open subset of Rn and X : U → Rn a C r vector field with 1 ≤ r ≤ ∞. Then we have the following three assertions: (i) D X ⊃ U × {0} and D X is open in U × R. (ii) The mapping α : D X → U is of class C r . (iii) For (x, t) ∈ D X , the pair (α(x, t), s) is in D X if and only if the pair (x, t + s) is in D X . In this case, we have the formula α(x, t + s) = α(α(x, t), s). Now let M be a C ∞ manifold. A C 1 map c from an open interval I of R into M is called a curve of M. Let t be a point of I and (U, ϕ) a chart at c(t). Shrinking the interval I to an open subinterval I0 such that c(I0 ) ⊂ U , we can take the derivative (ϕ ◦ c) (t) as a vector in Rn . This vector represents a tangent vector at c(t), independently of the chart used. In this way we can define a mapping c˙ : I −→ T (M) by the formula c(t) ˙ = the tangent vector of M at c(t) represented by (ϕ ◦ c) (t). Let X be a C r vector field on M with 1 ≤ r ≤ ∞. An integral curve of X is a C map c from an open interval I into M such that 1

c(t) ˙ = X (c(t)) for t ∈ I.

(4.10)

The local representative of formula (4.10) is given by the formula d c (t) = X ( c(t)) for t ∈ I, dt where c = ϕ ◦ c is the local representative of c and X is the local representative of X in the chart (U, ϕ). If the interval I contains the zero 0 and c(0) = x0 , we say that the map c is an integral curve of X at x0 . We remark that Theorems 4.5 and 4.6 extend to this case.

154

4 Manifolds, Tensors and Densities

4.6 Cotangent Bundles In this section we see that the collection of all cotangent spaces itself forms a smooth manifold, a dual object similar to the tangent bundle. Let M be an n-dimensional smooth manifold. We let Tx∗ (M) T ∗ (M) = x∈X

be the disjoint union of the cotangent spaces Tx∗ (M), and define a mapping π ∗ : T ∗ (M) −→ M by π ∗ (ω) = x if ω ∈ Tx∗ (M). Now we make T ∗ (M) into a 2n-dimensional smooth manifold by giving natural charts for it: Let (U, ϕ) be a chart on M with ϕ(x) = (x 1 , . . . , x n ). We define a mapping τϕ∗ : π ∗ −1 (U ) −→ ϕ(U ) × Rn by the formula

τϕ∗ (ω) = (ϕ(x), (ω1 , . . . , ωn ))

n ωi d x i . Then it follows that the mapping τϕ∗ is a bijection, if π ∗ (ω) = x and ω = i=1 1 n x of U . Furthermore it is since (d x , . . . , d x ) is a basis of Tx∗ (M) at each point  easy to see that the family of pairs (π ∗ −1 (U ), τϕ∗ ) , where (U, ϕ) ranges over all admissible charts, is an atlas on T ∗ (M). This shows that T ∗ (M) is a 2n-dimensional smooth manifold. We call T ∗ (M) the cotangent bundle of M and π ∗ the cotangent bundle projection of M, respectively. A smooth covector field or differential one-form on M is a smooth mapping ω : M −→ T ∗ (M) such that ω(x) ∈ Tx∗ (M) for each x ∈ M. In other words, a covector field ω assigns to each point x of M a cotangent vector ω(x) at x. The set X ∗ (M) of all smooth covector fields on M is a real linear space with the obvious operations of addition and scalar multiplication. Rephrased, the space X ∗ (M) is the space C ∞ (M; T ∗ (M)) of all smooth sections of the cotangent bundle T ∗ (M). If ω ∈ X ∗ (M) and (U, ϕ) is a chart with ϕ(x) = (x 1 , . . . , x n ), then ω has the local expression n  ωi d x i ω= i=1

4.6 Cotangent Bundles

155

where ω1 , . . . , ωn are smooth functions on U . The functions ω1 , . . . , ωn are called the local components of ω relative to the chart (U, ϕ). If (V, ψ) is another chart with ψ(y) = (y1 , y2 , . . . , yn ) such that U ∩ V = ∅, then we have the formula ω=

n 

ξi d xi =

i=1

n 

η j dy j on U ∩ V,

j=1

where the local components ξi and η j are related as follows: ξi =

n  j=1

η

∂ y , ∂xi

ηj =

n 

ξm

m=1

∂xm . ∂yj

If f ∈ C ∞ (M) and ω ∈ X ∗ (M), then the mapping M  x −→ f (x)ω(x) defines a smooth covector field on M. The space X ∗ (M) is a C ∞ (M)-module with respect to this operation of product.

4.7 Tensors Let K be the real number field R or the complex number field C, and let E 1 , . . . , E p be linear spaces over K. A mapping A : E 1 × · · · × E p −→ K is said to be p-multilinear if A(v1 , . . . , v p ) is linear in each argument vi separately, that is, A(v1 , . . . , vi−1 , λvi + μwi , vi+1 , . . . , v p ) = λA(v1 , . . . , vi , . . . , v p ) + μA(v1 , . . . , wi , . . . , v p ). In the case p = 2, we say that A is bilinear. The set of all p-multilinear mappings of E 1 × · · · × E p into K is a linear space over K with the obvious operations of addition and scalar multiplication. This linear space is denoted by L(E 1 , . . . , E p , K). Let E be a finite dimensional linear space over K. We write E ∗ for L(E, K), the space of all linear functionals on E. The space E ∗ is called the dual space of E. We remark that E may be identified with its bidual space E ∗∗ = L(E ∗ , K) by the isomorphism e → e∗∗ defined by the formula e∗∗ (α) = α(e) for all α ∈ E ∗ .

156

4 Manifolds, Tensors and Densities

We let   Tsr (E) = L E ∗ × · · · × E ∗ × E × · · · × E, K , r -copies of E ∗ and s-copies of E. The elements of Tsr (E) are called tensors   on E, contravariant of order r and r covariant of order s, or simply of type . In particular, we have the formulas s T01 (E) = L(E ∗ , K) = E, T10 (E) = L(E, K) = E ∗ . If t1 ∈ Tsr11 (E) and t2 ∈ Tsr22 (E), we define the tensor product t1 ⊗ t2 of t1 and t2 by the formula (t1 ⊗ t2 )(β 1 , . . . , β r1 , γ 1 , . . . , γ r2 , v1 , . . . , vs1 , w1 , . . . , ws2 ) = t1 (β 1 , . . . , β r1 , v1 , . . . , vs1 )t2 (γ 1 , . . . , γ r2 , w1 , . . . , ws2 ). Then we have

+r2 (E). t1 ⊗ t2 ∈ Tsr11+s 2

Also it is easy to see that the operation ⊗ is bilinear and associative. Assume that the linear space E has dimension n. Let (e1 , . . . , en ) be a basis of E j and (e1 , . . . , en ) the corresponding dual basis of E ∗ , that is, e j (ei ) = δi . Then the r +s n -elements   ei1 ⊗ · · · ⊗ eir ⊗ e j1 ⊗ · · · ⊗ e js : 1 ≤ i k ≤ n, 1 ≤ jk ≤ n form a basis of Tsr (E), so that the space Tsr (E) has dimension n r +s . In fact, every element t of Tsr (E) can be written in the form t=

   t ei1 , . . . , eir , e j1 , . . . , e js ei1 ⊗ · · · ⊗ eir ⊗ e j1 ⊗ · · · ⊗ e js . i 1 ...ir j1 ... js

The coefficients

  i1 ...ir ir t ij11 ... js = t e , . . . , e , e j1 , . . . , e js

are called the components of t relative to the basis (e1 , . . . , en ). The interior product i v t of a vector v ∈ E with a tensor t ∈ Tsr (E) is defined by the formula (i v t)(β 1 , . . . , β r , v1 , . . . , vs−1 ) = t (β 1 , . . . , β r , v, v1 , . . . , vs−1 ).

4.7 Tensors

157

It is easy to see that r (E) i v : Tsr (E) −→ Ts−1

is a continuous linear map, as is the map v → i v . Similarly, the interior product i β t of a form β ∈ E ∗ with a tensor t ∈ Tsr (E) is defined by the formula (i β t)(β 1 , . . . , β r −1 , v1 , . . . , vs ) = t (β, β 1 , . . . , β r −1 , v1 , . . . , vs ). It is easy to see that

i β : Tsr (E) −→ Tsr −1 (E).

is a continuous linear map, as is the map β → i β . These two operations take respectively the following form in components:   i ek ei1 ⊗ · · · ⊗ eir ⊗ e j1 ⊗ · · · ⊗ e js = δ j1 k ei1 ⊗ · · · ⊗ eir ⊗ e j2 ⊗ · · · ⊗ e js ,   i ek ei1 ⊗ · · · ⊗ eir ⊗ e j1 ⊗ · · · ⊗ e js = δ k i1 ei2 ⊗ · · · ⊗ eir ⊗ e j1 ⊗ · · · ⊗ e js .

4.8 Tensor Fields Let M be an n-dimensional smooth manifold. We let Tsr (Tx (M)) Tsr (T (M)) = x∈M

be the disjoint union of the spaces Tsr (Tx (M)) of tensors on the tangent space Tx (M), contravariant of order r and covariant of order s. This Tsr (T (M)) carries a natural structure of a smooth manifold of dimension n + n r +s , induced by the tangent bundle T (M) and the cotangent bundle T ∗ (M). The manifold Tsr (T (M)) is called the vector bundle  of  tensors, contravariant of order r and covariant of order s, or simply of r type . s Note that T01 (T (M)) = T (M), T10 (T (M)) = T ∗ (M).   r A smooth tensor field of type on M is a smooth mapping s t : M −→ Tsr (T (M)) such that

158

4 Manifolds, Tensors and Densities

t (x) ∈ Tsr (Tx (M)) for each x ∈ M.   r The set of all smooth tensor fields of type on M carries a real linear s space structure, the addition and scalar multiplication of tensor fields being taken place within each Tsr (Tx (M)) for x ∈ M. Rephrased, the space Tsr (M) is the space C ∞ (M; Tsr (T (M))) of all smooth sections of the vector bundle Tsr (T (M)). Note that Tsr (M)

T00 (M) = C ∞ (M), T01 (M) = X (M) = C ∞ (M; T (M)), T10 (M) = X ∗ (M) = C ∞ (M; T ∗ (M)). We give the expression of tensor fields in local charts. Recall that if (U, ϕ) is a chart on M with ϕ(x) = (x 1 , . . . , x n ), then the vector fields ∂ ∂ ∂ , ,..., n ∂x 1 ∂x 2 ∂x form a basis of the tangent space Tx (M) and the differentials d x 1, d x 2, . . . , d x n form the corresponding dual basis of the cotangent space Tx∗ (M) at each point x of U . A tensor field t ∈ Tsr (M) has the local expression t=

 i 1 ...ir j1 ... js

...ir t ij11 ... js

∂ ∂ ⊗ ··· ⊗ ⊗ d x j1 ⊗ · · · ⊗ d x js ∂xi1 ∂xir

...ir i 1 ...ir where t ij11 ... js are smooth functions on U . The functions t j1 ... js are called the local components of t relative to the chart (U, ϕ).  A Riemannian metric on M is a C ∞ tensor field g of type 02 on M such that g(x) ∈ T20 (Tx (M)) is an inner product on Tx (M) for each x ∈ M. In terms of local coordinates, if (U, ϕ) is a chart on M with ϕ(x) = (x1 , x2 , . . . , xn ), then the local components   ∂ ∂ for 1 ≤ i, j ≤ n gi j (x) = g(x) , ∂xi ∂x j

of g in (U, ϕ) are C ∞ functions on U and the matrix (gi j (x)) is symmetric and positive definite at every point x of U . Moreover, if (V, ψ) is another chart with ψ(y) = (y1 , y2 , . . . , yn ) such that U ∩ V = ∅, then the local components

4.8 Tensor Fields

159

 gk (y) = g(y)

∂ ∂ , ∂ yk ∂ y

 for 1 ≤ k,  ≤ n

of g in (V, ψ) are C ∞ functions on V , and we have the formula g=





gi j (x)d xi ⊗ d x j =

1≤i, j≤n

gk (y)dyk ⊗ dy on U ∩ V,

1≤k,≤n

where the local components gi j (x) and gk (y) are related there as follows: 

gi j (x) =

gk (x)

∂ yk ∂ y , ∂xi ∂x j

(4.11a)

gi j (x)

∂xi ∂x j . ∂ yk ∂ y

(4.11b)

1≤k,≤n



gk (y) =

1≤i, j≤n

A C ∞ manifold with a Riemannian metric is called a Riemannian manifold. We give a general theorem on the existence of Riemannian metrics. Theorem 4.8 Every paracompact smooth manifold admits a Riemannian metric. Finally, if (M, g) is a Riemannian manifold, we can define the divergence div v of a vector field n  ∂ ξi ∈ X (M) v= ∂x i i=1 by the formula n

 ∂  1 det(gi j )ξ k ∈ C ∞ (M), div v =  det(gi j ) k=1 ∂xk

(4.12)

and also the gradient vector field grad f of a C ∞ function f ∈ C ∞ (M) by the formula  n  n   ∂ ij ∂ f grad f = g ∈ X (M). ∂x ∂x i j j=1 i=1

4.9 Exterior Product The permutation group Sk on k-elements consists of all bijections σ : {1, . . . , k} −→ {1, . . . , k} usually given in the following form

160

4 Manifolds, Tensors and Densities

σ=

  σ(1) σ(2) . . . σ(k) 1 2 ... k

A transposition is a permutation that swaps two elements of {1, . . . , k}, leaving the remainder fixed. A permutation is said to be even (resp. odd) if it is written as the product of an even (resp. odd) number of transpositions. The expression of an even (resp. odd) permutation is not unique, but the number of transpositions is always even (resp. odd). We define the signature, sign σ, of a permutation σ by the formula sign σ =

+1 if σ is even, −1 if σ is odd.

Let K denote the real number field R or the complex number field C. Throughout this section, let E be an n-dimensional linear space over K. Recall that Tk0 (E) = the space of k -multilinear mappings of E × · · · × E into K. The group Sk acts on Tk0 (E). Indeed, each σ ∈ Sk defines a mapping σ : Tk0 (E) −→ Tk0 (E) by the formula   (σt) (e1 , . . . , ek ) = t eσ(1) , . . . , eσ(k) for t ∈ Tk0 (E), where e1 , . . . , ek ∈ E. A mapping t ∈ Tk0 (E) is said to be alternating (resp. symmetric) if σt = (sign σ)t (resp. σt = t) for all σ ∈ Sk . It is easy to see that t ∈ Tk0 (E) is alternating if and only if t (e1 , . . . , ek ) = 0 when ei = e j for some i = j. The set of all alternating elements of Tk0 (E) is a linear subspace of Tk0 (E). This  space by k E ∗ , and is called the k-th exterior product of E ∗ . The elements k is denoted E ∗ are called exterior k-forms. We remark that of k 

E ∗ = {0} if k > n.

We define the alternation mapping A : Tk0 (E) −→ Tk0 (E) by the formula

4.9 Exterior Product

161

At (e1 , . . . , ek ) =

1  (sign σ)t (eσ(1) , . . . , eσ(k) ). k! σ∈S k

Then we have the following: Proposition  4.9 The mapping A is a linear mapping onto map on k E ∗ .

k

E ∗ , and is the identity

If α ∈ Tk0 (E) and β ∈ T0 (E), we define the exterior product α ∧ β of α and β by the formula (k + )! A(α ⊗ β). α∧β = k!!  Then we have α ∧ β ∈ k+ E ∗ . Furthermore, the following formula is a convenient way to compute exterior products: (α ∧ β) (e1 , . . . , ek+ )      = (sign σ)α eσ(1) , . . . , eσ(k) · β eσ(k+1) , . . . , eσ(k+) , where denotes the sum over all (k, ) shuffles; that is, permutations σ of {1, 2, . . . , k + } such that σ(1) < σ(2) < . . . < σ(k) and σ(k + 1) < σ(k + 2) < · · · < σ(k + ). Example 4.10 If α1 , . . . , αk ∈ E ∗ , then we have the formula (α1 ∧ . . . ∧ αk )(e1 , . . . , ek ) =



(sign σ)α1 (eσ(1) ) . . . αk (eσ(k) )

σ∈Sk

= det(αi (e j )). In particular, if (e1 , . . . , en ) is a basis of E and (e1 , . . . , en ) is the corresponding dual basis of E ∗ , then we have the formula (e1 ∧ . . . ∧ ek )(e1 , . . . , ek ) = 1. The next proposition summarizes basic properties of the operation ∧: Proposition 4.11 Let α ∈ Tk0 (E), β ∈ T0 (E) and γ ∈ Tm0 (E). Then we have the following: (i) (ii) (iii) (iv)

α ∧ β = Aα ∧ β = α ∧ Aβ. The operation ∧ is bilinear. α ∧ β = (−1)k β ∧ α. α ∧ (β ∧ γ) = (α ∧ β) ∧ γ.

The next proposition describes bases of

k

E ∗:

162

4 Manifolds, Tensors and Densities

Proposition 4.12 For 2 ≤ k ≤ n, the space

k

E ∗ has dimension

  n n! = (n − k)! k! k More precisely, if (e1 , . . . ,en )is a basis of E and (e1 , . . . , en ) is the corresponding n -elements dual basis of E ∗ , then the k  form a basis of

k

ei 1 ∧ . . . ∧ ei k ; 1 ≤ i 1 < . . . < i k ≤ n



E ∗.

4.10 Differential Forms Let X be an n-dimensional smooth manifold. We let k

T ∗ (M) =

k

Tx∗ (M)

x∈M

be the disjoint union of the k-th exterior products of the cotangent spaces Tx∗ (M). k ∗  T (M) The elements of k Tx∗ (M) are called exterior k-forms at x. The  space  n carries a natural structure of smooth manifold of dimension n + , induced by the k  cotangent bundle T ∗ (M). We call k T ∗ (M) the vector bundle of exterior k-forms on the tangent spaces of M. A differential form of order k or simply k-form on M is a smooth mapping ω : M −→

k

T ∗ (M)

 such that ω(x) ∈ k Tx∗ (M). The set Ω k (M) of all k-forms on M is a real linear space with the obvious operations of addition  and scalar multiplication. Rephrased, the space Ω k (M) is the space C ∞ (M; k Tx∗ (M)) of all smooth sections of the  vector bundle k T ∗ (M). Note that Ω 0 (M) = C ∞ (M), Ω 1 (M) = C ∞ (X ; T ∗ (M)) = X ∗ (M). If ω ∈ Ω k (M) and η ∈ Ω  (M), we define the exterior product ω ∧ η ∈ Ω k+ (M) of ω and η by the formula

4.10 Differential Forms

163

(ω ∧ η) (X 1 , . . . , X k+ )      = (sign σ) ω X σ(1) , . . . , X σ(k) · η X σ(k+1) , . . . , X σ(k+) . Here denotes the sum over all (k, ) shuffles; that is, permutations σ of {1, 2, . . . , k + } such that σ(1) < σ(2) < . . . < σ(k) and σ(k + 1) < σ(k + 2) < . . . < σ(k + ), and X 1 , . . ., X k+ are arbitrary vector fields on M. We give the expression of differential forms in local charts.   We remark that if n 1 n -elements (U, ϕ) is a chart on M with ϕ(x) = (x , . . . , x ), then the k 

d x i1 ∧ . . . ∧ d x ik : 1 ≤ i 1 < . . . < i k ≤ n



 form a basis of k Tx∗ (M) at each point x of U . A differential form ω ∈ Ω k (M) has the local expression ω=



ωi1 ...ik d x i1 ∧ . . . ∧ d x ik

(4.13)

1≤i 1 0 such that  ρ1 (x, y) < δ =⇒ ρ2 (x, y) < ε, ρ2 (x, y) < δ =⇒ ρ1 (x, y) < ε. Equivalent metrics induce the same topology on X . If x is a point of X and A is a subset of X , then we define the distance dist (x, A) from x to A by the formula dist (x, A) = inf ρ(x, a). a∈A

Let (X, ρ) be a metric space. A sequence {xn } in X is called a Cauchy sequence if it satisfies Cauchy’s convergence condition

5.1 Metric Spaces and the Contraction Mapping Principle

181

lim ρ(xn , xm ) = 0.

n,m→∞

A metric space X is said to be complete if every Cauchy sequence in X converges to a point in X . Let (X, ρ) be a metric space. A map T from a subset X 0 of X into X is called a contraction on X 0 if there exists a number 0 < θ < 1 such that ρ (T (x), T (y)) ≤ θ ρ(x, y) for all x, y ∈ X 0 .

(5.1)

The next theorem is the basis of many important existence theorems in analysis (cf. [66, Chap. 3, Theorem 3.8.2]): Theorem 5.1 (the contraction mapping principle) Let T be a map of a complete metric space (X, ρ) into itself. If T is a contraction, then there exists a unique point z ∈ X such that T (z) = z. A point z for which T (z) = z is called a fixed point of T . Hence Theorem 5.1 is also called a fixed point theorem.

5.2 Linear Operators and Functionals Let X , Y be linear spaces over the same scalar field K. A mapping T defined on a linear subspace D of X and taking values in Y is said to be linear if it preserves the operations of addition and scalar multiplication: (L1) T (x1 + x2 ) = T x1 + T x2 for all x1 , x2 ∈ D. (L2) T (αx) = αT x for all x ∈ D and α ∈ K. We often write T x, rather than T (x), if T is linear. We let D(T ) = D, R(T ) = {T x : x ∈ D(T )} , N (T ) = {x ∈ D(T ) : T x = 0} , and call them the domain, the range and the null space of T , respectively. The mapping T is called a linear operator from D(T ) ⊂ X into Y . We also say that T is a linear operator from X into Y with domain D(T ). In In the particular case when Y = K, the mapping T is called a linear functional on D(T ). In other words, a linear functional is a K-valued function on D(T ) that satisfies conditions (L1) and (L2). If a linear operator T is a one-to-one map of D(T ) onto R(T ), then it is easy to see that the inverse mapping T −1 is a linear operator on R(T ) onto D(T ). The mapping T −1 is called the inverse operator or simply the inverse of T . A linear operator T admits the inverse T −1 if and only if T x = 0 implies that x = 0.

182

5 A Short Course in Functional Analysis

Let T1 and T2 be two linear operators from a linear space X into a linear space Y with domains D(T1 ) and D(T2 ), respectively. Then we say that T1 = T2 if and only if D(T1 ) = D(T2 ) and T1 x = T2 x for all x ∈ D(T1 ) = D(T2 ). If D(T1 ) ⊂ D(T2 ) and T1 x = T2 x for all x ∈ D(T1 ), then we say that T2 is an extension of T1 and also that T1 is a restriction of T2 , and we write T1 ⊂ T2 .

5.3 Quasinormed Linear Spaces Let X be a linear space over the real or complex number field K. A real-valued function p defined on X is called a seminorm on X if it satisfies the following three conditions (S1), (S2) and (S3): (S1) 0 ≤ p(x) < ∞ for all x ∈ X . (S2) p(αx) = |α| p(x) for all α ∈ K and x ∈ X . (S3) p(x + y) ≤ p(x) + p(y) for all x, y ∈ X . Let { pi } be a countable family of seminorms on X such that p1 (x) ≤ p2 (x) ≤ · · · ≤ pi (x) ≤ · · · for each x ∈ X, and define



1 Vi j = x ∈ X : pi (x) < j

(5.2)

 for i, j = 1, 2, . . . .

Then it is easy to verify that a countable family of the sets x + Vi j = {x + y : y ∈ Vi j } satisfies the axioms of a fundamental neighborhood system of x; hence X is a topological space which satisfies the first axiom of countability. Furthermore, we have the following: Theorem 5.2 Let { pi } be a countable family of seminorms on a linear space X which satisfies condition (5.2). Assume that For every non-zero x ∈ X, there exists a seminorm pi such that pi (x) > 0. Then the space X is metrizable by the metric ρ(x, y) = If we let

∞  1 pi (x) for all x, y ∈ X. i 1 + p (x) 2 i i=1

(5.3)

5.3 Quasinormed Linear Spaces

183

|x| = ρ(x, 0) =

∞  1 pi (x) for x ∈ X, 2i 1 + pi (x) i=1

(5.4)

then the quantity |x| enjoys the following four properties (Q1), (Q2), (Q3) and (Q4): (Q1) (Q2) (Q3) (Q4)

|x| ≥ 0; |x| = 0 if and only if x = 0. |x + y| ≤ |x| + |y| (the triangle inequality). αn → 0 in K =⇒ |αn x| → 0 for every x ∈ X . |xn | → 0 =⇒ |αxn | → 0 for every α ∈ K.

This quantity |x| is called a quasinorm of x, and the space X is called a quasinormed linear space. Theorem 5.2 may be restated as follows: Theorem 5.3 A linear space X , topologized by a countable family { pi } of seminorms satisfying conditions (5.2) and (5.3), is a quasinormed linear space with respect to the quasinorm |x| defined by formula (5.4). Let X be a quasinormed linear space. The convergence lim |xn − x| = 0

n→∞

in X is denoted by s − limn→∞ xn = x or simply by xn → x, and we say that the sequence {xn } converges strongly to x. A sequence {xn } is called a Cauchy sequence if it satisfies Cauchy’s condition lim |xm − xn | = 0.

m,n→∞

A quasinormed linear space X is called a Fréchet space if it is complete, that is, if every Cauchy sequence in X converges strongly to a point in X . If a quasinormed linear space X is topologized by a countable family { pi } of seminorms which satisfies conditions (5.2) and (5.3), then the above definitions may be reformulated in terms of seminorms as follows: (i) A sequence {xn } in X converges strongly to a point x in X if and only if, for every seminorm pi and every ε > 0, there exists a positive integer N = N (i, ε) such that n ≥ N =⇒ pi (xn − x) < ε. (ii) A sequence {xn } in X is a Cauchy sequence if and only if, for every seminorm pi and every ε > 0, there exists a positive integer N = N (i, ε) such that m, n ≥ N =⇒ pi (xm − xn ) < ε. Let X be a quasinormed linear space. A linear subspace of X is called a closed subspace if it is a closed subset of X . For example, the closure M of a linear subspace

184

5 A Short Course in Functional Analysis

M is a closed subspace. Indeed, the elements of M are limits of sequences in M; thus, if x = limn xn , xn ∈ M and y = limn yn , yn ∈ M, then it follows that x + y = lim(xn + yn ), n

αx = lim αxn for α ∈ K. n

This proves that x + y ∈ M and αx ∈ M for all α ∈ K.

5.3.1 Compact Sets A collection {Uλ }λ∈Λ of open sets of a topological space X is called an open covering of X if X = ∪λ∈Λ Uλ . A topological space X is said to be compact if every open covering {Uλ } of X contains some finite subcollection of {Uλ } which still covers X . If s subset of X is compact considered as a topological subspace of X , then it is called a compact subset of X . A subset of a topological space X is said to be relatively compact (or precompact) if its closure is a compact subset of X . A topological space X is said to be locally compact if every point of X has a relatively compact neighborhood. A subset of a topological space X is called a σ -compact subset if it is a countable union of compact sets. Compactness is such a useful property that, given a non-compact space (X, O), it is worthwhile constructing a compact space (X , O ) with X being its dense subset. Such a space is called a compactification of (X, O). The simplest way in which this can be achieved is by adjoining one extra point ∞ to the space X ; a topology O

can be defined on X = X ∪ {∞} in such a way that (X , O ) is compact and that O is the relative topology induced on X by O . The topological space (X , O ) is called the one-point compactification of (X, O), and the point ∞ is called the point at infinity. Let X be a quasinormed linear space. A subset Y of X is called a sequentially compact if every sequence {yn } in Y contains a subsequence {yn } which converges to a point y of Y : lim |yn − y| = 0.

n →∞

Then we have the following criterion for compactness: Theorem 5.4 A subset of a quasinormed linear space X is compact if and only if it is sequentially compact.

5.3 Quasinormed Linear Spaces

185

5.3.2 Bounded Sets Let (X, | · |) be a quasinormed linear space. A set B in X is said to be bounded if it satisfies the condition sup |x| < ∞. x∈B

We remark that every compact set is bounded. A subset K of X is said to be totally bounded if, for any given ε > 0 there is a finite number of balls B (xi , ε) = {x ∈ X : |x − xi | < ε} for 1 ≤ i ≤ n, of radius ε about xi ∈ X that cover K : K ⊂

n

B(xi , ε).

i=1

Example 5.5 Let (X, | · |) be a quasinormed linear space. Assume that a subset A satisfies the following three conditions: (a) For every h > 0, there exists a totally bounded subset Ah of X . (b) For each point x ∈ A, there exists a point y ∈ Ah such that |x − y| ≤ h. (c) For each point z ∈ Ah , there exists a point w ∈ A such that |z − w| ≤ h. Then the subset A is totally bounded. Proof For any given ε > 0, we assume that A⊂



B(x, ε).

x∈A

Choose a number h 0 such that 0 < h0
0 such that q j (T x) ≤ C pi (x) for all x ∈ D(T ).

5.3.4 Topologies of Linear Operators We let L(X, Y ) = the collection of continuous linear operators on X into Y. We define in the set L(X, Y ) addition and scalar multiplication of operators in the usual way: (T + S)x = T x + Sx for x ∈ X, (α T ) x = α (T x) for α ∈ K and x ∈ X. Then L(X, Y ) is a linear space. We introduce three different topologies on the space L(X, Y ). (1) Simple convergence topology: This is the topology of convergence at each point of X ; a sequence {Tn } in L(X, Y ) converges to an element T of L(X, Y ) in the simple convergence topology if and only if Tn x → T x in Y for each x ∈ X . (2) Compact convergence topology: This is the topology of uniform convergence on compact sets in X ; Tn → T in the compact convergence topology if and only if Tn x → T x in Y uniformly for x ranging over compact sets in X . (3) Bounded convergence topology: This is the topology of uniform convergence on bounded sets in X ; Tn → T in the bounded convergence topology if and only if Tn x → T x in Y uniformly for x ranging over bounded sets in X . The simple convergence topology is weaker than the compact convergence topology, and the compact convergence topology is weaker than the bounded convergence topology.

5.3.5 The Banach–Steinhaus Theorem We introduce three different definitions of boundedness for sets in the space L(X, Y ): (1) A set H in L(X, Y ) is said to be bounded in the simple convergence topology if, for each x ∈ X , the set {T x : T ∈ H } is bounded in Y . (2) A set H in L(X, Y ) is said to be bounded in the compact convergence topology if, for every compact set K in X , the set ∪T ∈H T (K ) is bounded in Y .

188

5 A Short Course in Functional Analysis

(3) A set H in L(X, Y ) is said to be bounded in the bounded convergence topology if, for every bounded set B in X , the set ∪T ∈H T (B) is bounded in Y . Furthermore, a set H in L(X, Y ) is said to be equicontinuous if, for every seminorm q j on Y , there exist a seminorm pi on X and a constant C > 0 such that sup q j (T x) ≤ C pi (x) for x ∈ X.

T ∈H

The next theorem states one of the fundamental properties of Fréchet spaces. Theorem 5.8 (Banach–Steinhaus) Let X be a Fréchet space and Y a quasinormed linear space. Then the following four conditions are equivalent for a subset H of L(X, Y ): (i) (ii) (iii) (iv)

H H H H

is bounded in the simple convergence topology. is bounded in the compact convergence topology. is bounded in the bounded convergence topology. is equicontinuous.

5.3.6 Product Spaces Let X and Y be quasinormed linear spaces over the same scalar field K. Then the Cartesian product X × Y becomes a linear space over K if we define the algebraic operations coordinatewise {x1 , y1 } + {x2 , y2 } = {x1 + x2 , y1 + y2 } , α {x, y} = {αx, αy} for α ∈ K. It is easy to verify that the quantity

1/2 |{x, y}| = |x|2X + |y|2Y

(5.7)

satisfies axioms (Q1) through (Q4) of a quasinorm; hence the product space X × Y is a quasinormed linear space with respect to the quasinorm defined by formula (5.7). Furthermore, if X and Y are Fréchet spaces, then so is X × Y . In other words, the completeness is inherited by the product space.

5.4 Normed Linear Spaces A quasinormed linear space is called a normed linear space if it is topologized by just one seminorm that satisfies condition (5.3). We give the precise definition of a normed linear space.

5.4 Normed Linear Spaces

189

Let X be a linear space over the real or complex number field K. A real-valued function · defined on X is called a norm on X if it satisfies the following three conditions (N1), (N2) and (N3): (N1) x ≥ 0; x = 0 if and only if x = 0. (N2) αx = |α| x , α ∈ K, x ∈ X . (N3) x + y ≤ x + y , x, y ∈ X (the triangle inequality). A linear space X equipped with a norm · is called a normed linear space. The topology on X is defined by the metric ρ(x, y) = x − y . The convergence lim xn − x = 0

n→∞

in X is denoted by s − limn→∞ xn = x or simply xn → x, and we say that the sequence {xn } converges strongly to x. A sequence {xn } in X is called a Cauchy sequence if it satisfies the condition lim xn − xm = 0.

n,m→∞

A normed linear space X is called a Banach space if it is complete, that is, if every Cauchy sequence in X converges strongly to a point in X . Two norms · 1 and · 2 defined on the same linear space X are said to be equivalent if there exist constants c > 0 and C > 0 such that c x 1 ≤ x 2 ≤ C x 1 for all x ∈ X. Equivalent norms induce the same topology. If X and Y are normed linear spaces over the same scalar field K, then the product space X × Y is a normed linear space by the norm 1/2

. {x, y} = x 2X + y 2Y If X and Y are Banach spaces, then so is X × Y . Let X be a normed linear space. If Y is a closed linear subspace of X , then the factor space X/Y is a normed linear space by the norm x = inf z . z∈ x

(5.8)

If X is a Banach space, then so is X/Y . The space X/Y , normed by formula (5.8), is called a normed factor space.

190

5 A Short Course in Functional Analysis

5.4.1 Linear Operators on Normed Spaces Throughout the rest of this section, the letters X , Y , Z denote normed linear spaces over the same scalar field K. The next theorem is a normed linear space version of Theorem 5.7: Theorem 5.9 Let T be a linear operator from X into Y with domain D(T ). Then T is continuous everywhere on D(T ) if and only if there exists a constant C > 0 such that T x ≤ C x for all x ∈ D(T ). (5.9) Remark 5.10 In inequality (5.9), the quantity x is the norm of x in X and the quantity T x is the norm of T x in Y . Frequently several norms appear together, but it is clear from the context which is which. One of the consequences of Theorem 5.9 is the following extension theorem for a continuous linear operator: Theorem 5.11 If T is a continuous linear operator from X into Y with domain D(T ) and if Y is a Banach space, then T has a unique continuous extension T whose domain is the closure D(T ) of D(T ). As another consequence of Theorem 5.9, we give a necessary and sufficient condition for the existence of the continuous inverse of a linear operator: Theorem 5.12 Let T be a linear operator from X into Y with domain D(T ). Then T admits a continuous inverse T −1 if and only if there exists a constant c > 0 such that T x ≥ c x for all x ∈ D(T ). A linear operator T from X into Y with domain D(T ) is called an isometry if it is norm-preserving, that is, if we have the formula T x = x for all x ∈ D(T ). It is clear that if T is an isometry, then it is injective and both T and T −1 are continuous. If T is a continuous, one-to-one linear mapping of X onto Y and if its inverse T −1 is also a continuous mapping, then it is called an isomorphism of X onto Y . Two normed linear spaces are said to be isomorphic if there is an isomorphism between them. By combining Theorems 5.9 and 5.12, we obtain the following: Theorem 5.13 Let T be a linear operator on X onto Y . Then T is an isomorphism if and only if there exist constants c > 0 and C > 0 such that c x ≤ T x ≤ C x for all x ∈ X.

5.4 Normed Linear Spaces

191

If T is a continuous linear operator from X into Y with domain D(T ), we let T = inf{C : T x ≤ C x , x ∈ D(T )}. Then, in view of the linearity of T we have the formula T x = sup T x = sup T x . x∈D(T ) x x∈D(T ) x∈D(T )

T = sup x =0

x =1

(5.10)

x ≤1

This proves that T is the smallest non-negative number such that T x ≤ T · x for all x ∈ D(T ).

(5.11)

Theorem 5.9 asserts that a linear operator T on X into Y is continuous if and only if it maps bounded sets in X into bounded sets in Y . Thus a continuous linear operator on X into Y is usually called a bounded linear operator on X into Y . We let L(X, Y ) = the space of bounded linear operators on X into Y. In the case of normed linear spaces, the simple convergence topology on L(X, Y ) is usually called the strong topology of operators, and the bounded convergence topology on L(X, Y ) is called the uniform topology of operators. In view of formulas (5.10) and (5.11), it follows that the quantity T satisfies axioms (N1), (N2) and (N3) of a norm; hence the space L(X, Y ) is a normed linear space by the norm T given by formula (5.10). The topology on L(X, Y ) induced by the operator norm T is just the uniform topology of operators. We give a sufficient condition for the space L(X, Y ) to be complete: Theorem 5.14 If Y is a Banach space, then so is L(X, Y ). If T is a linear operator from X into Y with domain D(T ) and S is a linear operator from Y into Z with domain D(S), then we define the product ST as follows: (a) D(ST ) = {x ∈ D(T ) : T x ∈ D(S)}, (b) (ST )(x) = S(T x) for every x ∈ D(ST ). As for the product of linear operators, we have the following: Proposition 5.15 If T ∈ L(X, Y ) and S ∈ L(Y, Z ), then it follows that ST ∈ L(X, Z ). Moreover, we have the inequality ST ≤ S · T . We often make use of the following theorem in constructing the bounded inverse of a bounded linear operator:

192

5 A Short Course in Functional Analysis

Theorem 5.16 If T is a bounded linear operator on a Banach space X into itself and satisfies the condition T < 1, then the operator I − T has a unique bounded linear inverse (I − T )−1 which is given by C. Neumann’s series (I − T )−1 =

∞ 

T n.

n=0

Here I is the identity operator: I x = x for every x ∈ X , and T 0 = I . The next theorem is a normed linear space version of the Banach–Steinhaus theorem (Theorem 5.8): Theorem 5.17 (the resonance theorem) Let X be a Banach space, Y a normed linear space and H a subset of L(X, Y ). Then the boundedness of the set { T x : T ∈ H } at each x ∈ X implies the boundedness of the set { T : T ∈ H } . Corollary 5.18 Let X be a Banach space, Y a normed linear space and {Tn } a sequence in L(X, Y ) . If the limit s − lim Tn x = T x n→∞

(5.12)

exists for each x ∈ X , then it follows that T ∈ L(X, Y ). Moreover, we have the inequality T ≤ lim inf Tn . n→∞

The operator T obtained above is called the strong limit of the sequence {Tn }, since the convergence in (5.12) is the strong topology of operators. We then write T = s − lim Tn . n→∞

5.4.2 Method of Continuity The following method of continuity plays in the proof of existence theorems for the Dirichlet problem (see Theorem A.21 in Appendix A):

5.4 Normed Linear Spaces

193

Theorem 5.19 (the method of continuity) Let B be a Banach space and let V be a normed linear space. If L0 and L1 are two bounded linear operators from B into V, we define a family of bounded linear operators Lt = (1 − t)L0 + t L1 : B −→ V for 0 ≤ t ≤ 1. Assume that there exists a positive constant C, independent of x and t, such that x B ≤ C Lt x V for all x ∈ B. (5.13) Then the operator L1 maps B onto V if and only if the operator L0 maps B onto V. Proof Assume that Ls is surjective for some s ∈ [0, 1]. By inequality (5.13), it follows that Ls is bijective, so that the inverse L−1 s : V → B exists. Here we remark that

−1

L ≤ C. s Now let t be an arbitrary point of the interval [0, 1]. For any given y ∈ V, the equation Lt x = y is equivalent to the equation Ls x = Lt x + (Ls − Lt )x = y + (s − t) (L1 x − L0 x) . Hence we have the equivalent assertions Lt x = y ⇐⇒ x = L−1 s (y + (s − t) (L1 x − L0 x))

−1 ⇐⇒ I − (s − t)L−1 s (L1 − L0 ) x = Ls y. However, if |t − s| is so small that |s − t| < δ :=

1 , C ( L1 + L0 )

then it follows that



(s − t) L−1 (L1 − L0 ) ≤ |s − t| L−1 L1 − L0 s

s

≤ C |s − t| ( L1 + L0 ) = < 1. This proves that the operator

I − (s − t) L−1 s (L1 − L0 )



|s − t| δ

194

5 A Short Course in Functional Analysis

has, as a Neumann series (Theorem 5.16), the inverse

I − (s − t) L−1 s (L1 − L0 )

−1

=

∞ 

n (s − t)n L−1 (L1 − L0 )n . s

n=0

Therefore, we obtain that, for all t ∈ [0, 1] satisfying |t − s| < δ,

−1 −1 Ls y. Lt x = y ⇐⇒ x = I − (s − t)L−1 s (L1 − L0 ) By dividing the interval [0, 1] into subintervals of length less than δ, we find that the mapping Lt is surjective for all t ∈ [0, 1], provided that Ls is surjective for some s ∈ [0, 1]. In particular, this proves that L1 maps B onto V if and only if L0 maps B onto V. The proof of Theorem 5.19 is complete. 

5.4.3 Finite Dimensional Spaces The next theorem asserts that there is no point in studying abstract finite dimensional normed linear spaces: Theorem 5.20 All n-dimensional normed linear spaces over the same scalar field K are isomorphic to Kn with the maximum norm α = max |αi |, α = (α1 , α2 , . . . , αn ) ∈ Kn . 1≤i≤n

Topological properties of the space Kn applies to all finite dimensional normed linear spaces. Corollary 5.21 All finite dimensional normed linear spaces are complete. Corollary 5.22 Every finite dimensional linear subspace of a normed linear space is closed. Corollary 5.23 A subset of a finite dimensional normed linear space is compact if and only if it is closed and bounded. By Corollary 5.22, it follows that the closed unit ball in a finite dimensional normed linear space is compact. Conversely, this property characterizes finite dimensional spaces: Theorem 5.24 If the closed unit ball in a normed linear space X is compact, then X is finite dimensional.

5.4 Normed Linear Spaces

195

5.4.4 The Hahn–Banach Extension Theorem The Hahn–Banach extension theorem asserts the existence of linear functionals dominated by norms (see [240, Chap. IV, Sect. 5, Theorem 1 and Corollary]): Theorem 5.25 (Hahn–Banach) Let X be a normed linear space over the real or complex number field K and let M be a linear subspace of X . If f is a continuous linear functional defined on M, then it can be extended to a continuous linear functional f on X so that f = f . Let X be a real or complex, normed linear space. A closed subset M of X is said to be balanced if it satisfies the condition x ∈ M, |α| ≤ 1 =⇒ αx ∈ M. The next two theorems assert the existence of non-trivial continuous linear functionals (see [240, Chap. IV, Sect. 6, Theorem 3]): Theorem 5.26 (Mazur) Let X be a real or complex, normed linear space and let / M there M be a closed convex, balanced subset of X . Then, for any element x0 ∈ exists a continuous linear functional f 0 on X such that f 0 (x0 ) > 1, | f 0 (x)| ≤ 1 on M. / M, it follows that Proof Since M is closed and x0 ∈ dist (x0 , M) > 0. If 0 < d < dist (x0 , M), we let     d d = x ∈ X : x ≤ , B 0, 2 2       d d d = x0 + B 0, = x ∈ X : x − x0 ≤ , B x0 , 2 2 2   d . U = x ∈ X : dist (x, M) ≤ 2 Then we have the assertions

196

5 A Short Course in Functional Analysis

  d = ∅, U ∩ B x0 , 2   d ⊂ U, B 0, 2 since 0 ∈ M. Moreover, since M is convex and balanced, it is easy to verify the following three assertions: (a) U is convex. (b) U is balanced. (c) U is absorbing, that is, for any x ∈ X , there exists a constant α > 0 such that α −1 x ∈ U . Hence, we can define the Minkowski functional pU of U by the formula   for every x ∈ X. pU (x) = inf α > 0 : α −1 x ∈ U Since U is closed, it is easy to verify the following assertions: 

pU (x) > 1 if x ∈ / U, pU (x) ≤ 1 if x ∈ U.

Therefore, by applying [240, Chap. IV, Sect. 6, Corollary 1 to Theorem 1] to our situation we can find a continuous linear functional f 0 on X such that f 0 (x0 ) = pU (x0 ) > 1, | f 0 (x)| ≤ pU (x) on X. In particular, we have the assertion | f 0 (x)| = pU (x) ≤ 1 on M, since M ⊂ U . The proof of Theorem 5.26 is complete.



Theorem 5.27 Let X be a normed linear space and let M be a closed linear subspace / M there exists a continuous linear functional f 0 of X . Then, for any element x0 ∈ on X such that  f 0 (x0 ) > 1, f 0 (x) = 0 on M. Proof Indeed, it suffices to note that | f 0 (x)| ≤ 1 on M =⇒ f 0 (x) = 0 on M,

5.4 Normed Linear Spaces

197

since M is a linear space. The proof of Theorem 5.27 is complete.



Finally, the next theorem asserts that, for each point x0 = 0 there exists a continuous linear functional f 0 such that f 0 (x0 ) = f 0 (0) = 0: Theorem 5.28 Let X be a normed linear space. For each non-zero element x0 of X , there exists a continuous linear functional f 0 on X such that 

f 0 (x0 ) = x0 , f 0 = 1.

5.4.5 Dual Spaces Let X be a normed linear space over the real or complex number field K. A continuous linear functional on X is usually called a bounded linear functional on X . The space L(X, K) of all bounded linear functionals on X is called the dual space of X , and is denoted by X . We shall write f (x) =  f, x for the value of the functional f ∈ X and the vector x ∈ X . The bounded (resp. simple) convergence topology on X is called the strong (resp. weak*) topology on X and the dual space X equipped with this topology is called the strong (resp. weak*) dual space of X . It follows from an application of Theorem 5.14 with Y := K that the strong dual space X is a Banach space with the norm | f (x)| = sup | f (x)| . x∈X \{0} x x∈X

f = sup

x ≤1

We remark that | f, x| ≤ f · x for all x ∈ X. Theorem 5.27 asserts that the dual space X separates points of X , that is, for arbitrary two distinct points x1 , x2 of X , there exists a functional f ∈ X such that f (x1 ) = f (x2 ).

198

5 A Short Course in Functional Analysis

5.4.6 Annihilators Let A be a subset of a normed linear space X . An element f of the dual space X is called an annihilator of A if it satisfies the condition f (x) = 0 for all x ∈ A. We let

  A0 = f ∈ X : f (x) = 0 for all x ∈ A

be the set of all annihilators of A. This is not a one way proposition. If B is a subset of X , we let 0 B = {x ∈ X : f (x) = 0 for all f ∈ B} be the set of all annihilators of B. Here are some basic properties of annihilators: (i) The sets A0 and 0 B are closed linear subspaces of X and X , respectively. (ii) If M is a closed linear subspace of X , then 0(M 0 ) = M. (iii) If A is a subset of X and M is the closure of the subspace spanned by A, then M 0 = A0 and M = 0(A0 ).

5.4.7 Dual Spaces of Normed Factor Spaces Let M be a closed linear subspace of a normed linear space X . Then each element f on the normed factor space X/M by f of M 0 defines a bounded linear functional the formula f ( x ) = f (x) for all x ∈ X/M. Indeed, the value f (x) on the right-hand side does not depend on the choice of a representative x of the equivalence class x , and we have the formula f = f . Furthermore, it is easy to see that the mapping π : f −→ f of M 0 into (X/M) is linear and surjective; hence we have the following: Theorem 5.29 The strong dual space (X/M) of the factor space X/M can be identified with the space M 0 of all annihilators of M by the linear isometry π .

5.4 Normed Linear Spaces

199

5.4.8 Bidual Spaces Each element x of a normed linear space X defines a bounded linear functional J x on the strong dual space X by the formula J x( f ) = f (x) for all f ∈ X .

(5.14)

Then Theorem 5.27 asserts that J x = sup |J x( f )| = x , f ∈X

f ≤1

so that the mapping J is a linear isometry of X into the strong dual space (X ) of X . The space (X ) is called the strong bidual (or second dual) space of X . Summing up, we have the following: Theorem 5.30 A normed linear space X can be embedded into its strong bidual space (X ) by the linear isometry J defined by formula (5.14). If the mapping J is surjective, that is, if X = (X ) , then we say that X is reflexive. For example, we have the following: Theorem 5.31 The space L p (Ω) is reflexive if and only if 1 < p < ∞ (see [2, Theorem 2.46]).

5.4.9 Weak Convergence A sequence {xn } in a normed linear space X is said to be weakly convergent if a finite limn→∞ f (xn ) exists for each f in the dual space X of X . A sequence {xn } in X is said to converge weakly to an element x of X if limn→∞ f (xn ) = f (x) for every f ∈ X ; we then write w − limn→∞ xn = x or simply xn → x weakly. Since the space X separates points of X , the limit x is uniquely determined. Theorem 5.30 asserts that X may be considered as a linear subspace of its bidual space (X ) ; hence the weak topology on X is just the simple convergence topology on the bidual space (X ) = L(X , K). For weakly convergent sequences, we have the following: Theorem 5.32 (i) s − limn→∞ xn = x implies w − limn→∞ xn = x. (ii) A weakly convergent sequence {xn } is bounded: sup xn < +∞. n

Furthermore, if w − limn→∞ xn = x, then the sequence {xn } is bounded and we have the inequality

200

5 A Short Course in Functional Analysis

x ≤ lim inf xn . n→∞

Part (ii) of Theorem 5.32 has a converse: Theorem 5.33 A sequence {xn } in X converges weakly to an element x of X if the following two conditions (a) and (b) are satisfied: (a) The sequence {xn } is bounded. (b) limn→∞ f (xn ) = f (x) for every f in some strongly dense subset of X . For the strong and weak closures of a linear subspace, we have the following (see [240, Chap. V, Sect. 1, Theorem 11]): Theorem 5.34 (Mazur) Let X be a normed linear space. If M is a closed linear subspace of X in the strong topology of X , then it is closed in the weak topology of X. Proof Our proof is based on a reduction to absurdity. Assume, to the contrary, that M is not weakly closed. Then there exists a point x0 ∈ X \ M such that x0 is an accumulation point of the set M in the weak topology of X . Namely, there exists a sequence {xn } of M such that xn converges weakly to x0 . However, by applying Mazur’a theorem (Theorem 5.26) we can find a continuous linear functional f 0 on X such that f 0 (x0 ) > 1, | f 0 (x)| ≤ 1 on M. Hence we have the assertion 1 < | f 0 (x0 )| = lim | f 0 (xn )| ≤ 1. n→∞

This is a contradiction. The proof of Theorem 5.34 is complete.



Finally, the next Eberlein–Shmulyan theorem gives a necessary and sufficient condition for reflexivity of a Banach space in terms of sequential weak compactness (see [240, Appendix to Chap. V, Sect. 4, Theorem]): Theorem 5.35 (Eberlein–Shmulyan) A Banach space X is reflexive if and only if it is locally sequentially weakly compact, that is, X is reflexive if and only if every strongly bounded sequence of X contains a subsequence which converges weakly to an element of X .

5.4 Normed Linear Spaces

201

5.4.10 Weak* Convergence A sequence { f n } in the dual space X is said to be weakly* convergent if a finite limn→∞ f n (x) exists for every x ∈ X . A sequence { f n } in X is said to converge weakly* to an element f of X if limn→∞ f n (x) = f (x) for every x ∈ X ; we then write w ∗ − limn→∞ f n = f or simply f n → f weakly*. The weak* topology on X is just the simple topology on the space X = L(X, K). We have the following analogue of Theorem 5.32: Theorem 5.36 (i) s − limn→∞ f n = f implies w ∗ − limn→∞ f n = f . (ii) If X is a Banach space, then a weakly* convergent sequence { f n } in X

converges weakly* to an element f of X and we have the inequality f ≤ lim inf f n . n→∞

One of the important consequences of Theorem 5.36 is the sequential weak* compactness of bounded sets: Theorem 5.37 Let X be a separable Banach space. Then every bounded sequence in the strong dual space X has a subsequence which converges weakly* to an element of X .

5.4.11 Dual Operators The notion of the transposed matrix may be extended to the notion of dual operators as follows: Let T be a linear operator from X into Y with domain D(T ) dense in X . Such operators are called densely defined operators. Each element g of the dual space Y of Y defines a linear functional G on D(T ) by the formula G(x) = g(T x) for all x ∈ D(T ). If this functional G is continuous everywhere on D(T ) in the strong topology on X , it follows from an application of Theorem 5.11 that G can be extended uniquely to a continuous linear functional g on the closure D(T ) = X, that is, there exists a unique element g of the dual space X of X which is an extension of G. So we let D(T ) = the totality of those g ∈ Y such that the mapping x −→ g(T x) is continuous everywhere on D(T ) in the strong topology on X ,

202

5 A Short Course in Functional Analysis

Fig. 5.1 The operators T and T

T

←→

←→

g =T g∈X ← −−−−− g ∈ D(T )

x ∈ D(T )

and define

− −−−−→ T

Tx ∈ Y

T g = g .

In other words, the mapping T is a linear operator from Y into X with domain D(T ) such that

g(T x) = T g (x) for all x ∈ D(T ) and g ∈ D(T ).

(5.15)

The operator T is called the dual operator or transpose of T . The operators T : X → Y and T : Y → X can be visualized as in Fig. 5.1. Frequently, we write  f, x or x, f  for the value f (x) of a functional f at a point x. For example, we write formula (5.15) as follows: T x, g = x, T g for all x ∈ D(T ) and g ∈ D(T ).

(5.16)

The next theorem states that the continuity of operators is inherited by the transposes ([240, Chap. VII, Sect. 1, Theorem 2]): Theorem 5.38 Let X , Y be normed linear spaces and X , Y their strong dual spaces, respectively. If T is a bounded linear operator on X into Y , then its transpose T is a bounded linear operator on Y into X , and we have the formula T = T .

5.4.12 Adjoint Operators We assume that a normed linear space X is equipped with a conjugation, that is, with a continuous, unitary operation X  u → u ∈ X satisfying the following conditions:

5.4 Normed Linear Spaces

203

u + v = u + v, α u = α u, (u) = u for all u, v ∈ X and α ∈ K. For example, if X is a function space, then u is just the usual pointwise complex conjugate: u(x) = u(x) for all u ∈ X. A conjugation on X induces a conjugation on the dual space X by the formula  f , u =  f, u for all f ∈ X and u ∈ X. Hence we can define a bounded sesquilinear form (·, ·) on X × X by the formula ( f, u) =  f , u for all f ∈ X and u ∈ X. Now we consider the case where X and Y are each equipped with a conjugation. The notion of the adjoint matrix may be extended to the notion of adjoints as follows: Let A : X → Y be a linear operator with domain D(A) dense in X . Each element v of Y defines a linear functional V on D(A) by the formula V (u) = (Au, v) for every u ∈ D(A). If this functional V is continuous everywhere on D(A) in the strong topology on X , by applying Theorem 5.11 we obtain that V can be extended uniquely to a continuous linear functional v ∗ on D(A) = X . So we let D(A∗ ) = the totality of those v ∈ Y such that the mapping u −→ (Au, v) is continuous everywhere on D(A) in the strong topology on X , and define

A∗ v = v ∗ .

In other words, the mapping A∗ is a linear operator from Y into X with domain D(A∗ ) such that

u, A∗ v = (Au, v) for all u ∈ D(A) and v ∈ D(A∗ ). The operator A∗ is called the adjoint operator or adjoint of A. The operators A : X → Y and A∗ : Y → X can be visualized as in Fig. 5.2 below.

204

5 A Short Course in Functional Analysis

Fig. 5.2 The operators A and A

v ∗ = A∗ v ∈ X ← −−−− − v ∈ D(A∗ ) ←→

←→

A∗

u ∈ D(A)

− −−−− → A

Au ∈ Y

5.5 Linear Functionals and Measures One of the fundamental theorems in analysis is the Riesz–Markov representation theorem which describes an intimate relationship between measures and linear functionals.

5.5.1 The Space of Continuous Functions A topological space is said to be locally compact if every point has a compact neighborhood. If (X, ρ) is a locally compact, metric space, then we can make X into a compact space X ∂ = X ∪ {∂} by adding a single point ∂. The space X ∂ is called the one-point compactification of X and the point ∂ is called the point at infinity. Now let C(X ) be the collection of real-valued, continuous functions on X . We define in the set C(X ) addition and scalar multiplication of functions in the usual way: ( f + g)(x) = f (x) + g(x) for all x ∈ X. (α f )x = α f (x) for all α ∈ R and x ∈ X. Then C(X ) is a real linear space. If f ∈ C(X ), the support of f , denoted by supp f , is the smallest closed set outside of which f (x) vanishes, that is, the closure of the set {x ∈ X : f (x) = 0}. If supp f is compact, we say that f is compactly supported. We define a subspace of C(X ) as follows: Cc (X ) = { f ∈ C(X ) : supp f is compact } . Namely, Cc (X ) is the space of compactly supported, continuous functions on X . If f ∈ C(X ), we say that f is vanishes at infinity if the set {x ∈ X : | f (x)| ≥ ε} is compact for every ε > 0, and we write lim f (x) = 0.

x→∂

We define a subspace of C(X ) as follows:

5.5 Linear Functionals and Measures Fig. 5.3 The spaces C(X ∂ ) and C0 (X )

205

R

C(X∂ )

f =g+c

c = f (∂)

g

0  C0 (X ) =

C0 (X)

 f ∈ C(X ) : lim f (x) = 0 . x→∂

It is easy to see that C0 (X ) is a Banach space with the supremum (maximum) norm f ∞ = sup | f (x)|. x∈X

Then we have the following proposition: Proposition 5.39 Let (X, ρ) be a locally compact, metric space and let T be the collection of all subsets U of X ∂ = X ∪ {∂} such that either (i) U is an open subset of X or (ii) ∂ ∈ U and U c = X ∂ \ U is a compact subset of X . Then the space (X ∂ , T ) is a compact space and the inclusion map ι : X → X ∂ is an embedding. Furthermore, if f ∈ C(X ), then f (x) extends continuously to X ∂ if and only if f (x) = g(x) + c where g ∈ C0 (X ) and c ∈ R, in which case the continuous extension is given by f (∂) = c (see Fig. 5.3). Moreover, the next proposition asserts that C0 (X ) is the uniform closure of Cc (X ) (see [63, Proposition 4.35]): Proposition 5.40 Let (X, ρ) be a locally compact metric space. The space C0 (X ) is the closure of Cc (X ) in the topology of uniform convergence.

5.5.2 The Space of Signed Measures Let (X, M) be a measurable space. If μ and λ are signed measures on M, we define the sum μ + λ and the scalar multiple αμ (α ∈ R) as follows: (μ + λ)(A) = μ(A) + λ(A) for A ∈ M, (αμ)(A) = αμ(A) for α ∈ K and A ∈ M.

206

5 A Short Course in Functional Analysis

Then it is clear that μ + λ and αμ are signed measures. Furthermore, we can verify that the quantity μ = the total variation |μ|(X ) of a signed measure μ

(5.17)

satisfies axioms (N1), (N2) and (N3) of a norm. Thus the totality of signed measures μ on M is a normed linear space by the norm μ defined by formula (5.17). Now we define 1 (|μ| + μ) , 2 1 μ− = (|μ| − μ) 2 μ+ =

for a signed measure μ. Then it follows from inequality (2.3) that both μ+ and μ− are finite non-negative measures on M. Also we have the Jordan decomposition of μ: μ = μ+ − μ− . The measures μ+ and μ− are called the positive and negative variation measures of μ, respectively.

5.5.3 The Riesz–Markov Representation Theorem Let (X, ρ) be a locally compact, metric space. First, we characterize the non-negative linear functionals on C0 (X ). A linear functional F on C0 (X ) is said to be non-negative if it satisfies the condition f ∈ C0 (X ), f ≥ 0 on X =⇒ F( f ) ≥ 0 on X. Then we have the following Riesz–Markov representation theorem (see [63, Theorems 7.2 and 7.17], [209, Theorems 3.6 and 3.7]): Theorem 5.41 (Riesz–Markov) Let (X, ρ) be a locally compact, metric space. To each non-negative linear functional F on the space C0 (X ), there corresponds a unique non-negative, Radon measure μ on X such that  F( f ) =

f (x) dμ(x) for all f ∈ C0 (X ),

(5.18)

X

and we have the formula F = sup {F( f ) : f ∈ C0 (X ), 0 ≤ f ≤ 1 on X } .

(5.19)

5.5 Linear Functionals and Measures

207

Let (K , ρ) be a compact metric space. Now we characterize the space of all bounded linear functionals T on C(K ), that is, the dual space C(K ) of C(K ). Recall that the dual space C(K ) is a Banach space with the norm T = sup |T f |. f ∈C(K ) f ≤1

A compact space version of the Riesz–Markov representation theorem reads as follows (see [209, Theorem 3.10]): Theorem 5.42 (Riesz–Markov) Let (K , ρ) be a compact metric space. To each T ∈ C(K ) , there corresponds a unique real Borel measure μ on K such that  Tf =

f (x) dμ(x) for all f ∈ C(K ),

(5.20)

K

and we have the formula T = the total variation |μ|(K ) of μ.

(5.21)

Conversely, every real Borel measure μ on K defines a bounded linear functional T ∈ C(K ) through formula (5.20), and relation (5.21) holds true. Remark 5.43 In view of Theorem 2.8, we obtain that the positive and negative variation measures μ+ , μ− of a real Borel measure μ on K are both regular. Note that the space of all real Borel measures μ on K is a normed linear space by the norm μ = the total variation |μ|(K ) of μ. (5.22) Therefore, we can restate Theorem 5.42 as follows: Theorem 5.44 If (K , ρ) is a compact metric space, then the dual space C(K ) of C(K ) can be identified with the space of all real Borel measures on K normed by formula (5.22).

5.5.4 Weak Convergence of Measures Let K be a compact metric space and let C(K ) be the Banach space of real-valued continuous functions on K with the supremum (maximum) norm f = sup | f (x)|. x∈K

208

5 A Short Course in Functional Analysis

A sequence {μn }∞ n=1 of real Borel measures on K is said to converge weakly to a real Borel measure μ on K if we have the assertion   lim f (x) dμn (x) = f (x) dμ(x) for all f ∈ C(K ). (5.23) n→∞

K

K

Theorem 5.44 tells us that the space of all real Borel measures on K normed by (5.22) can be identified with the strong dual space C(K ) of C(K ). Thus the weak convergence (5.23) of real Borel measures is just the weak* convergence of C(K ) . One more result is important when studying the weak convergence of measures: Theorem 5.45 The Banach space C(K ) is separable, that is, it contains a countable, dense subset. The next theorem is one of the fundamental theorems in measure theory: Theorem 5.46 Every sequence {μn }∞ n=1 of real Borel measures on K satisfying the condition (5.24) sup |μn |(K ) < +∞ n≥1

has a subsequence which converges weakly to a real Borel measure μ on K . Furthermore, if the measures μn are all non-negative, then the measure μ is also nonnegative. Proof By virtue of Theorem 5.45, we can apply Theorem 5.38 with X := C(K ) to obtain the first assertion, since condition (5.24) implies the boundedness of the sequence { μn }∞ n=1 . The second assertion is an immediate consequence of the first assertion of Theorem 5.41. The proof of Theorem 5.46 is complete. 

5.6 Closed Operators Let X and Y be normed linear spaces over the same scalar field K. Let T be a linear operator from X into Y with domain D(T ). The graph G(T ) of T is the set G(T ) = {{x, T x} : x ∈ D(T )} in the product space X × Y . Note that G(T ) is a linear subspace of X × Y . We say that T is closed if its graph G(T ) is closed in X × Y . This is equivalent to saying that {xn } ⊂ D(T ), xn −→ x in X, T xn −→ y in Y =⇒ x ∈ D(T ), T x = y.

5.6 Closed Operators

209

In particular, if T is continuous and its domain D(T ) is closed in X , then T is a closed linear operator. We remark that if T is a closed linear operator which is also injective, then its inverse T −1 is a closed linear operator. Indeed, this follows from the fact that the mapping {x, y} −→ {y, x} is a homeomorphism of X × Y onto Y × X . A linear operator T is said to be closable if the closure G(T ) in X × Y of G(T ) is the graph of a linear operator, say, T , that is, G(T ) = G(T ). A linear operator is called a closed extension of T if it is a closed linear operator which is also an extension of T . It is easy to see that if T is closable, then every closed extension of T is an extension of T . Thus the operator T is called the minimal closed extension of T . The next theorem gives a necessary and sufficient condition for a linear operator to be closable ([240, Chap. II, Sect. 6, Proposition 2]): Theorem 5.47 A linear operator T from X into Y with domain D(T ) is closable if and only if the following condition is satisfied: {xn } ⊂ D(T ), xn −→ 0 in X, T xn −→ y in Y =⇒ y = 0. The notion of adjoints introduced in Sect. 5.4.12 gives a very simple criterion for closability. In fact, we can prove the following ([100, Chap. 3, Sect. 5, Theorem 5.29]): Theorem 5.48 Let X and Y be reflexive Banach spaces. If T : X → Y is a densely defined, closable linear operator, then the adjoint T ∗ : Y → X is closed and densely defined. Moreover, T ∗∗ = (T ∗ )∗ is the minimal closed extension of T . Namely, we have the formula

G T ∗∗ = G(T ). Now we can formulate three pillars of functional analysis – Banach’s open mapping theorem ([26, Theorem 2.6]), Banach’s closed graph theorem ([26, Theorem 2.9]) and Banach’s closed range theorem for closed operators ([26, Theorem 2.19]): Theorem 5.49 (Banach’s open mapping theorem) Let X and Y be Banach spaces. Then every continuous linear operator on X onto Y is open, that is, it maps every open set in X onto an open set in Y . Theorem 5.50 (Banach’s closed graph theorem) Let X and Y be Banach spaces. Then every closed linear operator on X into Y is continuous. Corollary 5.51 Let X and Y be Banach spaces. If T is a continuous, one-to-one linear operator on X onto Y , then its inverse T −1 is also continuous; hence T is an isomorphism. Indeed, the inverse T −1 is a closed linear operator, so that Theorem 5.6 applies. We give useful characterizations of closed linear operators with closed range ([26, Exercise 2.14]):

210

5 A Short Course in Functional Analysis

Theorem 5.52 Let X and Y be Banach spaces and T a closed linear operator from X into Y with domain D(T ). Then the range R(T ) of T is closed in Y if and only if there exists a constant C > 0 such that dist (x, N (T )) ≤ C T x for all x ∈ D(T ). Here dist (x, N (T )) = inf x − z z∈N (T )

is the distance from x to the null space N (T ) of T . The key point of the next theorem is that if the range R(T ) is closed in Y , then a necessary and sufficient condition for the equation T x = y to be solvable is that the given right-hand side y ∈ Y is annihilated by every solution x ∈ X of the homogeneous transposed equation T x = 0 ([240, Chap. VII, Sect. 5, Theorem]): Theorem 5.53 (Banach’s closed range theorem) Let X and Y be Banach spaces and T a densely defined, closed linear operator from X into Y . Then the following four conditions are equivalent: (i) (ii) (iii) (iv)

The range R(T ) of T is closed in Y . The range R(T ) of the transpose T is closed in X .  R(T ) = 0 N (T ) = x ∈ X : x, x  = 0 for all x ∈ N (T ). R(T ) = 0 N (T ) = x ∈ X : x , x = 0 for all x ∈ N (T ) .

5.7 Complemented Subspaces Let X be a linear space. Two linear subspaces M and N of X are said to be algebraic complements in X if X is the direct sum of M and N , that is, if X = M  N . Algebraic complements M and N in a normed linear space X are said to be topological complements in X if the addition mapping {y, z} −→ y + z is an isomorphism of M × N onto X . We then write X = M ⊕ N. As an application of Corollary 5.51, we obtain the following: Theorem 5.54 Let X be a Banach space. If M and N are closed algebraic complements in X , then they are topological complements. A closed linear subspace of a normed linear space X is said to be complemented in X if it has a topological complement. By Theorem 5.54, this is equivalent in Banach spaces to the existence of a closed algebraic complement.

5.7 Complemented Subspaces

211

The next theorem gives two criteria for a closed subspace to be complemented ([26, Sect. 2.4]): Theorem 5.55 Let X be a Banach space and M a closed subspace of X . If M has either finite dimension or finite codimension, then it is complemented in X .

5.8 Compact Operators Let X and Y be normed linear spaces over the same scalar field K. A linear operator T on X into Y is said to be compact or completely continuous if it maps every bounded subset of X onto a relatively compact subset of Y , that is, if the closure of T (B) is compact in Y for every bounded subset B of X . This is equivalent to saying that, for every bounded sequence {xn } in X , the sequence {T xn } has a subsequence which converges in Y . We list some facts which follow at once: (i) Every compact operator is bounded. Indeed, a compact operator maps the unit sphere onto a bounded set. (ii) Every bounded linear operator with finite dimensional range is compact. This is an immediate consequence of Corollary 5.23. (iii) No isomorphism between infinite dimensional spaces is compact. This follows from an application of Theorem 5.24. (iv) A linear combination of compact operators is compact. (v) The product of a compact operator with a bounded operator is compact. The next theorem states that if Y is a Banach space, then the compact operators on X into Y form a closed subspace of L(X, Y ) ([26, Theorem 6.1]): Theorem 5.56 Let X be a normed linear space and Y a Banach space. If {Tn } is a sequence of compact linear operators which converges to an operator T in the space L(X, Y ) with the uniform topology, then T is compact. As for the transposes of compact operators, we have the following theorem ([26, Theorem 6.4]): Theorem 5.57 (Schauder) Let X and Y be normed linear spaces. If T is a compact linear operator on X into Y , then its transpose T is a compact linear operator on Y into X .

212

5 A Short Course in Functional Analysis

5.9 The Riesz–Schauder Theory Now we state the most interesting results on compact linear operators, which are essentially due to F. Riesz in the Hilbert space setting. The results are extended to Banach spaces by Schauder: Theorem 5.58 Let X be a Banach space and T a compact linear operator on X into itself. Set S = I − T. Then we have the following three assertions (i), (ii) and (iii): (i) The null space N (S) of S is finite dimensional and the range R(S) of S is closed in X . (ii) The null space N (S ) of the transpose S is finite dimensional and the range R(S ) of S is closed in X . (iii) dim N (S) = dim N (S ). By combining Theorems 5.53 and 5.58, we can obtain an extension of the theory of linear mappings in finite dimensional linear spaces ([26, Theorem 6.6]): Corollary 5.59 (the Fredholm alternative) Let T be a compact linear operator on a Banach space X into itself. If S = I − T is either one-to-one or onto, then it is an isomorphism of X onto itself. Let T be a bounded linear operator on X into itself. The resolvent set of T , denoted ρ(T ), is defined to be the set of scalars λ ∈ K such that λI − T is an isomorphism of X onto itself. In this case, the inverse (λI − T )−1 is called the resolvent of T . The complement of ρ(T ), that is, the set of scalars λ ∈ K such that λI − T is not an isomorphism of X onto itself is called the spectrum of T , and is denoted by σ (T ). The set σ p (T ) of scalars λ ∈ K such that λI − T is not one-to-one forms a subset of σ (T ), and is called the point spectrum of T . A scalar λ ∈ K belongs to σ p (T ) if and only if there exists a non-zero element x ∈ X such that T x = λx. In this case, λ is called an eigenvalue of T and x an eigenvector of T corresponding to λ. Also the null space N (λI − T ) of λI − T is called the eigenspace of T corresponding to λ, and the dimension of N (λI − T ) is called the multiplicity of λ. By using C. Neumann’s series (Theorem 5.16), we find that the resolvent set ρ(T ) is open in K and that {λ ∈ K : |λ| > T } ⊂ ρ(T ). Hence the spectrum σ (T ) = K \ ρ(T ) is closed and bounded in K. If T is a compact operator and λ is a non-zero element of σ (T ), then, by applying Corollary 5.51 to the operator λ−1 T we obtain that λI − T is not one-to-one, that is, λ ∈ σ p (T ). Also note that if X is infinite dimensional, then T is not an isomorphism of X onto itself; hence 0 ∈ σ p (T ). Therefore the scalar field K can be decomposed as follows:

5.9 The Riesz–Schauder Theory

213

K = σ p (T ) ∪ {0} ∪ ρ(T ). We can say rather more about the spectrum σ (T ) in terms of transpose operators ([240, Chap. X, Sect. 5, Theorems 1, 2 and 3]): Theorem 5.60 (Riesz–Schauder) Let T be a compact linear operator on a Banach space X into itself. Then we have the following three assertions (i), (ii) and (iii): (i) The spectrum σ (T ) of T is either a finite set or a countable set accumulating only at the zero 0; and every

non-zero element of σ (T ) is an eigenvalue of T . (ii) dim N (λI − T ) = dim N λ − T < ∞ for all λ = 0. (iii) Let λ = 0. The non-homogeneous equation (λI − T ) x = y

has a solution if and only if y is orthogonal to the space N λ − T . Similarly, the non-homogeneous transpose equation

λI − T z = w has a solution if and only if w is orthogonal to the space N (λI − T ). Moreover, the operator λI − T is onto if and only if it is one-to-one.

5.10 Fredholm Operators Throughout this section, the letters X , Y , Z denote Banach spaces over the same scalar field K. A linear operator T : X → Y is called a Fredholm operator if the following five conditions are satisfied: (i) The domain D(T ) of T is dense in X . (ii) T is a closed operator. (iii) The null space N (T ) = {x ∈ D(T ) : T x = 0} of T has finite dimension, that is, dim N (T ) < ∞. (iv) The range R(T ) of T is closed in Y . (v) The range R(T ) has finite codimension, that is, codim R(T ) = dim Y/R(T ) < ∞. Then the index of T is defined by the formula

214

5 A Short Course in Functional Analysis

ind T := dim N (T ) − codim R(T ). For example, we find from Theorems 5.8 and 5.2 that if X = Y and T is compact, then the operator I − T is a Fredholm operator and ind (I − T ) = 0. We give a characterization of Fredholm operators. First, we have the following ([125, Theorem 2.24]): Theorem 5.61 If T : X → Y is a Fredholm operator with domain D(T ), then there exist a bounded linear operator S : Y → X and compact linear operators P : X → X , Q : Y → Y such that (a) ST = I − P on D(T ), (b) T S = I − Q on Y . Furthermore, we have the formulas R(P) = N (T ), dim R(Q) = codim R(T ). Theorem 5.61 has a converse: Theorem 5.62 Let T be a closed linear operator from X into Y with domain D(T ) dense in X . Assume that there exist bounded linear operators S1 : Y → X and S2 : Y → X and compact linear operators K 1 : X → X , K 2 : Y → Y such that (a) S1 T = I − K 1 on D(T ), (b) T S2 = I − K 2 on Y . Then T is a Fredholm operator. Now we state some important properties of Fredholm operators ([125, Theorem 2.21]): Theorem 5.63 If T : X → Y is a Fredholm operator and if S : Y → Z is a Fredholm operator, then the product ST : X → Z is a Fredholm operator, and we have the formula ind (ST ) = ind S + ind T. The next theorem states that the index is stable under compact perturbations or small perturbations ([125, Theorem 2.26]): Theorem 5.64 (i) If T : X → Y is a Fredholm operator and if K : X → Y is a compact linear operator, then the sum T + K : X → Y is a Fredholm operator, and we have the formula ind (T + K ) = ind T. (ii) The Fredholm operators form an open subset of the space L(X, Y ) of bounded operators. More precisely, if E : X → Y is a bounded operator with E sufficiently small, then the sum T + E : X → Y is a Fredholm operator, and we have the formula

5.10 Fredholm Operators

215

ind (T + E) = ind T. As for the transposes of Fredholm operators, we have the following: Theorem 5.65 If T : X → Y is a Fredholm operator and if Y is reflexive, then the transpose T : Y → X of T is a Fredholm operator, and we have the formula ind T = −ind T. Now we can state a generalization of the Fredholm alternative (Corollary 5.51) in terms of adjoint operators ([125, Theorem 2.27]): Theorem 5.66 (the Fredholm alternative) Let A : X → Y be a Fredholm operator with ind A = 0. Then there are two, mutually exclusive possibilities (i) and (ii): (i) The homogeneous equation Au = 0 has only the trivial solution u = 0. In this case, we have the following two assertions: (a) For each f ∈ Y , the non-homogeneous equation Au = f has a unique solution u ∈ X . (b) For each g ∈ X , the adjoint equation A∗ v = g has a unique solution v ∈ Y . (ii) The homogeneous equation Au = 0 has exactly p linearly independent solutions u 1 , u 2 , . . ., u p for some p ≥ 1. In this case, we have the following three assertions: (c) The homogeneous adjoint equation A∗ v = 0 has exactly p linearly independent solutions v1 , v2 , . . ., v p . (d) The non-homogeneous equation Au = f is solvable if and only if the righthand side f satisfies the orthogonal conditions

v j , f = 0 for all 1 ≤ j ≤ p.

(e) The non-homogeneous adjoint equation A∗ v = g is solvable if and only if the right-hand side g satisfies the orthogonal conditions

g, u j = 0 for all 1 ≤ j ≤ p.

Finally, we give a very useful criterion for conditions (iii) and (iv) made for Fredholm operators: Theorem 5.67 [Peetre] Let X , Y , Z be Banach spaces such that X ⊂ Z with compact injection, and let T be a closed linear operator from X into Y with domain D(T ). Then the following two conditions are equivalent: (i) The null space N (T ) of T has finite dimension and the range R(T ) of T is closed in Y .

216

5 A Short Course in Functional Analysis

Fig. 5.4 The decomposition of the domain D(T )

X

N (T )

x1

0

x = x0 + x1

x0

D(T ) ∩ X0

(ii) There is a constant C > 0 such that x X ≤ C ( T x Y + x Z ) for all x ∈ D(T ).

(5.25)

Proof (i) ⇒ (ii): By Theorem 5.11, the null space N (T ) has a closed topological complement X 0 : (5.26) X = N (T ) ⊕ X 0 . This gives that D(T ) = N (T ) ⊕ (D(T ) ∩ X 0 ) . Hence every element x of D(T ) can be written in the form (see Fig. 5.4) x = x0 + x1 for x0 ∈ D(T ) ∩ X 0 and x1 ∈ N (T ). Since the range R(T ) is closed in Y , it then follows from an application of Theorem 5.7 that (5.27) x0 X ≤ C T x0 Y . Here and in the following the letter C denotes a generic positive constant independent of x. On the other hand, Theorem 5.20 tells us that all norms on a finite dimensional linear space are equivalent. This gives that x1 X ≤ C x1 Z .

(5.28)

However, since the injection X → Z is compact and hence is continuous, we have the inequality (5.29) x1 Z ≤ x Z + x0 Z ≤ x Z + C x0 X . Thus it follows from inequalities (5.28) and (5.29) that x1 X ≤ C ( x Z + x0 X ) .

(5.30)

5.10 Fredholm Operators

217

Therefore, by combining inequalities (5.27) and (5.30), we obtain the desired inequality (5.25): x X ≤ x0 X + x1 X ≤ C ( T x Y + x Z ) , since T x0 = T x. (ii) ⇒ (i): By inequality (5.25), we have the inequality x X ≤ C x Z for all x ∈ N (T ).

(5.31)

However, since the null space N (T ) is closed in X , it is a Banach space. Since the injection X → Z is compact, it follows from inequality (5.31) that the closed unit ball {x ∈ N (T ) : x X ≤ 1} of N (T ) is compact. Therefore, we obtain from Theorem 5.24 that dim N (T ) < +∞. Let X 0 be a closed topological complement of N (T ) as in decomposition (5.26). To prove the closedness of R(T ), by virtue of Theorem 5.7, it suffices to show that x X ≤ C T x Y for all x ∈ D(T ) ∩ X 0 . Assume, to the contrary, that For every n ∈ N, there is an element xn of D(T ) ∩ X 0 such that xn X > n T xn Y . If we let

xn =

xn . xn X

then we have the assertions xn ∈ D(T ) ∩ X 0 , xn X = 1, 1 T xn Y < . n

(5.32) (5.33)

Since the injection X → Z is compact, by passing to a subsequence we may assume that the sequence {x }n is a Cauchy sequence in Z . Then, in view of (5.33), it follows from inequality (5.25) that the sequence {xn } is a Cauchy sequence in X , and hence converges to some element x of X . Since the operator T is closed, we obtain that x ∈ D(T ), T x = 0,

218

5 A Short Course in Functional Analysis

so that

x ∈ N (T ).

On the other hand, in view of assertions (5.32), it follows that x ∈ X 0, and further that

x X = lim xn X = 1. n

This is a contradiction, since we have the assertion x ∈ N (T ) ∩ X 0 = {0}. The proof of Theorem 5.67 is complete.



5.11 Hilbert Spaces A complex (or real) linear space X is called a pre-Hilbert space or inner product space if, to each ordered pair of elements x and y of X , there is associated a complex (or real) number (x, y) in such a way that (I1) (I2) (I3) (I4)

(y, x) = (x, y). (αx, y) = α(x, y) for all α ∈ C (or α ∈ R). (x + y, z) = (x, z) + (y, z) for all x, y and z ∈ X . (x, x) ≥ 0; (x, x) = 0 if and only if x = 0.

Here (x, y) denotes the complex conjugate of (x, y). In the real case condition (I1) becomes simply (y, x) = (x, y). The number (x, y) is called the inner product or scalar product of x and y. The following are immediate consequences of conditions (I1), (I2) and (I3): (i) (αx + βy, z) = α (x, z) + β (y, z) for all α, β ∈ C. (ii) (x, αy + βz) = α (x, y) + β (x, z) for all α, β ∈ C. These properties (i) and (ii) are frequently called sesquilinearity. In the real case they are bilinearity. We list some basic properties of the inner product: (1) The Schwarz inequality holds true for all x, y ∈ X : | (x, y) |2 ≤ (x, x) (y, y) . Here the equality holds true if and only if x and y are linearly dependent.

5.11 Hilbert Spaces

219

(2) The quantity x =



(x, x) (the non-negative square root)

satisfies axioms (N1), (N2) and (N3) of a norm; hence a pre-Hilbert space is a normed linear space by the norm x =



(x, x).

(3) The inner product (x, y) is a continuous function of x and y: xn − x −→ 0, yn − y −→ 0 =⇒ (xn , yn ) −→ (x, y). (4) The parallelogram law holds true for all x, y ∈ X : x + y 2 + x − y 2 = 2( x 2 + y 2 ).

(5.34)

Conversely, we assume that X is a normed linear space whose norm satisfies condition (5.34). We let (x, y) =

1

x + y 2 − x − y 2 4

if X is a real normed linear space, and let (x, y) =

√ 1

x + y 2 − x − y 2 + i x + i y 2 − i x − i y 2 , i = −1, 4

if X is a complex normed linear space. Then it is easy to verify that the number (x, y) satisfies axioms (I1) through (I4) of an inner product; hence X is a preHilbert space. A pre-Hilbert space is called a Hilbert space if it is complete with respect to the norm derived from the inner product. If X and Y are pre-Hilbert spaces over the same scalar field K, then the product space X × Y is a pre-Hilbert space by the inner product ({x1 , y1 }, {x2 , y2 }) = (x1 , x2 ) + (y1 , y2 ) for all x1 , x2 ∈ X and y1 , y2 ∈ Y. Furthermore, if X and Y are Hilbert spaces, then so is X × Y .

220

5 A Short Course in Functional Analysis

5.11.1 Orthogonality Let X be a pre-Hilbert space. Two elements x, y of X are said to be orthogonal if (x, y) = 0; we then write x ⊥ y. We remark that x ⊥ y ⇐⇒ y ⊥ x. x ⊥ x ⇐⇒ x = 0. If A is a subset of X , we let A⊥ = {x ∈ X : (x, y) = 0 for all y ∈ A} . In other words, A⊥ is the set of all those elements of X which are orthogonal to every element of A. We list some facts which follow at once: (i) (ii) (iii) (iv) (v)

The orthogonal set A⊥ is a linear subspace of X . A ⊂ B =⇒ B ⊥ ⊂ A⊥ . A ∩ A⊥ = {0}. The orthogonal set A⊥ is closed. ⊥ A⊥ = A = [A]⊥ where A is the closure of A and [A] is the space spanned by A, that is, the space of all finite linear combinations of elements of A.

Facts (iv) and (v) follow from the continuity of the inner product.

5.11.2 The Closest-Point Theorem and Applications Theorem 5.68 (the closest-point theorem) Let A be a closed convex subset of a Hilbert space X . If x is a point not in A, then there is a unique point a in A such that x − a = dist (x, A) = inf { x − y : y ∈ A} . Theorem 5.68 can be proved by using the parallelogram law. One of the consequences of Theorem 5.68 is that every closed linear subspace of a Hilbert space is complemented (see Fig. 5.5 below): Theorem 5.69 Let M be a closed linear subspace of a Hilbert space X . Then every element x of X can be decomposed uniquely in the form x = y + z for y ∈ M and z ∈ M ⊥ . Moreover, the mapping x → {y, z} is an isomorphism of X onto M × M ⊥ .

(5.35)

5.11 Hilbert Spaces

221

Fig. 5.5 The orthogonal decomposition (5.35)

M⊥

X

x=y+z

z = PM ⊥ x

0

y = PM x

M

We shall write the decomposition (5.35) in the form X = M ⊕ M ⊥,

(5.36)

emphasizing that the mapping x → {y, z} is an isomorphism of X onto M × M ⊥ . The space M ⊥ is called the orthogonal complement of M. Corollary 5.70 If M is a closed linear subspace of a Hilbert space X , then it follows ⊥

that M ⊥⊥ = M ⊥ = M. Furthermore, if A is a subset of X , then we have A⊥⊥ = [A], where [A] is the space spanned by A. With the above notation (5.35), we define a mapping PM of X into M by the formula PM x = y. Since the decomposition (5.35) is unique, it follows that PM is linear. Furthermore, we easily obtain the following: Theorem 5.71 The operator PM enjoys the following three properties (i), (ii) and (iii): (i) PM2 = PM (idempotent

property). (ii) PM x, x = x, PM x (symmetric property). (iii) PM ≤ 1. The operator PM is called the orthogonal projection onto M. Similarly, we define a mapping PM ⊥ of X into M ⊥ by the formula PM ⊥ x = z. Then Corollary 5.70 asserts that PM ⊥ is the orthogonal projection onto M ⊥ . It is clear that PM + PM ⊥ = I. Now we give an important characterization of bounded linear functionals on a Hilbert space:

222

5 A Short Course in Functional Analysis

Theorem 5.72 (the Riesz representation theorem) Every element y of a Hilbert space X defines a bounded linear functional J X y on X by the formula J X y(x) = (x, y) for all x ∈ X,

(5.37)

and we have the formula J X y = sup |J X y(x)| = y . x∈X x ≤1

Conversely, for every bounded linear functional f on X , there exists a unique element y of X such that f = J X y, that is, f (x) = (x, y) for all x ∈ X, and so f = y . In view of formula (5.37), it follows that the mapping J X enjoys the following property: J X (αy + βz) = α J X y + β J X z for all y, z ∈ X and α, β ∈ C. We express this by saying that J X is conjugate linear or antilinear. In the real case, J X is linear. Let X be the strong dual space of a Hilbert space X , that is, the space of bounded linear functionals on X with the norm f = sup | f (x)|. x∈X x ≤1

Then Theorem 5.72 may be restated as follows: There is a conjugate linear, norm-preserving isomorphism J X of X onto X .

(5.38)

In this case, we say that X is antidual to X . Recall that a sequence {xn } in a normed linear space X is said to converge weakly to an element x of X if f (xn ) → f (x) for every f ∈ X . Assertion (5.38) tells us that a sequence {xn } in a Hilbert space X converges weakly to an element x of X if and only if (xn , y) → (x, y) as n → ∞ for every y ∈ X . Another important consequence of Theorem 5.72 is the reflexivity of Hilbert spaces: Corollary 5.73 Every Hilbert space can be identified with its strong bidual space.

5.11 Hilbert Spaces

223

5.11.3 Orthonormal Sets Let X be a pre-Hilbert space. A subset S of X is said to be orthogonal if every pair of distinct elements of S is orthogonal. Furthermore, if each element of S has norm one, then S is said to be orthonormal. We remark that if S is an orthogonal set of non-zero elements, we can construct an orthonormal set from S by normalizing each n αi xi , then element of S. If {x1 , x2 , . . . , xn } is an orthonormal set and if x = i=1 we have the formula x = 2

n 

|αi |2 with αi = (x, xi ) .

i=1

Therefore, every orthonormal set is linearly independent. First, we state the Gram–Schmidt orthogonalization: Theorem 5.74 (Gram–Schmidt) Let {xi }i∈I be a finite or countable infinite set of linearly independent vectors of X . Then we can construct an orthonormal set {u i }i∈I such that, for each i ∈ I , (a) u i is a linear combination of {x1 , x2 , . . . , xi }. (b) xi is a linear combination of {u 1 , u 2 , . . . , u i }. Corollary 5.75 Every n-dimensional pre-Hilbert space over the scalar field K is isomorphic to the space Kn with the usual inner product. Let {u λ }λ∈Λ be an orthonormal set of a pre-Hilbert space X . For each x ∈ X , we let xˆλ = (x, u λ ) for λ ∈ Λ. The scalars xˆλ are called the Fourier coefficients of x with respect to {u λ }. Then we have the following: Theorem 5.76 For each x ∈ X , the set of those λ ∈ Λ such that xˆλ = 0 is at most countable. Furthermore, we have the Bessel inequality 

|xˆλ |2 ≤ x 2 .

λ∈Λ

An orthonormal set S of X is called a complete orthonormal system if it is not contained in a larger orthonormal set of X . As for the existence of such systems, we have the following: Theorem 5.77 Let X be a Hilbert space having a non-zero element. Then, for every orthonormal set S in X , there exists a complete orthonormal system that contains S. The next theorem gives useful criteria for the completeness of orthonormal sets:

224

5 A Short Course in Functional Analysis

Theorem 5.78 Let S = {u λ }λ∈Λ be an orthonormal set in a Hilbert space X . Then the following five conditions (i) through (v) are equivalent: (i) (ii) (iii) (iv)

The set S is complete. S ⊥ = {0}. The space [S] spanned by S is dense in X : [S] = X . For every x ∈ X , we have the formula x 2 =



|xˆλ |2 .

(5.39)

λ∈Λ

(v) For every x ∈ X , we have the formula x=



xˆλ u λ in X.

(5.40)

λ∈Λ

Formula (5.39) is called the Parseval identity and formula (5.40) is called the Fourier series expansion of x with respect to {u λ }.

5.11.4 Adjoint Operators Throughout this subsection, the letters X , Y , Z denote Hilbert spaces over the same scalar field K. Let T be a linear operator from X into Y with domain D(T ) dense in X . Each element y of Y defines a linear functional f on D(T ) by the formula f (x) = (T x, y) for every x ∈ D(T ). If this functional f is continuous everywhere on D(T ), by applying Theorem 5.11 we obtain that f can be extended uniquely to a continuous linear functional f on D(T ) = X . Therefore, Riesz’s theorem (Theorem 5.72) asserts that there exists a unique element y ∗ of X such that f (x) = (x, y ∗ ) for all x ∈ X. In particular, we have the formula (T x, y) = f (x) = (x, y ∗ ) for all x ∈ D(T ). Hence we let D(T ∗ ) = the totality of those y ∈ Y such that the mapping x −→ (T x, y) is continuous everywhere on D(T ),

5.11 Hilbert Spaces

225

Fig. 5.6 The operators T and T ∗

T∗

←→

←→

y∗ = T ∗ y ∈ X ← −−−−− y ∈ D(T ∗ )

x ∈ D(T )

and define

− −−−−→ T

Tx ∈ Y

T ∗ y = y∗.

In other words, the mapping T ∗ is a linear operator from Y into X with domain D(T ∗ ) such that (T x, y) = (x, T ∗ y) for all x ∈ D(T ) and y ∈ D(T ∗ ). The operator T ∗ is called the adjoint operator or simply the adjoint of T . The operators T : X → Y and T ∗ : Y → X can be visualized as in Fig. 5.6. We list some basic properties of adjoints: (i) (ii) (iii) (iv)

The operator T ∗ is closed. If T ∈ L(X, Y ), then T ∗ ∈ L(Y, X ) and T ∗ = T . If T , S ∈ L(X, Y ), then (αT + β S)∗ = αT ∗ + β S ∗ for all α, β ∈ C. If T ∈ L(X, Y ) and S ∈ L(Y, Z ), then (ST )∗ = T ∗ S ∗ .

A densely defined linear operator T from X into itself is said to be self-adjoint if T = T ∗ . Note that every self-adjoint operator is closed. As for the adjoints of closed operators, we have the following (see Theorem 5.3): Theorem 5.79 If T is a densely defined, closed linear operator from X into Y , then the adjoint T ∗ is a densely defined, closed linear operator from Y into X and we have the formula

∗ T ∗∗ = T ∗ = T. Corollary 5.80 If T is a densely defined, closable linear operator, then the adjoint minimal closed T ∗ is densely defined and the operator T ∗∗ coincides with the extension T of T . Namely, we have the formula

G T ∗∗ = G(T ).

5.12 The Hilbert–Schmidt Theory In the finite dimensional case, the spectral theorem for self-adjoint linear operators states that there exists an orthonormal basis consisting of eigenvectors. We generalize this theorem to the Hilbert space case.

226

5 A Short Course in Functional Analysis

Let T be a self-adjoint linear operator on a Hilbert space X into itself. We remark that an eigenvalue λ of T is real. In fact, if x is an eigenvector of T corresponding to λ, then λ (x, x) = (T x, x) = (x, T x) = λ (x, x) , so that λ = λ. Furthermore, the Riesz–Schauder theorem (Theorem 5.48) tells us that if T is compact, then the non-zero eigenvalues of T is a countable set accumulating only at the zero 0; hence we can order them in a sequence {λ j } in such a way that |λ1 | ≥ |λ2 | ≥ . . . ≥ |λ j | ≥ |λ j+1 | ≥ . . . −→ 0, where each λ j is repeated according to its multiplicity. For each λ j , we let

Vλ j = the eigenspace N λ j I − T of T corresponding to the eigenvalue λ j . The eigenspaces Vλ j are mutually orthogonal. Indeed, if x ∈ Vλi and y ∈ Vλ j , then we have the formula λi (x, y) = (T x, y) = (x, T y) = λ j (x, y) , so that (x, y) = 0 if λi = λ j . Therefore, we can choose an orthonormal basis of Vλ j and combine these into an orthonormal set {x j } of eigenvectors of T such that T xj = λjxj. In other words, there is an orthonormal basis {x j }∞ j=1 of X such that the operator T diagonalizes with respect to this basis: ⎛ ⎜ ⎜ ⎜ ⎜ T ∼⎜ ⎜ ⎜ ⎜ ⎝

λ1

0



λ2

..

0⎟⎟ . λj

..

.

⎟ ⎟ ⎟. ⎟ ⎟ ⎟ ⎠

The spectral theorem extends to the Hilbert space case as follows: Theorem 5.81 (Hilbert–Schmidt) Let T be a self-adjoint compact linear operator on a Hilbert space X into itself. Then, for any x ∈ X we have the formula Tx =

∞  j=1

λ j (x, x j )x j = s − lim

n→∞

n  j=1

λ j (x, x j )x j .

5.12 The Hilbert–Schmidt Theory

227

In particular, if T is one-to-one, then we have the expansion formula x=

∞ 

n 

x, x j x j = s − lim x, x j x j ,

j=1

n→∞

j=1

that is, the family {x j } of eigenvectors is a complete orthonormal system of X .

5.13 Notes and Comments The material in this chapter is adapted from the book of Yosida [240] and also a part of Brezis [26], Friedman [66] and Schechter [160] in such a way as to make it accessible to graduate students and advanced undergraduates as well. For more thorough treatments of functional analysis, the readers might be referred to Kato [100], Kolmogorov–Fomin [104], Reed–Simon [149] and Rudin [153]. Section 5.3: For more leisurely treatments of linear topological spaces, the reader is referred to Schaefer Sa1971 and Treves [223]. Section 5.4: Theorem 5.19 is taken from [74, Chap. 5, Theorem 5.2]. Section 5.5: The Riesz–Markov representation theorem, Theorem 5.41, is adapted from Rudin [153, Theorem 6.19] and Folland [63, Theorems 7.2 and 7.17], while Theorem 5.42 is taken from Taira [209, Theorem 3.10]. For a proof of Theorem 5.45, see Jameson [96]. Section 5.10: For further material on Fredholm operators, see Gohberg–Kreˇin [75]. Theorem 5.67, first proved by Peetre [144] for bounded operators, is taken from Taira [187].

Chapter 6

A Short Course in Semigroup Theory

This chapter is devoted to the general theory of semigroups. In Sects. 6.1–6.3 we study Banach space valued functions, operator valued functions and exponential functions, generalizing the numerical case. Section 6.4 is devoted to the theory of contraction semigroups. A typical example of contraction semigroups is the semigroup associated with the heat kernel. We consider when a linear operator is the infinitesimal generator of some contraction semigroup. This question is answered by the Hille–Yosida theorem (Theorem 6.10). In Sect. 6.5 we consider when a linear operator is the infinitesimal generator of some (C0 ) semigroup (Theorem 6.27), generalizing the theory of contraction semigroups developed in Sect. 6.4. Moreover, we study an initial-value problem associated with a (C0 ) semigroup, and prove an existence and uniqueness theorem for the initial-value problem (Theorem 6.29).

6.1 Banach Space Valued Functions Let E be a Banach space over the real or complex number field, equipped with a norm  · . A function u(t) defined on an interval I with values in E is said to be strongly continuous at a point t0 of I if it satisfies the condition lim u(t) − u(t0 ) = 0.

t→t0

If u(t) is strongly continuous at every point of I , then it is said to be strongly continuous on I . If u(t) is strongly continuous on I , then the function u(t) is continuous on I and also, for any f in the dual space E  of E, the function f (u(t)) is continuous on I .

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 K. Taira, Functional Analytic Techniques for Diffusion Processes, Springer Monographs in Mathematics, https://doi.org/10.1007/978-981-19-1099-9_6

229

230

6 A Short Course in Semigroup Theory

As in the case of scalar valued functions, the following two results hold: (1) If u(t) is strongly continuous on a bounded closed interval I , then it is uniformly strongly continuous on I . (2) If a sequence {u n (t)} of strongly continuous functions on I converges uniformly strongly to a function u(t) on I , then the limit function u(t) is strongly continuous on I . Now let (R, B(R), dt) be one-dimensional Borel measure space. If u(t) is a strongly continuous function defined on a Borel set I ∈ B(R) such that the function u(t) is Borel integrable on I , that is,  u(t) dt < ∞,

(6.1)

I

then it follows from an application of [240, Chap. V, Sect. 5, Theorem 1] that the Bochner integral  u(t) dt I

can be defined just as in the case of scalar valued functions (see Sect. 2.17). Then we say that the function u(t) is strongly integrable on I . By the triangle inequality, we have the inequality (see [240, Chap. V, Sect. 5, Corollary 1]):       u(t) dt  ≤ u(t) dt.   I

I

Furthermore, we easily obtain the following theorem (see [240, Chap. V, Sect. 5, Corollary 2]): Theorem 6.1 Let u(t) be a strongly continuous function defined on an interval I which satisfies condition (6.1), and let T be a bounded linear operator on E into itself. Then the function T u(t) is strongly integrable on I , and we have the formula 



T

u(t) dt

 =

T u(t) dt.

I

I

Similarly, we have, for any functional f ∈ E  , 



f

u(t) dt I

 =

f (u(t)) dt. I

As in the case of scalar valued functions, the following two results hold: (3) If a sequence {u n (t)} of strongly continuous functions on a bounded closed interval I converges uniformly strongly to a function u(t) on I , then the limit function u(t) is strongly integrable on I , and we have the formula

6.1 Banach Space Valued Functions

231



 u(t) dt = lim

u n (t) dt.

n→∞

I

I

(4) If u(t) is strongly continuous in a neighborhood of a point t0 of I , then we have the formula (see [240, Chap. V, Sect. 5, Theorem 2]):   t0 +h  1   lim  u(t) dt − u(t0 )  = 0. h→0 h t 0 A function u(t) defined on an open interval I is said to be strongly differentiable at a point t0 of I if the limit lim

h→0

u(t0 + h) − u(t0 ) h

(6.2)

exists in E. The value of formula (6.2) is denoted by du (t0 ) or u  (t0 ). dt If u(t) is strongly differentiable at every point of I , then it is said to be strongly differentiable on I . A strongly differentiable function is strongly continuous. As in the case of scalar valued functions, the following two results hold: (5) If u(t) is strongly differentiable on I and u  (t) is strongly continuous on I , then we have, for any a, b ∈ I , 

b

u(b) − u(a) =

u  (t) dt.

a

(6) If u(t) is strongly continuous on I , then, for each c ∈ I , the integral is strongly differentiable on I , and we have the formula d dt



t c

u(s) ds



t

u(s) ds

= u(t).

c

6.2 Operator Valued Functions Let L(E, E) be the space of all bounded linear operators on a Banach space E into itself. The space L(E, E) is a Banach space with the operator norm T  = sup x∈E x=0

T x = sup T x. x x∈E x≤1

232

6 A Short Course in Semigroup Theory

A function T (t) defined on an interval I with values in the space L(E, E) is said to be strongly continuous at a point t0 of I if it satisfies the condition lim T (t)x − T (t0 )x = 0 for every x ∈ E.

t→t0

We say that T (t) is norm continuous at t0 if it satisfies the condition lim T (t) − T (t0 ) = 0.

t→t0

If T (t) is strongly (resp. norm) continuous at every point of I , then it is said to be strongly (resp. norm) continuous on I . A norm continuous function is strongly continuous. The next theorem is an immediate consequence of the resonance theorem (see Theorem 5.17): Theorem 6.2 If T (t) is strongly continuous on I , then the function T (t) is bounded uniformly in t over bounded closed intervals contained in I . A function T (t) defined on an open interval I is said to be strongly differentiable at a point t0 of I if there exists an operator S(t0 ) in L(E, E) such that     T (t0 + h) − T (t0 )   x − S(t0 )x  lim  = 0 for every x ∈ E. h→0  h We say that T (t) is norm differentiable at t0 if it satisfies the condition    T (t0 + h) − T (t0 )   − S(t0 ) lim   = 0. h→0 h The operator S(t0 ) is denoted by dT (t0 ) or T  (t0 ). dt If T (t) is strongly (resp. norm) differentiable at every point of I , then it is said to be strongly (resp. norm) differentiable on I . A norm differentiable function is strongly differentiable. It should be emphasized that the Leibniz formula can be extended to strongly or norm differentiable functions: Theorem 6.3 (i) If u(t) and T (t) are both strongly continuous (resp. differentiable) on I , then the function T (t)u(t) is also strongly continuous (resp. differentiable) on I . In the differentiable case, we have the formula d du dT (t)u(t) + T (t) (t). (T (t)u(t)) = dt dt dt

6.2 Operator Valued Functions

233

(ii) If T (t) and S(t) are both norm (resp. strongly) differentiable on I , then the function S(t)T (t) is also norm (resp. strongly) differentiable on I , and we have the formula d dT dS (t)T (t) + S(t) (t). (S(t)T (t)) = dt dt dt

6.3 Exponential Functions Let E be a Banach space and L(E, E) the space of all bounded linear operators on E into itself. Just as in the case of numerical series, we have the following: Theorem 6.4 If A ∈ L(E, E), we let et A =

∞  tm m A for every t ∈ R. m! m=0

(6.3)

Then it follows that the right-hand side converges in the Banach space L(E, E), and enjoys the following three properties: (a) et A  ≤ e|t|A for all t ∈ R. (b) et A es A = e(t+s)A for all t, s ∈ R. (c) The exponential function et A is norm differentiable on R, and satisfies the formula d tA (e ) = Aet A = et A A for all t ∈ R. dt

(6.4)

Proof (a) Since we have, for any m ∈ N, Am  ≤ Am , it follows that 2 3  t A e  = I + t A + (t A) + (t A) + . . .  2! 3! |t|2 A2 |t|3 A3 ≤ I  + |t|A + + + ... 2! 3! = e|t|A for all t ∈ R.

This proves that the series (6.3) converges in the space L(E, E) for all t ∈ R, and enjoys property (a). (b) Just as in the case of numerical series, we can rearrange the series e(t+s)A =

∞  (t + s)m m A m! m=0

234

6 A Short Course in Semigroup Theory



to obtain that

∞  tm m A m! m=0



∞  sm m A . m! m=0

(c) We remark that the series Aet A =

∞  t m m+1 = et A A A m! m=0

converges in L(E, E) uniformly in t over bounded intervals of R. Hence we have, by termwise integration, 

t

Aes A ds =

0

∞ 

t m+1 Am+1 = et A − I. (m + 1)! m=0

(6.5)

Therefore, we find that the left-hand side of formula (6.5) and hence the function et A is norm differentiable on R, and that the desired formula (6.4) holds true. The proof of Theorem 6.4 is complete.  Theorem 6.5 If A and B are bounded linear operators on a Banach space E into itself and if A and B commute, then we have the formula e A+B = e A e B = e B e A . Proof Since AB = B A, it follows that Aet B =

∞ ∞   tm tm m AB m = B A = et B A, m! m! m=0 m=0

just as in the numerical case. However, we have, by formula (6.4), d (1−t)(A+B) e = −(A + B) e(1−t)(A+B) for all t ∈ R. dt Hence, if we let K (t) = et A et B e(1−t)(A+B) for every t ∈ R, we find from the Leibniz formula that

(6.6)

6.3 Exponential Functions

235

d (K (t)) = et A Aet B e(1−t)(A+B) + et A et B Be(1−t)(A+B) dt − et A et B (A + B)e(1−t)(A+B) = et A et B Ae(1−t)(A+B) + et A et B Be(1−t)(A+B) − et A et B (A + B)e(1−t)(A+B) = 0 for all t ∈ R, so that K (t) is a constant function. In particular, we have the assertion e A+B = K (0) = K (1) = e A e B . This proves the desired formula (6.6). The proof of Theorem 6.5 is complete.



6.4 Contraction Semigroups Let E be a Banach space. A one-parameter family {Tt }t≥0 of bounded linear operators on E into itself is called a contraction semigroup of class (C0 ) or simply a contraction semigroup if it satisfies the following three conditions: (i) Tt+s = Tt · Ts for all t, s ≥ 0. (ii) limt↓0 Tt x − x = 0 for every x ∈ E. (iii) Tt  ≤ 1 for all t ≥ 0. Condition (i) is called the semigroup property. Remark 6.6 In view of conditions (i) and (ii), it follows that T0 = I . Hence condition (ii) is equivalent to the strong continuity of {Tt }t≥0 at t = 0. Moreover, it is easy to verify that a contraction semigroup {Tt }t≥0 is strongly continuous on the interval [0, ∞).

6.4.1 The Hille–Yosida Theory of Contraction Semigroups If {Tt }t≥0 is a contraction semigroup of class (C0 ), then we let D = the set of all x ∈ E such that the limit Th x − x lim h↓0 h exists inE. Then we define a linear operator A from E into itself as follows:

236

6 A Short Course in Semigroup Theory

Fig. 6.1 The Hille–Yosida theory of semigroups via the Laplace transform

(a) The domain D(A) of A is the set D. Th x − x for every x ∈ D(A). (b) Ax = lim h↓0 h The operator A is called the infinitesimal generator of {Tt }t≥0 . The Hille–Yosida theory of semigroups via the Laplace transform can be visualized as in Fig. 6.1 (see formula (6.13)). First, we derive a differential equation associated with a contraction semigroup of class (C0 ) in terms of its infinitesimal generator: Proposition 6.7 Let A be the infinitesimal generator of a contraction semigroup {Tt }t≥0 . If x ∈ D(A), then we have Tt x ∈ D(A) for all t > 0, and the function Tt x is strongly differentiable on the interval (0, ∞) and satisfies the equation d (Tt x) = A(Tt x) = Tt (Ax) for all t > 0. dt

(6.7)

Proof Let h > 0. Then we have, by the semigroup property, Th (Tt x) − Tt x = Tt h



 Th − I x . h

However, since Tt is bounded and x ∈ D(A), it follows that  Tt This implies that

Th − I x h

 −→ Tt (Ax) as h ↓ 0.

Tt x ∈ D(A), A(Tt x) = Tt (Ax).

Therefore, we find that Tt x is strongly right-differentiable on (0, ∞) and satisfies the equation d+ (Tt x) = A(Tt x) = Tt (Ax) for all t > 0. dt Similarly, we have, for each 0 < h < t,

6.4 Contraction Semigroups

237

Tt−h x − Tt x − Tt (Ax) = Tt−h −h



 Th − I x − Th (Ax) . h

However, since we have the inequality Tt−h  ≤ 1 and, as h ↓ 0,

we obtain that

Th (Ax) −→ Ax, Tt−h x − Tt x −→ Tt (Ax) as h ↓ 0. −h

This proves that Tt x is strongly left-differentiable on (0, ∞) and satisfies the equation d− (Tt x) = A(Tt x) = Tt (Ax) for all t > 0. dt Summing up, we have proved that Tt x is strongly differentiable on the interval (0, ∞) and satisfies Eq. (6.7). The proof of Proposition 6.7 is complete.  The next proposition characterizes the infinitesimal generator A of a contraction semigroup {Tt }t≥0 : Proposition 6.8 Let A be the infinitesimal generator of a contraction semigroup {Tt }t≥0 . Then A is a densely defined, closed linear operator in E. Proof The proof is divided into two steps. Step 1: First, we show that the operator A is closed. To do this, we assume that xn ∈ D(A), xn → x0 and Axn → y0 in E. Then it follows from an application of Eq. (6.7) that  Tt xn − xn = 0

t

d (Ts xn ) ds = ds



t

Ts (Axn ) ds.

0

However, we have, as n → ∞, Tt xn − xn −→ Tt x0 − x0 and also

(6.8)

238

6 A Short Course in Semigroup Theory

 t   t    Ts (Axn ) ds −  T y ds s 0   0 0  t     =  Ts (Axn − y0 ) ds  0  t Ts (Axn − y0 ) ds ≤ 0

≤ t Axn − y0  −→ 0 as n → ∞. Hence, by letting n → ∞ in formula (6.8) we obtain that 

t

Tt x0 − x0 =

Ts y0 ds.

(6.9)

0

Furthermore, it follows that, as t ↓ 0, 1 t



t

Ts y0 ds −→ T0 y0 = y0 ,

0

since the integrand Ts y0 is strongly continuous. Therefore, we find from formula (6.9) that, as t ↓ 0, 1 Tt x0 − x0 = t t This proves that





t

Ts y0 ds −→ y0 .

0

x0 ∈ D(A), Ax0 = y0 .

Therefore, we have proved that the operator A is a closed operator. Step 2: Secondly, we show the density of the domain D(A) in E. Let x be an arbitrary element of E. For each δ > 0, we let xδ =

1 δ



δ

Ts x ds. 0

Then we have, for any 0 < h < δ, Th (xδ ) =

1 δ

Hence it follows that



δ 0

Th (Ts x) ds =

1 δ

 0

δ

Th+s x ds =

1 δ



δ+h

Ts x ds. h

6.4 Contraction Semigroups



Th − I h



239

  δ+h 1 Ts x ds − h h   δ+h 1 1 = Ts x ds − δ h δ

1 xδ = δ

1 h 1 h





δ

Ts x ds 0



h

(6.10) 

Ts x ds .

0

However, it follows that, as h ↓ 0,  1 δ+h Ts x ds −→ Tδ x, • h δ  1 h Ts x ds −→ T0 x = x, • h 0 since the integrand Ts x is strongly continuous. Therefore, we find from formula (6.10) that 



Th − I h

xδ −→

This proves that



1 (Tδ x − x) as h ↓ 0. δ

xδ ∈ D(A), Axδ = 1δ (Tδ x − x).

Moreover, it follows that xδ =

1 δ



δ

Ts x ds −→ T0 x = x as δ ↓ 0.

0

Summing up, we have proved that D(A) is dense in E. The proof of Proposition 6.8 is complete.



Let {Tt }t≥0 be a contraction semigroup. Then the integral 

s

e−αt Tt x dt for x ∈ E,

(6.11)

0

is strongly integrable for all s > 0, since the integrand is strongly continuous on the interval [0, ∞). Moreover, if α > 0, then the limit G α x of the integral (6.11) exists in E as s → ∞:  ∞  s G α x := e−αt Tt x dt = lim e−αt Tt x dt for x ∈ E and α > 0. 0

s→∞ 0

Thus G α x is defined for all x ∈ E if α > 0. It is easy to see that the operator G α is a bounded linear operator on E into itself with norm 1/α:

240

6 A Short Course in Semigroup Theory

1 x for all x ∈ E. α

G α x ≤

(6.12)

The family {G α }α>0 of bounded linear operators is called the resolvent of the semigroup {Tt }t≥0 . The next proposition characterizes the resolvent G α of a contraction semigroup {Tt }t≥0 : Proposition 6.9 Let {Tt }t≥0 be a contraction semigroup defined on a Banach space E and A the infinitesimal generator of {Tt }. For each α > 0, the operator (αI − A) is a bijection of D(A) onto E, and its inverse (αI − A)−1 is the resolvent G α : (αI − A)−1 x = G α x =





e−αt Tt x dt for every x ∈ E.

(6.13)

0

Proof The proof is divided into three steps. Step 1: First, we show that (αI − A) is surjective for each α > 0. Let x be an arbitrary element of E. Then we have, for each h > 0,  Th (G α x) =



0

= eαh

e 

−αt ∞





Th (Tt x) dt =

e−αt Tt+h x dt

0

e−αt Tt x dt.

h

Hence it follows that Th (G α x) − G α x = eαh





e−αt Tt x dt −

h

= eαh − 1





e−αt Tt x dt

0





e h

−αt



h

Tt x dt −

e−αt Tt x dt,

0

so that Th (G α x) − G α x h  αh  ∞  e −1 1 h −αt = e−αt Tt x dt − e Tt x dt. h h 0 h However, we obtain that  • 

eαh − 1 h



−→ α as h ↓ 0,  ∞ ∞ −αt e Tt x dt −→ e−αt Tt x dt = G α x as h ↓ 0,

• h

0

(6.14)

6.4 Contraction Semigroups

1 • h



h

241

e−αt Tt x dt −→ e−αt Tt x|t=0 = x as h ↓ 0.

0

By letting h ↓ 0 in formula (6.14), we have proved that Th (G α x) − G α x −→ αG α x − x as h ↓ 0. h This implies that



G α x ∈ D(A), A(G α x) = αG α x − x,

or equivalently, (αI − A)G α x = x for every x ∈ E. Therefore, we have proved that the operator (αI − A) is surjective for each α > 0. Step 2: Secondly, we show that (αI − A) is injective for each α > 0. Now we assume that x ∈ D(A), (αI − A)x = 0. If we introduce a function u(t) by the formula u(t) = e−αt Tt x for all t > 0, then it follows from an application of Proposition 6.7 that d (u(t)) = −αe−αt Tt x + e−αt Tt Ax = −e−αt Tt (αI − A)x dt = 0, so that u(t) = a constant for all t > 0. However, we have, by letting t ↓ 0,

u(t) = u(0) = e−αt Tt x t=0 = x. On the other hand, we have, by letting t ↑ +∞, u(t) = u(+∞) = lim u(t) = 0. t↑+∞

Indeed, it suffices to note that

242

6 A Short Course in Semigroup Theory

u(t) = e−αt Tt x ≤ e−αt x for all t > 0. Hence it follows that x = 0. This proves that the operator (αI − A) is injective for each α > 0. Step 3: Summing up, we have proved that (αI − A) is a bijection of D(A) onto E and that (αI − A)−1 = G α . The proof of Proposition 6.9 is complete.



Now we consider when a linear operator is the infinitesimal generator of some contraction semigroup. This question is answered by the following theorem: Theorem 6.10 (Hille–Yosida) Let A be a linear operator from a Banach space E into itself with domain D(A). In order that A is the infinitesimal generator of some contraction semigroup, it is necessary and sufficient that A satisfies the following three conditions: (i) The operator A is closed and its domain D(A) is dense in E. (ii) For every α > 0 the equation (αI − A) x = y has a unique solution x ∈ D(A) for any y ∈ E; we then write x = (αI − A)−1 y. (iii) For any α > 0, we have the inequality   (αI − A)−1  ≤ 1 . α

(6.15)

Proof The necessity of conditions (i) through (iii) follows from Propositions 6.8 and 6.9 and inequality (6.12). The sufficiency is proved in six steps. Step 1: For each α > 0, we define linear operators Jα = α (αI − A)−1 , and Aα = AJα . Then we can prove the following four assertions: Jα  ≤ 1, lim Jα x = x for every x ∈ E, α→+∞

(6.16a) (6.16b)

6.4 Contraction Semigroups

243

and Aα  ≤ 2α, lim Aα x = Ax for every x ∈ D(A).

α→+∞

(6.17a) (6.17b)

The operators Aα are called the Yosida approximations to A. First, we remark that assertion (6.17a) is an immediate consequence of inequality (6.15). Furthermore, we have, for all x ∈ D(A), Jα x − x = α (αI − A)−1 x − (αI − A)−1 (αI − A) x = (αI − A)−1 (αx − αx + Ax) = (αI − A)−1 (Ax). Hence it follows from inequality (6.15) that, as α → +∞,   1 Jα x − x ≤ (αI − A)−1  Ax ≤ Ax −→ 0. α This proves assertion (6.16b), since Jα  ≤ 1 and D(A) is dense in E. Assertion (6.17b) follows from assertion (6.16b). Indeed, we have, as α → +∞, Aα x = AJα x = Jα (Ax) −→ Ax for every x ∈ D(A). On the other hand, it follows that Aα = −αI + αJα , so that Aα  ≤ α + α Jα  ≤ 2α. This proves assertion (6.17a). Step 2: We define a linear operator Tt (α) = exp [tAα ] for every α > 0. Since we have the formula Aα = −αI + αJα , it follows from an application of Theorem 6.5 that the operators Tt (α) = e−αt exp [αt Jα ] for t ≥ 0 form a contraction semigroup for each α > 0. Indeed, it suffices to note that

(6.18)

244

6 A Short Course in Semigroup Theory

Tt (α) = e

−αt

exp [αt Jα ] ≤ e ∞  (αt)n

≤ e−αt

n=0

n!

−αt

∞  (αt)n n=0

n!

Jαn 

= e−αt eαt

= 1. Step 3: We show that the operator Tt (α) has a strong limit Tt as α → +∞: Tt x = lim Tt (α)x for every x ∈ E. α→+∞

Moreover, this convergence is uniform in t over bounded intervals [0, t0 ] for all t0 > 0. If x is an arbitrary element of D(A), then it follows from an application of Proposition 6.7 that Tt (α)x − Tt (β)x  t d = (Tt−s (β)Ts (α)x) ds ds 0   t d d (Tt−s (β)) · Ts (α)x + Tt−s (β) · (Ts (α)x) ds = ds ds 0  t

Tt−s (β)(−Aβ )Ts (α)x + Tt−s (β)Ts (α)(Aα x) ds = 0  t

Tt−s (β)Ts (α) Aα x − Aβ x ds. = 0

Hence we have the inequality Tt (α)x − Tt (β)x   t 

  =  Tt−s (β)Ts (α) Aα x − Aβ x ds   0  t     Tt−s (β) |Ts (α) ds · Aα x − Aβ x  ≤ t Aα x − Aβ x  ≤ 0   ≤ t0 Aα x − Aβ x  for all t ∈ [0, t0 ], since Tt−s (β) ≤ 1 and Ts (α) ≤ 1. However, we recall (see assertion (6.17b)) that, as α → +∞, Aα x −→ Ax for every x ∈ D(A). Therefore, we obtain that, as α, β → +∞,

6.4 Contraction Semigroups

245

Tt (α)x − Tt (β)x −→ 0, and that this convergence is uniform in t over the interval [0, t0 ]. We can define a linear operator Tt by the formula Tt x = lim Tt (α)x for every x ∈ D(A). α→+∞

Furthermore, since Tt (α) ≤ 1 and D(A) is dense in E, it follows that the operator Tt (α) has a strong limit Tt as α → +∞: Tt x = lim Tt (α)x for every x ∈ E, α→+∞

(6.19)

and further that the convergence is uniform in t over bounded intervals [0, t0 ] for each t0 > 0. Step 4: We show that the family {Tt }t≥0 forms a contraction semigroup of class (C0 ). First, it follows from an application of the resonance theorem (Theorem 5.17) that the operator Tt is bounded and satisfies the condition Tt  ≤ lim inf Tt (α) ≤ 1 for all t ≥ 0. α→+∞

Secondly, the semigroup property of {Tt } Tt (Ts x) = Tt+s x for x ∈ E follows from that of {Tt (α)}. Indeed, we have, as α → +∞, Tt (Ts x) − Tt (α)(Ts (α)x) ≤ (Tt − Tt (α))Ts x + Tt (α)(Ts x − Ts (α)x) ≤ (Tt − Tt (α))Ts x + (Ts − Ts (α))x −→ 0, so that Tt (Ts x) = lim Tt (α)(Ts (α)x) = lim Tt+s (α)x α→+∞

α→+∞

= Tt+s x for every x ∈ E. Furthermore, since the convergence of formula (6.19) is uniform in t over bounded sub-intervals of the interval [0, ∞), it follows that the function Tt x, x ∈ E, is strongly continuous on the interval [0, ∞). Consequently, the family {Tt }t≥0 forms a contraction semigroup. Step 5: We show that the infinitesimal generator of the semigroup {Tt }t≥0 thus obtained is precisely the operator A.

246

6 A Short Course in Semigroup Theory

Let A0 be the infinitesimal generator of {Tt }t≥0 with domain D(A0 ). If x is an arbitrary element of the domain D(A), it follows an application of Proposition 6.7 that  t  t d sAα e x ds = esAα (Aα x) ds. (6.20) etAα x − x = 0 ds 0 However, we have, as α → +∞, etAα x − x = Tt (α)x − x −→ Tt x − x for every x ∈ D(A), and also 

t



t

esAα (Aα x) ds −→

0

Ts (Ax) ds for every x ∈ D(A).

0

Indeed, it suffices to note that, as α → +∞,  t   t    esAα (Aα x) ds −  T (Ax) ds s   0 0  t   t      sAα sAα    ≤  e (Aα x − Ax) ds  +  (e − Ts )(Ax) ds   0 0  t  t   sA   sA e α  ds Aα x − Ax + e α (Ax) − Ts (Ax) ds ≤ 0 0  t Ts (α)(Ax) − Ts (Ax) ds −→ 0, ≤ t Aα x − Ax + 0

since the convergence in formula (6.19) is uniform in t over bounded intervals [0, t0 ] for each t0 > 0. Hence, by letting α → +∞ in formula (6.20) we have, for all x ∈ D(A),  Tt x − x =

t

Ts (Ax) ds.

0

Moreover, it follows that, as t ↓ 0, 1 Tt x − x = t t



t

Ts (Ax) ds −→ T0 (Ax) = Ax for every x ∈ D(A),

0

since the integrand Ts (Ax) is strongly continuous. Summing up, we have proved that

x ∈ D(A0 ), A0 x = Ax.

6.4 Contraction Semigroups

247

This implies that A ⊂ A0 . It remains to show that D(A) = D(A0 ). If y is an arbitrary element of D(A0 ), we let x = (I − A)−1 (I − A0 ) y. Then we have the assertions

x ∈ D(A) ⊂ D(A0 ), (I − A) x = (I − A0 ) y,

and so (I − A0 ) x = (I − A0 ) y, This implies that y = x ∈ D(A), since the operator (I − A0 ) is bijective. Step 6: Finally, we show the uniqueness of the semigroup. Let {Ut }t≥0 be another contraction semigroup which has A as its infinitesimal generator. For each x ∈ D(A) and each t > 0, we introduce a function w(s) as follows: w(s) = Tt−s (Us x) for 0 ≤ s ≤ t. Then we it follows from an application of Proposition 6.7 that     d d dw = Tt−s Us x + Tt−s Us x ds ds ds = −ATt−s (Us x) + Tt−s (AUs x) = −Tt−s (AUs x) + Tt−s (AUs x) = 0 for 0 < s < t, so that w(s) = a constant for all s ∈ [0, t]. In particular, we obtain that w(0) = w(t), that is, Tt x = Ut x for each x ∈ D(A) and each t > 0.

248

6 A Short Course in Semigroup Theory

This implies that Tt = Ut for all t ≥ 0, since Tt and Ut are both bounded and since the domain D(A) is dense in E. Now the proof of Theorem 6.10 is complete. 

6.4.2 The Contraction Semigroup Associated with the Heat Kernel In this subsection we consider the heat kernel K t (x) =

|x|2 1 e− 4t n/2 (4πt)

for t > 0 and x ∈ Rn

on the function space   C0 (Rn ) = u ∈ C(Rn ) : lim u(x) = 0 . x→∞

We say that a function u ∈ C(Rn ) vanishes at infinity if the set 

x ∈ Rn : |u(x)| ≥ ε



is compact for every ε > 0, and we write lim u(x) = 0.

x→∞

It is easy to see that the function space C0 (Rn ) is a Banach space with the supremum (maximum) norm u∞ = sup |u(x)|. x∈Rn

This subsection is taken from [209, Sect. 4.4.2]. The purpose of this subsection is to prove the following typical example (see [209, Example 4.11]): Example 6.11 A one-parameter family {Tt }t≥0 of bounded linear operators, defined by the formula

u(x) for t = 0, Tt u(x) =  Rn K t (x − y)u(y) dy for t > 0, forms a contraction semigroup of class (C0 ) on the Banach space C0 (Rn ). Proof The proof of Example 6.11 is given by a series of several claims. In the following we shall write T (t) for Tt .

6.4 Contraction Semigroups

249

Step 1: First, the next lemma proves that the operators T (t) map C0 (Rn ) into itself: Lemma 6.12 We have, for all t > 0, T (t) : C0 (Rn ) −→ C0 (Rn ). Proof Let u(x) be an arbitrary function in C0 (Rn ). Then it follows that  2 1 − |x−y| 4t e u(y) dy T (t)u(x) = (4πt)n/2 Rn  |z|2 1 = e− 4t u(x − z) dz. n/2 (4πt) Rn (1) First, we show that T (t)u ∈ C(Rn ): Since u ∈ C0 (Rn ) is uniformly continuous on Rn , for any given number ε > 0 we can find a constant δ = δ(ε) > 0 such that |x1 − x2 | < δ =⇒ |u(x1 ) − u(x2 )| < ε. In particular, we have the assertion |x − y| < δ =⇒ |u(x − z) − u(y − z)| < ε. Therefore, we obtain that |T (t)u(x) − T (t)u(y)|

  2 2

1 1 − |z|4t − |z|4t

e u(x − z) dz − e u(y − z) dy

= (4πt)n/2 Rn (4πt)n/2 Rn  2 1 − |z|4t ≤ e |u(x − z) − u(y − z)| dz (4πt)n/2 Rn  |z|2 ε ≤ e− 4t dz = ε. n/2 (4πt) Rn This proves that the function T (t)u is uniformly continuous on Rn . (2) Secondly, we show that T (t)u ∈ C0 (Rn ), that is, lim T (t)u(x) = 0.

x→∞

(6.21)

Since we have the condition lim u(x) = 0,

x→∞

for any given number ε > 0 we can find a positive integer N = N (ε) ∈ N such that |u(y)| < ε for all |y| > N .

(6.22)

250

6 A Short Course in Semigroup Theory

Then we decompose the integral T (t)u(x) into the two terms: T (t)u(x)  2 1 − |x−y| 4t = e u(y) dy (4πt)n/2 Rn   2 |x−y|2 1 1 − |x−y| 4t = e u(y) dy + e− 4t u(y) dy n/2 n/2 (4πt) (4πt) |y|≤N |y|>N := I1 (x) + I2 (x). However, by condition (6.22) we can estimate the term I2 (x) as follows:  |x−y|2 1 e− 4t |u(y)| dy n/2 (4πt) |y|>N  |x−y|2 ε e− 4t u(y) dy = ε. < n/2 (4πt) Rn

|I2 (x)| ≤

(6.23)

The term I1 (x) may be estimated as follows:  |x−y|2 1 e− 4t |u(y)| dy n/2 (4πt) |y|≤N  (|x|−N )2 1 e− 4t |u(y)| dy ≤ n/2 (4πt) |y|≤N  )2 1 − (|x|−N 4t e |u(y)| dy = (4πt)n/2 |y|≤N  )2 1 − (|x|−N 4t e |u(y)| dy. ≤ (4πt)n/2 Rn

|I1 (x)| ≤

Moreover, we have, as x → ∞, e−

(|x|−N )2 4t

−→ 0,

and hence I1 (x) −→ 0 as x → ∞.

(6.24)

Summing up, we obtain from assertions (6.23) and (6.24) that lim sup |T (t)u(x)| ≤ lim sup (|I1 (x)| + |I2 (x)|) ≤ ε. x→∞

x→∞

This proves the desired assertion (6.21), since ε is arbitrary. The proof of Lemma 6.12 is complete.



Moreover, we find that the operators {T (t)}t>0 are bounded on the space C0 (Rn ). Indeed, it suffices to note that we have, for all x ∈ Rn ,

6.4 Contraction Semigroups

251

|T (t)u(x)|  ≤ K t (x − y)|u(y)| dy Rn   2 |y|2 1 1 − |x−y| 4t ≤ u∞ e dy = u e− 4t dy ∞ n/2 n/2 (4πt) (4πt) Rn Rn = u∞ , and hence T (t)u∞ ≤ u∞ for all u ∈ C0 (Rn ). This proves that T (t) ≤ 1 for all t > 0. Step 2: Secondly, we show that the family {T (t)}t≥0 forms a semigroup on the space C0 (Rn ): Step 2-1: To do this, we need the following Chapman–Kolmogorov equation for the heat kernel: Lemma 6.13 (Chapman—Kolmogorov) For all t, s > 0, we have the equation  K t+s (x) =

Rn

K t (x − y) K s (y) dy.

(6.25)

Proof (1) The proof is based on the following elementary formula: |y|2 |x|2 t +s |x − y|2 + = + 4t 4s 4(t + s) 4ts

2

y − s x .

t +s

(6.26)

Indeed, the right-hand side is calculated as follows:

2

s t + s

|x|2 y− + x

4(t + s) 4ts t +s   t +s s s |x|2 + y− x, y − x = 4(t + s) 4ts t +s t +s 2 t +s 2 s |x| 1 + |y| − (x, y) + |x|2 = 4(t + s) 4ts 2t 4t (t + s) t +s t +s 2 1 |x|2 + |y| − (x, y) = 4t (t + s) 4ts 2t 1 2 t +s 2 1 = |x| − (x, y) + |y| . 4t 2t 4ts

(6.27)

252

6 A Short Course in Semigroup Theory

Similarly, the left-hand side is calculated as follows: |x − y|2 |y|2 1 1 + = (x − y, x − y) + |y|2 4t 4s 4t 4s 1 1 1 1 2 = |x| − (x, y) + |y|2 + |y|2 4t 2t 4t 4s t +s 2 1 2 1 |y| . = |x| − (x, y) + 4t 2t 4ts

(6.28)

Therefore, the desired formula (6.26) follows from formulas (6.27) and (6.28). (2) By using formula (6.26), we can prove the Chapman–Kolmogorov equation (6.25) as follows:  K t (x − y) K s (y) dy Rn  2 2 1 1 − |x−y| − |y| 4t 4s dy = e e ˙ (4πt)n/2 (4πs)n/2 Rn    2 |y|2 1 1 − |x−y| 4t + 4s = e dy (4πt)n/2 (4πs)n/2 Rn  2 |x|2 t+s s 1 1 = e− 4(t+s) − 4ts | y− t+s x | dy n/2 n/2 (4πt) (4πs) Rn  2 |x|2 1 1 s − 4(t+s) − t+s 4ts | y− t+s x | dy = e e n (4πt)n/2 (4πs)n/2 R |x|2 1 1 t+s 2 = e− 4(t+s) e− 4ts |z| dz n/2 n/2 (4πt) (4πs) Rn    2 |x| t + s −n/2 1 1 2 − 4(t+s) e e−|w| dw = n (4πt)n/2 (4πs)n/2 4ts R √ n |x|2 |x|2 4n/2 1 1 − 4(t+s) − 4(t+s) e ( π) = e = (4π)n (t + s)n/2 (4π(t + s))n/2 = K t+s (x). The proof of Lemma 6.13 is complete.



Step 2-2: The next lemma proves that the family {T (t)}t≥0 forms a semigroup: Lemma 6.14 For all t, s > 0, we have the formula T (t)(T (s)u)(x) = T (t + s)u(x) for u ∈ C0 (Rn ). Proof By using the Chapman–Kolmogorov equation (6.25), we obtain that

(6.29)

6.4 Contraction Semigroups

253

 T (t)(T (s)u)(x) =

K t (x − y)T (s)u(y) dy    K t (x − y) K s (y − z)u(z) dz dy = Rn Rn    K t (x − y) K s (y − z) dy u(z) dz = n Rn R = K t+s (x − z)u(z) dz Rn

Rn

= T (t + s)u(x) for every u ∈ C0 (Rn ). 

The proof of Lemma 6.14 is complete.

Step 3: Finally, the next lemma proves that the operators {T (t)}t>0 converge strongly to the identity operator I : Lemma 6.15 We have, for all u ∈ C0 (Rn ), lim T (t)u − u∞ = 0.

(6.30)

t↓0

Proof Since we have the formula 1 (4πt)n/2



2

e

− |y−y| 4t

Rn

1 dy = (4πt)n/2

 Rn

e−

|y|2 4t

dy = 1,

it follows that  2 1 − |x−y| 4t e u(y) dy − u(x) (4πt)n/2 Rn  |x−y|2 1 = e− 4t (u(y) − u(x)) dy. n/2 (4πt) Rn

T (t)u(x) − u(x) =

However, since u ∈ C0 (Rn ) is uniformly continuous on Rn , for any given number ε > 0 we can find a constant δ = δ(ε) > 0 such that |x − y| < δ =⇒ |u(x) − u(y)| < ε. Then we decompose the term T (t)u(x) − u(x) into the two terms:  |x−y|2 1 e− 4t (u(y) − u(x))dy n/2 n (4πt) R |x−y|2 1 = e− 4t (u(y) − u(x)) dy n/2 (4πt) |x−y| 0. Moreover, the function Tt x, x ∈ D(A), is continuously differentiable for all t > 0, and satisfies the equation d (Tt x) = Tt (Ax) = A(Tt x) for all t > 0. dt

(6.35)

(ii) The operator A is a closed linear operator. Proof (i) If x ∈ D(A), we have, for all h > 0, 

Th − I h



 Tt x = Tt

 Th − I x . h

However, since x ∈ D(A), it follows that  Tt

Th x − x h

 −→ Tt (Ax) as h ↓ 0.

Therefore, we obtain that     Th − I Th − I Tt x = Tt x −→ Tt (Ax) as h ↓ 0. h h This proves that Tt Ax ∈ D(A) and that A(Tt x) = Tt (Ax). Moreover, it follows that Tt x is strongly right-differentiable: d+ Tt+h x − Tt x = lim Tt (Tt x) = lim h↓0 h↓0 dt h = Tt (Ax).



Th x − x h



On the other hand, we have, for all sufficiently small h > 0 with t − h > 0, Tt−h x − Tt x − Tt (Ax) = Tt−h −h



 x − Th x − Th (Ax) . −h

However, it follows that x − Th x Th x − x = −→ Ax as h ↓ 0, −h h Th (Ax) −→ Ax as h ↓ 0. Since we have the inequality (see inequality (6.34))

(6.36)

6.5 (C0 ) Semigroups

259

Tt−h  ≤ M eω(t−h) , we obtain from formula (6.36) that Tt−h x − Tt x − Tt (Ax) = Tt−h −h



 x − Th x − Th (Ax) −→ 0 as h ↓ 0. −h

This proves that Tt x is strongly left-differentiable: Tt−h x − Tt x d− = Tt (Ax). (Tt x) = lim h↓0 dt −h Therefore, we have proved that Tt x is strongly differentiable, and that Eq. (6.35) holds true. Moreover, we find that the derivative d (Tt x) = Tt (Ax) for x ∈ D(A), dx is strongly continuous. (ii) Let {u n } be an arbitrary sequence in the domain D(A) such that u n → u and Au n → v. We show that u ∈ D(A) and Au = v. Since we have, for all x ∈ D(A), d (Tt x) = Tt (Ax), dt 

it follows that Tt x − x =

t

0

d (Ts x) ds = ds



t

Ts (Ax) ds.

0

In particular, we have the formula  Tt u n − u n =

t

Ts (Au n ) ds.

0

However, we remark that Tt u n − u n −→ Tt u − u as n → ∞, and that   t    (Ts (Au n ) − Ts v) ds    0  t ≤ Ts (Au n ) − Ts v 0

≤ t · max Ts  · Au n − v −→ 0 as n → ∞. s∈[0,t]

(6.37)

260

6 A Short Course in Semigroup Theory

Therefore, by letting n → ∞ in formula (6.37) we obtain that 

t

Tt u − u =

Ts v ds.

0

Since the function Ts v is strongly continuous, we have the assertion lim t↓0

Tt u − u 1 = lim t↓0 t t

This proves that





t

Ts v ds = Ts v|s=0 = v.

0

u ∈ D(A), Au = v,

so that the operator A is closed. The proof of Lemma 6.22 is complete.



Lemma 6.23 The domain D(A) of the infinitesimal generator A of a (C0 ) semigroup {Tt }t≥0 is dense in the space E. Proof First, we choose a real-valued, function ϕ ∈ C0∞ (R) such that supp ϕ ⊂ R+ = (0, ∞). If x is an arbitrary element of E, we let 



u=

ϕ(t)Tt x dt.

0

Then we have, for all sufficiently small h > 0, 



Th u = 

0

=







ϕ(t) Th (Tt x) dt = 0

ϕ(t) Tt+h x dt = inth∞ ϕ(s − h) Ts x ds

ϕ(s − h) Ts x ds,

0

and hence

Th u − u =− h



∞ 0

ϕ(s − h) − ϕ(s) Ts x ds. −h

By letting h ↓ 0 in this formula, we obtain from the dominated convergence theorem (Theorem 2.12) that Th u − u =− lim h↓0 h



∞ 0

ϕ (t)Tt x dt.

(6.38)

6.5 (C0 ) Semigroups

261

Fig. 6.2 A function ϕε (t)

This proves that u ∈ D(A) and that 



Au = −

ϕ (t)Tt x dt.

0

For any given number ε > 0, we choose a real-valued, function ϕε ∈ C0∞ (R) such that (see Fig. 6.2) supp ϕε ⊂ [ε, 3 ε] ,  ∞ ϕε (t) dt = 1. 0



If we let



uε =

ϕε (t)Tt x dt,

0

it follows from an application of formula (6.38) with u := u ε that u ε ∈ D(A). Moreover, since we have the formula  ∞ ϕε (t)x dt, x= 0

we obtain that 



u ε − x ≤

 ϕε (t)Tt x − x dt =

0

3ε ε

ϕε (t)Tt x − x dt

= sup Tt x − x −→ 0 as ε ↓ 0. t∈[ε,3ε]

This proves the density of D(A) in E. The proof of Lemma 6.23 is complete.



Corollary 6.24 Every (C0 ) semigroup {Tt }t≥0 is uniquely determined by its infinitesimal generator A. Proof Assume that two (C0 ) semigroups {Tt } and {St } have a closed linear operator A as their infinitesimal generator. For any positive time t0 , it suffices to show that

262

6 A Short Course in Semigroup Theory

Tt0 = St0 . If x is an arbitrary element of the domain D(A), we let W (t) = Tt0 −t (St x) for 0 ≤ t ≤ t0 . Then it follows from an application of formula (6.35) that     d d d (W (t)) = (Tt −t ) St x + Tt0 −t (St x) dt dt 0 dt = Tt0 −t (−A)(St x) + Tt0 −t (ASt x) = 0, so that

dW ≡ 0 for all t ∈ [0, t0 ]. dt

This implies that Tt0 x = W (0) = W (t0 ) = St0 x for all x ∈ D(A).

(6.39)

However, we know from Lemma 6.23 that the domain D(A) is dense in E. Hence we have, by assertion (6.39), Tt0 = St0 , since the operators Tt0 and St0 are bounded. The proof of Corollary 6.24 is complete.



If {Tt }t≥0 is a (C0 ) semigroup, we shall write formally Tt = etA = exp(tA), by using its infinitesimal generator A.

6.5.2 Infinitesimal Generators and Their Resolvents Let E be a Banach space and let A : E → E be a closed linear operator with domain D(A). We recall the following: (1) The resolvent set of A, denoted by ρ(A), is defined to be the set of scalars λ ∈ C such that λI − A is injective and that (λI − A)−1 ∈ L(E, E). (2) If λ ∈ ρ(A), the inverse operator (λI − A)−1 is called the resolvent of A, and denoted by R(λ; A):

6.5 (C0 ) Semigroups

263

R(λ; A) = (λI − A)−1 for λ ∈ ρ(A). (3) The complement of ρ(A) is called the spectrum of A, and is denoted by σ(A): σ(A) = C \ ρ(A). The set σ p (A) of scalars λ ∈ C such that the operator λI − A is not one-to-one forms a subset of σ(A), and is called the point spectrum of A. A scalar λ ∈ C belongs to σ p (A) if and only if there exists a non-zero element x ∈ E such that Ax = λx. In this case, λ is called an eigenvalue of A and x an eigenvector of A corresponding to λ. Also the null space N (λI − A) of λI − A is called the eigenspace of A corresponding to λ, and the dimension of N (λI − A) is called the geometric multiplicity of λ. First, we have the following lemma: Lemma 6.25 Let A : E → E be a closed linear operator with domain D(A). If λ, μ ∈ ρ(A), we have the following formulas: (i) (λ − A)−1 − (μ − A)−1 = (μ − λ) (λ − A)−1 (μ − A)−1 . (ii) (λ − A)−1 (μ − A)−1 = (μ − A)−1 (λ − A)−1 . The formula (i) is called the resolvent equation. Proof (i) The first formula may be proved as follows: (λ − A)−1 − (μ − A)−1 = (λ − A)−1 (μ − A) (μ − A)−1 − (λ − A)−1 (λ − A) (μ − A)−1 = (λ − A)−1 {(μ − A) − (λ − A)} (μ − A)−1 = (λ − A)−1 (μ − λ) (μ − A)−1 = (μ − λ) (λ − A)−1 (μ − A)−1 . (ii) Moreover, if we interchange λ and μ in formula (i), it follows that (μ − A)−1 − (λ − A)−1 = (λ − μ) (μ − A)−1 (λ − A)−1 . Therefore, by combining this formula with formula (i) we obtain that 1

(μ − A)−1 − (λ − A)−1 λ−μ 1

= (λ − μ) (λ − A)−1 (μ − A)−1 λ−μ = (λ − A)−1 (μ − A)−1 .

(μ − A)−1 (λ − A)−1 =

The proof of Lemma 6.25 is complete.



Let {T (t)}t≥0 be a (C0 ) semigroup and let A be its infinitesimal generator. The next theorem characterizes the resolvent set ρ(A) and the resolvent R(λ; A) = (λI − A)−1 :

264

6 A Short Course in Semigroup Theory

Fig. 6.3 The half-plane {Re λ > ω}

Theorem 6.26 Let {T (t)}t≥0 be a (C0 ) semigroup that satisfies the inequality Tt  ≤ M eωt for all t ≥ 0.

(6.40)

Then we have the following two assertions: (i) The infinitesimal generator A of {Tt }t≥0 is a closed operator and its domain D(A) is dense in E. (ii) The resolvent set ρ(A) of A contains the half-plane {λ ∈ C : Re λ > ω} (see Fig. 6.3) and the resolvent R(λ; A) = (λI − A)−1 is expressed in the integral form  R(λ; A)u =



e−λt T (t)u dt for Re λ > ω and u ∈ E.

(6.41)

0

Moreover, we have the inequalities for the powers of R(λ; A) R(λ; A)m  ≤

M Re λ > ω, m = 1, 2, . . . . (Re λ − ω)m

(6.42)

Proof The proof is divided into six steps. Step 1: First, if λ is a complex number such that Re λ > ω, then we have, by inequality (6.40),  0



|e−λt |T (t)u dt ≤





|e−λt |T (t)u dt  ∞ ≤ Mu e−(Re λ−ω)t dt 0

0

6.5 (C0 ) Semigroups

265

∞ e−(Re λ−ω)t = M u − Re λ − ω 0 Mu for all u ∈ E. = Re λ − ω 

Hence, if we let  R(λ)u =





e−λt T (t)u dt for every u ∈ E,

(6.43)

0

 ∈ L(x) and that it follows that R(λ)   R(λ) ≤

M , Re λ > ω. Re λ − ω

(6.44)

Step 2: Secondly, we show that, for all u ∈ E,

 R(λ)u ∈ D(A),  = u. (λI − A) R(λ)u

(6.45)

For all sufficiently small h > 0, it follows that   T (h)( R(λ)u) − R(λ)u h   1 ∞ −λt 1 ∞ −λt = e T (h)T (t)u dt − e T (t)u dt h 0 h 0  1 ∞ −λt = e {T (h + t) − T (t)}u dt h 0  ∞  1 1 ∞ −λt = e T (t + h)u dt − T (t)u dt h 0 h 0  ∞  ∞ 1 1 = e−λ(s−h) T (s)u ds − T (t)u dt h h h 0  ∞   h  1 1 ∞ −λt −λ(t−h) −λ(t−h) = e T (t)u dt − e T (t)u dt − e T (t)u dt h 0 h 0 0   1 h −λh(1−s) eλh − 1 ∞ −λt e T (t)u dt − e T (hs)u ds = h h 0 0 := I1 + I2 . However, we have, as h ↓ 0, I1 =

eλh − 1  eλh − 1   R(λ)u = λ R(λ)u −→ λ R(λ)u, h λh

266

6 A Short Course in Semigroup Theory

and also I2 = −

1 h



h

e−λh(1−s) T (hs)u ds −→ −T (0)u = −u.

0

Therefore, we have proved that   T (h)( R(λ)u) − ( R(λ)u) h↓0 h  1 ∞ −λt = lim e {T (h + t) − T (t)}u dt h↓0 h 0  = λ R(λ)u − u. lim

This proves that

or equivalently,



(6.46)

 R(λ)u ∈ D(A),   A( R(λ)u) = λ R(λ)u − u,

 = u for all u ∈ E. (λI − A) R(λ)u

We remark that the operator λI − A is surjective for all Re λ > ω. Step 3: Thirdly, we show that  (λI − A) u = u for all u ∈ D(A). R(λ)

(6.47)

We have, by formula (6.43) and assertion (6.46),   T (h)u − u   R(λ)(Au) = lim R(λ) h↓0 h  ∞   ∞ 1 −λt −λt = lim e T (h)T (t)u dt − e T (t)u dt h↓0 h 0 0 ∞  1 −λt e (T (h + t) − T (t))u dt = lim h↓0 h 0  = λ R(λ)u − u. This proves the desired assertion (6.47). We remark that the operator λI − A is injective for all Re λ > ω. Step 4: Fourthly, we show that  = u for every u ∈ E. lim λ R(λ)u

λ↓∞

Since we have, for all λ > ω,

(6.48)

6.5 (C0 ) Semigroups

267





λe−λt dt = 1,

0

it follows that    ∞   ∞ −λt   −λt =  u − λ R(λ)u  λe u dt − λe T (t)u dt   0 0  ∞ ≤ λe−λt u − T (t)u dt. 0

However, for any given number ε > 0 we can find a number δ = δ(ε) > 0 such that 0 ≤ t < δ =⇒ u − T (t)u < ε. Then we decompose the integral I4 : 



∞ 0

λe−λt u − T (t)u dt into the two terms I3 and

λe−λt u − T (t)u dt

0

 =

δ

λe

−λt

 u − T (t)u dt +

0

∞ δ

λe−λt u − T (t)u dt

:= I3 + I4 . The term I3 may be estimated as follows: 

δ

I3 ≤ ε 0

δ  λe−λt dt = ε −e−λt 0 = ε(1 − e−λδ ) < ε.

The term I4 may be estimated as follows: 



λe−λt (1 + T (t)) u dt  ∞ ≤ u λe−λt (1 + Meωt ) dt δ  ∞ = u (λe−λt + Mλe−(λ−ω)t ) dt δ  ∞ Mλe−(λ−ω)t −λt = u −e − λ−ω δ  −(λ−ω)δ  Mλe . = u e−λδ + λ−ω

I4 ≤

Therefore, we obtain that

δ

268

6 A Short Course in Semigroup Theory

 lim sup u − λ R(λ)u ≤ lim (I3 + I4 ) ≤ ε. λ↓∞

λ→∞

This proves the desired assertion (6.48), since ε > 0 is arbitrary. Step 5: By combining assertions (6.45) and (6.48), we obtain that the domain D(A) is dense in the space E. Moreover, it follows from assertions (6.45) and (6.47)  = (λI − A)−1 is the resolvent of A, that is, and inequality (6.44) that R(λ)  R(λ; A)u = R(λ)u =





e−λt T (t)u dt, Re λ > ω, u ∈ E,

0

 −1 is a closed linear operator. and further that A = λI − R(λ) Step 6: Finally, we prove inequality (6.42). If we differentiate formula (6.41) with respect to λ, it follows that d (R(λ; A)u) = − dλ





te−λt T (t)u dt.

0

On the other hand, by using the resolvent equation in the Banach space L(E, E) we obtain that d

(λI − A)−1 = − (λI − A)−2 = −R(λ; A)2 . dλ Hence we have the formula 



R(λ; A)2 =

t e−λt T (t)u dt.

0

Similarly, if we differentiate this formula with respect to λ, we obtain that d

R(λ; A)2 u = − dλ and that





t 2 e−λt T (t)u dt,

0

d (λI − A)−2 = −2 (λI − A)−3 = −2R(λ; A)3 . dλ

Hence we have the formula 



2R(λ; A) u = 3

t 2 e−λt T (t)u dt,

0

or equivalently, R(λ; A)3 u =

1 2





t 2 e−λt T (t)u dt.

0

Continuing this process, we have, after m − 1 steps,

6.5 (C0 ) Semigroups

269

R(λ; A)m u =

1 (m − 1)!





t m−1 e−λt T (t)u dt.

0

Moreover, by using inequality (6.40) we obtain that Mu R(λ; A) u ≤ (m − 1)!



m



t m−1 e−(Re λ−ω)t dt.

0

However, the integral on the right-hand side can be calculated as follows: 



t m−1 e−(Re λ−ω)t dt

0

 m−1 −(Re λ−ω)t t=∞  ∞ e (m − 1)t m−2 e−(Re λ−ω)t t dt + = − Re λ − ω Re λ − ω 0 t=0  ∞ (m − 1)t m−2 e−(Re λ−ω)t = dt Re λ − ω 0 .. .  ∞ (m − 1)! = e−(Re λ−ω)t dt (Re λ − ω)m−1 0  −(Re λ−ω)t t=∞ e (m − 1)! − = (Re λ − ω)m−1 Re λ − ω t=0 (m − 1)! = . (Re λ − ω)m Therefore, we have proved that R(λ; A)m u ≤

(m − 1)! Mu Mu · = for all u ∈ E. m (m − 1)! (Re λ − ω) (Re λ − ω)m

This proves the desired inequality (6.42). Now the proof of Theorem 6.26 is complete.



6.5.3 The Hille–Yosida Theorem Now we consider when a linear operator is the infinitesimal generator of some (C0 ) semigroup. This question is answered by the following Hille–Yosida theorem: Theorem 6.27 (Hille–Yosida) Let E be a Banach space, and let A : E → E be a closed linear operator with domain D(A). The operator A is the infinitesimal generator of some (C0 ) semigroup {Tt }t≥0 if and only if it satisfies the following two conditions:

270

6 A Short Course in Semigroup Theory

(i) The operator A is a densely defined, closed linear operator. (ii) There exists a real number ω such that the half-line (ω, ∞) is contained in the resolvent set ρ(A) of A, and the resolvent R(λ; A) = (λI − A)−1 satisfies the inequality R(λ; A)m  ≤

M for all λ > ω and all m = 1, 2, . . . . (λ − ω)m

(6.49)

Proof (I) The “only if” part follows immediately from Theorem 6.26. (II) The proof of the “if” part is divided into four steps. In the following we shall write T (t) for Tt . Step 1: If n is a positive integer such that n > ω, we let 

1 A n

−1

= n (n I − A)−1 = n R(n; A) ∈ L(E, E),   An := A Jn = n A (n I − A)−1 = −n (n I − A)−1 + n 2 I (n I − A)−1 Jn :=

I−

= −n I + n 2 R(n; A) ∈ L(E, E). The operators An are called Yosida approximations. First, we show that s − lim Jn = I,

(6.50)

n→∞

lim An u = Au for every u ∈ D(A).

n→∞

(6.51)

Since we have the inequality R(λ; A)m  ≤ it follows that

M for all λ > ω and all m = 1, 2, . . . , (λ − ω)m    Jn  M  R(n; A) =   n  ≤ n − ω,

so that Jn  ≤

(6.52)

nM . n−ω

This proves that the operators Jn are uniformly bounded in the space L(E, E). On the other hand, we have, for all u ∈ D(A), Jn u = n R(n; A)u = R(n; A) (nu − Au + Au) = R(n; A) {(n I − A) u − Au} = u + R(n; A)(Au).

6.5 (C0 ) Semigroups

271

However, it follows from inequality (6.52) that R(n; A)(Au) ≤ R(n; A) · Au ≤

MAu −→ 0 as n → ∞. n−ω

Hence we have, as n → ∞, Jn u −→ u for every u ∈ D(A). Since D(A) is dense in E and since Jn are uniformly bounded, we obtain that Jn u −→ u for every u ∈ E as n → ∞. This proves the desired assertion (6.50). Moreover, since we have the formulas 1 Jn A = (n I − A)−1 A = (n I − A)−1 {n I − (n I − A)} = n (n I − A)−1 − I, n 1 A Jn = A (n I − A)−1 = {n I − (n I − A)} (n I − A)−1 = n (n I − A)−1 − I, n it follows that the operators A and Jn are commutative on D(A). Hence we have, by assertion (6.50), An u = A Jn u = Jn (Au) −→ Au for every u ∈ D(A) as n → ∞. This proves the desired assertion (6.51). Step 2: For any positive integer n > ω, we let Tn (t) = et An = et A Jn , and show that

nωt

Tn (t) ≤ Me n−ω .

(6.53)

Since we have the formula Tn (t) = et (−n+n

2

R(n;A))

= e−nt · en

2

t R(n;A)

,

it follows that Tn (t) ≤ e−nt

∞  n 2m t m R(n; A)m  m! m=0

≤ M e−nt

∞  1 n 2m t m n2 t · = M e−nt · e n−ω m m! (n − ω) m=0

272

6 A Short Course in Semigroup Theory nωt

= M e n−ω for all positive integer n > ω. This proves the desired inequality (6.53). Step 3: Now we show the following two assertions: (i) The strong limit T (t) := s − lim Tn (t) exists in the Banach space L(E, E). More precisely, the function Tn (t)u, u ∈ E, converges to T (t)u uniformly in t on bounded intervals of [0, ∞). (ii) The operators {T (t)}t≥0 form a (C0 ) semigroup, and satisfies the inequality T (t) ≤ M eωt for all t ≥ 0. If we let ω0 := max {0, ω} , then it follows that nωt ≤ 2ω0 t for all positive integer n > 2ω0 . n−ω Let τ > 0 be an arbitrary positive number. Then we have, by Step 2, Tn (t) ≤ M e n−ω ≤ M e2ω0 t ≤ M e2ω0 τ for t ∈ [0, τ ] and n > 2 ω0 . nωt

(6.54)

On the other hand, by Theorem 6.4 it follows that d t An d e = Tn (t)An . (Tn (t)) = dt dt Since Am and An are commutative and since Am and Tn (s) are commutative, we have the formula Tn (t) − Tm (t) = [Tm (t − s)Tn (s)]s=t s=0  t d {Tm (t − s)Tn (s)} ds = 0 ds   t d d Tm (t − s) · Tn (s) + Tm (t − s) · Tn (s) ds = ds ds 0  t {−Tm (t − s)Am Tn (s) + Tm (t − s)Tn (s)An } ds = 0  t {Tm (t − s)Tn (s)An − Tm (t − s)Tn (s)Am } ds = 0  t = Tm (t − s)Tn (s) (An − Am ) ds for all t ∈ [0, τ ]. 0

In view of inequality (6.54), this proves that

6.5 (C0 ) Semigroups

273

Tn (t)u − Tm (t)u ≤ M 2 e4ω0 τ τ An u − Am u,

(6.55)

for all t ∈ [0, τ ] and all n, m > 2ω0 . However, it follows from assertion (6.51) that we have, for all u ∈ D(A), An u −→ Au as n → ∞. Therefore, by letting n, m → ∞ in inequality (6.55) we find that the function Tn (t)u, u ∈ D(A), converges uniformly in t ∈ [0, τ ], for each τ > 0. We consider the general case where u ∈ E: For any given number ε > 0, we can find an element v ∈ D(A) such that u − v < ε. Then we have the inequality Tn (t)u − Tm (t)u ≤ Tn (t)(u − v) + Tn (t)v − Tm (t)v + Tm (t)(u − v) ≤ 2M ε e2ω0 τ + Tn (t)v − Tm (t)v, for all t ∈ [0, τ ] and for all n, m > 2ω0 . Since the function Tn (t)v, v ∈ D(A), converges uniformly in t ∈ [0, τ ], it follows that lim sup Tn (t)u − Tm (t)u ≤ 2M ε e2ω0 τ for all t ∈ [0, τ ]. n,m→∞

This proves that the function Tn (t)u, u ∈ E, also converges uniformly in t ∈ [0, τ ], for each τ > 0. Therefore, we have proved that the function Tn (t)u, u ∈ E, converges uniformly in t over bounded intervals of [0, ∞). We can define a family {T (t)}t≥0 of linear operators on E by the formula T (t)u := lim Tn (t)u = lim et An u for every u ∈ E. n→∞

n→∞

First, we remark that the function T (t)u, u ∈ E, is strongly continuous for all t ≥ 0, since this convergence is uniform in t over bounded intervals of [0, ∞). Secondly, it follows from an application of the resonance theorem (Theorem 5.17) that T (t) ≤ lim inf Tn (t) ≤ lim inf M e n−ω = M eωt for all t ≥ 0, nωt

n→∞

n→∞

so that T (t) ∈ L(E, E) for all t ≥ 0. Thirdly, since we have the group property for the operators Tn (t + s) = Tn (t)Tn (s) for all t, s ∈ R, by passing to the limit we obtain the semigroup property for the operators

274

6 A Short Course in Semigroup Theory

T (t + s) = T (t)T (s) for all t, s ≥ 0. Summing up, we have proved that {T (t)}t≥0 form a (C0 ) semigroup, and satisfies the inequality T (t) ≤ M eωt for all t ≥ 0. Step 4: Finally, we show that the infinitesimal generator A of {T (t)}t≥0 coincides with the operator A. Since we have, for all n > ω0 , d d t An e u = An Tn (t)u = Tn (t)An u for u ∈ D(A), (Tn (t)u) = dt dt it follows that 

h

Tn (h)u − u = 0

d (Tn (t)u) dt = dt



h

Tn (t)An u dt.

(6.56)

0

Hence we have, by inequality (6.53), Tn (t)(An u) − T (t)(Au) ≤ Tn (t)An u − Au +  (Tn (t) − T (t)) Au ≤ M e2ωh An u − Au +  (Tn (t) − T (t)) (Au). It should be noticed that the convergence Tn (t)(An u) −→ T (t)(Au) is uniform in t ∈ [0, τ ] as n → ∞. By letting n → ∞ in formula (6.56), we obtain that 

h

T (h)u − u =

T (t)Au dt for u ∈ D(A).

0

Hence it follows that lim h↓0

T (h)u − u 1 = lim h↓0 h h



h

T (t)Au dt = T (0)Au = Au.

0

This proves that u ∈ D(A), Au = Au, so that A ⊂ A. In order to prove that A = A, it suffices to show that D(A) ⊂ D(A). Let u be an arbitrary element of D(A). Since the operator

6.5 (C0 ) Semigroups

275

λI − A : D(A) −→ E is bijective for λ > ω, we can find a unique element v ∈ D(A) such that (λI − A) v = (λI − A)u. However, since A ⊂ A, it follows that (λI − A) v = (λI − A) v, so that (λI − A)(u − v) = 0. This proves that u = v ∈ D(A), that is, D(A) ⊂ D(A), since the operator λI − A : D(A) −→ E is bijective for λ > ω (see assertion (ii) of Theorem 6.26). Now the proof of Theorem 6.27 is complete.



The next corollary gives a simple necessary and sufficient condition for contraction (C0 ) semigroups (cf. Theorem 6.10): Corollary 6.28 Let E be a Banach space, and let A : E → E be a densely defined, closed linear operator with domain D(A). The operator A is the infinitesimal generator of some contraction (C0 ) semigroup {Tt }t≥0 if and only if the half-line (0, ∞) is contained in the resolvent set ρ(A) of A and the resolvent R(λ; A) = (λI − A)−1 satisfies the inequality 1 R(λ; A) ≤ for all λ > 0. λ Proof The “only if” part follows from Theorem 6.26 with ω := 0 and M := 1. The “if” part may be proved as follows: Since we have, for every integer m ∈ N, R(λ; A)m  ≤ R(λ; A)m ≤

1 for all λ > 0, λm

it follows that condition (ii) of Theorem 6.27 is satisfied with ω := 0 and M := 1. Therefore, by applying Theorem 6.27 to our situation we obtain that A is the infinitesimal generator of some contraction (C0 ) semigroup. The proof of Corollary 6.28 is complete. 

276

6 A Short Course in Semigroup Theory

6.5.4 (C0 ) Semigroups and Initial-Value Problems Finally, we consider an initial-value problem associated with a (C0 ) semigroup. More precisely, we prove the following existence and uniqueness theorem for an initial-value problem associated with a (C0 ) semigroup: Theorem 6.29 Let {Tt }t≥0 be a (C0 ) semigroup with infinitesimal generator A. If x ∈ D(A), then the function u(t) = Tt x is a unique solution of the initial-value problem du = Au for all t > 0, dt (∗) u(0) = x which satisfies the following three conditions: (a) The function u(t) is continuously differentiable for all t > 0. (b) u(t) ≤ M eβt for all t ≥ 0. (c) u(t) → x as t ↓ 0. In other words, the initial-value problem (∗) is well-posed. The proof of Theorem 6.29 is based on the following result on the Laplace transform: Lemma 6.30 Let u(t) be an E-valued, bounded continuous function defined on the open interval R+ = (0, ∞). If we have, for all λ > 0, 



e−λt u(t) dt = 0,

(6.57)

0

then it follows that u(t) = 0 for all t ≥ 0. Proof (1) If f is an arbitrary element of the dual space E  of E, then it follows that the function R+  t −→ f (u(t)) is bounded and continuous. Moreover, it is easy to verify the formula 



e 0

−λt

 f (u(t)) dt = f



e

−λt

 u(t) dt ,

(6.58)

0

since the integrals can be approximated by Riemann sums. Indeed, let Δ = {t0 = 0, t1 , . . . , tn = M} be the division of the interval [0, M] for any M > 0. Then we have, for the corresponding Riemann sums,

6.5 (C0 ) Semigroups

277

n 

e

−λti

f (u(ti )) · |Δ| = f

 n 

i=0

e

−λti

u(ti ) · |Δ| ,

(6.59)

i=0

where |Δ| = max |ti − ti−1 |. 1≤i≤n

However, by letting |Δ| → 0 in formula (6.59), we obtain that n 

e−λti f (u(ti )) · |Δ| −→

i=0

f



e

−λti

e−λt f (u(t)) dt,

0



 n 

M



M

u(ti ) · |Δ| −→ f

e

−λt

 u(t) dt .

0

i=0

Hence we have, for all M > 0, 

M

e

−λt

 f (u(t)) dt = f

0

M

e

−λt

 u(t) dt .

(6.60)

0

The desired formula (6.58) follows by letting M → ∞ in formula (6.60). (2) By combining condition (6.57) and formula (6.58), we obtain that 



e−λt f (u(t)) dt = 0 for all λ > 0.

0

Hence we have, by the fundamental fact of the Laplace transform, f (u(t)) = 0 for all t ≥ 0. This proves that u(t) = 0 for all t ≥ 0, since f is an arbitrary element of the dual space E  . The proof of Lemma 6.30 is complete.  Corollary 6.31 Let u(t) be an E-valued, bounded continuous function defined on the open interval R+ = (0, ∞). Assume that there exist a constant M ≥ 1 and a real number β such that (6.61) u(t) ≤ M eβt for all t ≥ 0. If we have, for all λ > β,





e−λt u(t) dt = 0,

0

then it follows that u(t) = 0 for all t ≥ 0.

278

6 A Short Course in Semigroup Theory

Proof If we let v(t) = u(t)e−βt , ξ = λ − β, then it follows that the function v(t) satisfies the bounded condition v(t) ≤ M for all t ≥ 0, and the condition   ∞ −ξt e v(t) dt = 0



e

−(λ−β)t

 v(t) dt =

0



e−λt u(t) dt

0

= 0 for all ξ = λ − β > 0. Therefore, by applying Lemma 6.30 we obtain that v(t) = u(t) e−βt = 0 for all t ≥ 0, so that u(t) = 0 for all t ≥ 0. The proof of Corollary 6.31 is complete.



Proof of Theorem 6.29 (1) By Lemma 6.22 with A := A, it follows that the function u(t) = Tt x, x ∈ D(A), is a solution of problem (∗). (2) We have only to show the uniqueness of solutions. Assume that u 1 and u 2 are two solutions of problem (∗) which satisfy conditions (a), (b) and (c): u 1 (t) ≤ M eβ1 t for all t ≥ 0, u 2 (t) ≤ M eβ2 t for all t ≥ 0. Then it follows that the function v(t) = u 1 (t) − u 2 (t) is a solution of the initial-value problem

which satisfies the condition

= Av, v(0) = 0 dv dt

6.5 (C0 ) Semigroups

279

v(t) ≤ 2M eβt for all t ≥ 0,

(6.62)

where β = max{β1 , β2 }. Now we take an arbitrary real number λ such that λ > β, and let W (t) = e−λt v(t) = e−λt (u 1 (t) − u 2 (t)) . Then it follows that d −λt d d (W (t)) = e v(t) = −λW (t) + e−λt v(t) dt dt dt = −λW (t) + e−λt Av(t) = −(λI − A)W (t), so that 

s

W (t) dt = −(λI − A)

0



dW (t) dt dt 0 = −(λI − A)−1 W (s), −1

s

since W (0) = v(0) = 0. Hence we have, by inequality (6.62),    

s 0

  s       −λt   = −(λI − A)−1 W (s) = W (t) dt  e v(t) dt    0 

 = −(λI − A)−1 e−λs v(s)  1 2M −(λ−β)s ≤ e−λs 2M eβs = e . λ−β λ−β

Since λ > β, it follows that  s    2M −(λ−β)s  W (t) dt  −→ 0 as s → ∞.   ≤ λ−βe 0 Therefore, we have proved that  0



e−λt v(t) dt =





W (t) dt = 0 for all λ > β.

0

In view of condition (6.62), by applying Corollary 6.31 to the function v(t) we obtain that v(t) = u 1 (t) − u 2 (t) = 0 for all t ≥ 0, so that

280

6 A Short Course in Semigroup Theory

u 1 (t) = u 2 (t) for all t ≥ 0. This proves the uniqueness theorem for problem (∗). Now the proof of Theorem 6.29 is complete.



6.6 Notes and Comments Hille–Phillips [81] and Yosida [240] are the classics for semigroup theory. The material in this chapter is adapted from the books of Chazarain–Piriou [35], Friedman [66], Goldstein [76], Kre˘ın [106], Pazy [142], Tanabe [218] and also part of Taira [203] in such a way as to make it accessible to graduate students and advanced undergraduates as well. For more leisurely treatments of semigroups, the readers might be referred to Amann [11] and Engel–Nagel [51].

Part II

Elements of Partial Differential Equations

Chapter 7

Distributions, Operators and Kernels

This chapter is a summary of the basic definitions and results about the theory of distributions or generalized functions which will be used in subsequent chapters. Distribution theory has become a convenient tool in the study of partial differential equations. Many problems in partial differential equations can be formulated in terms of abstract operators acting between suitable spaces of distributions, and these operators are then analyzed by the methods of functional analysis. The virtue of this approach is that a given problem is stripped of extraneous data, so that the analytic core of the problem is revealed. Section 7.1 serves to settle questions of notation and such. In Sect. 7.2 we study L p spaces, the spaces of C k functions and test functions, and also Hölder spaces on an open subset of Euclidean space. Moreover, we introduce Friedrichs’ mollifiers and show how Friedrichs’ mollifiers can be used to approximate a function by smooth functions (Theorem 7.4). In Sect. 7.3 we study differential operators and state that differential operators are local operators (Peetre’s theorem 7.7). In Sect. 7.4 we present a brief description of the basic concepts and results of distributions. In particular, the importance of tempered distributions lies in the fact that they have Fourier transforms. In Sect. 7.4.10 we give the Fourier transform of a tempered distribution which is closely related to the stationary phase theorem (Example 7.29). In Sect. 7.5 we formulate the Schwartz kernel theorem (Theorem 7.36) which characterizes continuous linear operators in terms of distributions. In Sect. 7.6 we describe the classical single and double layer potentials arising in the Dirichlet and Neumann n (formulas (7.53) and problems for the Laplacian Δ in the case of the half-space R+ (7.54)). Moreover, we prove the Green representation formula (7.56). This formula will be formulated in terms of pseudo-differential operators in Chap. 9 (Sect. 9.5). Some results in Sects. 7.3, 7.4 and 7.5 can be extended to distributions, differential operators, and operators and kernels on a manifold in Sect. 7.7. The virtue of manifold theory is that it provides the geometric insight into the study of partial © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 K. Taira, Functional Analytic Techniques for Diffusion Processes, Springer Monographs in Mathematics, https://doi.org/10.1007/978-981-19-1099-9_7

283

284

7 Distributions, Operators and Kernels

differential equations, and intrinsic properties of partial differential equations may be revealed. In the last Sect. 7.8 we introduce the notion of domains of class C r from the viewpoint of manifold theory.

7.1 Notation This section serves to settle questions of notation and such.

7.1.1 Points in Euclidean Spaces Let Rn be the n-dimensional Euclidean space. We use the conventional notation x = (x1 , x2 , . . . , xn ). If x = (x1 , x2 , . . . , xn ) and y = (y1 , y2 , . . . , yn ) are points in Rn , we set x·y=

n 

x j yj,

j=1

⎛ ⎞1/2 n  |x| = ⎝ x 2j ⎠ . j=1

7.1.2 Multi-Indices and Derivations Let α = (α1 , α2 , . . . , αn ) be an n-tuple of non-negative integers. Such an n-tuple α is called a multi-index. We let |α| = α1 + α2 + . . . + αn , α! = α1 !α2 ! · · · αn !. If α = (α1 , α2 , . . . , αn ) and β = (β1 , β2 , . . . , βn ) are multi-indices, we define α + β = (α1 + β1 , α2 + β2 , . . . , αn + βn ). The notation α ≤ β means that α j ≤ β j for each 1 ≤ j ≤ n. Then we let

7.1 Notation

285

       β β1 β2 βn = ··· . α α1 α2 αn We use the shorthand ∂ , ∂x j 1 ∂ Dj = i ∂x j

∂j =

(i =

√ −1)

for derivatives on Rn . Higher order derivatives are expressed by multi-indices as follows: ∂ α = ∂1α1 ∂2α2 · · · ∂nαn ,

D α = D1α1 D2α2 · · · Dnαn .

Similarly, if x = (x1 , x2 , . . . , xn ) ∈ Rn , we write x α = x1α1 x2α2 · · · xnαn .

7.2 Function Spaces In this section we study L p spaces, the spaces of C k functions and test functions, and also Hölder spaces on an open subset of Euclidean space.

7.2.1 L p Spaces Let Ω be an open subset of Rn . Two Lebesgue measurable functions f , g on Ω are said to be equivalent if they are equal almost everywhere in Ω with respect to the Lebesgue measure d x, that is, if f (x) = g(x) for every x outside a set of Lebesgue measure zero. This is obviously an equivalence relation. If 1 ≤ p < ∞, we let L p (Ω) = the space of equivalence classes of Lebesgue measurable functions f (x) on Ω such that | f (x)| p is integrable on Ω. The space L p (Ω) is a Banach space with the norm   f p =

1/ p | f (x)| d x p

Ω

.

286

7 Distributions, Operators and Kernels

Furthermore, the space L 2 (Ω) is a Hilbert space with the inner product ( f, g) =

Ω

f (x)g(x) d x.

A Lebesgue measurable function f (x) on Ω is said to be essentially bounded if there exists a constant C > 0 such that | f (x)| ≤ C almost everywhere (a.e.) in Ω. We define ess supx∈Ω | f (x)| = inf {C : | f (x)| ≤ C a. e. in Ω} . For p = ∞, we let L ∞ (Ω) = the space of equivalence classes of essentially bounded, Lebesgue measurable functions on Ω. The space L ∞ (Ω) is a Banach space with the norm  f ∞ = ess supx∈Ω | f (x)|. If 1 < p < ∞, we let p = so that 1 < p  < ∞ and

p , p−1

1 1 +  = 1. p p

The number p  is called the exponent conjugate to p. We recall that the most basic inequality for L p -functions is the following: 

Theorem 7.1 (Hölder’s inequality) If 1 < p < ∞ and f ∈ L p (Ω), g ∈ L p (Ω), then the product f (x)g(x) is in L 1 (Ω) and we have the inequality  f g1 ≤  f  p g p .

(7.1)

It should be noticed that inequality (7.1) holds true for the two cases p = 1, p  = ∞ and p = ∞, p  = 1. Inequality (7.1) in the case p = p  = 2 is referred to as Schwarz’s inequality.

7.2.2 Convolutions We give a general theorem about integral operators on a measure space ([63, Theorem 6.18]):

7.2 Function Spaces

287

Theorem 7.2 (Schur) Let (X, M, μ) be a measure space. Assume that K (x, y) is a measurable function on the product space X × X such that |K (x, y)| dμ(y) ≤ C

sup x∈X

X



and

|K (x, y)| dμ(x) ≤ C

sup y∈X

X

where C is a positive constant. If f ∈ L p (X ) with 1 ≤ p ≤ ∞, then the function T f (x), defined by the formula T f (x) =

K (x, y) f (y) dμ(y), X

is well-defined for almost all x ∈ X , and is in L p (X ). Furthermore, we have the inequality T f  p ≤ C f  p .

Corollary 7.3 (Young’s inequality) If f ∈ L 1 (Rn ) and g ∈ L p (Rn ) with 1 ≤ p ≤ ∞, then the convolution ( f ∗ g)(x), defined by the formula ( f ∗ g) (x) =

Rn

f (x − y)g(y) dy,

is well-defined for almost all x ∈ Rn , and is in L p (Rn ). Furthermore, we have the inequality  f ∗ g p ≤  f 1 g p .

7.2.3 Spaces of C k Functions Let Ω be an open subset of Rn . We let C(Ω) = the space of continuous functions on Ω. If K is a compact subset of Ω, we define a seminorm p K on C(Ω) by the formula C(Ω) ϕ −→ p K (ϕ) = sup |ϕ(x)|. x∈K

288

7 Distributions, Operators and Kernels

We equip the space C(Ω) with the topology defined by the family { p K } of seminorms where K ranges over all compact subsets of Ω. If k is a positive integer, we let C k (Ω) = the space of C k functions on Ω. We define a seminorm p K ,k on C k (Ω) by the formula C k (Ω) ϕ −→ p K ,k (ϕ) = sup |∂ α ϕ(x)|.

(7.2)

x∈K |α|≤k

We equip the space C k (Ω) with the topology defined by the family p K ,k of seminorms where K ranges over all compact subsets of Ω. This is the topology of uniform convergence on compact subsets of Ω of the functions and their derivatives of order ≤ k. We set ∞ C k (Ω), C ∞ (Ω) = k=1

and C 0 (Ω) = C(Ω). Let m be a non-negative integer or m = ∞. Let {K  } be a sequence of compact subsets of Ω such that K  is contained in the interior of K +1 for each  and that Ω=



K.

=1

For example, we may take  1 . K  = x ∈ Ω : |x| ≤ , dist(x, ∂Ω) ≥  

Such a sequence {K  } is called an exhaustive sequence of compact subsets of Ω. It is easy to see that the countable family

pK, j



=1,2,... 0≤ j≤m

of seminorms suffices to define the topology on C m (Ω) and further that C m (Ω) is complete. Hence the space C m (Ω) is a Fréchet space.

7.2 Function Spaces

289

Furthermore, we let C(Ω) = the space of functions in C(Ω)having continuous extensions to the closure Ω of Ω. If k is a positive integer, we let C k (Ω) = the space of functions in C k (Ω) all of whose derivatives of order ≤ k have continuous extensions to Ω. We set C ∞ (Ω) =



C k (Ω),

k=1

and C 0 (Ω) = C(Ω). Let m be a non-negative integer or m = ∞. We equip the space C m (Ω) with the topology defined by the family p K , j of seminorms where K ranges over all compact subsets of Ω and 0 ≤ j ≤ m. Let {F } be an increasing sequence of compact subsets of Ω such that ∞

F = Ω.

=1

For example, we may take

F = x ∈ Ω : |x| ≤  . Such a sequence {F } is called an exhaustive sequence of compact subsets of Ω. It is easy to see that the countable family { p F , j }=1,2,... 0≤ j≤m

of seminorms suffices to define the topology on C m (Ω) and further that C m (Ω) is complete. Hence the space C m (Ω) is a Fréchet space. If Ω is bounded and 0 ≤ m < ∞, then the space C m (Ω) is a Banach space with the norm ϕC m (Ω) = sup |∂ α ϕ(x)|. x∈Ω |α|≤m

290

7 Distributions, Operators and Kernels

7.2.4 Space of Test Functions Let Ω be an open subset of Rn and let u(x) be a continuous function on Ω. The support of u, denoted supp u, is the closure in Ω of the set {x ∈ Ω : u(x) = 0}. In other words, the support of u is the smallest closed subset of Ω outside of which u vanishes. Let m be a non-negative integer or m = ∞. If K is a compact subset of Ω, we let C Km (Ω) = the space of functions in C m (Ω) with support in K . The space C Km (Ω) is a closed subspace of C m (Ω). Furthermore, we let C0m (Ω) =

C Km (Ω),

K ⊂Ω

where K ranges over all compact subsets of Ω, so that C0m (Ω) is the space of functions in C m (Ω) with compact support in Ω. It should be emphasized that the space C0m (Ω) can be identified with the space of functions in C0m (Rn ) with support in Ω. If {K  } is an exhaustive sequence of compact subsets of Ω, we equip the space C0m (Ω) with the inductive limit topology of the spaces C Km (Ω), that is, the strongest locally convex linear space topology such that each injection C Km (Ω) −→ C0m (Ω) is continuous. We can verify that this topology on C0m (Ω) is independent of the sequence {K  } used. We list some basic properties of the topology on C0m (Ω):

(1) A sequence ϕ j in C0m (Ω) converges to an element ϕ in C0m (Ω) if and only if the functions ϕ j and ϕ are supported in a common compact subset K of Ω and ϕ j → ϕ in C Km (Ω). (2) A subset of C0m (Ω) is bounded if and only if it is bounded in C Km (Ω) for some compact K ⊂ Ω. (3) A linear mapping of C0m (Ω) into a linear topological space is continuous if and only if its restriction to C Km (Ω) for every compact K ⊂ Ω is continuous. The elements of C0∞ (Ω) are often called test functions.

7.2.5 Hölder Spaces Let D be a subset of Rn and let 0 < θ < 1. A function ϕ defined on D is said to be Hölder continuous with exponent θ if the quantity

7.2 Function Spaces

291

[ϕ]θ;D = sup

x,y∈D x = y

|ϕ(x) − ϕ(y)| |x − y|θ

is finite. We say that ϕ is locally Hölder continuous with exponent θ if it is Hölder continuous with exponent θ on compact subsets of D. Hölder continuity may be viewed as a fractional differentiability. Let Ω be an open subset of Rn and 0 < θ < 1. We let C θ (Ω) = the space of functions in C(Ω) which are locally H¨older continuous with exponent θ on Ω. If k is a positive integer, we let C k+θ (Ω) = the space of functions in C k (Ω) all of whose kth order derivatives are locally H¨older continuous with exponent θ on Ω. If K is a compact subset of Ω, we define a seminorm q K ,k on C k+θ (Ω) by the formula C k+θ (Ω) ϕ −→ q K ,k (ϕ) = sup |∂ α ϕ(x)| + sup [∂ α ϕ]θ;K . |α|=k

x∈K |α|≤k

It is easy to see that the Hölder space C k+θ (Ω) is a Fréchet space. Furthermore, we let C θ (Ω) = the space of functions in C(Ω) which are H¨older continuous with exponent θ on Ω. If k is a positive integer, we let C k+θ (Ω) = the space of functions in C k (Ω) all of whose k-th order derivatives are H¨older continuous with exponent θ on Ω. m+θ Let m be a non-negative (Ω) with the topology integer. We equip the space C

defined by the family q K ,k of seminorms where K ranges over all compact subsets of Ω. It is easy to see that the Hölder space C m+θ (Ω) is a Fréchet space. If Ω is bounded, then C m+θ (Ω) is a Banach space with the norm

ϕC m+θ (Ω) = ϕC m (Ω) + sup [∂ α ϕ]θ;Ω |α|=m

α

= sup |∂ ϕ(x)| + sup [∂ α ϕ]θ;Ω . x∈Ω |α|≤m

|α|=m

292

7 Distributions, Operators and Kernels

7.2.6 Friedrichs’ Mollifiers Let ρ(x) be a non-negative, bell-shaped C ∞ function on Rn satisfying the following conditions:

supp ρ = x ∈ Rn : |x| ≤ 1 . ρ(x) d x = 1.

(7.3a) (7.3b)

Rn

For example, we may take  k exp[−1/(1 − |x|2 )] if |x| < 1, ρ(x) = 0 if |x| ≥ 1, where the constant factor k is so chosen that condition (7.3b) is satisfied. For each ε > 0, we define ρε (x) =

1 x  , ρ εn ε

then ρε (x) is a non-negative, C ∞ function on Rn , and satisfies the conditions

supp ρε = x ∈ Rn : |x| ≤ ε ; ρε (x) d x = 1.

(7.4a) (7.4b)

Rn

The functions {ρε } are called Friedrichs’ mollifiers (see Fig. 7.1 below). The next theorem shows how Friedrichs’ mollifiers can be used to approximate a function by smooth functions: Theorem 7.4 Let Ω be an open subset of Rn . Then we have the following two assertions (i) and (ii): (i) If u ∈ L p (Ω) with 1 ≤ p < ∞ and vanishes outside a compact subset K of Ω, then it follows that ρε ∗ u ∈ C0∞ (Ω) provided that ε < dist (K , ∂Ω), and further that ρε ∗ u → u in L p (Ω) as ε ↓ 0. (ii) If u ∈ C0m (Ω) with 0 ≤ m < ∞, then it follows that ρε ∗ u ∈ C0∞ (Ω) provided that ε < dist (supp u, ∂Ω), and further that ρε ∗ u → u in C0m (Ω) as ε ↓ 0. Here dist (K , ∂Ω) = inf {|x − y| : x ∈ K , y ∈ ∂Ω} . The functions ρε ∗ u are called regularizations of the function u.

7.2 Function Spaces

293

Fig. 7.1 Friedrichs’ mollifiers {ρε }

Corollary 7.5 The space C0∞ (Ω) is dense in L p (Ω) for each 1 ≤ p < ∞. Indeed, Corollary 7.5 is an immediate consequence of part (i) of Theorem 7.4, since L p functions with compact support are dense in L p (Ω). The next result gives another useful construction of smooth functions that vanish outside compact sets: Corollary 7.6 Let K be a compact subset of Rn . If Ω is an open subset of Rn such that K ⊂ Ω, then there exists a function f ∈ C0∞ (Ω) such that 0 ≤ f (x) ≤ 1 in Ω, f (x) = 1 on K . Proof Let δ = dist (K , ∂Ω), and define a relatively compact subset U of Ω, containing K , as follows:  δ U = x ∈ Ω : |x − y| < for some y ∈ K . 2 

Then it is easy to verify that the function f (x) = ρε ∗ χU (x) = satisfies all the conditions.

1 εn



ρ U

x−y ε

 dy for 0 < ε
0 and a non-negative integer m such that | u, ϕ | ≤ C p K ,m (ϕ) for all ϕ ∈ C K∞ (Ω), where

p K ,m (ϕ) = sup |∂ α ϕ(x)|. x∈K |α|≤m

  (iii) u, ϕ j → 0 whenever ϕ j → 0 in C0∞ (Ω). Part (ii) of Theorem 7.4 tells us that the space C0∞ (Ω) is a dense subspace of for 0 ≤ m < ∞. Also it is clear that the injection of C0∞ (Ω) into C0m (Ω) is continuous. Hence the dual space Dm (Ω) = L(C0m (Ω), C) can be identified with a linear subspace of D (Ω), by the identification of a continuous linear functional on C0m (Ω) with its restriction to C0∞ (Ω). The elements of Dm (Ω) are called distributions of order ≤ m on Ω. In other words, the distributions of order ≤ m on Ω are precisely those distributions on Ω that have continuous extensions to C0m (Ω). Now we give some important examples of distributions. C0m (Ω)

Example 7.9 We let L 1loc (Ω) = the space of equivalence classes of Lebesgue measurable functions on Ωwhich are integrable on every compact subset of Ω. The elements of L 1loc (Ω) are called locally integrable functions on Ω. For example (n = 1), it is easy to verify the following two assertions (a) and (b): (a) log |x| ∈ L 1loc (R). (b) Y (x) ∈ L 1loc (R). Here Y (x) is the Heaviside step function defined by the formula  Y (x) =

1 for x > 0, 0 for x < 0.

Every element f of L 1loc (Ω) defines a distribution T f of order zero on Ω by the formula   f (x)ϕ(x) d x for every ϕ ∈ C0∞ (Ω). Tf , ϕ = Ω

296

7 Distributions, Operators and Kernels

Indeed, we have, for all ϕ ∈ C K∞ (Ω), 







| Tf , ϕ | ≤

| f (x)| d x

p K ,0 (ϕ).

K

Moreover, we can prove that the mapping f −→ T f induces an injection of L 1loc (Ω) into D (Ω). Indeed, we can prove the following Du Bois Raymond lemma: Lemma 7.10 (Du Bois Raymond) Assume that f ∈ L 1loc (Ω) satisfies the condition Ω

f (x)ϕ(x) d x = 0 for all ϕ ∈ C0∞ (Ω).

(7.6)

Then it follows that f (x) = 0 almost everywhere in Ω. Proof It suffices to show that we have, for any compact subset K of Ω, f (x) = 0 almost everywhere in K . Now we take a function χ ∈ C0∞ (Ω) such that χ (x) = 1 on K , and let f χ (x) = χ (x) f (x) for x ∈ Rn . Then we remark that f χ ∈ L 1 (Rn ). Hence it follows from an application of part (i) of Theorem 7.4 with p := 1 that ρε ∗ f χ −→ f χ in L 1 (Rn ) as ε ↓ 0. However, we have the assertion ρε (x − y) f χ (y) dy = ρε ∗ f χ (x) = Rn

Rn

f (y) (χ (y)ρε (x − y)) dy,

and, for all sufficiently small ε > 0, χ (·)ρε (x − ·) ∈ C0∞ (Ω) for all x ∈ Rn . Therefore, by applying condition (7.6) to our situation we obtain that

(7.7)

7.4 Distributions and the Fourier Transform

297

ρε ∗ f χ (x) = 0 for all x ∈ Rn . Hence we have, by assertion (7.7),  f χ  L 1 = lim ρε ∗ f χ  L 1 = 0. ε↓0

This proves that f χ (x) = χ (x) f (x) = 0 for almost all x ∈ Rn , so that f (x) = 0 for almost all x ∈ K . The proof of Lemma 7.10 is complete.



By virtue of Lemma 7.10, we can regard locally integrable functions as distributions. We say that such distributions “are” functions. In particular, the functions in C m (Ω) (0 ≤ m ≤ ∞) and in L p (Ω) (1 ≤ p ≤ ∞) are distributions on Ω. Example 7.11 More generally, every complex Borel measure μ on Ω defines a distribution of order zero on Ω by the formula μ, ϕ =

Ω

ϕ(x) dμ(x) for every ϕ ∈ C0∞ (Ω).

In particular, if we take μ to be the point mass at a point x0 of Ω, we obtain the Dirac measure δx0 defined by the formula   δx0 , ϕ = ϕ(x0 ) for every ϕ ∈ C0∞ (Ω). In other words, the Dirac measure δx0 is the point evaluation functional for x0 ∈ Ω. We denote δ0 just by δ in the case x = 0. Example 7.12 Let f (x) be a continuous function on Rn \ {0} which is positively homogeneous of degree −n and has mean zero on the unit sphere n : f (λx) = λ−n f (x) for x ∈ Rn and λ > 0, f (σ ) dσ = 0. n

Here σ is the surface measure on n . Then the formula v. p. f (x), ϕ = lim ε↓0

|x|>ε

f (x)ϕ(x) d x for ϕ ∈ C0∞ (Rn )

(7.8a) (7.8b)

298

7 Distributions, Operators and Kernels

Fig. 7.2 The restriction u|V ∈ D (V ) of u ∈ D (Ω)

Fig. 7.3 The derivative ∂ α u of u ∈ D (Ω)

defines a distribution on Rn . Here “v.p.” stands for Cauchy’s “valeur principale” in French. For example (n = 1), the distribution v. p.(1/x) is defined by the formula 

 ∞ ϕ(x) ϕ(x) − ϕ(−x) 1 dx = dx v. p. , ϕ = lim ε↓0 |x|>ε x x x 0   ∞ 1 = ϕ  (t x) dt d x for every ϕ ∈ C0∞ (R). 0

−1

We define various operations on distributions. (a) Restriction: If u ∈ D (Ω) and V is an open subset of Ω, we define the restriction u|V to V of u by the formula u|V , ϕ = u, ϕ for every ϕ ∈ C0∞ (V ). Then it follows that u|V ∈ D (V ) (see Figs. 7.2 and 7.3 above). (b) Differentiation: The derivative ∂ α u of a distribution u ∈ D (Ω) is the distribution on Ω defined by the formula ∂ α u, ϕ = (−1)|α| u, ∂ α ϕ for every ϕ ∈ C0∞ (Ω). For example (n = 1), we have the formulas (1) Y (x) = δ(x).

1 (2) (log |x|) = v. p. . x (c) Multiplication by functions: The product au of a function a ∈ C ∞ (Ω) and a distribution u ∈ D (Ω) is the distribution on Ω defined by the formula (see Fig. 7.4 below)

7.4 Distributions and the Fourier Transform

299

Fig. 7.4 The product au of a ∈ C ∞ (Ω) and u ∈ D (Ω)

au, ϕ = u, aϕ for every ϕ ∈ C0∞ (Ω). For example (n = 1), we have the formulas (1) x  δ(x) = 0.  1 = 1. (2) x v. p. x The Leibniz formula for the differentiation of a product remains valid: β

D (au) =

 β  α≤β

α

D β−α a · D α u.

(7.9)

(d) We can combine operations (b) and (c). We let P(x, D) =



aα (x)D α with aα ∈ C ∞ (Ω)

|α|≤m

be a differential operator of order m on Ω. If u ∈ D (Ω), we define P(x, D)u by the formula  P(x, D)u, ϕ = u,



 |α|

α

(−1) D (aα ϕ)

for every ϕ ∈ C0∞ (Ω).

|α|≤m

Then it follows that P(x,  D)u ∈ D (Ω). ξ α is called the complete symbol of The function p(x, ξ ) = |α|≤m aα (x) P(x, D) and the function pm (x, ξ ) = |α|=m aα (x) ξ α is called the principal symbol of P(x, D). If P (α) (x, D) is a differential operator of order m − |α| having complete symbol ∂ξα p(x, ξ ), then we have the following generalization of formula (7.9):

300

7 Distributions, Operators and Kernels

P(x, D)(au) =

 1   P (α) (x, D)a D α u. α! |α|≤m

(7.10)

This is referred to as the Leibniz–Hörmander formula. (e) Conjugation: The conjugate u of a distribution u ∈ D (Ω) is the distribution on Ω defined by the formula u, ϕ = u, ϕ for every ϕ ∈ C0∞ (Ω), where · denotes complex conjugation.

7.4.2 Topologies on D  (Ω) Let Ω be an open subset of Rn . There are two natural topologies on the space D (Ω) of distributions on Ω: (1) Weak* topology τs : This is the topology of convergence at each element of   C0∞ (Ω). The space D (Ω) endowed with this topology is denoted by D (Ω)s . converges to a distribution u in D (Ω)s if and A sequence u j of distributions 

 only if the sequence u j , ϕ converges to u, ϕ for every ϕ ∈ C0∞ (Ω). (2) Strong topology τb : This is the topology of uniform convergence on all bounded  subsets of C0∞ (Ω). The space D (Ω) endowed with this topology is denoted  by D (Ω)b . A sequence u j of distributions

  converges to a distribution u in D (Ω)b if and only if the sequence u j , ϕ converges to u, ϕ uniformly in ϕ over all bounded subsets of C0∞ (Ω). We list some basic topological properties of D (Ω): (I) In the case of a sequence of distributions, the two notions of convergence coincide, that is, u j → u in D (Ω)s if and only if u j → u in D (Ω)b . Let Ω1 and Ω2 be open subsets of Rn 1 and Rn 2 , respectively and let A be a linear operator on C0∞ (Ω2 ) into D (Ω1 ). Then the continuity of A does not depend on the topology (τs or τb ) on D (Ω1 ). Indeed, A : C0∞ (Ω2 ) → D (Ω1 ) is continuous if and only if its restriction to C K∞2 (Ω2 ) for every compact K 2 ⊂ Ω2 is continuous; so it suffices to base our reasoning on sequences. (II) If u j is a sequence in D (Ω) and the limit   u, ϕ = lim u j , ϕ j→∞

exists for every ϕ ∈ C0∞ (Ω), then it follows that u ∈ D (Ω). Thus we have u j → u in D (Ω)s and hence in D (Ω)b . This is one of the important consequences of the Banach–Steinhaus theorem (see Theorem 5.8). (III) The strong dual space of D (Ω)b can be identified with C0∞ (Ω). This fact is referred to as the reflexivity of C0∞ (Ω).

7.4 Distributions and the Fourier Transform

301

7.4.3 Support of a Distribution Let Ω be an open subset of Rn . Two distributions u 1 and u 2 on Ω are said to be equal in an open subset V of Ω if the restrictions u 1 |V and u 2 |V are equal. In particular, we have u = 0 in V if and only if u, ϕ = 0 for all ϕ ∈ C0∞ (V ). The local behavior of a distribution determines it completely. More precisely, we have the following theorem: Theorem 7.13 The space D (Ω) has the sheaf property; this means the following two properties (S1) and (S2): (S1) If {Uλ }λ∈ is an open covering of Ω and if a distribution u ∈ D (Ω) is zero in each Uλ , then u = 0 in Ω. (S2) Given an open covering {Uλ }λ∈ of Ω and a family of distributions u λ ∈ D (Uλ ) such that u j = u k in every Uλ ∩ Uμ , there exists a distribution u ∈ D (Ω) such that u = u λ in each Uλ . Proof Let {ϕλ }λ∈ be a partition of unity subordinate to the open covering {Uλ }λ∈ of Ω (see Sect. 7.7.2). Namely, the family {ϕλ }λ∈ in C ∞ (Ω) satisfies the following three conditions (a), (b) and (c): (a) 0 ≤ ϕλ (x) ≤ 1 for all x ∈ Ω and λ ∈ . (b) supp ϕλ ⊂ Uλ for each λ ∈ . (c) The collection {supp ϕλ }λ∈ is locally finite and 

ϕλ (x) = 1 for every x ∈ Ω.

λ∈

Here supp ϕλ is the support of ϕλ , that is, the closure in Ω of the set {x ∈ Ω : ϕλ (x) = 0}. (1) For any given ϕ ∈ C0∞ (Ω), it follows that   ϕ = λ∈ ϕλ ϕ, ϕλ ϕ ∈ C0∞ (Uλ ) for each λ ∈ .  Here it should be emphasized that the summation λ∈ is finite, since supp ϕ is compact. Therefore, since u is zero in each Uλ , we have the assertion  u, ϕ = u,

 λ∈

 ϕλ ϕ =



u, ϕλ ϕ = 0.

λ∈

This proves that u = 0 in Ω. (2) If we define a distribution u ∈ D (Ω) by the formula u, ϕ =

 λ∈

u λ , ϕλ ϕ for every ϕ ∈ C0∞ (Ω),

302

7 Distributions, Operators and Kernels

Fig. 7.5 The approximate function χ j (x)

then it is easy to verify that u = u λ in each Uλ . The proof of Theorem 7.13 is complete.



If u ∈ D (Ω), the support of u is the smallest closed subset of Ω outside of which u is zero. The support of u is denoted by supp u. We remark that if ϕ ∈ C0∞ (Ω) such that supp ϕ ∩ supp u = ∅, then we have u, ϕ = 0. It should be emphasized that the present definition of support coincides with the previous one if u is a continuous function on Ω. Example 7.14 In the case where n = 1, it is easy to verify the following three assertions (1), (2) and (3): (1) supp δx0 = {x0 }. (2) supp Y (x) = [0, ∞). (3) supp v. p. x1 = (−∞, ∞).

7.4.4 Dual Space of C ∞ (Ω) Let Ω be an open subset of Rn . The injection of C0∞ (Ω) into C ∞ (Ω) is continuous ∞ and the space C0∞ (Ω) is a dense subspace of C (Ω). Indeed, if K j is an exhaustive sequence compact subsets of Ω, by using

of Corollary 7.3 we can construct a sequence χ j of functions in C0∞ (Ω) such that χ j (x) = 1 on K j (see Fig. 7.5). For any given function ϕ ∈ C ∞ (Ω), it is easy to verify that χ j ϕ −→ ϕ in C ∞ (Ω) as j → ∞. Hence the dual space

E  (Ω) = L(C ∞ (Ω), C)

can be identified with a linear subspace of the dual space D (Ω) = L(C0∞ (Ω), C), by the identification of a continuous linear functional on C ∞ (Ω) with its restriction to C0∞ (Ω). In other words, the elements of E  (Ω) are precisely those distributions that have continuous extensions to C ∞ (Ω). The situation can be visualized as in Fig. 7.6 below.

7.4 Distributions and the Fourier Transform

303

Fig. 7.6 The dual spaces D (Ω) and E  (Ω)

More precisely, we have the following theorem: Theorem 7.15 (i) The dual space E  (Ω) of C ∞ (Ω) consists of those elements of D (Ω) with compact support. (ii) The dual space E  m (Ω) of C m (Ω), 0 ≤ m < ∞, consists of those elements of m m D (Ω) with compact support, and E  (Ω) = ∪∞ m=0 E (Ω). As in the case of D (Ω), we equip the space E  (Ω) with two natural topologies τs and τb , and denote (E  (Ω), τs ) and (E  (Ω), τb ) by E  (Ω)s and E  (Ω)b respectively. We have the same topological properties of E  (Ω) as those of D (Ω).

7.4.5 Tensor Product of Distributions Let X and Y be open subsets of Rn and R p , respectively. If ϕ ∈ C0∞ (X ) and ψ ∈ C0∞ (Y ), we define the tensor product ϕ ⊗ ψ of ϕ and ψ by the formula (ϕ ⊗ ψ)(x, y) = ϕ(x) ψ(y). It is clear that ϕ ⊗ ψ ∈ C0∞ (X × Y ). We let C0∞ (X ) ⊗ C0∞ (Y ) = the space of finite combinations of the form ϕ ⊗ ψ where ϕ ∈ C0∞ (X ) and ψ ∈ C0∞ (Y ). The space C0∞ (X ) ⊗ C0∞ (Y ) is a linear subspace of C0∞ (X × Y ). Furthermore, it is ∞ Y ); that is, for every  ∈ C0∞ (X × Y ), there exists a sequentially in∞C0 (X × ∞

dense sequence  j in C0 (X ) ⊗ C0 (Y ) such that  j →  in C0∞ (X × Y ). The next lemma asserts that the space C0∞ (X ) ⊗ C0∞ (Y ) is sequentially dense in ∞ C0 (X × Y ): Lemma 7.16 The space C0∞ (X ) ⊗ C0∞ (Y ) is sequentially dense in C0∞ (X

× Y ). Namely, for every function  ∈ C0∞ (X × Y ) there exists a sequence  j in C0∞ (X ) ⊗ C0∞ (Y ) such that  j →  in C0∞ (X × Y ). Proof Let (x, y) be an arbitrary function in C0∞ (X × Y ). We choose a closed cube K of side length T such that supp  is contained in the interior of K (see Fig. 7.7  ∈ C ∞ (Rn × R p ) below), and we extend the function  to the periodic function 

304

7 Distributions, Operators and Kernels

Fig. 7.7 The interior of the closed cube K

with period T . Moreover, we choose two functions θ (x) ∈ C ∞ (Rn ) and ζ (y) ∈ C ∞ (R p ) such that supp (θ ⊗ ζ ) ⊂ K , θ (x) ⊗ ζ (y) = 1 on supp . Then we have the formula (x, y), (x, y) = (θ (x) ⊗ ζ (y)) 

(7.11)

 and the Fourier expansion of  (x, y) = 



cα,β e

2πi T

α·x

e

2πi T

β·y

for (x, y) ∈ Rn × R p .

α∈Nnp

β∈N

 are given by the formula Here the Fourier coefficients cα,β of  cα,β =

1 T n+ p

Rn



e−

2πi T

α·x

Rp

e−

2πi T

β·y

(x, y) d x d y.

However, we have, by integration by parts, α γ β δ cα,β  |γ +δ| T 1 2πi 2πi = n+ p e− T α·x e− T β·y Dxγ D δy (x, y) d x d y n p T 2π R R  |γ +δ| T 1 2πi 2πi e− T α·x e− T β·y Dxγ D δy (x, y) d x d y = n+ p T 2π X Y for all multi-indices γ and δ.

(7.12)

7.4 Distributions and the Fourier Transform

305

Hence, for any positive integer N we can find a positive constant C N such that   (1 + |α| + |β|) N cα,β  ≤ C N for all (α, β) ∈ Nn × N p .

(7.13)

Therefore, we obtain from formulas (7.11), (7.12) and inequality (7.13) that the series (x, y) (x, y) = (θ (x) ⊗ ζ (y))      2πi 2πi = cα,β θ (x) e T α·x ζ (y) e T β·y α∈Nn β∈N p

converges in the space C0∞ (X × Y ). Now the proof of Lemma 7.16 is complete.



The sequential density of C0∞ (X ) ⊗ C0∞ (Y ) in C0∞ (X × Y ) allows us to obtain the following theorem: Theorem 7.17 If u ∈ D (X ) and v ∈ D (Y ), there exists a unique distribution u ⊗ v ∈ D (X × Y ) such that u ⊗ v,  = u, ϕ = v, ψ for all  ∈ C0∞ (X × Y ), where ϕ(x1 ) = v, (x, ·) and ψ(y) = u, (·, y). The distribution u ⊗ v is called the tensor product of u and v. We list some basic properties of the tensor product: (1) u ⊗ v, ϕ ⊗ ψ = u, ϕ v, ψ for all ϕ ∈ C0∞ (X ) and ψ ∈ C0∞ (Y ). (2) supp (u ⊗ v) = supp u × supp v. β β (3) Dxα D y (u ⊗ v) = Dxα u ⊗ D y v.

7.4.6 Convolution of Distributions The Young inequality (Corollary 7.3) tells us that if u ∈ L 1 (Rn ) and v ∈ L p (Rn ) with 1 ≤ p ≤ ∞, then the convolution (u ∗ v)(x) =

Rn

u(x − y)v(y) dy

is well-defined for almost all x ∈ Rn , and is in L p (Rn ). Furthermore, it follows from Fubini’s theorem (Theorem 2.18) that u ∗ v, ϕ = u(x)v(y)ϕ(x + y) d xd y for all ϕ ∈ C0∞ (Rn ). Rn ×Rn

306

7 Distributions, Operators and Kernels

We use this formula to extend the definition of convolution to the case of distributions. Let u, v ∈ D (Rn ) and assume that one of them has compact support. If ϕ ∈ ∞ C0 (Rn ), then the support of the function ϕ˜ : (x, y) −→ ϕ(x + y) is contained in the strip

(x, y) ∈ Rn × Rn : x + y ∈ supp ϕ .

Thus it is easy to see that the intersection supp (u ⊗ v) ∩ supp ϕ˜ is a compact subset of Rn × Rn . We choose a function θ in C0∞ (Rn × Rn ) such that θ = 1 in a neighborhood of supp (u ⊗ v) ∩ supp ϕ, ˜ and define u ⊗ v, ϕ ˜ = u ⊗ v, θ ϕ ˜ . Observe that u ⊗ v, θ ϕ ˜ is independent of the function θ chosen, and further that the mapping C0∞ (Rn ) ϕ −→ u ⊗ v, ϕ ˜ is continuous. This discussion justifies the following definition: Definition 7.18 The convolution u ∗ v of two distributions u and v in D (Rn ), one of which has compact support, is a distribution on Rn defined by the formula u ∗ v, ϕ = u ⊗ v, ϕ ˜ for every ϕ ∈ C0∞ (Rn ). We state some basic facts concerning the convolution product: (1) (2) (3) (4)

u ∗ v = v ∗ u. supp (u ∗ v) ⊂ supp u + supp v = {x + y : x ∈ supp u, y ∈ supp v}. D α (u ∗ v) = (D α u) ∗ v = u ∗ (D α v). If either u ∈ D (Rn ), v ∈ C0∞ (Rn ) or u ∈ E  (Rn ), v ∈ C ∞ (Rn ), then we have the assertions u ∗ v ∈ C ∞ (Rn ),   (u ∗ v)(x) = u y , v(x − y) ,

where u y means that the distribution u operates on v(x − y) as a function of y with x fixed. (5) Let ρ(x) be a non-negative, C ∞ function on Rn such that

7.4 Distributions and the Fourier Transform

307

ρ(−x) = ρ(x) for all x ∈ Rn ,

supp ρ = x ∈ Rn : |x| ≤ 1 , ρ(x) d x = 1, Rn

and define a function ρε (x) by the formula (see Fig. 7.1) ρε (x) =

1 x  for every ε > 0. ρ εn ε

If u ∈ D (Rn ) (resp. u ∈ E  (Rn )), then it follows that the convolutions u ∗ ρε are in C ∞ (Rn ) and further that we have, for every ϕ ∈ C0∞ (Rn ) (resp. ϕ ∈ C ∞ (Rn )), u ∗ ρε , ϕ = u, ρε ∗ ϕ −→ u, ϕ as ε ↓ 0. This proves that u ∗ ρε −→ u in D (Rn ) (resp. in E  (Rn )) as ε ↓ 0. Rephrased, distributions can be approximated in the weak* topology of distributions by smooth functions. The functions u ∗ ρε are called regularizations of the distribution u.

7.4.7 The Jump Formula If x = (x1 , x2 , . . . , xn ) is a point of Rn , we write   x = x  , xn , x  = (x1 , x2 , . . . , xn−1 ) . n If u ∈ C ∞ (R+ ), we define its extension u 0 to the whole space Rn by the formula

 

u (x , xn ) = 0

u(x  , xn ) for xn ≥ 0, 0 for xn < 0.

Then it follows that u 0 is a distribution on Rn and further that its j-th derivative j ∂n (u 0 ) with respect to the normal variable xn is expressed as follows: ∂ j (u 0 ) j

∂ xn

 =

∂ ju j

∂ xn

0 +

j−1 

γ j−k−1 u ⊗ δ (k) (xn ),

k=0

where γk u is a C ∞ function on Rxn−1 defined by the formula 

(7.14)

308

7 Distributions, Operators and Kernels

(γk u)(x  ) =

∂ku  (x , 0) for every x  ∈ Rn−1 , ∂ xnk

and δ(xn ) is the Dirac measure at 0 on Rxn . Furthermore, if Δ is the usual Laplacian Δ=

∂2 ∂2 ∂2 + 2 + ... + 2, 2 ∂ xn ∂ x1 ∂ x2

then we have the following formula:   Δ u 0 = (Δu)0 + γ1 u ⊗ δ(xn ) + γ0 u ⊗ δ () (xn ) ∂u  (x , 0) ⊗ δ(xn ) + u(x  , 0) ⊗ δ () (xn ). = (Δu)0 + ∂ xn

(7.15)

More generally, if the operator P(x, Dx ) =

m 

P j (x, Dx  )Dnj

j=0

is a differential operator of order m with C ∞ coefficients on Rn , then we have the formula    1 P u 0 = (Pu)0 + √ P+k+1 (x, Dx  )γ u ⊗ Dnk δ(xn ). −1 +k+1≤m

(7.16)

Here P j (x, Dx  ) is a differential operator of order m − j with respect to x  . Formula (7.16) will be referred to as the jump formula.

7.4.8 Regular Distributions with Respect to One Variable If x = (x1 , x2 , . . . , xn ) is the variable in Rn , we write x = (x  , xn ), x  = (x1 , x2 , . . . , xn−1 ), so x  is the variable in Rn−1 . ) is said to be continuous A function U (xn ) defined on R with values in D (Rxn−1  ), the function U (xn ), φ is continuous on R. if, for every φ ∈ C0∞ (Rxn−1 

7.4 Distributions and the Fourier Transform

309

We let )) C(R; D (Rxn−1  )-valued continuous functions on R. = the space of D (Rxn−1  If U ∈ C(R; D (Rn−1 )), we can associate injectively a distribution u ∈ D (Rn ) by the formula u, ϕ = R

U (xn ), ϕ(·, xn ) d xn for every ϕ ∈ C0∞ (Rn ).

Such a distribution u is said to be continuous with respect to xn with values in ). We let D (Rxn−1  ). γ0 u = U (0) ∈ D (Rxn−1  The distribution γ0 u is called the sectional trace of order zero on the hyperplane {xn = 0} of u. Let k be a positive integer. A function U (xn ), defined on R with values in ), is said to be of class C k if, for every φ ∈ C0∞ (Rxn−1 ), the function D (Rxn−1   U (xn ), φ is of class C k on R. We let )) = the space of D (Rxn−1 )-valued C k functions on R. C k (R; D (Rxn−1   )), we have, for 0 ≤ j ≤ k, If U ∈ C k (R; D (Rxn−1  

∂nj u, ϕ



= R

 ( j)  U (xn ), ϕ(·, xn ) d xn for ϕ ∈ C0∞ (Rn ).

This shows that the distribution ∂n u on Rn is the distribution associated with U ( j) ∈ )). We say that u is of class C k with respect to xn with values in C(R; D (Rxn−1  n−1  D (Rx  ). We define the sectional trace γ j u of order j on the hyperplane {xn = 0} of u by the formula j

γ j u = Dnj U (0) ∈ D (Rn−1 ) for 0 ≤ j ≤ k. We make no distinction between U and u for notational convenience. )), 0 ≤ m ≤ ∞. If u ∈ It is obvious what we mean by C m ([0, ∞); D (Rxn−1  )), we define a distribution u 0 ∈ D (Rn ) by the formula C([0, ∞); D (Rxn−1  

 u0, ϕ =

0



u(xn ), ϕ(·, xn ) d xn for every ϕ ∈ C0∞ (Rn ).

310

7 Distributions, Operators and Kernels

The distribution u 0 is an extension to the whole space Rn of u which is equal to zero for xn < 0. )), we define its sectional traces γ j u, 0 ≤ j ≤ m, on If u ∈ C m ([0, ∞); D (Rxn−1  the hyperplane {xn = 0} by the formula γ j u = lim Dnj u(·, xn ) in D (Rn−1 ). xn ↓0

Then it is easy to verify that formula (7.14) and hence the jump formula (7.16) can )). be extended to the space C m ([0, ∞); D (Rxn−1 

7.4.9 The Fourier Transform If f ∈ L 1 (Rn ), we define its (direct) Fourier transform  f by the formula  f (ξ ) =

Rn

e−i x·ξ f (x) d x for ξ = (ξ1 , ξ2 , . . . , ξn ),

(7.17)

where x · ξ = x 1 ξ1 + x 2 ξ2 + . . . + x n ξn . It follows from an application of the Lebesgue dominated convergence theorem (Theorem 2.12) that the function  f (ξ ) is continuous on Rn , and further we have the inequality f (ξ )| ≤  f 1 .  f ∞ = sup |  Rn

We also denote  f by F f . f ∗ g of the convolution Example 7.19 If f , g ∈ L 1 (Rn ), then the Fourier transform  f ∗ g is given by the formula  f ∗ g(ξ ) =  f (ξ ) g (ξ ) for ξ ∈ Rn . Indeed, we have, by Fubini’s theorem (Theorem 2.18),  f ∗ g(ξ ) =



=

n

Rn





f (x − y)g(y) dy d x g(y)e−i y·ξ dy · f (x − y)e−iξ ·(x−y) dy e

R

−i x·ξ

=  f (ξ ) g (ξ ).

Rn

Rn

ˇ by the formula Similarly, if g ∈ L 1 (Rn ), we define the function g(x)

7.4 Distributions and the Fourier Transform

g(x) ˇ =

311

1 (2π )n

Rn

ei x·ξ g(ξ ) dξ.

The function g(x) ˇ is called the inverse Fourier transform of g. We also denote gˇ by F ∗ g. Now we introduce a subspace of L 1 (Rn ) which is invariant under the Fourier transform. We let S(Rn ) = the space ofC ∞ functionsϕ(x)onRn such that, for any non-negative integer j, the quantity

p j (ϕ) = sup (1 + |x|2 ) j/2 |∂ α ϕ(x)| x∈Rn |α|≤ j

is finite. The space S(Rn ) is called the Schwartz space or space of C ∞ functions on Rn rapidly n defined by the decreasing at infinity.

We equip the space S(R ) with the topology countable family p j of seminorms. It is easy to verify that S(Rn ) is complete; so it is a Fréchet space. Now we give typical examples of functions in S(Rn ): Example 7.20 (1) For every a > 0, it follows that ϕ(x) = e−a|x| ∈ S(Rn ). 2

The Fourier transform  ϕ (ξ ) of ϕ(x) is given by the formula  ϕ (ξ ) =

Rn

e−i x·ξ e−a|x| d x = 2

 π n/2 a

|ξ |2

e− 4a

for ξ ∈ Rn .

(7.18)

(2) The Fourier transform  K t (ξ ) of the heat kernel K t (x) =

|x|2 1 e− 4t n/2 (4π t)

for x ∈ Rn and t > 0,

is given by the formula  K t (ξ ) =

1 (4π t)n/2

= e−t |ξ |

2



e−i x·ξ e−

|x|2 4t

dx

(7.19)

Rn

for ξ ∈ Rn and t > 0.

Physically, the heat kernel K t (x) expresses a thermal distribution of position x at time t in a homogeneous isotropic medium Rn with unit coefficient of thermal diffusivity, given that the initial thermal distribution is the Dirac measure δ(x) (see Fig. 7.8 below).

312

7 Distributions, Operators and Kernels

Fig. 7.8 An intuitive meaning of the heat kernel K t (x)

Proof (1) We have only to prove formula (7.18) for n = 1, since we have the formula exp −a|x|

! 2





= exp ⎣−a ⎝

n 

⎞⎤ x 2j ⎠⎦ =

n &

j=1

! exp −ax 2j .

j=1

The proof is divided into three steps. Step 1: If ξ = 0, then formula (7.18) is reduced to the well known formula



−∞

' e−ax d x = 2

π . a

Step 2: Now we consider the case where ξ < 0. Since the function C z −→ e−az e−i zξ 2

is a entire function of z = x + i y, it follows from an application of Cauchy’s theorem that 2 0= e−az e−i zξ dz (7.20) R R

=

e −R



+

−ax 2 −i xξ

dx +

y0

e−a(R+i y)

2

0 −R

e−a(x+i y0 ) e−i(x+i y0 )ξ d x + 2

R

−i(R+i y)ξ



0

i dy

e−a(−R+i y) e−i(−R+i y)ξ i dy 2

y0

:= I + I I + I I I + I V. Here  R is a path consisting of the rectangle as in Fig. 7.9 below. (a) Since we have, for yξ ≤ 0,   2 2 2 2  −a(±R+i y)2 −i(±R+i y)ξ  e  = e−a R +ay +yξ ≤ e−a R +ay , we can estimate the second term I I and the fourth term I V as follows:

7.4 Distributions and the Fourier Transform

313

Fig. 7.9 The integral path  R consisting of the rectangle

  |I I | , |I V | = 

  2 e−a(±R+i y) −i(±R+i y)ξ dy  0 y0 2 −a R 2 ≤e e−ay dy −→ 0 as R → ∞. y0

0

(b) In order to estimate the third term I I I as R → ∞, we remark that

−R

e

−a(x+i y0 )2 −i(x+i y0 )ξ

e

R



−R

dx =

e−ax

2

+a y0 2 +y0 ξ −i(2ay0 +ξ )x

d x.

R

If we take y0 = −

ξ , 2a

then it follows that a y0

2

    ξ2 ξ 2 ξ ξ =− . + y0 ξ = a − + − 2a 2a 4a

Hence the third term I I I can be estimated as follows: −R 2 2 III = e−ax +a y0 +y0 ξ −i(2ay0 +ξ )x d x R −R 2 2 e−ax d x = ea y0 +y0 ξ R ' −∞ ξ2 ξ2 π 2 as R → ∞. e−ax d x = −e− 4a −→ e− 4a a ∞ Therefore, by letting R → ∞ in formula (7.20) we obtain the desired formula (7.18) for ξ < 0: ' ∞ π − ξ2 −i xξ −ax 2 e 4a . e e dx = a −∞ Step 3: The case where ξ > 0 can be treated similarly. (2) Formula (7.19) follows by applying formula (7.18) with a := 1/(4t) to the heat kernel K t (x).

314

7 Distributions, Operators and Kernels



The proof of Example 7.20 is complete. The next theorem summarizes the basic properties of the Fourier transform:

Theorem 7.21 (i) The Fourier transforms F and F ∗ map S(Rn ) continuously into itself. Furthermore, we have, for all multi-indices α and β,  ϕ (ξ ) for ϕ ∈ S(Rn ), D α ϕ(ξ ) = ξ α  β ϕ(ξ ) for ϕ ∈ S(R n ).  ϕ (ξ ) = (−x) Dβ 

(ii) The Fourier transforms F and F ∗ are isomorphisms of S(Rn ) onto itself; more precisely, FF ∗ = F ∗ F = I on S(Rn ). In particular, we have the formula ϕ(x) =

1 (2π )n

Rn

ei x·ξ  ϕ (ξ ) dξ for every ϕ ∈ S(Rn ).

(7.21)

(iii) If ϕ, ψ ∈ S(Rn ), we have the formulas

(x) d x = ϕ(x) · ψ



 ϕ (ξ ) · ψ(ξ ) dξ, 1 (−ξ ) dξ, ϕ(x) · ψ(x) d x =  ϕ (ξ ) · ψ (2π )n Rn Rn 1 (ξ ) dξ. ϕ(x) · ψ(x) d x =  ϕ (ξ ) · ψ (2π )n Rn Rn

R

n

(7.22a)

Rn

(7.22b) (7.22c)

Formulas (7.21) is called the Fourier inversion formula and formulas (7.22b) and (7.22c) are called the Parseval formulas.

7.4.10 Tempered Distributions For the three spaces C0∞ (Rn ), S(Rn ) and C ∞ (Rn ), we have the following two inclusions (i) and (ii): (i) The injection of C0∞ (Rn ) into S(Rn ) is continuous and the space C0∞ (Rn ) is dense in S(Rn ). (ii) The injection of S(Rn ) into C ∞ (Rn ) is continuous and the space S(Rn ) is dense in C ∞ (Rn ). Indeed, we take a function ψ ∈ C0∞ (Rn ) such that (see Fig. 7.10 below)  ψ(x) = and let

1 if |x| < 1, 0 if |x| > 2,

7.4 Distributions and the Fourier Transform

315

1

−2

−1

0

ψ(x)

1

2

x ∈ Rn

Fig. 7.10 The function ψ(x)

ψ j (x) = ψ

  x for every integer j ≥ 1. j

For any given function ϕ ∈ S(Rn ) (resp. ϕ ∈ C ∞ (Rn )), it is easy to verify that ψ j ϕ −→ ϕ in S(Rn ) (resp. in C ∞ (Rn )) as j → ∞. Hence the dual space S  (Rn ) = L(S(Rn ), C) can be identified with a linear subspace of D (Rn ) containing E  (Rn ), by the identification of a continuous linear functional on S(Rn ) with its restriction to C0∞ (Rn ). Namely, we have the inclusions E  (Rn ) ⊂ S  (Rn ) ⊂ D (Rn ). The elements of S  (Rn ) are called tempered distributions on Rn . In other words, the tempered distributions are precisely those distributions on Rn that have continuous extensions to S(Rn ). Roughly speaking, the tempered distributions are those which grow at most polynomially at infinity, since the functions in S(Rn ) die out faster than any power of x at infinity. In fact, we have the following examples (1) through (4) of tempered distributions: (1) The functions in L p (Rn ) (1 ≤ p ≤ ∞) are tempered distributions. (2) A locally integrable function on Rn is a tempered distribution if it grows at most polynomially at infinity. (3) If u ∈ S  (Rn ) and f (x) is a C ∞ function on Rn all of whose derivatives grow at most polynomially at infinity, then the product f u is a tempered distribution. (4) Any derivative of a tempered distribution is also a tempered distribution. More precisely, we can prove the following structure theorem for tempered distributions: Theorem 7.22 (the structure theorem) Let u ∈ S  (Rn ). Then there exist a nonnegative integer m and functions { f α }|α|≤m in L ∞ (Rn ) such that

316

7 Distributions, Operators and Kernels

u=



! ∂x1 · · · ∂xn ∂xα (1 + |x|2 )m/2 f α .

(7.23)

|α|≤m

Proof The proof is divided into three steps. Step 1: Since u : S(Rn ) → C is continuous, we can find a seminorm pm (·) of S(Rn ) and a constant δ > 0 such that ψ ∈ S(Rn ), pm (ψ) < δ =⇒ |u, ψ| < 1.

(7.24)

Here we recall that

pm (ψ) = sup (1 + |x|2 )m/2 |∂ α ψ(x)| . x∈Rn |α|≤m

Let ϕ be an arbitrary non-zero function in S(Rn ). By letting ψ=

δ ϕ , 2 pm (ϕ)

we obtain from assertion (7.24) that |u, ϕ|
0. t

It is known (see Aronszajn–Smith [16]) that the function G α (x) is represented as follows: 1 α−n K (n−α)/2 (|x|)|x| 2 , G α (x) = (n+α−2)/2 n/2 2 π (α/2) where K (n−α)/2 (z) is the modified Bessel function of the third kind (cf. Watson [234]). (e) Riesz kernels: R j (x) = −

xj ((n + 1)/2) v. p. n+1 for 1 ≤ j ≤ n. (n+1)/2 π |x|

The distribution v. p.

xj |x|n+1

is an extension of v. p. (1/x) to the multi-dimensional case (see Example 7.12). (f) In the case n = 1, we have the assertions (1) Y (x) ∈ S  (R). 1 (2) v. p. ∈ S  (R). x The importance of tempered distributions lies in the fact that they have Fourier transforms. u by the formula If u ∈ S  (Rn ), we define its (direct) Fourier transform Fu =  Fu, ϕ = u, Fϕ for all ϕ ∈ S(Rn ). Then it follows that

(7.33)

322

7 Distributions, Operators and Kernels

Fu ∈ S  (Rn ), since the Fourier transform F : S(Rn ) −→ S(Rn ) is an isomorphism. Furthermore, in view of formulas (7.22a) it follows that the above definition (7.33) agrees with definition (7.17) if u ∈ S(Rn ). We also denote Fu by  u. Similarly, if v ∈ S  (Rn ), we define its inverse Fourier transform F ∗ v = vˇ by the formula     ∗ F v, ψ = v, F ∗ ψ for all ψ ∈ S(Rn ). The next theorem, which is a consequence of Theorem 7.21, summarizes the basic properties of Fourier transforms in the space S  (Rn ): Theorem 7.26 (i) The Fourier transforms F and F ∗ map S  (Rn ) continuously into itself. Furthermore, we have, for all multi-indices α and β, F(D α u)(ξ ) = ξ α Fu(ξ ) for u ∈ S  (Rn ), β

Dξ (Fu(ξ )) = F((−x)β u)(ξ ) for u ∈ S  (Rn ). (ii) The Fourier transforms F and F ∗ are isomorphisms of S  (Rn ) onto itself; more precisely, FF ∗ = F ∗ F = I on S  (Rn ). (iii) The transforms F and F ∗ are norm-preserving operators on L 2 (Rn ) and FF ∗ = F ∗ F = I on L 2 (Rn ). (iv) If u, v ∈ L 2 (Rn ), we have the Parseval formulas

1 Fu(ξ ) · Fv(ξ ) dξ, n (2π )n Rn R 1 u(x) · v(x) d x = F(ξ ) · Fv(−ξ ) dξ. n (2π )n R n R u(x) · v(x) d x =

Assertion (iii) is referred to as the Plancherel theorem. We remark that Theorems 7.21 and 7.26 can be visualized as in Figs. 7.13 and 7.14 below.

7.4.11 Fourier Transform of Tempered Distributions In this subsection we show the Fourier transform of some important examples of tempered distributions. More detailed and concise accounts of this subsection are given in [202, Chap. 5, Sect. 5.4]. First, we consider the distributions v. p. x1 and Y (x) in the case n = 1:

7.4 Distributions and the Fourier Transform

323

Fig. 7.13 The mapping properties of the Fourier transform F

F

S (Rn ) − −−−−→ S (Rn ) ⏐ ⏐

⏐ ⏐ F

L2 (Rn ) − −−−− → L2 (Rn ) ⏐ ⏐

⏐ ⏐

S(Rn ) − −−−− → S(Rn ) F

Fig. 7.14 The mapping properties of the inverse Fourier transform F ∗

F∗

S (Rn ) ← −−−−− S (Rn ) ⏐ ⏐

⏐ ⏐ F∗

L2 (Rn ) ← −−−− − L2 (Rn ) ⏐ ⏐

⏐ ⏐

S(Rn ) ← −−− − − S(Rn ) ∗ F

Example 7.27 (1) For the distribution v. p. x1 , we have the formula   1 −πi for ξ > 0, (ξ ) = −π i sgn ξ = F v. p. x πi for ξ < 0. 

(7.34)

(2) For the Heaviside function Y (x), we have the formula (ξ ) = 1 v. p. 1 + π δ(ξ ). (F Y )(ξ ) = Y i ξ Proof (1) We calculate the Fourier transform of the distribution 1 1 v. p. . π x

h(x) = For 0 < ε < μ, we let  h ε,μ (x) =

1 πx

0 

and h ε (x) =

1 πx

0

if ε < |x| < μ, otherwise, if |x| > ε, if |x| ≤ ε.

(7.35)

324

7 Distributions, Operators and Kernels

Then it follows that

e−i x·ξ 2i μ sin(x · ξ ) dx = − dx π ε x ε 0, v. p. + 2π δ(ξ ) = sgn x + 1 = i ξ 0 for x < 0

(7.38)

= 2Y (x). Therefore, by applying the Fourier transform F to both sides of formula (7.38) we obtain from part (ii) of Theorem 7.26 that (ξ ) = (F Y )(ξ ) = 1 F F ∗ Y 2



 2 1 1 1 v. p. + 2π δ(ξ ) = v. p. + π δ(ξ ). i ξ i ξ

The proof of the desired formula (7.35) is complete. The proof of Example 7.27 is now complete.



Secondly, we give the inverse Fourier transform of some homogeneous functions:

7.4 Distributions and the Fourier Transform

325

Example 7.28 (i) For λ = −n − 2k with k = 0, 1, 2, . . ., we let σ (ξ ) = |ξ |λ for ξ = 0. Then its inverse Fourier transform 1 k(x) := (F σ )(x) = (2π )n ∗



ei x·ξ |ξ |λ dξ

Rn

is given by the formula   2λ  n+λ 1 2 k(x) = n/2  λ  for x = 0. n+λ π |x|  −2 (ξ ) are obtained from part (i) by taking (ii) The Fourier transforms  Rα (ξ ) and N λ := α − n and λ := 2 − n, respectively. More precisely, we have the following two formulas (a) and (b): (a) Riesz potentials: ((n − α)/2) 1 for 0 < α < n, 2α π n/2 (α/2) |x|n−α 1  Rα (ξ ) = . |ξ |α Rα (x) =

(b) Newtonian potentials: ((n − 2)/2) 1 for n ≥ 3, 4π n/2 |x|n−2 (ξ ) = 1 . N |ξ |2

N (x) =

A detailed proof of part (i) of Example 7.28 is given in [202, Chap. 5, Example 5.28]. Thirdly, we give the Fourier transform of a tempered distribution which is closely related to the stationary phase theorem ([35, Chapitre III, Théorème 9.3]): Example 7.29 (1) For any λ ∈ R \ {0}, we consider a function -

iλ 2 x f (x) = exp 2

. for x ∈ R.

Then its Fourier transform (F f )(ξ ) =  f (ξ ) is given by the formula √

. . i 2π iπ λ  f (ξ ) = √ exp for ξ ∈ R. exp − ξ 2 4 |λ| 2λ |λ|

(7.39)

326

7 Distributions, Operators and Kernels

(2) For any symmetric, non-singular q × q matrix Q, we consider a function -

i Qy, y G(y) = exp 2

. for y ∈ Rq .

 Then its Fourier transform (F G)(η) = G(η) is given by the formula . .  i  iπ (2π )q/2  sign Q exp − Q −1 η, η G(η) =√ for η ∈ Rq . exp 4 2 | det Q|

(7.40)

Here the signature sign Q of Q is the number α of plus ones minus the number β of the minus ones in the diagonalized q × q matrix: sign Q = α − β. A detailed proof of formula (7.39) is given in [202, Chap. 5, Example 5.29]. Furthermore, we give the Fourier transforms of Bessel potentials and Riesz potentials in Examples 7.25 as follows: Example 7.30 (a) Bessel potentials: α (ξ ) = G (b) Riesz kernels:

1 for α > 0. (1 + |ξ |2 )α/2

ξj  for 1 ≤ j ≤ n. R j (ξ ) = i |ξ |

Finally, as for distributions with compact support we have the following theorem: Theorem 7.31 Let u ∈ E  (Rn ). Then we have the following two assertions (i) and (ii): (i) Its Fourier transform (Fu)(ξ ) =  u (ξ ) is a C ∞ function on Rn given by the formula   (7.41) (Fu)(ξ ) = u, e−i x·ξ for every ξ ∈ Rn . (ii) The function Fu(ξ ) is slowly increasing, that is, there exist constants C > 0 and μ ∈ R such that |(Fu)(ξ )| ≤ C(1 + |ξ |)μ for all ξ ∈ Rn . Proof The proof is divided into three steps. Step 1: For every ξ ∈ Rn , we let   φ(ξ ) = u, e−i x·ξ .

(7.42)

7.4 Distributions and the Fourier Transform

327

Since u ∈ E  (Rn ), by considering difference quotients of φ we obtain that φ(ξ ) ∈ C ∞ (Rn ) with derivatives given by the formula   (∂ α φ)(ξ ) = (−i)|α| u, x α e−i x·ξ for every multi-index α. Step 2: Now we prove that φ(ξ ) = (Fu)(ξ ) for all ξ ∈ Rn , that is, formula (7.41). If ϕ ∈ C0∞ (Rn ), then it follows that u ∗ ϕ ∈ C0∞ (Rn ) and further that we have the formula u ∗ ϕ(ξ ) = e−i x·ξ u ∗ ϕ(x) d x Rn        = u ∗ ϕ, e−i x·ξ = u x ⊗ ϕ y , e−i(x+y)·ξ = u, e−i x·ξ ϕ, e−i y·ξ   = u, e−i x·ξ  ϕ (ξ ). In particular, by taking ϕ(x) := ρε (x) = we obtain that

1 x  for ε > 0, ρ εn ε

  u ∗ ρε (ξ ) = u, e−i x·ξ ρε (ξ ).

(7.43)

(a) First, by letting ε ↓ 0 in the left-hand side of formula (7.43) we obtain that u ∗ ρε −→ u in S  (Rn ). This proves that

u ∗ ρε −→ Fu in S  (Rn ) as ε ↓ 0,

(7.44)

since F : S  (Rn ) → S  (Rn ) is continuous. (b) On the other hand, it follows from an application of the Lebesgue dominated convergence theorem (Theorem 2.12) that ρε (ξ ) =

|y|≤1

ρ(y)e−iεy·ξ dy

(x = εy)



−→

|y|≤1

ρ(y) dy = 1

uniformly in ξ over compact subsets of Rn as ε ↓ 0. Hence, by letting ε ↓ 0 in the right-hand side of formula (7.43) we obtain that 

   u, e−i x·ξ ρε (ξ ) −→ u, e−i x·ξ in D (Rn ) as ε ↓ 0.

(7.45)

328

7 Distributions, Operators and Kernels

Therefore, the desired formula (7.41) follows by combining assertions (7.44) and (7.45). Step 3: Finally, we prove that the function (Fu)(ξ ) is slowly increasing. Since u ∈ E  (Rn ), we can find a compact set K , a non-negative integer μ and a positive constant γ such that p K ,μ (ψ) = sup |∂ α ψ(x)| < γ =⇒ | u, ψ | < 1. x∈K |α|≤μ

For all ϕ ∈ C ∞ (Rn ) and λ > 0, by letting ψ(x) =

γ ϕ(x) for x ∈ Rn , p K ,μ (ϕ) + λ

we obtain that p K ,μ (ψ) = γ

p K ,μ (ϕ) < γ, p K ,μ (ϕ) + λ

| u, ψ | = γ

| u, ϕ | < 1. p K ,μ (ϕ) + λ

so that

This proves that | u, ϕ | ≤

1 p K ,μ (ϕ) for all ϕ ∈ C ∞ (Rn ), γ

(7.46)

since λ > 0 is arbitrary. Therefore, the desired inequality (7.42) follows by taking ϕ(x) = e−i·xξ in formula (7.46). The proof of Theorem 7.31 is now complete.  Example 7.32 If δx0 is Dirac measure at a point x0 of Rn , then it follows that   −i x·ξ δ = e−i x0 ·ξ for all ξ ∈ Rn , x0 (ξ ) = δx0 , e and further that

(7.47)

  n  δ x0 (ξ ) = 1 for all ξ ∈ R .

Therefore, by using Theorem 7.26 and formula (7.47) we have (formally) the Fourier inversion formula 1 ei(x−x0 )·ξ dξ. (7.48) δx0 (x) = (2π )n Rn

7.5 Operators and Kernels

329

7.5 Operators and Kernels Let X and Y be open subset of Rn and R p , respectively. In this section we characterize continuous linear operators from C0∞ (Y ) into D (X ) in terms of distributions. More detailed and concise accounts of this subsection are given in [202, Chap. 5, Sect. 5.5]. First, we consider the following example: Example 7.33 Let K ∈ C ∞ (X × Y ). If we define a linear operator A : C0∞ (Y ) −→ C ∞ (X ) by the formula Aψ(x) = Y

K (x, y)ψ(y) dy for every ψ ∈ C0∞ (Y ),

then it follows that A is continuous. Furthermore, the operator A can be extended to a continuous linear operator : E  (Y )b −→ C ∞ (X ). A Here E  (Y )b is the space E  (Y ) endowed with the strong topology (see Sect. 7.4.2). Proof (1) Let M be an arbitrary compact subset of Y . If ψ ∈ C0∞ (Y ) with supp ψ ⊂ M, then we have, for any compact subset L of X and any non-negative integer j,     p L , j (Aψ) = sup  ∂xα K (x, y)ψ(y) dy  x∈L |α|≤ j

Y



≤ sup x∈L |α|≤ j

M



= sup x∈L |α|≤ j

M

  α ∂ K (x, y) dy · sup |ψ(y)| x y∈M

  α ∂ K (x, y) dy · p M,0 (ψ) . x

This proves the continuity of A : C0∞ (Y ) → C ∞ (X ). (2) Furthermore, we can extend A to a continuous linear operator : E  (Y )b −→ C ∞ (X ) A as follows: If we let  Av(x) = v, K (x, ·) for every v ∈ E  (Y ),

330

7 Distributions, Operators and Kernels

 ∈ C ∞ (X ), since supp v is compact. More precisely, we have, then it follows that Av for any compact subset H of X and any non-negative integer m,        = sup  v, ∂xα K (x, ·)  ,  = sup ∂ α Av p H,m ( Av) x∈H |α|≤m

where the functions

x∈H |α|≤m



∂xα K (x, ·)

x∈H |α|≤m

form a bounded subset of C ∞ (Y ). However, we recall that a sequence v j of distributions converges to a distribution v in the space E  (Y )b if and only if the sequence

 v j , ψ converges to v, ψ uniformly in ψ over all bounded subsets of C0∞ (Y ). : E  (Y )b → C ∞ (X ). Therefore, we obtain the continuity of A The proof of Example 7.33 is complete.  More generally, we have the following example: Example 7.34 If K is a distribution in D (X × Y ), we can define a continuous linear operator   A ∈ L C0∞ (Y ), D (X ) by the formula Aψ, ϕ = K , ϕ ⊗ ψ for all ϕ ∈ C0∞ (X ) and ψ ∈ C0∞ (Y ). We then write A = Op(K ). Proof (1) First, we show that Aψ ∈ D (X ). Assume that ϕ j → 0 in C0∞ (X ) as j → ∞. Then it follows that ϕ j ⊗ ψ → 0 in C0∞ (X × Y ) for every ψ ∈ C0∞ (Y ). Hence we have the assertion 

   Aψ, ϕ j = K , ϕ j ⊗ ψ −→ 0 as j → ∞.

This proves that Aψ ∈ D (X ). (2) Secondly, we show that A : C0∞ (Y ) → D (X ) is continuous. Assume that ψ j → 0 in C0∞ (Y ) as j → ∞. Then it follows that ϕ ⊗ ψ j → 0 in C0∞ (X × Y ) for every ϕ ∈ C0∞ (X ). Therefore, we have the assertion  and hence

   Aψ j , ϕ = K , ϕ ⊗ ψ j −→ 0 as j → ∞, Aψ j −→ 0 in D (X ).

This proves the continuity of A : C0∞ (Y ) → D (X ). The proof of Example 7.34 is complete.



7.5 Operators and Kernels

331

We give a simple example of the operators Op (K ): Example 7.35 Let D = {(x, x) : x ∈ X } be the diagonal in the product space X × X , and define a distribution δ D (x, y) = δ(x − y) ∈ D (X × X ) by the formula δ D ,  = X

(x, x) d x for every  ∈ C0∞ (X × X ).

Then it follows that Op (δ D ) = I . Proof Indeed, it suffices to note that we have, for all ϕ, ψ ∈ C0∞ (X ), Op (δ D )ψ, ϕ = δ D , ϕ ⊗ ψ =

ϕ(x)ψ(x) d x = ψ, ϕ . X

This proves that Op (δ D )ψ = ψ for all ψ ∈ C0∞ (X ), that is, Op (δ D ) = I . The proof of Example 7.35 is complete.



7.5.1 Schwartz’s Kernel Theorem By Lemma 7.16, we know that the space C0∞ (X ) ⊗ C0∞ (Y ) is sequentially dense in C0∞ (X × Y ). Hence it follows that the mapping D (X × Y ) K −→ Op(K ) ∈ L(C0∞ (Y ), D (X )) is injective. The next theorem asserts that it is also surjective: Theorem 7.36 (Schwartz’s kernel theorem) If A is a continuous linear operator on C0∞ (Y ) into D (Y ), then there exists a unique distribution K ∈ D (X × Y ) such that A = Op(K ). A detailed proof of Theorem 7.36 is given in [202, Chap. 5, Theorem 5.36 and Lemma 5.37]. The distribution K is called the kernel of A. Formally, we have the formula K (x, y)ψ(y) dy for all ψ ∈ C0∞ (Y ). (Aψ) (x) = Y

Now we give some important examples of distributions kernels: Example 7.37 (a) Riesz potentials: X = Y = Rn , 0 < α < n. (−Δ)−α/2 u(x) = Rα ∗ u(x) ((n − α)/2) 1 = α n/2 u(y) dy for u ∈ C0∞ (Rn ). 2 π (α/2) Rn |x − y|n−α

332

7 Distributions, Operators and Kernels

Fig. 7.15 The operators A and A

A

←→

←→

A ϕ ∈ D (Y ) ← −−−− − ϕ ∈ C0∞ (X)

ψ ∈ C0∞ (Y ) − −−−− → Aψ ∈ D (X) A

(b) Newtonian potentials: X = Y = Rn , n ≥ 3. (−Δ)−1 u(x) = N ∗ u(x) ((n − 2)/2) 1 = u(y) dy for u ∈ C0∞ (Rn ). n/2 n−2 4π Rn |x − y| (c) Bessel potentials: X = Y = Rn , α > 0. (I − Δ)−α/2 u(x) = G α ∗ u(x) =

Rn

G α (x − y) u(y) dy for u ∈ C0∞ (Rn ).

(d) Riesz operators: X = Y = Rn , 1 ≤ j ≤ n. Y j u(x) = R j ∗ u(x) x j − yj ((n + 1)/2) =i v. p. u(y) dy for u ∈ C0∞ (Rn ). (n+1)/2 n+1 π Rn |x − y| (e) The Calderón–Zygmund integro-differential operator: X = Y = Rn . 1  Yj (−Δ)1/2 u(x) = √ −1 j=1 n



∂u ∂x j

 (x)

((n + 1)/2)  = v. p. π (n+1)/2 j=1 n

Rn

x j − y j ∂u (y) dy |x − y|n+1 ∂ y j

for u ∈ C0∞ (Rn ). If A : C0∞ (Y ) → D (X ) is a continuous linear operator, we define its transpose A by the formula 



 A ϕ, ψ = ϕ, Aψ for all ϕ ∈ C0∞ (X ) and ψ ∈ C0∞ (Y ).

Then the transpose A is a continuous linear operator on C0∞ (X ) into D (Y ). The operators A and A can be visualized as in Fig. 7.15 above. The distribution kernel of A is obtained from the distribution kernel K (x, y) of A by interchanging the roles of x and y; formally this means that

7.5 Operators and Kernels

333

Fig. 7.16 The operators A and A∗

A∗

←→

←→

A∗ ϕ ∈ D (Y ) ← −−−− − ϕ ∈ C0∞ (X)

ψ ∈ C0∞ (Y ) − −−−−→ Aψ ∈ D (X) A









A ϕ (y) = X

K (y, x)ϕ(x) d x for all ϕ ∈ C0∞ (X ).

Also we have the formula (A ) = A. Similarly, we define the adjoint A∗ of A by the formula 

   A∗ ϕ, ψ = ϕ, Aψ for all ϕ ∈ C0∞ (X ) and ψ ∈ C0∞ (Y ).

Then the adjoint A∗ is a continuous linear operator on C0∞ (X ) into D (Y ). The operators A and A∗ can be visualized as in Fig. 7.16 above. The distribution kernel of A∗ is obtained form the distribution kernel K (x, y) by interchanging the roles of x and y; formally this means that 

 A∗ ϕ (y) =

X

K (y, x) ϕ(x) d x for all ϕ ∈ C0∞ (X ).

We also have the formula (A∗ )∗ = A. Example 7.38 If X = Y is an open subset Ω of Rn and if A=



aα (x) D α

|α|≤m

is a differential operator of order m with C ∞ coefficients in Ω, then we have the formulas  (−1)|α| D α (aα (x)·) , A = |α|≤m

A∗ =



  (−1)|α| D α aα (x)· .

|α|≤m

This shows that both A and A∗ are differential operators of order m with C ∞ coefficients in Ω.

334

7 Distributions, Operators and Kernels

7.5.2 Regularizers Let X and Y be open subsets of Rn and R p , respectively. A continuous linear operator A : C0∞ (Y ) → D (X ) is called a regularizer if it extends to a continuous linear operator on E  (Y ) into C ∞ (X ). The next theorem gives a characterization of regularizers in terms of distribution kernels: Theorem 7.39 A continuous linear operator A : C0∞ (Y ) → D (X ) is a regularizer if and only if its distribution kernel K (x, y) is in C ∞ (X × Y ). Proof The proof is divided into two steps. Step 1: First, we prove the “if” part. If A = Op (K ) with K ∈ C ∞ (X × Y ), then it follows from Example 7.33 that A extends to a continuous linear operator : E  (Y ) → C ∞ (X ). A Step 2: Secondly, we prove the “only if” part. The proof of Step 2 is divided into four steps. Step 2-a: We assume that A = Op (K ) extends to a continuous linear operator : E  (Y ) → C ∞ (X ). First, by letting A    y (x) for every (x, y) ∈ X × Y , a(x, y) = Aδ  y ∈ C ∞ (X ). we show that a ∈ C(X × Y ). Here we remark that Aδ Now let (x0 , y0 ) be an arbitrary point of X × Y . Then it follows that |a(x, y) − a(x0 , y0 )|       y (x) − Aδ  y0 (x0 ) =  Aδ            y0 (x) − Aδ  y (x) − Aδ  y0 (x) +  Aδ  y0 (x0 ) . ≤  Aδ

(7.49)

However, since Aδ y0 ∈ C ∞ (X ), for any given ε > 0 we can find a constant η = η (x0 , ε) > 0 such that       y0 (x) − Aδ  y0 (x0 ) ≤ ε . |x − x0 | ≤ η =⇒  Aδ 2

(7.50)

Moreover, since A : E  (Y ) → C ∞ (X ) is continuous, for any given ε > 0 we can find a constant η = η (x0 , ε, η ) > 0 such that |y − y0 | ≤ η =⇒

sup

|x−x0 |≤η

      Aδ  y (x) − Aδ  y0 (x)

(7.51)

   y − Aδ  y0 ≤ ε . = p{|x−x0 |≤η },0 Aδ 2 Indeed, it suffices to note that y → y0 in Y implies that δ y → δ y0 in E  (Y ). Therefore, by combining inequalities (7.49), (7.50) and (7.51) we obtain that

7.5 Operators and Kernels

335

sup

|x−x0 |≤η |y−y0 |≤η

|a(x, y) − a(x0 , y0 )| ≤ ε.

   y (x) ∈ C(X × Y ). This proves that a(x, y) = Aδ  : E  (Y ) → C ∞ (X ) Step 2-b: Secondly, we show that a ∈ C ∞ (X × Y ). Since A is continuous, it is easy to verify that we have, for all multi-indices α and β,  (β)   y (x) in D (X × Y ). ∂xα ∂ yβ a(x, y) = (−1)|β| ∂xα Aδ

(7.52)

However, by arguing just as in Step 2-a we find that  (β)   y (x) ∈ C(X × Y ). ∂xα ∂ yβ a(x, y) = (−1)|β| ∂xα Aδ    y (x) ∈ C ∞ (X × Y ). This proves that a(x, y) = Aδ Step 2-c: Finally, we show that a(x, y) = K (x, y). To do this, we need the following lemma: (β)

Lemma 7.40 The linear combinations of distributions of the form δ y are dense in the space E  (Y ). Proof Let v be an arbitrary distribution of E  (Y ). By Theorem 7.24 (the structure theorem for distributions with compact support) we can find a compact neighborhood V of supp v and a non-negative integer m such that v = ∂ m+2 G, where G ∈ C(Y ) with supp G ⊂ V . For every function ψ ∈ C ∞ (Y ), we have the formula     v, ψ = ∂ m+2 G, ψ = (−1)m+2 G, ∂ m+2 ψ = (−1)m+2 G(y)∂ m+2 ψ(y) dy. Y

The integrand G(y)∂ m+2 ψ(y) is continuous and supported in the compact subset V , so the integral can be approximated by Riemann sums. More precisely, for each large number N ∈ N we can approximate supp G by a union of cubes of side length N 2−N and volume 2−n N centered at points y1N , y2N , . . ., yk(N ) ∈ supp G. Then we find that the corresponding Riemann sums SN =

k(N ) (−1)m+2  G(y Nj )∂ m+2 ψ(y Nj ) 2n N j=1

336

7 Distributions, Operators and Kernels

are supported in the common compact subset V , and converge uniformly to v, ψ as N → ∞. Hence we have, for every ψ ∈ C ∞ (Y ), 





v, ψ = lim S N , ψ = lim N →∞

N →∞

k(N ) 1 

2n N

j=1

 G(y Nj )δ (m+2) ,ψ y Nj

.

This proves that k(N ) 1 

2n N

G(y Nj )δ (m+2) −→ v in E  (Y ) as N → ∞. yN

j=1

j

The proof of Lemma 7.40 is complete.



Step 2-d: If A1 = Op (a), it follows from Step 1 that A1 : C0∞ (Y ) → D (X ) is a 1 : E  (Y ) → C ∞ (X ). regularizer. Namely, it extends to a continuous linear operator A Hence we have, by formula (7.52),        δ (β) 1 δ (β) = a(·, y), δ (β) = (−1)|β| ∂ yβ a(·, y) = A . A y y y However, by Lemma 7.40 it follows that the linear combinations of distributions of (β) type δ y are dense in the space E  (Y ).  that 1 and A Therefore, we obtain from the continuity of A  = Op (K ) on E  (Y ). 1 = A Op (a) = A By the uniqueness of kernels, this implies that a(x, y) = K (x, y). Summing up, we have proved that A = Op (K ) with K ∈ C ∞ (X × Y ). Now the proof of Theorem 7.39 is complete.



7.6 Layer Potentials The purpose of this section is to describe the classical layer potentials arising in the Dirichlet and Neumann problems for the Laplacian Δ in the case of the half-space n . R+

7.6.1 Single and Double Layer Potentials Recall that the Newtonian potential is defined by the formula (see Example 7.25)

7.6 Layer Potentials

337

(7.53) (−Δ)−1 f (x) = N ∗ f (x) ((n − 2)/2) 1 = f (y) dy n 4π n/2 |x − y|n−2 R 1 1 = f (y) dy for f ∈ C0∞ (Rn ). (n − 2)ωn Rn |x − y|n−2 Here ωn =

2π n/2 for n ≥ 2 (n/2)

is the surface area of the unit sphere n . In the case n = 3, we have the classical Newtonian potential f (y) 1 u(x) = dy. 4π R3 |x − y| Up to an appropriate constant of proportionality, the Newtonian potential 1 1 4π |x − y| is the gravitational potential at position x due to a unit point mass at position y, and so the function u(x) is the gravitational potential due to a continuous mass distribution with density f (x). In terms of electrostatics, the function u(x) describes the electrostatic potential due to a charge distribution with density f (x). We define a single layer potential with density ϕ by the formula N ∗ (ϕ(x  ) ⊗ δ(xn )) 1 ϕ(y  ) = dy  (n − 2)ωn Rn−1 (|x  − y  |2 + xn2 )(n−2)/2

(7.54)

for ϕ ∈ C0∞ (Rn−1 ). In the case n = 3, the function N ∗ (ϕ ⊗ δ) is related to the distribution of electric charge on a conductor Ω. In equilibrium, mutual repulsion causes all the charge to reside on the surface ∂Ω of the conducting body with density ϕ, and ∂Ω is an equipotential surface. We define a double layer potential with density ψ by the formula N ∗ (ψ(x  ) ⊗ δ () (xn )) 1 xn ψ(y  ) = dy  for ψ ∈ C0∞ (Rn−1 ). ωn Rn−1 (|x  − y  |2 + xn2 )n/2

(7.55)

In the case n = 3, the function N ∗ (ψ ⊗ δ () ) is the potential induced by a distribution of dipoles on R2 with density ψ(y  ), the axes of the dipoles being normal to R2 .

338

7 Distributions, Operators and Kernels

7.6.2 The Green Representation Formula By applying the Newtonian potential to both sides of the jump formula (7.22), we obtain that   u 0 = N ∗ (−Δ))(u 0     = N ∗ (−Δu)0 ) − N ∗ (γ1 u ⊗ δ(xn ) − N ∗ γ0 u ⊗ δ () (xn ) ∂u  N (x − y)Δu(y) dy − N (x  − y  , xn ) (y , 0) dy  =− n n−1 ∂ yn R R ∂N  + (x − y  , xn )u(y  , 0) dy. Rn−1 ∂ yn Hence we arrive at the Green representation formula: 1 1 u(x) = Δu(y) dy (2 − n)ωn R+n |x − y|n−2 1 1 ∂u  + (y , 0) dy    2 2 (n−2)/2 n−1 (2 − n)ωn R (|x − y | + xn ) ∂ yn xn 1 n + u(y  , 0) dy  for x ∈ R+ . ωn Rn−1 (|x  − y  |2 + xn2 )n/2

(7.56)

By formulas (7.53), (7.54) and (7.55), we find that the first term is the Newtonian potential and the second and third terms are the single and double layer potentials, respectively. On the other hand, it is easy to verify that if ϕ(x  ) is bounded and continuous on n−1 R , then the function xn 2 ϕ(y  ) dy  (7.57) u(x  , xn ) =   ωn Rn−1 (|x − y |2 + xn2 )n/2   n is well-defined for x  , xn ∈ R+ , and is a (unique) solution of the homogeneous Dirichlet problem for the Laplacian 

n Δu = 0 in R+ , n−1 u=ϕ on R .

(DP)

Formula (7.57) is called the Poisson integral formula for the solution of the Dirichlet problem. Furthermore, by using the Fourier transform we can express formula (7.57) for ϕ ∈ S(Rn−1 ) as follows:

7.6 Layer Potentials

339

u(x  , xn ) =

1 (2π )n−1





Rn−1





ei x ·ξ e−xn |ξ |  ϕ (ξ  ) dξ  .

(7.58)

In order to prove formula (7.58), we need the following elementary formula (see e.g. [202, Chap. 5, Lemma 5.42]; [209, Chap. 6, Claim 6.1]): Lemma 7.41 For any β > 0, we have the formula 1 e−β = √ π



∞ 0

e−s − β 2 √ e 4s ds. s

(7.59)

By applying Fubini’s theorem and formula (7.59) with β := xn |ξ  |, we obtain that 1    ei x ·ξ e−xn |ξ |  ϕ (ξ  ) dξ  (7.60) (2π )n−1 Rn−1   1     ϕ(y  ) ei(x −y )·ξ e−xn |ξ | dξ  dy  = n−1 n−1 n−1 (2π ) R R   ∞ 1 1  −s (1+|x  −y  |2 /xn2 ) n/2−1 s = ϕ(y ) e ds dy  n/2 n−1 n−1 π x 0 R n   ∞ 1 1 xnn  −σ n/2−1 dy  = ϕ(y ) e σ dσ n/2 n−1  − y  |2 + x 2 )n/2 n−1 π (|x x 0 R n n    σ = s 1 + |x  − y  |2 /xn2 (n/2) xn = ϕ(y  ) dy  π n/2 Rn−1 (|x  − y  |2 + xn2 )n/2 xn 2 = ϕ(y  ) dy  .   ωn Rn−1 (|x − y |2 + xn2 )n/2 This proves the desired formula (7.58).

7.6.3 Approximation to the Identity via Dirac Measure The purpose of this subsection is to prove Table 7.1.

Table 7.1 Approximations to the identity via Dirac measure Approximation Integral formulas functions e−xn |ξ e−t |ξ |

|

2

Associated problems

Formulas (7.57) and (7.58)

Dirichlet problem (DP)

Formulas (7.62) and (7.63)

Initial-value problem (IVP)

340

7.6.3.1

7 Distributions, Operators and Kernels

Poisson Kernel and Dirac Measure

Poisson integral formula (7.57) is based on the following approximation formula for the Dirac measure δ(x  − y  ): δ(x  − y  ) =

1 lim (2π )n−1 xn ↓0

Rn−1









ei(x −y )·ξ e−xn |ξ | dξ  .

(7.61)

Indeed, we find from formula (7.60) and the Fourier inversion formula (7.48) with n := n − 1, x0 := 0 and x := x  − y  that 2 lim xn ↓0 ωn  = lim xn ↓0



xn dy  (|x  − y  |2 + xn2 )n/2  1 i(x  −y  )·ξ  −xn |ξ  |  e e dξ = δ(x  − y  ). (2π )n−1 Rn−1 Rn−1

Hence we have (formally) the assertion   2 xn   ϕ(y ) dy lim u(x  , xn ) = lim xn ↓0 xn ↓0 ωn Rn−1 (|x  − y  |2 + x n2 )n/2 = δ(x  − y  ) ϕ(y  ) dy  = ϕ(x  ). Rn−1

We remark that formula (7.61) is an approximation to the identity, since the Dirac measure δ D (x  , y  ) = δ(x  − y  ) ∈ D (Rn−1 × Rn−1 ) is the distribution kernel of the identity operator (see Example 7.35).

7.6.3.2

Heat Kernel and Dirac Measure

As in Sect. 6.4.2, we can introduce a one-parameter family {Tt }t≥0 of bounded linear operators on the Banach space C0 (Rn ) (see Sect. 6.4.2), associated with the heat kernel |x|2 1 e− 4t for t > 0 and x ∈ Rn , K t (x) = n/2 (4π t) by the formula 

u(x) for t = 0, Tt u(x) = , Rn K t (x − y)u(y) dy for t > 0. Then we can prove that the function w(x, t) := T (t)u(x) =

1 (4π t)n/2

Rn

e−

|x−y|2 4t

u(y) dy

(7.62)

7.6 Layer Potentials

341

is a solution of the following initial-value problem for the heat equation: ⎧ ⎨ ∂w − Δw = 0 in Rn × (0, ∞), ∂t ⎩w(·, 0) = u on Rn .

(IVP)

Roughly speaking, we may express the function w(x, t) in the form w(x, t) = T (t)u(x) = et Δ u(x). Indeed, we obtain from formula (7.26) and the Fourier inversion formula (7.48) with x0 := 0 and x := x − y that 

1 lim K t (x − y) = lim t↓0 t↓0 (2π )n

e

i(x−y)·ξ

e

−t |ξ |2

Rn

 dξ

= δ(x − y).

(7.63)

Hence, we have (formally) the assertion

lim w(x, t) = lim t↓0

t↓0

Rn

K t (x − y) u(y) dy =

Rn

δ(x − y) u(y) dy = u(x).

It should be emphasized that formula (7.63) is an approximation to the identity, since the Dirac measure δ D (x, y) = δ(x − y) ∈ D (Rn × Rn ) is the distribution kernel of the identity operator (see Example 7.35).

7.7 Distribution Theory on a Manifold This section is a summary of the basic definitions and results about the theory of distributions on a manifold. The virtue of manifold theory is that it provides the geometric insight into the study of distributions, and intrinsic properties of distributions may be revealed.

7.7.1 Densities on a Manifold Now let M be an n-dimensional C ∞ manifold. We remark that if (U, ϕ) is a chart on M with ϕ(x) = (x1 , x2 , . . . , xn ), then the density |d x1 ∧ · · · ∧ d xn |

342

7 Distributions, Operators and Kernels

is a basis of the space Ω(Tx∗ (M)) of densities on the tangent space Tx (M) of M at each point x of U . We let / Ω(Tx∗ (M)) Ω(T ∗ (M)) = x∈M

be the disjoint union of the spaces Ω(Tx∗ (M)), and define a mapping |π | : Ω(T ∗ (M)) −→ M by the formula

|π |(ρ) = x if ρ ∈ Ω(Tx∗ (M)),

and define a mapping

by the formula

|ϕ| : |π |−1 (U ) −→ ϕ(U ) × R2

   ∂ ∂ |ϕ|(ρ) = ϕ(x), ρ ∧ ··· ∧ ∂ x1 ∂ xn

if |π |(ρ) = x. Here we identify C with R2 . We can make Ω(T ∗ (M)) into an (n + 2)-dimensional C ∞ manifold by giving natural charts for it. Indeed,

if (U, ϕ) is a chart on M with ϕ(x) = (x1 , x2 , . . . , xn ), then the family of pairs (|π |−1 (U ), |ϕ|) , where (U, ϕ) ranges over all admissible charts, is an atlas on Ω(T ∗ (M)). We call Ω(T ∗ (M)) the fiber bundle of densities on the tangent spaces of M. A C ∞ density on M is a C ∞ mapping ρ : M −→ Ω(T ∗ (M)) such that ρ(x) ∈ Ω(Tx∗ (M)) for each x ∈ M. The set C ∞ (|M|) of all C ∞ densities on M is a complex linear space with the obvious operations of addition and scalar multiplication.

7.7.2 Distributions on a Manifold Let M be an n-dimensional C ∞ manifold (without boundary) which satisfies the second axiom of countability; hence M is paracompact. It is well known that every paracompact C ∞ manifold M has a partition of unity {ϕλ }λ∈ subordinate to any given open covering {Uλ }λ∈ of M. Namely, the family {ϕλ }λ∈ in C ∞ (M) satisfies the following three conditions (PU1), (PU2) and (PU3):

7.7 Distribution Theory on a Manifold

343

(PU1) 0 ≤ ϕλ (x) ≤ 1 for all x ∈ M and λ ∈ . (PU2) supp ϕλ ⊂ Uλ for each λ ∈ . (PU3) The collection {supp ϕλ }λ∈ is locally finite and 

ϕλ (x) = 1 for every x ∈ M.

λ∈

We let

C ∞ (M) = the space of C ∞ functions on M.

We equip the space C ∞ (M) with the topology defined by the family of seminorms: ϕ −→ p(ϕ ◦ χ −1 ) for ϕ ∈ C ∞ (M), where (U, χ ) ranges over all admissible charts on M and p ranges over all seminorms on C ∞ (χ (U )) such as formula (7.2). By using a partition of unity, we can verify that the topology on C ∞ (M) is defined by the family of seminorms associated with an atlas on M alone. However, since M satisfies the second axiom of countability, there exists an atlas on M consisting of countably many charts. This shows that C ∞ (M) is metrizable. Furthermore, it is easy to see that C ∞ (M) is complete; hence it is a Fréchet space. If K is a compact subset of M, we let C K∞ (M) = the space of C ∞ functions on M with support in K . The space C K∞ (M) is a closed subspace of C ∞ (M). Furthermore, we let C0∞ (M) =

C K∞ (M),

K ⊂M

where K ranges over all compact subsets of M. We equip the space C0∞ (M) with the inductive limit topology of the spaces C K∞ (M). We let C ∞ (|M|) = the space of C ∞ densities on M, C0∞ (|M|) = the space of C ∞ densities on M with compact support. Since M is paracompact, it is known that M admits a strictly positive C ∞ density μ. Hence we can identify C ∞ (|M|) with C ∞ (M) as linear topological spaces by the isomorphism C ∞ (M) −→ C ∞ (|M|) ϕ −→ ϕ · μ.

344

7 Distributions, Operators and Kernels

Similarly, the space C0∞ (|M|) can be identified with the space C0∞ (M). A distribution on M is a continuous linear functional on C0∞ (|M|). The space distributions on M is denoted by D (M). Namely, D (M) is the dual space of ∞  C0 (|M|) = L(C0∞ (|M|), C). If ϕ ∈ C0∞ (M) and u ∈ D (M), we denote the action of u on ϕ · μ by u, ϕ · μ or sometimes by ϕ · μ, u. A function u defined on M is said to be in L 1loc (M) if, for any admissible chart (U, χ ) on M, the local representative u ◦ χ −1 of u is in L 1loc (χ (U )). The elements of L 1loc (M) are called locally integrable functions on M. Every element u of L 1loc (M) defines a distribution on M by the formula u, ϕ · μ = M

uϕ · μ for every ϕ ∈ C0∞ (M).

We list some basic properties of distributions on a manifold: (1) If V is an open subset of M, then a distribution u ∈ D (M) defines a distribution u|V ∈ D (V ) by restriction to C0∞ (|V |). (2) The space D (M) has the sheaf property; this means the following two properties (S1) and (S2): (S1) If {Uλ }λ∈ is an open covering of M and if a distribution u ∈ D (M) is zero in each Uλ , then u = 0 in M. (S2) Given an open covering {Uλ }λ∈ of M and a family of distributions u λ ∈ D (Uλ ) such that u j = u k in every Uλ ∩ Uμ , there exists a distribution u ∈ D (M) such that u = u λ in each Uλ . (3) The space of distributions with compact support can be identified with the dual space E  (M) of C ∞ (|M|). We have the same topological properties of D (M) and E  (M) as those of D (Ω) and E  (Ω) stated in Sect. 7.4.

7.7.3 Differential Operators on a Manifold Let M be an n-dimensional C ∞ manifold (without boundary). If P is a linear mapping of C ∞ (M) into itself and if (U, χ ) is a chart on M, we let χ∗ P = χ∗ ◦ (P|U ) ◦ χ ∗ , where P|U is the restriction of P to U and χ ∗ v = v ◦ χ is the pull-back of v by χ and χ∗ u = u ◦ χ −1 is the push-forward of u by χ , respectively. Then it follows that χ∗ P is a linear mapping of C ∞ (χ (U )) into itself. The situation can be visualized in the above commutative diagram (see Fig. 7.17 below).

7.7 Distribution Theory on a Manifold

345

Fig. 7.17 The differential operator χ∗ P

C ∞ (U )

P |U

− −−−−→



χ∗ ⏐

C ∞ (U ) ⏐ ⏐χ∗

C ∞ (χ(U )) − −−−− → C ∞ (χ(U )) χ∗ P

A continuous linear mapping P : C ∞ (M) → C ∞ (M) is called a differential operator of order m on M if, for any chart (U, χ ) on M, the mapping χ∗ P is a differential operator of order m on χ (U ) ⊂ Rn . Example 7.42 Let (M, g) be an n-dimensional, Riemannian smooth manifold. The Laplace–Beltrami operator or simply the Laplacian Δ M of M is a second order differential operator defined (in local coordinates) by the formula Δ M = div (grad f ) n 

∂ 1 0 = det(gi j ) ∂ xk k,=1



0 ∂ det(gi j )g k ∂ x

 for every f ∈ C ∞ (M),

where   ∂ ∂ , , gi j = g ∂ xi ∂ x j  ij g = the inverse matrix of (gi j ). If M = Rn with standard Euclidean metric (gi j ) = (δi j ), then the Laplace–Beltrami operator Δ M becomes the usual Laplacian Δ=

∂2 ∂2 ∂2 + 2 + ... + 2. 2 ∂ xn ∂ x1 ∂ x2

7.7.4 Operators and Kernels on a Manifold Let M and N be C ∞ manifolds equipped with strictly positive densities μ and ν, respectively. If K ∈ D (M × N ), we can define a continuous linear operator A : C0∞ (N ) −→ D (M) by the formula Aψ, ϕ · μ = K , ϕ · μ ⊗ ψ · ν for all ϕ ∈ C0∞ (M) and ψ ∈ C0∞ (N ).

346

7 Distributions, Operators and Kernels

If A : C0∞ (N ) → D (M) is a continuous linear operator, we define its transpose A by the formula 



 A ϕ, ψ · ν = ϕ · μ, Aψ for all ϕ ∈ C0∞ (M) and ψ ∈ C0∞ (N ).

Then the transpose A is a continuous linear operator on C0∞ (M) into D (N ). Also we have (A ) = A. Similarly, we define the adjoint A∗ of A by the formula 

   A∗ ϕ, ψ · ν = ϕ · μ, Aψ for all ϕ ∈ C0∞ (M) and ψ ∈ C0∞ (N ).

Then the adjoint A∗ is a continuous linear operator on C0∞ (M) into D (N ), and we have (A∗ )∗ = A. It should be emphasized that the results in Sect. 7.5 extend to this case.

7.8 Domains of Class C r In this last section we introduce the notion of domains of class C r from the viewpoint of manifold theory. An open set Ω in Rn is called a Lipschitz hypograph if its boundary ∂Ω can be represented as the graph of a Lipschitz continuous function. Namely, there exists a Lipschitz continuous function ζ : Rn−1 → R such that (see Fig. 7.18) 

 Ω = x = x  , xn ∈ Rn : xn < ζ (x  ), x  ∈ Rn−1 .

(7.64)

An open subset of Rn is called a domain if it is also connected. Let 0 ≤ r ≤ ∞. A domain Ω in Rn with boundary ∂Ω is said to be of class C r or a C r domain if, at each point x0 of ∂Ω, there exist a neighborhood U of x0 in Rn and a bijection χ of U onto B = {x = (x1 , x2 , . . . , xn ) ∈ Rn : |x| < 1} such that (see Fig. 7.19 below)

Fig. 7.18 The Lipschitz hypograph Ω

7.8 Domains of Class C r

347

Fig. 7.19 The coordinate neighborhood (U, χ)

χ (U ∩ Ω) = B ∩ {xn > 0} , χ (U ∩ ∂Ω) = B ∩ {xn = 0} . χ ∈ C r (U ), χ −1 ∈ C r (B). More precisely, a C r domain is an n-dimensional, C r manifold with boundary (see Sect. 8.1). Sometimes, a different smoothness condition will be needed, so we broaden the above terminology as follows: For any non-negative integer k and any 0 < θ ≤ 1, we say that the domain Ω defined by formula (7.64) is a C k,θ hypograph if the function ζ is of class C k,θ , that is, if ζ is of class C k and its k-th order partial derivatives are Hölder continuous with exponent θ . The next definition requires that, roughly speaking, the boundary of Ω can be represented locally as the graph of a Lipschitz continuous function, by using different systems of Cartesian coordinates for different parts of the boundary: Definition 7.43 Let Ω be a bounded domain in Euclidean space Rn with boundary

J ∂Ω. We say that Ω is a Lipschitz domain if there exist finite families U j j=1 and

J Ω j j=1 having the following three properties (i), (ii) and (iii) (see Fig. 7.20):

J (i) The family U j j=1 is a finite open covering of ∂Ω. (ii) Each Ω j can be transformed to a Lipschitz hypograph by a rigid motion, that is, by a rotation plus a translation. (iii) The set Ω satisfies the conditions

Fig. 7.20 The families U j

and Ω j in Definition 7.43

348

7 Distributions, Operators and Kernels

U j ∩ Ω = U j ∩ Ω j for 1 ≤ j ≤ J . In the obvious way, we define a domain of class C k,θ or C k,θ domain by substituting “C ” for “Lipschitz” throughout Definition 7.43. It should be emphasized that a Lipschitz domain is the same thing as a C 0,1 domain. If Ω is a Lipschitz hypograph defined by formula (7.64), then we remark that its boundary 

 ∂Ω = x = x  , ζ (x  ) : x  ∈ Rn−1 k,θ

is an (n − 1)-dimensional, C 0,1 submanifold of Rn if we apply the following Rademacher theorem (see [119, Corollary 1.73], [174, Theorem]): Theorem 7.44 (Rademacher) Any Lipschitz continuous function on Rn admits L ∞ first partial derivatives almost everywhere in Rn . Indeed, it follows from an application of Rademacher’s theorem that the function ζ (x  ) is Fréchet differentiable almost everywhere in Rn−1 with ∇ζ  L ∞ (Rn−1 ) ≤ C,

(7.65)

where C is any Lipschitz constant for the function ζ (x  ). Hence the Riemannian metric (h i j ) of ∂Ω is given by the formula ⎛ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎝

h 11 h 21 · · ·

h 12 h 22 · · ·

h n−1 1 h n−1 2

· · · · · ·

· · · · · ·

⎞ ⎛ 1 + ζx21 ζx1 ζx2 · h 1 n−1 ⎜ ⎟ · h 2 n−1 ⎟ ⎜ ζx2 ζx1 1 + ζx22 ⎜ · · · ⎟ ⎟=⎜ · ⎟ · · · ⎟ ⎜ ⎜ · · · · ⎠ ⎝ · ζxn−1 ζx1 ζxn−1 ζx2 · h n−1 n−1

where ζxi =

∂ζ ∂ xi

⎞ · · · ζx1 ζxn−1 · · · ζx2 ζxn−1 ⎟ ⎟ ⎟ ··· · ⎟, ⎟ ··· · ⎟ ⎠ ··· · 2 · · · 1 + ζxn−1

for 1 ≤ i ≤ n − 1.

It is easy to see that   det h i j = 1 + ζx21 + . . . + ζx2n−1 = 1 + |∇ζ (x  )|2 . Therefore, we obtain that the boundary ∂Ω has the surface measure dσ and that the unit exterior normal ν exists dσ -almost everywhere in Rn−1 (see Fig. 7.21 below), where dσ and ν are given respectively by the following formulas: 0

1 + |∇ζ (x  )|2 d x  , (−∇ζ (x  ), 1) ν=0 . 1 + |∇ζ (x  )|2

dσ =

7.8 Domains of Class C r

349

Fig. 7.21 The unit exterior normal ν to ∂Ω

Here it should be noticed that we have, by inequality (7.65), 1≤

0

1 + |∇ζ (x  )|2 ≤

0

1 + C 2,

so that the surface measure dσ is equivalent locally to the Lebesgue measure d x  .

7.9 The Seeley Extension Theorem The next theorem due to Seeley [165, Theorem] shows that if a domain Ω is of class C ∞ , then the functions in C ∞ (Ω) are the restrictions to Ω of functions in C ∞ (Rn ): n Theorem 7.45 (Seeley) Let Ω be either the half space R+ or a C ∞ domain in Rn with bounded boundary  = ∂Ω. Then there exists a continuous linear extension operator E : C ∞ (Ω) −→ C ∞ (Rn ).

Furthermore, the restriction to C0∞ (Ω) of E is a continuous linear extension operator on C0∞ (Ω) into C0∞ (Rn ). n Proof (i) First we let Ω = R+ . The proof is based on the following lemma (see also [12, Chap. VI, Lemma 1.1.1]):

Lemma 7.46 There exists a function w(t) in the Schwartz space S(R) such that supp w ⊂ [1, ∞), ∞ t n w(t) dt = (−1)n for n = 0, 1, 2, . . .. 1

Assuming this lemma for the moment, we shall prove the theorem. n ), we define the extension operator E as follows: If ϕ ∈ C ∞ (R+ 

ϕ(x  , xn ) if xn ≥ 0, Eϕ(x , xn ) = , ∞  w(s) θ (−x s) ϕ(x , −sx ) ds if xn < 0, n n 1 

350

7 Distributions, Operators and Kernels

Fig. 7.22 The open covering {V j } and the partition of unity {ω j }

where x = (x  , xn ), x  = (x1 , . . . , xn−1 ) and θ ∈ C0∞ (R) with supp θ ⊂ [−2, 2] and θ (t) = 1 for |t| ≤ 1. Then it is easy to verify the following: (1) Eϕ ∈ C ∞ (Rn ). n ) −→ C ∞ (Rn ) is continuous. (2) E : C ∞ (R+ (3) If supp ϕ ⊂ x ∈ Rn : |x  | ≤ r, 0 ≤ xn ≤ a for some r > 0 and a > 0, then supp Eϕ ⊂ x ∈ Rn : |x  ver t ≤ r, |xn | ≤ a . n This proves the theorem for the half space R+ . ∞ (ii) Now we assume that Ω is a C domain in Rn with bounded boundary  = ∂Ω. Then we can choose a finite covering {U j } Nj=1 of  by open subsets of Rn and C ∞ diffeomorphisms χ j of U j onto B = {x ∈ Rn : |x| < 1} such that the open sets

3 Vj =

χ −1 j

√ 45 3 1 x ∈ R : |x | < , |xn | < for 1 ≤ j ≤ N 2 2 

n

form an open covering of the open set Ωδ = {x ∈ Ω : dist (x, ) < δ} for some δ > 0 (see Figs. 7.22 and 7.23 below). Furthermore, we can choose an open set V0 in Ω, bounded away from , such that (see Fig. 7.24 below) ⎛ Ω⊂⎝

N

⎞ V j ⎠ ∪ V0 .

j=1

Let {ω j } Nj=0 be a partition of unity subordinate to the covering {V j } Nj=0 . If ϕ ∈ C ∞ (Ω), we define the extension operator E by the formula Eϕ := ω0 ϕ +

N  j=1

   ∗ χ ∗j E (χ −1 . j ) (ω j ϕ)

7.9 The Seeley Extension Theorem

351

Fig. 7.23 The coordinate transformation χ j maps V j onto B(0, 1) Fig. 7.24 The open sets V0 and V j

Then it is easy to verify that this operator E enjoys the desired properties. Theorem 7.45 is proved, apart from the proof of Lemma 7.46.



7.9.1 Proof of Lemma 7.46 Our elementary proof is a modified version of the original proof of Seeley [165, Theorem] due to Masao Tanikawa. Step (1): First, we take a function u 0 ∈ C0∞ (R) such that u 0 ≥ 0 on R. supp u 0 ⊂ [1, 2]. ∞ u 0 (t) dt = 1.

(7.66a) (7.66b) (7.66c)

0

If k is a positive integer, we let u k (t) := Then we have the following:

1 u0 2k



t 2k

 .

(7.67)

352

7 Distributions, Operators and Kernels

u k ≥ 0 on R.

! supp u k ⊂ 2k , 2k+1 . ∞ u k (t) dt = 1. 0

Hence, for any sequence {ak }, the formal sum w(t) =

∞ 

ak u k (t)

k=0

make sense. We shall choose a sequence {ak } in such a way that the function w has the desired properties. Step (2): Next fix an integer N > 0 and construct the N -th approximation to the function w: N  a N k u k (t). w N (t) := k=0

The coefficients a N k will be picked to satisfy the conditions:



t n w N (t) dt = (−1)n for n = 0, 1, . . . , N .

(7.68)

0

However, by conditions (7.66) and (7.67), this equation may be rewritten as N 

2nk a N k = h n for n = 0, 1, . . . , N ,

(7.68 )

k=0

where hn = , ∞ 0

(−1)n . t n u 0 (t) dt

Now the equation (7.68 ) is a linear system of N + 1 equations in N + 1 unknowns, and its determinant is the Vandermonde determinant:   1 1 . . . 1    N 1 2 . . . 2 N  &  j    2 − 2i .  N =  .. .. . . ..  = . . . .  i, j=0  1 2 N . . . 2 N 2  i< j Hence, by using Cramer’s rule, we find the solutions {a N k } of equation (7.68 ) as follows:

7.9 The Seeley Extension Theorem

353

3 a N k = (−1)  N k k

5 N  N −  (−1) h  S N k −1 , =0

where

N & 

N k =

 2 j − 2i ,

i, j=0 i, j =k i< j

and S Nm k is the elementary symmetric polynomial of degree m in terms of the elements {1, 2, . . . , 2k−1 , 2k+1 , . . . , 2 N }. Since we have, by condition (7.66), |h  | ≤ 1 and

N 

S NNk− =

=0

N & 

 1 + 2 ,

=0  =k

it follows that k−1 4 k−1 4−1 N N & &  &    &      k  k |a N k | ≤ 1+2 1+2 2 −2 2 −2 =0

=

k−1 & =0

where

=k+1

=0

=k+1

N &

1 + 2 1 + 2 · := Ak · B N k ,  k 2 − 2 =k+1 2 − 2k k−1 & 1 + 2 Ak = , 2 − 2k =0

BN k

N & 1 + 2 = . 2 − 2k =k+1

However, we remark that |Ak | ≤

k−1 +1 & 2 =0

and also

2k−1

≤ 2(3k−k

2

)/2

(7.69)

354

7 Distributions, Operators and Kernels

log B N k =

  N  1 + 2k 1 + 2k log 1 +  < k 2 −2 2 − 2k =k+1 =k+1 N 

N    < 1 + 2k =k+1

(7.70)

∞   1 < 1 + 2k j 2 j=k

1 2−1

< 4, since we have the inequality log(1 + x) < x for all x > 0. Therefore, by combining estimates (7.69) and (7.70), we obtain that |a N k | ≤ e4 2(3k−k

2

)/2

for k = 0, 1, . . . , N .

(7.71)

In order to prove that a finite lim N →∞ a N k exists for every integer k ≥ 0, it suffices to show that each sequence {a N k }∞ N =1 is a Cauchy sequence. Since we have the formula +1− N +1 N − S N k + S NNk+1− , S NN+1 k =2

it follows that  a N +1 k − a N k = (−1)

k

N 



(−1) 2 h  + (−1) k

+1

!

4

h +1 S NNk−

=0

×  N +1 k · −1 N +1 , so that |a N +1 k

3 N 5    N − k − aN k | ≤ 1 + 2 SN k  N +1 k · −1 N +1 =0

= |Ak | · B N k ·

1 . − 2k

2 N +1

Hence, we obtain from estimates (7.69) and (7.70) that we have, for any integer M > 0, M    |a N +M k − a N k | ≤ 1 + 2k |Ak | e4 m=1

1 2 N +m

  1 ≤ e4 1 + 2k |Ak | · N . 2

− 2k

7.9 The Seeley Extension Theorem

355

This proves that the sequence {a N k }∞ N =1 is a Cauchy sequence for every integer k ≥ 0. Step (3): Finally, we let ak := lim a N k for k = 0, 1, . . .. N →∞

Then, by letting N → ∞ in estimate (7.71), we have the inequality |ak | ≤ e4 2(3k−k

2

)/2

.

(7.72)

In view of estimates (7.71) and (7.72) and formula (7.68), it is easy to verify that w N → w in the space S(R) and further that the limit function w enjoys the desired properties. The proof of Lemma 7.46 and hence that of Theorem 7.45 is complete. 

7.10 Notes and Comments Schwartz [163] and Gelfand–Shilov [73] are the classics for distribution theory. Our treatment here follows the expositions of Chazarain–Piriou [35], Hörmander [84] and Treves [223]. Sections 7.1 and 7.2: The material in these sections is taken from Gilbarg– Trudinger [74] and Folland [62]. Section 7.3: Peetre’s theorem 7.7 is due to Peetre [143]. Section 7.4: For the Banach–Steinhaus theorem, see Treves [223, Chap. 33]. Example 7.29 is taken from Chazarain–Piriou [35, Chapitre III, Lemme 9.4]. Section 7.5: Schwartz’s kernel theorem (Theorem 7.36) is taken from Chazarain– Piriou [35, Chapitre I, Théorème 4.4]. Section 7.6: More detailed and concise accounts of layer potentials are given by the books of Folland [62] and McLean [125]. Section 7.7: Distributions on a manifold were first studied by de Rham [40]. The material here is adapted from Abraham–Marsden–Ratiu [1], Chazarain–Piriou [35] and Lang [113] in such a way as to make it accessible to graduate students and advanced undergraduates as well. Section 7.8: The definition of a C r domain is taken from McLean [125]. Section 7.9: This section is adapted from [209, Sect. 4.2].

Chapter 8

L 2 Theory of Sobolev Spaces

One of the most useful ways of measuring differentiability properties of functions on Rn is in terms of L 2 norms, and is provided by the Sobolev spaces on Rn . The great advantadge of this approach lies in the fact that the Fourier transform works very well in the Hilbert space L 2 (Rn ). The purpose of this chapter is to summarize the basic definitions and results about Sobolev spaces which will be needed for the study of boundary value problems in Chap. 11.

8.1 The Spaces H s (R n ) If s ∈ R, we let u = Fu is H s (Rn ) = the space of distributions u ∈ S  (Rn ) such that  a locally integrable function and (1 + |ξ|2 )s/2 u ∈ L 2 (Rn ). We equip the space H s (Rn ) with the inner product (u, v)s =

1 (2π)n

 Rn

 s 1 + |ξ|2  u (ξ) ·  v (ξ) dξ,

and with the associated norm us =

1 (2π)n/2

 Rn

 s 1 + |ξ|2 | u (ξ)|2 dξ

1/2 .

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 K. Taira, Functional Analytic Techniques for Diffusion Processes, Springer Monographs in Mathematics, https://doi.org/10.1007/978-981-19-1099-9_8

357

8 L 2 Theory of Sobolev Spaces

358

The space H s (Rn ) is called the Sobolev space of order s. Roughly speaking, the order s “counts” the number of L 2 -derivatives of elements in H s (Rn ) (cf. Theorems 8.1 and 8.2 below). We define a linear map Λs : S  (Rn ) −→ S  (Rn ) by the formula   Λs u = F ∗ (1 + |ξ|2 )s/2 Fu for u ∈ S  (Rn ). This can be visualized as follows: F

S  (Rn ) →−→ S  (Rn )

(1+|ξ|2 )s/2



F∗

−→ S  (Rn ) →−→ S  (Rn ).

Thus the map Λs is an isomorphism of S  (Rn ) onto itself, and its inverse is the map Λ−s . Furthermore, it follows from an application of the Plancherel theorem (Theorem 7.22) that (a) u ∈ H s (Rn ) if and only if Λs u ∈ L 2 (Rn ). (b) (u, v)s = Rn Λs u(x)Λs v(x) d x. This shows that Λs is an isometric isomorphism of H s (Rn ) onto L 2 (Rn ). Hence the Sobolev space H s (Rn ) is a Hilbert space. In particular, we have the assertion H 0 (Rn ) = L 2 (Rn ). We list some basic topological properties of the function spaces H s (Rn ): (1) If s > t, then we have the inclusions S(Rn ) ⊂ H s (Rn ) ⊂ H t (Rn ) ⊂ S  (Rn ) with continuous injections. We let H ∞ (Rn ) =



H s (Rn ),

s∈R

H

−∞

(R ) = n



H s (Rn ).

s∈R

Then we have the inclusions S(Rn ) ⊂ H ∞ (Rn ), E  (Rn ) ⊂ H −∞ (Rn ).

8.1 The Spaces H s (R n )

359

The second inclusion follows from Theorem 7.24. (2) The space S(Rn ) is dense in H s (Rn ) for each s ∈ R. Indeed, since S(Rn ) is dense in L 2 (Rn ), it follows that the space S(Rn ) = Λ−s (S(Rn )) is dense in H s (Rn ) = Λ−s (L 2 (Rn )). (3) The next two theorems give a direct description of H s (Rn ) when s > 0. Theorem 8.1 If m is a positive integer, then the Sobolev space H m (Rn ) is the space of functions u ∈ L 2 (Rn ) such that D α u ∈ L 2 (Rn ) for |α| ≤ m. Furthermore, the norm um is equivalent to the norm ⎛ ⎝



|α|≤m

Rn

⎞1/2 |D α u(x)|2 d x ⎠

.

Theorem 8.2 Let s = m + σ where m is a positive integer and 0 < σ < 1. Then the Sobolev space H s (Rn ) is the space of functions u ∈ H m (Rn ) such that the integral (Slobodecki˘ı seminorm)   Rn ×Rn

|D α u(x) − D α u(y)|2 dx dy |x − y|n+2σ

is finite for |α| = m. Furthermore, the norm us is equivalent to the norm ⎛ ⎝u2m +

 Rn ×Rn

|α|=m

⎞1/2 |D α u(x) − D α u(y)|2 d x d y⎠ . |x − y|n+2σ

(4) The next theorem states that the elements of H s (Rn ) are smooth in the classical sense for sufficiently large s > 0 ([35, Chapitre II, Théorème 2.5]; [62, Lemma (6.5)]): Theorem 8.3 (Sobolev) If s > n/2 + k where k is a non-negative integer, then we have the inclusion H s (Rn ) ⊂ C k (Rn ) with continuous injection. (5) Since the Sobolev space H s (Rn ) is a Hilbert space, it is its own dual space. However, it is more useful to consider the following characterization of the dual space of H s (Rn ). Theorem 8.4 The bilinear form < ·, · > on the product space S(Rn ) × S(Rn ), defined by the formula (see the Parseval formula (7.22b))  {u, v} −→ u, v =

Rn

u(x) · v(x) d x =

1 (2π)n

 Rn

 u (ξ) ·  v (−ξ) dξ,

8 L 2 Theory of Sobolev Spaces

360

extends uniquely to a continuous bilinear form < ·, · > on the space H s (Rn ) × H −s (Rn ) for each s ∈ R, given by the formula  1  u (ξ) ·  v (−ξ) dξ (2π)n Rn  1 = ξ s/2 u (ξ) · ξ −s/2 v (−ξ) dξ. (2π)n Rn

{u, v} −→ u, v =

This bilinear form on the space H s (Rn ) × H −s (Rn ) permits us to identify the strong dual space of H s (Rn ) with H −s (Rn ). (6) If F is a closed subset of Rn , we let HFs (Rn ) = the subspace of H s (Rn ) consisting of the functions with support in F. Since the injection H s (Rn ) → D (Rn ) is continuous, it follows that HFs (Rn ) is a closed subspace of H s (Rn ); hence it is a Hilbert space. The next theorem is a Sobolev space version of the Ascoli–Arzelà theorem (Theorem 2.2): Theorem 8.5 (Rellich) Let K be a compact subset of Rn . If s > t, then the injection HKs (Rn ) −→ HKt (Rn ) is compact. Table 4.1 gives a bird’s-eye view of the Bolzano–Weierstrass theorem, the Ascoli– Arzelà theorem and Rellich’s theorem in analysis (see Table 8.1 below). (7) If  · s1 and  · s2 are two Sobolev norms, then the intermediate norms between them are estimated as follows ([35, Chapitre II, Proposition 2.11]): Proposition 8.6 Let s1 , s, s2 be real numbers such that s1 < s < s2 . For every ε > 0, there exists a constant Cε > 0 such that

Table 8.1 A bird’s-eye view of three compactness theorems in analysis Subject Sequence Compactness theorem Theory of real numbers

Sequence of real numbers

Calculus

Sequence of continuous functions Sequence of distributions

Theory of distributions

The Bolzano–Weierstrass theorem The Ascoli–Arzerà theorem Rellich’s theorem

8.1 The Spaces H s (R n )

361

u2s ≤ εu2s2 + Cε u2s1 , u ∈ H s2 (Rn ). This inequality is called the interpolation inequality.

s s 8.2 The Spaces Hloc (Ω) and Hcomp (Ω)

Now we study distributions which behave locally just like the distributions in H s (Rn ). In doing so, the next theorem plays a fundamental role. Theorem 8.7 The multiplication {ϕ, u} −→ ϕu is a continuous bilinear mapping of S(Rn ) × S(Rn ) into H s (Rn ) for each s ∈ R; more precisely, we have the inequality ϕus ≤ 2|s|/2 us

 Rn

 (1 + |ξ|2 )|s|/2 | ϕ(ξ)| dξ .

If Ω is an open subset of Rn , we let s (Ω) = the space of distributions u ∈ D (Ω) such that ϕu ∈ H s (Rn ) Hloc for all ϕ ∈ C0∞ (Ω). s (Ω) with the topology defined by the family of seminorms We equip the space Hloc

u −→ ϕus   where ϕ ranges over the space C0∞ (Ω). Let  K j be an exhaustive sequence of compact subsets of Ω. If we take a sequence ϕ j in C0∞ (Ω) such that ϕ j = 1 on s (Ω) is defined by the countable many seminorms K j , then the topology on Hloc u −→ ϕ j us alone. Indeed, for every ϕ ∈ C0∞ (Ω) we can take j so large that ϕ j ϕ = ϕ. Then it follows from an application of Theorem 8.7 that ϕus = ϕϕ j us ≤ Cs ϕ j us s where Cs > 0 is a constant independent of ϕ j . This shows that Hloc (Ω) is metrizable. s Furthermore, by virtue of the completeness of the spaces H (Rn ) and D (Rn ), we s s (Ω) is complete. Hence the space Hloc (Ω) is a Fréchet can easily check that Hloc space.

8 L 2 Theory of Sobolev Spaces

362

s Here are some basic topological properties of Hloc (Ω):

(1) We have the inclusions s C ∞ (Ω) ⊂ Hloc (Ω) ⊂ D (Ω) s (Ω) with continuous injections. Furthermore, the space C0∞ (Ω) is dense in Hloc for each s ∈ R. (2) The next theorem is a localized version of Theorem 8.3:

Theorem 8.8 (Sobolev) If s > n/2 + k where k is a non-negative integer, then we have the inclusion s (Ω) ⊂ C k (Ω) Hloc with continuous injection. Furthermore, we have the assertion C ∞ (Ω) =



s Hloc (Ω).

s∈R

This theorem is one of many Sobolev imbedding theorems. (3) We let s (Ω) = the union of the spaces HKs (Rn ) where K ranges over Hcomp

all compact subsets of Ω. s (Ω) with the inductive limit topology of the spaces We equip the space Hcomp HKs (Rn ): s (Ω) = lim HKs (Rn ). Hcomp − → K Ω

s −s (Ω) × Hcomp (Ω) We define a bilinear form < ·, · > on the product space Hloc by the formula {u, v} −→ u, v = ϕu, v (8.1)

where ϕ is a function in C0∞ (Ω) such that ϕ = 1 in a neighborhood of supp v, and < ·, · > on the right-hand side is the pairing of H s (Rn ) and H −s (Rn ). It is easy to verify that the quantity < ϕu, v > does not depend on the function ϕ chosen. Then we have the following duality theorem ([35, Chapitre II, Théorème 2.15]): s −s Theorem 8.9 (the duality theorem) The Sobolev spaces Hloc (Ω) and Hcomp (Ω) s −s (Ω) are dual to each other with respect to the bilinear pairing of Hloc (Ω) and Hcomp defined by formula (8.1):

s (Ω) and H s 8.2 The Spaces Hloc comp (Ω)

363

Fig. 8.1 The inverse image χ∗ v of v under χ



 s −s Hloc (Ω) ∼ (Ω), = Hcomp   −s s Hcomp (Ω) ∼ (Ω). = Hloc (4) The characterization of H s (Rn ) in terms of L 2 -norms in Theorems 8.1 and 8.2 s (Ω) under C ∞ diffeomorphims. allows us to prove the invariance of the space Hloc n Let Ω1 , Ω2 be two open subsets of R and χ : Ω1 −→ Ω2 a C ∞ diffeomorphism. If v ∈ D (Ω2 ), we define a distribution χ∗ v ∈ D (Ω1 ) by the formula (cf. formula (4.15)):   χ∗ v, ϕ = v, ϕ ◦ χ−1 ·  det(J (χ−1 )) for all ϕ ∈ C0∞ (Ω1 ), where J (χ−1 ) is the Jacobian matrix of χ−1 . The situation can be visualized as in Fig. 8.1 above. The distribution χ∗ v is called the inverse image of v under χ. s (Ω) under C ∞ Then we have the invariance theorem of Sobolev spaces Hloc diffeomorphims: Theorem 8.10 Let χ : Ω1 → Ω2 be a C ∞ diffeomorphism. Then the mapping v −→ s s (Ω2 ) onto Hloc (Ω1 ), and its inverse is the mapping χ∗ v is an isomorphism of Hloc −1 ∗ u −→ (χ ) u (see Fig. 8.2).

Fig. 8.2 The Sobolev spaces s (Ω ) and H s (Ω ) Hloc 1 2 loc

8 L 2 Theory of Sobolev Spaces

364

8.3 The Spaces H s (M) s Theorem 8.10 allows us to define Hloc (M), where M is a manifold, as follows (see Fig. 8.3 below): Let M be an n-dimensional, C ∞ manifold which satisfies the second axiom of countability. We let s (M) = the space of distributions u ∈ D (M) such that, for any Hloc

admissible chart (U, χ) on M, the inverse image (χ−1 )∗ (u|U ) s of u|U under χ−1 belongs to Hloc (χ(U )). s (M) with the topology defined by the family of seminorms We equip the space Hloc

u −→  ϕ · (χ−1 )∗ (u|U )s where (U, χ) ranges over all admissible charts on M and ϕ  ranges over the space C0∞ (χ(U )). Now we assume that M is an n-dimensional compact C ∞ manifold. By the com N pactness of M, we can find an atlas (U j , χ j ) j=1 consisting of finitely many charts  N  N on M. Let ϕ j j=1 be a partition of unity subordinate to the covering U j j=1 . s Then the topology on Hloc (M) can be defined by the norm associated with the inner product N  

−1 ∗ ∗ (χ−1 (u, v)s = j ) (ϕ j u), (χ j ) (ϕ j v) j=1

s

where (·, ·)s on the right-hand side is the inner product in H s (Rn ). Hence the space s Hloc (M) is a Hilbert space. In the case when M is compact, we write s (M). H s (M) = Hloc

Observe that all the results we stated about H s (Rn ) in Sect. 8.1 are true also for H (M), since the spaces H s (M) are defined to be locally the spaces H s (Rn ). s

s (M) and H s (χ(U )) Fig. 8.3 The Sobolev spaces Hloc loc

8.3 The Spaces H s (M)

365

We summarize some basic topological properties of H s (M): (1) If s > t, then we have the inclusions C ∞ (M) ⊂ H s (M) ⊂ H t (M) ⊂ D (M) with continuous inclusions. Furthermore, we have the formula

H s (M). D (M) = s∈R

(2) The space C ∞ (M) is dense in H s (M) for each s ∈ R. (3) The next theorem is a manifold version of Sobolev’s theorem (Theorem 8.3): Theorem 8.11 (Sobolev) If s > n/2 + k where k is a non-negative integer, then we have the inclusion H s (M) ⊂ C k (M) with continuous injection. Furthermore, we have the formula C ∞ (M) =



H s (M).

s∈R

(4) Let μ be a strictly positive density on M. The bilinear form < ·, · > on the product space C ∞ (M) × C ∞ (M), defined by the formula  {u, v} −→ u, v = =

u(x) · v(x) dμ(x) M N

∗  ∗      ϕ j u , χ−1 ϕ j v , χ−1 j j

j=1

extends uniquely to a continuous bilinear form < ·, · > on the product space H s (M) × H −s (M) for each s ∈ R. Here the ·, · on the last right-hand side are the bilinear form introuduced in Theorem 8.4. The spaces H s (M) and H −s (M) are dual to each other with respect to this bilinear pairing of H s (M) and H −s (M):  H s (M) ∼ = H −s (M),  −s  ∼ H s (M). H (M) = 

Similarly, the spaces H s (M) and H −s (M) are antidual to each other with respect to an extension of the sesquilinear form (·, ·) on the product space C ∞ (M) × C ∞ (M) defined by the formula (see the Parseval formula (7.22c))

8 L 2 Theory of Sobolev Spaces

366

 {u, v} −→ (u, v) = =

u(x) · v(x) dμ(x) M N



χ−1 j

∗  ∗     ϕ j u , χ−1 ϕ v . j j

j=1

We denote again this sesquilinear form on the product space H s (M) × H −s (M) by (·, ·). We remark that (u, v) = u, v for u ∈ H s (M) and v ∈ H −s (M). (5) Let s1 , s, s2 be real numbers such that s1 < s < s2 . For every ε > 0, there exists a constant Cε > 0 such that u2s ≤ εu2s2 + Cε u2s1 for u ∈ H s2 (M). (6) Finally, the next theorem is a manifold version of Rellich’s theorem (Theorem 8.5): Theorem 8.12 (Rellich) If s > t, then the injection H s (M) −→ H t (M) is compact.

n) 8.4 The Spaces H s (R+

Preparatory to studying Sobolev spaces on a C ∞ manifold with boundary, we conn = {(x1 , . . . , xn ) ∈ Rn : xn ≥ 0}. We define sider Sobolev spaces on the halfspace R+ the restriction map n ) ρ : H s (Rn ) −→ D (R+ by the formula ρ(u) = u|R+n for u ∈ H s (Rn ). Then the null space

  u ∈ H s (Rn ) : ρ(u) = 0

is the closed subspace HRs n \R+n (Rn ) of H s (Rn ). Hence we have the assertion

n) 8.4 The Spaces H s (R+

367

The factor space H s (Rn )/HRs n \R+n (Rn ) is   isomorphic to the range ρ(u) : u ∈ H s (Rn ) of ρ.

(8.2)

n : This leads us to the following definition of a Sobolev space on R+ n n ) = the space of distributions u ∈ D (R+ ) such that H s (R+

there exists a distribution U ∈ H s (Rn ) with ρ(U ) = u. n ) with the norm We equip the space H s (R+

us = inf U s where the infimum is taken over all such U . On the other hand, since we have the orthogonal decomposition H s (Rn ) = HRs n \R+n (Rn )



HRs n \R+n (Rn )

⊥

,

it follows from an application of Theorem 5.32 that The factor space H s (Rn )/HRs n \R+n (Rn ) is  ⊥ isomorphic to the space HRs n \R+n (Rn ) .

(8.3)

Therefore, by combining assertions (8.2) and (8.3) we obtain that the space n ) is isomorphic to the space H s (R+ 

HRs n \R+n (Rn )

⊥

.

n ) admits a Hilbert space structure. We remark that, Hence the Sobolev space H s (R+ n s for every u ∈ H (R+ ), there exists a unique distribution

 ⊥ U ∈ HRs n \R+n (Rn ) such that ρ(U ) = u and

U s = us .

The characterization of H s (Rn ) in terms of L 2 -norms in Theorems 8.1 and 8.2 allows us to obtain the following: Theorem 8.13 If s ≥ 0, then the Seeley extension operator n ) −→ C0∞ (Rn ) E : C0∞ (R+

8 L 2 Theory of Sobolev Spaces

368

extends uniquely to a continuous linear extension operator n ) −→ H s (Rn ). E : H s (R+ n ) when s is a non-negative The next theorem gives a direct description of H s (R+ integer.

Theorem 8.14 If m is a non-negative integer, then the Sobolev space n ) H m (R+ n ) such that is the space of functions u ∈ L 2 (R+ n ) for |α| ≤ m. D α u ∈ L 2 (R+

Furthermore, the norm um is equivalent to the norm ⎛ ⎝

⎞1/2

 |α|≤m

n R+

|D α u(x)|2 d x ⎠

.

n Here are some basic topological properties of H s (R+ ):

(1) We have the inclusions n n n ) ⊂ H s (R+ ) ⊂ D (R+ ) C0∞ (R+

with continuous inclusions. n n ) is dense in H s (R+ ) for each s ∈ R. (2) The space C0∞ (R+ (3) The next theorem is a halfspace version of Sobolev’s theorem (Theorem 8.3): Theorem 8.15 (Sobolev) If s > n/2 + k where k is a non-negative integer, then we have the inclusion n n ) ⊂ C k (R+ ) H s (R+ with continuous injection. (4) We define a bilinear form < ·, · > on the product space n ) × HR−sn (Rn ) H s (R+ +

by the formula u, v =  u , v ,

(8.4)

where  u is an extension of u in H s (Rn ) and < ·, · > on the right-hand side is the u, v > pairing of H s (Rn ) and H −s (Rn ). We can easily verify that the quantity <  does not depend on the extension  u chosen.

n) 8.4 The Spaces H s (R+

369

Then we have the following duality theorem (see [35, Chapiter II, Proposition 3.5]): n ) and HR−sn (Rn ) Theorem 8.16 (the duality theorem) The Sobolev spaces H s (R+ +

n are dual to each other with respect to the bilinear pairing of H s (R+ ) and HR−sn (Rn ) + defined by formula (8.4):



 n  ∼ H s (R+ ) = HR−sn (Rn ), +   −s n n ∼ HRn (R ) = H s (R+ ). +

8.5 The Spaces H s (Ω) Now let Ω be a bounded, C ∞ domain in Rn . Its closure Ω is an n-dimensional compact C ∞ manifold with boundary. By Theorems 4.19 and 4.20, we may assume the following: (a) The domain Ω is a relatively compact open subset of an n-dimensional compact C ∞ manifold M without boundary in which Ω has a C ∞ boundary ∂Ω (see  is called the double of Ω. Fig. 8.4). The manifold M = Ω (b) In a tubular neighborhood W of ∂Ω in M a normal coordinate t is chosen so that the points of W are represented as (x  , t), x  ∈ ∂Ω, −1 < t < 1; t > 0 in Ω, t < 0 in M \ Ω and t = 0 only on ∂Ω (see Fig. 8.5 below). (c) The manifold M is equipped with a strictly positive density μ which, on W , is the product of a strictly positive density ω on ∂Ω and the Lebesgue measure dt on (−1, 1). The Sobolev spaces H s (Ω) are defined similarly to the way in which it was n ), by replacing Rn by M. That is, we let defined for the spaces H s (R+ Fig. 8.4 The domain Ω and  the double M = Ω

8 L 2 Theory of Sobolev Spaces

370

Fig. 8.5 The boundary ∂Ω and the tubular neighborhood W

H s (Ω) = the space of distributions u ∈ D (Ω) such that there exists a distribution U ∈ H s (M) with ρ(U ) = u. Here ρ is the restriction map to Ω. We equip the space H s (Ω) with the norm us = inf U s where the infimum is taken over all such U . n ) in Sect. 8.4 are true also for the All the results we stated about the spaces H s (R+ s spaces H (Ω). We summarize some basic topological properties of H s (Ω): (1) The Sobolev space H s (Ω) is a Hilbert space. (2) We have the inclusions C ∞ (Ω) ⊂ H s (Ω) ⊂ D (Ω) with continuous inclusions. Furthermore, the space C ∞ (Ω) is dense in H s (Ω) for each s ∈ R. (3) The next theorem is a bounded domain version of Sobolev’s theorem (Theorem 8.3): Theorem 8.17 (Sobolev) Let Ω be a bounded, C ∞ domain in Rn . If s > n/2 + k where k is a non-negative integer, then we have the inclusion H s (Ω) ⊂ C k (Ω) with continuous injection. Furthermore, we have the assertion C ∞ (Ω) =



H s (Ω).

s∈R

(4) We let HΩs (M) = the subspace of H s (M) consisting of the functions with support in Ω.

8.5 The Spaces H s (Ω)

371

Since the injection H s (M) → D (Ω) is continuous, it follows that HΩs (M) is a closed subspace of H s (M); hence it is a Hilbert space. We define a bilinear form < ·, · > on the product space H s (Ω) × HΩ−s (M) by the formula u, v =  u , v , (8.5) where  u is an extension of u in H s (M) and < ·, · > on the right-hand side is the u , v > is pairing of H s (M) and H −s (M). It is easy to verify that the quantity <  independent of the extension  u chosen. Then we have a bounded domain version of the duality Theorem 8.16 (see [35, Chapiter II, Proposition 3.5]): Theorem 8.18 (the duality theorem) Let Ω be a bounded, C ∞ domain in Rn . Then the spaces H s (Ω) and HΩ−s (M) are dual to each other with respect to the bilinear pairing of H s (Ω) and HΩ−s (M) defined by formula (8.5): 

 H s (Ω) ∼ = HΩ−s (M),   HΩ−s (M) ∼ = H s (Ω). (5) By covering a neighborhood of ∂Ω with local charts and locally using the Seeley extension operator (cf. the proof of Theorem 7.45), we can obtain an extension operator E : C ∞ (Ω) −→ C ∞ (M). If s ≥ 0, then this operator E extends uniquely to a continuous linear extension operator E : H s (Ω) −→ H s (M). The next proposition follows from the proof of Theorem 7.45: Proposition 8.19 Let E  : H −s (M) → HΩ−s (M) be the transpose of the Seeley extension operator E : H s (Ω) → H s (M) (s ≥ 0). If u ∈ H −s (M) is of class C ∞ up to ∂Ω in Ω and also in M \ Ω, then E  u ∈ HΩ−s (M) is of class C ∞ up to ∂Ω in Ω. (6) Let s1 , s, s2 be real numbers such that 0 ≤ s1 < s < s2 . For every ε > 0, there exists a constant Cε > 0 such that u2s ≤ εu2s2 + Cε u2s1 for u ∈ H s2 (Ω). Indeed, we have, with a constant C > 0 independent of ε, u2s ≤ Eu2s ≤ εEu2s2 + Cε Eu2s1   ≤ C εu2s2 + Cε u2s1 .

8 L 2 Theory of Sobolev Spaces

372

(7) Finally, we have the following version of Rellich’s theorem (Theorem 8.5): Theorem 8.20 (Rellich) If s ≥ 0 and s > t, then the injection H s (Ω) −→ H t (Ω) is compact. Indeed, it suffices to note that the injection H s (Ω) −→ H t (Ω) can be written as ρ

E

H s (Ω) −→ H s (M) −→ H t (M) −→ H t (Ω).

8.6 Trace Theorems In this section we study the restrictions to hyperplanes of functions in the Sobolev space H s (Rn ). To do so, we denote points in Rn or the standard coordinate system of Rn as follows:   x = x  , xn ∈ Rn , x  = (x1 , x2 , . . . , xn−1 ) ∈ Rn−1 , and   ξ = ξ  , ξn ∈ Rn , ξ  = (ξ1 , ξ2 , . . . , ξn−1 ) ∈ Rn−1 . If j is a non-negative integer, we define the trace map ∞ n (R+ ) −→ C0∞ (Rn−1 ) γ j : C(0)

by the formula ∞ n (R+ ). γ j u(x  ) = lim Dnj u(x  , xn ) for u ∈ C(0) xn ↓0

Here

  ∞ n (R+ ) = u|R+n : u ∈ C0∞ (Rn ) . C(0)

(I) We start with the case where j = 0: Theorem 8.21 If s > 1/2, then the trace map ∞ n γ0 : C(0) (R+ ) −→ C0∞ (Rn−1 )

8.6 Trace Theorems

373

extends uniquely to a continuous linear map n ) −→ H s−1/2 (Rn−1 ), γ0 : H s (R+

where n n ) = the space of distributions u ∈ D (R+ ) such that H s (R+

there exists a distribution U ∈ H s (Rn ) with ρ(U ) = u. More precisely, we have the inequality γ0 u2s−1/2 ≤ Cs u2s for all u ∈ H s (Rn ).

(8.6)

Here Cs is a positive constant given by the formula Cs =

1 2π





−∞



1 1+

s σ2

dσ =

Γ (s − 1/2) 1 for s > . √ 2 2 πΓ (s)

Proof Since C0∞ (Rn ) is dense in H s (Rn ), it suffices to show inequality (8.6) for every function u ∈ C0∞ (Rn ). First, by the Foureir inversion formula on Rn it follows that u(x  , xn ) =

1 (2π)n

 Rn

ei x·ξ  u (ξ) dξ.

Hence we have the formula  1   ei x ·ξ  u (ξ  , ξn ) dξ  dξn (γ0 u) (x ) = u(x , 0) = n (2π) Rn     1 1 i x  ·ξ   e  u (ξ , ξn ) dξn dξ  . = (2π)n−1 Rn−1 2π R 



(8.7)

On the other hand, by the Foureir inversion formula on Rn−1 we find that 1 γ0 u(x ) = (2π)n−1 

 Rn−1





  ei x ·ξ γ 0 u(ξ ) dξ .

Therefore, it follows from formulas (8.7) and (8.8) that  1  γ  u (ξ  , ξn ) dξn for ξ  ∈ Rn−1 . 0 u(ξ ) = 2π R By applying the Schwarz inequality to formula (8.9), we obtain that

(8.8)

(8.9)

8 L 2 Theory of Sobolev Spaces

374

2      1       2 2 −s/2 2 s/2   γ 1 + |ξ| · 1 + |ξ|  u (ξ , ξn )dξn  (8.10) 0 u(ξ ) =  2π R   2  −s  s   1 1 u (ξ , ξn ) dξn . ≤ 1 + |ξ|2 1 + |ξ|2  dξn · 2π R 2π R However, by the change of the variable 1/2  ηn , ξn = 1 + |ξ|2 we have the formula       −s  1 2 −s  2 −s+1/2 1 1 + |ξ| 1 + ηn2 dξn = 1 + |ξ | dηn 2π R 2π R  −s+1/2 = Cs 1 + |ξ  |2 .

(8.11)

Therefore, by carrying formula (8.11) into inequality (8.10) we find that     1  2  2 −s+1/2 γ · 0 u(ξ ) ≤ C s 1 + |ξ | 2π



2  s   u (ξ , ξn ) dξn . 1 + |ξ|2 

R

This proves the desired inequality  2  s−1/2  1 γ u(ξ  ) dξ  1 + |ξ  |2 n−1 n−1 (2π) R  2  s   1 u (ξ , ξn ) dξn dξ  1 + |ξ|2  ≤ Cs (2π)n Rn−1 R = Cs u2s .

γ0 u2s−1/2 =

The proof of Theorem 8.21 is complete.



(II) We can prove the following general case (see [35, Chapitre II, Corollaire 4.4]): Theorem 8.22 (the trace theorem) If 0 ≤ j < s − 1/2, then the trace map ∞ n γ j : C(0) (R+ ) −→ C0∞ (Rn−1 )

extends uniquely to a continuous linear map n ) −→ H s− j−1/2 (Rn−1 ). γ j : H s (R+ n ), then the mapping Furthermore, if u ∈ H s (R+

xn −→ Dnj u(·, xn ) is a continuous function on [0, ∞) with values in H s− j−1/2 (Rn−1 ).

8.6 Trace Theorems

375

The next theorem shows that the result of Theorem 8.22 is sharp: Theorem 8.23 If 0 ≤ j < s − 1/2, then the trace map 

n H s (R+ ) −→

H s− j−1/2 (Rn−1 )

0≤ j 0. Remark 8.28 Theorem 8.27 is an expression of the fact that if we know about the derivatives of the solution u of Au = f in tangential directions, then we can derive information about the normal derivatives γ j u by means of the equation Au = f . If u ∈ D (Ω) has a sectional trace on ∂Ω of order zero, we can define its extension u 0 in D (M) as follows: Choose functions θ ∈ C0∞ (W ) and ψ ∈ C0∞ (Ω) such that θ + ψ = 1 on Ω, and define u 0 by the formula  u 0 , ϕ · μ =

1

u(t), (θϕ)(·, t) · ω dt + u, ψϕ · μ for ϕ ∈ C ∞ (M).

0

The distribution u 0 is an extension to M of u which is equal to zero in M \ Ω. j If v ∈ D (∂Ω), we define a multiple layer v ⊗ Dt δ ( j = 0, 1, . . .) by the formula v ⊗ Dt δ, ϕ · μ = (−1) j v, Dt ϕ(·, 0) · ω , ϕ ∈ C ∞ (M). j

j

j

It is clear that v ⊗ Dt δ is a distribution on M with support in ∂Ω. Let P be a differential operator of order m with C ∞ coefficients on M. In a neighborhood of ∂Ω, we can write P = P(x, Dx ) uniquely in the form P(x, Dx ) =

m

j=0

j

P j (x, Dx  )Dt

  for x = x  , t ,

8 L 2 Theory of Sobolev Spaces

378

where P j (x, Dx  ) is a differential operator of order m − j acting along the surfaces parallel to the boundary ∂Ω. Now it is easy to see that formulas (4.4) and (4.5) extend to this case: (1) If u ∈ D (Ω) has sectional traces on ∂Ω up to order j, then we have the formula 1 γ j−k−1 u ⊗ Dtk δ. i k=0 j−1

j

j

Dt (u 0 ) = (Dt u)0 +

(8.13)

(2) If u ∈ D (Ω) has sectional traces on ∂Ω up to order m, then we have the jump formula   1 P+k+1 (x, Dx  )γ u ⊗ Dtk δ. P u 0 = (Pu)0 + i +k+1≤m

(8.14)

Section 9.8.3 is devoted to the pseudo-differential operator approach to sectional traces (Theorem 9.42).

8.8 Sobolev Spaces and Regularizations We introduce a two-parameter family of norms on the Sobolev spaces H s (Rn ). If m > 0 and 0 < ρ < 1, we let  1 2 (1 + |ξ|2 )s (1 + |ρξ|2 )−m | u (ξ)|2 dξ. (8.15) u(s,m,ρ) = (2π)n Rn We list two results which follow at once: (1) For all u ∈ H s−m (Rn ), we have the inequalities ρm u(s,m,ρ) ≤ us−m ≤ u(s,m,ρ) , that is, the norm u(s,m,ρ) is equivalent to the norm us−m . (2) If u ∈ H s (Rn ), then we have the assertion u(s,m,ρ) ↑ us as ρ ↓ 0, so that us = sup u(s,m,ρ) . 0 0. χ εn ε

The next theorem gives another equivalent expression for the norm u(s,m,ρ) in terms of the regularizations u ∗ χε of u. Theorem 8.31 Assume that the function χ satisfies condition (2) for k > s. Then,  for any s1 ∈ R and t < s + s1 − m, there exist constants Cs,s1 ,t > 0 and Cs,s >0 1 ,t s+s1 −m n (R ), independent of ρ such that we have, for all u ∈ H  u2(s+s1 ,m.ρ)

1

≤ Cs,s1 ,t

u ∗

0

χε 2s1

!  −m ρ2 −2s dε 2 1+ 2 + ut ε ε ε

 ≤ Cs,s u2(s+s1 ,m,ρ) . 1 ,t

Now let M be an n-dimensional, compact C ∞ manifold without boundary. If m > 0 and 0 < ρ < 1, we define a norm  · (s,m,ρ) on the Sobolev space H s−m (M) by the formula  u(s,m,ρ) =

inf

u  +u  =u u  ,u  ∈D  (M)

 1   u s−m + u s . ρm

Then the above results are also true for the spaces H s (M). More precisely, we have the following results (cf. Hörmander [84, 85]): (1) The norm  · (s,m,ρ) increases as ρ ↓ 0, and we have the formula us = sup u(s,m,ρ) if u ∈ H s (M). 0 s. Then, for any s1 ∈  > 0 independent R and t < s + s1 − m, there exist constants Cs,s1 ,t > 0 and Cs,s 1 ,t s+s1 −m (M) with support in K , of ρ such that we have, for all u ∈ H  u2(s+s1 ,m.ρ)

1

≤ Cs,s1 ,t

u ∗

0

χε 2s1

!  −m ρ2 −2s dε 2 1+ 2 + ut ε ε ε

 ≤ Cs,s u2(s+s1 ,m,ρ) . 1 ,t

8.9 Friedrichs’ Mollifiers and Differential Operators The purpose of this section is to study the behaviour of the commutator of a differential operator P(x, D) with a regularization ρε ∗. More precisely, we prove the following theorem which characterizes the operator norms of the commutator [P, ρε ∗] in the framework of Sobolev spaces H s (Rn ) when ε ↓ 0 (see [85, Remark to Lemma 1.4.5], [35, Chapitre IV, Sect. 10]): Theorem 8.34 Let P = P(x, D) =

aα (x)D α for aα ∈ C0∞ (Rn )

|α|≤m

be a differential operator of order m with symbol p(x, ξ) =

aα (x)ξ α ,

|α|≤m

and let ρ ∈ C0∞ (Rn ). For every ε > 0, we define the commutator [P, ρε ∗] by the formula [P, ρε ∗] u := P (ρε ∗ u) − ρε ∗ (Pu), where the {ρε } are Friedrichs’ mollifiers. Then, for any real number s and a non-negative number σ we can find a constant C = C(s, σ) > 0, independent of ε, such that we have, for every u ∈ H s+m (Rn ), [P, ρε ∗]us+σ ≤ C ε1−σ us+m .

(8.17)

8 L 2 Theory of Sobolev Spaces

382

Proof The proof of Theorem 8.34 is divided into two steps. Step I: First, by using the Fourier transform we can write down Pv(x) for v ∈ S(Rn ) in the form Pv(x) 

1  1 i xξ = e p(x, ξ) v (ξ) dξ = ei xξ aα (x) ξ α v (ξ) dξ n n (2π)n Rn (2π) R |α|≤m  1 1 ei xη aα (η − ξ) ξ α v (ξ) dξ dη. = n n (2π)n (2π)n R ×R Hence we have the formulas P(ρε ∗ u)(x)  1 1 = ei xη  aα (η − ξ)ξ α ρ ε ∗ u(ξ) dξ dη (2π)n (2π)n Rn ×Rn  1 1 = ei xη  aα (η − ξ)ξ α ρε (ξ) u (ξ) dξ dη (2π)n (2π)n Rn ×Rn  1 1 ei xη  aα (η − ξ)ξ α  ρ(εξ) u (ξ) dξ dη = n n (2π)n (2π)n R ×R

(8.18)

ρε ∗ (Pu)(x)  1 = ei xη ρε ∗ (Pu)(η) dη (2π)n Rn  1  = ei xη ρε (εη) Pu(η) dη (2π)n Rn  1 1 = ei xη  aα (η − ξ) ξ α  ρ(εη) u (ξ) dξ dη. n (2π) (2π)n Rn ×Rn

(8.19)

and

If we let R1 (ε)u = [P, ρε ∗] u = P (ρε ∗ u) − ρε ∗ (Pu), then it follows from formulas (8.18) and (8.19) that R1 (ε)u  1 1 = ei xη aα (η − ξ)ξ α ( ρ(εξ) −  ρ(εη))  u (ξ) dξ dη (2π)n (2π)n Rn ×Rn  1 1 = ei xη  p (η − ξ, ξ) ( ρ(εξ) −  ρ(εη))  u (ξ) dξ dη, n n (2π)n (2π)n R ×R where

8.9 Friedrichs’ Mollifiers and Differential Operators

383

 p (η − ξ, ξ) :=  aα (η − ξ)ξ α . However, by using Taylor’s formula we can express the function  ρ(εξ) −  ρ(εη) as follows:  1 d ρ(εξ) −  ρ(εη) = − ρ (εξ + t (εη − εξ))) dt (8.20) r1 (ξ, η, ε) :=  ( 0 dt n  1

  ∂ ρ =− ((εξ + t (εη − εξ))) dt · ε ξ j − η j . ∂η j j=1 0 Hence we have the estimate |r1 (ξ, η, ε)| = | ρ(εξ) −  ρ(εη)| ≤ ε |ξ − η| for all ξ, η ∈ Rn .

(8.21)

Moreover, we have, for all |ξ − η| ≤ 21 |ξ|, |εξ + t (εη − εξ)| ≥ ε |ξ| − t |εη − εξ| ≥ ε |ξ| −

ε ε |ξ| = |ξ| . 2 2

Therefore, for each non-negative integer k we can find a constant Ck > 0 such that we have, for all |ξ − η| ≤ 21 |ξ|,    ρ  k  ∂  |ξ|) + ε + t − εξ))) (1 (εη  ∂η ((εξ  j    ρ  k  ∂ k ≤ 2 (1 + |εξ + t (εη − εξ)|)  ((εξ + t (εη − εξ))) ∂η j ≤ Ck . In view of formula (8.20), this proves that |r1 (ξ, η, ε)| = | ρ(εξ) −  ρ(εη)| ≤ Ck ε |ξ − η| (1 + ε |ξ|)−k

(8.22) for all |ξ − η| ≤

1 |ξ| . 2

On the other hand, for each non-negative integer h we can find a constant C h > 0 such that we have, for all ξ, η ∈ Rn , | p (η − ξ, ξ)| = | aα (η − ξ)ξ α | ≤ C h (1 + |ξ|)m (1 + |η − ξ|)−h . Step II: Now we can estimate the Sobolev norm of the integral operator

(8.23)

8 L 2 Theory of Sobolev Spaces

384

R1 (ε)u  1 1 = ei xη  aα (η − ξ) ξ α ( ρ(εξ) −  ρ(εη))  u (ξ) dξ dη n n (2π)n (2π)n R ×R  1 1 ei xη  p (η − ξ, ξ)r1 (ξ, η, ε)  u (ξ) dξ dη. = (2π)n (2π)n Rn ×Rn To do this, we consider the scalar product R1 (ε)u, v  1 R v (−η) dη = 1 (ε)u(η) ·  (2π)n Rn  1 1 =  p (η − ξ, ξ)r1 (ξ, η, ε)  u (ξ) ·  v (−η) dξ dη (2π)n (2π)n Rn ×Rn for all u, v ∈ S(Rn ). If we introduce two functions (s+m)/2  | u (ξ)| U (ξ) := 1 + |ξ|2 and

−(s+σ)/2  | v (−η)| , V (η) := 1 + |η|2

then, by using estimates (8.21), (8.22) and (8.23) we obtain that | R1 (ε)u, v | (8.24)  1 1 | u (ξ)| | v (−η)| dξ dη p (η − ξ, ξ)| |r1 (ξ, η, ε)| | ≤ n n (2π)n (2π)n  R ×R 1 1 | p (η − ξ, ξ)| |r1 (ξ, η, ε)| = n (2π) (2π)n Rn ×Rn  −(s+m)/2 (s+σ)/2  × 1 + |ξ|2 U (ξ) · 1 + |η|2 V (η) dξ dη     −h/2 1 1 m/2 ≤ 1 + |η − ξ|2 C h 1 + |ξ|2 n n (2π) (2π) Rn ×Rn  −(s+m)/2 (s+σ)/2  × |r1 (ξ, η, ε)| 1 + |ξ|2 U (ξ) · 1 + |η|2 V (η) dξ dη. However, by using the Peetre inequality (s+σ)/2  (s+σ)/2  |s+σ|/2  1 + |ξ − η|2 ≤ 2|s+σ|/2 1 + |ξ|2 1 + |η|2 for all ξ, η ∈ R, we have the inequality

8.9 Friedrichs’ Mollifiers and Differential Operators

385

 m/2  −h/2 1 + |ξ|2 1 + |η − ξ|2  −(s+m)/2  (s+σ)/2 × 1 + |ξ|2 1 + |η|2  m/2  −h/2  −(s+m)/2 1 + |η − ξ|2 1 + |ξ|2 ≤ 2|s+σ|/2 1 + |ξ|2 (s+σ)/2  |s+σ|/2  1 + |ξ − η|2 × 1 + |ξ|2  (σ)/2  −h/2+|s+σ|/2 1 + |ξ − η|2 = 2|s+σ|/2 1 + |ξ|2 ≤ C (1 + |ξ|)σ (1 + |ξ − η|)|s|+σ−h

(8.25)

for all ξ, η ∈ R,

with a constant C > 0 independent of ξ, η ∈ R. Therefore, by combining inequalities (8.24) and (8.25) we obtain that | R1 (ε)u, v | (8.26)  1 1 | u (ξ)| | v (−η)| dξ dη ≤ p (η − ξ, ξ)| |r1 (ξ, η, ε)| | (2π)n (2π)n Rn ×Rn  ≤ C h (1 + |ξ|)σ (1 + |η − ξ|)|s|+σ−h |r1 (ξ, η, ε)| U (ξ) · V (η) Rn ×Rn

× dξ dη, with a constant C h > 0 independent of ξ, η ∈ R. The case (A): |ξ − η| ≤ 21 |ξ|: In this case, we have, by inequality (8.22) with k := σ, |r1 (ξ, η, ε)| ≤ Cσ ε |ξ − η| (1 + ε |ξ|)−σ ≤ Cσ ε1−σ |ξ − η| (1 + |ξ|)−σ

for all |ξ − η| ≤

1 |ξ|, 2

and so (1 + |ξ|)σ (1 + |η − ξ|)|s|+σ−h |r1 (ξ, η, ε)| ≤ Cσ ε1−σ (1 + |η − ξ|)|s|+σ+1−h

for all |ξ − η| ≤

1 |ξ| . 2

Consequently, by taking the positive integer h sufficiently large so that h ≥ 1 + |s| + σ, we obtain that (1 + |ξ|)σ (1 + |η − ξ|)|s|+σ−h |r1 (ξ, η, ε)| 1 ≤ Cσ ε1−σ for all |ξ − η| ≤ |ξ|. 2

(8.27)

8 L 2 Theory of Sobolev Spaces

386

The case (B): |ξ − η| ≥ 21 |ξ|: In this case, since we have the inequality 1 + |ξ| ≤ 2 |ξ − η| + 1 ≤ 2 (1 + |ξ − η|) , we obtain from inequality (8.21) that |r1 (ξ, η, ε)| (1 + |ξ|)σ (1 + |η − ξ|)|s|+σ−h ≤ 2σ ε |η − ξ| (1 + |η − ξ|)|s|+2σ−h ≤ 2σ ε (1 + |η − ξ|)1+|s|+2σ−h

1 |ξ| . 2

for all |ξ − η| ≥

Consequently, by taking the positive integer h sufficiently large so that h ≥ 1 + |s| + 2σ ≥ 1 + |s| + σ, we obtain that (1 + |ξ|)σ (1 + |η − ξ|)|s|+σ−h |r1 (ξ, η, ε)| 1 ≤ 2σ ε for all |ξ − η| ≥ |ξ|. 2

(8.28)

Therefore, by combining inequalities (8.26), (8.27) and (8.28) and using the Schwarz inequality we have the following inequality: | R1 (ε)u, v |  ≤ C  ε1−σ

Rn ×Rn

 1−σ

≤C ε

U (ξ) · V (η) dξ dη



1/2 

|U (ξ)| dξ

1/2

2

Rn

|V (η)| dη 2

Rn

= C  ε1−σ us+m · v−s−σ , with a constant C  > 0 independent of ε. This proves the desired inequality (8.17). Indeed, by the duality of the Sobolev spaces H s+σ (Rn ) and H −s−σ (Rn ) it suffices to note that   | R1 (ε)u, v | R1 (ε)us+σ = sup : v ∈ H −s−σ (Rn ) v−s−σ ≤ C  ε1−σ us+m . Now the proof of Theorem 8.34 is complete. As a particular case, we have the following Friedrichs lemma:



8.9 Friedrichs’ Mollifiers and Differential Operators

387

Corollary 8.35 (Friedrichs) Let P = P(x, D) be a differential operator of order m as in Theorem 8.34. Assume that a function ρ ∈ C0∞ (Rn ) satifies the condition   ρ(0) =

Rn

ρ(x) d x = 1.

Then there exists a constant C > 0, independent of ε, such that [P, ρε ∗] u H s (Rn ) ≤ C u H s+m−1 (Rn ) for all u ∈ H

s+m−1

(8.29) (R ). n

Moreover, we have, as ε ↓ 0, R1 (ε)u = [P, ρε ∗] u −→ 0 in H s (Rn ) for every u ∈ H s+m−1 (Rn ). Proof The proof of Corollary 8.35 is divided into three steps. Step I: The desired inequality (8.29) is obtained by letting σ := 1, s := s − 1 in inequality (8.17). Step II: If u ∈ S(Rn ), it is easy to see that ρε ∗ u −→ u in S(Rn ) as ε ↓ 0. Hence we have the assertion

aα (x)D α (ρε ∗ u) −→ aα (x)D α u = Pu P (ρε ∗ u) = |α|≤m

|α|≤m

in S(Rn ) as ε ↓ 0. Similarly, since Pu ∈ C0∞ (Rn ), it follows that ρε ∗ (Pu) −→ Pu in S(Rn ) as ε ↓ 0. Therefore, we have, as ε ↓ 0, R1 (ε)u = [P, ρε ∗] u = P (ρε ∗ u) − ρε ∗ (Pu) −→ 0 in S(Rn ) for every u ∈ S(Rn ), and so R1 (ε)u −→ 0 in H s (Rn ) for every u ∈ S(Rn ).

(8.30)

8 L 2 Theory of Sobolev Spaces

388

Step III: Let u be an arbitrary function in the space H s+m−1 (Rn ). Since the Schwartz space S(Rn ) is dense in H s+m−1 (Rn ), for any given δ > 0 we can find a function vδ ∈ S(Rn ) such that u − vδ s+m−1 < δ. Then we have, by inequality (8.29), R1 (ε)us ≤ R1 (ε)vδ s + R1 (ε) (u − vδ )s ≤ R1 (ε)vδ s + C u − vδ s+m−1 ≤ R1 (ε)vδ s + Cδ. Hence it follows from assertion (8.30) with u := vδ that lim sup R1 (ε)us ≤ Cδ. ε↓0

This proves that lim R1 (ε)us = 0 for every u ∈ H s+m−1 (Rn ), ε↓0

since δ is arbitrary. The proof of Corollary 8.35 is complete.



Remark 8.36 All the above results remain valid when the symbol of P depends continuously on a parameter varying within a compact subset. Indeed, it suffices to note that we can choose the constant C h in estimate (8.23) independent of this parameter. As an application of Corollary 8.35, we can prove the equivalence between a weak solution in the distributional sense and a strong solution. Indeed, we have the following result: Corollary 8.37 Let P = P(x, D) be a differential operator of order m as in Theorem 8.34. Assume that a function ρ ∈ C0∞ (Rn ) satisfies the condition   ρ(0) =

Rn

ρ(x) d x = 1.

If a function u ∈ H m−1 (Rn ) satisfies the equation Pu = f for f ∈ L 2 (Rn ), in the distributional sense, then we can construct a sequence   u j ⊂ C ∞ (Rn ) ∩ H m−1 (Rn )

8.9 Friedrichs’ Mollifiers and Differential Operators

389

such that we have, as j → ∞, u j −→ u in H m−1 (Rn ), Pu j −→ Pu = f in L 2 (Rn ). In this case, the function u is called a strong solution of the equation Pu = f . Proof For every positive integer j, we let  u j (x) := u ∗ ρ1/j (x) = j

n Rn

ρ( j (x − y))u(y) dy.

Then it is easy to verify that u j ∈ C ∞ (Rn ) ∩ H m−1 (Rn ), and further that u j −→ u in H m−1 (Rn ) as j → ∞. Moreover, we have the formula     Pu j = P ρ1/j ∗ u = ρ1/j ∗ (Pu) + P ρ1/j ∗ u − ρ1/j ∗ (Pu) " # = ρ1/j ∗ (Pu) + P, ρ1/j ∗ u.

(8.31)

However, it follows from an application of Corollary 8.35 with s := 0 that "

# P, ρ1/j ∗ u −→ 0 in L 2 (Rn ) as j → ∞.

Since Pu = f ∈ L 2 (Rn ), we have, as j → ∞, ρ1/j ∗ (Pu) −→ Pu in L 2 (Rn ). Therefore, we obtain from formula (8.31) that Pu j −→ Pu = f in L 2 (Rn ) as j → ∞. The proof of Corollary 8.37 is complete.



We conclude this section with the following remark: Remark 8.38 All the above discussion generalizes word for word to the case where P is a matrix-valued, differential operator.

390

8 L 2 Theory of Sobolev Spaces

8.10 Notes and Comments Our treatment of L 2 Sobolev spaces is adapted from Chazarain–Piriou [35], Folland [62] and also Hörmander [84] in such a way as to make it accessible to graduate students and advanced undergraduates as well. For more leisurely treatments of function spaces, the readers might be referred to Adams–Fournier [2], Amann [12], Bergh–Löfström [18], Lions–Magenes [116] and Triebel [224, 225]. Section 8.1: Rellich’s theorem (Theorem 8.5) is taken from Chazarain–Piriou [35, Chapitre II, Théorème 2.10] and Folland [62, Theorem (6.14)]. Section 8.2: Theorem 8.7 is taken from Chazarain–Piriou [35, Chapitre II, Théorème 2.6]. Section 8.3: Mollifiers were introduced by Ogura [136, Theorem II] and Friedrichs [68, Sect. 2]. Section 8.5: Rellich’s theorem (Theorem 8.20) is taken from Chazarain–Piriou [35, Chapitre II, Proposition 3.4] and Folland [62, Theorem (6.46)]. Section 8.6: Amann [12, Chap. VIII] proves various trace theorems that are fundamental for the theory of boundary value problems for elliptic and parabolic differential equations. Section 8.7: The sectional trace theorem (Theorem 8.27) is due to Hörmander [85]. Section 8.8: The two-parameter family  · (s,1,ρ) of norms was introduced by Hörmander [84], and was used to prove regularity theorems for linear partial differential equations. See also Hörmander [85], Fedi˘ı [55] and Ole˘ınik–Radkeviˇc [138]. Section 8.9: This section is adapted from Hörmander [85, Remark to Lemma 1.4.5] and Chazarain–Pirout [35, Chapitre IV, Sect. 10].

Chapter 9

L 2 Theory of Pseudo-differential Operators

In recent years there has been a trend in the theory of partial differential equations towards constructive methods. The development of the theory of pseudo-differential operators has made possible such an approach to the study of (non-)elliptic differential operators. The class of pseudo-differential operators is essentially the smallest algebra of operators which contains all differential operators, all fundamental solutions of elliptic differential operators and all integral operators with smooth kernel. In this chapter we define pseudo-differential operators and study their basic properties such as the behavior of transposes, adjoints and compositions of such operators, and the effect of a change of coordinates on such operators. Furthermore, we discuss in detail, via functional analysis, the behavior of elliptic pseudo-differential operators on Sobolev spaces, and formulate the sectional trace theorem (Theorem 9.42) and classical surface and volume potentials (Theorems 9.48 and 9.49) in terms of pseudodifferential operators. This calculus of pseudo-differential operators will be applied to elliptic boundary value problems in Chap. 11. Finally, we give Gårding’s inequality and related inequalities (Theorems 9.51 and 9.53), and describe three classes of hypoelliptic pseudo-differential operators (Theorems 9.56, 9.58 and 9.60), which arise in the construction of Feller semigroups in Chap. 13.

9.1 Symbol Classes Let Ω be an open subset of Rn . If P(x, D) =



aα (x)D α

|α|≤m

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 K. Taira, Functional Analytic Techniques for Diffusion Processes, Springer Monographs in Mathematics, https://doi.org/10.1007/978-981-19-1099-9_9

391

9 L 2 Theory of Pseudo-differential Operators

392

is a differential operator of order m with C ∞ coefficients on Ω, then we have, by Theorem 7.21,  1 ei x·ξ p(x, ξ) u (ξ) for u ∈ C0∞ (Ω), (9.1) P(x, D)u(x) = (2π)n Rn where p(x, ξ) =



aα (x)ξ α .

|α|≤m

We make use of use the Fourier integral representation (9.9) to define pseudodifferential operators, by taking p(x, ξ) to belong to a wider class of functions than polynomials. If m ∈ R and 0 ≤ δ < ρ ≤ 1, we let m (Ω × R N ) = the set of all functions a(x, θ) ∈ C ∞ (Ω × R N ) with Sρ,δ

the property that, for any compact K ⊂ Ω and any multi-indices α, β, there exists a constant C K ,α,β > 0 such that, for all x ∈ K and θ ∈ R N ,   α β ∂ ∂ a(x, θ) ≤ C K ,α,β (1 + |θ|)m−ρ|α|+δ|β| . θ

x

m (Ω × R N ) are called symbols of order m. We drop the Ω × R N The elements of Sρ,δ m and use Sρ,δ when the context is clear.  Example 9.1 (1) A polynomial p(x, ξ) = |α|≤m aα (x)ξ α of order m with coeffim cients in C ∞ (Ω) is in the class S1,0 (Ω × Rn ). (2) If m ∈ R, the function

m/2  Ω × Rn  (x, ξ) −→ 1 + |ξ|2 m is in the class S1,0 (Ω × Rn ).   (3) A function a(x, θ) ∈ C ∞ Ω × (R N \ {0}) is said to be positively homogeneous of degree m in θ if it satisfies the condition

a(x, tθ) = t m a(x, θ) for all t > 0. If a(x, θ) is positively homogeneous of degree m in θ and if ϕ(θ) is a smooth function such that ϕ(θ) = 0 for|θ| ≤ 1/2and ϕ(θ) = 1 for |θ| ≥ 1, then the function m Ω × RN . ϕ(θ)a(x, θ) is in the class S1,0 If K is a compact subset   of Ω and j is a non-negative integer, we define a seminorm m p K , j,m on Sρ,δ Ω × R N by the formula

9.1 Symbol Classes

393

m Sρ,δ (Ω × R N )  a −→ p K , j,m (a) =

sup x∈K θ∈R N |α|+|β|≤ j

    α β ∂θ ∂x a(x, θ) (1 + |θ|)m−ρ|α|+δ|β|

.

m We equip the space Sρ,δ (Ω × R N ) with the topology defined by the family { p K , j,m } of seminorms where K ranges over all compact subsets of Ω and j = 0, 1, . . .. The m (Ω × R N ) is a Fréchet space. space Sρ,δ We set   m   Sρ,δ Ω × R N , S∞ Ω × R N = m∈R

S

−∞

  m   Ω × RN = Sρ,δ Ω × R N . m∈R

  For example, if ϕ(ξ) ∈ S(R N ), then it follows that ϕ(ξ) ∈ S −∞ Ω × R N . More precisely, we have the formula     π S R N , S −∞ Ω × R N = C ∞ (Ω)⊗ π S(R N ) is the completed π-topology (or projective topolwhere the space C ∞ (Ω)⊗ ogy) tensor product of the Fréchet spaces C ∞ (Ω) and S(R N ) (see [157, Chap. III, Sect. 6], [223, Chap. 45]). We list some facts which follow at once:

(1) m ≤ m =⇒ S −∞ ⊂ S m ⊂ S m ⊂ S ∞ . β (2) a ∈ S m =⇒ ∂αθ ∂x a ∈ S m−|α| . (3) a ∈ S m , b ∈ S m =⇒ ab ∈ S m+m . In particular, it follows that S ∞ is a commutative algebra and that S −∞ is an ideal of S ∞ . The next theorem gives a meaning to a formal sum of symbols of decreasing order: m

Theorem 9.2 Let a j (x, θ) ∈ Sρ,δj (Ω × R N ), m j ↓ −∞, j = 0, 1, . . .. Then there m0 exists a symbol a(x, θ) ∈ Sρ,δ (Ω × R N ), unique modulo S −∞ (Ω × R N ), such that we have, for all k > 0, a(x, θ) −

k−1 

 mk  Ω × RN . a j (x, θ) ∈ Sρ,δ

j=0

If formula (9.2) holds true, we write a(x, θ) ∼

∞  j=0

a j (x, θ).

(9.2)

9 L 2 Theory of Pseudo-differential Operators

394

 The formal sum j a j (x, θ) is called an asymptotic expansion of a(x, θ). m (Ω × R N ) is said to be classical if there exist smooth A symbol a(x, θ) ∈ S1,0 functions a j (x, θ), positively homogeneous of degree m − j in θ for |θ| ≥ 1, such that ∞  a(x, θ) ∼ a j (x, θ). j=0

We remark that the homogeneous functions a j (x, θ) are uniquely determined (for |θ| ≥ 1) for a(x, θ). The homogeneous function a0 (x, θ) of degree m is called the principal part of a(x, θ). We let   Sclm Ω × R N = the set of all classical symbols of order m. It should be emphasized that the subspace Sclm (R N ), defined as the set of all xindependent elements of Sclm (Ω × R N ), is closed in the induced topology, and we have the formula     π Sclm R N . Sclm Ω × R N = C ∞ (Ω)⊗ Example 9.3 The symbols in Examples 9.1 are all classical, and they have respectively as principal part the following functions:  (1) pm (x, ξ) = |α|=m aα (x)ξ α . (2) |ξ|m . (3) a(x, θ). m A symbol a(x, θ) in Sρ,δ (Ω × R N ) is said to be elliptic of order m if there exists a symbol b(x, θ) ∈ S m (Ω × R N ) such that

a(x, θ)b(x, θ) ≡ 1

mode S −1 (Ω × R N ).

We give a useful criterion for ellipticity: Theorem 9.4 A symbol a(x, θ) in S m (Ω × R N ) is elliptic if and only if, for any compact K ⊂ Ω there exists a constant C K > 0 such that |a(x, θ)| ≥ C K (1 + |θ|)m for all x ∈ K and |θ| ≥

1 . CK

There is a simple criterion in the case of classical symbols: Corollary 9.5 Let a(x, θ) be in Sclm (Ω × R N ) with principal part a0 (x, θ). Then a(x, θ) is elliptic if and only if we have the condition a0 (x, θ) = 0, x ∈ Ω, |θ| = 1.

(9.3)

9.1 Symbol Classes

395

For example, a polynomial p(x, ξ) =



aα (x)ξ α

|α|≤m

of order m is elliptic if and only if we have the condition pm (x, ξ) =



aα (x)ξ α = 0

|α|=m

for all (x, ξ) ∈ Ω × (Rn \ {0}).

9.2 Phase Functions Let Ω be an open subset of Rn . A function    ϕ(x, θ) ∈ C ∞ Ω × R N \ {0} is called a phase function on Ω × (R N \ {0}) if it satisfies the following three conditions: (a) ϕ(x, θ) is real-valued. (b) ϕ(x, θ) is positively homogeneous of degree one in the variable θ. (c) The differential dϕ does not vanish on the space Ω × (R N \ {0}). Example 9.6 Let U be an open subset of R p and Ω = U × U . The function ϕ(x, y, ξ) = (x − y) · ξ is a phase function on the space Ω × (R p \ {0}) (n = 2 p, N = p). The next lemma will play a fundamental role in defining oscillatory integrals in Sect. 9.3: Lemma 9.7 (Lax) If ϕ(x, θ) is a phase function on Ω × (R N \ {0}), then there exists a first order differential operator L=

N  j=1

such that

 ∂ ∂ + bk (x, θ) + c(x, θ) ∂θ j ∂xk k=1 n

a j (x, θ)

  L eiϕ = eiϕ ,

and its coefficients a j (x, θ), bk (x, θ), c(x, θ) enjoy the following properties:

9 L 2 Theory of Pseudo-differential Operators

396

−1 0 a j (x, θ) ∈ S1,0 ; bk (x, θ), c(x, θ) ∈ S1,0 .

Furthermore, the transpose L of L has coefficients a j (x, θ), bk (x, θ), c (x, θ) in the same symbol classes as a j (x, θ), bk (x, θ), c(x, θ), respectively. For example, if ϕ(x, y, ξ) is a phase function as in Example 9.6   ϕ(x, y, ξ) = (x − y) · ξ for (x, y) ∈ U × U and ξ ∈ R p \ {0} , then the operator L is given by the formula ⎧ ⎫ p p p    ∂ ξk ∂ −ξk ∂ ⎬ 1 1 − ρ(ξ) ⎨  L= x j − yj + + i 2 + |x − y|2 ⎩ j=1 ∂ξ j |ξ|2 ∂xk |ξ|2 ∂ yk ⎭ k=1 k=1 + ρ(ξ), where ρ(ξ) is a function in C0∞ (R p ) such that ρ(ξ) = 1 for |ξ| ≤ 1.

9.3 Oscillatory Integrals Let Ω be an open subset of Rn and let   m   ∞ Ω × RN = Sρ,δ Sρ,δ Ω × R N . m∈R

  If ϕ(x, θ) is a phase function on Ω × R N \ {0} , we wish to give a meaning to the integral  Iϕ (au) =

Ω×R N

eiϕ(x,θ) a(x, θ)u(x) d xdθ, u ∈ C0∞ (Ω),

(9.4)

∞ for each symbol a(x, θ) ∈ Sρ,δ (Ω × R N ). It should be noticed that if a(x, θ) ∈ m Sρ,δ (Ω × R N ) with m < −N , this integral is absolutely convergent. We consider the general case. By Lemma 9.7, we can replace eiϕ in formula (9.4) by L(eiϕ ). Then a formal integration by parts gives us that

 Iϕ (au) =

Ω×R N

eiϕ(x,θ) L (a(x, θ)u(x)) d xdθ.

r continuously However, the properties of the coefficients of L imply that L maps Sρ,δ r −η into Sρ,δ for all r ∈ R, where η = min(ρ, 1 − δ). Continuing this process, we can reduce the growth of the integrand at infinity until it becomes integrable. In this way, ∞ (Ω × R N ). we can give a meaning to the integral (9.4) for each symbol a(x, θ) ∈ Sρ,δ

9.3 Oscillatory Integrals

397

More precisely, we have the following: Theorem 9.8 (i) The linear functional   S −∞ Ω × R N  a −→ Iϕ (au) ∈ C ∞ (Ω × R N ) whose restriction to each extends uniquely to a linear functional  on Sρ,δ m m Sρ,δ (Ω × R N ) is continuous. Furthermore, the restriction to Sρ,δ (Ω × R N ) of  is expressed in the form

 (a) =

Ω×R N

 k eiϕ(x,θ) L (a(x, θ)u(x)) d xdθ,

where k > (m + N )/η, η = min(ρ, 1 − δ). m (ii) For any fixed symbol a(x, θ) ∈ Sρ,δ (Ω × R N ), the mapping C0∞ (Ω)  u −→ Iϕ (au) = (a) ∈ C

(9.5)

is a distribution of order ≤ k for k > (m + N )/η. ∞ an oscillatory integral, but use the standard We call the linear functional  on Sρ,δ notation as in formula (9.4). The distribution (9.5) is called the Fourier integral distribution associated with the phase function ϕ(x, θ) and the amplitude a(x, θ), and is denoted by the formula

 K (x) =

RN

eiϕ(x,θ) a(x, θ) dθ.

Example 9.9 (a) The Dirac measure δ(x) may be expressed as an oscillatory integral in the form  1 ei xξ dξ. δ(x) = (2π)n Rn This formula is called the plane-wave expansion of the delta function (cf. Gel’fand– Shilov [73]). (b) The distributions v. p. (x j /|x|n+1 ), 1 ≤ j ≤ n, are expressed as v. p.

xj i =− n |x|n+1 2 Γ ((n + 1)/2) π (n−1)/2

 ei xξ Rn

ξj dξ for 1 ≤ j ≤ n. |ξ|

Indeed, it suffices to note that ξj  R j (ξ) = i for 1 ≤ j ≤ n, |ξ| as stated in Examples 7.30 (Riesz kernels).

9 L 2 Theory of Pseudo-differential Operators

398

Oscillatory integrals depending on a parameter behave like ordinary integrals. In fact, we have the following: p Theorem 9.10  an open subset of R and let ϕ(x, y, θ) be a phase function  Let Y be on Ω × Y × R N \ {0} such that

  dx,θ ϕ(x, y, θ) = 0 on Ω × Y × R N \ {0} . If a(x, y, θ) ∈ S m (Ω × Y × R N ), we let  F(y) =

Ω×R N

eiϕ(x,y,θ) a(x, y, θ) u(x, y) d x dθ for u ∈ C0∞ (Ω × Y ).

Then we have the following two assertions: (i) The distribution F is in C0∞ (Y ) and we may differentiate under the integral sign. (ii) For the Integral of F, we have the formula 

 F(y) dy =

Ω×Y ×R N

Y

eiϕ(x,y,θ) a(x, y, θ) u(x, y) d x d y dθ.

If u is a distribution on Ω, the singular support of u is the smallest closed subset of Ω outside of which u is smooth. The singular support of u is denoted by sing supp u. The next theorem estimates the singular support of a Fourier integral distribution: Theorem 9.11 If ϕ(x, θ) is a phase function on the space Ω × (R N \ {0}) and if ∞ (Ω × R N ), then the distribution a(x, θ) is in Sρ,δ  A=

RN

eiϕ(x,θ) a(x, θ) dθ ∈ D (Ω)

satisfies the condition   sing supp A ⊂ x ∈ Ω : dθ ϕ(x, θ) = 0 for some θ ∈ R N \ {0} .

9.4 Fourier Integral Operators Let U and V be open subsets of R p and Rq , respectively. If ϕ(x, y, θ) is a phase ∞ (U × V × R N ), then there function on U × V × (R N \ {0}) and if a(x, y, θ) ∈ Sρ,δ is associated a distribution K ∈ D (U × V ) defined by the formula  K (x, y) =

RN

eiϕ(x,y,θ) a(x, y, θ) dθ.

9.4 Fourier Integral Operators

399

By applying Theorem 9.11 to our situation, we obtain that   sing supp K ⊂ (x, y) ∈ U × V : dθ ϕ(x, y, θ) = 0 for some θ ∈ R N \ {0} . The distribution K ∈ D (U × V ) defines a continuous linear operator A : C0∞ (V ) −→ D (U ) by the formula Av, u = K , u ⊗ v for all u ∈ C0∞ (U ) and v ∈ C0∞ (V ). The operator A is called the Fourier integral operator associated with the phase function ϕ(x, y, θ) and the amplitude a(x, y, θ), and is denoted by the formula  Av(x) =

V ×R N

 = V

eiϕ(x,y,θ) a(x, y, θ)v(y) dy dθ

K (x, y) v(y) dy for v ∈ C0∞ (V ).

For example, the distribution K (x, y) =

1 (2π)n

defines the Bessel potential

 ei(x−y)ξ Rn

1 dξ ∈ D (Rn × Rn ) (1 + |ξ|2 )s/2

J s = (I − )−s/2

for any s > 0. Indeed, it suffices to note that (I − )−s/2 v(x) = G s ∗ v(x)  1 s (ξ) = ei(x−y)ξ G v (ξ) dξ (2π)n Rn     1 1 i(x−y)ξ e dξ v(y) dy = (2π)n Rn (1 + |ξ|2 )s/2 Rn for all v ∈ C0∞ (Rn ). The next theorem summarizes some basic properties of the operator A: Theorem 9.12 (i) If d y,θ ϕ(x, y, θ) = 0 on U × V × (R N \ {0}), then the operator A maps C0∞ (V ) continuously into C ∞ (U ). (ii) If dx,θ ϕ(x, y, θ) = 0 on U × V × (R N \ {0}), then the operator A extends to a continuous linear operator on E (V ) into D (U ).

9 L 2 Theory of Pseudo-differential Operators

400

(iii) If d y,θ ϕ(x, y, θ) = 0 and dx,θ ϕ(x, y, θ) = 0 on U × V × (R N \ {0}), then we have, for all v ∈ E (V ), sing supp Av   ⊂ x ∈ U : dθ ϕ(x, y, θ) = 0 for some y ∈ sing supp v and θ ∈ R N \ {0} .

9.5 Pseudo-differential Operators In this section we define pseudo-differential operators and study their basic properties such as the behavior of transposes, adjoints and compositions of such operators, and the effect of a change of coordinates on such operators. Furthermore, we formulate classical surface and volume potentials in terms of pseudo-differential operators. This calculus of pseudo-differential operators will be applied to elliptic boundary value problems in Chaps. 4–11.

9.5.1 Definitions and Basic Properties Let Ω be an open subset of Rn and m ∈ R. A pseudo-differential operator of order m on Ω is a Fourier integral operator of the form  Au(x) =

Ω×Rn

ei(x−y)·ξ a(x, y, ξ)u(y) dydξ for u ∈ C0∞ (Ω),

(9.6)

m with some a(x, y, ξ) ∈ Sρ,δ (Ω × Ω × Rn ). Namely, a pseudo-differential operator of order m is a Fourier integral operator associated with the phase function m ϕ(x, y, ξ) = (x − y) · ξ and some amplitude a(x, y, ξ) ∈ Sρ,δ (Ω × Ω × Rn ). We let

Lm ρ,δ (Ω) = the set of all pseudo-differential operators of order m on Ω. By applying Theorems 9.12 and 9.11 to our situation, we obtain the following three assertions: (1) A pseudo-differential operator A maps the space C0∞ (Ω) continuously into the space C ∞ (Ω) and extends to a continuous linear operator A : E (Ω) −→ D (Ω). (2) The distribution kernel K A , defined by the formula

9.5 Pseudo-differential Operators

401

 K A (x, y) =

Rn

ei(x−y)·ξ a(x, y, ξ) dξ,

of a pseudo-differential operator A satisfies the condition sing supp K A ⊂ {(x, x) : x ∈ Ω} , that is, the kernel K A is smooth off the diagonal Ω = {(x, x) : x ∈ Ω} in Ω × Ω. (3) sing supp Au ⊂ sing supp u for all u ∈ E (Ω). In other words, Au is smooth whenever u is. This property is referred to as the pseudo-local property. We set

L −∞ (Ω) =



Lm ρ,δ (Ω).

m∈R

The next theorem characterizes the class L −∞ (Ω): Theorem 9.13 The following three conditions are equivalent: (i) A ∈ L −∞ (Ω). (ii) A is written in the form (9.6) with some a(x, y, ξ) ∈ S −∞ (Ω × Ω × Rn ). (iii) A is a regularizer, or equivalently, its distribution kernel K A is in the space C ∞ (Ω × Ω). Proof (i) =⇒ (iii): If A ∈ L −∞ (Ω), then, for every m ∈ R there exists a symbol a ∈ S m such that A can be written in the form (9.6). Then its distribution kernel  K A (x, y) = ei(x−y)·ξ a(x, y, ξ) dξ Rn

is in C k (Ω × Ω) for k < −m − n. This proves that K A is in C ∞ (Ω × Ω). (iii) =⇒ (ii): If K A is in C ∞ (Ω × Ω), we can write A in the form (9.6), by taking a(x, y, ξ) = e−i(x−y)·ξ θ(ξ)K A (x, y), with   θ ∈ C0∞ Rn ,  θ(ξ) dξ = 1. Rn

This proves condition (ii), since a ∈ S −∞ . (ii) =⇒ (i): This is trivial. The proof of Theorem 9.13 is complete.



9 L 2 Theory of Pseudo-differential Operators

402 Fig. 9.1 Condition (a) for the operator A

supp A

Ω

K

K

Fig. 9.2 Condition (b) for the operator A

Ω

supp A

Ω

K

K

Ω

We recall that a continuous linear operator A : C0∞ (Ω) → D (Ω) is said to be properly supported if the following two conditions are satisfied (see Figs. 9.1 and 9.2): (a) For any compact subset K of Ω, there exists a compact subset K of Ω such that supp v ⊂ K

=⇒ supp Av ⊂ K .

(b) For any compact subset K of Ω, there exists a compact subset K ⊃ K of Ω such that supp v ∩ K = ∅ =⇒ supp Av ∩ K = ∅. If A is properly supported, then it maps C0∞ (Ω) continuously into E (Ω), and further it extends to a continuous linear operator on C ∞ (Ω) into D (Ω). The situation can be visualized as in Fig. 9.3 below. The next theorem states that every pseudo-differential operator can be written as the sum of a properly supported operator and a regularizer. Theorem 9.14 If A ∈ L m ρ,δ (Ω), then we have the decomposition

9.5 Pseudo-differential Operators

403

Fig. 9.3 The mapping properties of a properly supported operator A

A

C ∞ (Ω) − −−−− → D (Ω) = L(C0∞ (Ω), C) ⏐ ⏐

⏐ ⏐

C0∞ (Ω) − −−−− → E (Ω) = L(C ∞ (Ω), C) A

A = A0 + R, −∞ (Ω). where A0 ∈ L m ρ,δ (Ω) is properly supported and R ∈ L

Proof Choose a function ρ ∈ C ∞ (Ω × Ω) such that (a) ρ(x, y) = 1 in a neighborhood of the diagonal {(x, x) : x ∈ Ω} in Ω × Ω; (b) The restrictions to supp ρ of the projections p1 : Ω × Ω  (x1 , x2 ) → x1 ∈ Ω, p2 : Ω × Ω  (x1 , x2 ) → x1 ∈ Ω are proper mappings. Then the operators A0 and R, defined respectively by the kernels K A0 (x, y) = ρ(x, y)K A (x, y), K R (x, y) = (1 − ρ(x, y)) K A (x, y), are the desired ones. The proof of Theorem 9.14 is complete.



9.5.2 Symbols of a Pseudo-differential Operator m If p(x, ξ) ∈ Sρ,δ (Ω × Rn ), then the operator p(x, D), defined by the formula

1 p(x, D)u(x) = (2π)n

 Rn

ei x·ξ p(x, ξ) u(ξ) ˆ dξ for u ∈ C0∞ (Ω),

(9.7)

is a pseudo-differential operator of order m on Ω, that is, p(x, D) ∈ L m ρ,δ (Ω). The next theorem asserts that every properly supported pseudo-differential operator can be reduced to the form (9.7). Theorem 9.15 If A ∈ L m ρ,δ (Ω) is properly supported, then we have the formulas m (Ω × Rn ), p(x, ξ) = e−i x·ξ A(ei x·ξ ) ∈ Sρ,δ

9 L 2 Theory of Pseudo-differential Operators

404

and A = p(x, D). m (Ω × Ω × Rn ) is an amplitude for A, then we have Furthermore, if a(x, y, ξ) ∈ Sρ,δ the following asymptotic expansion:

  1  α α p(x, ξ) ∼ . ∂ξ D y (a(x, y, ξ)) α! y=x α≥0 The function p(x, ξ) is called the complete symbol of A. We extend the notion of a complete symbol to the whole space L m ρ,δ (Ω). If A ∈ A0 ∈ L m (Ω) such that A − A0 ∈ ρ,δ L (Ω), and define

Lm ρ,δ (Ω), we choose a properly supported operator −∞

σ(A) = the equivalence class of the complete symbol of A0 in m Sρ,δ (Ω × Rn )/S −∞ (Ω × Rn ).

In view of Theorems 9.13 and 9.15, it follows that σ(A) does not depend on the operator A0 chosen. The equivalence class σ(A) is called the complete symbol of A. It is easy to see that the mapping   −∞   m n Ω × Rn Lm ρ,δ (Ω)  A  −→ σ(A) ∈ Sρ,δ Ω × R /S induces an isomorphism     −∞ m Ω × Rn /S −∞ Ω × Rn . (Ω) −→ Sρ,δ Lm ρ,δ (Ω)/L Similarly, if A ∈ L m ρ,δ (Ω), we define σm (A) = the equivalence class of the complete symbol of A0 in m Sρ,δ (Ω × Rn )/S m−1 (Ω × Rn ).

The equivalence class σm (A) is called the principal symbol of A. It is easy to see that the mapping m n m−1 (Ω × Rn ) Lm ρ,δ (Ω)  A  −→ σm (A) ∈ Sρ,δ (Ω × R )/S

induces an isomorphism     m−1 m Ω × Rn /S m−1 Ω × Rn . (Ω) −→ Sρ,δ Lm ρ,δ (Ω)/L

9.5 Pseudo-differential Operators

405

We shall often identify the complete symbol σ(A) with a representative in the class m (Ω × Rn ) for notational convenience, and call any member of σ(A) a complete Sρ,δ symbol of A. We shall do the same for the principal symbol σm (A). A pseudo-differential operator A ∈ L m 1,0 (Ω) is said to be classical if its complete symbol σ(A) has a representative in the class Sclm (Ω × Rn ). We let Lm cl (Ω) = the set of all classical pseudo-differential operators of order m on Ω. Then the mapping   −∞   m n Ω × Rn Lm cl (Ω)  A  −→ σ(A) ∈ Scl Ω × R /S induces an isomorphism     −∞ (Ω) −→ Sclm Ω × Rn /S −∞ Ω × Rn . Lm cl (Ω)/L Also we have the formula L −∞ (Ω) =



Lm cl (Ω).

m∈R

If A ∈ L m cl (Ω), then the principal symbol σm (A) has a canonical representative   σ A (x, ξ) ∈ C ∞ Ω × (Rn \ {0}) which is positively homogeneous of degree m in the variable ξ. The function σ A (x, ξ) is called the homogeneous principal symbol of A. For example, the Bessel potential J s = (I − )−s/2 for s ∈ Rn is a classical pseudo-differential operator on Rn with homogeneous principal symbol |ξ|−s . It should be noticed that  n  A ∈ L m−1 cl (Ω) ⇐⇒ σ A (x, ξ) ≡ 0 on Ω × R \ {0} .

9.5.3 The Algebra of Pseudo-differential Operators The next two theorems assert that the class of pseudo-differential operators forms an algebra closed under the operations of composition of operators and taking the transpose or adjoint of an operator.

9 L 2 Theory of Pseudo-differential Operators

406

∗ Theorem 9.16 If A ∈ L m ρ,δ (Ω), then its transpose A and its adjoint A are both in m ∗ L ρ,δ (Ω), and the complete symbols σ(A ) and σ(A ) have respectively the following asymptotic expansions:

 1 ∂ξα Dxα (σ(A)(x, −ξ)) , α! α≥0    1 ∂ξα Dxα σ(A)(x, ξ) . σ(A∗ )(x, ξ) ∼ α! α≥0

σ(A )(x, ξ) ∼





m Theorem 9.17 If A ∈ L m ρ ,δ (Ω) and B ∈ L ρ ,δ (Ω), where 0 ≤ δ < ρ ≤ 1, and if +m one of them is properly supported, then the composition AB is in L m (Ω) with ρ,δ ρ = min(ρ , ρ ) and δ = max(δ , δ ). Moreover, we have the following asymptotic expansion:

σ(AB)(x, ξ) ∼

 1 ∂ξα (σ(A)(x, ξ)) · Dxα (σ(B)(x, ξ)) . α! α≥0

9.5.4 Elliptic Pseudo-differential Operators A pseudo-differential operator A ∈ L m ρ,δ (Ω) is said to be elliptic of order m if its complete symbol σ(A) is elliptic of order m. In view of Corollary 9.5, it follows that a classical pseudo-differential operator A ∈ L m cl (Ω) is elliptic if and only if its homogeneous principal symbol σ A (x, ξ) does not vanish on the space Ω × (Rn \ {0}). The next theorem states that elliptic operators are the “invertible” elements in the algebra of pseudo-differential operators: Theorem 9.18 An operator A ∈ L m ρ,δ (Ω) is elliptic if and only if there exists a −m properly supported operator B ∈ L ρ,δ (Ω) such that 

AB ≡ I mode L −∞ (Ω), B A ≡ I mode L −∞ (Ω).

Such an operator B is called a parametrix for A. In other words, a parametrix for A is a two-sided inverse of A modulo L −∞ (Ω). It should be emphasized that a parametrix is unique modulo L −∞ (Ω).

9.5 Pseudo-differential Operators

407

9.5.5 Invariance of Pseudo-differential Operators Under Change of Coordinates We see what happens to a pseudo-differential operator under a change of coordinates. The next theorem proves the invariance of pseudo-differential operators under change of coordinates: Theorem 9.19 Let Ω1 , Ω2 be two open subsets of Rn and χ : Ω1 → Ω2 a C ∞ diffeomorphism. If A ∈ L m ρ,δ (Ω1 ), where 1 − ρ ≤ δ < ρ ≤ 1, then the mapping Aχ : C0∞ (Ω2 ) −→ C ∞ (Ω2 ) v −→ A(v ◦ χ) ◦ χ−1 is in L m ρ,δ (Ω2 ), and we have the asymptotic expansion  1       ∂ξα σ(A) (x,t χ (x) · η) · Dzα eir (x,z,η) z=x σ Aχ (y, η) ∼ α! α≥0 with

(9.8)

r (x, z, η) = χ(z) − χ(x) − χ (x) · (z − x) , η.

Here x = χ−1 (y), χ (x) is the derivative of χ at x and t χ (x) its transpose. Here χ∗ v = v ◦ χ is the pull-back of v by χ and χ∗ u = u ◦ χ−1 is the pushforward of u by χ, respectively. The situation may be represented as in Fig. 9.4. Remark 9.20 Formula (9.8) shows that   σ(Aχ )(y, η) ≡ σ(A) x,t χ (x) · η

m−(ρ−δ)

mode Sρ,δ

.

Note that the mapping   Ω2 × Rn  (y, η) −→ x,t χ (x) · η ∈ Ω1 × Rn is just a transition map of the cotangent bundle T ∗ (Rn ). This implies that the principal n ∗ n symbol σm (A) of A ∈ L m ρ,δ (R ) can be invariantly defined on T (R ) when 1 − ρ ≤ δ < ρ ≤ 1. Fig. 9.4 The pseudo-differential operators A and Aχ = χ∗ ◦ A ◦ χ∗

A

C0∞ (Ω1 ) − −−−−→ C ∞ (Ω1 ) ⏐ ⏐χ∗ ⏐ χ∗ ⏐ C0∞ (Ω2 ) − −−−− → C ∞ (Ω2 ) Aχ

408

9 L 2 Theory of Pseudo-differential Operators

9.5.6 Pseudo-differential Operators and Sobolev Spaces A differential operator of order m with smooth coefficients on Ω is continuous on s−m s (Ω) into Hloc (Ω) for all s ∈ R. This result extends to pseudo-differential operHloc ators. More precisely, we have the following theorem (see Bourdaud [23, Theorem 1]): Theorem 9.21 (Bourdaud) Every properly supported operator A in the class Lm 1,δ (Ω), where 0 ≤ δ < 1, extends to continuous linear operators s−m s A : Hloc (Ω) −→ Hloc (Ω), s s−m A : Hcomp (Ω) −→ Hcomp (Ω)

for all s ∈ R. More precisely, we have Figs. 9.5 and 9.6 below. By combining Theorems 9.18 and 9.21, we can obtain the following: Theorem 9.22 (the elliptic regularity theorem) If A ∈ L m 1,δ (Ω), where 0 ≤ δ < 1, is properly supported and elliptic, then we have the following three assertions: s+m s (i) A distribution u ∈ D(Ω) is in Hloc (Ω) if and only if Au ∈ Hloc (Ω). (ii) sing supp u = sing supp Au for u ∈ D (Ω). In other words, u is smooth if and only if Au is. (iii) For every compact K ⊂ Ω, s ∈ R and t < s + m, there exists a constant C K ,s,t > 0 such that

Fig. 9.5 The mapping properties of a properly supported, pseudo-differential operator A

A

D (Ω) − −−−−→ ⏐ ⏐

⏐ ⏐ A

s−m s (Ω) − −−−− → Hloc (Ω) Hloc

⏐ ⏐

⏐ ⏐

C ∞ (Ω) − −−−− → A

Fig. 9.6 The mapping properties of a properly supported, pseudo-differential operator A

D (Ω)

E (Ω)

A

− −−−−→

⏐ ⏐

C ∞ (Ω)

E (Ω) ⏐ ⏐

A

s s−m Hcomp (Ω) − −−−− → Hcomp (Ω)

⏐ ⏐ C0∞ (Ω)

⏐ ⏐ − −−−− → A

C0∞ (Ω)

9.5 Pseudo-differential Operators

409

  u H s+m (Ω) ≤ C K ,s,t Au H s (Ω) + u H t (Ω) for all u ∈ C K∞ (Ω). Proof Take a parametrix B ∈ L −m (Ω) for A as in Theorem 9.18: 

AB = I + R1 for R1 ∈ L −∞ (Ω), B A = I + R2 for R2 ∈ L −∞ (Ω).

Then, parts (i) and (iii) follow from an application of Theorem 9.21. Furthermore, we obtain that u = B Au − R2 u ≡ B Au mod C ∞ (Ω), so that, by the pseudo-local property of B, sing supp u = sing supp B(Au) ⊂ sing supp Au. This proves part (ii), since the converse inclusion is simply the pseudo-local property of A. The proof of Theorem 9.22 is complete. 

9.6 Pseudo-differential Operators on a Manifold In this section we define the concept of a pseudo-differential operator on a manifold, and transfer all the machinery of pseudo-differential operators to manifolds. Throughout this section, let M be an n-dimensional compact smooth manifold without boundary.

9.6.1 Definitions and Basic Properties Theorem 9.19 leads us to the following: Definition 9.23 Let 1 − ρ ≤ δ < ρ ≤ 1. A continuous linear operator A : C ∞ (M) → C ∞ (M) is called a pseudo-differential operator of order m ∈ R if it satisfies the following two conditions: (i) The distribution kernel of A is smooth off the diagonal  M = {(x, x) : x ∈ M} in M × M. (ii) For any chart (U, χ) on M, the mapping Aχ : C0∞ (χ(U )) −→ C ∞ (χ(U )) u −→ A(u ◦ χ) ◦ χ−1

9 L 2 Theory of Pseudo-differential Operators

410 Fig. 9.7 The mapping properties of a pseudo-differential operator A in terms of Sobolev spaces H s (M)

A

D (M ) − −−−−→ ⏐ ⏐

D (Ω) ⏐ ⏐

A

−−−− → H s−m (M ) H s (M ) − ⏐ ⏐

⏐ ⏐

C ∞ (M ) − −−−− → A

C ∞ (M )

belongs to the class L m ρ,δ (χ(U )). We let Lm ρ,δ (M) = the set of all pseudo-differential operators of order m on M, and set

L −∞ (M) =



Lm ρ,δ (M).

m∈R

Some results about pseudo-differential operators on Rn stated above are also true for pseudo-differential operators on M. In fact, pseudo-differential operators on M are defined to be locally pseudo-differential operators on Rn . For example, we have the following five results: (1) A pseudo-differential operator A extends to a continuous linear operator A : D (M) → D (M). (2) sing supp Au ⊂ sing supp u for every u ∈ D (M). (3) A continuous linear operator A : C ∞ (M) → D (M) is a regularizer if and only if it is in L −∞ (M). (4) The class L m ρ,δ (M) is stable under the operations of composition of operators and taking the transpose or adjoint of an operator. (5) A pseudo-differential operator A ∈ L m 1,δ (M), where 0 ≤ δ < 1, extends to a continuous linear operator A : H s (M) → H s−m (M) for all s ∈ R. The situation can be visualized as in Fig. 9.7 above.

9.6.2 Classical Pseudo-differential Operators A pseudo-differential operator A ∈ L m 1,0 (M) is said to be classical if, for any chart (U, χ) on M, the mapping Aχ : C0∞ (χ(U )) → C ∞ (χ(U )) belongs to the class Lm cl (χ(U )).

9.6 Pseudo-differential Operators on a Manifold

411

We let Lm cl (M) = the set of all classical pseudo-differential operators of order m on M. We observe that

L −∞ (M) =



Lm cl (M).

m∈R

Let A ∈ L m cl (M). If (U, χ) is a chart on M, there is associated a homogeneous principal symbol σ Aχ ∈ C ∞ (χ(U ) × (Rn \ {0})). In view of Remark 9.20, by smoothly patching together the functions σ Aχ we can obtain a smooth function σ A (x, ξ) on T ∗ (M) \ {0} = {(x, ξ) ∈ T ∗ (M) : ξ = 0}, which is positively homogeneous of degree m in the variable ξ. The function σ A (x, ξ) is called the homogeneous principal symbol of A. A classical pseudo-differential operator A ∈ L m cl (M) is said to be elliptic of order m if its homogeneous principal symbol σ A (x, ξ) does not vanish on the bundle T ∗ (M) \ {0} of non-zero cotangent vectors. Then we have the following result: (6) An operator A ∈ L m cl (M) is elliptic if and only if there exists a parametrix B ∈ L −m cl (M) for A:  AB ≡ I mode L −∞ (M), B A ≡ I mode L −∞ (M). From now on, we only consider classical pseudo-differential operators which we often encounter in applications. For example, differential operators and parametrices for elliptic differential operators are classical pseudo-differential operators. Let A ∈ L m cl (M). If (U, χ) is a chart on M, there is associated a homogeneous principal symbol σ Aχ (x, ξ) ∈ C ∞ (χ(U ) × (Rn \ {0})). In view of Remark 9.20, by smoothly patching together the functions σ Aχ , we obtain a C ∞ function σ A (x, ξ) on T ∗ (M) \ {0} = {(x, ξ) ∈ T ∗ (M) : ξ = 0}, which is positively homogeneous of degree m in the variable ξ. The function σ A (x, ξ) is called the homogeneous principal symbol of A. It should be noticed that ∗ A ∈ L m−1 cl (M) ⇐⇒ σ A (x, ξ) ≡ 0 on T (M) \ {0}.

The next theorem asserts that the class L m cl (M) of classical pseudo-differential operators is stable under the operations of composition of operators and taking the transpose or adjoint of an operator. ∗ Theorem 9.24 (i) If A ∈ L m cl (M), then its transpose A and its adjoint A are both m in the class L cl (M), and we have the formulas

9 L 2 Theory of Pseudo-differential Operators

412



σ A (x, ξ) = σ A (x, −ξ), σ A∗ (x, ξ) = σ A (x, ξ).

(9.9)



m (ii) If A ∈ L m cl (M) and B ∈ L cl (M), then the composition AB is in the class m+m L cl (M), and we have the formula

σ AB (x, ξ) = σ A (x, ξ) · σ B (x, ξ).

(9.10)

9.6.3 Elliptic Pseudo-differential Operators A classical pseudo-differential operator A ∈ L m cl (M) is said to be elliptic of order m if its homogeneous principal symbol σ A (x, ξ) does not vanish on the bundle T ∗ (M) \ {0} of non-zero cotangent vectors. The next theorem is a generalization of Theorem 9.18: Theorem 9.25 An operator A ∈ L −m cl (M) is elliptic if and only if there exists a parametrix B ∈ L −m cl (M) for A: 

AB ≡ I mod L −∞ (M), B A ≡ I mod L −∞ (M).

9.7 Elliptic Pseudo-differential Operators and Their Indices In this section, by using the Riesz–Schauder theory we prove some of the most important results about elliptic pseudo-differential operators on a manifold. These results will be useful for the study of interior boundary value problems in Chap. 11. Throughout this section, let M be an n-dimensional compact smooth manifold without boundary.

9.7.1 Pseudo-differential Operators on Sobolev Spaces Let H s (M) be the Sobolev space of order s ∈ R on M. Recall that C ∞ (M) =



H s (M),

s∈R



D (M) =



s∈R

H s (M).

9.7 Elliptic Pseudo-differential Operators and Their Indices

413

A linear operator T : C ∞ (M) → C ∞ (M) is said to be of order m ∈ R if it extends to a continuous linear operator on H s (M) into H s−m (M) for each s ∈ R. For example, every pseudo-differential operator in L m (M) is of order m. We say that T : C ∞ (M) → C ∞ (M) is of order −∞ if it extends to a continuous linear operator on H s (M) into C ∞ (M) for each s ∈ R. This is equivalent to saying that T is a regularizer; hence we have the formula L −∞ (M) = the set of all operators of order − ∞.

(9.11)

Let T : H s (M) → H t (M) be a linear operator with domain D(T ) dense in H (M). Each element v of H −t (M) defines a linear functional G on D(T ) by the formula G(u) = (T u, v) for u ∈ D(T ), s

where (·, ·) on the right-hand side is the sesquilinear pairing of H t (M) and H −t (M). If this functional G is continuous everywhere on the domain D(T ), by applying Theorem 5.11 we obtain that G can be extended uniquely to a continuous linear  on D(T ) = H s (M). Hence, there exists a unique element v ∗ of H −s (M) functional G such that    G(u) = u, v ∗ for u ∈ H s (M), since the sesquilinear form (·, ·) on the product space H s (M) × H −s (M) permits us to identify the strong dual space of H s (M) with H −s (M). In particular, we have the formula   (T u, v) = G(u) = u, v ∗ for all u ∈ D(T ). So we let D(T ∗ ) = the totality of those v ∈ H −t (M) such that the mapping u → (T u, v) is continuous everywhere on the domain D(T ), and define

T ∗v = v∗.

Therefore, it follows that T ∗ is a linear operator from H −t (M) into H −s (M) with domain D (T ∗ ) such that     (T u, v) = u, T ∗ v for all u ∈ D(T ) and v ∈ D T ∗ .

(9.12)

The operator T ∗ is called the adjoint of T . The operators T : H s (M) → H t (M) and T ∗ : H −t (M) → H −s (M) can be visualized as in Fig. 9.8 below. Similarly, the transpose of T is a linear operator T from H −t (M) into H −s (M) with domain D(T ) such that

9 L 2 Theory of Pseudo-differential Operators

414

T∗

v ∗ = T ∗ v ∈ H −s (M ) ← −−−− −

v ∈ D(T ∗ )

u ∈ D(T )

− −−−−→ T u ∈ H t (M ) T

T

v = T v ∈ H −s (M ) ← −−−− −

v ∈ D(T )

u ∈ D(T )

←→

←→

Fig. 9.9 The operators T and T

←→

←→

Fig. 9.8 The operators T and T ∗

− −−−− → T u ∈ H t (M ) T

  D T = the totality of those v ∈ H −t (M) such that the mapping u → T u, v is continuous everywhere on the domain D(T ), and satisfies the formula   T u, v = u, T v for all u ∈ D(T ) and v ∈ D T .

(9.13)

Here ·, · on the left-hand (resp˙right-hand) side is the bilinear pairing of H t (M) and H −t (M) (resp. H s (M) and H −s (M)). The operators T : H s (M) → H t (M) and T : H −t (M) → H −s (M) can be visualized as in Fig. 9.9. In view of formulas (9.12) and (9.13), it follows that (a) v ∈ D(T ) ⇐⇒ v ∈ D (T ∗ ), (b) T v = T ∗ v for every v ∈ D(T ). Here · denotes complex conjugation. Hence, we have the following two assertions: 

1. The ranges R (T ∗ ) and R (T ) are isomorphic.   2. The null spaces N (T ∗ ) and N T are isomorphic.

(9.14)

Now let A ∈ L m (M). Then it follows from an application of Theorem 9.21 the operator A : C ∞ (M) → C ∞ (M) extends uniquely to a continuous linear operator As : H s (M) −→ H s−m (M) for all s ∈ R, and hence to a continuous linear operator A : D (M) −→ D (M). The situation can be visualized as in Fig. 9.10 below.

9.7 Elliptic Pseudo-differential Operators and Their Indices

415

Fig. 9.10 The mapping properties of A, As and A

A

D (M ) − −−−−→

D (M )

⏐ ⏐

⏐ ⏐ A

H s (M ) − −−−s− → H s−m (M ) ⏐ ⏐

⏐ ⏐

C ∞ (M ) − −−−− → A

Fig. 9.11 The operators As and (As )∗ = A∗−s+m

C ∞ (M )

A

←→

←→

H s (M ) − −−−s−→ H s−m (M )

H −s (M ) ← −−−− − H −s+m (M ) ∗ (As )

The adjoint A∗ of A is also in L m (M); hence the operator A∗ : C ∞ (M) −→ C ∞ (M) extends uniquely to a continuous linear operator A∗s : H s (M) −→ H s−m (M) for all s ∈ R. The next lemma states a fundamental relationship between the operators As and A∗s (see Fig. 9.11): Lemma 9.26 If A ∈ L m (M), we have, for all s ∈ R,  (As )∗ = A∗−s+m , ∗  ∗ A−s+m = As .

(9.15)

Proof If u ∈ D(As ) = H s (M) and v ∈ D(A∗−s+m ) = H −s+m (M), there exist sequences {u j } and {v j } in C ∞ (M) such that u j → u in H s (M) and v j → v in H −s+m (M) as j → ∞, respectively. Then we have the assertions 

Au j −→ As u in H s−m (M), A∗ v j −→ A∗−s+m v in H −s (M),

so that       (As u, v) = lim Au j , v j = lim u j , A∗ v j = u, A∗−s+m v . j→∞

j→∞

9 L 2 Theory of Pseudo-differential Operators

416

This proves the desired formulas (9.15). The proof of Lemma 9.26 is complete.



9.7.2 The Index of an Elliptic Pseudo-differential Operator In this section, we study the operators As when A is a classical elliptic pseudodifferential operator. The next theorem is an immediate consequence of Theorem 9.25: Theorem 9.27 (the elliptic regularity theorem) Let A ∈ L m cl (M) be elliptic. Then we have, for all s ∈ R, u ∈ D (M), Au ∈ H s (M) =⇒ u ∈ H s+m (M). In particular, we have the assertions R (As ) ∩ C ∞ (M) = R(A),

(9.16)

N (As ) = N (A).

(9.17)

Here     R (As ) = Au : u ∈ H s (M) , R(A) = Au : u ∈ C ∞ (M) ;     N (As ) = u ∈ H s (M) : As u = 0 , N (A) = u ∈ C ∞ (M) : Au = 0 . Here it is worth pointing out that assertion (9.17) is a generalization of the celebrated Weyl theorem, which states that harmonic functions on Euclidean space are smooth. The next theorem states that the operators As are Fredholm operators: s s−m Theorem 9.28 If A ∈ L m (M) is cl (M) is elliptic, the operator As : H (M) → H a Fredholm operator for all s ∈ R.

Proof Take a parametrix B ∈ L −m cl (M) for A: 

B A = I + P, P ∈ L −∞ (M), AB = I + Q, Q ∈ L −∞ (M).

Then we have, for all s ∈ R, 

Bs−m · As = I + Ps , As · Bs−m = I + Q s−m .

9.7 Elliptic Pseudo-differential Operators and Their Indices

417

Furthermore, in view of assertion (9.11) it follows from an application of Rellich’s theorem (Theorem 8.5) that the operators Ps : H s (M) −→ H s (M) and Q s−m : H s−m (M) −→ H s−m (M) are both compact. Therefore, by applying Theorem 5.62 to our situation we obtain that As is a Fredholm operator. The proof of Theorem 9.28 is complete.  Corollary 9.29 Let A ∈ L m cl (M) be elliptic. Then we have the following two assertions: (i) The range R(A) of A is a closed linear subspace of C ∞ (M). (ii) The null space N (A) of A is a finite dimensional, closed linear subspace of C ∞ (M). Proof (i) It follows from Theorem 9.28 that the range R ( As ) of As is closed in H s−m (M); hence it is closed in C ∞ (M), since the injection C ∞ (M) −→ H s−m (M) is continuous. In view of formula (9.16), this proves part (i). (ii) Similarly, in view of formula (9.17) it follows from Theorem 9.28 that N (A) has finite dimension; so it is closed in each H s (M) and hence in C ∞ (M). The proof of Corollary 9.29 is complete.  The next theorem asserts that the index ind As = dim N (As ) − codim R(As ) does not depend on s ∈ R: Theorem 9.30 If A ∈ L m cl (M) is elliptic, then we have, for all s ∈ R, ind As = dim N (A) − dim N (A∗ ). Here

(9.18)

  N (A∗ ) = v ∈ C ∞ (M) : A∗ v = 0 .

Proof Since the range R ( As ) is closed in H s−m (M), by applying the closed range theorem (Theorem 5.53) to our situation we obtain that   codim R ( As ) = dim N A s .

418

9 L 2 Theory of Pseudo-differential Operators

However, in view of assertion (9.14) it follows that     dim N A s = dim N A∗s . Furthermore, we have, by formulas (9.15) and (9.17), N (A∗s ) = N (A∗−s+m ) = N (A∗ ),

(9.19)

since A∗ ∈ L m cl (M) is also elliptic (see formula (9.9)). Summing up, we obtain that codim R(As ) = dim N (A∗ ).

(9.20)

Therefore, the desired formula (9.18) follows from formulas (9.17) and (9.20). The proof of Theorem 9.30 is complete.



We give another useful expression for ind As . To do this, we need the following lemma: ∗ Lemma 9.31 Let A ∈ L m cl (M) be elliptic. Then the spaces N (A ) and R(A) are ∞ orthogonal complements of each other in C (M) relative to the inner product of L 2 (M) (see Fig. 9.12): (9.21) C ∞ (M) = N (A∗ ) ⊕ R(A).

Proof Since the range R(Am ) is closed in L 2 (M), it follows from an application of the closed range theorem that   L 2 (M) = N A∗m ⊕ R (Am ) .

(9.22)

However we have, by formulas (9.19) and (9.16),

Fig. 9.12 The orthogonal decomposition (9.21) of C ∞ (M) in L 2 (M)

C ∞ (M )

N (A∗ ) = N (A∗m )

0 R(A) = R(Am ) ∩ C ∞ (M )

9.7 Elliptic Pseudo-differential Operators and Their Indices

N (A∗m ) = N (A∗ ), ∞

R(Am ) ∩ C (M) = R(A).

419

(9.23a) (9.23b)

Therefore, the desired orthogonal decomposition (9.21) follows from assertions (9.23), by restricting the orthogonal decomposition (9.22) to the space C ∞ (M). The proof of Lemma 9.31 is complete.  Now we can prove the following: Theorem 9.32 If A ∈ L m cl (M) is elliptic, then we have, for all s ∈ R, ind As = dim N (A) − codim R(A). Here

(9.24)

codim R(A) = dim C ∞ (M)/R(A).

Proof The orthogonal decomposition (9.21) tells us that   dim N A∗ = codim R(A). Hence, the desired formula (9.24) follows from formula (9.18). The proof of Theorem 9.32 is complete.



We let   ind A = dim N (A) − dim N A∗ = dim N (A) − codim R(A).

(9.25)

The next theorem states that the index of an elliptic pseudo-differential operator depends only on its principal symbol: Theorem 9.33 If A, B ∈ L m cl (M) are elliptic and if they have the same homogeneous principal symbol, then it follows that ind A = ind B.

(9.26)

Proof Since the difference A − B belongs to the class L m−1 cl (M), it follows from Rellich’s theorem (Theorem 8.5) that the operator As − Bs : H s (M) −→ H s−m (M) is compact. Hence, by applying Theorem 5.64 we obtain that ind As = ind (Bs + (As − Bs )) = ind Bs . In view of Theorem 9.32, this proves the desired formula (9.26).

420

9 L 2 Theory of Pseudo-differential Operators



The proof of Theorem 9.33 is complete.

As for the product of elliptic pseudo-differential operators, we have the following:

m Theorem 9.34 If A ∈ L m cl (M) and B ∈ L cl (M) are elliptic, then we have the formula ind B A = ind B + ind A. (9.27)

Proof We remark that we have, for each s ∈ R, (B A)s = Bs−m · As . Hence, by applying Theorem 5.63 to our situation we obtain that ind (B A)s = ind Bs−m + ind As .

(M) (see This proves formula (9.27), since B A is an elliptic operator in L m+m cl Theorem 9.24). The proof of Theorem 9.34 is complete.  As for the adjoints, we have the following: Theorem 9.35 If A ∈ L m cl (M) is elliptic, then we have the formula ind A∗ = −ind A.

(9.28)

Indeed, it suffices to note that A∗∗ = A. We give some useful criteria for ind A = 0: ∗ Theorem 9.36 If A ∈ L m cl (M) is elliptic and if A and A have the same homogeneous principal symbol, then it follows that

ind A = 0.

(9.29)

Indeed, Theorem 9.33 tells us that ind A = ind A∗ . However, in view of formula (9.28) this implies the desired formula (9.29). Corollary 9.37 If A ∈ L m cl (M) is elliptic and if its homogeneous principal symbol is real, then we have the assertion ind A = 0. Indeed, by part (i) of Theorem 9.24 it follows that A and A∗ have the same homogeneous real principal symbol. Therefore, Theorem 9.36 applies. ∗ Theorem 9.38 If A ∈ L m cl (M) is elliptic and if A = λA for some λ ∈ C, then we have the assertions

9.7 Elliptic Pseudo-differential Operators and Their Indices

421

|λ| = 1, and ind A = 0. Proof First, we remark that A = A∗∗ = (λA)∗ = λA∗ = |λ|2 A. Hence we have |λ| = 1 and so 

N (λA) = N (A), R(λ A) = R(A).

Therefore, it follows from formula (9.28) that ind A = ind λA = ind A∗ = −ind A. This proves that ind A = 0. The proof of Theorem 9.38 is complete.



The next theorem describes conditions under which an elliptic pseudo-differential operator is invertible on Sobolev spaces: Theorem 9.39 Let A ∈ L m cl (M) be elliptic. Assume that 

ind A = 0, N (A) = {0}.

Then we have the following three assertions: (i) The operator A : C ∞ (M) → C ∞ (M) is bijective. (ii) The operator As : H s (M) → H s−m (M) is an isomorphism for each s ∈ R. (iii) The inverse A−1 of A belongs to the class L −m cl (M). Proof (i) Since ind A = 0 and N (A) = {0}, it follows from formula (9.25) that   N A∗ = {0}. Hence the surjectivity of A follows from the orthogonal decomposition (9.21). (ii) Since N (As ) = N (A) = {0} and ind As = ind A = 0, it follows that the operator As : H s (M) −→ H s−m (M) is bijective for each s ∈ R. Therefore, by applying the closed graph theorem (Theorem 5.50) to our situation we obtain that the inverse

9 L 2 Theory of Pseudo-differential Operators

422 Fig. 9.13 The mapping properties of A−1 and As −1

A −1

H s−m (M ) − −−s−−→ H s (M ) ⏐ ⏐ C ∞ (M )

⏐ ⏐ − −−−− → C ∞ (M ) A−1

As −1 : H s−m (M) −→ H s (M) is continuous for each s ∈ R. (iii) Since we have the formula  A−1 = As −1 C ∞ (M) and since each As −1 : H s−m (M) → H s (M) is continuous, it follows that the operator A−1 : C ∞ (M) −→ C ∞ (M) is continuous, and also it is of order −m. The situation can be visualized as in Fig. 9.13. −m It remains to prove that A−1 ∈ L −m cl (M) takes a parametrix B ∈ L cl (M) for A: 

AB = I + P, P ∈ L −∞ (M), B A = I + Q, Q ∈ L −∞ (M).

Then we have the formula A−1 − B = (I − B A) A−1 = −Q · A−1 . However, in view of assertion (9.11) it follows that the operator Q · A−1 ∈ L −∞ (M), since it is of order −∞. This proves that A−1 = B − Q · A−1 ∈ L −m cl (M). The proof of Theorem 9.39 is complete.



The next theorem states that the Sobolev spaces H s (M) can be characterized in terms of elliptic pseudo-differential operators: Theorem 9.40 Let A ∈ L m cl (M) be elliptic with m > 0. Assume that

9.7 Elliptic Pseudo-differential Operators and Their Indices



423

A = A∗ , N (A) = {0}.

Then we have the following two assertions: (i) There exists a complete orthonormal system {ϕ j } of L 2 (M) consisting of eigenfunctions of A, and its corresponding eigenvalues {λ j } are real and |λ j | → +∞ as j → ∞. (ii) A distribution u ∈ D (M) belongs to H mr (M) for some integer r if and only if we have the condition ∞   2 λ2rj  u, ϕ j  < +∞. j=1

More precisely, the quantity (u, v)mr =

∞ 

   λ2rj u, ϕ j v, ϕ j

(9.30)

j=1

is an admissible inner product for the Hilbert space H mr (M). Proof (i) Since A = A∗ , it follows from Theorem 9.35 that ind A = 0. Hence, by applying Theorem 9.39 we obtain that the operator A : C ∞ (M) −→ C ∞ (M) is bijective, and its inverse A−1 is an elliptic operator in L −m cl (M). We let  A−1 = the composition of (A−1 )0 : L 2 (M) → H m (M) and the injection: H m (M) → L 2 (M). Then it follows from Rellich’s theorem (Theorem 8.5) that the operator  A−1 : L 2 (M) −→ L 2 (M) is compact. Furthermore, since C ∞ (M) is dense in L 2 (M) and A = A∗ , we have the formula      A−1 v for u, v ∈ L 2 (M), A−1 u, v = u,  A−1 is where (·, ·) is the inner product of L 2 (M). This implies that the operator  self-adjoint. Also, we have the assertion     N  A−1 = N (A−1 )0 = N (A−1 ) = {0},

9 L 2 Theory of Pseudo-differential Operators

424

since A−1 ∈ L −m cl (M) is elliptic. Therefore, by applying the Hilbert–Schmidt theorem (Theorem 5.81) to the self-adjoint operator  A−1 we obtain that there exists a complete orthonormal system {ϕ j } of L 2 (M) consisting of eigenfunctions of  A−1 , and its corresponding eigenvalues {μ j } are real and converges to zero as j → ∞. Since the eigenvalues μ j are all non-zero, it follows that ϕj =

1 1  −1  A 0 ϕ j ∈ H m (M). A−1 ϕ j = μj μj

However, note that

and that

 

A−1

    A−1 0  H m (M) = A−1 m

 m

: H m (M) −→ H 2m (M).

Hence, we have the assertion ϕj =

1  −1  A m ϕ j ∈ H 2m (M). μj

Continuing this way, we obtain that

ϕj ∈

H km (M) = C ∞ (M).

k∈N

Therefore, we have the formula A−1 ϕ j = μ j ϕ j , and so Aϕ j = λ j ϕ j for λ j = where |λ j | =

1 , μj

1 −→ +∞ as j → ∞. |μ j |

(ii) For each integer r , we let 

Ar if r ≥ 0, A :=  −1 |r | A if r < 0, r

where A0 = I . Then it follows that Ar is an elliptic operator in L mr cl (M) and that

9.7 Elliptic Pseudo-differential Operators and Their Indices

425

N (Ar ) = {0}. Moreover we have, by Theorem 9.34,  r ind A = 0 if r ≥ 0,   ind A = |r | ind A−1 = 0 if r < 0. r

Therefore, by applying Theorem 9.39 we obtain that the operator 

Ar

 mr

: H mr (M) −→ L 2 (M)

is an isomorphism. Thus the quantity 

(u, v)mr =

Ar

 mr

   u, Ar mr v

is an admissible inner product for the Hilbert space H mr (M). Furthermore, since {ϕ j } is a complete orthonormal system of L 2 (M), we have, by Parseval’s formula (see Theorem 5.78), ∞  

(u, v)mr =

Ar

 mr

u, ϕ j

  (Ar )mr v, ϕ j .

(9.31)

j=1

However, we have, by formula (9.15), 

Ar

 ∗ mr

=



Ar

∗  0

  = Ar 0 ,

since A = A∗ . Hence, it follows that 

Ar

 mr

     u, ϕ j = u, Ar 0 ϕ j   = u, Ar ϕ j   = λrj u, ϕ j .

(9.32)

Therefore, the desired formula (9.30) follows from formulas (9.31) and (9.32). The proof of Theorem 9.40 is complete.



As one of the important applications of Theorem 9.40, we can obtain the following: Theorem 9.41 Let  be the Laplace–Beltrami operator on M and let {χ j } be the orthonormal system of L 2 (M) consisting of eigenfunctions of − and {λ j } its corresponding eigenvalues: −χ j = λ j χ j with λ j ≥ 0.

9 L 2 Theory of Pseudo-differential Operators

426

Then the functions χ j span the Sobolev spaces H s (M) for each s ∈ R. More precisely, the quantity ∞   s    1 + λ j u, χ j v, χ j (u, v)s = j=1

is an admissible inner product for the Hilbert space H s (M).

9.8 Potentials and Pseudo-differential Operators The purpose of this section is to describe, in terms of pseudo-differential operators, the classical surface and volume potentials arising in boundary value problems for elliptic differential operators.

9.8.1 Single and Double Layer Potentials Revisited Recall that the Newtonian potential is defined by the formula (−)−1 f (x) = N ∗ f (x)  Γ ((n − 2)/2) 1 = f (y) dy n/2 n−2 n |x − y| 4π  R   1 1 = f (y) dy for f ∈ C0∞ Rn . (n − 2)ωn Rn |x − y|n−2 Here ωn =

2π n/2 Γ (n/2)

is the surface area of the unit sphere. In the case n = 3, we have the formula u(x) =

1 4π

 R3

f (y) dy. |x − y|

Up to an appropriate constant of proportionality, the newtonian potential N (x − y) is the gravitational potential at position x due to a unit point mass at position y, and so the function u(x) is the gravitational potential due to a continuous mass distribution with density f (x). In terms of electrostatics, the function u(x) describes the electrostatic potential due to a charge distribution with density f (x). We define a single layer potential with density ϕ by the formula

9.8 Potentials and Pseudo-differential Operators

427

 ϕ(y ) Γ ((n − 2)/2) dy n/2 2 + x 2 )(n−2)/2 n−1 4π (|x − y | R n  ϕ(y ) 1 = dy , (n − 2)ωn Rn−1 (|x − y |2 + xn2 )(n−2)/2

N ∗ (ϕ(x ) ⊗ δ(xn )) =

ϕ ∈ C0∞ (Rn−1 ). In the case n = 3, the function N ∗ (ϕ ⊗ δ) is related to the distribution of electric charge on a conductor Ω. In equilibrium, mutual repulsion causes all the charge to reside on the surface ∂Ω of the conducting body with density ϕ, and ∂Ω is an equipotential surface. We define a double layer potential with density ψ by the formula   1 N ∗ ψ(x ) ⊗ δ (xn ) = ωn

 Rn−1

(|x

xn ψ(y ) dy , − y |2 + xn2 )(n−2)/2

ψ ∈ C0∞ (Rn−1 ). In the case n = 3, the function N ∗ (ψ ⊗ δ ) is the potential induced by a distribution of dipoles on R2 with density ψ(y ), the axes of the dipoles being normal to R2 . On the other hand, it is easy to verify that if ϕ is bounded and continuous on Rn−1 , then the function  xn 2 ϕ(y ) dy (7.57) u(x , xn ) = ωn Rn−1 (|x − y |2 + xn2 )n/2 n is well-defined for (x , xn ) ∈ R+ , and is a (unique) solution of the Dirichlet problem



n , u = 0 in R+ u = ϕ on Rn−1 .

(DP)

Recall that formula (7.57) is the Poisson integral formula for the solution of the Dirichlet problem. Furthermore, by using the Fourier transform we can express formula (7.57) for ϕ ∈ S(Rn−1 ) as follows:  1 ei x ξ e−xn |ξ | ϕ (ξ ) dξ . (7.58) u(x , xn ) = (2π)n−1 Rn−1

9.8.2 The Green Representation Formula Revisited By applying the Newtonian potential to both sides of the jump formula (7.15), we obtain that

428

9 L 2 Theory of Pseudo-differential Operators

  u 0 = ((N ∗ (−)) u 0     = N ∗ (−u)0 − N ∗ (γ1 u ⊗ δ(xn )) − N ∗ γ0 u ⊗ δ (xn )   ∂u N (x − y) u(y) dy − N (x − y , xn ) (y , 0) dy =− n n−1 ∂ yn R R  ∂N + (x − y , xn ) u(y , 0) dy. n−1 ∂ y n R Hence we arrive at Green’s representation formula:  1 1 u(x) = u(y) dy (9.33) n (2 − n)ωn R+ |x − y|n−2  1 ∂u 1 + (y , 0) dy 2 2 (n−2)/2 (2 − n)ωn Rn−1 (|x − y | + xn ) ∂ yn  xn 1 n + u(y , 0) dy for x = (x , xn ) ∈ R+ . ωn Rn−1 (|x − y |2 + xn2 )n/2 We remark that the first term in formula (9.33) is the Newtonian potential and the second and third terms are the single and double layer potentials, respectively.

9.8.3 Surface and Volume Potentials This subsection is a modern version of the classical potential theory in terms of pseudo-differential operators. We start with a formal description of a background. Let Ω be a bounded domain in Euclidean space Rn with smooth boundary ∂Ω. Its closure Ω = Ω ∪ ∂Ω is an n-dimensional compact smooth manifold with boundary. We may assume that Ω is the closure of a relatively compact, open subset Ω of an n-dimensional, compact smooth manifold M without boundary in which Ω has a smooth boundary ∂Ω (see Fig. 8.1). Let P be a differential operator of order m with smooth coefficients on M. Then we have the jump formula (see formula (7.16))    for u ∈ C ∞ (Ω), P u 0 = (Pu)0 + Pγu  is a distribution where u 0 is the extension of u to M by zero outside Ω, and Pγu on M with support in ∂Ω. If P admits an “inverse” Q, then the function u may be expressed as follows:       . u = Q (Pu)0 Ω + Q Pγu Ω

9.8 Potentials and Pseudo-differential Operators

429

The first term on the right-hand side is a volume potential and the second term is a surface potential with m “layers”. For example, if P is the usual Laplacian  and if n , then the first term is the classical Newtonian potential and the second term Ω = R+ is the familiar combination of single and double layer potentials. Our pseudo-differential operator approach to sectional traces is based on the following theorem due to Chazarain–Piriou [35, Chapitre V, Théorème 2.2]: Theorem 9.42 (the sectional trace theorem) Let Ω be an open set in Rn with smooth μ boundary ∂Ω. Let A ∈ L cl (Rn ) be properly supported. Assume that Every term in the complete symbol

∞ j=0

a j (x, ξ) of A

(9.34)

is a rational function of ξ. If u ∈ D (Rn ) is equal to 0 in Ω, then the distribution Au|Ω has sectional traces of any order on ∂Ω. Remark 9.43 In view of Theorem 9.19, it follows that condition (9.34) is invariant under change of coordinates. Furthermore, it is easy to see that every parametrix for an elliptic differential operator satisfies condition (9.34). Proof By using local coordinate systems flattening out ∂Ω, together with a partition of unity, we may assume that n , Ω = R+

u ∈ E (Rn ),

and that n . u = 0 in R+

The proof of Theorem 9.42 is divided into seven steps. Step (1): First, we can find a constant C > 0 and a integer  ≥ 0 such that | u (ξ)| ≤ C (1 + |ξ|) for all ξ ∈ Rn . Hence we have, for every ε > 0, u ∈ H −−n/2−ε (Rn ). n If B ∈ L m cl (R ) for m < 0, it follows that

Bu ∈ H −−m−n/2−ε (Rn ). By Sobolev’s imbedding theorem (Theorem 8.3), we obtain that Bu ∈ C k (Rn ) for k < −m − n − .

(9.35)

9 L 2 Theory of Pseudo-differential Operators

430

Take a function χ(ξ) ∈ C ∞ (Rn ) such that  χ(ξ) =

0 in a neighborhood of ξ = 0, 1 for |ξ| ≥ 1.

μ

μ

If a(x, ξ) ∈ Scl (Rn × Rn ) is a symbol of A ∈ L cl (Rn ), we can express it as follows: a(x, ξ) = χ(ξ) a(x, ξ) + (1 − χ(ξ)) a(x, ξ), where

(1 − χ(ξ)) a(x, ξ) ∈ S −∞ (Rn × Rn ).

By applying assertion (9.35) with m := −∞, we find that (I − χ(D)) a(x, D)u ∈ C ∞ (Rn ). Hence we are reduced to the following case: ⎧ μ = ⎪ a (x, D) ∈ L cl (Rn ), ⎨A  a (x, ξ) = χ(ξ) a(x, ξ), ⎪ ⎩  a (x, t ξ) = t μ a(x, ξ) for all t > 0. Step (2): Let Φ(x) be an arbitrary function in C0∞ (Rn ). Then we have the formula 

   1 i x·ξ  u (ξ) e  a (x, ξ) Φ(x) d x dξ (2π)n Rn Rn    1 i x·ξ  u (ξ) e a(x, ξ) Φ(x) d x dξ = (2π)n |ξ|≥1 Rn    1 i x·ξ  u (ξ) e χ(ξ) a(x, ξ) Φ(x) d x dξ + (2π)n |ξ|≤1 Rn

  Φ = Au,

:= Bu, Φ + Ru, Φ. However, by using Fubini’s theorem (Theorem 2.18) we can write down the last term as follows:    1 i x·ξ  u (ξ) e χ(ξ) a(x, ξ) Φ(x) d x dξ Ru, Φ = (2π)n |ξ|≤1 Rn     i(x−y)·ξ  1 e , u(y) χ(ξ) a(x, ξ) dξ Φ(x) d x = (2π)n Rn |ξ|≤1  = K R (x, y), u(y) Φ(x) d x, Rn

9.8 Potentials and Pseudo-differential Operators

431

where K R (x, y) is the distribution kernel of R, given by the formula K R (x, y) =



1 (2π)n

|ξ|≤1

ei(x−y)·ξ χ(ξ) a(x, ξ) dξ.

Since we have the assertion K R (x, y) ∈ C ∞ (Rn × Rn ), it follows that

Ru ∈ C ∞ (Rn ).

Therefore, we are reduced to the study of the term  − Ru, Bu := Au where  1  u (ξ)F(ξ) dξ, Bu, Φ = (2π)n |ξ|≥1  F(ξ) = ei x·ξ a(x, ξ) Φ(x) d x. Rn

Step (3): Since a(x, ξ) is a rational function of ξ, we find that the poles of the function a(x, ξ , ξn ) of ξn remains in some compact subset of the complex place C when x belongs to a compact subset of Rn and ξ belongs to a compact subset of Rn−1 , respectively. Hence, for every fixed ξ ∈ Rn−1 the function 



C  ξn −→ F(ξ , ξn ) =



Rn



ei x ·ξ ei xn ξn a(x, ξ , ξn ) Φ(x) d x

is holomorphic, for |ξn | sufficiently large. Moreover, we have the following: Claim 9.44 Assume that Φ(x) ∈ C0∞ (Rn ) satisfies the condition n . supp Φ ⊂ R+

Then the function F(ξ , ξn ) =

 Rn

ei x·ξ a(x, ξ) Φ(x) d x

is rapidly decreasing with respect to the variable ξ = (ξ , ξn ), for ξ ∈ Rn−1 and Im ξn ≥ 0.

9 L 2 Theory of Pseudo-differential Operators

432

Proof By integration by parts, we have, for any multi-index α, ξ α F(ξ) =



Dxα ei x·ξ · a(x, ξ , ξn ) Φ(x) d x    = (−1)|α| ei x·ξ · Dxα a(x, ξ , ξn ) Φ(x) d x. Rn

Rn

However, we remark that







ei x·ξ = ei x ·ξ ei xn ξn = ei x ·ξ ei xn Re ξn · e−xn Im ξn , x ∈ supp Φ =⇒ xn > 0. Hence we have, for some constant Cα > 0, |ξ α F(ξ)| ≤ C (1 + |ξ|)μ for all ξ ∈ Rn−1 and Im ξn ≥ 0. Therefore, for any multi-index α we can find a constant Cα > 0 such that |F(ξ)| ≤ Cα (1 + |ξ|)μ−|α| for all ξ ∈ Rn−1 and Im ξn ≥ 0. 

The proof of Claim 9.44 is complete. On the other hand, we have the following: n Claim 9.45 Assume that u = 0 in R+ , that is,

    supp u ⊂ x = x , xn ∈ Rn : xn ≤ 0 . Then the function  u (ξ , ξn ) is slowly increasing with respect to the variable ξ = n−1 (ξ , ξn ), for ξ ∈ R and Im ξn ≥ 0. Proof Since u ∈ E (Rn ), we can find a constant C > 0 and a non-negative integer  such that ⎛ ⎞  |∂ α ϕ(x)|⎠ for all ϕ ∈ C ∞ (Rn ). |u, ϕ| ≤ C sup ⎝ (9.36) x∈supp u

Thus, by taking

|α|≤

ϕ(x) := e−i x·ξ ,

we have the formula



 u (ξ , ξn ) = u, e−i x·ξ  = u, e−i x ·ξ e−i xn ξn 



= u, e−i x ·ξ e−i xn Re ξn e xn Im ξn .

9.8 Potentials and Pseudo-differential Operators

433

n However, since u = 0 in R+ , it follows that

e xn Im ξn ≤ 1 for all x ∈ supp u and Im ξn ≥ 0. Therefore, just as in the proof of the Paley–Wiener–Schwartz theorem (see [[35, Chapitre IV, Théorème 2.9]) we obtain from inequality (9.36) that | u (ξ)| ≤ C (1 + |ξ|) for all x ∈ supp u and Im ξn ≥ 0. This proves that the function u (ξ , ξn ) is slowly increasing with respect to ξ = (ξ , ξn ), for ξ ∈ Rn−1 and Im ξn ≥ 0. The proof of Claim 9.45 is complete.  Step (4): Now we make a contour deformation in the integral  1 Bu, Φ =  u (ξ)F(ξ) dξ (2π)n |ξ|≥1  1 =  u (ξ , ξn )F(ξ , ξn ) dξ dξn . (2π)n |ξ|≥1 We consider the two cases for |ξ| ≥ 1:   (A) ξ  ≥ 1 and −∞ < ξn < ∞. ! !   (B) ξ  < 1 and ξn ≥ 1 − |ξ |2 and ξn ≤ − 1 − |ξ |2 . Case (A): In this case, we take a positive contour Γξ in the upper half-plane {Im ξn > 0} enclosing the poles of a(x, ξ , ξn ) of ξn (see Fig. 9.14). Then we have, by Cauchy’s theorem, 

R −R

 =

Γξ







 u (ξ , ξn )F(ξ , ξn ) dξn +

 u (ξ , ξn )F(ξ , ξn ) dξn

CR

 u (ξ , ξn ) F(ξ , ξn ) dξn .

However, by Claims 9.44 and 9.45 it follows that

Fig. 9.14 A positive contour Γξ in Case (A) in the upper half-plane

Γξ

9 L 2 Theory of Pseudo-differential Operators

434

Γξ

Fig. 9.15 A positive contour Γξ in Case (B) in the upper half-plane



 lim

R→∞ C R

1 − |ξ |

1 − |ξ |

 u (ξ , ξn ) F(ξ , ξn ) dξn = 0.

Hence, by passint to the limit we obtain that 



−∞

 u (ξ , ξn ) F(ξ , ξn ) dξn =

 Γξ

 u (ξ , ξn )F(ξ , ξn ) dξn for all |ξ ≥ 1.

Case (B): In this case, we take a positive contour Γξ in the upper half-plane {Im the# poles of a(x, ξ , ξn ) of ξn , completed by the segment " !ξn > 0} enclosing ! 2 − 1 − |ξ | , 1 − |ξ |2 (see Fig. 9.15). Similarly, we have, by Cauchy’s theorem, 

R





1−|ξ |2



 +  =



−R

Γξ





 u (ξ , ξn )F(ξ , ξn ) dξn

 u (ξ , ξn )F(ξ , ξn ) dξn +

1−|ξ |2

CR

 u (ξ , ξn )F(ξ , ξn ) dξn

 u (ξ , ξn )F(ξ , ξn ) dξn .

Hence, by passint to the limit we obtain that √





−∞

 =

Γξ

1−|ξ |2

 ∞  u (ξ , ξn )F(ξ , ξn ) dξn + √

1−|ξ |2

 u (ξ , ξn )F(ξ , ξn ) dξn for all |ξ < 1.

Summing up, we have proved the formula

 u (ξ , ξn )F(ξ , ξn ) dξn

9.8 Potentials and Pseudo-differential Operators

435

 1  u (ξ , ξn )F(ξ , ξn ) dξ dξn (9.37) (2π)n |ξ|≥1  ∞   1 dξ  u (ξ , ξ ) F(ξ , ξ ) dξ = n n n (2π)n |ξ |≥1 −∞ ⎛ ⎞  −√1−|ξ |2  1 ⎝ +  u (ξ , ξn ) F(ξ , ξn ) dξn ⎠ dξ (2π)n |ξ | 0. Since u ∈ E (Rn ) satisfies the condition 

 x = (x , xn ) ∈ Rn : xn ≤ 0 ,

it follows that ⎧ ∞ n ⎪ ⎨ρk ∗ u ∈ C0 (R ),    supp (ρk ∗ u) ⊂ x = x , xn ∈ Rn : xn ≤ 0 , ⎪ ⎩ ρk ∗ u −→ u in E (Rn ) as k → ∞. Hence we have the assertion B (ρk ∗ u) −→ Bu in D (Rn ) as k → ∞, and so Bu, ϕ(x ) ⊗ ψ(xn ) = lim B (ρk ∗ u) , ϕ(x ) ⊗ ψ(xn ) k→∞   1 ρk (ξ)  u (ξ , ξn ) = lim k→∞ (2π)n Rn−1 Γξ

9.8 Potentials and Pseudo-differential Operators

 ×

Rn

437

  ei x·ξ a(x , xn , ξ , ξn )ϕ(x )ψ(xn ) d x d xn dξn dξ .

Here we remark the following facts: n (a) ρk (ξ) =  ρ(ξ/k)  ∈S(R ) for all integer k > 0.   (b) |ξn | ≤ c0 1 + ξ for all ξn ∈ Γξ and |ξ | ≥ 1. (c) Γξ = |ξ | Γ1 for all |ξ | ≥ 1.

Hence, by using Fubini’s theorem (Theorem 2.18) we obtain that   1 ρk (ξ , ξn )  u (ξ , ξn ) (2π)n Rn−1 Γξ    × ei x·ξ a(x , xn , ξ , ξn )ϕ(x )ψ(xn ) d x d xn dξn dξ Rn      ∞ 1 (ξ , ξn ) i xn ξn  u (ξ , ξn ) = ψ(x ) e  ρ n n−1 (2π)n 0 k R Γξ     i x ·ξ × e a(x , xn , ξ , ξn )ϕ(x ) d x dξn dξ d xn . Rn−1

However, we have the following assertion: (d) The function





Rn−1



ei x ·ξ a(x , xn , ξ , ξn ) ϕ(x ) d x

is rapidly decreasing with respect to the variable ξ for ϕ(x ) ∈ C0∞ (Rn−1 ). Indeed, by integration by parts it suffices to note that ξ

α





 Rn−1





ei x ·ξ a(x , xn , ξ , ξn ) ϕ(x ) d x

  Dxα ei x ·ξ a(x , xn , ξ , ξn ) ϕ(x ) d x Rn−1      ei x ·ξ Dxα a(x , xn , ξ , ξn ) ϕ(x ) d x . = (−1)|α | =

Rn−1

Hence, we find that the function    1 (ξ , ξn )  u (ξ , ξn ) ei xn ξn  ρ 2π Γξ k   i x ·ξ × e a(x , xn , ξ , ξn ) ϕ(x ) d x dξn

G k (xn , ξ ) =

Rn−1

is rapidly decreasing with respect to the variable ξ . Moreover, we have the inequality

9 L 2 Theory of Pseudo-differential Operators

438

     (ξ , ξn )  ≤  ρ (0) = ρ ρ(x) d x = 1 for all integer k > 0.   k Rn Therefore, by applying (Theorem 2.12) we obtain that

Lebesgue’s

dominated

convergence

theorem

Bu, ϕ(x ) ⊗ ψ(xn )   1 ρk (ξ)  u (ξ , ξn ) = lim k→∞ (2π)n Rn−1 Γξ    i x·ξ × e a(x , xn , ξ , ξn ) ϕ(x ) ψ(xn ) d x d xn dξn dξ n R      ∞ 1 (ξ , ξn ) i xn ξn  u (ξ , ξn ) = lim ψ(x ) e  ρ n k→∞ (2π)n 0 n−1 k R Γξ     i x ·ξ × e a(x , xn , ξ , ξn )ϕ(x ) d x dξn dξ d xn Rn−1    ∞ 1 = ψ(xn ) ei xn ξn  u (ξ , ξn ) (2π)n 0 Rn−1 Γξ     × ei x ·ξ a(x , xn , ξ , ξn )ϕ(x ) d x dξn dξ d xn Rn−1    ∞ 1 d xn , = ψ(x ) G(x , ξ ) dξ n n (2π)n−1 0 Rn−1 where G(xn , ξ )    1 ei xn ξn  u (ξ , ξn ) ei x ·ξ a(x , xn , ξ , ξn ) ϕ(x ) d x dξn . = 2π Γξ Rn−1 

The proof of Claim 9.46 is complete. Step (7): By virtue of Claim 9.46, we find that the distribution [0, ∞)  xn −→ Bu(·, xn )|R+n   is equal to a function in the space C ∞ [0, ∞); D (Rn−1 ) , given by the formula C0∞ (Rn−1 )

1  ϕ −→ (2π)n−1  ×





Γξ

Rn−1

Rn−1



ei xn ξn  u (ξ , ξn )



ei x ·ξ a(x , xn , ξ , ξn ) ϕ(x ) d x

Now the proof of Theorem 9.42 is complete.



 dξn dξ . 

9.8 Potentials and Pseudo-differential Operators

439

Remark 9.47 It is easy to see that the function G(xn , ξ ) is rapidly decreasing with respect to the variable ξ .

9.8.3.1

Surface Potentials

First, by using Theorem 9.42 just as in Chazarain–Piriou [35, Chapitre V, Théorème 2.4] we can obtain a theorem that covers surface potentials (see [35, Chapitre V, Théorème 2.4]): Theorem 9.48 Let A ∈ L m cl (M) be properly supported be as in Theorem 9.42. Then we have the following three assertions: (i) The operator H : v −→ A(v ⊗ δ)|Ω is continuous on C ∞ (∂Ω) into C ∞ (Ω). If v ∈ D (∂Ω), then the distribution H v ∈ D (Ω) has sectional traces on ∂Ω of any order. (ii) The operator S : C ∞ (∂Ω) −→ C ∞ (∂Ω) v −→ H v|∂Ω belongs to the class L m+1 cl (∂Ω). Furthermore, its homogeneous principal symbol is given by the formula (x , ξ ) −→

1 2π

 Γξ

a0 (x , 0, ξ , ξn ) dξn ,

  where a0 x , xn , ξ , ξn ∈ C ∞ (T ∗ (M) \ {0}) is the homogeneous principal symbol of A, and Γξ is a circle in the upper half-plane {ξn ∈ C : Im ξn > 0} which encloses the poles ξn of a0 (x , 0, ξ , ξn ) there (see Fig. 9.16 below). (iii) The operator H extends to a continuous linear operator H : H s (∂Ω) −→ H s−m−1/2 (Ω) for all s ∈ R (see Fig. 9.17 below).

9.8.3.2

Volume Potentials

The next theorem covers volume potentials (see [35, Chapitre V, Théorème 2.5]): Theorem 9.49 Let A ∈ L m cl (M) be as in Theorem 9.42. Then we have the following two assertions:

9 L 2 Theory of Pseudo-differential Operators

440 Fig. 9.16 A circle Γξ in the upper half-plane

Γξ

Fig. 9.17 The mapping properties of a surface potential H

H

D (∂Ω) − −−−− → ⏐ ⏐

D (Ω) ⏐ ⏐

H

H s (∂Ω) − −−−− → H s−m−1/2 (Ω) ⏐ ⏐

⏐ ⏐

C ∞ (∂Ω) − −−−− → H

Fig. 9.18 The mapping properties of a volume potential G

G

H s (Ω) − −−−−→ H s−m (Ω) ⏐ ⏐

⏐ ⏐

−−−− → C ∞ (Ω) − G

(i) The operator

C ∞ (Ω)

C ∞ (Ω)

  G : f −→ A f 0 Ω

is continuous on C ∞ (Ω) into itself. (ii) The operator G extends to a continuous linear operator G : H s (Ω) −→ H s−m (Ω) for all s > −1/2 (see Fig. 9.18).

9.9 The Sharp Gårding Inequality Let Ω be an open subset of Rn and let A be a properly supported, pseudo-differential operator of order m on Ω. In this section we are concerned with inequalities from below for A of the form

9.9 The Sharp Gårding Inequality

441

Re (Au, u) L 2 (Ω) ≥ C K u2s for all u ∈ C K∞ (Ω),

(9.39)

where K is a compact subset of Ω and (·, ·) L 2 (Ω) is the inner product of the Hilbert space L 2 (Ω) = H 0 (Ω) and   C K∞ (Ω) = u ∈ C ∞ (Ω) : supp u ⊂ K . We remark that inequality (9.39) is always true for s ≥ m/2, since we have the inequality | (Au, u) L 2 (Ω) | ≤ C K u2m/2 for all u ∈ C K∞ (Ω) with a constant C K > 0. Indeed, it suffices to note that     −Re (Au, u) L 2 (Ω) ≤ Re (Au, u) L 2 (Ω)  ≤ (Au, u) L 2 (Ω)  ≤ C K u2m/2 ≤ C K u2s for all u ∈ C K∞ (Ω). In what follows we give sufficient conditions on A for inequality (9.39) to hold true for s < m/2. These results will play an important role in deriving a priori estimates for (non-)elliptic boundary value problems in Chap. 11. (I) The next result, first proved by Gårding [72] for differential operators, is a milestone in the theory of elliptic boundary value problems: Theorem 9.50 (Gårding) Let A be a properly supported, pseudo-differential operator of order m on Ω with principal symbol am (x, ξ). Assume that there exists a constant a0 > 0 such that Re am (x, ξ) ≥ a0 |ξ|m for all (x, ξ) ∈ T ∗ (Ω) = Ω × Rn . Then, for every compact K ⊂ Ω and s < m/2, there exist constants c K ,s > 0 and C K ,s > 0 such that Re (Au, u) L 2 (Ω) ≥ c K ,s u2m/2 − C K ,s u2s for all u ∈ C K∞ (Ω).

(9.40)

The inequality (9.40) is called Gårding’s inequality. (II) A sharpened form of Gårding’s inequality is given by Hörmander [85]: Theorem 9.51 (the sharp Gårding inequality) Theorem 9.50. Assume that

Let A ∈ L m (Ω) be as in

Re am (x, ξ) ≥ 0 for all (x, ξ) ∈ T ∗ (Ω) = Ω × Rn . Then, for every compact K ⊂ Ω and s < (m − 1)/2, there exist constants c K ,s > 0 and C K ,s > 0 such that Re (Au, u) L 2 (Ω) ≥ −c K ,s u2(m−1)/2 − C K ,s u2s for all u ∈ C K∞ (Ω).

(9.41)

9 L 2 Theory of Pseudo-differential Operators

442

It should be emphasized that Fefferman and Phong have proved the following precise inequality for m = 2 (cf. [57, Theorem], [91, Corollary 18.6.11]): Theorem 9.52 (Fefferman–Phong) LetA = a(x, D) be a pseudo-differential oper2 ator with complete symbol a(x, ξ) ∈ S1,0 (Rn × Rn ) such that   a(x, ξ) ≥ 0 on T ∗ Rn = Rn × Rn . Then there exists a constant C > 0 such that we have, for all u ∈ S (Rn ), Re (Au, u) L 2 (Ω) ≥ −C u20 . Here the constant C may be chosen uniformly in the a(x, ξ) in a bounded subset of 2 the symbol class S1,0 (Rn × Rn ). (III) Melin [126] goes further, giving a necessary and sufficient condition on A for the following inequality to hold true for every ε > 0: Re (Au, u) L 2 (Ω) ≥ −εu2(m−1)/2 − Cε,s u2s for all u ∈ C K∞ (Ω). We formulate precisely Melin’s result. Let M be an n-dimensional, compact C ∞ manifold without boundary and let A be a classical pseudo-differential operator of order m on M such that the complete symbol σ(A)(x, ξ) has an asymptotic expansion: σ(A)(x, ξ) ∼ am (x, ξ) + am−1 (x, ξ) + . . . ,

(9.42)

where am (x, ξ) = σ A (x, ξ) and am−1 (x, ξ) is positively homogeneous of degree m − 1 in the variable ξ. For simplicity, we assume that am (x, ξ) ≥ 0 on the cotangent bundle T ∗ (M). We let

(9.43)

   = (x, ξ) ∈ T ∗ (M) : am (x, ξ) = 0 .

The set  is called the characteristic set of A. Let u be an arbitrary tangent vector of T ∗ (M) at a point ρ of . Then we choose a C ∞ vector field v on T ∗ (M) equal to u at ρ, and define a quadratic form Q ρ (u, u) on the product space     Tρ T ∗ (M) × Tρ T ∗ (M) by the formula

  Q ρ (u, u) = v 2 am (ρ).

In view of condition (9.43), it is easy to verify that Q ρ (u, u) is independent of the vector field v chosen. The form Q ρ is called the Hessian of am at ρ.

9.9 The Sharp Gårding Inequality

Let

443

    ρ T ∗ (M) = Tρ T ∗ (M) ⊗ C T

be the complexification of the tangent space Tρ (T ∗ (M)). We consider the symplectic form n  dξ j ∧ d x j σ= j=1

and the quadratic form Q ρ as bilinear forms on the product space     ρ T ∗ (M) . ρ T ∗ (M) × T T Since the form σ is non-degenerate, we can define a linear map     ρ T ∗ (M) −→ T ρ T ∗ (M) Fρ : T by the formula     ρ T ∗ (M) . σ u, Fρ v = Q ρ (u, v) for u, v ∈ T The map Fρ is called the Hamilton map of Q ρ . It is easy to see that the eigenvalues of Fρ are situated on the imaginary axis, symmetrically around the origin. For each ρ = (x, ξ) ∈ , we let r Ham (x, ξ) = the sum of the positive eigenvalues of the map T √ (− −1Q ρ ), where each eigenvalue counted with its multiplicity. Furthermore, we let (x, ξ) = am−1 (x, ξ) + am−1



−1  ∂ 2 am (x, ξ). 2 j=1 ∂x j ∂ξ j n

The function am−1 (x, ξ) is invariantly defined at the points of the characteristic set  of A, and is called the subprincipal symbol of A. Then we have the following (cf. [91, Theorem 22.3.3]):

Theorem 9.53 (Melin’s inequality) Let A be in L m cl (M) with asymptotic expansion (9.42). Assume that am (x, ξ) ≥ 0 on the cotangent bundle T ∗ (M). Then the inequality

9 L 2 Theory of Pseudo-differential Operators

444

Re (Aϕ, ϕ) L 2 (M) ≥ −ε ϕ2(m−1)/2 − Cε,s ϕ2s , ϕ ∈ C ∞ (M), holds true for every ε > 0 and s < (m − 1)/2 if and only if the following condition is satisfied: 1 r Ham (x, ξ) ≥ 0 (x, ξ) + T Re am−1 2   on 1 = (x, ξ) ∈ T ∗ (M) : am (x, ξ) = 0, |ξ| = 1 . Here (·, ·) L 2 (M) is the inner product of the Hilbert space L 2 (M) = H 0 (M). (IV) Fefferman–Phong [57] have proved some result of this nature for differential operators, which we will describe. Let M be an n-dimensional compact C ∞ manifold without boundary and let A be a second order differential operator with real coefficients on M such that, in terms of local coordinates, N 

 ∂2 ∂ a (x) + bi (x) + c(x). A= ∂x ∂x ∂x i j i i, j=1 i=1 N

ij

(9.44)

Here: ij ∞ (1) The 2 a (x) are the components of a C symmetric contravariant tensor of type on M and 0 n 

a i j (x)ξi ξ j ≥ 0 for all x ∈ M and ξ =

i, j=1

n 

ξ j d x j ∈ Tx∗ (M),

j=1

where Tx∗ (M) is the cotangent space of M at x. Namely, the principal symbol  n ij i, j=1 a (x)ξi ξ j of A is non-negative on the cotangent bundle T ∗ (M) =



Tx∗ (M).

x∈M

(2) bi (x) ∈ C ∞ (M) for 1 ≤ i ≤ n. (3) c(x) ∈ C ∞ (M). A tangent vector v=

n  j=1

is said to be subunit for the operator

vj

∂ ∈ Tx (M) ∂x j

9.9 The Sharp Gårding Inequality

A0 =

445 n  i, j=1

a i j (x)

∂2 ∂xi ∂x j

if it satisfies the condition ⎛ ⎞2 n n n    ⎝ v jξj⎠ ≤ a i j (x)ξi ξ j for all ξ = ξ j d x j ∈ Tx∗ (M). j=1

i, j=1

j=1

If ρ > 0, we define a “non-Euclidean” ball B A0 (x, ρ) of radius ρ about x as follows: B A0 (x, ρ) = the set of all points y ∈ M which can be joined to x by a Lipschitz path γ : [0, ρ] → M for which the tangent vector γ(t) ˙ of M at γ(t) is subunit for A0 for almost every t. Also we let B E (x, ρ) = the ordinary Euclidean ball of radius ρ about x. The next result is due to Fefferman–Phong [57]: Theorem 9.54 (Fefferman–Phong) Let the differential operator A be of the form (9.44). Then the following two conditions are equivalent: (i) There exist constants 0 < ε ≤ 1 and C > 0 such that we have, for ρ > 0 sufficiently small, B E (x, ρ) ⊂ B A0 (x, C ρε ) for every x ∈ M. (9.45) (ii) There exist constants c0 > 0 and C0 > 0 such that − Re (Aϕ, ϕ) L 2 (M) ≥ c0 ϕ2ε − C0 ϕ20 for all ϕ ∈ C ∞ (M).

(9.46)

9.10 Hypoelliptic Pseudo-differential Operators Let Ω be an open subset of Rn . A properly supported pseudo-differential operator A on Ω is said to be hypoelliptic if it satisfies the condition sing supp u = sing supp Au for every u ∈ D (Ω).

(9.47)

For example, Theorem 9.22 tells us that elliptic operators are hypoelliptic. It is easy to see that condition (1) is equivalent to the following condition: For any open subset Ω1 of Ω, we have the assertion

9 L 2 Theory of Pseudo-differential Operators

446

u ∈ D (Ω), Au ∈ C ∞ (Ω1 ) =⇒ u ∈ C ∞ (Ω1 ).

(9.48)

We say that A is globally hypoelliptic if it satisfies the weaker condition u ∈ D (Ω), Au ∈ C ∞ (Ω) =⇒ u ∈ C ∞ (Ω). It should be noticed that these two notions may be transferred to manifolds. The following criterion for hypoellipticity is due to Hörmander: Theorem 9.55 (Hörmander) Let 1 − ρ ≤ δ < ρ ≤ 1, and let A = p(x, D) ∈ L m ρ,δ (Ω) be properly supported. Assume that, for any compact K ⊂ Ω and any multiindices α, β, there exist constants C K ,α,β > 0, C K > 0 and μ ∈ R such that we have, for all x ∈ K and |ξ| ≥ C K ,   α β  D D p(x, ξ) ≤ C K ,α,β | p(x, ξ)| (1 + |ξ|)−ρ|α|+δ|β| , ξ

x

| p(x, ξ)|−1 ≤ C K (1 + |ξ|)μ . μ

Then there exists a parametrix B ∈ L ρ,δ (Ω) for A. In this section we describe two classes of hypoelliptic pseudo-differential operators of Hörmander [89] and Melin–Sjöstrand [127], which arise in the study of elliptic boundary value problems. (I) First, we formulate precisely Hörmander’s result. Let A be a properly supported, classical pseudo-differential operator of order m on Ω ⊂ Rn such that the complete symbol σ(A)(x, ξ) has an asymptotic expansion σ(A)(x, ξ) ∼ am (x, ξ) + am−1 (x, ξ) + . . . as in formula (9.42). The following criterion for hypoellipticity is due to Hörmander [89, Theorem 5.9]: Theorem 9.56 (Hörmander) Let A ∈ L m cl (Ω) be properly supported. Assume that am (x, ξ) ≥ 0 on T ∗ (Ω) (x, ξ) on  belongs to a closed angle which interand further that the range of −am−1 sects with the positive real axis only at the origin. Then the following two conditions are equivalent:

(i) For every compact K ⊂ Ω, s ∈ R and t < s + m − 1, there exists a constant C K ,s,t > 0 such that us+m−1 ≤ C K ,s,t (Aus + ut ) for all u ∈ C K∞ (Ω). (ρ) is non-zero or else (ii) At every point ρ of , either the subprincipal symbol am−1 the Hamilton map Fρ of the Hessian Q ρ of am is not nilpotent.

9.10 Hypoelliptic Pseudo-differential Operators

447

Furthermore, each of conditions (i) and (ii) implies that s+m−1 s (Ω) =⇒ u ∈ Hloc (Ω). u ∈ D (Ω), Au ∈ Hloc

(9.49)

Remark 9.57 Regularity result (9.49) involves a loss of one derivative compared with the elliptic regularity theorem (Theorem 9.22). We express this by saying that A is hypoelliptic, with loss of one derivative. (II) Secondly, we formulate precisely Melin–Sjöstrand’s result. Let M be an ndimensional, compact smooth manifold without boundary and let A be a classical pseudo-differential operator of first order on M such that A(x, D) = β(x, D) +



−1B(x, D)

(9.50)

where:

√ (1) β(x, ∂) = −1β(x, D) is a real C ∞ vector field on M. (2) B ∈ L 1cl (M) and its homogeneous principal symbol b1 (x, ξ) is real.

We remark that the homogeneous principal symbol β(x, ξ) of β(x, D) is a polynomial of degree one in the variable ξ. The following criterion for global hypoellipticity is due to Melin–Sjöstrand [127, Introduction]. Theorem 9.58 (Melin–Sjöstrand) Let A ∈ L 1cl (M) be of the form (9.50). Assume that the following three conditions are satisfied: (A) The symbol b1 (x, ξ) does not change sign on the cotangent bundle T ∗ (M), that is, either b1 (x, ξ) ≥ 0 on T ∗ (M) or b1 (x, ξ) ≤ 0 on T ∗ (M). (B) The vector field β is non-zero on the set   K = x ∈ M : b1 (x, ξ) = 0 for some (x, ξ) ∈ T ∗ (M) \ {0} . (C) Any maximal integral curve x(t; x0 ) of β starting at x0 ∈ K is not entirely contained in K (see Fig. 9.19 below). Then we have, for all s ∈ R, u ∈ D (M), Au ∈ H s (M) =⇒ u ∈ H s (M).

(9.51)

Furthermore, for any t < s, there exists a constant Cs,t > 0 such that us ≤ Cs,t (Aus + ut ) . Assertion (9.51) implies that A is globally hypoelliptic, with loss of one derivative. An essential idea of the Melin–Sjöstrand theory [127] may be explained as follows: They construct a parametrix E for A which is “formally” given by a Fourier integral operator with a complex phase function (see [127, formula (4.19)]):

9 L 2 Theory of Pseudo-differential Operators

448

x(t; x0 )

Fig. 9.19 An intuitive meaning of hypothesis (C)

... ... ... .....

.... ............................................................................. . .............. ........... ... ............ . ....... . . ..... . . . . . . . . ...... .... ...... ... . ...... . . ...... . ... .. .... . . . .. .. . . .... .. .. . . . .. .. . .. ...• .. . . ... . . . ..... . .. .. . . . . . ... .. . . x0 . . ...... ........................ . .. . . . . . . . . . . . . .... ... . .......................... . ................................ ......... . ............. ............. . ..................... ..........................................

K = {x ∈ M : b1 (x, ξ) = 0 on T ∗ (M ) \ {0}}

 E=



exp [−t A] dt = A−1 .

0

Here: (1) The operator exp [−t A] is a Fourier integral operator of order zero with a complex phase function. (2) The continuity of exp [−t A] : H s (M) −→ H s (M) for every s ∈ R can be reduced to the case of pseudo-differential operators of the Hörmander class L 01/2,1/2 (M), which is established by Calderón–Vaillancourt [28, Theorem]. (3) The operator  E=



exp [−t A] dt : H s (M) −→ H s (M)

0

is continuous for every s ∈ R. (4) We have “formally” 



AE = '





A exp [−t A] dt =

0

= − exp [−t A]

(t=∞ t=0

0

  d − exp [−t A] dt dt

=I

and 



EA = 0





exp [−t A] A dt =

A exp [−t A] dt = I.

0

(III) Thirdly, let Ω be an open subset of Rn . We say that a properly supported, pseudo-differential operator A ∈ L m (Ω) is sub-elliptic with loss of 0 ≤ δ < 1 if, for every compact K ⊂ Ω, s ∈ R and t < s + m − δ we have the inequality us+m−δ ≤ C K ,s,t (Aus + ut ) for all u ∈ C K∞ (Ω).

9.10 Hypoelliptic Pseudo-differential Operators

449

It is known (cf. Hörmander [85]) that sub-elliptic operators are hypoelliptic, with loss of δ derivatives. Egorov [48] and Hörmander [90] have obtained necessary and sufficient conditions that a properly supported, classical pseudo-differential operator A ∈ Lm cl (Ω) be sub-elliptic. More precisely, we have the following (see [90, Theorem 3.4]): Theorem 9.59 (Egorov–Hörmander) Let A be a properly supported, pseudodifferential operator with principal symbol am (x, ξ). Then A is sub-elliptic with loss of 0 ≤ δ < 1 if and only if, at every point x0 of Ω, there exists a neighborhood V of x0 such that the following two conditions (i) and (ii) are satisfied: (i) For any point (x, ξ) ∈ V × (Rn \ {0}), the function 

HRe z am

j

(Im zam ) (x, ξ)

(9.52)

is different from zero for some complex number z and some non-negative integer j ≤ δ/(1 − δ), where H f is the Hamilton vector field defined by the formula Hf =

n n   ∂f ∂ ∂f ∂ − . ∂ξ ∂x ∂x j j j ∂ξ j j=1 j=1

(ii) If j is an odd integer and is the smallest integer such that the function (9.52) is not identically zero, then the function (9.52) is non-negative for all (x, ξ) ∈ V × (Rn \ {0}). Finally, the next theorem is a special version of Theorem 9.54 adapted to subellipticity (see [210, Theorem 2.1 and Proposition 6.3]): Theorem 9.60 (Fefferman–Phong) Let M be an n-dimensional, compact smooth manifold without boundary and let A be a second order differential operator of the form (9.44). Then the following four conditions are equivalent: (i) There exist constants 0 < ε ≤ 1 and C > 0 such that we have, for ρ > 0 sufficiently small, B E (x, ρ) ⊂ B A0 (x, C ρε ) for every x ∈ M.

(9.45)

(ii) There exist constants c0 > 0 and C0 > 0 such that −Re (Aϕ, ϕ) L 2 (M) ≥ c0 ϕ2ε − C0 ϕ20 for all ϕ ∈ C ∞ (M).

(9.46)

(iii) There exist constants c1 > 0 and C1 > 0 such that Aϕ2L 2 (M) + C1 ϕ20 ≥ c1 ϕ22ε for all ϕ ∈ C ∞ (M).

(9.53)

9 L 2 Theory of Pseudo-differential Operators

450

In this case, the differential operator A is hypoelliptic with loss of 2(1 − ε)derivatives on M. Namely, we have, for every s ∈ R, ϕ ∈ D (M), Aϕ ∈ H s (M) =⇒ ϕ ∈ H s+2ε (M).

(9.54)

Moreover, for any t < s + 1 there exists a constant Cs,t > 0 such that   ϕ2s+2ε ≤ Cs,t Aϕ2s + ϕ2t .

(9.55)

9.11 Notes and Comments Our L 2 theory of pseudo-differential operators follows the exposition of Chazarain– Piriou [35] and also a part of Seeley’s Se1969. This chapter is adapted from these books in such a way as to make it accessible to graduate students and advanced undergraduates as well. For detailed studies of the L p theory of pseudo-differential operators, the readers might be referred to Bourdaud [23], Coifman–Meyer [39], Hörmander [91], Kumano-go [108], Nagase [130], Seeley [167], Shubin [170], Taylor [220] and [221]. m were first introduced by Hörmander [87]. Section 9.1: The symbol classes Sρ,δ Section 9.4: For the theory of Fourier integral operators, see Hörmander [88], Duistermaat–Hörmander [44] and Duistermaat [43]. Section 9.7: Our treatment of index theory of elliptic operators is adapted from Palais [139]. To prove Theorem 9.41, we need an interpolation argument. See Lions– Magenes [116] and Taylor [220]. Section 9.8: This section is taken from Chazarain–Piriou [35, Chapitre V, Sect. 2]. Theorem 9.48 is due to Hörmander [85]; see also Seeley [167] and Èskin [52]. A detailed proof of Theorem 9.48 is given in [209, Theorem 6.2]. Theorem 9.49 is due to Boutet de Monvel [24] (see [150, p. 162, part (iv) of Theorem 2]). Section 9.10: The notion of hypoellipticity was introduced by Schwartz (cf. Schwartz [163]). Hypoelliptic second order differential operators have been studied in detail by Treves [222], Hörmander [86], Fedi˘ı [55], Ole˘ınik–Radkeviˇc [138] and many others.

Part III

Maximum Principles and Elliptic Boundary Value Problems

Chapter 10

Maximum Principles for Degenerate Elliptic Operators

In this chapter we prove various maximum principles for degenerate elliptic differential operators of second order, and reveal the underlying analytical mechanism of propagation of maxima in terms of subunit vectors introduced by Fefferman– Phong [57] (Theorems 10.14 and 10.17). The results may be applied to questions of uniqueness for elliptic boundary value problems. Furthermore, the mechanism of propagation of maxima will play an important role in the interpretation and study of Markov processes from the viewpoint of functional analysis in Part V (see also [202]).

10.1 Introduction We begin with the following elementary result: 2 Let I be an open interval of R. If u ∈ C 2 (I ), dd xu2 (x) ≥ 0 in I and if u takes its maximum at a point of I , then u(x) is a constant. This result can be extended to the N -dimensional case, with the operator d 2 /d x 2 replaced by the usual Laplacian =

N  ∂2 . ∂xi2 i=1

⎧ N 2 ⎪ ⎨ Let D be a connected, open subset o f R . If u ∈ C (D), u(x) ≥ 0 in D and u takes its maximum at a point ⎪ ⎩ o f D, then u(x) is a constant.

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 K. Taira, Functional Analytic Techniques for Diffusion Processes, Springer Monographs in Mathematics, https://doi.org/10.1007/978-981-19-1099-9_10

(10.1)

453

454

10 Maximum Principles for Degenerate Elliptic Operators

Result (10.1) is well known by the name of the strong maximum principle for the Laplacian. Now we study the underlying analytical mechanism of propagation of maxima for degenerate elliptic differential operators of second order, which will reveal an intimate connection between partial differential equations and Markov processes. Let A be a second order differential operator with real coefficients such that A=

N 

 ∂2 ∂ + bi (x) , ∂xi ∂x j ∂x i i=1 N

a i j (x)

i, j=1

(10.2)

where the coefficients a i j (x), bi (x) satisfy the following two conditions: (1) The a i j (x) are C 2 functions on R N all of whose derivatives of order ≤ 2 are bounded in R N , a i j (x) = a ji (x) for all x ∈ R N and all 1 ≤ i, j ≤ N , and satisfy the condition N 

  a i j (x)ξi ξ j ≥ 0 for all (x, ξ) ∈ T ast R N = R N × R N .

(10.3)

i, j=1

  Here T ast R N is the cotangent bundle of R N . (2) The bi (x) are C 1 functions on R N with bounded derivatives in R N for 1 ≤ i ≤ N . We consider the following problem: Problem 10.1 Let D be a connected open subset of R N and x a point of D. Then determine the largest connected, relatively closed subset D(x) of D, containing x, such that: I f u ∈ C 2 (D), Au(x) ≥ 0 in D, and sup D u = M < ∞, (10.4) then u(x) ≡ M throughout D(x). The set D(x) is called the propagation set of x in D. We now give a coordinate-free description of the set D(x) in terms of subunit vectors whose notion is introduced by Fefferman–Phong [57]. A tangent vector N  ∂ γj ∈ Tx (D) X= ∂x j j=1 at x ∈ D is said to be subunit for the operator A0 =

N  i, j=1

a i j (x)

∂2 ∂xi ∂x j

10.1 Introduction

455

if it satisfies the condition ⎞2 ⎛ N N N    ⎝ γ jηj⎠ ≤ a i j (x)ηi η j for all η = η j d x j ∈ Tx∗ (D), j=1

i, j=1

j=1

where Tx∗ (D) is the cotangent space of D at x. Note that this notion is coordinate  free. Hence we can rotate the coordinate axes so that the symmetric matrix a i j is diagonalized at x:   ij   a (x) = λi δi j , λ1 > 0, . . . , λr > 0, λr +1 = . . . = λ N = 0,   where r = rank a i j (x) . Then it is easy to see that the vector X is subunit for A0 if and only if it is contained in the following ellipsoid of dimension r : (γ r )2 (γ 1 )2 + ... + ≤ 1, γ r +1 = . . . = γ N = 0. λ1 λr

(10.5)

A subunit trajectory is a Lipschitz path γ : [t1 , t2 ] → D such that the tangent vector d γ(t) ˙ = (γ(t)) dt is subunit for A0 at γ(t) for almost every t. We remark that if γ(t) ˙ is subunit for A0 , so is −γ(t); ˙ hence subunit trajectories are not oriented. We let ⎞ ⎛ N N ij   ∂ ∂a i ⎝b (x) − X 0 (x) := (x)⎠ . ∂x ∂x j i i=1 j=1 The vector field X 0 (x) is called the drift vector field in probability theory, while it is the so-called subprincipal part of the operator A in terms of the theory of partial differential equations. A drift trajectory is a curve θ : [t1 , t2 ] → D such that ˙ = X 0 (θ(t)) on [t1 , t2 ] , θ(t) and this curve is oriented in the direction of increasing t. Our main result reads as follows (Theorem 10.14): ⎧ ⎪ ⎨T he pr opagation set D(x) o f x in D contains the closur e D (x) in D o f all points y ∈ D that can be joined to x by a f inite ⎪ ⎩ number o f subunit and dri f t tra jectories.

(10.6)

456

10 Maximum Principles for Degenerate Elliptic Operators

  This result tells us that if the matrix a i j (x) is non-degenerate at a point x, that is, if r = rank a i j (x) = N , then  the maximum propagates in an open neighborhood of x; but if the matrix a i j (x) is degenerate at x, then the maximum propagates only in a “thin” ellipsoid of dimension r defined by formula (10.5) and in the direction of X 0 . Now we see the reason why the strong maximum principle (10.1) holds true for the Laplacian. We consider four simple examples in the case when D is the square (−1, 1) × (−1, 1) (N = 2) (see Figs. 1.8 and 1.9 in Sect. 1.2). Example 10.2 A1 = ∂ 2 /∂x 2 + x 2 ∂ 2 /∂ y 2 . The subunit vector fields for A1 are generated by the following:   ∂ ∂ ,x . ∂x ∂ y Hence we have the assertion The set D ((x, y)) is equal to D for every (x, y) ∈ D.

(10.7)

That is, the strong maximum principle (10.1) remains valid for the operator A1 . Example 10.3 A2 = x 2 ∂ 2 /∂x 2 + ∂ 2 /∂ y 2 . The subunit vector fields for A2 are generated by the following:   ∂ ∂ . (10.8) x , ∂x ∂ y Thus we have the assertion ⎧ ⎪ if x > 0, ⎨[0, 1) × (−1, 1) D ((x, y)) = {0} × (−1, 1) if x = 0, ⎪ ⎩ (−1, 0] × (−1, 1) if x < 0. It can be shown that the operator A2 does not have the property (10.7) in some weak sense (see [206, Example 6.2]). Example 10.4 A3 = x 2 ∂ 2 /∂x 2 + ∂ 2 /∂ y 2 + y∂/∂x. The subunit vector fields for A03 = A2 are generated by formula (10.8), and the drift vector field is equal to the following: ∂ . (y − 2x) ∂x Thus, by virtue of the drift vector field, we have assertion (10.7), and so the strong maximum principle (10.1) remains valid for the operator A3 . Example 10.5 A4 = x 2 ∂ 2 /∂x 2 + ∂ 2 /∂ y 2 + ∂/∂x. The subunit vector fields for A04 = A2 are generated by (10.8), and the drift vector field is equal to the following:

10.1 Introduction

457

(1 − x)

∂ . ∂x

Hence we have the assertion

D ((x, y)) =

D if x < 0, [0, 1) × (−1, 1) if x ≥ 0.

It can also be shown that the operator A4 does not have the property (10.7) in some weak sense (see [206, Example 6.4]). It is worth pointing out here that the propagation set D (x) coincides with the support of the Markov process corresponding to the operator A, which is the closure of the collection of all possible trajectories of a Markovian particle, starting at x, with generator A. In the case where the differential operator A is written as the sum of squares of vector fields, we can give another (equivalent) description of the set D (x). Assume that the differential operator A is of the form: A=

r 

Yk2 + Y0 ,

(10.9)

k=1

where the Yk are real C 2 vector fields on R N and Y0 is a real C 1 vector field on R N , respectively (see Hörmander [86], Bony [21]). If A is of the form (10.9), then Hill’s diffusion trajectory is a curve β : [t1 , t2 ] −→ D such that

˙ = 0 on [t1 , t2 ] . ˙ = Yk (β(t)) , β(t) β(t)

Hill’s diffusion trajectories are not oriented; they may be traversed in either direction. Hill’s drift trajectories are defined similarly, with Yk replaced by Y0 , but they are oriented in the direction of increasing t. In this case, our main result (10.6) can be restated as follows (Theorem 10.17): ⎧ ⎪ ⎨T he pr opagation set D (x) coincides with the closur e in D of all points y ∈ D which can be joined to x by a f inite number ⎪ ⎩ o f H ill s di f f usion and dri f t tra jectories. Furthermore, our result (10.6) may be reformulated in various ways. For example, we have the following result (Theorem 10.19):

458

10 Maximum Principles for Degenerate Elliptic Operators

⎧ ⎪ ⎨ Let c(x) be a continuous f unction on D such that c(x) ≤ 0 in D. I f u ∈ C 2 (D), (A + c)u(x) ≥ 0 in D and u attains its positive ⎪ ⎩ maximum M at a point x o f D, then u(x) ≡ M thr oughout D (x).

10.2 Maximum Principles Let D be a bounded domain in R N with boundary ∂ D, and let A be a second order degenerate elliptic differential operator with real coefficients such that N 

 ∂2 ∂ A= a (x) + bi (x) + c(x), ∂x ∂x ∂x i j i i, j=1 i=1 N

ij

(10.10)

where: (1) a i j ∈ C(R N ), a i j (x) = a ji (x) for all x ∈ R N and N 

  a i j (x)ξi ξ j ≥ 0 for all (x, ξ) ∈ T ∗ R N = R N × R N .

(10.11)

i, j=1

(2) bi ∈ C(R N ) for 1 ≤ i ≤ N . (3) c ∈ C(R N ) and A1(x) = c(x) ≤ 0 in D. First, we prove the following theorem: Theorem 10.6 (the weak maximum principle). Let the differential operator A of the form (10.10) satisfy the degenerate elliptic condition (10.11). Assume that a function u ∈ C(D) ∩ C 2 (D) satisfies either Au(x) ≥ 0 and c(x) < 0 in D,

(10.12)

Au(x) > 0 and c(x) ≤ 0 in D.

(10.13)

or Then the function u(x) may take its positive maximum only on the boundary ∂ D. Proof The proof is based on a reduction to absurdity. Assume, to the contrary, that The function u(x) takes its positive maximum at an interior point x0 of D, Without loss of generality, we may choose a local coordinate system

(10.14)

10.2 Maximum Principles

459

y = (y1 , y2 . . . . , y N ) in a neighborhood of x0 such that x0 = the origin, N 

 ∂2 ∂ A= α (y) + β k (y) + c(y), ∂ y ∂ y ∂ yk j k j,k=1 k=1 N

jk

with

 jk  α (0) =



Er 0 0 0

 .

  Here r = rank a i j (x0 ) and Er is the r × r unit matrix. Then the assumption (10.14) implies that ∂u (0) = 0, ∂ yk ∂2u (0) ≤ 0 for 1 ≤ k ≤ r, ∂ yk2 so that we have the inequalities Au(x0 ) = Au(0) ≤ c(0)u(0)

< 0 if c(x0 ) < 0, ≤ 0 if c(x0 ) ≤ 0.

This contradicts the hypothesis (10.12) or (10.13). The proof of Theorem 10.6 is complete.



As an application of the weak maximum principle, we can obtain a pointwise estimate for solutions of the inhomogeneous equation: Au = f in D: Theorem 10.7 Assume that c(x) < 0 on D = D ∪ ∂ D. Then we have, for all u ∈ C(D) ∩ C 2 (D),      Au   , max |u| . max |u| ≤ max sup  c  ∂D D D Proof We let

     Au   , max |u| , M := max sup  c  ∂D D

(10.15)

460

10 Maximum Principles for Degenerate Elliptic Operators

and consider the functions v± (x) = M ± u(x). Then it follows that Av± (x) = c(x)M ± Au(x) ≤ 0 in D. Hence, by applying Theorem 10.6 to the functions −v± (x) we obtain that the functions v± (x) may take their negative minimums only on the boundary ∂ D. However, we have the inequality v± (x) = M ± u(x) ≥ 0 on ∂ D, so that v± (x) ≥ 0 on D = D ∪ ∂ D. This proves the desired estimate (10.15). The proof of Theorem 10.7 is complete.



Remark 10.8 In the case where Au(x) = 0 in D, the estimate (10.15) can be replaced by the following equality: max |u| = max |u|. ∂D

D

We consider the case where the operator A is strictly elliptic on D, that is, there exists a constant a0 > 0 such that N 

a i j (x)ξi ξ j ≥ a0 |ξ|2 for all x ∈ D and ξ ∈ R N .

(10.16)

i, j=1

Then we have the following: Theorem 10.9 Assume that A is strictly elliptic on D and c(x) ≡ 0 in D. If u ∈ C 2 (D) ∩ C(D) and Au(x) ≥ 0 in D, then we have the formula max u = max u. D

∂D

Proof Taking ξ = (1, 0, . . . , 0) in inequality (10.16), we obtain that a 11 (x) ≥ a0 on D. Hence we can find a constant α > 0 so large that   Aeαx1 = α2 a 11 (x) + αb1 (x) eαx1 > 0 on D.

(10.17)

10.2 Maximum Principles

461

Then we have, for all ε > 0,   A u(x) + εeαx1 ≥ εAeαx1 > 0 in D. Thus, by arguing as in the proof of Theorem 10.6 we find that the function u(x) + εeαx1 may take its maximum only on the boundary ∂ D. This implies that     max u + εeαx1 = max u + εeαx1 . ∂D

D

The desired equality (10.17) follows by letting ε ↓ 0 in equality (10.18). The proof of Theorem 10.9 is complete.

(10.18)



Corollary 10.10 Assume that A is strictly elliptic on D. If u ∈ C 2 (D) ∩ C(D) and Au(x) = 0 in D, then we have the formula max |u| = max |u|. ∂D

D

Proof By replacing u(x) by −u(x) if necessary, we may assume that max u > 0. D

We let

D + = {x ∈ D : u(x) > 0} .

Then we have the assertion N  i, j=1

 ∂2u ∂u + bi (x) = −c(x)u ≥ 0 in D + . ∂xi ∂x j ∂x i i=1 N

a i j (x)

Hence, by applying Theorem 10.9 with D := D + we obtain that (see Fig. 10.1 below) u. max u = max + D+

∂D

However, since u = 0 on ∂ D + ∩ D, this implies that u = max u. max u = max u = max + D

D+

The proof of Corollary 10.10 is complete.

∂D

∂D



∂u Now we study the inner normal derivative ∂n of u at a point where the function u takes its non-negative maximum. In what follows let D be a domain of class C 2 . We let ρ(x) = dist (x, ∂ D) for x ∈ R N .

462

10 Maximum Principles for Degenerate Elliptic Operators

Fig. 10.1 The sets D + and {u = 0}

Then it follows that ⎧ 1 N ⎪ ⎨ρ ∈ C (R ), x ∈ ∂ D ⇐⇒ ρ(x) = 0, ⎪ ⎩ grad ρ = the unit inward normal n to ∂ D. Following Fichera [61], we define two disjoint subsets 3 and 0 of the boundary ∂ D by the formulas ⎧ ⎨

N 

⎫ ⎬

x ∈ ∂ D : a i j (x )n i n j > 0 , ⎩ ⎭ i, j=1 ⎧ ⎫ N ⎨ ⎬  0 = x ∈ ∂ D : a i j (x )n i n j = 0 , ⎩ ⎭ 3 =

i, j=1

where n = (n 1 , n 2 , . . . , n N ). In other words, 3 is the set of non-characteristic points with respect to the operator A and  0 is the set of characteristic points with respect to the operator A, respectively. The next lemma justifies the definition of the sets 3 and 0 (see [206, Lemma 3.1]): Lemma 10.11 The sets 3 and  0 are invariant under C 2 -diffeomorphisms preserving normal vectors. Proof Let x0 be an arbitrary point of 3 and consider, in a neighborhood U of x0 , a C 2 -diffeomorphism   y = F(x) = F 1 (x), F 2 (x), . . . , F N (x)

10.2 Maximum Principles

463

which preserves normal vectors. Then it follows that ∂ D ∩ U = {y ∈ U : Φ(y) = 0} , Φ = ρ ◦ F −1 , and also

 ∂ F  ∂Φ ∂ρ = , 1 ≤ i ≤ N. ∂xi ∂xi ∂ y =1 N

ni =

Furthermore, we can rewrite the operator A in the form N 

 ∂2 ∂ A= a (x) + bi (x) + c(x) ∂x ∂x ∂x i j i i, j=1 i=1 =

N

ij

N  N 

a

ij



F

−1

(y)

 ∂ F ∂ Fm ∂xi ∂x j

,m=1 i, j=1

+

N  N  =1



+c F

i=1 −1

i



b F

−1

(y)

 ∂ F





∂xi

+

N 

∂2 ∂ y ∂ ym

a

ij



F

−1

i, j=1

 ∂2 F  (y) ∂xi ∂x j



∂ ∂ y

(y) .

However, we have the formula  N  N   ∂ F  ∂ F m ∂Φ ∂Φ ∂ρ ∂ρ ij a (x) = a (x) ∂xi ∂x j ∂xi ∂x j ∂ y ∂ ym i, j=1 i, j=1 ,m=1 ⎛ ⎞ N N  m   ∂ F ∂ F ⎝ ⎠ ∂Φ ∂Φ = a i j (x) ∂x ∂x ∂ y ∂ ym i j ,m=1 i, j=1 N 

ij

=

N  ,m=1

where  a m (y) =

 a m (y)

∂Φ ∂Φ , ∂ y ∂ ym

N 

  ∂ F ∂ Fm a i j F −1 (y) . ∂xi ∂x j i, j=1

This proves the invariance of the sets 3 and  0 , since the diffeomorphism F preserves normal vectors and so grad Φ has the same direction as the unit inward normal n, that is,

464

10 Maximum Principles for Degenerate Elliptic Operators

Σ3

Fig. 10.2 The unit inward normal n at a point x0 of the set 3

N 

... ... ... ... ... .. ... .. .. D . . n ............ . . . ..... ..... ... ..... ..... ... ..... .... . ..... .. ...... .• .... ..... x0 . . . ... ...... ........ ......... . . . . . . . . . . ... ............... .................................................

N  ∂ρ ∂ρ ∂Φ ∂Φ a (x) > 0 ⇐⇒  a m (y) > 0, ∂x ∂x ∂ y ∂ ym i j i, j=1 ,m=1 N  i, j=1

ij

a i j (x)

N  ∂ρ ∂ρ ∂Φ ∂Φ = 0 ⇐⇒  a m (y) = 0. ∂xi ∂x j ∂ y ∂ ym ,m=1

The proof of Lemma 10.11 is complete.



The next lemma will be useful in Chap. 13: Lemma 10.12 (Hopf’s boundary point lemma). Assume that a function u ∈ C(D) ∩ C 2 (D) satisfies the condition Au(x) ≥ 0 in D,

(10.19)

and that there exists a point x0 of the set 3 such that u(x0 ) = max u ≥ 0.

(10.20a)

u(x) < u(x0 ) for all x ∈ D.

(10.20b)

D

∂u Then the unit inward normal derivative ∂n (x0 ) of u at x0 , if it exists, satisfies the condition (see Fig. 10.2 above) ∂u (x ) < 0. (10.21) ∂n 0

Proof By Lemma 10.11, we may choose a local coordinate system y = (y1 , y2 , . . . , y N ) in a neighborhood of x0 such that

10.2 Maximum Principles

465



x0 = the origin, ρ = yN .

Assume that the operator A is written in the form A=

N 

 ∂2 ∂ + β k (y) + c(y). ∂ y j ∂ yk ∂ yk k=1 N

α jk (y)

j,k=1

Note that α N N (0) > 0,

(10.22)

since 0 ∈ 3 and n = (0, . . . , 0, 1). Now we consider the function v(y) = α

N −1 

yi2 − β y N − y N2 ,

i=1

where α, β are positive constants to be chosen later on. Then we have the formula Av(0) = 2α

N −1 

αii (0) − 2α N N (0) − ββ N (0).

i=1

In view of inequality (10.22), it follows that there exists a neighborhood V of the origin 0 such that Av(x) < 0 in V, (10.23) if the constants α and β are chosen sufficiently small. We let E = the domain surrounded by the hypersurface {v = 0} and the hyperplane {y N = η}. Here η is a positive constant to be chosen small enough so that E ⊂ V . Furthermore, we let w(y) := εv(y) − u(y) + u(0), where ε is a positive constant to be chosen later on. Then it follows from inequalities (10.19), (10.20a) and (10.23) that Aw(x) = εAv(x) − Au(x) + c(x)u(0) ≤ εAv(x) < 0 in E.

466

10 Maximum Principles for Degenerate Elliptic Operators

Thus, by applying Theorem 10.6 to the function −w(x) we find that the function w(x) may take its negative minimum only on the boundary ∂ E of E. However, condition (10.20b) implies that w(y) = εv(y) + (u(0) − u(y)) ≥ 0 for all y ∈ ∂ E, if ε is chosen sufficiently small. Hence it follows that w(y) = εv(y) + u(0) − u(y) ≥ 0 for all y ∈ E ∪ ∂ E. Therefore, by taking y = (0, y N ) with 0 < y N < η we have the inequality  ε If the derivative

∂u ∂n

v(0, yn ) − v(0, 0) yN

 ≥

u(0, yn ) − u(0, 0) . yN

(10.24)

exists at x0 , we can let y N ↓ 0 in inequality (10.24) to obtain that ∂u ∂u ∂v (x0 ) = (0) ≤ ε (0) ∂n ∂ yN ∂ yN = −εβ.

This proves the desired inequality (10.21). The proof of Lemma 10.12 is complete.



10.3 Propagation of Maxima Let D be a connected open subset of R N . The following result is well-known by the name of the strong maximum principle for the Laplacian =

N  ∂2 . ∂xi2 i=1

If u ∈ C 2 (D), u(x) ≥ 0 in D and u(x) takes its maximum at a point of D, then u(x) is a constant. The purpose of this section is to reveal the underlying analytical mechanism of propagation of maxima for degenerate elliptic differential operators of second order, explaining the above result. The mechanism of propagation of maxima is closely related to the diffusion phenomenon of Markovian particles. Let A be a second order differential operator with real coefficients such that

10.3 Propagation of Maxima

A=

467 N 

 ∂2 ∂ + bi (x) , ∂xi ∂x j ∂x i i=1 N

a i j (x)

i, j=1

(9.44)

where: (1) The a i j are C 2 functions on R N all of whose derivatives of order ≤ 2 are bounded in R N , a i j (x) = a ji (x), x ∈ R N and satisfy the condition N 

  a i j (x)ξi ξ j ≥ 0 for all (x, ξ) ∈ T ∗ R N = R N × R N .

(10.3)

i, j=1

(2) The bi are C 1 functions on R N with bounded derivatives in R N . In this section we consider the following problem. Problem 10.13 Let D be a connected open subset of R N and x a point of D. Then determine the largest connected, relatively closed subset D(x) of D, containing x, such that I f u ∈ C 2 (D), Au(x) ≥ 0in D, and sup D u = M < ∞, (10.4) then u(x) ≡ M thr oughout D(x). The set D(x) is called the propagation set of x in D. We give a coordinate-free description of the propagation set D(x) in terms of subunit vectors, introduced by Fefferman–Phong [57] (cf. Theorem 9.53 in Sect. 9.9).

10.3.1 Statement of Results Following Fefferman–Phong [57], we say that a tangent vector X=

N 

γj

j=1

∂ ∈ Tx (D) ∂x j

at x ∈ D is subunit for the operator A0 =

N  i, j=1

if it satisfies the condition

a i j (x)

∂2 ∂xi ∂x j

468

10 Maximum Principles for Degenerate Elliptic Operators

⎛ ⎞2 N N N    ⎝ γ jηj⎠ ≤ a i j (x)ηi η j for all η = η j d x j ∈ Tx∗ (D), j=1

i, j=1

j=1

where Tx∗ (D) is the cotangent space of D at x. We remark that this notion is coordinate-free. So we rotate the coordinate axes so that the matrix (a i j ) is diagonalized at x:   ij   a (x) = λi δi j , λ1 > 0, λ2 > 0, . . . , λr > 0, λr +1 = . . . = λ N = 0,   where r = rank a i j (x) . Then it is easy to see that the vector X is subunit for A0 if and only if it is contained in the following ellipsoid of dimension r : (γ r )2 (γ 1 )2 + ... + ≤ 1, γ r +1 = . . . = γ N = 0. λ1 λr

(10.25)

A subunit trajectory is a Lipschitz path γ : [t1 , t2 ] → D such that the tangent vector γ(t) ˙ =

d (γ(t)) dt

is subunit for A0 at γ(t) for almost every t. We remark that if γ(t) ˙ is subunit for A0 , so is −γ(t); ˙ hence subunit trajectories are not oriented. We let ⎞ ⎛ N N ij   ∂ ∂a ⎝bi (x) − X 0 (x) = (x)⎠ . ∂x ∂x j i i=1 j=1 The vector field X 0 (x) is called the drift vector field in probability theory, while it is the so-called subprincipal part of the operator A in terms of the theory of partial differential equations. A drift trajectory is a curve θ : [t1 , t2 ] → D such that ˙ = X 0 (θ(t)) on [t1 , t2 ] , θ(t) and this curve is oriented in the direction of increasing t. Now we can state our main result for the strong maximum principle: Theorem 10.14 (the strong maximum principle). Let the differential operator A of the form (10.2) satisfy the degenerate elliptic condition (10.3). Then the propagation set D(x) of x in D contains the closure D (x) in D of all points y ∈ D that can be joined to x by a finite number of subunit and drift trajectories. ij Theorem  tells us that if the matrix (a ) is non-degenerate at x, that is, if  10.14 r = rank a i j (x) = N , then the maximum propagates in a neighborhood of x; but if the matrix (a i j ) is degenerate at x, then the maximum propagates only in a “thin”

10.3 Propagation of Maxima

469

ellipsoid of dimension r (cf. formula (10.25)) and in the direction of X 0 . Now we see the reason why the strong maximum principle holds true for the Laplacian . Stroock and Varadhan [177] characterized the support of the diffusion process corresponding to the operator A (which is the closure of the collection of all possible trajectories of a Markovian particle, starting at x, with generator A) and, as one of its applications, they gave a (not coordinate-free) description of the propagation set D(x). The next theorem asserts that our propagation set D (x) coincides with that of Stroock–Varadhan [178]: Theorem 10.15 The propagation set D (x) of Theorem 10.6 coincides with the closure in D of the points φ(t), t ≥ 0, where φ : [0, t] → D is a path for which there exists a piecewise C 1 function ψ : [0, t] → R N such that φ (s) = xi + i

 s N 0

 +

⎛ s

a i j (φ(τ )) ψ j (τ ) dτ

j=1

⎝bi (φ(τ )) −

0

N  ∂a i j j=1

∂x j

(10.26) ⎞

(φ(τ ))⎠ dτ , 1 ≤ i ≤ N .

Remark 10.16 By Stroock–Varadhan [178, Theorem 4.1], we see that our propagation set D | prime (x) is the largest subset of D having the property (10.4) in some weak sense (see also [92, Chap. VI, Theorem 8.3], [177, 191]). In the case where the operator A is written as the sum of squares of vector fields, Hill [80] gave another (coordinate-free) description of a propagation set, although his proof was not complete. Hill’s result is completely proved and extended to the non-linear case by Redheffer [148, Theorem 2] (see Bony [21, Théorème 2.1]). As a byproduct of Theorem 10.14, we can prove that our propagation set D (x) coincides with that of Hill [80, Theorem 1]. In order to formulate Hill’s result ([80, Theorem 1]), we assume that the operator A is written as the sum of squares of vector fields: A=

r 

Yk2 + Y0 ,

(10.27)

k=1

where the Yk are real C 2 vector fields on R N and Y0 is a real C 1 vector field on R N . Hill’s diffusion trajectory is a curve β : [t1 , t2 ] → D such that ˙ = 0 on [t1 , t2 ] . ˙ = Yk (β(t)) , β(t) β(t) Hill’s diffusion trajectories are not oriented; they may be traversed in either direction. Hill’s drift trajectories are defined similarly, with Yk replaced by Y0 , but they are oriented in the direction of increasing t.

470

10 Maximum Principles for Degenerate Elliptic Operators

We can prove the following version of the strong maximum principle: Theorem 10.17 (strong maximum principle). Assume that the differential operator A of the form (10.2) can be written as the sum (10.27) of squares of vector fields. Then the propagation set D (x) of Theorem 10.6 coincides with the closure in D of all points y ∈ D that can be joined to x by a finite number of Hill’s diffusion and drift trajectories. Remark 10.18 Theorem 10.17 is implicitly proved by Stroock–Varadhan (cf. [177, Theorem 5.2]; [178, Theorem 3.2]), since the support of the diffusion process corresponding to the operator A does not depend on the expression of A. Theorem 10.14 may be reformulated in various ways. For example, we have the following version of the strong maximum principle: Theorem 10.19 (strong maximum principle). Let the differential operator A of the form (10.10) satisfy the degenerate elliptic condition (10.11) and let c(x) be a continuous function on D such that c(x) ≤ 0 in D. If u ∈ C 2 (D), (A + c(x)) u ≥ 0 in D and if u attains its positive maximum M at a point x of D, then u(x) ≡ M throughout D (x).

10.3.2 Preliminaries First, we prove the following version of the weak maximum principle (cf. Theorem 10.6): Theorem 10.20 (the weak maximum principle). Let the differential operator A of the form (10.10) satisfy the degenerate elliptic condition (10.11) and let c(x) be a continuous function on D such that c(x) ≤ 0 in D. Assume that u ∈ C 2 (D), Au(x) > 0 in D and sup D u = M < +∞. Then the function u(x) takes its maximum M only on the boundary ∂ D. Proof The proof is based on a reduction to absurdity. Assume, to the contrary, that The function u(x) takes its maximum M at a point x0 of D. Without loss of generality, we may choose a local coordinate system y = (y1 , y2 , . . . , y N ) in a neighborhood of x0 such that

10.3 Propagation of Maxima

471

x0 = the origin, A=

N  j,k=1

(10.28a)  ∂2 ∂ + β k (y) + c(y), ∂ y j ∂ yk ∂ yk k=1 N

α jk (y)

(10.28b)

with  jk  α (0) =



Er 0 0 0

 .

(10.28c)

  Here r = rank a i j (x0 ) and Er is the r × r unit matrix. Since the function u(x) takes its maximum M at x0 , it follows from assumptions (10.28) that ⎧ ∂u ⎪ ⎪ (0) = 0, ⎨ ∂ yk 2 ∂ u ⎪ ⎪ (0) ≤ 0 for 1 ≤ k ≤ N , ⎩ ∂ yk2 so that Au(x0 ) =

r  ∂2u k=1

∂ yk2

(0) ≤ 0.

This contradicts the hypothesis: Au(x) > 0 in D. The proof of Theorem 10.20 is complete.



Next we prove elementary lemmas on non-negative functions: Lemma 10.21 Let f (x) be a non-negative C 2 function on R such that sup | f (x)| ≤ C

(10.29)

x∈R

for some constant C > 0. Then we have the inequality | f (x)| ≤

√  2C f (x) on R.

(10.30)

Proof In view of Taylor’s formula, it follows that 0 ≤ f (y) = f (x) + f (x) (y − x) +

f (ξ) (y − x)2 2

where ξ is between x and y. Thus, by letting z = x − y we obtain from estimate (10.29) that

472

10 Maximum Principles for Degenerate Elliptic Operators

f (ξ) 2 z 2 C ≤ f (x) + f (x)z + z 2 , 2

0 ≤ f (x) + f (x)z +

so that

C 2 z + f (x)z + f (x) ≥ 0 for all z ∈ R. 2

Therefore, we have the inequality f (x)2 − 2C f (x) ≤ 0. This proves the desired inequality (10.30). The proof of Lemma 10.21 is complete.



Lemma 10.22 Let f (x) be a non-negative C 2 function on R such that   sup  f (x) ≤ 1.

(10.31)

x∈R

Then we have the inequalities    1 2 y + f (0) ≤ y 2 + f (y) ≤ 2 y 2 + f (0) on R. 3

(10.32)

Proof Since the derivative of the function f (y) + f (−y) vanishes at y = 0, by using the Taylor expansion we obtain that y 2 + f (y) ≤ y 2 + f (y) + f (−y)   ≤ y 2 + 2 f (0) + sup | f (x)| · y 2 . x∈R

By virtue of estimate (10.31), this yields the inequality on the right-hand side of inequalities (10.32). On the other hand, we have, by the mean value theorem,     | f (0) + f (2y) − 2 f (y) f (y) − f (0)  1  f (2y) − f (y) − =  (10.33) y2 |y|  y y 1 = | f (z) − f (w)| |y|      z − w   f (z) − f (w)      · =  y   z−w   z − w    sup  f (x) . ≤  y  x∈R

10.3 Propagation of Maxima

473

Here z is between y and 2y and w is between 0 and y, and so |z − w| ≤ 2|y|.

(10.34)

Therefore, in view of conditions (10.31) and (10.34) it follows from inequality (10.33) that f (0) ≤ 2y 2 + 2 f (y) − f (2y) ≤ 2y 2 + 2 f (y). This yields the inequality on the left-hand side of inequalities (10.32). The proof of Lemma 10.22 is complete.



As one of the applications of Lemma 10.21, we obtain the following lemmas on positive semi-definite quadratic forms: Lemma 10.23 Let a i j (x) be bounded continuous functions on R N , and assume that N 

a i j (x)ξi ξ j ≥ 0 for all x ∈ R N and ξ ∈ R N .

i, j=1

Then we have, for 1 ≤ j ≤ N ,  N 2  N       ij jj k a (x)ξi  ≤ a (x) a (x)ξk ξ    i=1

(10.35)

k,=1

for all x ∈ R N and ξ ∈ R N . Indeed, The desired inequality (10.35) is an immediate consequence of inequality (10.30) if we apply Lemma 10.21 to the function R  ξ j −→

N 1  k a (x)ξk ξ . 2 k,=1

Lemma 10.24 Assume that a i j (x) are C 2 functions on R N all of whose second derivatives are bounded on R N , and that N 

a i j (x)ξi ξ j ≥ 0 for all x ∈ R N and ξ ∈ R N .

i, j=1

Then we have, for 1 ≤ k ≤ N ,

474

10 Maximum Principles for Degenerate Elliptic Operators

   N    ∂a i j   (10.36) (x)λi μ j   i, j=1 ∂xk  ⎧ ⎛ ⎞1/2 ⎞1/2 ⎫ ⎛ N N ⎨ ⎬   ≤ C |μ| ⎝ a i j (x)λi λ j ⎠ + |λ| ⎝ a i j (x)μi μ j ⎠ ⎩ ⎭ i, j=1

i, j=1

for all x ∈ R and λ, μ ∈ R , N

N

where C > 0 is a constant depending only on the bounds on the second derivatives of a i j .   Proof Let x0 be an arbitrary point of R N . We may assume that the matrix a i j (x0 ) is diagonal. Then, by applying Lemma 10.21 to the function R  xk −→

N 1  ij a (x)λi λ j , 2 i, j=1

we obtain that ⎛

⎞2 N ij  ∂a ⎝ (x)λi λ j ⎠ ∂xk i, j=1   ⎛ ⎞   N   N ∂2ai j  ≤ 2 sup  (x)λi λ j  · ⎝ a i j (x)λi λ j ⎠ . ∂x ∂x N  m i, j=1  x∈R i, j=1

(10.37)

1≤,m≤N

Thus, by taking x := x0 and λ := ei in inequality (10.37), where ei is the i-th coordinate vector, we have the inequality   ii   ∂a    ≤ C1 a ii (x0 ) 1/2 .  (x ) 0   ∂x k

(10.38)

Furthermore, by taking λ := ei + e j , i = j, we have the inequality   ii   ∂a  ii 1/2 ∂a i j ∂a j j jj   (x ) + 2 (x ) + (x ) , 0 0 0  ≤ C 2 a (x 0 ) + a (x 0 )  ∂x ∂xk ∂xk k and so     ∂a i j  2  ∂x (x0 ) k

    ii 1/2  ∂a ii ∂a j j jj  ≤ C2 a (x0 ) + a (x0 ) + (x0 ) + (x0 ) ∂xk ∂xk

(10.39)

10.3 Propagation of Maxima

475

  1/2 1/2  j j 1/2  ≤ C2 a ii (x0 ) + a j j (x0 ) + C1 a ii (x0 ) + a (x0 )  1/2 . ≤ C3 a ii (x0 ) + a j j (x0 ) Here C1 , C2 , C3 are positive constants depending only on the bounds on the second derivatives of a i j (x). Therefore, it follows from inequalities (10.38) and (10.39) that       N ∂a i j  (x0 )λi μ j   ∂x k  i, j=1  N    ∂a i j    ≤ (x ) 0  |λi | |μ j |  ∂x k i, j=1 ≤ C4

(10.40)

N   ii 1/2 a (x0 ) + a j j (x0 ) |λi | |μ j | i, j=1



⎤1/2 N   2 2  ii j j a (x0 ) + a (x0 ) λi μ j ⎦ ≤ N C4 ⎣ i, j=1

= N C4

 N

a

ii

(x0 )λi2

 N

i=1

≤ N C4

  N

 μ2j

+

j=1

a

jj

(x0 )μ2j

 N

j=1

1/2 a i j (x0 )λi λ j

 N

|μ| +

i, j=1

 N

1/2 λi2

i=1

1/2 a i j (x0 )μi μ j

 |λ| ,

i, j=1

  since the matrix a i j (x0 ) is diagonal. Here   C3 . C4 = max C1 , 2 Inequality (10.40) proves the desired inequality (10.36) with   C3 . C := N C4 = N max C1 , 2 The proof of Lemma 10.24 is complete.



We prove an approximation theorem for integral curves of vector fields. To do this, we need two elementary lemmas. Lemma 10.25 (Gronwall). Assume that y(t) is an absolutely continuous function on R, and that there exist two continuous functions f (t) and g(t) on R such that

476

10 Maximum Principles for Degenerate Elliptic Operators

y (t) + f (t)y(t) ≤ g(t) almost everywhere in R.

(10.41)

Then we have the inequality  y(t) ≤

 y(0) +

t



s

g(s) exp

0





  f (σ) dσ ds exp −

0

t

 f (s) ds

(10.42)

0

on R. Proof It follows from inequality (10.41) that d dt





t

y(t) exp

 f (s) ds



t

≤ g(t) exp

0

 f (s) ds

0

almost everywhere in R. Hence, by integrating with respect to t we obtain the desired inequality (10.42). The proof of Lemma 10.25 is complete.  Lemma 10.26 Let Z (x) be a Lipschitz continuous vector field on R N and let ω(t) be a bounded continuous function on R. Assume that x(t) is a unique solution of the initial-value problem x(t) ˙ = Z (x(t)) in R, (10.43) x(0) = x0 ∈ R N , and that y(t) is a piecewise C 1 function on R satisfying the conditions

y˙ (t) = Z (y(t)) + ω(t) almost everywhere in R, y(0) = x0 ∈ R N .

(10.44)

Then we have the estimate |x(t) − y(t)| ≤

 ε  Kt e − 1 on R, K

(10.45)

where ε = supt∈R |ω(t)| and K is the Lipschitz constant for the vector field Z (x). Proof We let u(t) := |x(t) − y(t)| . We observe that the function u(t) is absolutely continuous, since x(t) and y(t) are piecewise C 1 functions. Thus, in view of conditions (10.43) and (10.44) it follows that

10.3 Propagation of Maxima

477

u(t) ˙ ≤ |x(t) ˙ − y˙ (t)| ≤ |Z (x(t)) − Z (y(t))| + |ω(t)| ≤ K |x(t) − y(t)| + ε = K u(t) + ε almost everywhere in R. Therefore, the desired inequality (10.45) follows from an application of Lemma 10.25, since u(0) = 0. The proof of Lemma 10.26 is complete.  Now we can prove an approximation theorem for integral curves of vector fields, essentially due to Bony [21]: Theorem 10.27 (Bony). Let X 1 (x), X 2 (x), . . ., X m (x) be Lipschitz continuous vector fields on R N and let Z (x) =

m 

λk (x)X k (x)

k=1

where the λk (x) are real-valued C 1 functions on R N . Then each integral curve of the vector field Z (x) can be approximated uniformly by piecewise differentiable curves, of which each differentiable arc is an integral curve of one of the vector fields X k (x). Proof It suffices to prove the theorem in the case m = 2: Z (x) = λ1 (x)X 1 (x) + λ2 (x)X 2 (x). We consider a piecewise differentiable curve x(t) defined by the following formulas: x(0) = x0 , x(t) ˙ = λ1 (x (2kθ)) X 1 (x(t)) for 2kθ ≤ t ≤ (2k + 1)θ; x(t) ˙ = λ2 (x (2kθ)) X 2 (x(t)) for (2k + 1)θ ≤ t ≤ (2k + 2)θ,

(10.46a) (10.46b) (10.46c)

where θ is a positive parameter and k ranges over all integers. Furthermore, we let y(t) be a polygonal line defined by the following: t − kθ (x ((2k + 2)θ) − x(2kθ)) , θ kθ ≤ t ≤ (k + 1)θ.

y(t) := x(2kθ) +

(10.47)

Then, by virtue of the Taylor expansion it follows from equations (20) and (19) that we have, for kθ ≤ t ≤ (k + 1)θ,

478

10 Maximum Principles for Degenerate Elliptic Operators

1 (10.48) (x ((2k + 2)θ) − x(2kθ)) θ x ((2k + 1)θ) − x(2kθ) x ((2k + 2)θ) − x ((2k + 1)θ) + = θ θ = x(2kθ) ˙ + x˙ ((2k + 1)θ) + an error term of order θ = λ1 (x (2kθ)) X 1 (x (2kθ)) + λ2 (x (2kθ)) X 2 (x ((2k + 1)θ)

y˙ (t) =

+ an error term of order θ. However, by the mean value theorem, we have, for kθ ≤ t ≤ (k + 1)θ, |y(t) − x(2kθ)| t − kθ |x ((2k + 2)θ) − x(2kθ)| = θ ≤ |x ((2k + 2)θ) − x(2kθ)| ≤ |x ((2k + 2)θ) − x ((2k + 1)θ)| + |x ((2k + 1)θ) − x(2kθ)|

(10.49)

= a term of order θ, and also |y(t) − x ((2k + 1)θ)|

(10.50)

≤ |y(t) − x(2kθ)| + |x(2kθ) − x ((2k + 1)θ)| = a term of order θ. Therefore, combining inequalities (21), (22) and (23) we find that y˙ (t) = Z (y(t)) + an error term of order θ almost everywhere in R. In view of Lemma 10.26, this implies that, as θ ↓ 0, the polygonal line y(t) converges uniformly to the integral curve of Z (x) issuing from x0 . Since the distance between x(t) and y(t) tends to zero as θ ↓ 0, it follows that, as θ ↓ 0, the piecewise differentiable curve x(t), defined by formula (19), converges uniformly to the integral curve of Z(x) issuing from x0 . The proof of Theorem 10.27 is complete.  Finally, we study the behavior of integral curves of vector fields with small initial data. Lemma 10.28 Assume that X (x) =

N  i=1

is a C 1 vector field on R N such that

a i (x)

∂ ∂xi

10.3 Propagation of Maxima

479

∂ at x = 0. ∂x1

X=

Let x(t, y) = (x1 (t, y), x2 (t, y), . . . , x N (t, y)) be a unique solution of the initialvalue problem x(t, ˙ y) = X (x(t, y)) , (10.51) x(0, y) = y ∈ R N . Then we have, as |t| + |y| → 0,  ∂a 1 1 ∂a 1 (0)t 2 + (0)y j t + o(|t|2 + |x|2 ), 2 ∂x1 ∂x j j=1 N

x1 (t, y) = y1 + t +

 ∂a i 1 ∂a i (0)t 2 + (0)y j t + o(|t|2 + |x|2 ) 2 ∂x1 ∂x j j=1

(10.52a)

N

xi (t, y) = yi +

(10.52b)

for 2 ≤ i ≤ N . Proof We let  ∂a i 1 ∂a i (0)t 2 − (0)y j t for 1 ≤ i ≤ N . 2 ∂x1 ∂x j j=1 N

wi (t, y) := xi (t, y) − yi −

Then it follows from condition (10.51) that wi (0, y) = 0 for all 1 ≤ i ≤ N ,

(10.53)

and further that  ∂a i ∂a i (0)t − (0)y j ∂x1 ∂x j j=1 N

w˙ i (t, y) = a i (x(t, y)) − ⎛

⎞ N i  ∂a (0)x j (t, y)⎠ = δi1 + ⎝a i (x(t, y)) − a i (0) − ∂x j j=1   i ∂a x1 (t, y) − x1 (0, y) −1 t + (0) ∂x1 t N    ∂a i + (0) x j (t, y) − y j for all 1 ≤ i ≤ N , ∂x j j=2

since a i (0) = δi1 and x1 (0, y) = y1 .

(10.54)

480

10 Maximum Principles for Degenerate Elliptic Operators

By using the mean value theorem, we can estimate each term of the right-hand side of the formula (10.54) as follows: • a i (x(t, y)) − a i (0) −

N    ∂a i (0)x j (t, y) = O |x(t, y)|2 = o(|t| + |y|), ∂x j j=1

since x j (t, y) = y j + ta j (x(s, y)) for some s between 0 and t.  •

   x1 (t, y) − x1 (0, y) − 1 t = a 1 (x(s, y)) − 1 t = o(|t| + |y|), t

since a 1 (0) = 1. • x j (t, y) − y j = ta j (x(s, y)) = o(|t| + |y|) for all 2 ≤ j ≤ N , since a j (0) = 0. Summing up, we can rewrite formula (10.54) in the form w˙ i (t, y) = δi1 + o(|t| + |y|).

(10.55)

Hence it follows from formulas (10.53) and (10.55) that  wi (t, y) = 0

t

wi· (s, y) ds = δi1 t + o(|t| + |y|) for all 1 ≤ i ≤ N .

This proves the desired formulas (10.52). The proof of Lemma 10.28 is complete.



10.3.3 Proof of Theorem 10.14 We use a modification of the techniques originally introduced by Hopf [82] for elliptic operators and later adapted by Bony [21] for degenerate elliptic ones (cf. Hill [80], Redheffer [148], Ole˘ınik–Radkeviˇc [138], Amano [13]). Before the proof of Theorem 10.14, we summarize these techniques in the form of lemmas (Lemma 10.29 and Lemma 10.31 below). Let F be a (relatively) closed subset of D. Following Bony [21], we say that a vector ν is normal to the set F at one of its points, x0 , if there exists an open ball Q contained in the set D \ F, centered at x1 , such that (see Fig. 10.3 below)

T he point x0 is on the boundar y o f the ball Q; ν = s(x1 − x0 )with s > 0.

10.3 Propagation of Maxima Fig. 10.3 The vector ν is normal to the set F at x0 in the sense of Bony [21]

481 .............................. ................. ........ ........ ...... ....... ...... . . . . .. ..... . ν = s(x − x . . 1 0) . .... .. . . . .... . . ...... . . . .... . . . . ....... . . . . . .... . . . . . . . . . .. . ...... . . .... . . . . . . ... . . .. . . .... . . . . . ... .. . Q .. . . ... . . . .. .. .. ... .. . . . . . .. . x1 .. . .. . . D . . .. .. • .. .. .. . .. . . .. . . . . . .. .. .. .. ... . . ... .. . ... .. . . . . .... . .. .. ............................... . . . . . . . . . . . . . . .. . . . . .. • . . . ... . . .......... ......... . . . . . . . . . . . . . . . .. . ....... .. ....... . .. . . . . . . . . . . x . . . . ....... 0 ....... .. . . . . . . . . . . . . ....... .. ...... ....... .. ...... ........ . . . . . ... . .... ...... ....

F

...... ..... ..... ..... ...

The next lemma, essentially due to Bony [21], will play a fundamental role in the proof of Theorem 10.14: Lemma 10.29 Let X (x) be a Lipschitz continuous vector field on R N and let x(t) be an integral curve of X . Assume that At each point x0 of the set F, the inner product X (x0 ), ν is non-positive for any vector ν normal to F at x0 : (10.56) X (x0 ), ν ≤ 0. If x(t0 ) ∈ F for some t0 , then it follows that x(t) ∈ F for all t ≥ t0 . Remark 10.30 If X (x0 ), ν = 0 for any vector ν normal to F at x0 , then we can replace t by −t, and deduce that x(t) ∈ F for all t, not just for t ≥ t0 . Proof of Lemma 10.29 Our proof is based on a reduction to absurdity. We let δ(t) = inf |x(t) − z|, z∈F

and assume, to the contrary, that δ(t) > 0 for t0 < t ≤ t1 .

(10.57)

Here recall that the set F is a (relatively) closed subset of D. Step 1: First, we show that lim inf h↓0

δ(t − h)2 − δ(t)2 ≥ −2 K δ(t)2 for t0 < t ≤ t1 , h

where K is the Lipschitz constant for the vector field X (x).

(10.58)

482

10 Maximum Principles for Degenerate Elliptic Operators

Let {h n }∞ n=1 be a sequence, h n ↓ 0, such that 2

δ(t − h n )2 − δ(t)2 δ(t − h)2 − δ(t)2 = lim inf , n→∞ h↓0 hn h lim

(10.59)

and let yn be the projection on the set F of the point x(t − h n ): |x(t − h n ) − yn | = inf |x(t − h n ) − z| = δ(t − h n ). z∈F

(10.60)

Now we remark that we can choose a sufficiently small open ball B, centered at x(t), such that its closure B in R N is contained in D. Since the set B ∩ F is compact and x (t − h n ) ∈ B for sufficiently large n, by passing to a subsequence we may assume that the sequence {yn } converges to some point y of F. Then it follows from formula (10.60) that (10.61) |x(t) − y| = inf |x(t) − z| = δ(t). z∈F

In other words, the limit point y is the projection on F of the point x(t). Thus we have, by the mean value theorem, δ (t − h n )2 − δ(t)2 |x (t − h n ) − yn |2 − |x(t) − y|2 = hn hn |x (t − h n ) − yn |2 − |x(t) − yn |2 ≥ hn = −2x (tn ) − yn , X (x (tn ))

(10.62)

where t − h n < tn < t. By virtue of formula (10.59), we can let n → ∞ in inequality (10.62) to obtain that lim inf h↓0

δ(t − h)2 − δ(t)2 ≥ −2x(t) − y, X (x(t)) h = −2x(t) − y, X (y)

(10.63)

− 2x(t) − y, X (x(t)) − X (y). However, we have, by hypothesis (10.56), x(t) − y, X (y) ≤ 0, since formula (10.61) implies that the vector x(t) − y is normal to F at y. Hence, by using Schwarz’s inequality we obtain from inequality (10.63) that

10.3 Propagation of Maxima

lim inf h↓0

483

δ(t − h)2 − δ(t)2 ≥ −2x(t) − y, X (x(t)) − X (y) h ≥ −2|x(t) − y||X (x(t)) − X (y)| ≥ −2 K |x(t) − y|2 = −2 K δ(t)2 .

This proves the desired inequality (10.58). Step 2: Next we show that if f is a continuous function on the closed interval [t0 , t1 ] such that f (t − h) − f (t) ≥ −C for t0 < t ≤ t1 , h↓0 h f (t0 ) = 0,

lim inf

(10.64a) (10.64b)

then we have the inequality f (t) ≤ C(t − t0 ) on [t0 , t1 ].

(10.65)

Here C is a non-negative constant. The proof is based on a reduction to absurdity. Assume, to the contrary, that there exists a point s ∈ [t0 , t1 ] such that f (s) > C (s − t0 ) .

(10.66)

We remark that s = t0 , since f (t0 ) = 0 and further that f (s) > 0, since C ≥ 0. We let Φ(t) := f (t) −

f (s) (t − t0 ) , s

and let s0 be a point of [t0 , s] at which the function Φ(t) attains its non-negative maximum on [t0 , s]. We may take s0 = t0 , since Φ(s) = 0. Then we have, by inequalities (10.64) and (10.66), Φ (s0 − h) − Φ(s0 ) h↓0 h f (s) f (s0 − h) − f (s0 ) + = lim inf h↓0 h s − t0 f (s) ≥ −C + > 0. s − t0

0 ≥ lim inf

This is a contradiction.

484

10 Maximum Principles for Degenerate Elliptic Operators

Step 3: Now the proof of Lemma 10.29 is easy. We let   1 . θ := min t1 − t0 , 4K Then we find from inequality (10.58) that the function f (t) := δ(t)2 satisfies hypothesis (10.64) with t1 := t0 + θ, C := 2 K

max δ(s)2 .

t0 ≤s≤t0 +θ

Thus it follows from inequality (10.65) that we have, for all t0 ≤ t ≤ t0 + θ, δ(t)2 ≤ 2 K ≤

max δ(s)2 · (t − t0 ) ≤ 2 K θ

t0 ≤s≤t0 +θ

max δ(s)2

t0 ≤s≤t0 +θ

1 max δ(s)2 , 2 t0 ≤s≤t0 +θ

and so δ(t) ≡ 0 for all t0 ≤ t ≤ t0 + θ. This contradicts the assumption (10.57). The proof of Lemma 10.29 is complete.



Construction of Barriers Next we prove a lemma on the construction of “barriers”: Lemma 10.31 Assume that u ∈ C 2 (D), Au(x) ≥ 0 in D, sup D u = M < +∞ and u(x0 ) = M for some point x0 of D. If there exists a C 2 function v(x) on D such that v(x0 ) = 0, grad v(x0 ) = 0 and Av(x0 ) > 0, then, for any sufficiently small neighborhood U of x0 , the function u(x) attains its maximum M at some point of the set {x ∈ ∂U : v(x) > 0}. Here ∂U denotes the boundary of U . Proof Since v(x0 ) = 0 and grad v(x0 ) = 0, we can construct a C 2 function V (x) from v(x) such that: V (x0 ) = 0, grad V (x0 ) = 0, AV (x0 ) > 0, {x ∈ D : V (x) ≥ 0} \ {0} ⊂ {x ∈ D : v(x) > 0} . Thus, in order to prove the lemma, it suffices to show the following assertion: If we choose a sufficiently small open ball B centered at x0 , then, for any neighborhood U of x0 contained in B, we have the assertion

10.3 Propagation of Maxima

485

sup {u(x) : x ∈ ∂U , V (x) ≥ 0} = M.

(10.67)

The proof of assertion 10.67 is based on a reduction to absurdity. Assume, to the contrary, that: For any open ball B centered at x0 , there exists a neighborhood U of x0 , contained in B, such that sup {u(x) : x ∈ ∂U , V (x) ≥ 0} < M. Then we can find a neighborhood σ of the set ∂U ∩ {V ≥ 0} and a sufficiently small constant ε > 0 such that u(x) + εV (x) < M for all x ∈ σ.

(10.68)

Also, since there exists a constant δ > 0 such that V (x) ≤ −δ for all x ∈ ∂U \ σ, it follows that u(x) + εV (x) < M for all x ∈ ∂U \ σ.

(10.69)

Therefore, we obtain from inequalities (10.68) and (10.69) that u(x) + εV (x) < M on ∂U. On the other hand, since AV (x0 ) > 0, choosing the ball B sufficiently small, we may assume that AV (x) > 0 in U. Then we have A(u + εV ) ≥ εAV > 0 in U. Therefore, by applying Theorem 10.20 to the function u(x) + εV (x), we obtain that the function u(x) + εV (x) may take its maximum only on the boundary ∂U . However, this is a contradiction, since we have the assertions u(x0 ) + εV (x0 ) = u(x0 ) = M, u(x) + εV (x) < M on ∂U. This contradiction proves the desired assertion (10.67) and hence Lemma 10.31. The proof of Lemma 10.31 is complete.  End of Proof of Theorem 10.14 Assume that u ∈ C 2 (D), Au(x) ≥ 0 in D, sup D u = M < +∞ and u(x) = M for some point x of D. We let

486

10 Maximum Principles for Degenerate Elliptic Operators

F = {y ∈ D : u(y) = M} . We remark that, by the continuity of u, the set F is a (relatively) closed subset of D. Step 1: First, we prove that the maximum M propagates along subunit trajectories. To do this, in view of Lemma 10.29 and Remark 10.30, it suffices to show the following: Lemma 10.32 Let x0 be a point of the set F and let γ be a subunit vector for the operator N  ∂2 A0 = a i j (x) ∂xi ∂x j i, j=1 at x0 . Then we have the assertion γ, ν = 0 for all vector ν normal to F at x0 . Proof The proof is based on a reduction to absurdity. Assume, to the contrary, that we have the assertion: γ, ν = s

N 

  j j γ j x1 − x0 = 0 for some vector ν nor mal toF at x0 ,

j=1

where ν = s(x1 − x0 ), s > 0 and x1 is the center of some open ball Q contained in D \ F, and x0 is on the boundary of Q as in Fig. 10.3. Then, since the vector ν is subunit for A0 at x0 , it follows that ⎞2 ⎛ N    j j 0 0, if we choose the constant q sufficiently large. Therefore, by applying Lemma 10.31 to our situation we obtain that, for any sufficiently small neighborhood U of x0 , the function u(x) attains its maximum M at some point of the set {x ∈ ∂U : v(x) > 0} = ∂U ∩ Q. This contradicts the assumption: Q ⊂ D \ F. The proof of Lemma 10.32 is complete.  Step 2: Next we prove that the maximum M propagates along drift trajectories. To do this, in view of Lemma 10.29, it suffices to show the following lemma due to Redheffer [148, Sect. 11]: Lemma 10.33 (Redheffer). The drift vector field X 0 (x) =

N  i=1

⎛ ⎝bi (x) −

N  ∂a i j j=1

∂x j

⎞ (x)⎠

∂ ∂xi

satisfies condition (10.56) in Lemma 10.29. Namely, at each point x0 of the set F, the inner product X 0 (x0 ), ν is non-positive for any vector ν normal to F at x0 : X 0 (x0 ), ν ≤ 0.

(10.71)

Proof We divide the proof of Lemma 10.33 into four steps. Step (1): The proof is based on a reduction to absurdity. Assume, to the contrary, that For some vector ν normal to F at x0 , we have the condition ⎞ ⎛ N N ij     ∂a ⎝bi (x0 ) − (10.72) (x0 )⎠ x1i − x0i > 0, X 0 (x0 ), ν = s ∂x j i=1 j=1 where ν = s (x1 − x0 ), s > 0 and x1 is the center of some open ball Q contained in D \ F, and x0 is on the boundary of Q as in Fig. 10.3.

488

10 Maximum Principles for Degenerate Elliptic Operators

First, it follows from inequality (10.63) in Lemma 10.23 that the tangent vector ⎛ ⎜ ⎝ $



$N j=1

N i, j=1

a

kj

j (x0 )(x1



j x0 ) j

j

a i j (x0 )(x1i − x0i )(x1 − x0 )

⎟ 1/2 ⎠

(10.73)

1≤k≤N

is well defined. Furthermore, by using Schwarz’s inequality we find that the tangent vector (10.73) is subunit for the operator A0 =

N  i, j=1

ai j

∂2 ∂xi ∂x j

at x0 . Hence we have, by Lemma 10.32, N 

  j  j a k j (x0 ) x1k − x0k x1 − x0 = 0.

(10.74)

k, j=1

Therefore, without loss of generality, we may choose a local coordinate system y = (y1 , y2 , . . . , y N ) in a neighborhood of x0 such that x0 = the origin, x1 − x0 = (0, . . . , 0, 1),     ij Er 0 . a (x0 ) = 0 0

(10.75a) (10.75b) (10.75c)

Here we remark by condition (10.74) that r = rank(a i j (x0 )) < N , and Er is the r × r unit matrix. Indeed, we have only to choose coordinates so that x0 = 0 and so that the vector x1 − x0 is directed along the positive x N-axis, and  then rotate the coordinates, keeping the x N -axis fixed, so that the matrix a i j (x0 ) is diagonalized. Then the condition (10.72) is expressed as follows: b N (0) −

N  ∂a N j j=1

∂yj

(0) > 0.

However, we have, by inequality (10.64) in Lemma 10.24, N  ∂a i j (0)λi μ j = 0 for all λi , μ j ∈ Rif 1 ≤ k ≤ N . ∂ yk i, j=r +1

(10.76)

10.3 Propagation of Maxima

489

This implies that ∂a N j (0) = 0 if r + 1 ≤ j ≤ N and 1 ≤ k ≤ N . ∂ yk

(10.77)

Thus the condition (10.76) can be simplified in the following form: b N (0) −

r  ∂a N j j=1

∂yj

(0) > 0.

(10.78)

Step (2): Now we consider the function   r r N   1 ∂a N i ∂a N i v(y) = y N − (0)yi2 + (0)yi y j 2 i=1 ∂ yi ∂yj i=1 j=i+1 +c

r  i=1

yi2

+C

N 



yi2

,

i=r +1

where c and C are positive constants to be chosen later on. Then it follows that

v(0) = 0, grad v(0) = (0, . . . , 0, 1) = 0,

and further, from conditions (10.75a), (10.75b), (10.75c) and (10.78) that  Av(0) = −2r c + b N (0) −

r  ∂a N i i=1

∂ yi

 (0) > 0,

if we choose the constant c sufficiently small. Therefore, by applying Lemma 10.31 we obtain that, for any sufficiently small ε > 0, the function u(x) attains its maximum M at some point z of the set {y ∈ D : |y| < ε, v(y) > 0}; so that z ∈ F. Since v(z) > 0, it follows that the point z = (z 1 , z 2 , . . . , z N ) satisfies the inequality 1  ∂a N i (0)z i2 2 i=1 ∂ yi r

zN >

r r N N     ∂a N i 2 + (0)z i z j + c zi + C z i2 . ∂ y j i=1 j=i+1 i=1 i=r +1

Step (3): We let

(10.79)

490

10 Maximum Principles for Degenerate Elliptic Operators

X i :=

N 

ai j

j=1

∂ ∂yj

for 1 ≤ i ≤ N ,

and consider a chain of integral curves   y (i) (t) = y1(i) (t), . . . , y N(i) (t) for 1 ≤ i ≤ r, defined by the following:   y˙ (1)(t) = X 1 y (1) (t) ,   y˙ (2)(t) = X 2 y (2) (t) ,   y˙ (3)(t) = X 3 y (3) (t) , .. .

  y˙ (r )(t) = X r y (r ) (t) ,

y (1) (0) = z; y (2) (0) = y (1) (−z 1 ) ;   y (3) (0) = y (2) −y2(2) (0) ;   −1) y (r ) (0) = y (r −1) −yr(r−1 (0) .

For simplicity, we write   y(z) = (y1 (z), y2 (z), . . . , y N (z)) := y (r ) −yr(r ) (0) . First, we show that y(z) ∈ F.

(10.80)

In view of inequality (10.63) in Lemma 10.23, it follows that the vector fields  ai j ∂ Xi  1/2 =  1/2 ii ii ∂ yj a j=1 a N

for 1 ≤ i ≤ r,

are subunit for A0 . Therefore, we find from Theorem 10.27 that a chain of the integral curves y (i) (t) can be approximated uniformly by subunit trajectories; so that assertion (10.80) is obtained from Step (1), since z ∈ F. Next we show that: For any α > 0, if we choose a constant ε > 0 sufficiently small and the constant C sufficiently large, then the point y(z), |z| < ε, is contained in the open ball of radius α about (0, . . . , 0, α) (see Fig. 10.4 below) N −1 

yi (z)2 + (y N (z) − α)2 < α2 for all |z| < ε.

i=1

Since we have, by conditions (10.75a), (10.75b) and (10.75c),

(10.81)

10.3 Propagation of Maxima

491

yN

... ....... .... ... .. 1 ... ... ... ... ... .. ... . ..... ........................... ..... ... . ... . .... .. . . ... . ... ... ... ... ... ... . . . .. . ... . . . . .. .. .. . . .. . . .. . . . . .. ... .... . . .. ... . . .. . . .. ... . . .. .. .. .. . . . .. .. . . . .. .. .. .... .. .. ... .. .. ... ... ... ... . . . . ... ... ... .... ... .... ... .... .... ... .. .... ..... ................................................................................................................................................................................................................................................... .

•x

2α •

α•

y(z) •

y = (y1 , . . . , yN−1 )

• x0 = 0

Fig. 10.4 The point y(z) is contained in the open ball of radius α about (0, . . . , 0, α)

X i (0) =

∂ ∂ yi

for 1 ≤ i ≤ r,

it follows from an application of Lemma 10.28 with X := X 1 that, as |z| → 0, we have y1(1) (−z 1 ) = −

 ∂a 11 1 ∂a 11 (0)z 12 − (0)z 1 z j + o(|z|2 ), 2 ∂ y1 ∂ y j j=2

yi(1) (−z 1 ) = z i −

N

 ∂a 1i 1 ∂a 1i (0)z 12 − (0)z 1 z j + o(|z|2 ) for 2 ≤ i ≤ N . 2 ∂ y1 ∂ y j j=2 N

Furthermore, by replacing X 1 by X 2 , we have, as |z| → 0,   • y1(2) −y2(1) (−z 1 ) 1 ∂a 21 1 ∂a 11 (0)z 12 − (0)z 22 2 ∂ y1 2 ∂ y2 N N   ∂a 11 ∂a 21 − (0)z 1 z j − (0)z 2 z j + o(|z|2 ), ∂ y ∂ y j j j=2 j=3   (2) (1) • y2 −y2 (−z 1 )

=−

N  ∂a 22 1 ∂a 22 (0)z 22 − (0)z 2 z j + o(|z|2 ), 2 ∂ y2 ∂ y j j=3   • yi(2) −y2(1) (−z 1 )

=−

492

10 Maximum Principles for Degenerate Elliptic Operators

1 ∂a 1i (0)z 12 2 ∂ y1 N N   ∂a 1i ∂a 2i 1 ∂a 2i − (0)z 22 − (0)z 1 z j − (0)z 2 z j + o(|z|2 ) 2 ∂ y2 ∂ y ∂ y j j j=2 j=3

= zi −

for 3 ≤ i ≤ N . Continuing this process, we have, after r -steps, N −1 

yi (z) ≤ K 2

i=1

 r 

z i4

+

,

z i2

(10.82)

i=r +1

i=1

y N (z) = z N −



N 

N r r   ∂a N i 1  ∂a N i (0)z i2 − (0)z i z j + o(|z|2 ), 2 i=1 ∂ yi ∂ y j i=1 j=i+1

(10.83)

where K is a positive constant independent of z. By combining formula (10.83) and inequality (10.79), we obtain that y N (z) > (c + o(1))

r 

N 

z i2 + (C + o(1))

z i2 .

(10.84)

i=r +1

i=1

Therefore, for any α > 0, if we choose the constant ε > 0 sufficiently small and the constant C sufficiently large, we conclude from inequalities (10.82) and (10.84) and formula (10.83) that y N (z) (2α − y N (z)) ≥ αy N (z) > α (c + o(1)) ≥K

 r  i=1



N −1 

r 

z i2 + α (C + o(1))

i=1

z i4 +

N 



N 

z i2

i=r +1

z i2

i=r +1

yi (z)2 for all |z| < ε.

i=1

This proves the desired assertion (10.81). Step (4): In view of conditions (10.75a), (10.75b) and (10.75c), the assertions (10.80) and (10.81) imply that the vector ν is not normal to F at x0 in the sense of Bony. This contradiction proves the desire inequality (10.71) and hence Lemma 10.33. The proof of Lemma 10.33 is complete.  Now the proof of Theorem 10.14 is complete.



10.3 Propagation of Maxima

493

10.3.4 Proof of Theorem 10.19 The proof of Theorem 10.19 is essentially the same as that of Theorem 10.14. Indeed, it is easy to see that, in the proof of Theorem 10.14, the assumption: Au(x) ≥ 0 is needed only in a sufficiently small neighborhood U of a point x0 where u(x0 ) = M. However, if M > 0, we may assume that u(x) > 0 in U , and hence Au(x) ≥ −c(x)u ≥ 0 in U, since c(x) ≤ 0 in D. Therefore, the proof goes through as before.



Proof of Theorem 10.15 First, we prove the following assertion: Each trajectory φ(t) of the form (10.26) in Theorem 10.15 can be approximated uniformly by a finite number of

(10.85)

subunit and drift trajectories. Let ϕ be a path: [0, ρ] → D for which there exists a piecewise C 1 function ψ : [0, ρ] → R N such that ϕ(t) ˙ =

N 

a i j (ϕ(t)) ψ j (t) for 1 ≤ i ≤ N .

j=1

Then we have the assertion The path ϕ is a subunit trajectory if we change the scale time. Indeed, we let a = sup

N 

x∈D i, j=1 |ξ|=1

a i j (x)ξi ξ j , ⎛

ψ(ρ) = sup ⎝ 0≤t≤ρ

N 

⎞1/2 ψ j (t)2 ⎠

j=1

and define a path γ(t) := ϕ(cρ t) for 0 ≤ t ≤ where cρ is a positive constant defined by the formula

ρ , cρ

,

(10.86)

494

10 Maximum Principles for Degenerate Elliptic Operators

cρ =

a1/2

1 . · ψ(ρ)

Then it is easy to verify that γ(t) ˙ = cρ ϕ(c ˙ ρ t) = cρ

N 

  a i j ϕ(cρ t) ψ j (cρ t)

j=1

= cρ

N 

  a i j (γ(t)) ψ j cρ t .

j=1

Hence, we have, by Schwarz’s inequality,  N 



2 γ(t)η ˙ i

=⎝

i=1

N 



≤ cρ ⎝

⎞2

cρ a i j (γ(t)) ψ j cρ t ηi ⎠

i, j=1





⎞ ⎞⎛ N    a i j (γ(t)) ψi (cρ t)ψ j cρ t ⎠ ⎝ a i j (γ(t)) ηi η j ⎠

N 

i, j=1



≤ cρ a · ψ(ρ)2 ⎝

i, j=1

N 

⎞ a i j (γ(t)) ηi η j ⎠

i, j=1

=

N 

a i j (γ(t)) ηi η j for all η =

i, j=1

N 

∗ ηi d xi ∈ Tγ(t) (D).

i=1

This implies that the path γ(t) = ϕ(cρ t) is a subunit trajectory. Therefore, the desired assertion (10.85) is obtained from fact (10.86) and an application of Theorem 10.27. The proof of the converse of assertion (10.85) is based on the following proposition, essentially due to Fefferman–Phong [57, the proof of Lemma 1]: Proposition 10.34 Assume that a point y ∈ D can be joined to a point x ∈ D by a Lipschitz path v : [0, ρ] → D for which the tangent vector γ(t) ˙ is subunit for A0 at  →D v(t) for almost every t. Then we can join x to y by a Lipschitz path  γ : [0, Cρ] of the form  γ i (t) = xi +

N   j=1

t

 a i j ( γ (t)) ξ j (s) ds, 0 ≤ t ≤ Cρ,

(10.87)

0

 is a positive constant and ξ(t) = where C field.

$N

i=1 ξi (t) d x i

is a piecewise C 1 covector

10.3 Propagation of Maxima

495

Granting Proposition 10.34 for the moment, we shall prove the converse of assertion (10.85). Step (1): First, we remark that the trajectories φ(t) of the form (10.26) in Theorem 10.15 contain drift trajectories as the particular case: ψ j ≡ 0. Step (2): Let  γ (t) be a Lipschitz path of the form (10.87). If n is a positive integer, we define a path φn (t) of the from (10.26) in Theorem 10.15 as follows: φin (t) := xi +

N  

t

+

a i j (φn (s)) nξ j (ns)ds

0

j=1





t

⎝bi (φn (s)) −

0

N  ∂a i j j=1

∂x j

⎞ (φn (s))⎠ ds for 0 ≤ t ≤

 Cρ , n

  t  ϕn (t) := φn for 0 ≤ t ≤ Cρ. n

and let

Then we have the formula ϕin (t)

= xi +

N   j=1





t n

+

t n

⎝bi (φn (s)) −

0

N  ∂a i j j=1

= xi +

N  t  j=1

1 + n

a i j (φn (s)) nξ j (ns)ds

0

 0



t

∂x j

⎞ (φn (s))⎠ ds

a i j (ϕn (τ )) ξ j (τ ) dτ

0

⎝bi (ϕn (τ )) −

N  ∂a i j j=1

∂x j

⎞ (ϕn (τ ))⎠ dτ .

Thus it follows from an application of Lemma 10.25 that the path γ (t) can be approximated uniformly by the paths ϕn (t) and hence by the paths φn (t). Therefore, by combining this fact and Proposition 10.34 we find that the subunit trajectories can be approximated uniformly by paths of the form (10.26) in Theorem 10.17. Step (3): Summing up, we conclude that any finite number of subunit and drift trajectories can be approximated uniformly by a finite number of trajectories of the form (10.26) in Theorem 10.15. Theorem 10.7 is proved, apart from the proof of Proposition 10.34.  Proof of Proposition 10.34

496

10 Maximum Principles for Degenerate Elliptic Operators

Our proof mimics that of Fefferman–Phong [57, Lemma 1]. We divide the proof into three steps. Step (I): Let x be a point of D and ρ > 0. With the differential operator A = 0

N 

a i j (x)

i, j=1

∂2 , ∂xi ∂x j

we associate a “non-Euclidean” ball B A0 (x, ρ) of radius ρ about x as follows: B A0 (x, ρ) = the set of all points y ∈ D which can be joined to x by a Lipschitz path v : [0, ρ] → D, for which the tangent vector v(t) ˙ is subunit for A0 at v(t) for almost every t. Also we let B E (x, ρ) = the ordinary Euclidean ball of radius ρ about x. We remark that if the operator A0 is the usual Laplacian =

N  ∂2 , ∂xi2 i=1

then the two balls B A0 (x, ρ) and B E (x, ρ) coincide. Since we have the inequality   ij   ij a (x) ≤ a (x) + ρ8 δi j as matrices, it follows that B A0 (x, ρ) ⊂ B A0 (x, ρ).     the term ρ8  is a technicality. Let gi j (x) be the inverse matrix of g i j (x) = Here a i j (x) + ρ8 δi j :    −1 gi j (x) = a i j (x) + ρ8 δi j . Then, for any point y ∈ B A0 (x, ρ), we may join x to y by a geodesic γ : [0, ρ] → D in the metric N  gi j (x) d xi d x j . ds 2 = i, j=1

First, we have the following: Lemma 10.35 If we parametrize the geodesic γ by its arc-length t, then the tangent vector γ(t) ˙ is a subunit vector for the operator A0 − ρ8 . Namely, we have the

10.3 Propagation of Maxima

497

inequality ⎛ ⎝

N 

⎞2 γ (t)η j ⎠ ≤ ·j

j=1

N   ij  a (γ(t)) + ρ8 δi j ηi η j

(10.88)

i, j=1

for all η =

N 

∗ ηi d xi ∈ Tγ(t) (D),

i=1

where γ(t) ˙ =

N 

γ˙ j (t)

j=1

∂ ∈ Tγ(t) (D). ∂x j

Proof Since inequality (10.88) is independent ofthe particular local chart, we rotate  the coordinate axes so that the matrix a i j (γ(t)) is diagonalized as follows:     ij a (γ(t)) = λi δi j with λi ≥ 0. Then it follows from Schwarz’s inequality that ⎛ ⎝

N 

⎞2 γ˙ j (t)η j ⎠

j=1

⎞⎛ ⎞ ⎛ N N j 2   γ ˙ (t) ⎠ ⎝ (λ j + ρ8 )η j ⎠ ≤⎝ 8 λ + ρ j j=1 j=1 ⎛ ⎞⎛ ⎞ N N     =⎝ a i j (γ(t)) + ρ8 δi j ηi η j ⎠ gi j (γ(t)) γ ·i (t)γ · j (t)⎠ ⎝ i, j=1

=

N 

i, j=1

 ij  a (γ(t)) + ρ8 δi j ηi η j ,

i, j=1

since we have the formula N 

gi j (γ(t)) γ˙ i (t)γ˙ j (t) = 1.

i, j=1

The proof of Lemma 10.35 is complete.



Corollary 10.36 We have, for all sufficiently small ρ > 0,   B E x, ρ2 ⊂ B A0 (x, ρ) .

(10.89)

498

10 Maximum Principles for Degenerate Elliptic Operators

Proof Every point y ∈ B E (x, ρ2 ) may be joined to the point x by a geodesic γ in the metric N  ds 2 = gi j (x) d xi d x j , i, j=1

where

−1      = (λi + ρ8 )−1 δi j . gi j (x) = a i j (x) + ρ8 δi j

Hence the desired assertion (10.89) is an immediate consequence of Lemma 10.35. Indeed, we have, for all sufficiently small ρ > 0, 

N  (d xi )2 ρds = ρ λ + ρ8 i=1 i

1/2

ρ ≥ (max1≤i≤N λi + ρ8 |1/2  N 1/2  2 2 ≥ρ (d xi ) .

 N 

1/2 (d xi )2

i=1

i=1

The proof of Corollary 10.36 is complete.



Step (II): Let γ be the geodesic as in Step (I): γ(0) = ρ, the tangent vector γ(t) ˙ is subunit for A0 . We perturb the path γ(t) to a broken path γ (t), which starts at x and ends very near y, so that cγ˙  (t) is a subunit vector for A0 for some positive constant c. Step (II-a): First, we need a lemma on perturbation of tangent vectors. $ Lemma 10.37 Let X = Nj=1 γ j ∂x∂ j be a subunit vector for A0 − ρ8  at a point x 0 ∈ D and let x 1 be an arbitrary point of D such that we have, for some positive constant c1 , |x 1 − x 0 | ≤ c1 ρ4 . Then there exist a tangent vector Y =

N  j=1

and a cotangent vector

δj

∂ ∈ Tx 1 (D) ∂x j

10.3 Propagation of Maxima

499

ζ=

N 

ζ j d x j ∈ Tx∗1 (D)

j=1

which satisfy the following three conditions: (i) |δ j − γ j | ≤ Cρ4 . 0 1 (ii) The vector $ N cYi j is 1subunit for A at x−4. i (iii) δ = j=1 a (x )ζ j and |ζ| ≤ Cρ . Here C and c are positive constants depending only on c1 and the bounds on the second derivatives of a i j (x). $ Proof Since the tangent vector X = Nj=1 γ j ∂x∂ j is subunit for A0 − ρ8  at x 0 , it follows that ⎛ ⎝

N 

⎞2 γ jηj⎠ ≤

j=1

N   ij 0  a (x ) + ρ8 δi j ηi η j

(10.90)

i, j=1

for all η =

N 

ηi d xi ∈ Tx∗0 (D).

i=1

To estimate the right-hand side of inequality (10.90), we let $N g(t) :=

i, j=1

 ij  1   a x + t (x 0 − x 1 ) + ρ8 δi j ηi η j |η|2

for t ∈ R and η = 0.

Then we have the assertions g(t) ≥ 0 on R, |g (t)| ≤ K |x 1 − x 2 |2 on R, where K is a positive constant depending only on the bounds on the second derivatives of a i j (x). By applying Lemma 10.29 to the function f (t) := we obtain that

so that

g(t) , K |x 1 − x 0 |2

  g(1) g(0) , + 1 ≤ 2 1 + K |x 1 − x 0 |2 K |x 1 − x 0 |2

500

10 Maximum Principles for Degenerate Elliptic Operators

$N i, j=1

 ij 0  a (x ) + ρ8 δi j ηi η j

|η|2  ij 1  8 i, j=1 a (x ) + ρ δi j ηi η j

$N ≤2

|η|2

+ K |x 1 − x 0 |2 .

Thus we have the estimate N   ij 0  a (x ) + ρ8 δi j ηi η j

(10.91)

i, j=1



⎞ N   ij 1  8 1 0 2 2 a (x ) + ρ δi j ηi η j + |x − x | |η| ⎠ , ≤K ⎝

i, j=1

where

K = max(K , 2).

Hence, by combining inequalities (10.90) with (10.91), we have, for |x 1 − x 0 | ≤ c1 ρ4 , ⎞2 ⎛ N N    ij 1  j ⎠ ⎝ a (x ) + ρ8 δi j ηi η j . γ ηj ≤ K (1 + c1 ) j=1

i, j=1

Therefore, by letting  X :=

N 



j=1

γj

∂ ∂x j

 c := K (1 + c12 ) we obtain that

at x 1 , −1/2

,

The vector  c X is subunit for A0 − ρ8  at x 1 .

(10.92)

Since conditions (i), (ii) and (iii) are unaffected by the rotation of coordinate axes, we may assume that  ij 1  a (x ) = (λi δi j ) for λi ≥ 0. Then the assertion (10.92) can be restated as follows:  N  i=1

2  cγ ηi i

N  ≤ (λi + ρ8 )ηi2 for all η = (η1 , η2 , . . . , η N ) ∈ R N . i=1

In particular, we have the estimate

10.3 Propagation of Maxima

501

 c2 (γ i )2 ≤ λi + ρ8 for all 1 ≤ i ≤ N . Now we define a tangent vector Y = δ = i

and a cotangent vector ζ =

$N i=1

δ i ∂x∂ i at x 1 as

γ i if λi ≥ ρ8 , 0 if 0 ≤ λi < ρ8 ,

$N

i=1 ζi d x i

ζi =

(10.93)

γi λi

at x 1 as if λi ≥ ρ8 , if 0 ≤ λi < ρ8 .

0

Then we can verify conditions (i), (ii) and (iii) as follows: (i) First, by using inequality (10.93) we have the estimate |δ − γ | = i

i 2

0 (γ i )2 ≤

λi +ρ  c2

8



2 8 ρ  c2

if λi ≥ ρ8 , if 0 ≤ λi ≤ ρ8 ,

so that |δ i − γ i | ≤ Cρ4 √

with the constant



2 =  c

C :=

2 K (1 + c12 )

1/2 .

(ii) Secondly, we have the estimate  N 



2 δ i ηi

=

i=1



2 γ i ηi

1≤i≤N

≤N



(γ i )2 ηi2 ≤ N

1≤i≤N λi ≥ρ8

≤ Hence, by letting

 λi + ρ8 ηi2 2  c 1≤i≤N λi ≥ρ8

N 2N  λi ηi2 .  c2 i=1

1  c = c := √ 1/2 , 2N 2N K (1 + c12 )

we obtain that The vector cY is subunit for A0 at x 1 .

(10.94)

502

10 Maximum Principles for Degenerate Elliptic Operators

(iii) Finally, we have the formula N 

a i j (x 1 )ζ j = λi ζi =

j=1

γ i if λi ≥ ρ8 , 0 if 0 ≤ λi < ρ8

= δi , and also |ζ|2 =

 (γ i )2  λi + ρ8  2 1 ≤ ≤ 2 2  c2 λi λi  c2 λi 1≤i≤N 1≤i≤N 1≤i≤N λi ≥ρ8



λi ≥ρ8

λi ≥ρ8

2N −8 ρ .  c2

This proves that

|ζ| ≤ Cρ−4 √  1/2 2N = 2N K (1 + c12 ) . C :=  c

with the constant

Lemma 10.37 is proved.



Step (II-b): Next we need a lemma on estimates for the second derivatives of Hamiltonian paths. We let N 1  ij a (x)ξi ξ j , H (x, ξ) = 2 i, j=1 and consider the Hamiltonian equations x˙i (t) =

N 

a i j (x(t)) ξ j (t), x(0) = x0 ,

(10.95a)

N 1  ∂a i j ξ˙ (t) = − (x(t)) ξi (t)ξ j (t), ξ(0) = ξ0 . 2 i, j=1 ∂x

(10.95b)

j=1

Then we obtain the following: Lemma 10.38 Let |ξ0 | ≤ Cρ−4 for some positive constant C. If we flow for time 0 ≤ t ≤ c1 ρ4 where c1 is a sufficiently small positive constant, then we have the estimates

10.3 Propagation of Maxima

503

|ξ(t)| ≤ C ρ−4 ,

(10.96a)

−8

|x(t)| ¨ ≤C ρ ,

(10.96b)

along the path. Here C and C are positive constants depending only on C, c1 and the bounds on the a i j (x) and their first derivatives. Proof First, it follows from the Hamiltonian equations (10.95) that |x(t)| ˙ ≤ K |ξ(t)| ,   ξ(t) ˙  ≤ K |ξ(t)|2 ,

(10.97b)

|x(t)| ¨ ≤ K |ξ(t)|2 ,

(10.98)

(10.97a)

and so

since we have the equations x¨i (t) =

N 

a i j (x(t)) ξ˙ j (t) +

j=1

N  ∂a i j (x(t)) x˙ (t)ξ j (t). ∂x j,=1

Here K is a positive constant depending only on the bounds on the a i j (x) and their first derivatives. By the mean value theorem and estimates (10.97), it follows that we have, for 0 ≤ t ≤ c1 ρ4 , |ξ(t)| ≤ |ξ0 | +

˙ sup |ξ(s)| ·t

0≤s≤c1 ρ4

≤ Cρ−4 + K so that

2

 K c1 ρ

4

sup |ξ(s)|2 · c1 ρ4 ,

0≤s≤c1 ρ4

sup |ξ(s)|



0≤s≤c1 ρ4

sup |ξ(s)| + Cρ−4 ≥ 0.

0≤s≤c1 ρ4

Thus, if the constant c1 is sufficiently small so that 1 − 4c1 K C > 0, we obtain that

 sup |ξ(s)| ≤

0≤s≤c1 ρ4

1−



1 − 4c1 K C 2K c1

Indeed, it suffices to note that we have, as c1 ↓ 0,



ρ−4 .

(10.99)

504

10 Maximum Principles for Degenerate Elliptic Operators

√ 1 − 4c1 K C −→ +∞, 2K c1 sup |ξ(s)| −→ |ξ0 |.

1+

0≤s≤c1 ρ4

Therefore, the desired estimates (10.96) follow from estimates (10.99) and (10.98), with C :=

1−



1 − 4c1 K C , 2K c1

C := K C . 2



The proof of Lemma 10.38 is complete. Step (II-c): Now we construct the broken path γ (t) mentioned above. (1) Assume that: We have constructed a path γ (t) for 0 ≤ t ≤ τk such that γ (0) = γ(0) = x,

(10.100a)

|γ (t) − γ(t)| ≤ C+ τk ρ

4

for 0 ≤ t ≤ τk .

(10.100b)

The large constant C+ and the division points τk will be picked later on (see formulas (10.116), (10.117) below) so that we have, for all sufficiently small ρ > 0, C+ τk ≤ C+ ρ ≤ c1

(10.101)

where c1 is the same constant as in Lemma 10.38. We remark that assertion (10.100) is vacuous for τ0 = 0. By virtue of estimates (10.100) and (10.101), we can apply Lemma 10.37 with x 0 := γ(τk ), x 1 := γ (τk ), X :=

N  j=1

γ˙ j (τk )

∂ , ∂x j

to obtain that: $ There exist a tangent vector Y = Nj=1 δ j ∂x∂ j at γ (τk ) and a cotangent vector $ ζ = Nj=1 ζ j d x j at γ (τk ) which satisfy the following three conditions:

10.3 Propagation of Maxima

505

(i) |δ j − γ˙ j (τk )| ≤ Cρ4 .

(10.102a)

(ii) The vector cY is subunit for A at γ (τk ). 0

(iii) δ i =

N 

  a i j γ (τk ) ζ j and |ζ| ≤ Cρ−4 .

(10.102b) (10.102c)

j=1

Now we define a path γ (t) for τk ≤ t ≤ τk+1 as the projection onto the xcoordinate of the Hamiltonian curve for the Hamiltonian H (x, ξ) =

N 1  ij a (x)ξi ξ j 2 i, j=1

starting at (γ (τk ), ζ) for t = τk

  $ γ˙ i (t) = Nj=1 a i j γ (t) ξ j (t), γ (τk ) = γ (τk ) ,  $ ij  γ (t) ξi (t)ξ j (t), ξ(τk ) = ζ. ξ˙ (t) = − 21 i,N j=1 ∂a ∂x

(10.103)

Then it follows from conditions (iii) and (ii) of (10.102) that ⎛ c2 ⎝

N 

⎞2  2 N N       ij i a γ (τk ) ζi ζ j ⎠ = cδ ζi ≤ a i j γ (τk ) ζi ζ j ,

i, j=1

i=1

so that

i, j=1

N 

  1 a i j γ (τk ) ζi ζ j ≤ 2 . c i, j=1

(10.104)

On the other hand, in view of Schwarz’s inequality, it follows from the initial-value $N problem (10.103) that we have, for all η = i=1 ηi d xi ∈ Tγ∗ (t) (D),  N 

2 γ˙ i (t)ηi

i=1

⎛ =⎝ ⎛

N 

i, j=1 N 

≤⎝

i, j=1

⎛ =⎝

N 

i, j=1

(10.105)

⎞2   a i j γ (t) ηi ξ j (t)⎠ ⎞ ⎞⎛ N      a i j γ (t) ξi (t)ξ j (t)⎠ ⎝ a i j γ (t) ηi η j ⎠ ⎞⎛

i, j=1

⎞ N      a i j γ (τk ) ζi ζ j ⎠ ⎝ a i j γ (t) ηi η j ⎠ , i, j=1

506

10 Maximum Principles for Degenerate Elliptic Operators

  since the function H γ (t), ξ(t) is conserved along the path. Therefore, by combining inequalities (10.105) and (10.104), we obtain that: The tangent vector cγ˙  (t) is subunit for A0 throughout τk ≤ t ≤ τk+1 . Here we remark that the constant c is independent of the division points τk , depending essentially on the constant c1 (see inequality (10.101) and formula (10.94)). Moreover, since |ξ(τk )| = |ζ| ≤ Cρ−4 , it follows from an application of Lemma 10.38 that (10.106) |γ¨  (t)| ≤ C ρ−8 for τk ≤ t ≤ τk+1 , provided that τk+1 − τk ≤ c1 ρ4 . (2) We estimate the second derivative γ(t) ¨ of the geodesic γ(t). First, we recall that the geodesic γ(t) satisfies the equations: N  d 2γi dγ j dγ k = 0 for 1 ≤ i ≤ N , +  ijk 2 dt dt dt j,k=1

where  ijk

  N ∂gj ∂g jk 1  i ∂gk = g (x) + − 2 =1 ∂xk ∂x j ∂x

(10.107)

(10.108)

are the Christoffel symbols. Thus, to estimate the second derivative γ(t), ¨ it suffices ˙ to estimate the  ijk and the first derivative γ(t). Differentiating the identity N 

g i j (x)g jk (x) = δki

j=1

with respect to the variable x , and then multiplying it to the left by gim (x) and summing over i, we obtain that N  ∂gmk ∂g i j =− gim (x) g jk (x). ∂x ∂x i, j=1

Substituting these into the right-hand side of formula (10.108), we have the formula  N N  ∂g in 1  ∂g in gn j (x) + gnk (x) 2 n=1 ∂xk ∂x j n=1  N  ∂g mn i g (x)gm j (x)gnk (x) − ∂x ,m,n=1

 ijk = −

(10.109)

10.3 Propagation of Maxima

=− −

507

 N N  ∂a in 1  ∂a in gn j (x) + gnk (x) 2 n=1 ∂xk ∂x j n=1 N 

(a i (x) + ρ8 δi )gm j (x)gnk (x)

,m,n=1

 ∂a mn . ∂x

We estimate the three terms of the right-hand side of formula (10.109). Since the bounds on γ(t) ¨ are unaffected by the rotation of coordinate axes, we may assume that  ij    a (γ(t)) = λi δi j for λi ≥ 0, so that      ij g (γ(t)) = λ j + ρ8 δi j ,  −1    gi j (γ(t)) = λ j + ρ8 δi j .

(10.110a) (10.110b)

Then, since the tangent vector γ(t)· is subunit for A0 − ρ8 , it follows that 

dγ j (t) dt

2 ≤ λ j + ρ8 for all 1 ≤ j ≤ N .

(10.111)

Hence we have, by formulas (10.110) and (10.111),  N     ∂a in dγ j dγ k   ∂a i j  (λ j + ρ8 )1/2 (λk + ρ8 )1/2  gn j (x)  ≤  ∂xk dt dt  ∂xk  λ j + ρ8 n=1

(10.112)

≤ K ρ−4 , where K is a positive constant depending only on the bounds on the first derivatives of a i j (x). Similarly, we have the estimates   N  ∂a in dγ j dγ k   gnk (x)  ≤ K ρ−4 ,    ∂x dt dt j n=1  N    mn j k dγ dγ ∂a   (a i + ρ8 δi )gm j (x)gnk (x)   ≤ K ρ−8 .   ∂x dt dt  ,m,n=1

(10.113)

(10.114)

Therefore, in view of estimates (10.112), (10.113) and (10.114), it follows from equation (10.107) that

508

10 Maximum Principles for Degenerate Elliptic Operators

    2 i  N j k d γ    dγ dγ i  ≤ C ρ−8 ,   = −  jk  dt 2   dt dt   j,k=1 so that

|γ(t)| ¨ ≤ C ρ−8 .

(10.115)

Here C is a positive constant depending only on the bounds on the a i j (x) and their first derivatives. (3) Now we pick the constants C+ and τk so that |γ (t) − γ(t)| ≤ C+ τk+1 ρ4 for τk ≤ t ≤ τk+1 . First, it follows from conditions (iii) and (i) of (10.102) that     N     i   ij i  γ˙ (τk ) − γ˙ i (τk ) =  γ a (τ ) − γ ˙ ξ (τ ) (τ )  k j k k     j=1   i  i = δ − γ˙ (τk )

(10.116)

≤ Cρ4 . We assume that τk+1 − τk = ρ12 . Then, by the mean value theorem, it follows from estimates (10.116), (10.106) and (10.115) that ˙ ≤ Cρ4 + (C + C )ρ−8 (t − τk ) |γ˙  (t) − γ(t)| ≤ (C + C + C )ρ4 for τk ≤ t ≤ τk+1 , so that we have, by induction hypothesis (10.100), |γ (t) − γ(t)| ≤ C+ τk ρ4 + (C + C + C )ρ4 (t − τk ) ≤ C+ τk+1 ρ4 for τk ≤ t ≤ τk+1 , provided that we pick the constant C+ as C+ ≥ C + C + C .

(10.117)

Our construction of the path γ (t) is now complete with τk := kρ12 .

(10.118)

10.3 Propagation of Maxima

509

Summing up, we have constructed a broken path γ : [0, ρ] → D such that (i) γ (0) = γ(0) = x.

(10.119a)

(ii) |γ (t) − γ(t)| ≤ C+ (k + 1)ρ for kρ 16

(iii) The tangent vector 12



cγ· (t) 12

12

≤ t ≤ (k + 1)ρ . 12

(10.119b)

0

is subunit for A throughout

(10.119c)

≤ t ≤ (k + 1)ρ .

Step (III): End of the proof of Proposition 10.34 Given any point y ∈ B A0 (x, ρ), let γ : [0, ρ] → D be a geodesic in the metric −1    $ 2 ds = i,N j=1 gi j (x)d xi d x j , where gi j (x) = a i j (x) + ρ8 δi j , such that γ joins x = γ(0) to y = γ(ρ). Let γ : [0, ρ] → D be a broken path satisfying conditions (10.119). Then it follows from condition (ii) of (10.119) that     γ (ρ) − y  = γ (ρ) − γ(ρ) ≤ C+ (ρ5 + ρ16 ) ≤ ρ4 ,

(10.120)

since kρ12 ≤ t = ρ and ρ is sufficiently small. Thus we have the assertion y ∈ B E (y1 , ρ4 ), y1 = γ (ρ). By applying Corollary 10.36 with x := y1 , ρ := ρ2 , we obtain that B E (y1 , ρ4 ) ⊂ B A0 −(ρ2 )8  (y1 , ρ2 ). Now we define a path  γ (t) = γ (ct) for 0 ≤ t ≤

ρ , c

(10.121)

(10.122)

where c is the same constant as in condition (ii) of (10.119). Then we have the following three assertions: (1)  γ (0)  = γ (0) = x. (2)  γ ρc = γ (ρ) = y1 . (3) The tangent vector  γ˙  = cγ˙  is subunit for A0 . This proves that there is a broken path γ : [0, ρ/c] −→ D γ0 =  with tangent vectors subunit for A0 such that γ0 joins the point x = γ0 (0)

510

10 Maximum Principles for Degenerate Elliptic Operators

to the point y1 = γ0 (ρ/c) . Also it follows from estimate (10.120) that |y − y1 | ≤ ρ4 . By virtue of assertion (10.121), we can repeat the above process, replacing x and ρ by y1 and ρ2 respectively, to obtain that there is a broken path γ1 : [0, ρ2 /c] −→ D with tangent vectors subunit for A0 such that γ1 joins the point y1 = γ1 (0) to the point

  y2 = γ1 ρ2 /c ,

and |y − y2 | ≤ (ρ2 )4 . Repeating the process yields a sequence of paths μ

γ : [0, tμ+1 ] −→ D for μ = 0, 1, 2, . . . , with tangent vectors subunit for A0 such that γ0 joins the point x = γ0 (0) to the point y1 = γ0 (t1 ), μ

and γ joins the point

μ

y μ+1

to the point y . Moreover, it follows that μ

y −→ y as μ → ∞, and further that

10.3 Propagation of Maxima ∞ 

511

tμ+1 ≤ Cρ for some positive constant C.

μ=0 μ

By combining a sequence of paths γ into a single Lipschitz path  γ : [0, Cρ] −→ D, we see by formulas (10.103) and estimate (10.122) that the path  γ is of the form (10.87) and joins x to y, as desired. The proof of Proposition 10.34 and hence that of Theorem 10.15 is complete. 

10.3.5 Proof of Theorem 10.17 The proof of Theorem 10.17 is divided into two steps. Step (I): We let Yk =

N 

bik (x)

i=1

Y0 =

N 

ci (x)

i=1

∂ ∂xi

for 1 ≤ k ≤ r ;

∂ . ∂xi

Then we have the formula A=

r 

Yk2 + Y0

k=1

=

N 

 ∂2 ∂ + bi (x) , ∂xi ∂x j ∂x i i=1 N

a i j (x)

i, j=1

where a i j (x) =

r 

bik (x)b jk (x),

(10.123a)

k=1

bi (x) = ci (x) +

N r   k=1 j=1

b jk (x)

∂bik . ∂x j

Thus it follows that the vector fields Yk are subunit for the operator

(10.123b)

512

10 Maximum Principles for Degenerate Elliptic Operators

A0 =

N 

a i j (x)

i, j=1

∂2 , ∂xi ∂x j

so that Hill’s diffusion trajectories (integral curves of Yk ) are subunit trajectories. Moreover, it follows from formulas (10.123a) and (10.123b) that the vector field Y0 is expressed as follows: Y0 =

N  i=1

ci (x)

(10.124)

⎞ ik ∂b ⎝bi (x) − ⎠ ∂ b jk (x) = ∂x ∂xi j i=1 k=1 j=1 ⎛ ⎞ ⎛ ⎞  N N r N N ij jk      ∂ ∂a ∂b ∂ i ik ⎝b (x) − ⎠ ⎝ ⎠ = + b (x) ∂x j ∂xi ∂x j ∂xi i=1 j=1 k=1 j=1 i=1 ⎛ ⎞ r N   ∂b jk ⎠ ⎝ = X0 + Yk . ∂x j k=1 j=1 N 



∂ ∂xi

r  N 

Therefore, in view of Theorem 10.27 we obtain that Hill’s drift trajectories (integral curves of Y0 ) can be approximated uniformly by piecewise differentiable curves, of which each differentiable arc is a subunit or drift trajectory. Summing up, we have proved that any finite number of Hill’s diffusion and drift trajectories can be approximated uniformly by a finite number of subunit and drift trajectories. Step (II): To prove the converse, we remark that, by Proposition 10.34, the subunit trajectories can be replaced by integral curves of the vector fields Xi =

N 

a i j (x)

j=1

∂ ∂x j

for 1 ≤ i ≤ N .

Moreover, it follows from formulas (10.123a) and (10.123b) that the vector fields X i are expressed as follows: Xi =

j=1

=

⎛ ⎞ r N   ∂ ∂ ⎠ bik (x)b jk (x) = bik (x) ⎝ b jk (x) ∂x ∂x j j k=1 k=1 j=1

 r N   r  k=1



bik (x)Yk for 1 ≤ i ≤ N .

10.3 Propagation of Maxima

513

Hence, by using Theorem 10.27 we find that the subunit trajectories can be approximated uniformly by piecewise differentiable curves, of which each differentiable ˙ = Yk (β(t)) arc is Hill’s diffusion trajectory. Indeed, it suffices to note that if β(t) ˙ 0 ) = 0 for some t0 ∈ [t1 , t2 ], then, by the uniqueness property, on [t1 , t2 ] and if β(t we have β(t) ≡ β(t0 ) on this interval; so that the trace of integral curves of Yk is ˙ = 0 on unchanged when this arc is dropped. This implies that the condition: β(t) [t1 , t2 ] may be assumed. Moreover, by virtue of Theorem 10.27 it follows from formula (10.124) that the drift trajectories (integral curves of X 0 ) can be approximated uniformly by piecewise differentiable curves, of which each differentiable arc is Hill’s diffusion or drift trajectory. Therefore, we conclude that any finite number of subunit and drift trajectories can be approximated uniformly by a finite number of Hill’s diffusion and drift trajectories. The proof of Theorem 10.17 is now complete. 

10.4 Notes and Comments The maximum principles in this chapter are adapted from Ole˘ınik–Radkeviˇc [138] and Gilbarg–Trudinger [74]. We make use of a modification of the technique originally introduced by Hopf [82] for elliptic operators and later adapted by Nirenberg [133] and Friedman [65] for parabolic operators in such a way as to make it accessible to graduate students and advanced undergraduates. For a general study of maximum principles, the readers might be referred to Protter–Weinberger [146] and LópzeGómez [117, Chaps. 1 and 7]. The Hopf boundary point lemma (Lemma 10.12) was proved independently by Hopf [83] and Ole˘ınik [137]. Theorem 10.14 is inspired by the work of Feffermann–Phong [57]. Our proof of Theorem 10.14 follows Bony [21] and Amano [13]; see also Redheffer [148]. As mentioned in the text, the virtue of this theorem is that the notion of a subunit trajectory is coordinate-free. It seems quite likely that there is an intimate connection between propagation of maxima and propagation of singularities for degenerate elliptic differential operators of second order (cf. Hörmander [86], Fefferman–Phong [57, 190]). For concrete examples, see Fujiwara–Omori [70, Theorem] and Taira [206, Theorems 4.1 and 5.1]. The maximum principle may be applied to questions of uniqueness for degenerate elliptic boundary value problems. Furthermore, the mechanism of propagation of maxima plays an important role in the interpretation and study of Markov processes from the viewpoint of functional analysis (see [202, 204, 205, 207]).

Part IV

L2

Theory of Elliptic Boundary Value Problems

Chapter 11

Elliptic Boundary Value Problems

This chapter is devoted to general boundary value problems for second order elliptic differential operators in the framework of L 2 Sobolev spaces. In Sect. 11.1 we begin with a summary of the basic facts about existence, uniqueness and regularity of solutions of the Dirichlet problem (D) in the framework of Hölder spaces (Theorem 11.1). In Sect. 11.2, by using the calculus of pseudo-differential operators developed in Sects. 9.8–9.10, we prove an existence, uniqueness and regularity theorem for the Dirichlet problem (D) in the framework of L 2 Sobolev spaces (Theorem 11.6). In Sect. 11.3 we formulate a general boundary value problem (), and show that these problems can be reduced to the study of pseudo-differential operators on the boundary (Proposition 11.9). The virtue of this reduction is that there is no difficulty in taking adjoints after restricting the attention to the boundary, whereas boundary value problems in general do not have adjoints. This allows us to discuss the existence theory more easily (Theorems 11.10–11.18). In Sect. 11.4 we study the basic questions of existence and uniqueness of solutions of a general boundary value problem ()α with a spectral parameter α. We prove two existence and uniqueness theorems for the boundary value problem ()α in the framework of Sobolev spaces when α → +∞ (Theorem 11.19 and Corollary 11.20). For this purpose, we make use of a method essentially due to Agmon and Nirenberg (see Agmon [3], Agmon– Nirenberg [5], Lions–Magenes [116]). This is a technique of treating the spectral parameter α as a second order elliptic differential operator of an extra variable and relating the old problem to a new one with the additional variable. Our presentation of this technique is due to Fujiwara [69]. Theorem 11.19 plays a fundamental role in constructing Feller semigroups (Markov processes) in Chap. 13.

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 K. Taira, Functional Analytic Techniques for Diffusion Processes, Springer Monographs in Mathematics, https://doi.org/10.1007/978-981-19-1099-9_11

517

518

11 Elliptic Boundary Value Problems

11.1 The Dirichlet Problem in the Framework of Hölder Spaces In this section, following Gilbarg–Trudinger [74] we state the classical existence, uniqueness and regularity theorems for the Dirichlet problem in the framework of Hölder spaces. Let Ω be a bounded domain in Rn with boundary ∂Ω. We let n 

 ∂2 ∂ a (x) + bi (x) + c(x) A= ∂ x ∂ x ∂ xi i j i, j=1 i=1 n

ij

(11.1)

be a second order strictly elliptic differential operator with real coefficients such that: (1) a i j ∈ C θ (Ω) with 0 < θ < 1, a i j (x) = a ji (x) for all x ∈ Ω and all 1 ≤ i, j ≤ n, and there exists a constant a0 > 0 such that n 

a i j (x)ξi ξ j ≥ a0 |ξ |2 for all (x, ξ ) ∈ T ∗ (Ω) = Ω × Rn ,

(11.2)

i, j=1

where T ∗ (Ω) is the cotangent bundle of Ω. (2) bi ∈ C θ (Ω) for 1 ≤ i ≤ n. (3) c ∈ C θ (Ω) and c(x) ≤ 0 in Ω. We are interested in the following Dirichlet problem: Given functions f (x) and ϕ(x  ) defined in Ω and on ∂Ω respectively, find a function u(x) in Ω such that 

Au = f in Ω, u=ϕ on ∂Ω.

(D)

The next theorem summarizes the basic facts about the Dirichlet problem in the framework of Hölder spaces (see [74, Theorems 6.24, 6.17 and 6.19]): Theorem 11.1 Let the differential operator A of the form (11.1) satisfy the strict ellipticity condition (11.2). Then we have the following three assertions: (i) (Existence and Uniqueness) Assume that the domain Ω is of class C 2 . If f ∈ C θ (Ω) and ϕ ∈ C(∂Ω), then problem (D) has a unique solution u in C(Ω) ∩ C 2+θ (Ω). (ii) (Interior Regularity) Assume that the functions a i j (x), bi (x) and c(x) belong to C k+θ (Ω) for some non-negative integer k. If u ∈ C 2 (Ω) and Au = f ∈ C k+θ (Ω), then we have u ∈ C k+2+θ (Ω). (iii) (Global Regularity) Assume that the domain Ω is of class C k+2+θ and that the functions a i j (x), bi (x) and c(x) belong to C k+θ (Ω) for some non-negative integer k. If f ∈ C k+θ (Ω) and ϕ ∈ C k+2+θ (∂Ω), then a solution u ∈ C(Ω) ∩ C 2 (Ω) of problem (D) belongs to C k+2+θ (Ω).

11.1 The Dirichlet Problem in the Framework of Hölder Spaces

519

In Appendix A, following Gilbarg–Trudinger [74] we present a brief introduction to the potential theoretic approach to the Dirichlet problem for Poisson’s equation, that is, A = .

11.2 The Dirichlet Problem in the Framework of L 2 Sobolev Spaces In this section, by using the theory of pseudo-differential operators we consider the Dirichlet problem in the framework of L 2 Sobolev spaces. This is a modern version of the classical potential approach to the Dirichlet problem. Let Ω be a bounded domain in Rn with smooth boundary ∂Ω. Its closure Ω = Ω ∪ ∂Ω is an n-dimensional, compact smooth manifold with boundary. By virtue of Theorems 4.19 and 4.20, we may assume that (see Figs. 11.1 and 11.2 below): (a) The domain Ω is a relatively compact open subset of an n-dimensional, compact smooth manifold M without boundary. (b) In a neighborhood W of ∂Ω in M a normal coordinate t is chosen so that the points of W are represented as (x  , t), x  ∈ ∂Ω, −1 < t < 1; t > 0 in Ω, t < 0 in M \ Ω and t = 0 only on ∂Ω. (c) The manifold M is equipped with a strictly positive density μ which, on W , is the product of a strictly positive density ω on ∂Ω and the Lebesgue measure dt  is called the double of Ω. on (−1, 1). This manifold M = Ω We let A=

n  i, j=1

 ∂2 ∂ + bi (x) + c(x) ∂ xi ∂ x j ∂ xi i=1 n

a i j (x)

(11.3)

be a second order strictly elliptic differential operator with real coefficients on the  of Ω such that: double M = Ω Fig. 11.1 The domain Ω  and the double M = Ω

Ω ∂Ω

M =Ω

520

11 Elliptic Boundary Value Problems

Fig. 11.2 The boundary ∂Ω and the tubular neighborhood W

(1) The a i j are the components of a C ∞ symmetric contravariant tensor of type on M and there exists a constant a0 > 0 such that n 

a i j (x)ξi ξ j ≥ a0 |ξ |2 for all (x, ξ ) ∈ T ∗ (M),

2 0

(11.4)

i, j=1

where T ∗ (M) is the cotangent bundle of M. (2) bi ∈ C ∞ (M) for 1 ≤ i ≤ n. (3) c ∈ C ∞ (M) and c(x) ≤ 0 in M. Furthermore, for simplicity, we assume that The function (A1)(x) = c(x) does not vanish identically on M.

(11.5)

First, we construct a volume potential for A, which plays the same role for A as the Newtonian potential plays for the Laplacian : Theorem 11.2 Let the differential operator A of the form (11.3) satisfy the strict ellipticity condition (11.4) and condition (11.5). Then we have the following two assertions: (i) The operator A : C ∞ (M) → C ∞ (M) is bijective, and its inverse Q is an elliptic operator in L −2 cl (M). (ii) The operators A and Q extend respectively to isomorphisms for each s ∈ R, which are still inverses of each other: ⎧ A ⎨ H s (M) −→ H s−2 (M), ⎩ H s (M) ←− H s−2 (M). Q

Proof We apply Theorem 9.39 to the strictly elliptic differential operator A. Since A is elliptic on M, by applying Theorem 10.19 to our situation we obtain that

N (A) = u ∈ C ∞ (M) : Au = 0 in M = {constants functions} .

11.2 The Dirichlet Problem in the Framework of L 2 Sobolev Spaces

521

In view of hypothesis (11.5), this implies that N (A) = {0}. On the other hand, since the principal symbol − i, j a i j (x)ξi ξ j of A is real, it follows from Corollary 9.37 that ind A = 0. Therefore, Theorem 11.2 follows from an application of Theorem 9.39. The proof of Theorem 11.2 is complete.



Next we construct a surface potential for A, which is a modern version of the classical Poisson kernel for the Laplacian . We let K v := γ0 (Q(v ⊗ δ)) for v ∈ C ∞ (∂Ω), where v ⊗ δ is a distribution on M defined by the formula

v ⊗ δ, ϕ · μ = v, ϕ(·, 0) · ω for all ϕ ∈ C ∞ (M). In view of part (ii) of Theorem 9.48, it follows that the operator K is in L −1 cl (∂Ω) and maps C ∞ (∂Ω) continuously into itself. Furthermore, we have the following theorem for the operator K : Theorem 11.3 Let the differential operator A of the form (11.3) satisfy the strict ellipticity condition (11.4) and condition (11.5). Then we have the following three assertions: (i) The operator K is an elliptic operator in L −1 cl (∂Ω). (ii) The operator K : C ∞ (∂Ω) → C ∞ (∂Ω) is bijective, and its inverse L is an elliptic operator in L 1cl (∂Ω). (iii) The operators K and L extend respectively to isomorphisms for each s ∈ R, which are still inverses of each other: ⎧ K ⎨ H s (∂Ω) −→ H s+1 (∂Ω), s ⎩ H (∂Ω) ←− H s+1 (∂Ω). L

Proof (i) We calculate the homogeneous principal symbol of K ∈ L −1 cl (∂Ω). In a tubular neighborhood W of ∂Ω in M, we can write the operator A = A(x, D) uniquely in the form A(x, D) = A2 (x)Dt2 + A1 (x, Dx  )Dt + A0 (x, Dx  ) for x = (x  , t),

(11.6)

where A j (x, Dx  ) ( j = 0, 1, 2) is a differential operator of order 2 − j acting along the surfaces parallel to ∂Ω. We denote by a1 (x, ξ  ) and a0 (x, ξ  ) the principal symbols

522

11 Elliptic Boundary Value Problems

of A1 (x, Dx  ) and A0 (x, Dx  ), respectively. Since A is elliptic on M, it follows from formula (11.6) that (1) A2 (x) < 0 for x ∈ W .   (2) a1 (x, ξ  )2 − 4 A2 (x)a0 (x, ξ  ) < 0 for x = x  , t ∈ W and ξ  ∈ Tx∗ (∂Ω) \ {0}. Hence the principal symbol of A can be decomposed as follows:    A2 (x)ξn2 + a1 (x, ξ  )ξn + a0 (x, ξ  ) = A2 (x) ξn − ξn+ (x, ξ  ) ξn − ξn− (x, ξ  ) where ξn± (x, ξ  )

=

−a1 (x, ξ  ) ∓



1/2  −1 4 A2 (x)a0 (x, ξ  ) − a1 (x, ξ  )2 . 2 A2 (x)

Since the principal symbol of Q = A−1 is 1   , A2 (x) ξn − ξn+ (x, ξ  ) ξn − ξn− (x, ξ  ) by applying formula (9.34) to our situation we obtain that the homogeneous principal symbol k(x  , ξ  ) of K is given by the following formula:

dξn     , 0) ξ − ξ + (x  , 0, ξ  ) ξ − ξ − (x  , 0, ξ  ) A (x n n  2 n n 1 = − 1/2 .   4 A2 (x , 0)a0 (x , 0, ξ  ) − a1 (x  , 0, ξ  )2

k(x  , ξ  ) =

1 2π

(11.7)

This proves that K ∈ L −1 cl (∂Ω) is elliptic. (ii) We apply Theorem 9.39 to the operator K . First, since the homogeneous principal symbol k(x  , ξ  ) of K is real, we obtain from Corollary 9.37 that ind K = 0. Now we show that

N (K ) = v ∈ C ∞ (∂Ω) : K v = 0 = {0}; then part (ii) of Theorem 11.3 follows from an application of Theorem 9.39. Assume that v ∈ C ∞ (∂Ω) and K v = 0. Then, by applying part(i) of Theorem 9.48 to our situation we obtain that Q(v ⊗ δ)|Ω ∈ C ∞ (Ω), Q(v ⊗ δ)| M\Ω ∈ C ∞ (M \ Ω)

(11.8) (11.9)

11.2 The Dirichlet Problem in the Framework of L 2 Sobolev Spaces

523

and also Q(v ⊗ δ)|∂Ω = K v = 0.

(11.10)

However, we have the formula A (Q(v ⊗ δ)|Ω ) = AQ(v ⊗ δ)|Ω = (v ⊗ δ)|Ω = 0 in Ω,

(11.11)

since A is a differential (hence local) operator. Therefore, in view of assertions (11.8), (11.11) and (11.10) we can apply the maximum principle (Corollary 10.10) to obtain that (11.12) Q(v ⊗ δ) = 0 on Ω. This gives that

0  Q(v ⊗ δ) = Q(v ⊗ δ)| M\Ω .

Thus it follows from an application of the jump formula (8.14) that v ⊗ δ = AQ (v ⊗ δ) 0  = A Q(v ⊗ δ)||M\Ω 

= AQ(v ⊗ δ)| M\Ω

0

(11.13)

 1 A2 (x) (Dt Q(v ⊗ δ)|∂  Ω ) ⊗ δ + i



+ A2 (x) (Q(v ⊗ δ)|∂  Ω ) ⊗ Dt δ + A1 (x, Dx  ) (Q(v ⊗ δ)|∂  Ω ) ⊗ δ 1 {A2 (x) (Dt Q(v ⊗ δ)|∂  Ω ) + A1 (x, Dx  ) (Q(v ⊗ δ)|∂  Ω )} ⊗ δ i √ 1 + A2 (x) (Q(v ⊗ δ)|∂  Ω ) ⊗ Dt δ, i = −1, i

=

where u|∂  Ω = the trace of u on ∂Ω from the outside M \ Ω. In order that formula (11.13) holds true, the last term on the right-hand side must vanish; hence we have the formula Q(v ⊗ δ)|∂  Ω = 0,

(11.14)

since A2 (x) < 0 in W . However, we have the formula   A Q(v ⊗ δ)| M\Ω = AQ(v ⊗ δ)| M\Ω = v ⊗ δ| M\Ω = 0.

(11.15)

Therefore, in view of assertions (11.9), (11.15) and (11.14) we can apply the maximum principle (Corollary 10.10) to obtain that

524

11 Elliptic Boundary Value Problems

Q(v ⊗ δ) = 0 on M \ Ω.

(11.16)

Consequently it follows from assertions (11.12) and (11.16) that Q(v ⊗ δ) = 0 on M. Since the operator Q is invertible, this implies that v ⊗ δ = 0 on M, so that v = 0 on ∂Ω. 

Now the proof of Theorem 11.3 is complete.

The next uniqueness theorem for the Dirichlet problem will play a fundamental role in the sequel: Theorem 11.4 Let s ∈ R. If u ∈ H s (Ω) satisfies the conditions 

Au = 0 in Ω, γ0 u = 0 on ∂Ω,

(11.17)

then it follows that u = 0 in Ω. Proof Since u ∈ H s (Ω) and Au = 0 in Ω, by applying Theorem 8.27 we find that the distribution u has sectional traces γ j u of any order j = 0, 1, 2, . . ., and γ j u ∈ H s− j−1/2 (∂Ω). In a neighborhood W of ∂Ω in M, we can write the operator A = A(x, D) uniquely in the form (11.6) A(x, D) = A2 (x)Dt2 + A1 (x, Dx  )Dt + A0 (x, Dx  ). Then it follows from an application of the jump formula (8.14) that   1 1 1 A u 0 = (Au)0 + A2 (γ1 u ⊗ δ) + A2 (γ0 u ⊗ Dt δ) + A1 γ0 u ⊗ δ i i i 1 = A2 (γ1 u ⊗ δ) , i since Au = 0 in Ω and γ0 u = 0. By Theorem 11.2, this gives that u0 =

1 Q (A2 (γ1 u ⊗ δ)) , i

11.2 The Dirichlet Problem in the Framework of L 2 Sobolev Spaces

so that u=

1 Q (A2 (γ1 u ⊗ δ))|Ω . i

525

(11.18)

In other words, every solution u ∈ H s (Ω) of the Dirichlet problem (11.17) can be expressed in the form (11.18). Thus we have the formula 0 = γ0 u =

1 K ((A2 |∂Ω ) · γ1 u) , i

and hence γ1 u = 0, since the operator K is invertible and A2 < 0 on ∂Ω. Therefore, it follows from formula (11.18) that u = 0 in Ω. 

The proof of Theorem 11.4 is complete. We let Pϕ := Q (Lϕ ⊗ δ)|Ω

for ϕ ∈ C ∞ (∂Ω).

(11.19)

By virtue of Theorems 11.3 and 9.48, it follows that P maps C ∞ (∂Ω) continuously into C ∞ (Ω), and it extends to a continuous linear operator P : H s−1/2 (∂Ω) −→ H s (Ω) for each s ∈ R.

(11.20)

Furthermore, we have, for all ϕ ∈ H s−1/2 (∂Ω), 

A (Pϕ) = AQ (Lϕ ⊗ δ)|Ω = (Lϕ ⊗ δ)Ω = 0 in Ω, γ0 (Pϕ) = K Lϕ = ϕ on ∂Ω.

(11.21)

The operator P is called the Poisson operator. We let

N (A, s) := u ∈ H s (Ω) : Au = 0 in Ω . Since the injection H s (Ω) → D (Ω) is continuous, it follows that N (A, s) is a closed subspace of H s (Ω); hence it is a Hilbert space. Then we have the following theorem: Theorem 11.5 The Poisson operator P maps H s−1/2 (∂Ω) isomorphically onto N (A, s) for each s ∈ R. Its inverse is the trace operator γ0 . ⎧ P ⎨ H s−1/2 (∂Ω) −→ N (A, s) , s−1/2 (∂Ω) ←− N (A, s) . ⎩H γ0

526

11 Elliptic Boundary Value Problems

Proof In view of assertions (11.20) and (11.21), it suffices to prove the surjectivity of P. Let w be an arbitrary element of N (A, s), and let u := P (γ0 w) . Then it follows from an application of Theorem 8.27 that γ0 w ∈ H s−1/2 (∂Ω); hence we have, by assertion (11.20), u ∈ H s (Ω). Moreover, by formulas (11.21) it follows that 

A(w − u) = 0 in Ω, γ0 (w − u) = 0 on ∂Ω.

Therefore, by applying Theorem 11.4 we obtain that w − u = 0 in Ω, that is, w = u = P (γ0 w) . This proves the surjectivity of P, and also P −1 = γ0 . The proof of Theorem 11.5 is complete.



By combining Theorems 11.2 and 11.5, we can obtain the following theorem: Theorem 11.6 Let s ≥ 2. The Dirichlet problem 

Au = f in Ω, γ0 u = ϕ on ∂Ω

(D)

has a unique solution u in H s (Ω) for any f ∈ H s−2 (Ω) and any ϕ ∈ H s−1/2 (∂Ω). Indeed, it suffices to note that the unique solution u of the Dirichlet problem (D) is given by the following formula: u = v + w := (Q E f )|Ω + P (ϕ − γ0 (Q E f )) ,

(11.22)

v = (Q E f )|Ω , w = P (ϕ − γ0 (Q E f )) .

(11.23a) (11.23b)

where

Here E : H s−2 (Ω) → H s−2 (M) is the Seeley extension operator (see Theorem 7.45).

11.3 General Boundary Value Problems

527

11.3 General Boundary Value Problems In this section, by using the Dirichlet problem we consider general boundary value problems for elliptic differential operators in the framework of Sobolev spaces.

11.3.1 Formulation of Boundary Value Problems Let Ω be a bounded domain in Rn with smooth boundary ∂Ω, and let n 

A=

 ∂2 ∂ + bi (x) + c(x) ∂ xi ∂ x j ∂ xi i=1 n

a i j (x)

i, j=1

(11.3)

be a second order strictly elliptic differential operator with real coefficients on the  of Ω that satisfies the conditions double M = Ω n 

a i j (x)ξi ξ j ≥ a0 |ξ |2 for all (x, ξ ) ∈ T ∗ (M)

(11.4)

i, j=1

and The function (A1)(x) = c(x) does not vanish identically on M.

(11.5)

If σ ≤ τ + 2, we let

H Aσ,τ := u ∈ H σ (Ω) : Au ∈ H τ (Ω) . We equip the space H Aσ,τ with the inner product (u, v) H Aσ,τ = (u, v) H σ (Ω) + (Au, Av) H τ (Ω) and with the associated norm 1/2  . u H Aσ,τ = u2H σ (Ω) + Au2H τ (Ω) Then it is easy to see that H Aσ,τ is a Hilbert space. By virtue of formulas (11.22) and (11.23), we find that every element u ∈ H Aσ,τ can be decomposed as follows: u =v+w

528

11 Elliptic Boundary Value Problems

where v = Q E (Au)|Ω ∈ H τ +2 (Ω), w = u − v ∈ N (A, σ ) .

(11.24a) (11.24b)

Since the operators E : H s (Ω) → H s (M) and Q : H s (M) → H s+2 (M) are both continuous, it follows that the decomposition (11.24) is continuous; more precisely, we have the inequalities v H τ +2 (Ω) ≤ C Au H τ (Ω) ;

(11.25)

w H σ (Ω) ≤ C u H Aσ,τ .

(11.26)

and

Here and in the following the letter C denotes a generic positive constant. Now we take τ ≥ 0. Then it follows from Theorem 8.25 that the trace maps γi : H τ +2 (Ω) −→ H τ −i+3/2 (∂Ω) for i = 0, 1 are continuous: |γi v| H τ −i+3/2 (∂Ω) ≤ C v H τ +2 (Ω) for v ∈ H τ +2 (Ω).

(11.27)

On the other hand, by applying Theorem 8.27 we obtain that the trace maps γ j : N (A, σ ) −→ H σ − j−1/2 (∂Ω) for j = 0, 1, 2, . . . are continuous for each σ ∈ R (see inequality (8.12)): |γi w| H σ − j−1/2 (∂Ω) ≤ C w H σ (Ω) for w ∈ N (A, σ ) .

(11.28)

Therefore, if u ∈ H Aσ,τ , we can define its traces γi u by the formulas γi u := γi v + γi w for i = 0, 1 and let γ u := {γ0 u, γ1 u} .

(11.29)

11.3 General Boundary Value Problems

529

Then we have the following proposition: Proposition 11.7 If σ ≤ τ + 2 and τ ≥ 0, then the trace mapping γ : H Aσ,τ −→ H σ −1/2 (∂Ω) × H σ −3/2 (∂Ω) is continuous. Proof It follows from inequalities (11.27) and (11.25) that |γi v| H σ −i−1/2 (∂Ω) ≤ |γi v| H τ −i+3/2 (∂Ω) ≤ Cv H τ +2 (Ω)

(11.30)

≤ CAu H τ (Ω) for i = 0, 1. Furthermore, it follows from inequalities (11.28) and (11.26) that |γi w| H σ −i−1/2 (∂Ω) ≤ Cw H σ (Ω)

(11.31)

≤ Cu H Aσ,τ for i = 0, 1. In view of formulas (11.29), the continuity of γ follows from inequalities (11.30) and (11.31). The proof of Proposition 11.7 is complete.  Let B j ( j = 0, 1) be a classical pseudo-differential operator of order m j on ∂Ω, and define a boundary operator Bγ as follows: Bγ u := B0 γ0 u + B1 γ1 u for u ∈ H Aσ,τ .

(11.32)

Then we have the following proposition: Proposition 11.8 If σ ≤ τ + 2 and τ ≥ 0, then the mapping Bγ : H Aσ,τ −→ H σ −m−1/2 (∂Ω) is continuous. Here m = max(m 0 , m 1 + 1). Proposition 11.8 follows immediately from Proposition 11.7, since the operators B0 : H σ −1/2 (∂Ω) −→ H σ −m 0 −1/2 (∂Ω), B1 : H σ −3/2 (∂Ω) −→ H σ −m 1 −3/2 (∂Ω) are both continuous. Now we can formulate our boundary value problem for (A, B) as follows: Given functions f ∈ H τ (Ω) and ϕ ∈ H τ −m+3/2 (∂Ω), find a function u ∈ H σ (Ω) such that 

Au = f in Ω, Bγ u = ϕ on ∂Ω.

()

530

11 Elliptic Boundary Value Problems

The boundary value problem () is said to be elliptic (or coercive) if σ = τ + 2, while it is said to be subelliptic if τ + 1 < σ < τ + 2.

11.3.2 Reduction to the Boundary In this subsection we show that the boundary value problem () can be reduced to the study of a pseudo-differential operator on the boundary. Table 11.1 below gives a bird’s-eye view of strong Markov processes and elliptic boundary value problems through the reduction to the boundary. Assume that u ∈ H Aσ,τ , σ ≤ τ + 2, τ ≥ 0, is a solution of the boundary value problem  Au = f in Ω, () Bγ u = ϕ on ∂Ω. Then, by virtue of decomposition (11.24) of u, this is equivalent to saying that w ∈ H σ (Ω) is a solution of the the boundary value problem 

Aw = 0 in Ω, Bγ w = ϕ − Bv on ∂Ω.

(11.33)

Here v = (Q E f )|Ω ∈ H τ +2 (Ω), w = u − v. However, Theorem 11.5 tells us that the spaces N (A, σ ) and H σ −1/2 (∂Ω) are isomorphic in such a way that 

γ0

N (A, σ ) −→ H σ −1/2 (∂Ω), N (A, σ ) ←− H σ −1/2 (∂Ω). P

Table 11.1 A bird’s-eye view of strong Markov processes and elliptic boundary value problems through the reduction to the boundary Field Probability Partial differential equations Analytic subject Reduction to the boundary

Strong Markov processes on the domain Strong Markov processes on the boundary

Elliptic boundary value problems Fredholm integral equations on the boundary

11.3 General Boundary Value Problems

531

Therefore, we find that w ∈ H σ (Ω) is a solution of the boundary value problem (11.33) if and only if ψ ∈ H σ −1/2 (∂Ω) is a solution of the equation Bγ (Pψ) = ϕ − Bγ v on ∂Ω.

(∗∗)

Here ψ = γ0 w, or equivalently, w = Pψ. Summing up, we have the following proposition: Proposition 11.9 Let σ ≤ τ + 2 and τ ≥ 0. For functions f ∈ H τ (Ω) and ϕ ∈ H τ −m+3/2 (∂Ω), there exists a solution u ∈ H Aσ,τ of the boundary value problem () if and only if there exists a solution ψ ∈ H σ −1/2 (∂Ω) of the boundary value problem (∗∗). Furthermore, the solutions u and ψ are related as follows: u = (Q E f )|Ω + Pψ. We remark that the pseudo-differential equation (∗∗) is a modern version of the classical Fredholm integral equation. We let T : C ∞ (∂Ω) −→ C ∞ (∂Ω) ϕ −→ Bγ (Pϕ).

(11.34)

Then we have, by formula (11.32),

where the operator

T = B0 + B1 ,

(11.35)

 = γ1 P : C ∞ (∂Ω) −→ C ∞ (∂Ω)

(11.36)

is called the Dirichlet-to-Neumann operator. By applying part (ii) of Theorem 9.48, we find that  is a classical, elliptic pseudo-differential operator of first order on ∂Ω. Indeed, we have, by formula (11.19), ϕ =

∂ (Q(Lϕ ⊗ δ)) for ϕ ∈ C ∞ (∂Ω). ∂n

Since L = K −1 , we find from formula (11.7) that the principal symbol p1 (x  , ξ  ) of  is given by the formula p1 (x  , ξ  )   1/2 1 4 A2 (x  , 0)a0 (x  , 0, ξ  ) − a1 (x  , 0, ξ  )2 = − a1 (x  , 0, ξ  ) i .  2 A( x , 0) For example, if the operator A is the usual Laplacian , we can write down the complete symbol p(x  , ξ  ) of  as follows (see [71], [202, Sect. 10.7]):

532

11 Elliptic Boundary Value Problems

  1 ωx  (ξ , ξ )  p(x , ξ ) = −|ξ | − − (n − 1)M(x ) 2 |ξ  |2 √ 1 + −1 div δ(ξ  ) (x  ) + terms of order ≤ −1. 2 





Here: (a) |ξ  | is the length of the cotangent vector ξ  with respect to the Riemannian metric of ∂Ω induced by the natural metric of Rn . (b) M(x  ) is the mean curvature of the boundary ∂Ω at x  . (c) ωx  (ξ , ξ ) is the second fundamental form of ∂Ω at x  , while ξ ∈ Tx  (∂Ω) is the tangent vector corresponding to the cotangent vector ξ  ∈ Tx∗ (∂Ω) by the duality between Tx  (∂Ω) and Tx∗ (∂Ω) with respect to the Riemannian metric (gi j (x  )) of ∂Ω. (d) div δ(ξ  ) is the divergence of a real smooth vector field δ(ξ  ) on ∂Ω defined (in terms of local coordinates) by the formula δ(ξ  ) =

n−1  ∂|ξ  | ∂ ∂ξ j ∂ x j j=1

for ξ  = 0.

Namely, if A = , we have the formula p1 (x  , ξ  ) = −|ξ  |. Therefore, it follows that the operator T is a classical pseudo-differential operator of order m on ∂Ω. Consequently, Proposition 11.9 asserts that the boundary value problem () can be reduced to the study of the pseudo-differential operator T on the boundary ∂Ω. We shall formulate this fact more precisely in terms of functional analysis. First, we remark that the operator T : C ∞ (∂Ω) −→ C ∞ (∂Ω) extends to a continuous linear operator T : H s (∂Ω) −→ H s−m (∂Ω) for each s ∈ R. Then we have, by definition (11.34), T ϕ = Bγ (Pϕ) for ϕ ∈ H σ −1/2 (∂Ω), since the Poisson kernel P : H σ −1/2 (∂Ω) −→ N (A, σ )

(11.37)

11.3 General Boundary Value Problems

533

and the boundary operator Bγ : H Aσ,τ −→ H σ −m−1/2 (∂Ω) are both continuous. We associate with the boundary value problem () a linear operator A : H σ (Ω) −→ H τ (Ω) × H τ −m+3/2 (∂Ω) as follows: (a) The domain D (A) of A is the space

D (A) = u ∈ H Aσ,τ : Bγ u ∈ H τ −m+3/2 (∂Ω) . (b) Au = {Au, Bγ u} for every u ∈ D (A). Since the operators

and

A : H Aσ,τ −→ H σ −2 (Ω) Bγ : H Aσ,τ −→ H σ −m−1/2 (∂Ω)

are both continuous, it follows that A is a closed operator. Furthermore, the operator A is densely defined, since D (A) contains C ∞ (Ω) and so it is dense in H σ (Ω). Similarly, we associate with equation (∗∗) a linear operator T : H σ −1/2 (∂Ω) −→ H τ −m+3/2 (∂Ω) as follows: (α) The domain D (T ) of T is the space

D (T ) = ϕ ∈ H σ −1/2 (∂Ω) : T ϕ ∈ H τ −m+3/2 (∂Ω) . (β) T ϕ = T ϕ for every ϕ ∈ D (T ). Then the operator T is a densely defined, closed operator, since the operator T : H σ −1/2 (∂Ω) → H σ −m−1/2 (∂Ω) is continuous and since D (T ) contains C ∞ (∂Ω). In what follows, we prove the following: (I) The null space N (A) of A has finite dimension if and only if the null space N (T ) of T has finite dimension, and we have the formula dim N (A) = dim N (T ) .

534

11 Elliptic Boundary Value Problems

(II) The range R (A) of A is closed if and only if the range R (T ) of T is closed; and R (A) has finite codimension if and only if R (T ) has finite codimension, and we have the formula codim R (A) = codim R (T ) . (III) The operator A is a Fredholm operator if and only if the operator T is a Fredholm operator, and we have the formula ind A = ind T . First, we prove the following theorem for the null spaces N (A) and N (T ): Theorem 11.10 (Null Spaces) The null spaces N (A) and N (T ) are isomorphic; hence we have the formula dim N (A) = dim N (T ) . Proof In view of assertion (11.37), it follows from Theorem 11.5 that the spaces N (A) and N (T ) are isomorphic in such a way that 

γ0

N (A) −→ N (T ) , N (A) ←− N (T ) . P



This proves the theorem. For the ranges R (A) and R (T ), we have the following lemma: Theorem 11.11 (Ranges) The following two conditions are equivalent: (i) The range R (A) is closed in H τ (Ω) × H τ −m+3/2 (∂Ω). (ii) The range R (T ) is closed in H τ −m+3/2 (∂Ω).

Proof (i) =⇒ (ii): Let ψ be an arbitrary element of the closure of the range R (T ), and let {ϕ j }∞ j=1 be a sequence in the domain D (T ) ⊂ H σ −1/2 (∂Ω) such that

T ϕ j −→ ψ in H τ −m+3/2 (∂Ω) as j → ∞.

Then, by letting w j = Pϕ j we obtain that Aw j = {Aw j , Bγ w j } = {0, T ϕ j } −→ {0, ψ} in H τ (Ω) × H τ −m+3/2 (∂Ω) as j → ∞.

11.3 General Boundary Value Problems

535

Thus it follows from condition (i) that there exists an element w ∈ D (A) ⊂ H Aσ,τ such that Aw = {0, ψ}, that is, 

Aw = 0 in Ω, Bγ w = ψ on ∂Ω.

However, Theorem 11.5 tells us that the distribution w can be written as w = Pϕ with ϕ = γ0 w ∈ H σ −1/2 (∂Ω). Hence we have the formula T ϕ = Bγ (Pϕ) = Bγ w = ψ ∈ H τ −m+3/2 (∂Ω). This proves that ϕ ∈ D (T ), so that ψ = T ϕ ∈ R (T ) . (ii) =⇒ (i): Let { f, ϕ} be an arbitrary element of the closure of the range R (A), and let {u j }∞ j=1 be a sequence in the domain D (A) ⊂ H Aσ,τ such that

Au j = Au j , Bγ u j −→ { f, ϕ} in H τ (Ω) × H τ −m+3/2 (∂Ω) as j → ∞. We decompose the u j as in formula (11.24) u j = vj + wj, where



j = 1, 2, . . . ,

 v j = Q E(Au j )Ω ∈ H τ +2 (Ω), w j = u j − v j ∈ N (A, σ ) .

Since the operators E : H τ (Ω) → H τ (M) and Q : H τ (M) → H τ +2 (M) are both continuous, it follows that  v j = Q E(Au j )Ω −→ Q E f |Ω

in H τ +2 (Ω) as j → ∞.

536

11 Elliptic Boundary Value Problems

Thus, by letting v = Q E f |Ω , we obtain from Proposition 11.8 with σ = τ + 2 that Bγ v j −→ Bγ v in H τ −m+3/2 (∂Ω) as j → ∞, so that Bγ w j = Bγ u j − Bγ v j −→ ϕ − Bγ v in H τ −m+3/2 (∂Ω) as j → ∞. However, we have, by Theorem 11.5, w j = Pϕ j with ϕ j = γ0 w j ∈ H σ −1/2 (∂Ω), and hence

  Bγ w j = Bγ Pϕ j = T ϕ j ∈ R (T ) .

Therefore, it follows from condition (ii) that there exists an element ψ ∈ D (T ) ⊂ H σ −1/2 (∂Ω) such that T ψ = ϕ − Bγ v. We let u := v + Pψ. Then we find that

u ∈ H σ (Ω)

and further that  Au = Av = A(Q E f )|Ω = E f |Ω = f ∈ H τ (Ω), Bγ u = Bγ v + Bγ (Pψ) = Bγ v + T ψ = ϕ ∈ Hτ −m+3/2 (∂Ω). This proves that u ∈ D (A) so that { f, ϕ} = {Au, Bγ u} = Au ∈ R (A) . The proof of Theorem 11.11 is complete.



11.3 General Boundary Value Problems

537

In order to study a relation between codim R (A) and codim R (T ), we consider the transposes A and T  . By using the duality theorem (Theorem 8.18), we find that the Sobolev spaces H t (Ω) and HΩ−t (M) are dual to each other for every t ∈ R. Hence it follows that the transpose A : HΩ−τ (M) × H −τ +m−3/2 (∂Ω) −→ HΩ−σ (M) is a closed linear operator such that    

Au, {v, ψ} = u, A {v, ψ} for all u ∈ D (A) and {v, ψ} ∈ D A . On the other hand, since the Sobolev spaces H s (∂Ω) and H −s (∂Ω) are dual to each other for every s ∈ R, it follows that the transpose T  : H −τ +m−3/2 (∂Ω) −→ H −σ +1/2 (∂Ω) is a closed linear operator such that    

T ϕ, ψ = ϕ, T  ψ for all ϕ ∈ D (T ) and ψ ∈ D T  . The situation can be visualized as in Figs. 11.3 and 11.4 above.    Then we have the following theorem for the null spaces N A and N T  : Theorem 11.12 Assume that the ranges R (A) and R (T ) are closed. Then the following two conditions are equivalent:   (i) The null space N A  has finite dimension. (ii) The null space N T  has finite dimension.

A

H τ (Ω) × H τ −m+3/2 (∂Ω) ←→

H σ (Ω) − −−−− → ←→

Fig. 11.3 The closed operators A and A

−σ −τ HΩ (M ) ← −−−− − HΩ (M ) × H −τ +m−3/2 (∂Ω) A

T

←→

H σ−1/2 (∂Ω) − −−−− → H τ −m+3/2 (∂Ω) ←→

Fig. 11.4 The closed operators T and T 

H −σ+1/2 (∂Ω) ← −−−− − H −τ +m−3/2 (∂Ω) T

538

11 Elliptic Boundary Value Problems

Moreover, in this case, we have the formula     dim N A = dim N T  .

(11.38)

  Proof (i) =⇒ (ii): Assume that the null space N A has dimension , and let

vj, ψj

 j=1

⊂ HΩ−τ (M) × H −τ +m−3/2 (∂Ω)

  be a basis of N A . We show that the family {ψ j }j=1 is a basis of the null space   N T . To do this, in view of the closed range theorem (Theorem 5.53) it suffices to prove that an arbitrary element ψ of H τ −m+3/2 (∂Ω) belongs to the range R (T ) if and only if we have the conditions  ψ, ψ j = 0 for all 1 ≤ j ≤ .



The “only if” part follows immediately. Indeed, if ψ = T ϕ with ϕ ∈ D (T ), then, by letting w = Pϕ we obtain that         ψ, ψ j = T ϕ, ψ j = Aw, {v j , ψ j } = w, A {v j , ψ j } = 0 for all 1 ≤ j ≤ ,   since Aw = {0, T ϕ} and {v j , ψ j } ∈ N A . In order to prove the “if” part, we assume that an element ψ ∈ H τ −m+3/2 (∂Ω) satisfies the conditions  ψ, ψ j = 0 for all 1 ≤ j ≤ .

 Then it follows that

   {0, ψ}, {v j , ψ j } = ψ, ψ j = 0 for 1 ≤ j ≤ .





  Since the family v j , ψ j j=1 is a basis of the null space N A , by applying the closed  range theorem we obtain that the element {0, ψ} belongs to the space 0 N A = R (A), that is, there exists an element w ∈ D (A) ⊂ H Aσ,τ such that 

Aw = 0 in Ω, Bγ w = ψ on ∂Ω.

In view of Theorem 11.5, this implies that

11.3 General Boundary Value Problems

539

ψ = Bγ (Pϕ) = T ϕ ∈ R (T ) , with ϕ = γ0 w ∈ D (T ).   (ii) =⇒ (i): Assume that the null space N T  has dimension , and let {ψ j }j=1 ⊂ H −τ +m−3/2 (∂Ω)   be a basis of N T  . We let v j := −E  Q  (Bγ ) ψ j for 1 ≤ j ≤ ,

(11.39)

where the operators E  : H −τ (M) −→ HΩ−τ (M),

Q  : H −τ −2 (M) −→ H −τ (M), (Bγ ) : H −τ +m−3/2 (∂Ω) −→ H −τ −2 (M),

are the transposes of E, Q and Bγ , respectively. We show that

vj, ψj

 j=1

  is a basis of the null space N A . To do this, in view of the closed range theorem (Theorem 5.53) it suffices to prove that an arbitrary element { f, ϕ} of H τ (Ω) × H τ −m+3/2 (∂Ω) belongs to the range R (A) if and only if we have the conditions   { f, ϕ}, {v j , ψ j } = 0 for 1 ≤ j ≤ .

(11.40)

In view of formula (11.39), it follows that      { f, ϕ}, {v j , ψ j } = ϕ, ψ j − Bγ (Q E f ), ψ j   = ϕ − Bγ (Q E f ), ψ j .



  Since the family {ψ j } j=1 is a basis of the null space N T  , by applying the closed range theorem we find that conditions (17) hold true if and only if we have the condition   ϕ − Bγ (Q E f ) ∈ 0 N T  = R (T ) . However, in view of Proposition 11.9, this is equivalent to saying that { f, ϕ} ∈ R (A) . Finally, we remark that formula (11.38) is clear from the above proof.

540

11 Elliptic Boundary Value Problems

The proof of Theorem 11.12 is complete.



By using the closed range theorem (Theorem 5.53), we have the following theorem for codim R (A) and codim R (T ): Corollary 11.13 Assume that the ranges R (A) and R (T ) are closed. Then the following two conditions are equivalent: (i) The range R (A) has finite codimension. (ii) The range R (T ) has finite codimension. Moreover, in this case, we have the formula codim R (A) = codim R (T ) . Indeed, Corollary 11.13 is an immediate consequence of the closed range theorem and Theorem 11.12. By combining Theorems 11.10–11.12 and Corollary 11.13, we obtain the following theorem for ind A and ind T : Theorem 11.14 (Indices) The following two conditions are equivalent: (i) The operator A is a Fredholm operator. (ii) The operator T is a Fredholm operator. Moreover, in this case, we have the formula ind A = ind T . The next theorem states that A has regularity property if and only if T has. Theorem 11.15 (Regularity) Let σ ≤ τ + 2, τ ≥ 0 and t < σ . Then the following two conditions are equivalent: (i) If u ∈ H t (Ω), Au ∈ H τ (Ω) and Bγ u ∈ H τ −m+3/2 (∂Ω), then u ∈ H σ (Ω). (ii) If ϕ ∈ H t−1/2 (∂Ω) and T ϕ ∈ H τ −m+3/2 (∂Ω), then ϕ ∈ H σ −1/2 (∂Ω). Proof (i) =⇒ (ii): Assume that ϕ ∈ H t−1/2 (∂Ω) and T ϕ ∈ H τ −m+3/2 (∂Ω). Then, by letting u = Pϕ we obtain that u ∈ H t (Ω), Au = 0, Bγ u = T ϕ ∈ H τ −m+3/2 (∂Ω). Hence it follows from condition (i) that u ∈ H σ (Ω).

11.3 General Boundary Value Problems

541

In view of Theorem 11.5, this implies that ϕ = γ0 u ∈ H σ −1/2 (∂Ω). (ii) =⇒ (i): Assume that ⎧ t ⎪ ⎨u ∈ H (Ω), Au ∈ H τ (Ω), ⎪ ⎩ Bγ u ∈ H τ −m+3/2 (∂Ω). Then the distribution u can be decomposed as in formula (11.24) u =v+w 

where

v = Q E(Au)|Ω ∈ H τ +2 (Ω), w = u − v ∈ N (A, σ ) .

Theorem 11.5 tells us that the distribution w can be written as w = Pϕ with ϕ = γ0 w ∈ H t−1/2 (∂Ω). Hence we have T ϕ = Bγ (Pϕ) = Bγ w = Bγ u − Bγ v ∈ H τ −m+3/2 (∂Ω). Therefore, it follows from condition (ii) that ϕ ∈ H σ −1/2 (∂Ω), so that

This proves that

w = Pϕ ∈ H σ (Ω). u = v + w ∈ H σ (Ω).

The proof of Theorem 11.15 is complete. In particular, we have the following corollary: Corollary 11.16 The following two conditions are equivalent: (i) N (A) ⊂ C ∞ (Ω). (ii) N (T ) ⊂ C ∞ (∂Ω).



542

11 Elliptic Boundary Value Problems

    For the regularity of the null spaces N A and N T  , we have the following theorem:     Theorem 11.17 Assume that the null spaces N A and N T  have finite dimension. Then the following two conditions are equivalent:   (i) N A  ⊂ C ∞ (Ω) × C ∞ (∂Ω). (ii) N T  ⊂ C ∞ (∂Ω). Proof (i) =⇒ (ii): This is clear from the proof of the implication (i) =⇒ (ii) of Theorem 11.12. (ii) =⇒ (i): We know from the proof of the implication (ii) =⇒ (i) of Theorem 11.12 that     If the family {ψ j } j=1 is a basis of the null space N T  ,  = dim N T  , then the family 

−E  Q  (Bγ ) ψ j , ψ j j=1 (11.41)   is a basis of the null space N A . However, we have, by formula (11.32),     (Bγ ) ψ j = γ0 B0 ψ j + γ1 B1 ψ j .     We remark that γ0 B0 ψ j and γ1 B1 ψ j are distributions on M with support in ∂Ω. If condition (ii) is satisfied, that is, if {ψ j }j=1 ⊂ C ∞ (∂Ω), by applying part (i) of Theorem 9.48 to our situation we obtain that    Q  γ0 B0 ψ j Ω ,

   Q  γ1 B1 ψ j Ω ∈ C ∞ (Ω),

and also       Q  γ0 B0 ψ j | M\Ω , Q  γ1 B1 ψ j | M\Ω ∈ C ∞ (M \ Ω). Hence it follows from an application of Proposition 8.19 that     E  Q  (Bγ ) ψ j = E  Q  γ0 B0 ψ j + E  Q  γ1 B1 ψ j ∈ C ∞ (Ω) for 1 ≤ j ≤ . In view of assertion (11.41), this proves condition (i). The proof of Theorem 11.17 is complete.



The next theorem states that a priori estimates for A are entirely equivalent to corresponding a priori estimates for T . Theorem 11.18 (Estimates) Let σ ≤ τ + 2, τ ≥ 0 and t < σ . Then the following two estimates are equivalent:

11.3 General Boundary Value Problems

543

  u H σ (Ω) ≤ C  Au H τ (Ω) + |Bγ u| H τ −m+3/2 (∂Ω) + u H t (Ω)

(11.42)

for u ∈ D (A) .   |ϕ| H σ −1/2 (∂Ω) ≤ C |T ϕ| H τ −m+3/2 (∂Ω) + |ϕ| H t−1/2 (∂Ω)

(11.43)

for ϕ ∈ D (T ) . Here and in the following the letter C denotes a generic positive constant. Proof (i) =⇒ (ii): By taking u = Pϕ with ϕ ∈ D (T ) in estimate (11.42), we obtain that   Pϕ H σ (Ω) ≤ C |T ϕ| H τ −m+3/2 (∂Ω) + Pϕ H t (Ω) . (11.44) However, Theorem 11.5 tells us that the Poisson operator P maps H s−1/2 (∂Ω) isomorphically onto N (A, s) for each s ∈ R. Thus estimate (11.43) follows from estimate (11.44). (ii) =⇒ (i): Every element u ∈ D (A) can be decomposed as in formula (11.24) u = v + w, where



v = Q E(Au)|Ω ∈ H τ +2 (Ω), w = u − v ∈ N (A, σ ) .

Then we have, by estimate (11.25), v H σ (Ω) ≤ v H τ +2 (Ω) ≤ C Au H τ (Ω) .

(11.45)

Furthermore, by applying estimate (11.43) to the distribution γ0 w we obtain that |γ0 w| H σ −l/2 (∂Ω)   ≤ C |T (γ0 w)| H τ −m+3/2 (∂Ω) + |γ0 w| H t−1/2 (∂Ω)   = C |Bγ w| H τ −m+3/2 (∂Ω) + |γ0 w| H t−1/2 (∂Ω)   ≤ C |Bγ u| H τ −m+3/2 (∂Ω) + |Bγ v| H τ −m+3/2 (∂Ω) + |γ0 w| H t−1/2 (∂Ω) . In view of Theorem 11.5, this gives that w H σ (Ω)   ≤ C |Bγ u| H τ −m+3/2 (∂Ω) + |Bγ v| H τ −m+3/2 (∂Ω) + w H t (Ω)

(11.46)

  ≤ C |Bγ u| H τ −m+3/2 (∂Ω) + |Bγ v| H τ −m+3/2 (∂Ω) + u H t (Ω) + v H t (Ω) . However, it follows from Proposition 11.8 with σ := τ + 2 that

544

11 Elliptic Boundary Value Problems

|Bγ v| H τ −m+3/2 (∂Ω) ≤ Cv H τ +2 (Ω) ≤ CAu H τ (Ω) .

(11.47)

Thus, by carrying estimates (11.45) and (11.47) into estimate (11.46) we obtain that   w H σ (Ω) ≤ C Au H τ (Ω) + |Bγ u| H τ −m+3/2 (∂Ω) + u H t (Ω) .

(11.48)

Therefore, the desired estimate (11.42) follows by combining estimates (11.45) and (11.48). The proof of Theorem 11.18 is complete. 

11.4 Unique Solvability Theorem for General Boundary Value Problems Let Ω be a bounded domain in Rn with C ∞ boundary ∂Ω. We let n 

 ∂2 ∂ a (x) + bi (x) + c(x) A= ∂ x ∂ x ∂ xi i j i, j=1 i=1 n

ij

be a second order strictly elliptic differential operator with real coefficients on the  of Ω as in Sect. 11.2, and assume that double M = Ω The function (A1)(x) = c(x) does not vanish identically on M.

(11.5)

In this section we consider the following boundary value problem: Given functions f (x) and ϕ(x  ) defined in Ω and on ∂Ω, respectively, find a function u(x) in Ω such that ⎧ in Ω, ⎨(A − α)u = f    ∂u  ()α ⎩ Bu := B0 (u|∂Ω ) + B1 = ϕ on ∂Ω. ∂ν ∂Ω Here (1) α is a non-negative spectral parameter. (2) B j , j = 0, 1, is a classical pseudo-differential operator of order m j on ∂Ω. (3) ν is the unit exterior normal to ∂Ω. We prove an existence and uniqueness theorem for the boundary value problem ()α in the framework of Sobolev spaces when α → +∞. For this purpose, we make use of a method essentially due to Agmon and Nirenberg (Agmon [3], Agmon– Nirenberg [5], Lions–Magenes [116]). This is a technique of treating the spectral parameter α as a second order elliptic differential operator of an extra variable on the unit circle and relating the old problem to a new one with the additional variable. The presentation of this technique is due to Fujiwara [69].

11.4 Unique Solvability Theorem for General Boundary Value Problems

545

Fig. 11.5 The product domain Ω × S

We introduce an auxiliary variable y of the unit circle S = R/2π Z, and replace the parameter α by the differential operator −

∂2 . ∂ y2

We consider instead of the boundary value problem ()α the following boundary value problem on the product domain Ω × S (see Fig. 11.5): Given functions  f (x, y) and u (x, y) in  ϕ (x  , y) defined in Ω × S and on ∂Ω × S, respectively, find a function  Ω × S such that ⎧   ∂2 ⎪ ⎪  u := A + 2  u=  f in Ω × S, ⎨ ∂y    ( ∗) ∂ u  ⎪ ⎪ =  ϕ on ∂Ω × S. u := B0 ( u |∂Ω×S ) + B1 ⎩ B ∂ν ∂Ω×S It should be emphasized that the second order differential operator = A+  =

n  i, j=1

∂2 ∂ y2  ∂2 ∂2 ∂ + 2+ bi (x) + c(x) ∂ xi ∂ x j ∂y ∂ xi i=1 n

a i j (x)

 × S as in Sect. 11.2, and satisfies is strictly elliptic on the product space M × S = Ω the condition The function (1) (x) = (A1)(x) = c(x) does not vanish  × S. identicallyon M × S = Ω

(11.5)

546

11 Elliptic Boundary Value Problems

Indeed, it suffices to note that n 

  a i j (x)ξi ξ j + η2 ≥ min{a0 , 1} |ξ |2 + η2

i, j=1

for all ((x, y), (ξ, η)) ∈ T ∗ (M × S), where T ∗ (M × S) is the cotangent bundle of M × S. Then, roughly speaking, the most important relationship between the boundary ∗) is stated as follows: value problems ()α and ( If the index of the boundary value problem ( ∗) is finite, then the

(11.49)

index of the boundary value problem ()α is equal to zero for all α ≥ 0.

11.4.1 Statement of Main Results We state assertion (11.49) more precisely. Let s ≥ max(2, m + 1/2) where m = max(m 0 , m 1 + 1), and 0 < κ ≤ 2. We associate with the boundary value problem ()α a densely defined, closed linear operator A(α) : H s−2+κ (Ω) −→ H s−2 (Ω) × H s−m−1/2 (∂Ω) as follows: (a) The domain D (A(α)) of A(α) is the space  D (A(α)) = u ∈ H s−2+κ (Ω) : (A − α)u ∈ H s−2 (Ω),  s−m−1/2 (∂Ω) . Bu ∈ H (b) A(α)u = {(A − α)u, Bu} for every u ∈ D (A(α)). Similarly, we associate with the boundary value problem ( ∗) a densely defined, closed linear operator  A : H s−2+κ (Ω × S) −→ H s−2 (Ω × S) × H s−m−1/2 (∂Ω × S)

11.4 Unique Solvability Theorem for General Boundary Value Problems

547

as follows:

  ( a ) The domain D  A of  A is the space     D  A =  u ∈ H s−2+κ (Ω × S) :  u ∈ H s−2 (Ω × S),  B u ∈ H s−m−1/2 (∂Ω × S) .    ( b)  A u = { u , B u } for every  u∈D  A . Now we can state our main result: Theorem 11.19 Let s ≥ max(2, m + 1/2), 0 < κ ≤ 2 and s − 5/2 + κ > 0. Then the following two conditions are equivalent: (i) The operator  A : H s−2+κ (Ω × S) −→ H s−2 (Ω × S) × H s−m−1/2 (∂Ω × S) is a Fredholm operator. (ii) For all α ≥ 0, the operator A(α) : H s−2+κ (Ω) −→ H s−2 (Ω) × H s−m−1/2 (∂Ω)  is a Fredholm operator with index zero; and there exists a constant    R > 0 such  2 2  that if α =  with  ∈Z and   ≥ R , then the operator A α is bijective and we have, for all u ∈ D A α  ,

 s−2+κ u2H s−2+κ (Ω) + α  u2L 2 (Ω)  2    s−2   A − α  u 2 2 ≤ C  (A − α  )u  H s−2 (Ω) + α  L (Ω)    s−m−1/2 2 2 |Bu| L 2 (∂Ω) , + |Bu| H s−m−1/2 (∂Ω) + α

(11.50)

with a constant C  > 0 independent of α  ≥ R  . In the “sub-elliptic” case, that is, in the case 1 < κ ≤ 2, we can prove the following corollary: Corollary 11.20 Let s ≥ max(2, m + 1/2) and 1 < κ ≤ 2. Then the following two conditions are equivalent: (i) The operator  A : H s−2+κ (Ω × S) −→ H s−2 (Ω × S) × H s−m−1/2 (∂Ω × S) is a Fredholm operator.

548

11 Elliptic Boundary Value Problems

(ii) For all α ≥ 0, the operator A(α) : H s−2+κ (Ω) −→ H s−2 (Ω) × H s−m−1/2 (∂Ω) is a Fredholm operator with index zero; and there exists a constant R > 0 such that if α ≥ R, then the operator A(α) is bijective and we have, for all u ∈ D (A(α)), u2H s−2+κ (Ω) + α s−2+κ u2L 2 (Ω)  ≤ C (A − α) u2H s−2 (Ω) + α s−2 (A − α) u2L 2 (Ω)  + |Bu|2H s−m−1/2 (∂Ω) + α s−m−1/2 |Bu|2L 2 (∂Ω) ,

(11.51)

with a constant C > 0 independent of α ≥ R. Remark 11.21 The boundary value problem () is elliptic (or coercive) if and only if κ = 2, and it is sub-elliptic if and only if 1 < κ < 2. In the elliptic case, Corollary 11.20 is proved by Agranovich–Vishik [6] (cf. [6, Theorems 4.1 and 5.1]).

11.4.2 Proof of Theorem 11.19 Step 1: First, we reduce the study of the boundary value problem ()α to that of a pseudo-differential operator on the boundary. By applying Theorem 11.5 to the operator A − α for α ≥ 0, we obtain the following two results: (a) The Dirichlet problem 

(A − α) w = 0 in Ω, γ0 w = ϕ on ∂Ω

(D)α

has a unique solution w in H t (Ω) for any ϕ ∈ H t−1/2 (∂Ω) (t ∈ R). (b) The mapping P(α) : H t−1/2 (∂Ω) −→ H t (Ω), defined by w = P(α)ϕ, is an isomorphism of H t−1/2 (∂Ω) onto the null space

N (A − α, t) = u ∈ H t (Ω) : (A − α) u = 0 in Ω for all t ∈ R; and its inverse is the trace operator γ0 on ∂Ω.

11.4 Unique Solvability Theorem for General Boundary Value Problems

549

⎧ P(α) ⎨ H t−1/2 (∂Ω) −→ N (A − α, t) , ⎩ H t−1/2 (∂Ω) ←− N (A − α, t) . γ0

We let T (α) : C ∞ (∂Ω) −→ C ∞ (∂Ω) ϕ −→ B (P(α)ϕ) . Then the operator T (α) can be written in the form T (α) = B P(α) = B0 + B1 Π (α), where the operator Π (α) : C ∞ (∂Ω) −→ C ∞ (∂Ω)   ∂ ϕ −→ (P(α)ϕ) ∂ν

∂Ω

.

is called the Dirichlet-to-Neumann operator. By applying Theorem 11.3 with A := A − α, we find that Π (α) is a classical, pseudo-differential operator of first order on ∂Ω. Hence the operator T (α) is a classical, pseudo-differential operator of order m on ∂Ω, and it extends to a continuous linear operator T (α) : H t (∂Ω) −→ H t−m (∂Ω) for all t ∈ R. Thus we can associate with equation (∗∗) a densely defined, closed linear operator T (α) : H s−5/2+κ (∂Ω) −→ H s−m−1/2 (∂Ω) as follows: (α) The domain D (T (α)) of T (α) is the space

D (T (α)) = ϕ ∈ H s−5/2+κ (∂Ω) : T (α)ϕ ∈ H s−m−1/2 (∂Ω) . (β) T (α)ϕ = T (α)ϕ = B (P(α)ϕ) for every ϕ ∈ D (T (α)). Then, by arguing as in Sect. 11.3.2 (σ := s − 2 + κ, τ := s − 2) we can prove the following: (I) The null space N (A(α)) of A(α) has finite dimension if and only if the null space N (T (α)) of T (α) has finite dimension, and we have the formula dim N (A(α)) = dim N (T (α)) .

550

11 Elliptic Boundary Value Problems

(II) The range R (A(α)) of A(α) is closed if and only if the range R (T (α)) of T (α) is closed; and R (A(α)) has finite codimension if and only if R (T (α)) has finite codimension, and we have the formula codim R (A(α)) = codim R (T (α)) . (III) The operator A(α) is a Fredholm operator if and only if the operator T (α) is a Fredholm operator, and we have the formula ind A(α) = ind T (α). Step 2: Similarly, we reduce the study of the boundary value problem ( ∗) to that of a pseudo-differential operator on the boundary. By virtue of condition (11.5), we can apply Theorem 11.5 to the strictly elliptic differential operator ∂2 = A+  ∂ y2 to obtain the following two results: ( a ) The Dirichlet problem 

w   = 0 in Ω × S, γ0 w = ϕ on ∂Ω × S

 ( D)

ϕ ∈ H t−1/2 (∂Ω × S) (t ∈ R). has a unique solution w  in H t (Ω × S) for any   (b) The mapping : H t−1/2 (∂Ω × S) −→ H t (Ω × S), P ϕ , is an isomorphism of H t−1/2 (∂Ω × S) onto the null space[] defined by w  = P  

 , t = u ∈ H t (Ω × S) :  u = 0 in Ω × S N  for all t ∈ R; and its inverse is the trace operator γ0 on ∂Ω × S. ⎧  P ⎨ H t−1/2 (∂Ω −→ N ⎩ H t−1/2 (∂Ω ←− N γ0

  , t ,    , t . 

We let  : C ∞ (∂Ω × S) −→ C ∞ (∂Ω × S) T   ϕ .  ϕ −→ B P

11.4 Unique Solvability Theorem for General Boundary Value Problems

551

 can be written in the form Then the operator T   = B P  = B0 + B1 Π , T where  = γ1 Π  : C ∞ (∂Ω × S) −→ C ∞ (∂Ω × S) Π  ∂   ( P ϕ )  ϕ −→ ∂ν ∂Ω×S

is called the Dirichlet-to-Neumann operator. By applying part Theorem 11.3 with  is a classical pseudo-differential operator of first order on A := , we find that Π  is a classical pseudo-differential operator of order m ∂Ω × S. Hence the operator T on ∂Ω × S, and it extends to a continuous linear operator  : H t (∂Ω × S) −→ H t−m (∂Ω × S) T for all t ∈ R. We can define a densely defined, closed linear operator T : H s−5/2+κ (∂Ω × S) −→ H s−m−1/2 (∂Ω × S) as follows:

  ( α ) The domain D T of T is the space  

 D T =  ϕ ∈ H s−5/2+κ (∂Ω × S) : T ϕ ∈ H s−m−1/2 (∂Ω × S) .       ϕ for every  ) T (β ϕ=T ϕ = B P ϕ ∈ D T . Then we have the following three results, analogous to results (I), (II) and (III):   The null space N  A of  A has finite dimension if and only if the null space   N T of T has finite dimension, and we have the formula

( I)

    dim N  A = dim N T . ( II )

    The range R  A of  A is closed if and only if the range R T of T is     closed; and R  A has finite codimension if and only if R T has finite codimension, and we have the formula     codim R  A = codim R T .

( I I I)

The operator  A is a Fredholm operator if and only if the operator T is a Fredholm operator, and we have the formula

552

11 Elliptic Boundary Value Problems

ind  A = ind T.      Step 3: Now we study the null spaces N T and N T α  when α  = 2 with  ∈ Z. In doing so, we need a lemma on the Fourier expansion: Lemma 11.22 Let M = ∂Ω or M = Ω. Then every function  ϕ ∈ H t (M × S) for t ∈ R can be uniquely expanded as follows:  ϕ (x, y) =



ϕ (x) ⊗ eiy in H t (M × S),

∈Z t

ϕ (x) ∈ H (M). Furthermore, if t ≥ 0, we have the formula | ϕ |2H t (M×S) ≈

 ∈Z

  t |ϕ |2H t (M) + 1 + 2 |ϕ |2L 2 (M) .

(11.52)

Here the symbol ≈ denotes equivalent norms.  of Ω in the case M = Ω, we may assume that Proof By considering the double Ω ∞ M is a compact C manifold without boundary. Indeed, it suffices to note that H t (Ω × S) = the space of restrictions to Ω × S of  × S); elements of H t (Ω ˆ H t (Ω) = the space of restrictions to Ω of elements of H t (Ω). Let {χ j } be the eigenfunctions of the Laplace–Beltrami operator − M on M and {λ j } its corresponding eigenvalues: − M χ j = λ j χ j with λ j ≥ 0. Then it follows from an application of Theorem 9.41 that the following two results hold true:   t    (a) H t (M) = ϕ ∈ D (M) : j 1 + λ j | ϕ, χ j |2 < +∞ . (b) Every function ϕ ∈ H t (M) can be uniquely expanded as follows: ϕ(x) =

  ϕ, χ j χ j (x) in H t (M). j

Similarly, by applying Theorem 9.41 with M := M × S we obtain the following two results:

11.4 Unique Solvability Theorem for General Boundary Value Problems

( a ) The Sobolev space H t (M × S) can be characterized as follows: H t (M × S)        2 t iy 2 1 + λj +  |  | < +∞ . ϕ, χ j ⊗ e =  ϕ ∈ D (M × S) : j,

( b) Every function  ϕ ∈ H t (M × S) can be uniquely expanded as follows:  ϕ (x, y) =

   ϕ , χ j ⊗ eiy χ j (x) ⊗ eiy in H t (M × S). j,

Therefore, we have the expansion  ϕ (x, y) =



ϕ (x) ⊗ eiy in H t (M × S),



with ϕ (x) =

 ( ϕ , χ j ⊗ eiy )χ j (x) ∈ H t (M). j

Indeed, it suffices to note the following:  t  2 1 + λ j  ϕ , χ j  j

 t 1 + λ j |( = ϕ , χ j ⊗ eiy )|2 j

⎧  2 t  ⎪ ϕ ⊗ eiy ) ≤ | λ j + 2 ( ϕ |2H t (M×S) if t ≤ 0, ⎨ j 1 + 2 −t  t  2 2  ≤ 1+ ( ϕ , χ j ⊗ eiy ) H t (M×S) 1 + λj +  j ⎪ −t 2 ⎩  ≤ 1 + 2 | ϕ | H t (M×S) if t < 0. Furthermore, if t ≥ 0, we have the inequalities t  t   t 1  1 + λ j + 1 + 2 ≤ 1 + λ j + 2 2  t  t  ≤ 2t 1 + λ j + 1 + 2 . Hence this gives that

553

554

11 Elliptic Boundary Value Problems

| ϕ |2H t (M×S) =

 t  2 ϕ , χ j ⊗ eiy  1 + λ j + 2   j,

 2 t  2   t  ϕ , χ j ⊗ eiy  + ϕ , χ j ⊗ eiy ) 1 + λj   1 + 2 ( ≈ j,

=



j,

|ϕ |2H t (M)



 t 1 + 2 |ϕ |2L 2 (M) . + 



The proof of Lemma 11.22 is complete. The next lemma will play a fundamental role in the sequel: Lemma 11.23 We have, for all ϕ ∈ D (∂Ω) and all  ∈ Z,  ⊗ eiy ) = P(2 )ϕ ⊗ eiy in D (Ω × S). P(ϕ (ϕ ⊗ eiy ) = T (2 )ϕ ⊗ eiy in D (∂Ω × S). T

(11.53) (11.54)

Proof First, we remark that D (∂Ω) =



H t (∂Ω),

t∈R

since the boundary ∂Ω is compact. Thus we may assume that ϕ ∈ H t (∂Ω) for some t ∈ R. We let w (x, y) := P(2 )ϕ(x) ⊗ eiy for (x, y) ∈ M × S. Then we have the assertion w  ∈ H t+1/2 (Ω × S), since P(2 )ϕ ∈ H t+1/2 (Ω) (cf˙the proof of Lemma 11.22). Furthermore, the distribution w  satisfies the conditions     w   = A − 2 P(2 )ϕ ⊗ eiy = 0 in Ω × S, on ∂Ω × S. w |∂Ω×S = ϕ ⊗ eiy  Thus, we have, by the uniqueness of solutions of the boundary value problem ( D),  ⊗ eiy ) = w P(ϕ  = P(2 )ϕ ⊗ eiy ∈ H t+1/2 (Ω × S), and hence

11.4 Unique Solvability Theorem for General Boundary Value Problems

555

 ⊗ eiy ) = B P(2 )ϕ ⊗ eiy (ϕ ⊗ eiy ) = B P(ϕ T = T (2 )ϕ ⊗ eiy in H t−m (∂Ω × S).    Now we can prove the most important relationship between the null spaces N T and N T (α  ) when α  = 2 with  ∈ Z. The proof of Lemma 11.23 is complete.

Proposition 11.24 The following two conditions are equivalent:   (i) dim N T < ∞. (ii) There exists a finite subset I of Z such that 

  dim N T (2 ) < ∞ if  ∈ I,   / I. dim N T (2 ) = 0 if  ∈

Moreover, in this case, we have the formulas 

    N T = ∈I N T (2 ) ⊗ eiy ,     dim N T = ∈I dim N T (2 ) .

Proof By applying Lemma 11.22, we obtain that every function  ϕ ∈ H s−5/2+κ (∂Ω × S) can be uniquely expanded as follows:   ϕ (x  , y) = ∈Z ϕ (x  ) ⊗ eiy in H s−5/2+κ (∂Ω × S), ϕ (x  ) ∈ H s−5/2+κ (∂Ω). Thus we have, by formula (11.54),  T ϕ=



T (2 )ϕ ⊗ eiy in H s−5/2+κ−m (∂Ω × S).

∈Z

By the uniqueness of the Fourier expansion (Lemma 11.22), this gives that   !   N T = N T (2 ) ⊗ eiy (formal sum). ∈Z

Hence  it is easy to see that conditions (i) and (ii) are equivalent, since the spaces N T (2 ) ⊗ eiy are linearly independent. The proof of Proposition 11.24 is complete.      Step 2: Next we study the ranges R T and R T (α  ) when α  = 2 with  ∈ Z. First, we have the following lemma:

556

11 Elliptic Boundary Value Problems

  Lemma 11.25 If the range R T is closed in H s−m−1/2 (∂Ω × S), then the range   R T (2 ) is closed in H s−m−1/2 (∂Ω) for all  ∈ Z.   Proof Let ψ be an arbitrary element of the closure of the range R T (2 ) in H s−m−1/2 (∂Ω), and let {ϕ (k) }∞ k=1 be a sequence in the domain   D T (2 ) ⊂ H s−5/2+κ (∂Ω) such that

T (2 )ϕ (k) −→ ψ in H s−m−1/2 (∂Ω) as k → ∞.

We let

 ϕ (k) (x  , y) = ϕ (k) (x  ) ⊗ eiy .

Then, by using Lemmas 11.23 and 11.22 we find that     ϕ (k) ∈ D T ,  ϕ (k) = T (2 )ϕ (k) ⊗ eiy −→ ψ ⊗ eiy in H s−5/2+κ (∂Ω × S) as k → ∞.   Since the range R T is closed, there exists a function    ϕ ∈ D T ⊂ H s−5/2+κ (∂Ω × S) such that

T ϕ = ψ(x  ) ⊗ eiy .

However, Lemma 11.22 tells us that  ϕ (x  , y) can be uniquely expanded as follows:  ϕ (x  , y) =



ϕk (x  ) ⊗ eiky in H s−5/2+κ (∂Ω × S),

k∈Z

ϕk (x  ) ∈ H s−5/2+κ (∂Ω). Hence we have, by formula (11.54),  ϕ= ψ(x  ) ⊗ eiy = T



T (k 2 )ϕk (x  ) ⊗ eiky .

k∈Z

Therefore, by the uniqueness of the Fourier expansion it follows that T (2 )ϕ = ψ ∈ H s−m−1/2 (∂Ω).   This proves that ϕ ∈ D T (2 ) , so that   ψ = T (2 )ϕ ∈ R T (2 ) .

11.4 Unique Solvability Theorem for General Boundary Value Problems Fig. 11.6 The closed operators T (α) and T (α)∗ with α = 2

557

T (α)

←→

←→

H s−5/2+κ (∂Ω) − −−−− → H s−m−1/2 (∂Ω)

H −s+5/2−κ (∂Ω) ← −−−− − H −s+m+1/2 (∂Ω) ∗ T (α)

T

←→

H s−5/2+κ (∂Ω × S) − −−−− → H s−m−1/2 (∂Ω × S) ←→

Fig. 11.7 The closed operators T and T∗

H −s+5/2−κ (∂Ω × S) ← −−−− − H −s+m+1/2 (∂Ω × S) T∗

     2 In order to study relationships between codim R T and codim R T ( ) for  ∈ Z, we consider the adjoints T∗ and T (2 )∗ . The adjoint The proof of Lemma 11.25 is complete.

T∗ : H −s+m+1/2 (∂Ω × S) −→ H −s+5/2−κ (∂Ω × S) is a closed linear operator such that           for all   ∈ D T∗ ,  =  ϕ ∈ D T and ψ T ϕ, ψ ϕ , T∗ ψ and the adjoint T (2 )∗ : H −s+m+1/2 (∂Ω) −→ H −s+5/2−κ (∂Ω) is a closed linear operator such that         T (2 )ϕ, ψ = ϕ, T (2 )∗ ψ for all ϕ ∈ D T (2 ) and ψ ∈ D T (2 )∗ . The situation can be visualized as in Figs. 11.6 and 11.7 above. The next lemma allows us to give a characterization of the adjoints T∗ and T ∗ (2 ) for  ∈ Z in terms of pseudo-differential operators: Lemma 11.26 Let M be a compact C ∞ manifold without boundary. If T is a classical pseudo-differential operator of order m on M, we define a densely defined, closed linear operator T : H s−5/2+κ (M) −→ H s−m−1/2 (M) for s ∈ R as follows:

558

11 Elliptic Boundary Value Problems

(a) The domain D (T ) of T is the space

D (T ) = ϕ ∈ H s−5/2+κ (M) : T ϕ ∈ H s−m−1/2 (M) . (b) T ϕ = T ϕ for every ϕ ∈ D (T ). Then the adjoint T ∗ of T is characterized as follows: (c) The domain D (T ∗ ) of T ∗ is contained in the space

ψ ∈ H −s+m+1/2 (M) : T ∗ ψ ∈ H −s+5/2−κ (M) , where T ∗ ∈ L m cl (M) is the adjoint of T . (d) T ∗ ψ = T ∗ ψ for every ψ ∈ D (T ∗ ). Proof Let ψ be an arbitrary element of D (T ∗ ) ⊂ H −s+m+1/2 (M), and let {ψ j }∞ j=1 be a sequence in C ∞ (M) such that ψ j −→ ψ in H −s+m+1/2 (M) as j → ∞. Then we have, for all ϕ ∈ C ∞ (M) ⊂ D (T ), 

 T ∗ ψ, ϕ = (ψ, T ϕ) = (ψ, T ϕ)     = lim ψ j , T ϕ = lim T ∗∗ ψ j , ϕ j j  ∗  = T ψ, ϕ ,

so that

T ∗ ψ = T ∗ ψ ∈ H −s+5/2+κ (M).

This proves the lemma.



 and T (α) for By applying Lemma 11.26 to the pseudo-differential operators T α ≥ 0, we obtain the following lemma:   Lemma 11.27 The null spaces N T∗ and N (T (α)∗ ) for α ≥ 0 are characterized respectively as follows:  

∗ ψ =0 .  ∈ H −s+5/2−κ (∂Ω × S) : T N T∗ = ψ 

 N T (α)∗ = ψ ∈ H −s+5/2−κ (∂Ω) : T (α)∗ ψ = 0 . Furthermore, we have the following lemma:

(11.55a) (11.55b)

11.4 Unique Solvability Theorem for General Boundary Value Problems

559

Lemma 11.28 The following two conditions are equivalent:   (i) dim N T∗ < ∞. (ii) There exists a finite subset J of Z such that 

  dim N T (2 )∗ < ∞ if  ∈ J,   / J. dim N T (2 )∗ = 0 if  ∈

Moreover, in this case, we have the formula   ∗     . dim N T∗ = dim N T 2 ∈J

Proof By passing to the adjoint in formula (11.54), we have, for all ψ ∈ D (∂Ω) and  ∈ Z,   (11.56) T ∗ ψ ⊗ eiy = T (2 )∗ ψ ⊗ eiy in D (∂Ω × S). Indeed, if ϕ ∈ C ∞ (∂Ω) and k ∈ Z, we have the formula 

     T ∗ ψ ⊗ eiy , ϕ ⊗ eiky = ψ ⊗ eiy , T (ϕ ⊗ eiky )   = ψ ⊗ eiy , T (k 2 )ϕ ⊗ eiky )    2π ψ, T (2 )ϕ if k = , = 0 if k = ,

and also  T (2 )∗ ψ ⊗ eiy , ϕ ⊗ eiky      2π T (2 )ψ, ϕ = 2π ψ, T (2 )ϕ if k = , = 0 if k = . 

This proves formula (11.56), since the set

ϕ ⊗ eiky : ϕ ∈ C ∞ (∂Ω), k ∈ Z

is dense in C ∞ (∂Ω × S). By virtue of formulas (11.55) and (11.56), just as in the proof of Proposition 11.24 we can prove that conditions (i) and (ii) are equivalent. The proof of Lemma 11.28 is complete.    The next proposition most important relationship between codim R T     gives the  and codim R T α when α = 2 for  ∈ Z.     Proposition 11.29 Assume that the ranges R T and R T (2 ) for all  ∈ Z are closed. Then the following two conditions are equivalent:

560

11 Elliptic Boundary Value Problems

  (i) codim R T < ∞. (ii) There exists a finite subset J of Z such that 

  codim R T (2 ) < ∞ if  ∈ J,   / J. codim R T (2 ) = 0 if  ∈

Moreover, in this case, we have the formula      codim R T = codim R T (2 ) . ∈J

Proposition 11.29 is an immediate consequence of the closed range theorem (Theorem 5.53) and Lemma 11.28. 

11.4.3 End of Proof of Theorem 11.19 (i) =⇒ (ii): The proof is divided into three steps. Step 1: Assume that the operator  A : H s−2+κ (Ω × S) −→ H s−2 (Ω × S) × H s−m−1/2 (∂Ω × S) is a Fredholm operator. Then it follows from result ( I I I ) that the operator T : H s−5/2+κ (∂Ω × S) −→ H s−m−1/2 (∂Ω × S) is a Fredholm operator, and

ind  A = ind T.

Therefore, by applying Proposition 11.24, Lemma 11.25 and Proposition 11.29 we obtain the following two results: (a) There exists a finite subset I of Z such that 

  dim N T (2 ) < ∞ if  ∈ I,   / I. dim N T (2 ) = 0 if  ∈

  (b) The range R T (2 ) is closed for all  ∈ Z, and there exists a finite subset J of Z such that    codim R T (2 ) < ∞ if  ∈ J,   / J. codim R T (2 ) = 0 if  ∈

11.4 Unique Solvability Theorem for General Boundary Value Problems

561

In other words, the operator T (2 ) : H s−5/2+κ (∂Ω) → H s−m−1/2 (∂Ω) is a Fredholm operator for all  ∈ Z, and it is bijective if  ∈ / (I ∪ J ). Hence, in view of results (I), (II) and (III), it follows that the operator A(2 ) : H s−2+κ (Ω) −→ H s−2 (Ω) × H s−m−1/2 (∂Ω) is a Fredholm operator for all  ∈ Z, and it is bijective if  ∈ / (I ∪ J ). Step 2: Next we show that ind A(α) = 0 for all α ≥ 0. First, we observe that the domain D (A(α)) does not depend on α ≥ 0. Take an integer  such that  ∈ / (I ∪ J ), and let α  := 2 . Then we have, for all u ∈ D (A(α)),   

 A(α)u = {(A − α) u, Bu} = A − α  u, Bu + α  − α u, 0  

  = A α  u + α  − α u, 0 , that is,

    A(α) = A α  + α  − α I, 0 .

  Since the operator A α  is bijective, this gives that

   −1  −1 = I + α − α  I, 0 A α  . A(α)A α  However, it follows from an application of the closed graph theorem (Theorem 5.50) that the inverse  −1 : H s−2 (Ω) × H s−m−1/2 (∂Ω) −→ H s−2+κ (Ω) A α is continuous. Furthermore, by applying Rellich’s theorem (Theorem 8.20) we obtain that the injection H s−2+κ (Ω) −→ H s−2 (Ω) is compact for κ > 0. Thus we find that the operator   −1

 α − α  I, 0 A α  is compact. Therefore, it follows from an application of Theorem 5.48 (Riesz– Schauder) that

562

11 Elliptic Boundary Value Problems

  −1  = 0. ind A(α)A α  Hence we have, by Theorem 5.63,   −1     A(α)A α  A α    −1    = ind A(α)A α + ind A α 

ind A(α) = ind

= 0. Step 3: Finally, we show that (11.57) There exists a constant R  > 0 such that if α  = 2 with  ∈ Z     2  and  ≥ R , then we have inequality (11.50) for all u ∈ D A α . By applying Peetre’s theorem (Theorem 5.67) with X := H s−5/2+κ (Ω × S), Y := H s−2 (Ω × S) × H s−m−1/2 (∂Ω × S), A, Z := L 2 (Ω × S), T :=   > 0 such that we obtain that there exists a constant C  u 2H s−2+κ (Ω×S)   2    ≤C u  H s−2 (Ω×S) + |B u |2H s−m−1/2 (∂Ω×S) +  u 2L 2 (Ω×S)   for all  u∈D  A .

(11.58)

Now we take    u (x, y) = u(x) ⊗ eiy for u ∈ D A(2 ) and  ∈ Z. Then we can apply inequality (11.52) to obtain the following three estimates: s−2+κ  u2L 2 (Ω) . •  u 2H s−2+κ (Ω×S) ≈ u2H s−2+κ (Ω) + 1 + 2 2   2   u  H s−2 (Ω×S) =  A − 2 u ⊗ eiy  H s−2 (Ω×S) •    2 s−2      A − 2 u 2 2 . ≈  A − 2 u  H s−2 (Ω) + 1 + 2 L (Ω)   2 iy 2  • |B u | H s−m−1/2 (∂Ω×S) = Bu ⊗ e H s−m−1/2 (∂Ω×S) s−m−1/2  |Bu|2L 2 (∂Ω) . ≈ |Bu|2H s−m−1/2 (∂Ω) + 1 + 2

(11.59a) (11.59b)

(11.59c)

11.4 Unique Solvability Theorem for General Boundary Value Problems

563

Therefore, by carrying these inequalities (11.59a)–(11.59c) into inequality (11.58)  > 0 independent of α  = 2 , we have, with a constant C  s−2+κ u2H s−2+κ (Ω) + α  u2L 2 (Ω)            A − α  u 2 s−2 + α  s−2  A − α  u 2 2 ≤C H (Ω) L (Ω)

  s−m−1/2 |Bu|2L 2 (∂Ω) + u L 2 (Ω) . + |Bu|2H s−m−1/2 (∂Ω) + α 

However, since s − 2 + κ > 0, we can eliminate the last term u L 2 (Ω) on the righthand side if α  is sufficiently large. This proves the desired assertion (11.57). (ii) =⇒ (i): The proof is divided into two steps. Step 1: Assume that condition (ii) is satisfied. Then it follows from results (I), (II), (III) that: The operator T (α) : H s−5/2+κ (∂Ω) −→ H s−m−1/2 (∂Ω) is a Fredholm operator with index zero for all α ≥ 0, and it is bijective if α = 2 ,  ∈ Z and 2 ≥ R  . Thus Proposition 11.24 tells us that   dim N T < ∞. Step 2: We show   that: The range R T is closed and has finite codimension:   codim R T < ∞. Then condition (i) follows from result ( I I I ).   In view of Proposition 11.29, it suffices to prove the closedness of R T . To do  > 0 such that this, we show that there exists a constant C | ϕ |2H s−5/2+κ (∂Ω×S)      2   ≤C T ϕ  H s−m−1/2 (∂Ω×S) + | for all  ϕ ∈ D T . ϕ |2L 2 (∂Ω×S)

(11.60)

  Then the closedness of R T follows from an application of Peetre’s theorem (Theorem 5.67). We let

I :=  ∈ Z : 2 < R  . Since the operators T (2 ) for  ∈ I are Fredholm operators and I is a finite set, by applying Theorem 5.67 with

564

11 Elliptic Boundary Value Problems

X := H s−5/2+κ (∂Ω), quadY := H s−m−1/2 (∂Ω),

Z := L 2 (∂Ω), T := T (2 ),

we obtain that there exists a constant C I > 0 independent of  ∈ I such that |ϕ|2H s−5/2+κ (∂Ω)  2 ≤ C I T (2 )ϕ  s−m−1/2 H

+ |ϕ|2L 2 (∂Ω) (∂Ω)



  for all ϕ ∈ D T (2 ) .

Thus, by using inequality (11.52) with M := ∂Ω we obtain that   ϕ ⊗ eiy 2 s−5/2+κ H (∂Ω×S)     2 2 I T ( )ϕ ⊗ eiy 2 s−m−1/2 ≤C + ϕ ⊗ eiy  L 2 (∂Ω×S) H (∂Ω×S)   for all ϕ ∈ D T (2 ) .

(11.61)

I > 0 is a constant independent of  ∈ I . Here C On the other hand, by using inequality (11.52) we find that inequality (11.50) is equivalent to the following:   u ⊗ eiy 2 s−2+κ H (Ω×S)        iy  Bu ⊗ eiy 2 s−m−1/2     u ⊗ e 2 s−2 ≤C + (∂Ω×S) H (Ω×S) H   2 for all u ∈ D A( ) .  > 0 is a constant independent of  ∈ Z \ I . We take Here  ∈ Z \ I and C   u = P(2 )ϕ for ϕ ∈ D T (2 ) and  ∈ Z \ I. Then, in view of formula (11.53) we obtain that   ϕ ⊗ eiy 2 s−5/2+κ H

(∂Ω×S)

  I I T (2 )ϕ ⊗ eiy 2 s−m−1/2 ≤C H (∂Ω×S)   2 for all ϕ ∈ D T ( ) ,

(11.62)

I I > 0 is a constant independent of  ∈ Z \ I . Indeed, it suffices to note that where C  is an isomorphism of H s−5/2+κ (∂Ω × S) onto the null space the Poisson operator P  

 , s − 2 + κ =  u = 0 in Ω × S . N  u ∈ H s−2+κ (Ω × S) :    Now let  ϕ be an arbitrary element of D T . Then, by using Lemmas 11.22 and 11.23 we find that  ϕ can be expanded as follows:

11.4 Unique Solvability Theorem for General Boundary Value Problems

565

  ϕ (x  , y) = ∈Z ϕ (x  ) ⊗ eiy in H s−5/2+κ (∂Ω × S),   ϕ (x  ) ∈ D T (2 ) . Therefore, by combining inequalities (11.61) and (11.62) we obtain that  2  T ϕ  H s−m−1/2 (∂Ω×S)   T (2 )ϕ ⊗ eiy 2 s−m−1/2 = H (∂Ω×S) ∈Z

  T (2 )ϕ ⊗ eiy 2 s−m−1/2 = H

∈I



(∂Ω×S)

   T (2 )ϕ ⊗ eiy 2 s−m−1/2

+

H

∈Z\I

(∂Ω×S)

 2  1   ϕ ⊗ eiy 2 2 ϕ ⊗ eiy  H s−5/2+κ (∂Ω×S) − L (∂Ω×S) I C ∈I

∈I

2 1   ϕ ⊗ eiy  H s−5/2+κ (∂Ω×S) + I I C ∈Z\I    2  1 1   ϕ ⊗ eiy 2 2 ϕ ⊗ eiy  H s−5/2+κ (∂Ω×S) − ≥ min , L (∂Ω×S) I C I I C ∈Z ∈Z   1 1 | ϕ |2H s−5/2+κ (∂Ω×S) − | = min , ϕ |2L 2 (∂Ω×S) . I C I I C This proves the desired inequality (11.60). The proof of Theorem 11.19 is now complete.



11.4.4 Proof of Corollary 11.20 In view of Theorem 11.19, it suffices to prove the following: If the operator  A : H s−2+κ (Ω × S) −→ H s−2 (Ω × S) × H s−m−1/2 (∂Ω × S) is a Fredholm operator and if 1 < κ < 2, then there exists a constant R > 0 such that if α ≥ R, the operator A(α) : H s−2+κ (Ω) −→ H s−2 (Ω) × H s−m−1/2 (∂Ω) is bijective and we have inequality (11.51) for all u ∈ D (A(α)). Step 1: First, we show that There exists a constant R > 0 such that if α ≥ R, then we have, for all w ∈ D (A(α)) satisfying Bw = 0 on ∂Ω,

566

11 Elliptic Boundary Value Problems

w2H s−2+κ (Ω) + α s−2+κ w2L 2 (Ω)   ≤ C  (A − α) w2H s−2 (Ω) + α s−2 (A − α) w2L 2 (Ω) ,

(11.63)

with a constant C  > 0 independent of α ≥ R. We choose a function ζ ∈ C0∞ (R) such that 0 ≤ ζ (y) ≤ 1 on R and supp ζ ⊂ [π/3, 5π/3], and let w (x, y) = w(x) ⊗ ζ (y)ei



Then we have the formulas   ∂2 w   = A+ 2 w ∂y = (A − α) w(x) ⊗ ζ (y)ei + w(x) ⊗ ζ  (y)ei and



αy

αy



αy

for α ≥ 0 and i =

√ −1.

√ √ + 2(i α)w(x) ⊗ ζ  (y)ei α y

in Ω × S,

Bw (x  , y) = Bw(x  ) ⊗ ζ ei



αy

= 0 on ∂Ω × S.

Hence, by using inequality (11.52) we find that   w ∈D  A . Therefore, by√ applying inequality w(x) ⊗ ζ (y)ei α y we obtain that

(11.58)

to

the

function

w (x, y) =

 √ 2   (11.64) w ⊗ ζ ei α y  s−2+κ H (Ω×S)     √ 2 √ 2      w ⊗ ζ ei α y  ≤C + w ⊗ ζ ei α y  2   s−2 H (Ω×S) L (Ω×S)   √ 2 √ 2      ≤C + w ⊗ ζ  ei α y  s−2 ( A − α) w ⊗ ζ ei α y  s−2 H (Ω×S) H (Ω×S)    √ 2 √ 2     . + 4α w ⊗ ζ  ei α y  s−2 + w ⊗ ζ ei α y  2 H

(Ω×S)

L (Ω×S)

We can estimate each term of inequality (11.64) in the following way (cf. the proof of inequality (11.52)): (i) First, we have the inequality

11.4 Unique Solvability Theorem for General Boundary Value Problems

 √ 2   • (A − α) w ⊗ ζ ei α y  s−2 H (Ω×S)  √ 2  i α y 2 ≈ (A − α) w H s−2 (Ω) ζ e  2 L (S)  √ 2   + (A − α) w2L 2 (Ω) ζ ei α y  s−2 H (S)   ≤ c1 (A − α) w2H s−2 (Ω) + α s−2 (A − α) w2L 2 (Ω) ,

567

(11.65)

where c1 > 0 is a constant depending only on ζ and s. Indeed, it suffices to note that  √ 2   • ζ ei α y  s−2 (11.66) H (S)  (s−2)/2  √  2  d2  i αy  ζe = 1− 2   2  dy L (S)

  √ √ s−2 2 2  1+η ≈ |ζ (η − α)| dη = (1 + (η + α)2 )s−2 | ζ (η)|2 dη R R  

s−2 2 s−2  2 s−2 2  (1 + η ) |ζ (η)| dη + α |ζ (η)| dη ≤4 R R   for every s ≥ 2. = 4s−2 ζ 2H s−2 (S) + α s−2 ζ 2L 2 (S) (ii) Secondly, we have the inequality  √ 2   • w ⊗ ζ  ei α y  s−2 H (Ω×S)   √ 2 √ 2     ≈ w2H s−2 (Ω) ζ  ei α y  2 + w2L 2 (Ω) ζ  ei α y  s−2 L (S) H (S)   2 s−2 2 ≤ c2 w H s−2 (Ω) + α w L 2 (Ω) ,

(11.67)

where c2 > 0 is a constant depending only on ζ and s. (iii) Thirdly, we have the inequality  √ 2   • w ⊗ ζ  ei α y  s−2 H

(Ω×S)

  ≤ c3 w2H s−2 (Ω) + α s−2 w2L 2 (Ω) ,

(11.68)

where c3 > 0 is a constant depending only on ζ and s. (iv) Fourthly, by using inequality (11.65) with s := 2 and s := s + κ we have the two inequalities

568

11 Elliptic Boundary Value Problems

 √ 2   • w ⊗ ζ ei α y  2 ≈ w2L 2 (Ω) . L (Ω×S)  √ 2   • w ⊗ ζ ei α y  s−2+κ H (Ω×S)  √ 2  √ 2     ≈ w2H s−2+κ (Ω) ζ ei α y  2 + w2L 2 (Ω) ζ ei α y  s−2+κ ≥

c4 w2H s−2+κ (Ω)

+ c5 α

L (S) s−2+κ

w2L 2 (Ω)



H 2 c6 w L 2 (Ω)

(11.69a) (11.69b) (S)

.

Here c4 , c5 , c6 are positive constants depending only on ζ and s. Indeed, it suffices to note that

 √ 2 2  √  i α y ζ (η) dη ≈ (1 + (η + α)2 )s−2+κ   s−2+κ ζ e H (S) R

2  −(s−2+κ) s−2+κ  ζ (η) dη ≥4 α R

2  2 s−2+κ  ζ (η) dη − (1 + η ) =4

R −(s−2+κ) s−2+κ

α

ζ 2L 2 (S) − ζ 2H s−2+κ (S) .

Therefore, by carrying these inequalities (11.65)–(11.69) into inequality (11.64), we have, with a constant C1 > 0 independent of α ≥ 1, w2H s−2+κ (Ω) + α s−2+κ w2L 2 (Ω)  ≤ C1 (A − α) w2H s−2 (Ω) + α s−2 (A − α) w2L 2 (Ω)  + α w2H s−2 (Ω) + α s−1 w2L 2 (Ω) .

(11.70)

To eliminate the term α w2H s−2 (Ω) on the right-hand side of inequality (11.70), we need the following interpolation inequalities (see [164, Ehrling’s inequality]): (a) For every ε > 0, there exists a constant Cε > 0 such that u2H s−1 (Ω) ≤ ε u2H s−2+κ (Ω) + Cε u2L 2 (Ω) for u ∈ H

s−2+κ

(11.71)

(Ω).

(b) There exists a constant C2 > 0 independent of α ≥ 0 such that   α v2H s−2 (Ω) ≤ C2 v2H s−1 (Ω) + α s−1 v2L 2 (Ω)

(11.72)

for v ∈ H s−1 (Ω). Inequality (11.72) is an immediate consequence of the following elementary inequality:

11.4 Unique Solvability Theorem for General Boundary Value Problems

569

s−2  s−1  α 1 + |ξ |2 ≤ 1 + |ξ |2 + α s−1 for all α ≥ 0 and ξ ∈ Rn . By applying inequalities (11.71) and (11.72) to the function w, we obtain that   α w2H s−2 (Ω) ≤ εC2 w2H s−2+κ (Ω) + C2 Cε + α s−1 w2L 2 (Ω) , and hence (taking ε = 1/(2C1 C2 )) that αC1 w2H s−2 (Ω) ≤

1 w2H s−2+κ (Ω) + C3 α s−1 w2L 2 (Ω) , 2

with a constant C3 > 0 independent of α ≥ 1. Therefore, by carrying this into inequality (11.70) we have, with another constant C4 > 0, w2H s−2+κ (Ω) + α s−2+κ w2L 2 (Ω)   ≤ C4 (A − α) w2H s−2 (Ω) + α s−2 (A − α) w2L 2 (Ω) + α s−1 w2L 2 (Ω) . Since s − 2 + κ > s − 1, we can eliminate the last term α s−1 w2L 2 (Ω) on the righthand side if α is sufficiently large. This proves inequality (11.63). Step 2: Inequality (11.63) tells us that the operator A(α) is injective for all α ≥ R; hence it is bijective for all α ≥ R, since ind A(α) = 0 for all α ≥ 0. Step 3: Finally, we show that inequality (11.51) holds true for all u ∈ D (A(α)). We may assume that (1) R  = 20 for some positive integer 0 . (2) R ≥ R  . Here R  is the constant in condition (ii) of Theorem 11.19 and R is the constant in Step 1. Thus, for any α ≥ R, we can choose a positive integer  ≥ 0 such that 2 ≤ α ≤ ( + 1)2 . We let

α  := 2 .

Then we have the inequalities 

α  ≤ α ≤ ( + 1)2 ≤ 4α  , √ α − α  ≤ 2 + 1 ≤ 3 α  .

(11.73)

Now let u be an arbitrary element of D (A(α)). Since α  = 2 ≥ 20 = R  , it follows from Theorem 11.19 that there exists a unique solution v ∈ H s−2+κ (Ω) of the the boundary value problem

570

11 Elliptic Boundary Value Problems



 A − α  v = 0 in Ω, Bv = Bu on ∂Ω.

(11.74)

and that  s−2+κ v2H s−2+κ (Ω) + α  v2L 2 (Ω)    s−m−1/2 |Bu|2L 2 (∂Ω) . ≤ C  |Bu|2H s−m−1/2 (∂Ω) + α  By inequalities (11.73), this gives that v2H s−2+κ (Ω) + α s−2+κ v2L 2 (Ω)   ≤ C5 |Bu|2H s−m−1/2 (∂Ω) + α s−m−1/2 |Bu|2L 2 (∂Ω) . Here C5 > 0 is a constant independent of α ≥ R. We let w := u − v. Then, in view of formula (11.74) it follows that 

  (A − α) w = (A − α) u − α  − α v in Ω, Bw = 0 on ∂Ω.

Thus we can apply inequality (11.63) to obtain that w2H s−2+κ (Ω) + α s−2+κ w2L 2 (Ω)    ≤ C   (A − α) u − α  − α v2H s−2 (Ω)     2 + α s−2 (A − α)u − α  − α v  2 L (Ω)

  2 ≤ 2C  (A − α) u2H s−2 (Ω) + α  − α v2H s−2 (Ω)

 2  + α s−2 (A − α) u2L 2 (Ω) + α s−2 α  − α v2L 2 (Ω) .

Furthermore, in view of inequalities (11.73) and (11.71), this gives that

(11.75)

11.4 Unique Solvability Theorem for General Boundary Value Problems

w2H s−2+κ (Ω) + α s−2+κ w2L 2 (Ω)  ≤ C6 (A − α) u2H s−2 (Ω) + α s−2 (A − α) u2L 2 (Ω)  + v2H s−1 (Ω) + α s−1 v L 2 (Ω) .

571

(11.76)

Here C6 > 0 is a constant independent of α ≥ R. Since s − 2 + κ > s − 1, by combining inequalities (11.75) and (11.76) we obtain the desired inequality (11.51). The proof of Corollary 11.20 is complete. 

11.5 Notes and Comments Section 11.1: Theorem 11.1 is adapted from the book of Gilbarg–Trudinger [74], where a thorough treatment of quasilinear elliptic equations is given. Section 11.2: The proof of Theorem 11.4, based on the jump formula, may conceivably be new. Theorem 11.5 is an expression of the fact that every solution u of the equation Au = 0 can be expressed by means of a single layer potential. Section 11.3: The main idea of the proof of Theorems 11.10–11.18 is due to Hörmander [85] and Seeley [168], and details were carried out by Taira [181]. Section 11.4: Theorem 11.19 and Corollary 11.20 are adapted from Taira [191] in such a way as to make it accessible to graduate students and advanced undergraduates as well. It is worth pointing out here that the key lemma in the proof of Theorem 11.19 is Lemma 11.23 which follows from the unique solvability of the Dirichlet problem. Hence the methods and results in this section can be extended to treat general boundary value problems for degenerate elliptic differential operators of second order which enjoy an existence and uniqueness theory for the Dirichlet problem in the framework of appropriate functions spaces. For detailed studies of the Dirichlet problem for such operators, the reader might be referred to Ole˘ınik– Radkeviˇc [138] and Stroock–Varadhan [178], which are based on the work of Fichera [61]. There are many topics on elliptic boundary value problems which we have not touched on. The reader is referred especially to Agmon [3], Evans [54], Folland [62], Grubb [77], Hörmander [91], Lions–Magenes [116], López-Gómez [117], McLean [125], Rempel–Schulze [150], Schechter [161], Troianiello [226] and Wloka [239] for more material. Višik [229], Agranovich-Vishik [6], Višik–‘Eskin: [230], Va˘ınberg–Grušin [228] and Ladyzhenskaya–Ural’tseva [109] are the Russian classics for the theory of elliptic boundary value problems.

Part V

Markov Processes, Feller Semigroups and Boundary Value Problems

Chapter 12

Markov Processes, Transition Functions and Feller Semigroups

This chapter is the heart of the subject, and its content may be summarized as in Table 12.1 below. In this chapter we introduce a class of (temporally homogeneous) Markov processes which we will deal with in this book (Definition 12.3). Intuitively, the Markov property is that the prediction of subsequent motion of a physical particle, knowing its position at time t, depends neither on the value of t nor on what has been observed during the time interval [0, t); that is, a physical particle “starts afresh” (see formula (12.2)). From the point of view of analysis, however, the transition function of a Markov process is something more convenient than the Markov process itself. In fact, it can be shown that the transition functions of Markov processes generate solutions of certain parabolic partial differential equations such as the classical diffusion equation; and, conversely, these differential equations can be used to construct and study the transition functions and the Markov processes themselves. In Sect. 12.1 we give the precise definition of a Markov transition function adapted to the Hille–Yosida theory of semigroups (Definition 12.4). A Markov process is called a strong Markov process if the “starting afresh” property holds not only for every fixed moment but also for suitable random times. In Sect. 12.1.7 we formulate precisely this “strong” Markov property (Definition 12.26), and give a useful criterion for the strong Markov property (Theorem 12.27). In Sect. 12.1.8 we introduce the basic notion of uniform stochastic continuity of transition functions (Definition 12.28), and give simple criteria for the strong Markov property in terms of transition functions (Theorems 12.27 and 12.29). In Sect. 12.2 we introduce a class of semigroups associated with Markov processes (Definition 12.31), called Feller semigroups, and we give a characterization of Feller semigroups in terms of Markov transition functions (Theorems 12.34 and 12.37). Section 12.3 is devoted to a version of the Hille–Yosida theorem (Theorem 5.14) adapted to the present context. We prove generation theorems for Feller semigroups (Theorems 12.38 and 12.53) which form a functional analytic background for the proof of Theorem 1.38 in Chap. 13. In particular, Theorem 12.38 and Corollary 12.39 give useful criteria in terms of maximum principles. In Sections 12.4 and 12.5, © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 K. Taira, Functional Analytic Techniques for Diffusion Processes, Springer Monographs in Mathematics, https://doi.org/10.1007/978-981-19-1099-9_12

575

576

12 Markov Processes, Transition Functions and Feller Semigroups

Table 12.1 An overview of this chapter Probability Functional analysis

Boundary value problems

Strong Markov process χ = (xt , F , Ft , Px ) Markov transition function pt (·, dy) Chapman-Kolmogorov equation

Feller semigroup {Tt }  Tt f (·) = pt (·, dy) f (y)

Infinitesimal generator A Tt = exp[tA]

Semigroup property Tt+s = Tt · Ts

Various diffusion phenomena

Function spaces C(K ), C0 (K )

Waldenfels integro-differential operator W = P+S Ventcel’ boundary condition L

following Ventcel’ (Wentzell) [236] we study the problem of determining all possible boundary conditions for multi-dimensional diffusion processes. More precisely, we describe analytically the infinitesimal generator A of a Feller semigroup {Tt } in the case where the state space is the closure D of a bounded domain D in Euclidean space R N (Theorems 12.55 and 12.57). Theorems 12.55 and 12.57 are essentially due to Ventcel’ [236]. Our proof of these theorems follows Bony–Courrège–Priouret [22], where the infinitesimal generators of Feller semigroups are studied in great detail in terms of the maximum principle. Analytically, a Markovian particle in D is governed by an integro-differential operator W , called Waldenfels operator, in the interior D of the domain, and it obeys a boundary condition L, called Ventcel’ boundary condition, on the boundary ∂ D of the domain. Probabilistically, a Markovian particle moves both by jumps and continuously in the state space and it obeys the Ventcel’ boundary condition which consists of six terms corresponding to a diffusion along the boundary, an absorption phenomenon, a reflection phenomenon, a sticking (or viscosity) phenomenon and a jump phenomenon on the boundary and an inward jump phenomenon from the boundary (see Figs. 12.16, 12.17 and 12.18. For the probabilistic meanings of Ventcel’ boundary conditions, the reader might be referred to Dynkin–Yushkevich [47]. In this way, we can reduce the problem of existence of Feller semigroups to the unique solvability of the boundary value problem for Waldenfels integro-differential operators W with Ventcel’ boundary conditions L in the theory of partial differential equations. In Sect. 12.6 we prove general existence theorems for Feller semigroups in terms of boundary value problems in the case when the measures e(x 0 , ·) in formula (12.57) and the measures ν(x  , ·) in formula (12.66) identically vanish in D and on ∂ D, respectively (see Theorems 12.74, 12.77 and 12.81). In other words, we confine ourselves to a class of Feller semigroups whose infinitesimal generators have no integro-differential operator term in formulas (12.57) and (12.66).

12.1 Markov Processes and Transition Functions

577

12.1 Markov Processes and Transition Functions In 1828 the English botanist R. Brown observed that pollen grains suspended in water move chaotically, incessantly changing their direction of motion. The physical explanation of this phenomenon is that a single grain suffers innumerable collisions with the randomly moving molecules of the surrounding water. A mathematical theory for Brownian motion was put forward by Einstein [50] in 1905. Let p(t, x, y) be the probability density function that a one-dimensional Brownian particle starting at position x will be found at position y at time t. Einstein derived the following formula from statistical mechanical considerations:   (y − x)2 1 . exp − p(t, x, y) = √ 2Dt 2π Dt Here D is a positive constant determined by the radius of the particle, the interaction of the particle with surrounding molecules, temperature and the Boltzmann constant. This gives an accurate method of measuring Avogadro number by observing particles. Einstein’s theory was experimentally tested by Perrin [145] between 1906 and 1909. In Sect. 12.1 we give the precise definition of a Markov transition function adapted to the Hille–Yosida theory of semigroups (Definition 12.4).

12.1.1 Definitions of Markov Processes Brownian motion was put on a firm mathematical foundation for the first time by Wiener [237] in 1923. Let Ω be the space of continuous functions ω : [0, ∞) → R with coordinates xt (ω) = ω(t) and let F be the smallest σ-algebra in Ω which contains all sets of the form {ω ∈ Ω : a ≤ xt (ω) < b} for t ≥ 0 and a < b. Wiener constructed probability measures Px , x ∈ R, on F for which the following formula holds true: Px {ω ∈ Ω : a1 ≤ xt1 (ω) < b1 , a2 ≤ xt2 (ω) < b2 , . . . , an ≤ xtn (ω) < bn }  bn  b1  b2 ... p(t1 , x, y1 ) p(t2 − t1 , y1 , y2 ) . . . = a1

a2

an

p(tn − tn−1 , yn−1 , yn ) dy1 dy2 . . . dyn , 0 < t1 < t2 < . . . < tn < ∞.

(12.1)

578

12 Markov Processes, Transition Functions and Feller Semigroups

This formula (12.1) expresses the “starting afresh” property of Brownian motion that if a Brownian particle reaches a position, then it behaves subsequently as though that position had been its initial position. The measure Px is called the Wiener measure starting at x. Let (Ω, F) be a measurable space. A non-negative measure P on F is called a probability measure if P(Ω) = 1. The triple (Ω, F, P) is called a probability space. The elements of Ω are known as sample points, those of F as events and the values P(A), A ∈ F, are their probabilities. An extended real-valued, F-measurable function X on Ω is called a random variable. The integral  Ω

X (ω) d P

(if it exists) is called the expectation of X , and is denoted by E(X ). We begin with a review of conditional probabilities and conditional expectations (see Sects. 5.5 and 5.6). Let G be a σ-algebra contained in F. If X is an integrable random variable, then the conditional expectation of X for given G is any random variable Y which satisfies the following two conditions (CE1) and (CE2): (CE1) The function Y is G-measurable. (CE2) A Y (ω) d P = A X (ω) d P for all A ∈ G. We recall that conditions (CE1) and (CE2) determine Y up to a set in G of measure zero. We shall write Y = E(X | G). When X is the characteristic function χ B of a set B ∈ F, we shall write P(B | G) = E (χ B | G) . The function P(B | G) is called the conditional probability of B for given G. This function can also be characterized as follows: (CP1) The function P(B | G) is G-measurable. (CP2) P(A ∩ B) = E(P(B | G); A) for every A ∈ G. Namely, we have, for every A ∈ G,  P(B | G)(ω) d P.

P(A ∩ B) = A

It should be emphasized that the function P(B | G) is determined up to a set in G of P-measure zero, that is, it is an equivalence class of G-measurable functions on Ω with respect to the measure P. Markov processes are an abstraction of the idea of Brownian motion. Let K be a locally compact, separable metric space and B the σ-algebra of all Borel sets in K , that is, the smallest σ-algebra containing all open sets in K . Let (Ω, F, P) be a probability space. A function X defined on Ω taking values in K is called a random variable if it satisfies the condition

12.1 Markov Processes and Transition Functions

579

X −1 (E) = {X ∈ E} ∈ F for all E ∈ B. We express this by saying that X is F/B-measurable. A family {xt }t≥0 of random variables is called a stochastic process, and it may be thought of as the motion in time of a physical particle. The space K is called the state space and Ω the sample space. For a fixed ω ∈ Ω, the function xt (ω), t ≥ 0, defines in the state space K a trajectory or a path of the process corresponding to the sample point ω. In this generality the notion of a stochastic process is of course not so interesting. The most important class of stochastic processes is the class of Markov processes which is characterized by the Markov property. Intuitively, this is the principle of the lack of any “memory” in the system. Markov processes are an abstraction of the idea of Brownian motion. More precisely, the temporally homogeneous Markov property or simply Markov property is that the prediction of subsequent motion of a physical particle, knowing its position at time t, depends neither on the value of t nor on what has been observed during the time interval [0, t); that is, a physical particle “starts afresh”. Now we introduce a class of Markov processes which we will deal with in this book. This vague idea can be made precise and effective in several ways. If {Z λ }λ∈Λ is a family of random variables, we let (see Proposition 3.11) σ(Z λ ; λ ∈ Λ) = the smallest σ-algebra, contained in F with respect to which all Z λ are measurable. If {xt }t≥0 is a stochastic process, we introduce three sub-σ-algebras F≤t , F=t and F≥t of F as follows: ⎧ F≤t = σ(xs : 0 ≤ s ≤ t) ⎪ ⎪ ⎪ ⎪ ⎪ the smallest σ-algebra, contained in F ⎪ ⎪ ⎪ ⎪ ⎪ with respect to which all xs , 0 ≤ s ≤ t, are measurable, ⎪ ⎪ ⎪ ⎪ ⎪ ⎨F=t = σ(xt ) the smallest σ-algebra, contained in F ⎪ ⎪ ⎪ with respect to which xt is measurable, ⎪ ⎪ ⎪ ⎪ ⎪ F≥t = σ(xs : t ≤ s < ∞) ⎪ ⎪ ⎪ ⎪ ⎪ the smallest σ-algebra, contained in F ⎪ ⎪ ⎩ with respect to which all xs , t ≤ s < ∞, are measurable. Intuitively, an event in F≤t is determined by the behavior of the process {xs } up to time t and an event in F≥t by its behavior after time t. Thus they represent respectively the “past” and “future” relative to the “present” moment. Definition 12.1 A stochastic process X = {xt } is called a Markov process if it satisfies the condition

580

12 Markov Processes, Transition Functions and Feller Semigroups

P(B | F≤t ) = P(B | F=t ) for any “future" set B ∈ F≥t . More precisely, we have, for any “future” set B ∈ F≥t ,  P(B | F=t )(ω) d P for every “past" setA ∈ F≤t .

P(A ∩ B) = A

Intuitively, this means that the conditional probability of a “future” event B given the “present” is the same as the conditional probability of B given the “present” and “past”. An observer may record not only the trajectories of the process, but also some other occurrences, only indirectly related or entirely unrelated to the process. Thus we obtain a broader and more flexible formulation of the Markov property if we enlarge the “past” as follows: Let {Ft }t≥0 be a family of sub-σ-algebras of F which satisfies the following two conditions (a) and (b): (a) If s < t, then Fs ⊂ Ft . (b) For each t ≥ 0, the function xt is Ft /B-measurable, that is, {xt ∈ E} ∈ Ft for all E ∈ B. We express property (a) by saying that the family {Ft } is increasing, and property (b) by saying

that the stochastic process {xt } is adapted to {Ft }. We remark that the family F≤t t≥0 satisfies both conditions and is the minimal possible one. Definition 12.2 Let {xt }t≥0 be a stochastic process and let {Ft }t≥0 be an increasing family of sub-σ-algebras of F. We say that {xt } is a Markov process with respect to {Ft } if it satisfies the following two conditions (i) and (ii): (i) {xt } is adapted to {Ft }. (ii) P(B | Ft ) = P(B | F=t ) for all B ∈ F≥t . It should be noticed that Definition 12.2 reduces to Definition 12.1 if we take Ft := F≤t . Moreover, by choosing the family {Ft } as the “past” has the effect of making it harder for the Markov property to hold true, while the property becomes more powerful. Now we define a class of (temporally homogeneous) Markov processes which we will deal with in this book: Definition 12.3 Assume that we are given the following: (1) A locally compact, separable metric space K and the σ-algebra B of all Borel sets in K . A point ∂ is adjoined to K as the point at infinity if K is not compact, and as an isolated point if K is compact (see Fig. 12.1 below). We let

12.1 Markov Processes and Transition Functions

K

581

K∂

∂ Fig. 12.1 The one-point compactification K ∂ of K

K ∂ = K ∪ {∂}, B∂ = the σ-algebra in K ∂ generated by B. The space K ∂ = K ∪ {∂} is called the one-point compactification of K . (2) The space Ω of all mappings ω : [0, ∞] → K ∂ such that ω(∞) = ∂ and that if ω(t) = ∂ then ω(s) = ∂ for all s ≥ t. Let ω∂ be the constant map ω∂ (t) = ∂ for all t ∈ [0, ∞]. (3) For each t ∈ [0, ∞], the coordinate map xt defined by xt (ω) = ω(t) for all ω ∈ Ω. (4) For each t ∈ [0, ∞], a pathwise shift mapping θt : Ω → Ω defined by the formula θt ω(s) = ω(t + s) for all ω ∈ Ω. We remark that θ∞ ω = ω∂ and that xt ◦ θs = xt+s for all t, s ∈ [0, ∞]. (5) A σ-algebra F in Ω and an increasing family {Ft }0≤t≤∞ of sub-σ-algebras of F. (6) For each x ∈ K ∂ , a probability measure Px on (Ω, F). We say that these elements define a temporally homogeneous Markov process or simply Markov process X = (xt , F, Ft , Px ) if the following four conditions (i)–(iv) are satisfied: (i) For each 0 ≤ t < ∞, the function xt is Ft /B∂ - measurable, that is, {xt ∈ E} ∈ Ft for all E ∈ B∂ . (ii) For each 0 ≤ t < ∞ and E ∈ B, the function pt (x, E) = Px {xt ∈ E} is a Borel measurable function of x ∈ K . (iii) Px {ω ∈ Ω : x0 (ω) = x} = 1 for each x ∈ K ∂ . (iv) For all t, h ∈ [0, ∞], x ∈ K ∂ and E ∈ B∂ , we have the formula (the Markov property) (12.2) Px {xt+h ∈ E | Ft } = ph (xt , E) a. e., or equivalently

582

12 Markov Processes, Transition Functions and Feller Semigroups

E

Fig. 12.2 The transition probability pt (x, E)

t

 Px (A ∩ {xt+h ∈ E}) =

ph (xt (ω), E) d Px (ω) for all A ∈ Ft .

(12.2 )

A

In Definition 12.3, the term ‘Markov process’ means a family of Markov processes over the measure space (Ω, F, Px ) with respect to {F}t , one Markov process for each of the measures Px corresponding to all possible initial positions x ∈ K ∂ . Here is an intuitive way of thinking about this definition of a Markov process. The sub-σ-algebra Ft may be interpreted as the collection of events which are observed during the time interval [0, t]. The value Px (A), A ∈ F, may be interpreted as the probability of the event A under the condition that a particle starts at position x; hence the value pt (x, E) expresses the transition probability that a particle starting at position x will be found in the set E at time t (see Fig. 12.2). The function pt (x, ·) is called the transition function of the process X . The transition function pt (x, ·) specifies the probability structure of the process. The intuitive meaning of the crucial condition (iv) is that the future behavior of a particle, knowing its history up to time t, is the same as the behavior of a particle starting at xt (ω), that is, a particle starts afresh. By using the Markov property (12.2) repeatedly, we easily obtain the following formula (12.3), analogous to formula (12.1):

(12.3) Px ω ∈ Ω : xt1 (ω) ∈ A1 , xt2 (ω) ∈ A2 , . . . , xtn (ω) ∈ An    = ··· pt1 (x, dy1 ) pt2 −t1 (y1 , dy2 ) · · · ptn −tn−1 (yn−1 , dyn ), A1

A2

An

0 < t1 < t2 < . . . < tn < ∞,

A1 , A2 , . . . , An ∈ B.

A Markovian particle moves in the space K until it “dies” or “disappears” at the time when it reaches the point ∂; hence the point ∂ is called the terminal point or cemetery. With this interpretation in mind, we let ζ(ω) := inf {t ∈ [0, ∞] : xt (ω) = ∂} . The random variable ζ is called the lifetime of the process X . The process X is said to be conservative if it satisfies the condition

12.1 Markov Processes and Transition Functions

583

Px {ζ = ∞} = 1 for each x ∈ K .

12.1.2 Transition Functions From the point of view of analysis, the transition function is something more convenient than the Markov process itself. In fact, it can be shown that the transition functions of Markov processes generate solutions of certain parabolic partial differential equations such as the classical diffusion equation; and, conversely, these differential equations can be used to construct and study the transition functions and the Markov processes themselves. Our first job is thus to give the precise definition of a transition function adapted to the Hille–Yosida theory of semigroups: Definition 12.4 Let (K , ρ) be a locally compact, separable metric space and B the σ-algebra of all Borel sets in K . A function pt (x, E), defined for all t ≥ 0, x ∈ K and E ∈ B, is called a temporally homogeneous Markov transition function on K or simply Markov transition function on K if it satisfies the following four conditions (a)–(d): (a) pt (x, ·) is a non-negative measure on B and pt (x, K ) ≤ 1 for each t ≥ 0 and each x ∈ K . (b) pt (·, E) is a Borel measurable function for each t ≥ 0 and each E ∈ B. (c) p0 (x, {x}) = 1 for each x ∈ K . (d) (The Chapman–Kolmogorov equation) For any t, s ≥ 0, any x ∈ K and any E ∈ B, we have the formula (see Fig. 12.3 below)  pt+s (x, E) =

pt (x, dy) ps (y, E).

(12.4)

K

Remark 12.5 It is just condition (d) which reflects the Markov property that a particle starts afresh. Here is an intuitive way of thinking about the above definition of a Markov transition function. The value pt (x, E) expresses the transition probability that a physical particle starting at position x will be found in the set E at time t. Equation (12.4) expresses the idea that a transition from the position x to the set E in time t + s is composed of a transition from x to some position y in time t, followed by a transition from y to the set E in the remaining time s; the latter transition has probability ps (y, E) which depends only on y (see Fig. 12.3):  Px {xt+s ∈ E} =

Px {xt ∈ dy} Py {xs ∈ E} .

(12.4 )

K

Thus a physical particle “starts afresh”; this property is called the Markov property.

584

12 Markov Processes, Transition Functions and Feller Semigroups

Fig. 12.3 The intuitive meaning of formula (12.4)

E

t+s

t

0

y

x K

The Chapman–Kolmogorov equation (12.4) tells us that pt (x, K ) is monotonically increasing as t ↓ 0, so that the limit p+0 (x, K ) = lim pt (x, K ) t↓0

exists. A Markov transition function pt (x, ·) is said to be normal if it satisfies the condition p+0 (x, K ) = lim pt (x, K ) = 1 for every x ∈ K . t↓0

The next theorem justifies the definition of a transition function, and hence it will be fundamental for our further study of Markov processes: Theorem 12.6 For every Markov process, the function pt , defined by the formula pt (x, E) := Px {xt ∈ E} for x ∈ K , E ∈ B and t ≥ 0 is a Markov transition function. Conversely, every normal Markov transition function corresponds to some Markov process. Here are some important examples of normal transition functions on the line R (see Lamperti [111, Chap. 7, Sect. 8]): Example 12.7 (uniform motion) If t ≥ 0, x ∈ R and E ∈ B, we let pt (x, E) = χ E (x + vt), / E. where v is a constant, and χ E (y) = 1 if y ∈ E and = 0 if y ∈ This process, starting at x, moves deterministically with constant velocity v.

12.1 Markov Processes and Transition Functions

585

Example 12.8 (Poisson process) If t ≥ 0, x ∈ R and E ∈ B, we let pt (x, E) = e−λt

∞ (λt)n n=0

n!

χ E (x + n),

where λ is a positive constant. This process, starting at x, advances one unit by jumps, and the probability of n jumps during the time 0 and t is equal to e−λt (λt)n /n!. Example 12.9 (Brownian motion) If t > 0, x ∈ R and E ∈ B, we let 1 pt (x, E) = √ 2πt



  (y − x)2 dy, exp − 2t E

and p0 (x, E) = χ E (x). This is a mathematical model of one-dimensional Brownian motion. Its character is quite different from that of the Poisson process; the transition function pt (x, E) satisfies the condition pt (x, (x − ε, x + ε)) = 1 − o(t) or equivalently, pt (x, R \ (x − ε, x + ε)) = o(t) for every ε > 0 and every x ∈ R. This means that the process never stands still, as does the Poisson process. In fact, this process changes state not by jumps but by continuous motion. A Markov process with this property is called a diffusion process. Example 12.10 (Brownian motion with constant drift) If t > 0, x ∈ R and E ∈ B, we let    1 (y − mt − x)2 dy, exp − pt (x, E) = √ 2t 2πt E and p0 (x, E) = χ E (x), where m is a constant. This represents Brownian motion with a constant drift of magnitude m superimposed; the process can be represented as {xt + mt}, where {xt } is Brownian motion on R.

586

12 Markov Processes, Transition Functions and Feller Semigroups

Example 12.11 (Cauchy process) If t > 0, x ∈ R and E ∈ B, we let pt (x, E) =

1 π

 E

t dy, t 2 + (y − x)2

and p0 (x, E) = χ E (x). This process can be thought of as the “trace” on the real line of trajectories of two-dimensional Brownian motion, and it moves by jumps (see Knight [102, Lemma 2.12]). More precisely, if B1 (t) and B2 (t) are two independent Brownian motions and if T is the first passage time of B1 (t) to x, then B2 (T ) has the Cauchy density 1 |x| for y ∈ (−∞, ∞). π x 2 + y2 Here are two examples of diffusion processes on the closed half-line K = R+ = [0, ∞) in which we must take account of the effect of the boundary point 0 of K (see [111, Chap. 7, Sect. 8]): Example 12.12 (reflecting barrier Brownian motion) If t > 0, x ∈ K = [0, ∞) and E ∈ B, we let pt (x, E)

       1 (y − x)2 (y + x)2 =√ dy + dy , exp − exp − 2t 2t 2πt E E

(12.5)

and p0 (x, E) = χ E (x). This represents Brownian motion with a reflecting barrier at x = 0; the process may be represented as {|xt |}, where {xt } is Brownian motion on R. Indeed, since {|xt |} goes from x to y if {xt } goes from x to ±y due to the symmetry of the transition function in Example 12.9 about x = 0, we find that (see Fig. 12.4 below) pt (x, E) = Px {|xt | ∈ E}

       1 (y − x)2 (y + x)2 =√ dy + dy . exp − exp − 2t 2t 2πt E E Example 12.13 (sticking barrier Brownian motion) If t > 0, x ∈ K = [0, ∞) and E ∈ B, we let

12.1 Markov Processes and Transition Functions

587

Fig. 12.4 The reflecting barrier

y 2

1

reflection

−x

0

x

       1 (y − x)2 (y + x)2 dy − dy pt (x, E) = √ exp − exp − 2t 2t 2πt E E

 2   x 1 z dz χ E (0), + 1− √ exp − 2t 2πt −x and p0 (x, E) = χ E (x). This represents Brownian motion with a sticking barrier at x = 0. When a Brownian particle reaches the boundary point 0 for the first time, instead of reflecting it sticks there forever; in this case the state 0 is called a trap. Here is a typical example of diffusion processes on the closed interval K = [0, 1] in which we must take account of the effect of the two boundary points 0 and 1 of K: Example 12.14 (reflecting barrier Brownian motion) If t > 0, x ∈ K = [0, 1] and E ∈ B, we let pt (x, E)      ∞  (y − x + 2n)2 (y + x + 2n)2 1 exp − + exp − dy, =√ 2t 2t 2πt E n=−∞ and p0 (x, E) = χ E (x). This represents Brownian motion with two reflecting barriers at x = 0 and x = 1. It was assumed so far that pt (x, K ) ≤ 1 for each t ≥ 0 and each x ∈ K . This implies that a Markovian particle may die or disappear in a finite time. Here are three typical examples of an absorbing barrier Brownian motion.

588

12 Markov Processes, Transition Functions and Feller Semigroups

Fig. 12.5 The absorbing barrier

y 2 absorbtion

−x

1

0

x

Example 12.15 (absorbing barrier Brownian motion) If t > 0, x ∈ K = [0, ∞) and E ∈ B, we let pt (x, E)  

     1 (y − x)2 (y + x)2 =√ exp − exp − dy − dy , 2t 2t 2πt E E

(12.6)

and p0 (x, E) = χ E (x). This represents Brownian motion with an absorbing barrier at x = 0; a Brownian particle dies at the first moment when it hits the boundary point x = 0 (see Fig. 12.5). Namely, the boundary point 0 of K is the terminal point. Example 12.16 (absorbing barrier Brownian motion) If t > 0, x ∈ K = [0, 1] and E ∈ B, we let pt (x, E)      ∞  (y − x + 2n)2 (y + x + 2n)2 1 exp − − exp − dy, =√ 2t 2t 2πt E n=−∞ and p0 (x, E) = χ E (x). This represents Brownian motion with two absorbing barriers at x = 0 and x = 1. Example 12.17 (absorbing–reflecting barrier Brownian motion) Let λ be a constant such that 0 < λ < 1. If t > 0, x ∈ K = [0, ∞) and E ∈ B, we let

12.1 Markov Processes and Transition Functions

pt (x, E)

589

(12.7)     2 2 1 (y − x) (y + x) =√ dy + dy exp − exp − 2t 2t 2πt E E

 2(1 − λ) 1 −√ λ 2πt   −y        (z − x)2 (1 − λ) (1 − λ) y z exp − dz dy, × exp exp λ λ 2t −∞ E



and p0 (x, E) = χ E (x). This process {xt } may be thought of as a “combination” of the absorbing and reflecting Brownian motions; the absorbing and reflecting cases are formally obtained by letting λ → 0 and λ → 1, respectively. In fact, it is easy to verify that formulas (12.6) and (12.5) may be obtained from formula (12.7) by letting λ → 0 and λ → 1, respectively. A Markov transition function pt (x, ·) is said to be conservative if it satisfies the condition that we have, for all t > 0, pt (x, K ) = 1 for each x ∈ K . For example, the reflecting barrier Brownian motion of Example 12.12 is conservative. Indeed, it suffices to note that we have, for all t > 0, pt (x, [0, ∞))

 ∞       ∞ 1 (y − x)2 (y + x)2 =√ dy + dy exp − exp − 2t 2t 2πt 0 0

 ∞    1 (y − x)2 dy =√ exp − 2t 2πt −∞ = 1 for every x ∈ K = [0, ∞). There is a simple trick which allows to turn the general case into the conservative case. We add a new point ∂ to the locally compact space K as the point at infinity if K is not compact, and as an isolated point if K is compact; so the space K ∂ = K ∪ {∂} is compact. Then we can extend a Markov transition function pt (x, ·) on K to a Markov transition function pt (x, ·) on K ∂ by the formulas ⎧  ⎪ for x ∈ K and E ∈ B, ⎨ p t (x, E) = pt (x, E)  for x ∈ K , p t (x, {∂}) = 1 − pt (x, K ) ⎪ ⎩  p t (∂, K ) = 0, p  t (∂, {∂}) = 1 for x = ∂.

590

12 Markov Processes, Transition Functions and Feller Semigroups

Intuitively, this means that a Markovian particle moves in the space K until it dies, at which time it reaches the point ∂; hence the point ∂ is the terminal point or cemetery. We remark that our convention is consistent, since Tt f (∂) = f (∂) = 0. In the sequel, we will not distinguish in our notation between pt (x, ·) and pt (x, ·); in the cases of interest for us the point ∂ will be absorbing.

12.1.3 Kolmogorov’s Equations Among the first works devoted to Markov processes, the most fundamental was A. N. Kolmogorov’s work (1931) where the general concept of a Markov transition function was introduced for the first time and an analytic method of describing Markov transition functions was proposed. We now take a close look at Kolmogorov’s work (see Lamperti [111, Chap. 6, Sect. 5]). Let pt (x, ·) be a transition function on R, and assume that the following two conditions (i) and (ii) are satisfied: (i) For each ε > 0, we have the assertion lim t↓0

1 sup pt (x, R \ (x − ε, x + ε)) = 0. t x∈R

(ii) The three limits  1 x+ε pt (x, dy)(y − x)2 := a(x), t↓0 t x−ε  1 x+ε pt (x, dy)(y − x) := b(x), lim t↓0 t x−ε 1 lim ( pt (x, R) − 1) := c(x) t↓0 t lim

exist for each x ∈ R. Physically, the limit a(x) may be thought of as variance (overt ω ∈ Ω) instantaneous (with respect to t) velocity when the process is at position x (see Sect. 12.1.1), and the limit b(x) has a similar interpretation as a mean. The transition functions in Examples 12.7, 12.9 and 12.10 satisfy conditions (i) and (ii) with a(x) = 0, b(x) = v, c(x) = 0; a(x) = 1, b(x) = c(x) = 0; a(x) = 1, b(x) = m, c(x) = 0, respectively, whereas the transition functions in Examples 12.8 and 12.11 do not satisfy condition (i). Furthermore, we assume that the transition function pt (x, ·) has a density p(t, x, y) with respect to the Lebesgue measure dy. Intuitively, the density p(t, x, y) represents the state of the process at position y at time t, starting at the initial state that a unit mass is at position x. Under certain regularity conditions, Kolmogorov

12.1 Markov Processes and Transition Functions

591

showed that the density p(t, x, y) is, for fixed y, the fundamental solution of the Cauchy problem ⎧ ∂p a(x) ∂ 2 p ⎨∂ p + b(x) = + c(x) p for t > 0. 2 ∂t 2 ∂x ∂x ⎩ limt↓0 p(t, x, y) = δ y (x),

(12.8)

and is, for fixed x, the fundamental solution of the Cauchy problem 

∂p ∂t

=

∂2 ∂ y2



a(y) 2

 p −

∂ ∂y

(b(y) p) + c(y) p for t > 0.

limt↓0 p(t, x, y) = δx (y).

(12.9)

Here δ is the Dirac measure (see Example 7.11), and δ y and δx represent unit masses at position y and x, respectively. Equation (12.8) is called Kolmogorov’s backward equation, since we consider the terminal state (the variable y) to be fixed and vary the initial state (the variable x). In this context, Eq. (12.9) is called Kolmogorov’s forward equation. These equations are also called the Fokker–Planck partial differential equations. In the case of Brownian motion (Example 12.9), Eqs. (12.8) and (12.9) become the classical diffusion (or heat) equations for t > 0: 1 ∂2 p ∂p = , ∂t 2 ∂x 2

1 ∂2 p ∂p = . ∂t 2 ∂ y2

Conversely, Kolmogorov raised the problem of construction of Markov transition functions by solving the given Fokker–Planck partial differential equations (12.8) and (12.9). It is worth pointing out here that the forward equation (12.9) is given in a more intuitive form than the backward equation (12.8), but regularity conditions on the functions a(y) and b(y) are more stringent than those needed in the backward case. This suggests that the backward approach is more convenient than the forward approach from the viewpoint of analysis. In 1936, Feller treated this problem by classical analytic methods, and proved that Eq. (12.8) (or (12.9)) has a unique solution p(t, x, y) under certain regularity conditions on the functions a(x), b(x) and c(x), and that this solution p(t, x, y) determines a Markov process. In 1943, Fortet proved that these solutions correspond to Markov processes with continuous paths. On the other hand, Bernstein (1938) and Lévy (1948) made probabilistic approaches to this problem, by using stochastic differential equations.

592

12 Markov Processes, Transition Functions and Feller Semigroups

12.1.4 Feller and C0 Transition Functions Let (K , ρ) be a locally compact, separable metric space and B the σ-algebra of all Borel sets in K . Let B(K ) be the space of real-valued, bounded Borel measurable functions on K ; B(K ) is a Banach space with the supremum norm  f ∞ = sup | f (x)|. x∈K

If pt is a transition function on K , we let  Tt f (x) =

pt (x, dy) f (y) for every f ∈ B(K ). K

Then, by applying Theorem 3.7 with F := B, H := { f ∈ B(K ) : Tt f is Borel measurable} , we obtain that H = B(K ), that is, the function Tt f is Borel measurable whenever f ∈ B(K ). Indeed, it suffices to note the following two facts (i) and (ii): (i) Condition (b) of Definition 12.4 implies condition (i) of Theorem 3.7. (ii) An application of the monotone convergence theorem (Theorem 2.10) gives that condition (ii) of Theorem 3.7 is satisfied. In view of condition (a) of Definition 12.4, it follows that, for each t ≥ 0, the operator Tt is non-negative and contractive on B(K ) into itself: f ∈ B(K ), 0 ≤ f (x) ≤ 1 on K =⇒ 0 ≤ Tt f (x) ≤ 1 on K . Furthermore, we have, by condition (d) of Definition 12.4 and Fubini’s theorem (Theorem 2.18),    pt+s (x, dy) f (y) = pt (x, dz) ps (z, dy) f (y) Tt+s f (x) = K K K

   pt (x, dz) ps (z, dy) f (y) = K K pt (x, dz) Ts f (z) = K

= Tt (Ts f ) (x), so that the operators Tt form a semigroup Tt+s = Tt · Ts for t, s ≥ 0.

12.1 Markov Processes and Transition Functions

593

We also have, by condition (c) of Definition 12.4, T0 = I = the identity operator. The Hille–Yosida theory of semigroups requires the strong continuity of {Tt }t≥0 : lim Tt f − f ∞ = 0 for every f ∈ B(K ), t↓0

(12.10)

that is,     lim sup  pt (x, dy) f (y) − f (x) = 0 for every f ∈ B(K ). t↓0 x∈K

(12.10 )

K

Now, by taking f := χ{x} ∈ B(K ) in formula (12.10 ), we obtain that lim pt (x, {x}) = 1 for every x ∈ K . t↓0

(12.11)

However, the Brownian motion transition function in Example 12.9, the most important and interesting example, does not satisfy condition (12.11). Thus we shift our attention to continuous functions on K , instead of Borel measurable functions on K . Now let C(K ) be the space of real-valued, bounded continuous functions on K ; C(K ) is a normed linear space with the supremum norm  f ∞ = sup | f (x)|. x∈K

We add a new point ∂ to the locally compact space K as the point at infinity if K is not compact, and as an isolated point if K is compact; so the space K ∂ = K ∪ {∂} is compact. Then we say that a function f ∈ C(K ) converges to a limit a ∈ R as x → ∂ if, for each ε > 0 there exists a compact subset E of K such that (see Sect. 2.4) | f (x) − a| < ε for all x ∈ K \ E. We shall write lim f (x) = a.

x→∂

Let C0 (K ) be the subspace of C(K ) which consists of all functions satisfying the condition lim x→∂ f (x) = 0; C0 (K ) is a closed subspace of C(K ). We remark that C0 (K ) may be identified with C(K ) if K is compact. Namely, we have the formula  C0 (K ) =

{ f ∈ C(K ) : lim x→∂ f (x) = 0} if K is locally compact, C(K ) if K is compact.

594

12 Markov Processes, Transition Functions and Feller Semigroups

Fig. 12.6 The spaces C(K ∂ ) and C0 (K ) when K is locally compact

R

C(K∂ )

c = f (∂)

0

f =g+c

g

C0 (K)

Moreover, we introduce a useful convention  Any real-valued function f (x) on K is extended to the compact space K ∂ = K ∪ {∂} by setting f (∂) = 0. From this viewpoint, the space C0 (K ) is identified with the subspace of C(K ∂ ) which consists of all functions f satisfying the condition f (∂) = 0. More precisely, we have the following decomposition (see Fig. 12.6): C(K ∂ ) = {constant functions} + C0 (K ). We recall that a transition function pt on K and a transition function pt on K ∂ are related as follows: ⎧  ⎪ for x ∈ K and E ∈ B, ⎨ p t (x, E) = pt (x, E) for x ∈ K , p  t (x, {∂}) = 1 − pt (x, K ) ⎪ ⎩   p t (∂, K ) = 0, p t (∂, {∂}) = 1 for x = ∂. Now we can introduce two important conditions on the measures pt (x, ·) related to continuity in x ∈ K , for fixed t ≥ 0: Definition 12.18 (i) A transition function pt is called a Feller transition function or simply Feller function if the function  Tt f (x) =

pt (x, dy) f (y) K

is a continuous function of x ∈ K whenever f (x) is bounded and continuous on K . Namely, the Feller property is equivalent to saying that the space C(K ) is an invariant subspace of B(K ) for the operators Tt : f ∈ C(K ) =⇒ Tt f ∈ C(K ).

12.1 Markov Processes and Transition Functions

595

(ii) Moreover, we say that pt is a C0 transition function if the space C0 (K ) is an invariant subspace of C(K ) for the operators Tt : f ∈ C0 (K ) =⇒ Tt f ∈ C0 (K ). Remark 12.19 The Feller property is equivalent to saying that the measures pt (x, ·) depend continuously on x ∈ K in the usual weak topology, for every fixed t ≥ 0.

12.1.5 Path Functions of Markov Processes It is naturally interesting and important to ask the following question: Question 12.20 Given a Markov transition function pt , under which conditions on pt does there exist a Markov process with transition function pt whose paths are almost surely continuous? A Markov process X = (xt , F, Ft , Px ) is said to be right-continuous provided that, for each x ∈ K , Px {ω ∈ Ω : the mapping t → xt (ω) is a right-continuous function from [0, ∞) into K ∂ } = 1. Furthermore, we say that X is continuous provided that, for each x ∈ K ,

Px ω ∈ Ω : the mapping t → xt (ω) is a continuous function from [0, ζ) into K ∂ = 1. Here ζ is the lifetime of the process X . Now we give some useful criteria for path-continuity in terms of transition functions: Theorem 12.21 Let K be a locally compact, separable metric space and let pt be a normal transition function on K . (i) Assume that the following two conditions (L) and (M) are satisfied: (L) For each s > 0 and each compact E ⊂ K , we have the assertion lim sup pt (x, E) = 0.

x→∂ 0≤t≤s

(12.12)

(M) For each ε > 0 and each compact E ⊂ K , we have the assertion lim sup pt (x, K \ Uε (x)) = 0, t↓0 x∈E

(12.13)

596

12 Markov Processes, Transition Functions and Feller Semigroups

where Uε (x) = {y ∈ K : ρ(y, x) < ε} is an ε-neighborhood of x. Then there exists a Markov process X with transition function pt whose paths are right-continuous on [0, ∞) and have left-hand limits on [0, ζ) almost surely. (ii) Assume that condition (L) and the following condition (N) (replacing condition (M)) are satisfied: (N) For each ε > 0 and each compact E ⊂ K , we have the assertion lim t↓0

1 sup pt (x, K \ Uε (x)) = 0. t x∈E

Then there exists a Markov process X with transition function pt whose paths are almost surely continuous on [0, ζ). Remark 12.22 (1) Condition (L) is trivially satisfied, if the state space K is compact. (2) It is known (see Dynkin [45, Lemma 6.2]) that if the paths of a Markov process are right-continuous, then the transition function pt satisfies the condition lim pt (x, Uε (x)) = 1 for all x ∈ K . t↓0

12.1.6 Stopping Times In this subsection we formulate the starting afresh property for suitable random times τ , that is, the events {ω ∈ Ω : τ (ω) < a} should depend on the process {xt } only “up to time a”, but not on the “future” after time a. This idea leads us to the following definition: Definition 12.23 Let {Ft : t ≥ 0} be an increasing family of σ-algebras in a probability space (Ω, F, P). A mapping τ : Ω → [0, ∞] is called a stopping time or Markov time with respect to {Ft } if it satisfies the condition {τ < a} = {ω ∈ Ω : τ (ω) < a} ∈ Fa for all a > 0.

(12.14)

If we introduce another condition {τ ≤ a} = {ω ∈ Ω : τ (ω) ≤ a} ∈ Fa for all a > 0,

(12.15)

then condition (12.15) implies condition (12.14); hence we obtain a smaller family of stopping times. Indeed, we have, for all a > 0,

12.1 Markov Processes and Transition Functions

597

 ∞   1 {τ < a} = {ω ∈ Ω : τ (ω) < a} = ω ∈ Ω : τ (ω) ≤ a − n n=1   ∞  1 , τ ≤a− = n n=1 and it follows from condition (12.15) that each set in the union belongs to Fa−1/n . Hence we obtain from the monotonicity of {Ft } that {τ < a} ∈

∞ 

Fa−1/n ⊂ Fa for all a > 0.

n=1

This proves that condition (12.14) is satisfied. Conversely, we can prove the following lemma: Lemma 12.24 Assume that the family {Ft } is right-continuous, that is, Ft =



Fs for each t ≥ 0.

s>t

Then condition (12.14) implies condition (12.15). Proof First, we have, for all a > 0,  ∞   1 ω ∈ Ω : τ (ω) < a + {τ ≤ a} = {ω ∈ Ω : τ (ω) ≤ a} = n n=1   ∞  1 . τ 0.

n=1

This proves that condition (12.15) is satisfied. The proof of Lemma 12.24 is complete.

 

Summing up, we have proved that conditions (12.14) and (12.15) are equivalent provided that the family {Ft } is right-continuous. If τ is a stopping time with respect to the right-continuous family {Ft } of σalgebras, we let Fτ = {A ∈ F : A ∩ {τ ≤ a} ∈ Fa for all a > 0} .

598

12 Markov Processes, Transition Functions and Feller Semigroups

Intuitively, we may think of Fτ as the “past” up to the random time τ . Then we have the following lemma: Lemma 12.25 Fτ is a σ-algebra. Proof (1) It is clear that ∅ ∈ Fτ . (2) If A ∈ Fτ , then we have, by condition (12.15), Ac ∩ {τ ≤ a} = {τ ≤ a} \ (A ∩ {τ ≤ a}) ∈ Fa for all a > 0. This proves that Ac ∈ Fτ . (3) If Ak ∈ Fτ for k = 1, 2, . . ., then we have, by condition (12.15), ∞ 

 Ak

∩ {τ ≤ a} =

k=1

∞ 

(Ak ∩ {τ ≤ a}) ∈ Fa for all a > 0.

k=1

This proves that ∪∞ k=1 Ak ∈ Fτ . The proof of Lemma 12.25 is complete.

 

Now we list some elementary properties of stopping times and their associated σ-algebras: (i) Any non-negative constant mapping is a stopping time. More precisely, if τ ≡ t0 for some constant t0 ≥ 0, then it follows that τ is a stopping time and that Fτ reduces to Ft0 . Proof Since we have the assertion  ∅ ∈ Fa if 0 < a < t0 , {τ ≡ t0 ≤ a} = Ω ∈ Fa if a ≥ t0 , it follows that τ is a stopping time and further from the right-continuity of {Ft } that Fτ = {A ∈ F : A ∩ {τ ≤ a} ∈ Fa for all a > 0} = {A ∈ F : A ∈ Fa for all a ≥ t0 }  = Fa = Ft0 . a≥t0

 

The proof of Assertion (i) is complete.

(ii) If {n } is a finite or denumerable collection of stopping times for the family {Ft }, then it follows that τ = inf τn n

is also a stopping time.

12.1 Markov Processes and Transition Functions

599

Proof Since each n is a stopping time, we have, for all a > 0, 

  {τn < a} ∈ Fa . τ = inf τn < a = n

n

Indeed, it suffices to note that each set in the union belongs to Fa .

 

(iii) If {n } is a finite or denumerable collection of stopping times for the family {Ft }, then it follows that τ = sup τn n

is also a stopping time. Proof Since each n is a stopping time and since {Ft } is increasing, we have, for all a > 0,      ∞  ∞ 1 τ = sup τn < a = τn < a − ∈ Fa−1/k ⊂ Fa . k n k=1 n k=1 Indeed, it suffices to note that each set in the intersection belongs to Fa−1/k .

 

(iv) If τ is a stopping time and t0 is a positive constant, then it follows that τ + t0 is also a stopping time. Proof Since the stopping time τ is non-negative, we have, by the monotonicity of {Ft }, {τ + t0 < a} = {τ < a − t0 }  ∅ ∈ Fa if 0 < a ≤ t0 , = {τ < a − t0 } ∈ Fa−t0 ⊂ Fa if a > t0 , This proves that τ + t0 is a stopping time.

 

(v) Let τ1 and τ2 be stopping times for the family {Ft } such that τ1 ≤ τ2 on Ω. Then it follows that Fτ 1 ⊂ Fτ 2 . This is a generalization of the monotonicity of the family {Ft }. Proof If A is an arbitrary element of Fτ1 , then it satisfies the condition A ∩ {τ1 ≤ a} ∈ Fa for all a > 0. Since we have the assertion {τ2 ≤ a} ⊂ {τ1 ≤ a} for all a > 0,

600

12 Markov Processes, Transition Functions and Feller Semigroups

it follows that A ∩ {τ2 ≤ a} = (A ∩ {τ1 ≤ a}) ∩ {τ2 ≤ a} ∈ Fa for all a > 0.

(12.16)  

This proves that A ∈ Fτ2 .

(vi) Let {n }∞ n=1 be a sequence of stopping times for the family {Ft } such that τn+1 ≤ τn on Ω. Then it follows that the limit τ = lim τn = inf τn n→∞

n≥1

is a stopping time and further that Fτ =



Fτ n .

n≥1

This property generalizes the right-continuity of the family {Ft }. Proof First, by assertion (ii) it follows that τ is a stopping time. Moreover, we have, for each n = 1, 2, . . ., τ = inf τk ≤ τn . k≥1

Hence, it follows from assertion (v) that Fτ ⊂ Fτn for each n = 1, 2, . . . , so that Fτ ⊂



Fτ n .

n≥1

Conversely, let A be an arbitrary element of ∩n≥1 Fτn . Then it follows that, for each n = 1, 2, . . ., A ∩ {τn ≤ a} ∈ Fa for all a > 0. However, since τn ↓ τ as n → ∞, we have the assertion    1 τn ≤ a + . {τ ≤ a} = m m∈N n∈N Hence we obtain from assertion (12.16) that

12.1 Markov Processes and Transition Functions

601

     1 τn ≤ a + A ∩ {τ ≤ a} = A m m∈N n∈N     1 A ∩ τn ≤ a + , = m m∈N n∈N where each member in the union belongs to Fa+1/m . Therefore, it follows from the right-continuity of {Ft } that A ∩ {τ ≤ a} ∈



Fa+ m1 = Fa for all a > 0.

m∈N

This proves that A ∈ Fτ . The proof of Assertion (vi) is complete.

 

12.1.7 Definition of Strong Markov Processes A Markov process is called a strong Markov process if the “starting afresh” property holds not only for every fixed moment but also for suitable random times. In this subsection we formulate precisely this “strong” Markov property (Definition 12.26 and formula (12.17)), and give a useful criterion for the strong Markov property (Theorem 12.27). Let (K , ρ) be a locally compact, separable metric space. We add a new point ∂ to the locally compact space K as the point at infinity if K is not compact, and as an isolated point if K is compact; so the space K ∂ = K ∪ {∂} is compact. Let X = (xt , F, Ft , Px ) be a Markov process. For each t ∈ [0, ∞], we define a mapping Φt : [0, t] × Ω −→ K ∂ by the formula Φt (s, ω) = xs (ω). A Markov process X = (xt , F, Ft , Px ) is said to be progressively measurable with respect to {Ft } if the mapping Φt is B[0,t] × Ft /B∂ -measurable for each t ∈ [0, ∞], that is, if we have the condition Φt−1 (E) = {Φt ∈ E} ∈ B[0,t] × Ft for all E ∈ B∂ . Here B[0,t] is the σ-algebra of all Borel sets in the interval [0, t] and B∂ is the σ-algebra in K ∂ generated by B. It should be noticed that if X is progressively measurable and if τ is a stopping time, then the mapping xτ : ω → xτ (ω) (ω) is Fτ /B∂ - measurable. Now we are in a position to introduce the following definition:

602

12 Markov Processes, Transition Functions and Feller Semigroups

Definition 12.26 We say that a progressively measurable Markov process X = (xt , F, Ft , Px ) has the strong Markov property with respect to {Ft } if the following condition is satisfied: For all h ≥ 0, x ∈ K ∂ , E ∈ B∂ and all stopping times τ , we have the formula Px {xτ +h ∈ E | Fτ } = ph (xτ , E),

(12.17)

or equivalently,  Px (A ∩ {xτ +h ∈ E}) =

ph (xτ (ω) (ω), E) d Px (ω) for all A ∈ Fτ .

(12.17 )

A

This expresses the idea of “starting afresh” at random times. The next result gives a useful criterion for the strong Markov property: Theorem 12.27 (Dynkin) If the transition function of a right-continuous Markov process has the C0 -property, then it is a strong Markov process.

12.1.8 Strong Markov Property and Uniform Stochastic Continuity In this subsection we introduce the basic notion of uniform stochastic continuity of transition functions (Definition 12.28), and give simple criteria for the strong Markov property in terms of transition functions (Theorems 12.29 and 12.30). Let (K , ρ) be a locally compact, separable metric space. We begin with the following definition: Definition 12.28 A transition function pt on K is said to be uniformly stochastically continuous on K if it satisfies the following condition: (U) For each ε > 0 and each compact E ⊂ K , we have the assertion lim sup [1 − pt (x, Uε (x))] = 0, t↓0 x∈E

(12.18)

where Uε (x) = {y ∈ K : ρ(y, x) < ε} is an ε-neighborhood of x. It should be noticed that every uniformly stochastically continuous transition function pt is normal and satisfies condition (M) in Theorem 12.21. Therefore, by combining part (i) of Theorems 12.21 and 12.27 we obtain the following theorem: Theorem 12.29 If a uniformly stochastically continuous, C0 transition function satisfies condition (L), then it is the transition function of some strong Markov process whose paths are right-continuous and have no discontinuities other than jumps.

12.1 Markov Processes and Transition Functions

603

uniform stochastic continuity + condition (L)

right-continuous Markov process

C0 -property

strong Markov process

Fig. 12.7 An overview of Theorems 12.27 and 12.29

We remark that Theorems 12.27 and 12.29 can be visualized as in Fig. 12.7. A continuous strong Markov process is called a diffusion process. The next result states a sufficient condition for the existence of a diffusion process with a prescribed transition function: Theorem 12.30 If a uniformly stochastically continuous, C0 transition function satisfies conditions (L) and (N), then it is the transition function of some diffusion process. This theorem is an immediate consequence of part (ii) of Theorems 12.21 and 12.27.

12.2 Feller Semigroups and Transition Functions In Sect. 12.2 we introduce a class of semigroups associated with Markov processes (Definition 12.31), called Feller semigroups, and we give a characterization of Feller semigroups in terms of Markov transition functions (Theorems 12.34 and 12.37).

12.2.1 Definition of Feller Semigroups Let (K , ρ) be a locally compact, separable metric space and let C(K ) be the Banach space of real-valued, bounded continuous functions on K with the supremum norm  f ∞ = sup | f (x)|. x∈K

Recall (see Sect. 12.1.4) that C0 (K ) is the closed subspace of C(K ) which consists of all functions satisfying the condition lim x→∂ f (x) = 0, and further that C0 (K ) may be identified with C(K ) if K is compact. Now we introduce a class of semigroups associated with Markov processes: Definition 12.31 A family {Tt }t≥0 of bounded linear operators acting on the space C0 (K ) is called a Feller semigroup on K if it satisfies the following three conditions (i), (ii) and (iii):

604

12 Markov Processes, Transition Functions and Feller Semigroups

(i) Tt+s = Tt · Ts for all t, s ≥ 0 (the semigroup property); T0 = I . (ii) The family {Tt } is strongly continuous in t for t ≥ 0: lim Tt+s f − Tt f ∞ = 0 for every f ∈ C0 (K ). s↓0

(iii) The family {Tt } is non-negative and contractive on C0 (K ): f ∈ C0 (K ), 0 ≤ f (x) ≤ 1 on K =⇒ 0 ≤ Tt f (x) ≤ 1 on K .

12.2.2 Characterization of Feller Semigroups in Terms of Transition Functions In Sect. 12.1.4, we have proved the following theorem: Theorem 12.32 If pt is a Feller transition function on K , then the associated operators {Tt }t≥0 , defined by the formula  Tt f (x) =

pt (x, dy) f (y) for every f ∈ C(K ),

(12.19)

K

form a non-negative and contraction semigroup on C(K ): (i) Tt+s = Tt · Ts , t, s ≥ 0 (the semigroup property); T0 = I . (ii) f ∈ C(K ), 0 ≤ f (x) ≤ 1 on K =⇒ 0 ≤ Tt f (x) ≤ 1 on K . The purpose of this subsection is to prove a converse: Theorem 12.33 If {Tt }t≥0 is a non-negative and contraction semigroup on the space C0 (K ), then there exists a unique C0 transition function pt on K such that formula (12.19) holds true for all f ∈ C0 (K ). Proof We fix t ≥ 0 and x ∈ K , and define a linear functional F on C0 (K ) as follows: F( f ) = Tt f (x) for all f ∈ C0 (K ). Then it follows that F is non-negative and bounded with norm F ≤ 1, since Tt is a non-negative and contractive operator on C0 (K ). Therefore, by applying the Riesz– Markov representation theorem (Theorem 5.41 with X := K ) to the functional F we obtain that there exists a unique Radon measure pt (x, ·) on K such that  Tt (x) = F( f ) =

pt (x, dy) f (y) for all f ∈ C0 (K ).

(12.20)

K

We show that the measures pt satisfy conditions (a)–(d) of Definition 12.4.

12.2 Feller Semigroups and Transition Functions

605

fn

Fig. 12.8 The function f n

G

(a) First, we have the inequality pt (x, K ) = sup {F( f ) : f ∈ C0 (K ), 0 ≤ f ≤ 1 on K } = F ≤ 1 for all x ∈ K , since F is contractive. (c) Since T0 = I , it follows that  f (x) = T0 f (x) = p0 (x, dy) f (y) for all f ∈ C0 (K ). K

This proves that p0 (x, {x}) = 1 for each x ∈ K . (b) We prove that the function pt (·, E) is Borel measurable for each E ∈ B. To do this, it suffices to show that the collection A = {E ∈ B : pt (·, E) is B-measurable} coincides with the σ-algebra B. The proof is divided into five steps. Step 1: The collection A contains the collection O of all open subsets of K : A ⊃ O.

(12.21)

Indeed, if G ∈ O, we let (see Fig. 12.8) f n (x) := min{nρ(x, K \ G), 1}, n = 1, 2, . . . . Then f n is a function in C0 (K ), and satisfies the condition  lim f n (x) =

n→∞

1 if x ∈ G, 0 if x ∈ K \ G.

Thus, by virtue of the dominated convergence theorem (Theorem 2.12) we obtain from formula (12.20) with f := f n that  lim Tt f n = lim

n→∞

n→∞

pt (x, dy) f n (y) = pt (x, G). K

606

12 Markov Processes, Transition Functions and Feller Semigroups

Since the functions Tt f n are continuous, this proves that the limit function pt (·, G) is B-measurable and so G ∈ A. Step 2: We have, by assertion (12.21), d(O) ⊂ d(A).

(12.22)

Step 3: The collection A is a d-system d(A) = A.

(12.23)

Indeed, it is easy to verify the following three assertions (i), (ii) and (iii): (i) By assertion (12.18), it follows that K ∈ O ⊂ A. (ii) If A, B ∈ A and A ⊂ B, then it follows that the function pt (·, B \ A) = pt (·, B) − pt (·, A) is B-measurable. This proves that B \ A ∈ A. (iii) If {An }∞ n=1 is an increasing sequence of elements of A, then it follows that the function  ∞   pt ·, An = lim pt (·, An ) n→∞

n=1

is B-measurable. This proves that ∞ 

An ∈ A.

n=1

Step 4: Since O is a π-system, it follows from an application of the monotone class theorem (Theorem 2.7) that d(O) = σ(O) = B.

(12.24)

Step 5: By combining assertions (12.24), (12.22) and (12.23), we obtain that B = d(O) ⊂ d(A) = A ⊂ B, so that A = B. (d) In view of the semigroup property and Fubini’s theorem (Theorem 2.18), it follows from formula (12.20) that we have, for all f ∈ C0 (K ),

12.2 Feller Semigroups and Transition Functions

607

 K

pt+s (x, dz) f (z) = Tt+s f (x) = Tt (Ts f )(x)   = pt (x, dy) ps (y, dz) f (z) K   K  pt (x, dy) ps (y, dz) f (z). = K

K

Hence the uniqueness part of the Riesz–Markov representation theorem (Theorem 5.41 with X := K ) gives that 



pt+s (x, E) =

pt (x, dy) ps (y, E) = K

pt (x, dy) ps (y, E) for all E ∈ B. K

Finally, the C0 -property of pt comes automatically, since Tt : C0 (K ) → C0 (K ). The proof of Theorem 12.33 is now complete.   It should be emphasized that the Feller or C0 -property deals with continuity of a Markov transition function pt (x, E) in x, and, by itself, is not concerned with continuity in t. Now we give a necessary and sufficient condition on pt (x, E) in order that its associated operators {Tt }t≥0 is strongly continuous in t on the space C0 (K ): lim Tt+s f − Tt f ∞ = 0 for all f ∈ C0 (K ). s↓0

(12.25)

Theorem 12.34 Let pt (x, ·) be a C0 transition function on K . Then the associated operators {Tt }t≥0 , defined by formula (12.19), are strongly continuous in t on C0 (K ) if and only if pt (x, ·) is uniformly stochastically continuous on K and satisfies the following condition (L): (L) For each s > 0 and each compact E ⊂ K , we have the assertion lim sup pt (x, E) = 0.

x→∂ 0≤t≤s

(12.12)

Remark 12.35 Since the semigroup {Tt }t≥0 is a contraction semigroup, it follows from Remark 6.6 that the strong continuity (12.25) of {Tt } in t for t ≥ 0 is equivalent to the strong continuity at t = 0: lim Tt f − f ∞ = 0 for all f ∈ C0 (K ). t↓0

(12.25 )

Proof The proof is divided into two steps. Step 1: First, we prove the “if” part of the theorem. Since continuous functions with compact support are dense in C0 (K ), it suffices to prove the strong continuity of {Tt } at t = 0 for all such functions f . For any compact subset E of K containing supp f , we have the inequality

608

12 Markov Processes, Transition Functions and Feller Semigroups

Tt f − f ∞ ≤ sup |Tt f (x) − f (x)| + sup |Tt f (x)| x∈K \E

x∈E

(12.26)

≤ sup |Tt f (x) − f (x)| +  f ∞ · sup pt (x, supp f ). x∈K \E

x∈E

However, condition (L) implies that, for each ε > 0, we can find a compact subset E of K such that, for all sufficiently small t > 0, sup pt (x, supp f ) < ε.

x∈K \E

(12.27)

On the other hand, we have, for each δ > 0,  Tt f (x) − f (x) = pt (x, dy) ( f (y) − f (x)) Uδ (x)  pt (x, dy) ( f (y) − f (x)) − f (x) (1 − pt (x, K )) , + K \Uδ (x)

and so sup |Tt f (x) − f (x)| x∈E

≤ sup | f (y) − f (x)| + 3 f ∞ · sup [1 − pt (x, Uδ (x))] . ρ(x,y) 0 such that sup | f (y) − f (x)| < ε.

ρ(x,y) 0, sup [1 − pt (x, Uδ (x))] < ε. x∈E

Hence we have, for all sufficiently small t > 0, sup |Tt f (x) − f (x)| < ε (1 + 3 f ∞ ) .

(12.28)

x∈E

Therefore, by carrying inequalities (12.27) and (12.28) into inequality (12.26) we obtain that, for all sufficiently small t > 0, Tt f − f ∞ < ε (1 + 4 f ∞ ) . This proves formula (12.25 ), that is, the strong continuity of {Tt }. Step 2: Next, we prove the “only if” part of the theorem.

12.2 Feller Semigroups and Transition Functions

609

fx

Fig. 12.9 The function f x

ε

x

ε

(1) Let x be an arbitrary point of K . For any ε > 0, we define (see Fig. 12.9) ⎧ ⎨

1 1 − ρ(x, y) if ρ(x, y) ≤ ε, f x (y) = ε ⎩0 if ρ(x, y) > ε.

(12.29)

If E is a compact subset of K , then the functions f x , x ∈ E, are in C0 (K ), for all sufficiently small ε > 0, and satisfy the condition  f x − f z ∞ ≤

1 ρ(x, z) for x, z ∈ E. ε

(12.30)

However, for any δ > 0, by the compactness of E we can find a finite number of points x1 , x2 , . . . , xn of E such that E⊂

n 

Uδε/4 (xk ),

k=1

and hence min ρ(x, xk ) ≤

1≤k≤n

δε for all x ∈ E. 4

Thus, by combining this inequality with inequality (12.30) we obtain that min  f x − f xk ∞ ≤

1≤k≤n

δ for all x ∈ E. 4

Now we have, by formula (12.29), 0 ≤ 1 − pt (x, Uε (x))  ≤1− pt (x, dy) f x (y) = f x (x) − Tt f x (x) K∂

≤  f x − Tt f x ∞ ≤  f x − f xk ∞ +  f xk − Tt f xk ∞ + Tt f xk − Tt f x ∞ ≤ 2 f x − f xk ∞ +  f xk − Tt f xk ∞ .

(12.31)

610

12 Markov Processes, Transition Functions and Feller Semigroups

f

Fig. 12.10 The function f

E U

In view of inequality (12.31), the first term on the last inequality is bounded by δ/2 for the right choice of k. Furthermore, it follows from the strong continuity (12.25 ) of {Tt } that the second term tends to zero as t ↓ 0, for each k = 1, · · · , n. Consequently, we have, for all sufficiently small t > 0, sup [1 − pt (x, Uε (x))] ≤ δ. x∈E

This proves condition (12.4), that is, the uniform stochastic continuity of pt (x, ·). (2) It remains to verify condition (L). We assume, to the contrary, that: For some s > 0 and some compact E ⊂ K , there exist a constant ε0 > 0, a sequence {tk }, tk ↓ t (0 ≤ t ≤ s) and a sequence {xk }, xk → ∂, such that ptk (xk , E) ≥ ε0 .

(12.32)

Now we take a relatively compact subset U of K containing E, and let (see Fig. 12.10) f (x) =

ρ(x, K \ U ) . ρ(x, E) + ρ(x, K \ U )

Then it follows that the function f (x) is in C0 (K ) and satisfies the condition  Tt f (x) =

pt (x, dy) f (y) ≥ pt (x, E) ≥ 0. K

Therefore, by combining this inequality with inequality (12.32) we obtain that  Ttk f (xk ) =

ptk (xk , dy) f (y) ≥ ptk (xk , E) ≥ ε0 .

(12.33)

K

However, we have the inequality Ttk f (xk ) ≤ Ttk f − Tt f ∞ + Tt f (xk ).

(12.34)

Since the semigroup {Tt } is strongly continuous and Tt f ∈ C0 (K ), we can let k → ∞ in inequality (12.34) to obtain that

12.2 Feller Semigroups and Transition Functions Fig. 12.11 An overview of Theorems 12.32–12.34

611

pt (x, ·) : uniform stochastic continuity + C0 -property + condition (L)

{Tt } : Feller semigroup on C0 (K)

lim sup Ttk f (xk ) = 0. k→∞

This contradicts inequality (12.33). The proof of Theorem 12.34 is complete.

 

Definition 12.36 A family {Tt }t≥0 of bounded linear operators acting on C0 (K ) is called a Feller semigroup on K if it satisfies the following three conditions: (i) Tt+s = Tt · Ts , t, s ≥ 0 (the semigroup property); T0 = I . (ii) {Tt } is strongly continuous in t for t ≥ 0: lim Tt+s f − Tt f  = 0 for every f ∈ C0 (K ). s↓0

(iii) {Tt } is non-negative and contractive on C0 (K ): f ∈ C0 (K ), 0 ≤ f (x) ≤ 1 on K =⇒ 0 ≤ Tt f (x) ≤ 1 on K . By combining Theorems 12.32–12.34, we have the following characterization of Feller semigroups in terms of transition functions: Theorem 12.37 If pt (x, ·) is a uniformly stochastically continuous, C0 transition function on K and satisfies condition (L), then its associated operators {Tt }t≥0 form a Feller semigroup on K . Conversely, if {Tt }t≥0 is a Feller semigroup on K , then there exists a uniformly stochastically continuous, C0 transition function pt (x, ·) on K , satisfying condition (L), such that formula (12.19) holds true for all f ∈ C0 (K ). We remark that Theorem 12.37 can be visualized as in Fig. 12.11 above.

612

12 Markov Processes, Transition Functions and Feller Semigroups

12.3 The Hille–Yosida Theory of Feller Semigroups This section is devoted to a version of the Hille–Yosida theorem (Theorem 5.14) adapted to the present context. In particular, we prove generation theorems for Feller semigroups (Theorems 12.38 and 12.53) which form a functional analytic background for the proof of Theorem 1.38 in Chap. 13.

12.3.1 Generation Theorems for Feller Semigroups Let (K , ρ) be a locally compact, separable metric space and let C(K ) be the space of real-valued, bounded continuous functions on K ; C(K ) is a normed linear space with the supremum norm  f ∞ = sup | f (x)|. x∈K

We add a new point ∂ to the locally compact space K as the point at infinity if K is not compact, and as an isolated point if K is compact. Hence the space K ∂ = K ∪ {∂} is compact. Recall that C0 (K ) is the closed subspace of C(K ) which consists of all functions satisfying the condition f (∂) = 0 (see Fig. 12.6), and further that C0 (K ) may be identified with C(K ) if K is compact. Namely, we have the formula  C0 (K ) =

{ f ∈ C(K ∂ ) : f (∂) = 0} if K is locally compact, C(K ) if K is compact.

If {Tt }t≥0 is a Feller semigroup on K , we define its infinitesimal generator A by the formula Tt u − u , (12.35) Au = lim t↓0 t provided that the limit (12.35) exists in the space C0 (K ). More precisely, the generator A is a linear operator from C0 (K ) into itself defined as follows: (1) The domain D(A) of A is the set D(A) = {u ∈ C0 (K ) : the limit (12.35) exists in C0 (K )} . Tt u − u for every u ∈ D(A). t The next theorem is a version of the Hille–Yosida theorem (Theorem 5.14) adapted to the present context: (2) Au = limt↓0

Theorem 12.38 (Hille–Yosida) (i) Let {Tt }t≥0 be a Feller semigroup on K and A its infinitesimal generator. Then we have the following four assertions (a)–(d):

12.3 The Hille–Yosida Theory of Feller Semigroups

613

(a) The domain D(A) is dense in the space C0 (K ). (b) For each α > 0, the equation (αI − A) u = f has a unique solution u in D(A) for any f ∈ C0 (K ). Hence, for each α > 0 the Green operator (αI − A)−1 : C0 (K ) → C0 (K ) can be defined by the formula u = (αI − A)−1 f for every f ∈ C0 (K ). (c) For each α > 0, the operator (αI − A)−1 is non-negative on C0 (K ): f ∈ C0 (K ), f ≥ 0 on K =⇒ (αI − A)−1 f ≥ 0 on K . (d) For each α > 0, the operator (αI − A)−1 is bounded on C0 (K ) with norm  (αI − A)−1  ≤

1 . α

(ii) Conversely, if A is a linear operator from C0 (K ) into itself satisfying condition (a) and if there is a constant α0 ≥ 0 such that, for all α > α0 , conditions (b)–(d) are satisfied, then A is the infinitesimal generator of some Feller semigroup {Tt }t≥0 on K. Proof In view of Theorem 5.14, it suffices the semigroup {Tt }t≥0 is

to show that non-negative if and only if its resolvents (αI − A)−1 α>α0 are non-negative. The “only if” part is an immediate consequence of the integral expression of (αI − A)−1 in terms of the semigroup {Tt } (see formula (6.13)): (αI − A)−1 =





e−αt Tt dt for α > 0.

0

On the other hand, the “if” part follows from expression (6.18) of the semigroup Tt (α) in terms of the Yosida approximation Jα = α (αI − A)−1 : Tt (α) = e

−αt

exp [αt Jα ] = e

−αt

∞ (αt)n n=0

n!

Jαn ,

and the definition (6.19) of the semigroup Tt : Tt = lim Tt (α). α→∞

The proof of Theorem 12.38 is complete.

 

Corollary 12.39 Let K be a compact metric space and let A : C(K ) → C(K ) be the infinitesimal generator of a Feller semigroup on K . Assume that the constant function 1 belongs to the domain D(A) of A and that we have, for some constant c,

614

12 Markov Processes, Transition Functions and Feller Semigroups

(A1)(x) ≤ −c on K .

(12.36)

Then the operator A = A + cI is the infinitesimal generator of some Feller semigroup on K . Proof It follows from an application of part (i) of Theorem 12.38 that, for all α > c the operators  −1 αI − A = ((α − c) I − A)−1 are defined and non-negative on the whole space C(K ). However, in view of inequality (12.36) we obtain that   α ≤ α − (A1 + c) = αI − A 1 on K , so that

 −1   αI − A 1 = 1 on K . α(αI − A )−1 1 ≤ αI − A

Hence we have, for all α > c, −1 −1   1  =  αI − A 1∞ ≤ .  αI − A α Therefore, by applying part (ii) of Theorem 12.38 to the operator A = A + cI we find that A is the infinitesimal generator of some Feller semigroup on K . The proof of Corollary 12.39 is complete.   Now we write down explicitly the infinitesimal generators of Feller semigroups associated with the transition functions in Examples 12.7–12.13. Example 12.40 (uniform motion) Let K = R and 

D(A) = { f ∈ C0 (K ) ∩ C 1 (K ) : f  ∈ C0 (K )}, A f = v f  for every f ∈ D(A),



where v is a positive constant. Then the resolvents (αI − A)−1 α>0 are given by the formula  1 ∞ − α (y−x) e v g(y) dy for every g ∈ C0 (K ). (αI − A)−1 g = v x Example 12.41 (Poisson process) Let K = R and 

D(A) = C0 (K ), A f (x) = λ( f (x + 1) − f (x)) for every f ∈ D(A).

12.3 The Hille–Yosida Theory of Feller Semigroups

615

The operator A is not “local”; the value A f (x) depends on the values f (x) and f (x + 1). This reflects the fact that the Poisson process changes state by jumps. Example 12.42 (Brownian motion) K = R and ⎧ ⎨ D(A) = { f ∈ C0 (K ) ∩ C 2 (K ) : f  ∈ C0 (K ), f  ∈ C0 (K )}, 1 ⎩ A f = f  for every f ∈ D(A). 2 The operator A is “local”, that is, the value A f (x) is determined by the values of f in an arbitrary small neighborhood of x. This reflects the fact that Brownian motion changes state by continuous motion. The resolvents (αI − A)−1 α>0 are given by the formula  ∞  ∞ √ 1 G α (x, y) g(y) dy := √ e− 2α |x−y| g(y) dy (αI − A)−1 g = 2α −∞ −∞ for every g ∈ C0 (K ). Here

  √ 1 G α (x, y) = √ exp − 2α |x − y| 2α

is the Green kernel of the resolvent (αI − A)−1 for α > 0. Example 12.43 (Brownian motion with constant drift) Let K = R and ⎧ ⎨ D(A) = { f ∈ C0 (K ) ∩ C 2 (K ) : f  ∈ C0 (K ), f  ∈ C0 (K )}, 1 ⎩ A f = f  + m f  for every f ∈ D(A). 2 Example 12.44 (Cauchy process) Let K = R and the domain D(A) contains C 2 functions on K with compact support, and the infinitesimal generator A takes the following form (see [209, Table 4.4 with n = 1 and α = 1]):   dy 1 f (x + y) − f (x) − y f  (x) 2 A f (x) = π R y   1 1 = f  (x + t y) (1 − t) dt dy. π R 0 The operator A is not “local”, which reflects the fact that the Cauchy process changes state by jumps (see [209, Example 4.8 with n = 1 and α = 1]). Example 12.45 (reflecting barrier Brownian motion) Let K = [0, ∞) and

616

12 Markov Processes, Transition Functions and Feller Semigroups



⎨ D(A) = f ∈ C0 (K ) ∩ C 2 (K ) : f  ∈ C0 (K ), f  ∈ C0 (K ), f  (0) = 0 , 1 ⎩ A f = f  for every f ∈ D(A). 2

The resolvents (αI − A)−1 α>0 are given by the formula (12.37) (αI − A)−1 g  ∞ √  ∞ √ 1 1 =√ e 2α (x−y) g(y) dy + √ e− 2α (x+y) g(y) dy 2α 0 2α 0  x √  √ 1 −√ e 2α (x−y) − e− 2α (x−y) g(y) dy 2α 0 for every g ∈ C0 (K ). Example 12.46 (sticking barrier Brownian motion) Let K = [0, ∞) and ⎧

⎨ D(A) = f ∈ C0 (K ) ∩ C 2 (K ) : f  ∈ C0 (K ), f  ∈ C0 (K ), f  (0) = 0 , 1 ⎩ A f = f  for every f ∈ D(A). 2

The resolvents (αI − A)−1 α>0 are given by the formula (αI − A)−1 g  ∞ √ √ 1 1 =√ e 2α (x−y) g(y) dy + g(0) e− 2αx α 2α x  x √  ∞ √ 1 1 − e− 2α (x+y) g(y) dy + √ e− 2α (x−y) g(y) dy α 0 2α 0 for every g ∈ C0 (K ).

(12.38)

Moreover, we can obtain the following example: Example 12.47 (reflecting barrier Brownian motion) Let K = [0, 1] and ⎧

⎨ D(A) = f ∈ C 2 (K ) : f  (0) = f  (1) = 0 , 1 ⎩ A f = f  for every f ∈ D(A). 2 Example 12.48 (absorbing barrier Brownian motion) Let K = [0, ∞) and ⎧

⎨ D(A) = f ∈ C0 (K ) ∩ C 2 (K ) : f  ∈ C0 (K ), f  ∈ C0 (K ), f (0) = 0 , 1 ⎩ A f = f  for every f ∈ D(A). 2

12.3 The Hille–Yosida Theory of Feller Semigroups

617

This represents Brownian motion with an absorbing barrier at x = 0; a Brownian particle dies at the first moment when it hits the boundary x = 0. Namely, the point 0 is the terminal point. Example 12.49 (absorbing barrier Brownian motion) Let K = [0, 1] where the boundary points 0 and 1 are identified with the point at infinity ∂. More precisely, we introduce a subspace C0 (K ) of C(K ) as follows: C0 (K ) = { f ∈ C(K ) : f (0) = f (1) = 0} . Then we define a linear operator A : C0 (K ) → C0 (K ) by the formula ⎧

⎨ D(A) = f ∈ C 2 (K ) : f  ∈ C0 (K ), f  ∈ C0 (K ), f (0) = f (1) = 0 , 1 ⎩ A f = f  for every f ∈ D(A). 2 This represents Brownian motion with two absorbing barriers at x = 0 and x = 1; a Brownian particle dies at the first moment when it hits the boundary points x = 0 and x = 1. Namely, the two points 0 and 1 are the terminal points. Example 12.50 (absorbing–reflecting barrier Brownian motion) Let K = [0, ∞), and ⎧ 2   ⎪ ⎪ D(A) = { f ∈ C0 (K ) ∩ C (K ) : f ∈ C0 (K ), f ∈ C0 (K ), ⎨ λ f  (0) − (1 − λ) f (0) = 0}, ⎪ 1 ⎪ ⎩ A f = f  for every f ∈ D(A). 2 Here λ is a constant such that 0 < λ < 1. This process {xt } may be thought of as a “combination” of the absorbing and reflecting Brownian motions; the absorbing and reflecting cases are formally obtained by letting λ → 0 and λ → 1, respectively. Here is an example where it is difficult to begin with a transition function and the infinitesimal generator is the basic tool of describing the process. Example 12.51 (sticky barrier Brownian motion) Let K = [0, ∞). We define a linear operator A : C0 (K ) → C0 (K ) by the formula ⎧ 2  ⎪ ⎪ D(A) = { f ∈ C0 (K ) ∩ C (K ) : f ∈ C0 (K ), ⎨ f  ∈ C0 (K ), f  (0) − λ f  (0) = 0}, ⎪ 1 ⎪ ⎩ A f = f  for every f ∈ D(A). 2 Here λ is a positive constant. This process {xt } may be thought of as a “combination” of the reflecting and sticking Brownian motions; the reflecting and sticking cases are

618

12 Markov Processes, Transition Functions and Feller Semigroups

formally obtained by letting λ → 0 and λ → ∞, respectively. Upon hitting x = 0, a Brownian particle leaves immediately, but it spends a positive duration of time analogous there. We remark that the set {t > 0 : xt = 0} is somewhat to Cantor-like

sets of positive Lebesgue measure. The resolvents (αI − A)−1 α>0 are given by the formula  ∞ √ √ 1 e 2α (x−y) g(y) dy + C e− 2α x (12.39) (αI − A)−1 g = √ 2α 0  x √  √ 1 −√ e 2α (x−y) − e− 2α (x−y) g(y) dy 2α 0 for every g ∈ C0 (K ), where C is a constant given by the formula  C=

1 λ 1 λ

− +

√ √





! ∞

e



0

√ − 2α y

g(y) dy +

1 λ

g(0) √ . + 2α 2 α

It should be noticed that formulas (12.37) and (12.38) may be obtained from formula (12.39) by letting λ → 0 and λ → ∞, respectively. Finally, it is worth pointing out here that a strong Markov process cannot stay at a single position for a positive length of time and then leave that position by continuous motion; it must either jump away or leave instantaneously. We give a simple example of a strong Markov process which changes state not by continuous motion but by jumps when the motion reaches the boundary: Example 12.52 Let K = [0, ∞) and ⎧ ⎪ D(A) = { f ∈ C0 (K ) ∩ C 2 (K ) : f  ∈ C0 (K ), f  ∈ C0 (K ), ⎪ ⎨ ∞ f  (0) = 2c 0 ( f (y) − f (0)) d F(y)}, ⎪ ⎪ ⎩ A f = 1 f  for every f ∈ D(A). 2 Here c is a positive constant and F is a distribution function on (0, ∞). This process {xt } may be interpreted as follows: When a Brownian particle reaches the boundary x = 0, it stays there for a positive length of time and then jumps back to a random point, chosen with the function F, in the interior (0, ∞). The constant c is the parameter in the “waiting time” distribution at the boundary x = 0. We remark that the boundary condition 





f (0) = 2c 0

( f (y) − f (0)) d F(y)

12.3 The Hille–Yosida Theory of Feller Semigroups

619

depends on the values of f (y) far away from the boundary x = 0, unlike the boundary conditions in Examples 12.45–12.48.

12.3.2 Generation Theorems for Feller Semigroups in Terms of Maximum Principles Although Theorem 12.38 tells us precisely when a linear operator A is the infinitesimal generator of some Feller semigroup, it is usually difficult to verify conditions (b)–(d). So we give useful criteria in terms of maximum principles ([202, Theorem 2.18 and Corollary 2.19]): Theorem 12.53 (Hille–Yosida–Ray) Let K be a compact metric space. Then we have the following two assertions (i) and (ii): (i) Let B be a linear operator from C(K ) = C0 (K ) into itself, and assume that (α) The domain D(B) of B is dense in the space C(K ). (β) There exists an open and dense subset K 0 of K such that if u ∈ D(B) takes a positive maximum at a point x0 of K 0 , then we have the inequality Bu(x0 ) ≤ 0. Then the operator B is closable in the space C(K ). (ii) Let B be as in part (i), and further assume that (β  ) (the positive maximum principle) If u ∈ D(B) takes a positive maximum at a point x  of K , then we have the inequality Bu(x  ) ≤ 0. (γ) For some α0 ≥ 0, the range R (α0 I − B) of α0 I − B is dense in the space C(K ). Then the minimal closed extension B of B is the infinitesimal generator of some Feller semigroup on K . Proof (i) By Theorem 5.9, it suffices to show that {u n } ⊂ D(B), u n → 0 and Bu n → v in C(K ) =⇒ v = 0. Replacing v by −v if necessary, we assume to the contrary that: The function v takes a positive value at some point of K . Then, since K 0 is open and dense in K , we can find a point x0 of K 0 , a neighborhood U of x0 contained in K 0 and a constant ε > 0 such that, for sufficiently large n

620

12 Markov Processes, Transition Functions and Feller Semigroups

Bu n (x) > ε for all x ∈ U.

(12.40)

On the other hand, by condition (α) there exists a function h ∈ D(B) such that 

h(x0 ) > 1, h(x) < 0 for all x ∈ K \ U .

Therefore, since u n → 0 in C(K ), it follows that the function u n (x) = u n (x) +

εh(x) 1 + Bh

satisfies the conditions εh(x0 ) > 0, 1 + Bh εh(x) u n (x) = u n (x) + < 0, x ∈ K \ U, 1 + Bh u n (x0 ) = u n (x0 ) +

if n is sufficiently large. This implies that the function u n ∈ D(B) takes its positive maximum at a point xn of U ⊂ K 0 . Hence we have, by condition (β), Bu n (xn ) ≤ 0. However, it follows from inequality (12.40) that Bu n (xn ) = Bu n (xn ) + ε

Bh(xn ) > Bu n (xn ) − ε > 0. 1 + Bh

This is a contradiction. (ii) We apply part (ii) of Theorem 12.38 to the operator B. Step (1): First, we show that u ∈ D(B), (α0 I − B) u ≥ 0 on K =⇒ u ≥ 0 on K .

(12.41)

By condition (γ), we can find a function v ∈ D(B) such that (α0 I − B) v ≥ 1 on K . Then we have, for any ε > 0, 

u + εv ∈ D(B), (α0 I − B) (u + εv) ≥ ε on K .

(12.42)

12.3 The Hille–Yosida Theory of Feller Semigroups

621

In view of condition (β  ) (the positive maximum principle), this implies that the function −(u + εv) does not take any positive maximum on K , so that u + εv ≥ 0 on K . Thus, by letting ε ↓ 0 we obtain that u ≥ 0 on K . This proves the desired assertion (12.41). Step (2): It follows from assertion (12.41) that the inverse (α0 I − B)−1 of α0 I − B is defined and non-negative on the range R (α0 I − B). Moreover it is bounded with norm (12.43)  (α0 I − B)−1  ≤ v. Here v is a function which satisfies condition (12.42). Indeed, since g = (α0 I − B) v ≥ 1 on K , it follows that we have, for all f ∈ C(K ), − f g ≤ f ≤  f g on K . Hence, by the non-negativity of (α0 I − B)−1 we have, for all f ∈ R (α0 I − B), − f v ≤ (α0 I − B)−1 f ≤  f v on K . This proves the desired inequality (12.43). Step (3): Next we show that   R α0 I − B = C(K ).

(12.44)

Let f be an arbitrary element of C(K ). By condition (γ), we can find a sequence {u n } in D(B) such that f n = (α0 I − B) u n −→ f in C(K ). Since the inverse (α0 I − B)−1 is bounded, it follows that u n = (α0 I − B)−1 f n converges to some u ∈ C(K ), so that Bu n = α0 u n − f n −→ α0 u − f in C(K ). Thus we have the assertions 

u ∈ D(B), Bu = α0 u − f,

622

12 Markov Processes, Transition Functions and Feller Semigroups

and so

  α0 I − B u = f.

This proves the desired assertion (12.44). Step (4): Furthermore, we show that   u ∈ D(B), α0 I − B u ≥ 0 on K =⇒ u ≥ 0 on K .

(12.41 )

  Since R α0 I − B = C(K ), in view of the proof of assertion (12.41) it suffices to show the following: If u ∈ D(B) takes a positive maximum at a point x  of K , then we have the inequality Bu(x  ) ≤ 0. (12.45) Assume, to the contrary, that Bu(x  ) > 0. Since there exists a sequence {u n } in D(B) such that 

u n −→ u in C(K ), Bu n −→ Bu in C(K ),

we can find a neighborhood U of x  and a constant ε > 0 such that we have, for n sufficiently large, (12.46) Bu n (x) > ε for all x ∈ U. Furthermore, by condition (α) we can find a function h ∈ D(B) such that 

h(x  ) > 1, h(x) < 0 for all x ∈ K \ U .

Then the function u n (x) = u n (x) + satisfies the conditions 

εh(x) 1 + Bh

u n (x  ) > u(x  ) > 0, u n (x) < u(x  ) for all x ∈ K \ U,

if n is sufficiently large. This implies that the function u n ∈ D(B) takes its positive maximum at a point xn of U . Hence we have, by condition (β  ) (the positive maximum principle),

12.3 The Hille–Yosida Theory of Feller Semigroups

623

Bu n (xn ) ≤ 0. However, it follows from inequality (12.46) that Bu n (xn ) = Bu n (xn ) + ε

Bh(xn ) > Bu n (xn ) − ε > 0. 1 + Bh

This is a contradiction. −1  of Step (5): In view of Steps (3) and (4), we obtain that the inverse α0 I − B α0 I − B is defined on the whole space C(K ), and is bounded with norm  −1  −1  α0 I − B  =  α0 I − B 1. Step (6): Finally, we show that:  −1 of αI − B is defined on the whole space For all α > α0 , the inverse αI − B C(K ), and is non-negative and bounded with norm −1  1 ≤ .  αI − B α We let

(12.47)

 −1 G α0 = α0 I − B .

First choose a constant α1 > α0 such that (α1 − α0 ) G α0  < 1, and let α0 < α ≤ α1 . Then, for any f ∈ C(K ), the Neumann series  u=

I+



 (α0 − α)

n

G nα0

G α0 f

n=1

converges in C(K ), and is a solution of the equation u − (α0 − α) G α0 u = G α0 f. Hence, we have the assertions 

u ∈ D(B),   αI − B u = f.

624

12 Markov Processes, Transition Functions and Feller Semigroups

This proves that

  R αI − B = C(K ) for α0 < α ≤ α1 .

(12.48)

Thus, by arguing as in the proof of Step (1) we obtain that we have, for α0 < α ≤ α1 ,   u ∈ D(B), αI − B u ≥ 0 on K =⇒ u ≥ 0 on K .

(12.49)

By combining assertions (12.48) and (12.49), we find that the inverse  −1 αI − B is defined and non-negative on the whole space C(K ) for α0 < α ≤ α1 . We let  −1 for α0 < α ≤ α1 . G α = αI − B Then the operator G α is bounded with norm G α  ≤

1 . α

(12.50)

Indeed, in view of assertion (12.45) it follows that if u ∈ D(B) takes a positive maximum at a point x  of K , then we have the inequality Bu(x  ) ≤ 0, and so max u = u(x  ) ≤ K

  1 1  αI − B u(x  ) ≤  αI − B u. α α

(12.51)

Similarly, if u takes a negative minimum at a point of K , then (replacing u by −u), we have the inequality − min u = max(−u) ≤ K

K

 1   αI − B u. α

(12.52)

The desired inequality (12.47) for α0 < α ≤ α1 follows from inequalities (12.51) and (12.52). Now we assume that assertion (12.47) is proved for α0 < α ≤ αn−1 , n = 2, 3, . . .. Then, by taking α1 , αn := 2αn−1 − 2 or equivalently



1 n−2 αn := 2 α1 , + 2

12.3 The Hille–Yosida Theory of Feller Semigroups

625

we have, for αn−1 < α ≤ αn , α − αn−1 αn − αn−1 1 ≤ = αn−1 αn−1 1 + 22−n < 1.

(α − αn−1 ) G αn−1  ≤

Hence, the desired assertion (12.47) for αn−1 < α ≤ αn is proved just as in the proof of assertion (12.47) for α0 < α ≤ α1 . Summing up, we have proved the desired assertion (12.47) for all α > α0 . Consequently, by applying part (ii) of Theorem 12.38 to the operator B we obtain that B is the infinitesimal generator of some Feller semigroup on K . The proof of Theorem 12.53 is now complete.   Corollary 12.54 Let A be the infinitesimal generator of a Feller semigroup on a compact metric space K and C a bounded linear operator on C 0 (K ) = C(K ) into itself. Assume that either M or A + M satisfies condition (β  ) (the positive maximum principle). Then the operator A = A + M is the infinitesimal generator of some Feller semigroup on K . Proof We apply part (ii) of Theorem 12.53 to the operator A . First, we note that A = A + M is a densely defined, closed linear operator from C(K ) into itself. Since the semigroup {Tt }t≥0 is non-negative and contractive on C(K ), it follows that if u ∈ D(A) takes a positive maximum at a point x  of K , then we have the inequality Au(x  ) = lim t↓0

Tt u(x  ) − u(x  ) ≤ 0. t

This implies that if M satisfies condition (β  ), so does A = A + M. We let G α0 := (α0 I − A)−1 for α0 > 0. If α0 is so large that G α0 M ≤ G α0  · M ≤

M < 1, α0

then the C. Neumann series  u=

I+



 (G α0 M)

n

G α0 f

n=1

converges in C(K ) for any f ∈ C(K ), and is a solution of the equation u − G α0 Mu = G α0 f.

626

12 Markov Processes, Transition Functions and Feller Semigroups

Hence, we have the assertions 

This proves that

u ∈ D(A) = D(A ),   α0 I − A u = f.

  R α0 I − A = C(K ).

Therefore, by applying part (ii) of Theorem 12.53 to the operator A we obtain that A is the infinitesimal generator of some Feller semigroup on K . The proof of Corollary 12.54 is complete.  

12.4 Infinitesimal Generator of Feller Semigroups on a Bounded Domain (i) In the early 1950s, W. Feller characterized completely the analytic structure of onedimensional diffusion processes; he gave an intrinsic representation of the infinitesimal generator A of a one-dimensional diffusion process and determined all possible boundary conditions which describe the domain D(A) of A. The probabilistic meaning of Feller’s work was clarified by Dynkin [45, 46], Itô and McKean, Jr. [95], Ray [147] and others. One-dimensional diffusion processes are completely studied both from analytic and probabilistic viewpoints. Now we take a close look at Feller’s work [59] and [60]. Let X = (xt , F, Ft , Px ) be a one-dimensional diffusion process with state space K . A point x of K is called a right (resp. left) singular point if xt (ω) ≥ x (resp. xt (ω) ≤ x) for all t ∈ [0, ζ(ω)) with Px -measure one. A right and left singular point is called a trap. For example, the point at infinity ∂ is a trap. A point which is neither right nor left singular is called a regular point. For simplicity, we assume that the state space K is the half-line K = [0, ∞), and all its interior points are regular. Feller proved that there exist a strictly increasing, continuous function s on (0, ∞) and Borel measures m and k on (0, ∞) such that the infinitesimal generator A of the process X can be expressed as follows: A f (x) = lim

f + (y) − f + (x) −

y↓x

 (x,y]

m((x, y])

f (z) dk(z)

.

(12.53)

Here: (1) f + (x) = limε↓0

f (x+ε)− f (x) , s(x+ε)−s(x)

the right-derivative of f at x with respect to s.

12.4 Infinitesimal Generator of Feller Semigroups on a Bounded Domain (i)

627

(2) The measure m is positive for non-empty open subsets, and is finite for compact sets. (3) The measure k is finite for compact subsets. The function s is called a canonical scale, and the measures m and k are called a canonical measure (or speed measure) and a killing measure for the process X , respectively. They determine the behavior of a Markovian particle in the interior of the state space K . We remark that the right-hand side of (12.53) is a generalization of the second order differential operator a(x) f  + b(x) f  + c(x) f, where a(x) > 0 and c(x) ≤ 0 on K . For example, the formula A f = a(x) f  + b(x) f  can be written in the form (12.53), if we take    y b(z) s(x) = dz dy, exp − 0 0 a(z)   x b(y) 1 exp dy d x, dm(x) = a(x) 0 a(y) dk(x) = 0. 

x

The boundary point 0 is called a regular boundary if we have, for a point r ∈ (0, ∞),  [s(r ) − s(x)][dm(x) + dk(x)] < ∞, (0,r )  [m((x, r )) + k((x, r ))] ds(x) < ∞. (0,r )

It can be shown that this notion is independent of the point r used. Intuitively, the regularity of the boundary point means that a Markovian particle approaches the boundary in finite time with positive probability, and also enters the interior from the boundary. The behavior of a Markovian particle at the boundary point is characterized by boundary conditions. In the case of regular boundary points, Feller determined all possible boundary conditions which are satisfied by the functions f (x) in the domain D(A) of A. A general boundary condition is of the form γ f (0) − δ A f (0) + μ f + (0) = 0,

628

12 Markov Processes, Transition Functions and Feller Semigroups

where γ, δ and μ are constants such that γ ≤ 0, δ ≥ 0, μ ≥ 0, μ + δ > 0. If we admit jumps from the boundary into the interior, then a general boundary condition takes the form  + γ f (0) − δA f (0) + μ f (0) + [ f (x) − f (0)] dν(x) = 0 (12.54) (0,∞)

where ν is a Borel measure with respect to which the function min(1, s(x) − s(+0)) is integrable. It should be noticed that boundary condition (12.54) is a “combination” of the boundary conditions in Examples 12.45, 12.48 and 12.52 if we take s(x) = x, dm(x) = 2d x, dk(x) = 0. A Markov process is said to be one-dimensional or multi-dimensional according to whether the state space is a subset of R or R N for N ≥ 2. The main purpose of this book is to generalize Feller’s work to the multidimensional case. In l959 Ventcel’ [236] studied the problem of determining all possible boundary conditions for multi-dimensional diffusion processes. In this section and the next section, we describe analytically the infinitesimal generator of a Feller semigroup in the case when the state space K = D, the closure of a bounded domain D in R N (see Theorems 12.55 and 12.57). Let K be a compact metric space and let C(K ) = C0 (K ) be the Banach space of real-valued continuous functions f (x) on K with the maximum norm  f ∞ = max | f (x)|. x∈K

A sequence {μn }∞ n=1 of real Borel measures on K is said to converge weakly to a real Borel measure μ on K if it satisfies the condition   f (x) dμn (x) = f (x) dμ(x) for every f ∈ C(K ). (12.55) lim n→∞

K

K

We remark (see Theorem 5.44) that the space of all real Borel measures μ on K is a normed linear space by the norm μ = the total variation |μ|(K ) of μ, and further (see formula (5.23)) that the weak convergence (12.55) of real Borel measures is just the weak* convergence of the dual space C(K ) of C(K ). Now we recall that a Feller semigroup {Tt }t≥0 on K is a strongly continuous semigroup of bounded linear operators Tt acting on C(K ) such that f ∈ C(K ), 0 ≤ f (x) ≤ 1 on K =⇒ 0 ≤ Tt f (x) ≤ 1 on K . The infinitesimal generator A of {Tt } is defined by the formula

12.4 Infinitesimal Generator of Feller Semigroups on a Bounded Domain (i)

Au = lim t↓0

Tt u − u , t

629

(12.56)

provided that the limit (12.56) exists in C(K ). Namely, the generator A is a linear operator from C(K ) into itself whose domain D(A) consists of all u ∈ C(K ) for which the limit (12.56) exists. Theorem 12.38, a version of the Hille–Yosida theorem, asserts that a Feller semigroup is completely characterized by its infinitesimal generator. Therefore, we are reduced to the study of the infinitesimal generators of Feller semigroups. Our first job is to derive an explicit formula in the interior D of D for the infinitesimal generator A of a Feller semigroup {Tt }t≥0 on the closure D. The next result is essentially due to Ventcel’ [236] (see the proof of Theorem 12.57 in Sect. 12.5): Theorem 12.55 (Ventcel’) Let D be a bounded domain in R N and let {Tt }t≥0 be a Feller semigroup on D and A its infinitesimal generator. Assume that, for every point x 0 of D, there exist a local coordinate system x = (x1 , x2 , . . . , x N ) in a neighborhood of x 0 and continuous functions χ1 (x), χ2 (x), . . . , χ N (x) defined on D such that χi = xi in a neighborhood of x 0 , and they satisfy the conditions 1, χ1 (x), χ2 (x), . . . , χ N (x) ∈ D(A), N

χi (x)2 ∈ D(A).

i=1

Then we have, for all u ∈ D(A) ∩ C 2 (D), Au(x 0 ) =

N i, j=1

 +

∂2u (x 0 ) + ∂xi ∂x j 

bi (x 0 )

i=1

e(x 0 , dy) u(y) − u(x 0 ) − D

Here:

a i j (x 0 )

(12.57) N

∂u 0 (x ) + c(x 0 )u(x 0 ) ∂xi

 N  ∂u 0  (x ) χi (y) − χi (x 0 ) . ∂xi i=1

630

12 Markov Processes, Transition Functions and Feller Semigroups

(1) (2) (3) (4)

The matrix (a i j (x 0 )) is symmetric and positive semi-definite. bi (x 0 ) = A(χi − χi (x 0 ))(x 0 ) for all 1 ≤ i ≤ N . c(x 0 ) = A1(x 0 ). e(x 0 , ·) is a non-negative Borel measure on D such that, for any neighborhood U of x 0 , e(x 0 , D \ U ) < ∞, " N #    2 e(x 0 , dy) χi (y) − χi (x 0 ) < ∞. U

(12.58a) (12.58b)

i=1

Proof The proof is divided into three steps. Step 1: By applying Theorem 12.37 with C0 (K ) := C(D), we obtain that there corresponds to a Feller semigroup {Tt }t≥0 on D a unique uniformly stochastically continuous, Feller transition function pt on D in the following manner:  Tt f (x) =

pt (x, dy) f (y) for every f ∈ C(D). D

Since the functions 1, χ1 , χ2 , . . ., χ N and follows that

$N

χ1 − χ1 (x 0 ), . . . , χ N − χ N (x 0 ),

i=1

χi2 belong to the domain D(A), it

N  2 χi − χi (x 0 ) ∈ D(A). i=1

Thus we have the formula (12.59) Au(x 0 )   1 Tt u(x 0 ) − u(x 0 ) = lim t↓0 t

  1 = lim pt (x 0 , dy)u(y) − u(x 0 ) t↓0 t D    1 pt (x 0 , D) − 1 u(x 0 ) = lim t↓0 t N    ∂u 0 1 pt (x 0 , dy) χi (y) − χi (x 0 ) (x ) + t i=1 D ∂xi " #  N  ∂u 0  1 0 0 0 + pt (x , dy) u(y) − u(x ) − (x ) χi (y) − χi (x ) t D ∂xi i=1  N 1 ∂u 0 = c(x 0 )u(x 0 ) + bi (x 0 ) (x ) + lim pt (x 0 , dy)% u (x 0 , y)d(x 0 , y), t↓0 t D\{x 0 } ∂x i i=1

12.4 Infinitesimal Generator of Feller Semigroups on a Bounded Domain (i)

631

where Tt 1(x 0 ) − 1 = A1(x 0 ), t↓0 t Tt (χi − χi (x 0 ))(x 0 ) = A(χi − χi (x 0 ))(x 0 ), bi (x 0 ) = lim t↓0 t

c(x 0 ) = lim

and d(x 0 , y) =

N  2 χi (y) − χi (x 0 ) , i=1

% u (x 0 , y) =

u(y) − u(x 0 ) −

$N

∂u 0 (x )(χi (y) − χi (x 0 )) ∂xi for y ∈ D \ {x 0 }. d(x 0 , y)

i=1

To rewrite the last term of formula (12.59), we define a non-negative measure % pt (x 0 , ·) on D by the formula 1 % pt (x , E) = t

 pt (x 0 , dy)d(x 0 , y) for all E ∈ B D .

0

E

Here and in the following B K denotes the σ-algebra of all Borel sets in a metric space K . Then we can rewrite formula (12.59) as follows: (12.59 )

Au(x 0 ) = c(x 0 )u(x 0 ) +

N i=1

bi (x 0 )

∂u 0 (x ) + lim t↓0 ∂xi

 % pt (x 0 , dy)% u (x 0 , y). D

We remark that, for all sufficiently small t > 0,  1 pt (x , D) + 1 = lim pt (x 0 , dy)d(x 0 , y) + 1 % pt (x , D) ≤ lim % t↓0 t↓0 t D  N    2 =A χi − χi (x 0 ) (x 0 ) + 1. 0

0

(12.60)

i=1

Step 2: Now we introduce a compactification of the space D \ {x 0 } to which the function % u (x 0 , ·) may be continuously extended. We let z i j (x 0 , y) :=

(χi (y) − χi (x 0 ))(χ j (y) − χ j (x 0 )) for y ∈ D \ {x 0 }. d(x 0 , y)

Then it is easy to see that the functions z i j (x 0 , ·) satisfy the condition

632

12 Markov Processes, Transition Functions and Feller Semigroups

|z i j (x 0 , y)| ≤ 1, and that the matrix (z i j (x 0 , ·)) is symmetric and positive semi-definite. We define a compact subspace M of symmetric, positive semi-definite matrices by the formula

M = (z i j )1≤i, j≤N : z i j = z ji , (z i j ) ≥ 0, |z i j | ≤ 1 , and consider an injection (see Fig. 12.12 below)    Φx 0 : D \ {x 0 }  y −→ y, z i j (x 0 , y) ∈ D × M. 0 Then the function % u (x 0 , Φx−1 0 (·)), defined on Φ x 0 (D \ {x }), can be extended to a continuous function & u (x, ·) on the closure

Hx 0 = Φx 0 (D \ {x 0 }) of Φx 0 (D \ {x 0 }) in D × M. Indeed, by using Taylor’s formula we have, in a neighborhood of x 0 , u(y) = u(x 0 ) + N 

N  ∂u 0  (x ) χi (y) − χi (x 0 ) ∂xi i=1

∂2u (x 0 + θ(y − x 0 ))(1 − θ) dθ ∂x ∂x i j 0 i, j=1    × χi (y) − χi (x 0 ) χ j (y) − χ j (x 0 ) ,

+

1

and hence (see Fig. 12.12) % u (x 0 , y) =

N  i, j=1 0

1

∂2u (x 0 + θ(y − x 0 ))(1 − θ)dθz i j (x 0 , y) ∂xi ∂x j

(12.61)

N 1 ∂2u (x 0 )z i j 2 i, j=1 ∂xi ∂x j       as Φx 0 (y) = y, z i j (x 0 , y) → h = x 0 , z i j .

−→ & u (x, h) =

We define a non-negative measure & pt (x 0 , ·) on Hx 0 by the formula & =% & for all E & ∈ BH 0 . pt (x 0 , Φx−1 & pt (x 0 , E) 0 ( E)) x

(12.62)

Then it follows from inequality (12.60) that we have, for all sufficiently small t > 0,

12.4 Infinitesimal Generator of Feller Semigroups on a Bounded Domain (i)

D \ {x0 }

633

Hx0 = Φx0 (D \ {x0 }) Φx 0 h = (x0 , (z ij ))

y

x0

Φx0 (y) = (y, (z ij (x0 , y)))

Fig. 12.12 The compactification Hx 0 of D \ {x 0 }



 N   0 2 & pt (x , Hx 0 ) ≤ % pt (x , D) ≤ A χi − χi (x ) (x 0 ) + 1. 0

0

i=1

Hence, by applying Theorem 5.4 to our situation we obtain that there exists a sequence {tn }, tn ↓ 0, such that the measures & ptn (x 0 , ·) converge weakly to a finite non-negative Borel measure & p (x, ·) on Hx 0 . Therefore, in view of (12.61) and (12.62), we can pass to the limit in formula (12.59 ) to obtain the following formula (12.59 ): (12.59 )

Au(x 0 ) = c(x 0 )u(x 0 ) +

N

bi (x 0 )

i=1

∂u 0 (x ) + lim t↓0 ∂xi

 D\{x 0 }

% pt (x 0 , dy)% u (x 0 , y)

 ∂u 0 (x ) + lim & ptn (x 0 , dh)& u (x, h) n→∞ H ∂x i 0 x i=1  N 0 0 i 0 ∂u 0 = c(x )u(x ) + b (x ) (x ) + & p (x, dh)& u (x, h). ∂xi Hx 0 i=1

= c(x 0 )u(x 0 ) +

N

bi (x 0 )

To rewrite the last term of formula (12.59 ), we define a non-negative Borel measure % p (x 0 , ·) on D \ {x 0 } by the formula p (x, Φx 0 (E)) for all E ∈ B D\{x 0 } , % p (x 0 , E) = & and let

  Z : D × M  h = (y, (z i j )) −→ z i j ∈ M.

Then we have the formula

634

12 Markov Processes, Transition Functions and Feller Semigroups

 & p (x, dh)& u (x, h)

(12.63)

Hx 0



 =

Hx 0 \Φx 0 (D\{x 0 })

& p (x, dh)& u (x, h) +

Φx 0 (D\{x 0 })

& p (x, dh)& u (x, h)

N  ∂2u 1 & p (x, dh)Z i j (h) (x 0 ) = 2 i, j=1 Hx 0 \Φx 0 (D\{x 0 }) ∂xi ∂x j  + % p (x 0 , dy)% u (x 0 , y) D\{x 0 }

=

N

a i j (x 0 )

i, j=1

 + where

∂2u (x 0 ) ∂xi ∂x j

  N  ∂u 0  e(x 0 , dy) u(y) − u(x 0 ) − (x ) χi (y) − χi (x 0 ) , ∂xi D i=1 1 a (x ) = 2 ij



0

Hx 0 \Φx 0 (D\{x 0 })

& p (x, dh)Z i j (h),

and e(x 0 , {x 0 }) = 0,  e(x 0 , E) =

1 & p (x , dy) d(x 0 , y) E\{x 0 } 0

 for E ∈ B D .

Therefore, by combining formulas (12.59 ) and (12.63) we obtain the desired expression (12.57) for Au in the interior D of D. Step 3: Properties (12.58) follow from our construction of a i j (x 0 ), bi (x 0 ), c(x 0 ) and e(x 0 , ·). The proof of Theorem 12.55 is now complete.   Remark 12.56 Bony, Courrège and Priouret give a more precise characterization of the infinitesimal generators of Feller semigroups in terms of the maximum principle (cf. Bony–Courrège–Priouret [22, Théorèmes IX and XIV]). Theorem 12.55 asserts that the infinitesimal generator A of a Feller semigroup {Tt }t≥0 on D is written in the interior D of D as the sum W = P + S of a degenerate elliptic differential operator P of second order and an integro-differential operator S:

12.4 Infinitesimal Generator of Feller Semigroups on a Bounded Domain (i) Fig. 12.13 A Markovian particle moves both by jumps and continuously in the state space D

635

D

W u(x) = Pu(x) + Su(x) (12.64) ⎛ ⎞ N N ∂2u ∂u := ⎝ a i j (x) (x) + bi (x) (x) + c(x)u(x)⎠ ∂x ∂x ∂x i j i i, j=1 i=1 " #  N ∂u + e(x, dy) u(y) − u(x) − (x) (χi (y) − χi (x)) . ∂xi D i=1 The differential operator P is called a diffusion operator which describes analytically a strong Markov process with continuous paths in the interior D such as Brownian motion, and the functions a i j (x), bi (x) and c(x) are called the diffusion coefficients, the drift coefficients and the termination coefficient, respectively. The operator S is called a second order Lévy operator which is supposed to correspond to a jump phenomenon in the interior D; a Markovian particle moves by jumps to a random point, chosen with kernel e(x, dy), in the interior D. Therefore, the operator W = P + S, called a Waldenfels integro-differential operator or simply Waldenfels operator, is supposed to correspond to such a physical phenomenon that a Markovian particle moves both by jumps and continuously in the state space D (see Fig. 12.13). Intuitively, the above result may be interpreted as follows: By Theorems 12.37 and 12.6, there correspond to a Feller semigroup {Tt }t≥0 a unique transition function pt and a Markov process X = (xt , F, Ft , Px ) in the following manner:  Tt f (x) =

pt (x, dy) f (y) for f ∈ C(D); D

pt (x, E) = Px {xt ∈ E} for E ∈ B D . In view of Theorem 12.21 and Remark 12.22, it will be true that if the paths of X are continuous, then the transition function pt has local character such as condition (N) of Theorem 12.21; hence the infinitesimal generator A is local, that is, the value Au(x 0 ) at an interior point x 0 is determined by the values of u in an arbitrary small neighborhood of x 0 . However, it is well known (see Peetre’s theorem 7.7) that a

636

12 Markov Processes, Transition Functions and Feller Semigroups

Fig. 12.14 A local coordinate system x = (x  , x N ) = (x1 , x2 , . . . , x N −1 , x N ) on U

∂D

D

U

xN

n x

linear operator is local if and only if it is a differential operator. Therefore, we have an assurance of the following assertion: The infinitesimal generator A of a Feller semigroup {Tt }t≥0 on D is a differential operator in the interior D of D if the paths of its corresponding Markov process X are continuous. In the general case when the paths of X may have discontinuities such as jumps, the infinitesimal generator A takes the form of the sum W of a differential operator P and an integro-differential (non-local) operator S, as proved in Theorem 12.55.

12.5 Infinitesimal Generator of Feller Semigroups on a Bounded Domain (ii) In this section, we derive an explicit formula on the boundary ∂ D of D for the infinitesimal generator A of a Feller semigroup {Tt }t≥0 on the closure D (Theorem 12.57). Let D be a bounded domain in Euclidean space R N , with smooth boundary ∂ D, and choose, for each point x  of ∂ D, a neighborhood U of x  in R N and a local coordinate system   x = x  , x N = (x1 , x2 , . . . , x N −1 , x N ) on U such that (see Fig. 12.14 above) x ∈ U ∩ D ⇐⇒ x ∈ U, x N (x) > 0; x ∈ U ∩ ∂ D ⇐⇒ x ∈ U, x N (x) = 0, and that the functions x  = (x1 , x2 , . . . , x N −1 ) restricted to U ∩ ∂ D, form a local coordinate system of ∂ D on U ∩ ∂ D (see Sect. 10.2). Furthermore, we may assume that the functions

12.5 Infinitesimal Generator of Feller Semigroups on a Bounded Domain (ii)

637

{x1 , x2 , . . . , x N −1 , x N } can be extended respectively to smooth functions {χ1 , χ2 , . . . , χ N −1 , χ N } on R N , so that d(x  , y) = χ N (y) +

N −1  2 χi (y) − χi (x  ) > 0

(12.65)

i=1

if x  ∈ U ∩ ∂ D and y ∈ D \ {x  }. The next theorem, due to Ventcel’ [236], asserts that every C 2 function in the domain D(A) of A must obey a boundary condition at each point of ∂ D ([191, Theorem 9.5.1]): Theorem 12.57 (Ventcel’) Let D be a bounded domain in R N , with smooth boundary ∂ D, and let {Tt }t≥0 be a Feller semigroup on D and A its infinitesimal generator. Then every function u in D(A) ∩ C 2 (D) satisfies, at each point x  of ∂ D, the boundary condition of the form N −1

αi j (x  )

i, j=1

N −1

∂2u ∂u  (x  ) + β i (x  ) (x ) ∂xi ∂x j ∂xi i=1

(12.66)

∂u  + γ(x  )u(x  ) + μ(x  ) (x ) − δ(x  )Au(x  ) ∂x N " #  N −1  ∂u      + ν(x , dy) u(y) − u(x ) − (x ) χi (y) − χi (x ) ∂xi D i=1 = 0. Here: (1) (2) (3) (4) (5)

  The matrix αi j (x  ) is symmetric and positive semi-definite. γ(x  ) ≤ 0. μ(x  ) ≥ 0. δ(x  ) ≥ 0. ν(x  , ·) is a non-negative Borel measure on D such that, for any neighborhood V of x  in R N , ν(x  , D \ V ) < ∞, " #  N −1     2 χi (y) − χi (x ) ν(x , dy) χ N (y) + < ∞. V ∩D

i=1

(12.67a) (12.67b)

638

12 Markov Processes, Transition Functions and Feller Semigroups

Proof The proof is essentially the same as that of Theorem 12.55. The proof is divided into five steps. Step 1: By Theorem 12.37, there corresponds to a Feller semigroup {Tt }t≥0 on D a unique uniformly stochastically continuous, Feller transition function pt on D in the following manner:  Tt f (x) =

pt (x, dy) f (y) for all f ∈ C(D). D

Thus we have the formula  1 Tt u(x  ) − u(x  ) (12.68) t N −1     ∂u  1 1   pt (x , D) − 1 u(x ) + pt (x  , dy) χi (y) − χi (x  ) (x ) = t t i=1 D ∂xi " #  N −1   ∂u 1 + pt (x  , dy) u(y) − u(x  ) − (x  ) χi (y) − χi (x  ) t D ∂x i i=1  N −1 ∂u  1 j = γt (x  )u(x  ) + βt (x  ) (x ) + pt (x  , dy)% u (x  , y)d(x  , y), ∂x t i D i=1 where  1 pt (x  , D) − 1 , t    1 j  βt (x ) = pt (x  , dy) χ j (y) − χ j (x  ) , t D γt (x  ) =

and d(x  , y) = χ N (y) +

N −1  2 χi (y) − χi (x  ) ,

y ∈ D,

i=1

% u (x  , y) =

u(y) − u(x  ) −

$ N −1 ∂u  (x )(χi (y) − χi (x  )) i=1 ∂xi for y ∈ D \ {x  }. d(x  , y)

We rewrite the last term of formula (12.68). To do this, we introduce a non-negative function  1 t (x  ) = pt (x  , dy)d(x  , y), t D and consider two cases. Case A: t (x  ) > 0. In this case we can write

12.5 Infinitesimal Generator of Feller Semigroups on a Bounded Domain (ii)

1 t

 D\{x  }

pt (x  , dy)% u (x  , y)d(x  , y) = t (x  )

where % qt (x  , E) =

1 tt (x  )



 D\{x  }

pt (x  , dy)d(x  , y),

639

% qt (x  , dy)% u (x  , y),

E ∈ BD .

E

Here and in the following B K denotes the σ-algebra of all Borel sets in K . We remark that  1 pt (x  , dy)d(x  , y) % qt (x  , D \ {x  }) = tt (x  ) D\{x  }    D\{x  } pt (x , dy)d(x , y) =  = 1,   D pt (x , dy)d(x , y) since it follows from condition (12.65) that d(x  , x  ) = 0. Case B: t (x  ) = 0. In this case, we have the formula  1  0 = t (x ) = pt (x  , dy)d(x  , y), t D and so, by condition (12.65), pt (x  , D \ {x  }) = 0. Hence we can write   1 pt (x  , dy)% u (x  , y)d(x  , y) = t (x  ) % qt (x  , dy)% u (x  , y) = 0,  t D\{x  } D\{x } where (for example) % qt (x  , ·) = the unit mass at a point of D, so that

% qt (x  , D \ {x  }) = 1.

Summing up, we obtain from Case A and Case B that N −1

j  1 ∂u  Tt u(x  ) − u(x  ) = γt (x  )u(x  ) + βt (x  ) (x ) t ∂x j j=1  + t (x  ) % qt (x  , dy)% u (x  , y). D\{x  }

(12.68 )

640

12 Markov Processes, Transition Functions and Feller Semigroups

Step 2: Now we introduce a compactification of D \ {x  } to which the function % u (x  , ·) may be continuously extended. We let χ N (y) for y ∈ D \ {x  }, d(x  , y) (χi (y) − χi (x  ))(χ j (y) − χ j (x  )) for y ∈ D \ {x  }. z i j (x  , y) := d(x  , y)

w(x  , y) :=

Then it is easy to see that the functions w(x  , ·) and z i j (x  , ·) satisfy the conditions 0 ≤ w(x  , y) ≤ 1, |z i j (x  , y)| ≤ 1, w(x  , y) +

N −1

z ii (x  , y) = 1,

i=1

and the matrix (z i j (x  , ·)) is symmetric and positive semi-definite. We define a compact subspace M of symmetric, positive semi-definite matrices by the formula

M = (z i j )1≤i, j≤N −1 : z i j = z ji , (z i j ) ≥ 0, |z i j | ≤ 1 , and a compact subspace H of D × [0, 1] × M by the formula  H=





y, w, z

 ij

∈ D × [0, 1] × M : w +

N −1

+ z ii = 1 ,

(12.69)

i=1

and consider an injection (see Fig. 12.15 below)      Φx  : D \ {x  }  y −→ y, w x  , y , z i j (x  , y) ∈ H. Then the function % u (x  , Φx  (·)), defined on Φx  (D \ {x  }), can be extended to a continuous function & u (x  , ·) on the closure Hx  = Φx  (D \ {x  }), of Φx  (D \ {x  }) in H . Indeed, by using Taylor’s formula we have, in a neighborhood of x  ,

12.5 Infinitesimal Generator of Feller Semigroups on a Bounded Domain (ii)

641

Hx = Φx (D \ {x })

D \ {x } Φx

h = (x , w, (z ij ))

.

y

x

Φx (y) = (y, w(x , y), (z ij (x , y)))

Fig. 12.15 The compactification Hx  of D \ {x  }

u(y) = u(x  ) + +

N 

N −1  ∂u   ∂u  (x ) χi (y) − χi (x  ) + (x )χ N (y) ∂x ∂x i N i=1

∂2u (x  + θ(y − x  ))(1 − θ) dθ ∂x ∂x i j 0 i, j=1    × χi (y) − χi (x  ) χ j (y) − χ j (x  ) , 1

and hence (see Fig. 12.15) ∂u  (x )w(x  , y) (12.70) ∂x N N  1 ∂2u + (x  + θ(y − x  ))(1 − θ)dθ × z i j (x  , y) ∂x ∂x i j 0 i, j=1

% u (x  , y) =

N ∂u  1 ∂2u (x )w + (x  )z i j ∂x N 2 i, j=1 ∂xi ∂x j         as Φx  (y) = y, w x  , y , z i j (x  , y) → h = x  , w, z i j .

−→ & u (x  , h) =

We define a non-negative measure & qt (x  , ·) on Hx  by the formula & =% & for all E & ∈ BH  . qt (x  , Φx−1 & qt (x  , E)  ( E)) x Then we can write formula (12.68 ) as follows: N −1

j  1 ∂u  Tt u(x  ) − u(x  ) = γt (x  )u(x  ) + βt (x  ) (x ) t ∂x j j=1  & qt (x  , dh)& u (x  , h). + t (x  ) Hx 

We remark that the measure & qt (x  , ·) is a probability measure on Hx  .

(12.68 )

642

12 Markov Processes, Transition Functions and Feller Semigroups

Step 3: We pass to the limit in formula (12.68 ). To do this, we introduce nonnegative functions 



θm (x ) = −γ1/m (x ) +

N −1

|β1/m (x  )| + 1/m (x  ) for m = 1, 2, . . . , j

(12.71)

j=1

and consider two cases.

lim inf m→∞ θm (x  ) = 0. In this case, there exists a subsequence θm k (x  )

Case I: of θm (x  ) such that lim θm k (x  ) = 0. k→∞

Thus, by passing to the limit in formula (12.68 ) with t := 1/m k we obtain that Au(x  ) = 0. Hence we have the desired condition (12.66), by taking αi j (x  ) = β i (x  ) = γ(x  ) = μ(x  ) = 0, δ(x  ) = 1, ν(x  , d x) = 0.

lim inf m→∞ θm (x  ) > 0. In this case, there exist a subsequence θm k (x  )

Case II: of θm (x  ) and a function θ(x  ) such that lim θm k (x  ) = θ(x  ) > 0.

(12.72)

k→∞

Then, by dividing both sides of formula (12.68 ) with t := 1/m k by the function θm k (x  ) we obtain that δ k (x  )

Ttk u(x  ) − u(x  ) tk



= γ k (x  )u(x  ) + + k (x  )

N −1 j=1



j

β k (x  )

∂u  (x ) ∂x j

q k (x  , dh)& u (x  , h), Hx 

where tk =

1 , mk

βtk (x  ) 1 γtk (x  ) j   , , , γ (x ) = β (x ) = k k θm k (x  ) θm k (x  ) θm (x  ) t (x  ) k (x  ) = k  , q k (x  , ·) = & qtk (x  , ·). θm k (x ) j

δ k (x  ) =

(12.73)

12.5 Infinitesimal Generator of Feller Semigroups on a Bounded Domain (ii)

643

However, we have, by formula (12.72), 0 ≤ δ k (x  ) < ∞, and further, by formula (12.71), j

0 ≤ −γ k (x  ) ≤ 1, −1 ≤ β k (x  ) ≤ 1, 0 ≤ k (x  ) ≤ 1, − γ k (x  ) +

N −1

j

|β k (x  )| + k (x  ) = 1.

j=1

We remark that the measures q k (x  , ·) are probability measures on Hx  . Since the metric spaces [0, +∞], [0, 1] and [−1, 1] are compact and since the space of probability measures on Hx  is also compact (see Theorem 5.36), we can pass to the limit in formula (12.73) to obtain the following formula: δ(x  )Au(x  ) = γ(x  )u(x  ) + + (x  )

N −1

β j (x  )

j=1



∂u  (x ) ∂x j

(12.74)

& q (x  , dh)& u (x  , h). Hx 

Here the functions δ(x  ), γ(x  ), β j (x  ) and (x  ) satisfy the conditions 0 ≤ δ(x  ) < ∞, 0 ≤ −γ(x  ) ≤ 1, − 1 ≤ β j (x  ) ≤ 1, 0 ≤ (x  ) ≤ 1, and 

− γ(x ) +

N −1

|β j (x  )| + (x  ) = 1,

(12.75)

j=1

and the measure & q (x  , ·) is a probability measure on Hx  . To rewrite the last term of formula (12.74), we define a non-negative Borel measure % q (x  , ·) on D \ {x  } by the formula q (x  , Φx  (E)) for all E ∈ B D\{x  } , % q (x  , E) = & and let

644

12 Markov Processes, Transition Functions and Feller Semigroups

   W : D × [0, 1] × M  h = y, w, z i j −→ w ∈ [0, 1],      Z c olon D × [0, 1] × M  h = y, w, z i j −→ z i j ∈ M. Then, in view of assertion (12.70) it follows that  & q (x  , dh) & u (x  , h) (x  ) H x   = (x ) & q (x  , dh) & u (x  , h) + (x  ) H  \Φx

 x

 (D\{x  })

(12.76)

Φx

 (D\{x  })

& q (x  , dh) & u (x  , h)

∂u  & q (x  , dh)W (h) (x ) = (x  )  ∂x N Hx  \Φx  (D\{x })  N  1 ∂2u  ij  + & q (x , dh)Z (h) (x ) 2 i, j=1 Hx  \Φx  (D\{x  }) ∂xi ∂x j   + (x ) % q (x  , dy) % u (x  , y) D\{x  }

N −1 ∂u ∂2u + αi j (x  ) (x  ) ∂x N i, j=1 ∂xi ∂x j " #  N −1  ∂u      + ν(x , dy) u(y) − u(x ) − (x ) χi (y) − χi (x ) , ∂xi D i=1

= μ(x  )

where μ(x  ) = (x  ) αi j (x  ) =

 Hx  \Φx  (D\{x  })

(x  ) 2

& q (x  , dh)W (h),



Hx  \Φx  (D\{x  })

& q (x  , dh)Z i j (h),

(12.77) (12.78)

and ν(x  , {x  }) = 0, ν(x  , E) = (x  )

(12.79a)

 E\{x  }

& q (x  , dh)Z i j (h) for E ∈ B D .

(12.79b)

Therefore, by combining formulas (12.74) and (12.76) we obtain the desired boundary condition (12.66) in Case II. Step 4: Properties (12.67a) and (12.67b) follow from our construction of αi j (x  ), β i (x  ), γ(x  ), μ(x  ), δ(x  ) and ν(x  , ·). Step 5: Finally, we show that the boundary condition (12.66) is consistent, that is, condition (12.66) does not take the form 0 = 0. In Case I, we have taken

12.5 Infinitesimal Generator of Feller Semigroups on a Bounded Domain (ii)

645

δ(x  ) = 1. In Case II, we assume that γ(x  ) = β i (x  ) = 0, ν(x  , ·) = 0. Then we have, by equation (12.75), (x  ) = 1, and hence, by formulas (12.79),



q (x  , D \ x  ) = 0. & q (x  , Φx  (D \ x  )) = % This implies that

& q (x  , Hx  \ Φx  (D \ x  )) = 1,

since the measure & q (x  , ·) is a probability measure on Hx  . Therefore, in view of definition (12.69) it follows from formulas (12.77) and (12.78) that 

μ(x ) + 2

N −1 i=1











α (x ) = (x ) ii

& q (x , dh) W (h) + Hx 

N −1

 Z (h) ii

i=1

= (x  )& q (x  , Hx  \ Φx  (D \ x  )) = 1.

The proof of Theorem 12.57 is now complete.

 

Remark 12.58 We can reconstruct the functions αi j (x  ), β i (x  ), γ(x  ), μ(x  ) and δ(x  ) so that they are bounded and Borel measurable on the boundary ∂ D (cf. Bony– Courrège–Priouret [22, Théorème XIII]). Probabilistically, Theorems 12.55 and 12.57 may be interpreted as follows: A Markovian particle in a Markov process X on the state space D is governed by an integro-differential (nonlocal) operator W of the form (12.64) in the interior D of D, and it obeys an integro-differential (nonlocal) boundary condition L of the form (12.66) on the boundary ∂ D of D:

646

12 Markov Processes, Transition Functions and Feller Semigroups

D

D

∂D

∂D

absorption

reflection

Fig. 12.16 Absorption and reflection phenomena

Lu(x  ) :=

N −1

αi j (x  )

i, j=1

N −1

∂2u ∂u  (x  ) + β i (x  ) (x ) ∂xi ∂x j ∂xi i=1

(12.80)

∂u  + γ(x  ) u(x  ) + μ(x  ) (x ) − δ(x  ) W u(x  ) ∂x N " #  N −1   ∂u + ν(x  , dy) u(y) − u(x  ) − (x  ) χi (y) − χi (x  ) ∂x i D i=1 = 0. The boundary condition L is called a second order Ventcel’ boundary condition (cf. [236]). It should be emphasized that the six terms of L N −1 i, j=1

αi j (x  )

N −1

∂2u ∂u  (x  ) + β i (x  ) (x ), ∂xi ∂x j ∂x i i=1

∂u  γ(x  ) u(x  ), μ(x  ) (x ), δ(x  ) W u(x  ), ∂x N ⎡ ⎤  N −1   ν(x  , dy  ) ⎣u(y  ) − u(x  ) − χ j (y) − χ j (x  ) ⎦ , ∂D



j=1

⎤ N −1   χ j (y) − χ j (x  ) ⎦ ν(x  , dy) ⎣u(y) − u(x  ) −

 D

j=1

may be supposed to correspond to a diffusion phenomenon along the boundary (like Brownian motion on ∂ D), an absorption phenomenon, a reflection phenomenon, a sticking (or viscosity) phenomenon and a jump phenomenon on the boundary and an inward jump phenomenon from the boundary, respectively (see Figs. 12.16, 12.17 and 12.18 below). Analytically, via a version of the Hille–Yosida theorem (Theorem 12.38), Theorems 12.55 and 12.57 may be interpreted as follows:

12.5 Infinitesimal Generator of Feller Semigroups on a Bounded Domain (ii)

D

647

D

∂D

∂D

diffusion along the boundary

viscosity

Fig. 12.17 Diffusion along ∂ D and viscosity phenomenon

D

D

∂D

∂D

jump into the interior

jump on the boundary

Fig. 12.18 Jump phenomenon into the interior and jump phenomenon on the boundary

A Feller semigroup {Tt }t≥0 on D is described by an integro-differential (nonlocal) operator W of the form (12.57) and an integro-differential (nonlocal) boundary condition L of the form (12.66). Therefore, we are reduced to the study of boundary value problems for Waldenfels integro-differential operators W with Ventcel’ boundary conditions L in the theory of partial differential equations.

12.6 Feller Semigroups and Boundary Value Problems By virtue of Theorems 12.55 and 12.57, we can reduce the study of Feller semigroups to the study of boundary value problems. In this section we prove general existence theorems for Feller semigroups in terms of boundary value problems in the case when the measures e(x 0 , ·) in formula (12.57) and the measures ν(x  , ·) in formula (12.66) identically vanish in D and on ∂ D, respectively (see formulas (12.81) and (12.82) below). In other words, we confine ourselves to a class of Feller semigroups whose infinitesimal generators have no integro-differential operator term in formulas (12.57) and (12.66). We start by formulating our problem precisely. Let D be a bounded domain in R N with smooth boundary ∂ D, and choose, for each point x0 of ∂ D, a neighborhood U of x0 in R N and a local coordinate system x = (x1 , . . . , x N −1 , x N )

648

12 Markov Processes, Transition Functions and Feller Semigroups

on U such that (see Fig. 12.14) x ∈ U ∩ D ⇐⇒ x ∈ U, x N (x) > 0; x ∈ U ∩ ∂ D ⇐⇒ x ∈ U, x N (x) = 0, and that the functions (x1 , x2 , . . . , x N −1 ), restricted to U ∩ ∂ D, form a local coordinate system of ∂ D on U ∩ ∂ D. We may take x N (x) = dist (x, ∂ D) for x ∈ R N . Then we have the formula grad x N (x  ) = the unit interior normal n to ∂ D at x  , and so

∂ ∂ . = ∂x N ∂n

Let A be a second order strictly elliptic differential operator with real coefficients such that Au(x) =

N

∂2u ∂u (x) + bi (x) (x) + c(x)u(x), ∂xi ∂x j ∂x i i=1 N

a i j (x)

i, j=1

(12.81)

where: (1) a i j ∈ C ∞ (R N ), a i j (x) = a ji (x) for all x ∈ R N and all 1 ≤ i, j ≤ N , and there exists a constant a0 > 0 such that N

  a i j (x)ξi ξ j ≥ a0 |ξ|2 for all (x, ξ) ∈ T ∗ R N = R N × R N .

(12.82)

i, j=1

  Here T ∗ R N is the cotangent bundle of R N . (2) bi ∈ C ∞ (R N ) for 1 ≤ i ≤ N . (3) c ∈ C ∞ (R N ) and c(x) ≤ 0 on D. The functions a i j (x), bi (x) and c(x) are called the diffusion coefficients, the drift coefficients and the termination coefficient, respectively. Let L be a boundary condition such that Lu(x  ) =

N −1

αi j (x  )

i, j=1

+ μ(x  )

N −1

∂2u ∂u  (x  ) + β i (x  ) (x ) + γ(x  ) u(x  ) ∂xi ∂x j ∂x i i=1

∂u  (x ) − δ(x  ) Au(x  ), ∂n

(12.83)

12.6 Feller Semigroups and Boundary Value Problems

649 ∂D

Fig. 12.19 The drift vector field β on ∂ D and the unit inward normal n to ∂ D

D

β

n

where: (1) The αi j are the components of a C ∞ symmetric contravariant tensor of type on ∂ D, and satisfy the degenerate elliptic condition N −1

αi j (x  )ηi η j ≥ 0

2  0

(12.84)

i, j=1

for all x  ∈ ∂ D and all η =

N −1

η j d x j ∈ Tx∗ (∂ D),

j=1

(2) (3) (4) (5) (6)

where Tx∗ (∂ D) is the cotangent space of ∂ D at x  . β i ∈ C ∞ (∂ D) for 1 ≤ i ≤ N . γ ∈ C ∞ (∂ D) and γ(x  ) ≤ 0 on ∂ D. μ ∈ C ∞ (∂ D) and μ(x  ) ≥ 0 on ∂ D. δ ∈ C ∞ (∂ D) and δ(x  ) ≥ 0 on ∂ D. n is the unit inward normal to ∂ D (see Fig. 12.19).

The condition L will be called a Ventcel’ boundary condition. Its four terms N −1 i, j=1

αi j (x  )

N −1

∂2u ∂u  (x  ) + β i (x  ) (x ), ∂xi ∂x j ∂xi i=1

γ(x  )u(x  ), μ(x  )

∂u  (x ), δ(x  )Au(x  ), ∂n

are supposed to correspond to a diffusion along the boundary, an absorption phenomenon, a reflection phenomenon and a viscosity phenomenon, respectively (see Figs. 12.16 and 12.17). We are interested in the following problem: Problem 12.59 Given analytic data (A, L), can we construct a Feller semigroup {Tt }t≥0 on D whose infinitesimal generator A is characterized by (A, L) ? Remark 12.60 In the case N = 1, this problem has been solved comprehensively both from probabilistic and analytic viewpoints by Feller [58–60], Dynkin [45, 46], Itô–McKean Jr. [95] and Ray [147]. So we consider the case N ≥ 2.

650

12 Markov Processes, Transition Functions and Feller Semigroups

In this section, we prove general existence theorems for Feller semigroups on ∂ D and then on D (Theorems 12.74 and 12.81) if the Ventcel’ boundary value problem 

(α − A) u = 0 in D, (λ − L) u = ϕ on ∂ D,

()α,λ

is solvable for sufficiently many functions ϕ in C(∂ D). Here α and λ are positive constants. First, we consider the following Dirichlet problem: For given functions f and ϕ defined in D and on ∂ D respectively, find a function u in D such that 

(α − A) u = f in D, u=ϕ on ∂ D.

(D)α

Theorem 11.1 tells us that the Dirichlet problem (D)α has a unique solution u in C 2+θ (D) for any f ∈ C θ (D) and any ϕ ∈ C 2+θ (∂ D). Therefore, we can introduce linear operators G 0α : C θ (D) −→ C 2+θ (D), and Hα : C 2+θ (∂ D) −→ C 2+θ (D) as follows: (1) For any f ∈ C θ (D), the function G 0α f ∈ C 2+θ (D) is a unique solution of the problem:  (α − A) G 0α f = f in D, (12.85) on ∂ D. G 0α f = 0 (2) For any ϕ ∈ C 2+θ (∂ D), the function Hα ϕ ∈ C 2+θ (D) is a unique solution of the problem:  (α − A) Hα ϕ = 0 in D, (12.86) on ∂ D. Hα ϕ = ϕ The operator G 0α is called the Green operator of the Dirichlet problem (12.85) and the operator Hα is called the harmonic operator of the Dirichlet problem (12.86), respectively. It should be noticed that the harmonic operator Hα is essentially the same as the Poisson operator P of the Dirichlet problem (11.21) with A replaced by A − α. Then we have the following lemma: Lemma 12.61 The operator G 0α (α > 0), considered from C(D) into itself, is nonnegative and continuous with norm

12.6 Feller Semigroups and Boundary Value Problems

651

G 0α  = G 0α 1 = sup G 0α 1(x). x∈D

Proof Let f be an arbitrary function in C θ (D) such that f ≥ 0 on D. Then, by applying Theorem 10.6 (the weak maximum principle) with A := A − α to the function −G 0α f we obtain from formula (12.85) that G 0α f ≥ 0 on D. This proves the non-negativity of G 0α . Since G 0α is non-negative, we have, for all f ∈ C θ (D), −G 0α  f  ≤ G 0α f ≤ G 0α  f  on D. This implies the continuity of G 0α with norm G 0α  = G 0α 1.  

The proof of Lemma 12.61 is complete. Similarly, we obtain from formula (12.86) the following lemma:

Lemma 12.62 The operator Hα , α > 0, considered from C(∂ D) into C(D), is nonnegative and continuous with norm Hα  = Hα 1 = sup Hα 1(x). x∈D

More precisely, we have the following lemma: Theorem 12.63 (i) (a) The operator G 0α , α > 0, can be uniquely extended to a non-negative, bounded linear operator on C(D) into itself, denoted again G 0α , with norm 1 (12.87) G 0α  = G 0α 1 ≤ . α (b) For all f ∈ C(D), we have the assertion  G 0α f ∂ D = 0 on ∂ D. (c) For all α and β > 0, the resolvent equation holds true: G 0α f − G 0β f + (α − β) G 0α G 0β f = 0 for every f ∈ C(D).

(12.88)

(d) For any f ∈ C(D), we have the assertion lim αG 0α f (x) = f (x) for each x ∈ D.

α→+∞

(12.89)

652

12 Markov Processes, Transition Functions and Feller Semigroups

Furthermore, if f |∂ D = 0, then this convergence is uniform in x ∈ D, that is, lim αG 0α f = f in C(D).

α→+∞

(12.89 )

(e) The operator G 0α maps C k+θ (D) into C k+2+θ (D) for any non-negative integer k.

(ii) (a  ) The operator Hα , α > 0, can be uniquely extended to a non-negative, bounded linear operator on C(∂ D) into C(D), denoted again Hα , with norm Hα  = 1. (b ) For all ϕ ∈ C(∂ D), we have the assertion Hα ϕ|∂ D = ϕ on ∂ D. (c ) For all α and β > 0, we have the assertion Hα ϕ − Hβ ϕ + (α − β) G 0α Hβ ϕ = 0 for all ϕ ∈ C(∂ D).

(12.90)

(d  ) The operator Hα maps C k+θ+2 (∂ D) into C k+2+θ (D) for any non-negative integer k. Proof (i) (a) Making use of mollifiers (cf. Sect. 4.2), we find that the space C θ (D) is dense in C(D) and further that non-negative functions can be approximated by non-negative C ∞ functions. Hence, by Lemma 12.61 it follows that the operator G 0α : C θ (D) → C 2+θ (D) can be uniquely extended to a non-negative, bounded linear operator G 0α : C(D) −→ C(D) with norm G 0α  = G 0α 1. Furthermore, since the function G 0α 1 satisfies the conditions 

(A − α) G 0α 1 = −1 in D, on ∂ D, G 0α 1 = 0

by applying Theorem 10.7 with A := A − α and α > 0, we obtain that G 0α  = G 0α 1 ≤

1 . α

(b) This follows from formula (12.85), since the space C θ (D) is dense in C(D) and the operator G 0α : C(D) → C(D) is bounded. (c) We find from the uniqueness property of solutions of the Dirichlet problem (D)α that equation (12.88) holds true for all f ∈ C θ (D). Hence, it holds true for all f ∈ C(D), since the space C θ (D) is dense in C(D) and since the operators G 0α are bounded.

12.6 Feller Semigroups and Boundary Value Problems

653

(d) First, let f be an arbitrary function in the domain C 2+θ (D) satisfying the condition (12.91) f |∂ D = 0. Then it follows from the uniqueness property of solutions of the Dirichlet problem (D)α that we have, for all α and β > 0, f − αG 0α f = G 0α ((β − A) f ) − βG 0α f. Thus we have, by estimate (12.87),  f − αG 0α f  ≤

β 1 (β − A) f  +  f , α α

and so lim  f − αG 0α f  = 0.

α→+∞

Now let f be an arbitrary function in C(D) satisfying the boundary condition (12.91). By means of mollifiers, we can find a sequence { f j } in C 2+θ (D) such that 

f j −→ f = f in C(D) as j → ∞, on ∂ D. f j |∂ D = 0

Then we have, by estimate (12.87),  f − αG 0α f  ≤  f − f j  +  f j − αG 0α f j  + αG 0α f j − αG 0α f  ≤ 2 f − f j  +  f j − αG 0α f j , and so lim sup  f − αG 0α f  ≤ 2 f − f j . α→+∞

This proves the desired assertion (12.89 ), since  f − f j  → 0 as j → ∞. To prove assertion (12.89), let f be an arbitrary function in C(D) and x an arbitrary point of D. Take a function ψ ∈ C(D) such that ⎧ ⎪ ⎨0 ≤ ψ ≤ 1 on D, ψ=0 in a neighborhood of x, ⎪ ⎩ ψ=1 near the boundary ∂ D. Then it follows from the non-negativity of G 0α and estimate (12.87) that 0 ≤ αG 0α ψ(x) + αG 0α (1 − ψ) (x) = αG 0α 1(x) ≤ 1.

(12.92)

654

12 Markov Processes, Transition Functions and Feller Semigroups

However, by applying assertion (12.89 ) to the function 1 − ψ we have the assertion lim αG 0α (1 − ψ) (x) = (1 − ψ) (x) = 1.

α→+∞

In view of inequalities (12.92), this implies that lim αG 0α ψ(x) = 0.

α→+∞

Thus, since we have the estimates − f ψ ≤ f ψ ≤  f ψ on D, it follows that |αG 0α ( f ψ)(x)| ≤  f αG 0α ψ(x) −→ 0 as α → +∞. Therefore, by applying assertion (12.89 ) to the function (1 − ψ) f we obtain that f (x) = ((1 − ψ) f ) (x) = lim αG 0α ((1 − ψ) f ) (x) α→+∞

=

lim αG 0α α→+∞

f (x) for each x ∈ D.

(e) This is an immediate consequence of part (iii) of Theorem 11.1. (ii) (a  ) Since the space C 2+θ (∂ D) is dense in C(∂ D), by Lemma 12.62 it follows that the operator Hα : C 2+θ (∂ D) −→ C 2+θ (D) can be uniquely extended to a non-negative, bounded linear operator Hα : C(∂ D) −→ C(D). Furthermore, by applying Theorem 10.7 with A := A − α (cf. Remark 10.8) we have the assertion Hα  = Hα 1 = 1. (b ) This follows from formula (12.86), since the space C 2+θ (∂ D) is dense in C(∂ D) and since the operator Hα : C(∂ D) → C(D) is bounded. (c ) We find from the uniqueness property of solutions of the Dirichlet problem (D)α that formula (12.90) holds true for all ϕ ∈ C 2+θ (∂ D). Hence, it holds true for all ϕ ∈ C(∂ D), since the space C 2+θ (∂ D) is dense in C(∂ D) and since the operators G 0α and Hα are bounded. (d  ) This is an immediate consequence of part (iii) of Theorem 11.1. The proof of Theorem 12.63 is now complete.  

12.6 Feller Semigroups and Boundary Value Problems

655

Fig. 12.20 The mapping properties of the Green operator G 0α of the Dirichlet problem (12.85)

G0

α − −−− −→

C(D) ⏐ ⏐

C(D) ⏐ ⏐

D(G0α ) = C θ (D) − −−−− → C 2+θ (D) G0 α

Fig. 12.21 The mapping properties of the harmonic operator Hα of the Dirichlet problem (12.86)

H

α − −−− − →

C(∂D) ⏐ ⏐

C(D) ⏐ ⏐

−−−− → C 2+θ (D) D(Hα ) = C 2+θ (∂D) − Hα

Fig. 12.22 The operators A and A

C(D) ⏐ ⏐ D(A)

A

− −−−− → C(D)

⏐ ⏐ D(A) = C 2 (D) − −−−− → C(D) A

Summing up, we have Figs. 12.20 and 12.21 for the mapping properties of the operators G 0α and Hα , respectively. Next we consider the Ventcel’ boundary value problem ()α,λ in the framework of the spaces of continuous functions. To do this, we introduce three operators associated with problem ()α,λ . (I) First, we introduce a linear operator A : C(D) −→ C(D) as follows: (1) The domain D(A) of A is the space C 2 (D). $N i $ ∂2u ∂u + i=1 b (x) + c(x)u for all u ∈ D(A). (2) Au = i,N j=1 a i j (x) ∂xi ∂x j ∂xi Then we have the following lemma: Lemma 12.64 The operator A has its minimal closed extension A in C(D). The operators A and A can be visualized as in Fig. 12.22. Proof We apply part (i) of Theorem 12.53 to the operator A.

656

12 Markov Processes, Transition Functions and Feller Semigroups

Assume that a function u ∈ C 2 (D) takes a positive maximum at a point x0 of D. Since the matrix (a i j (x)) is positive semi-definite, it follows that N

a i j (x0 )

i, j=1

∂2u (x0 ) ≤ 0, ∂xi ∂x j

∂u (x0 ) = 0 for 1 ≤ i ≤ N , ∂xi so that Au(x0 ) =

N

a i j (x0 )

i, j=1

∂2u (x0 ) + c(x0 )u(x0 ) ≤ 0. ∂xi ∂x j

This implies that the operator A satisfies condition (β) of Theorem 12.53 with K 0 := D and K := D. Therefore, Lemma 12.64 follows from an application of the same theorem.   Remark 12.65 Since the injection C(D) −→ D (D) is continuous, we have the formula Au =

N i, j=1

∂2u ∂u + bi (x) + c(x)u for u ∈ D(A), ∂xi ∂x j ∂x i i=1 N

a i j (x)

where the right-hand side is taken in the sense of distributions. The extended operators G 0α : C(D) −→ C(D) and Hα : C(∂ D) −→ C(D) still satisfy formulas (12.85) and (12.86) respectively in the following sense: Lemma 12.66 (i) For any f ∈ C(D), we have the assertions 

G 0 f ∈ D(A),   α αI − A G 0α f = f in D.

(ii) For any ϕ ∈ C(∂ D), we have the assertions 

H ϕ ∈ D(A),  α αI − A Hα ϕ = 0 in D.

12.6 Feller Semigroups and Boundary Value Problems

657

Here D(A) is the domain of A. Proof (i) For any f ∈ C(D), we choose a sequence { f j } in C θ (D) such that f j −→ f in C(D) as j → ∞. Then it follows from the boundedness of G 0α that G 0α f j −→ G 0α f in C(D), and also (α − A) G 0α f j = f j −→ f in C(D). Hence, we have the assertions  G 0 f ∈ D(A),   α αI − A G 0α f = f in D. since the operator A : C(D) → C(D) is closed. (ii) Similarly, part (ii) is proved, since the space C 2+θ (∂ D) is dense in C(∂ D) and the operator Hα : C(∂ D) → C(D) is bounded. The proof of Lemma 12.66 is complete.   Corollary 12.67 Every function u in D(A) can be written in the following form u = G 0α

   αI − A u + Hα (u|∂ D ) for α > 0.

Proof We let w = u − G 0α

(12.93)

   αI − A u − Hα (u|∂ D ) .

Then it follows from Lemma 12.66 that the function w is in D(A) and satisfies the conditions   αI − A w = 0 in D, on ∂ D. w|∂ D = 0 Thus, in view of Remark 12.65 we can apply Theorem 11.4 with A := A − α and s := 0 to obtain that w = 0. This proves the desired formula (12.93). The proof of Corollary 12.67 is complete. (II) Secondly, we introduce a linear operator LG 0α : C(D) −→ C(∂ D)

 

658

12 Markov Processes, Transition Functions and Feller Semigroups

as follows: (1) The domain D(LG 0α ) of LG 0α is the space 

 D LG 0α = f ∈ C(D) : G 0α f ∈ C 2 (D) .     (2) LG 0α f = L G 0α f for every f ∈ D LG 0α .   We remark that the domain D LG 0α contains the space C θ (D). Then we have the following lemma: Lemma 12.68 The operator LG 0α , α > 0, can be uniquely extended to a nonnegative, bounded linear operator LG 0α : C(D) −→ C(∂ D). Proof Let f (x) be an arbitrary function in D(LG 0α ) such that f ≥ 0 on D. Then we have the assertions ⎧ 0 2 ⎪ ⎨G α f ∈ C (D), 0 on D, Gα f ≥ 0 ⎪ ⎩ 0 on ∂ D, G α f |∂ D = 0 and so LG 0α f = μ

∂ ∂ (G 0 f ) − δ AG 0α f = μ (G 0α f ) + δ f ≥ 0 on ∂ D. ∂n α ∂n

This proves that the operator LG 0α is non-negative. By the non-negativity of LG 0α , we have, for all f ∈ D(LG 0α ), −LG 0α  f  ≤ LG 0α f ≤ LG 0α  f  on ∂ D. This implies the boundedness of LG 0α with norm LG 0α  = LG 0α 1. Recall that the space C θ (D) is dense in C(D) and that non-negative functions can be approximated by non-negative C ∞ functions. Hence, we find that the operator LG 0α can be uniquely extended to a non-negative, bounded linear operator LG 0α : C(D) −→ C(∂ D). The proof of Lemma 12.68 is complete. LG 0α

LG 0α

 

The operators and can be visualized as in Fig. 12.23 below. The next lemma states a fundamental relationship between the operators LG 0α and LG 0β for α and β > 0:

12.6 Feller Semigroups and Boundary Value Problems Fig. 12.23 The operators LG 0α and LG 0α

659

C(D)

LG0

− −−−α− →

C(∂D)

⏐ ⏐

⏐ ⏐

D(LG0α ) = C θ (D) − −−−− → C 1+θ (∂D) LG0 α

Lemma 12.69 For any f ∈ C(D), we have the formula LG 0α f − LG 0β f + (α − β) LG 0α G 0β f = 0 for α and β > 0.

(12.94)

Proof For any f ∈ C(D), we choose a sequence { f j } in C θ (D) such that f j −→ f in C(D) as j → ∞. Then, by using the resolvent equation (12.88) with f := f j we have the assertion LG 0α f j − LG 0β f j + (α − β) LG 0α G 0β f j = 0. Hence, the desired formula (12.94) follows by letting j → ∞, since the operators LG 0α , LG 0β and G 0β are all bounded. The proof of Lemma 12.69 is complete.   (III) Finally, we introduce a linear operator L Hα : C(∂ D) −→ C(∂ D) as follows: (1) The domain D(L Hα ) of L Hα is the space C 2+θ (∂ D). (2) L Hα ψ = L (Hα ψ) for every ψ ∈ D(L Hα ). It should be noticed that the operator L Hα is essentially the same as the operator T = Bγ P defined by formula (11.34) with Bγ replaced by L and P replaced by Hα , respectively. Then we have the following lemma: Lemma 12.70 The operator L Hα , α > 0, has its minimal closed extension L Hα in C(∂ D). Proof We apply part (i) of Theorem 12.53 to the operator L Hα . To do this, it suffices to show that the operator L Hα satisfies condition (β  ) with K := ∂ D or condition (β) with K := K 0 = ∂ D of the same theorem. Assume that a function ψ(x  ) in the domain D (L Hα ) = C 2+θ (∂ D)

660

12 Markov Processes, Transition Functions and Feller Semigroups

Fig. 12.24 The operators L Hα and L Hα

C(∂D)

LH

− −−−α− → C(∂D)

⏐ ⏐

⏐ ⏐

D(LHα ) = C 2+θ (∂D) − −−−− → C θ (∂D) LHα

takes its positive maximum at some point x0 of ∂ D. Since the function Hα ψ is in C 2+θ (D) and satisfies the conditions  (A − α) Hα ψ = 0 in D, on ∂ D, Hα ψ|∂ D = ψ by applying Theorem 10.6 (the weak maximum principle) with A := A − α to the function Hα ψ, we find that the function Hα ψ(x  ) takes its positive maximum at x0 ∈ ∂ D. Thus we can apply Hopf’s boundary point lemma (Lemma 10.12) with Σ3 := ∂ D to obtain that ∂ (12.95) (Hα ψ) (x0 ) < 0. ∂n Hence we have, by hypotheses (12.84) and inequality (12.95), L (Hα ψ) (x0 ) =

N −1 i, j=1

αi j (x0 )

∂2ψ ∂ (x0 ) + μ(x0 ) (Hα ψ) (x0 ) ∂xi ∂x j ∂n

+ γ(x0 )ψ(x0 ) − α δ(x0 )ψ(x0 ) ≤ 0. This verifies condition (β  ) (the positive maximum principle) of Theorem 12.53. Therefore, Lemma 12.70 follows from an application of the same theorem.   The operators L Hα and L Hα can be visualized as in Fig. 12.24 above. Remark 12.71 In view of assertion (12.45), we find that the operator L Hα enjoys the following property: If a function ψ ∈ D(L Hα ) takes its positive maximum at some point x  of ∂ D, then we have the inequality L Hα ψ(x  ) ≤ 0. (12.96) The next lemma states a fundamental relationship between the operators L Hα and L Hβ for α and β > 0: Lemma 12.72 The domain D(L Hα ) of L Hα does not depend on α > 0; so we denote by D the common domain. Then we have the formula

12.6 Feller Semigroups and Boundary Value Problems

661

L Hα ψ − L Hβ ψ + (α − β) LG 0α Hβ ψ = 0 for all α, β > 0 and ψ ∈ D. (12.97) Proof Let ψ be an arbitrary function in the domain D(L Hβ ). If we choose a sequence {ψ j } in D(L Hβ ) = C 2+θ (∂ D) such that  ψ j −→ ψ in C(∂ D), L Hβ ψ j −→ L Hβ ψ in C(∂ D), then it follows from the boundedness of Hβ and LG 0α that LG 0α (Hβ ψ j ) = LG 0α (Hβ ψ j ) −→ LG 0α (Hβ ψ) in C(∂ D). Therefore, by using formula (12.90) with ϕ := ψ j we obtain that L Hα ψ j = L Hβ ψ j − (α − β) LG 0α (Hβ ψ j ) −→ L Hβ ψ − (α − β) LG 0α (Hβ ψ) in C(∂ D). This implies that 

ψ ∈ D(L Hα ), L Hα ψ = L Hβ ψ − (α − β) LG 0α (Hβ ψ).

Conversely, by interchanging α and β we have the assertion D(L Hα ) ⊂ D(L Hβ ), and so D(L Hα ) = D(L Hβ ). Therefore, we have proved that the domain D(L Hα ) does not depend on α > 0. The proof of Lemma 12.72 is complete.   In view of Remark 12.65, it follows that every function f ∈ C(D) ⊂ L 2 (D) satisfies the equation (α − A) G 0α f = f in D

662

12 Markov Processes, Transition Functions and Feller Semigroups

in the sense of distributions. Hence, by applying Theorem 8.25with A := A − α to the function G 0α f we find that the boundary condition L G 0α f can be defined as a distribution on the boundary ∂ D. Similarly, we find that the boundary condition L(Hα ψ) for any function ψ ∈ C(∂ D) can be defined as a distribution on ∂ D, since we have the equation (α − A) (Hα ψ) = 0 in D. More precisely, we can prove the following lemma: Lemma 12.73 (i) If we define a linear operator  LG 0α : C(D) −→ D (∂ D) by the formula

 LG 0α f = L(G 0α f ) for f ∈ C(D),

then we have the assertion

LG 0α ⊂  LG 0α .

(ii) If we define a linear operator  L Hα : C(∂ D) −→ D (∂ D) by the formula

 L Hα ψ = L (Hα ψ) for ψ ∈ C(∂ D),

then we have the assertion

L Hα ⊂  L Hα .

Proof (i) Let f be an arbitrary function in the domain D(LG 0α ) = C(D). If we choose a sequence { f j } in C θ (D) ⊂ D(LG 0α ) such that f j −→ f in C(D), then we have the assertions

12.6 Feller Semigroups and Boundary Value Problems



663

G 0α f j −→ G 0α f in C(D), 0 0 (α − A) G α f j = f j −→ f = (α − A) G α f in C(D).

Thus, by applying Theorem 8.25 with A := A − α and s := σ = 0 we obtain that LG 0α f j −→  LG 0α f in D (∂ D). On the other hand, by the boundedness of LG 0α it follows that LG 0α f j −→ LG 0α f in C(∂ D). Hence we have the assertion

LG 0α f =  LG 0α f.

This proves part (i). Similarly, part (ii) follows from the closedness of L Hα . The proof of Lemma 12.73 is complete.

 

Now we can prove a general existence theorem for Feller semigroups on ∂ D in terms of boundary value problem ()α,λ . The next theorem tells us that the operator L Hα is the infinitesimal generator of some Feller semigroup on ∂ D if and only if problem ()α,λ is solvable for sufficiently many functions ϕ in C(∂ D). Theorem 12.74 (the existence theorem) (i) If the operator L Hα , α > 0, is the infinitesimal generator of a Feller semigroup on ∂ D, then, for each constant λ > 0, the boundary value problem 

(α − A) u = 0 in D, (λ − L) u = ϕ on ∂ D

()α,λ

has a solution u ∈ C 2+θ (D) for any ϕ in some dense subset of C(∂ D). (ii) Conversely, if, for some constant λ ≥ 0, the boundary value problem ()α,λ has a solution u ∈ C 2+θ (D) for any ϕ in some dense subset of C(∂ D), then the operator L Hα is the infinitesimal generator of some Feller semigroup on ∂ D. Proof (i) If the operator L Hα generates a Feller semigroup on ∂ D, applying part (i) of Theorem 12.38 with K := ∂ D to the operator A = L Hα we obtain that   R λI − L Hα = C(∂ D) for each λ > 0. This implies that the range R (λI − L Hα ) is a dense subset of C(∂ D) for each λ > 0. However, if ϕ ∈ C(∂ D) is in the range R (λI − L Hα ), and if ϕ = (λI − L Hα ) ψ with ψ ∈ C 2+θ (∂ D), then the function u = Hα ψ ∈ C 2+θ (D) is a solution of the boundary value problem ()α,λ . This proves part (i).

664

12 Markov Processes, Transition Functions and Feller Semigroups

(ii) We apply part (ii) of Theorem 12.53 with K = ∂ D to the operator L Hα . To do this, it suffices to show that the operator L Hα satisfies condition (γ) of the same theorem, since it satisfies condition (β  ) (the positive maximum principle), as is shown in the proof of Lemma 12.70. By the uniqueness theorem for the Dirichlet problem (D)α , it follows that any function u ∈ C 2+θ (D) which satisfies the equation (α − A) u = 0 in D can be written in the form u = Hα (u|∂ D ) , u|∂ D ∈ C 2+θ (∂ D) = D (L Hα ) . Thus we find that if there exists a solution u ∈ C 2+θ (D) of the boundary value problem ()α,λ for a function ϕ ∈ C(∂ D), then we have the formula (λI − L Hα ) (u|∂ D ) = ϕ, and so ϕ ∈ R (λI − L Hα ) . Therefore, if, for some constant λ ≥ 0, the boundary value problem ()α,λ has a solution u ∈ C 2+θ (D) for any ϕ in some dense subset of C(∂ D), then the range R (λI − L Hα ) is dense in C(∂ D). This verifies condition (γ) (with α0 := λ) of Theorem 12.53. Hence, part (ii) follows from an application of the same theorem. The proof of Theorem 12.74 is complete.   Furthermore, we give a general existence theorem for Feller semigroups on D in terms of Feller semigroups on ∂ D. In other words, we construct a Feller semigroup on D by making use of Feller semigroups on ∂ D. First, we give a precise meaning to the boundary conditions Lu for functions u in D(A). We let

D(L) = u ∈ D(A) : u|∂ D ∈ D , where D is the common domain of the operators L Hα for all α > 0. We remark that the space D(L) contains C 2+θ (D), since we have the assertion C 2+θ (∂ D) = D(L Hα ) ⊂ D. Corollary 12.67 tells us that every function u in D(L) ⊂ D(A) can be written in the form: (12.93) u = G 0α ((αI − A) u) + Hα (u|∂ D ) for α > 0.

12.6 Feller Semigroups and Boundary Value Problems

Then we define: Lu = LG 0α

   αI − A u + L Hα (u|∂ D ).

665

(12.98)

The next lemma justifies definition (12.98) of Lu for u ∈ D(L). Lemma 12.75 The right-hand side of formula (12.98) depends only on u, not on the choice of expression (12.93). Proof Assume that    αI − A u + Hα (u|∂ D )    = G 0β β I − A u + Hβ (u|∂ D ) ,

u = G 0α

  where α > 0, β > 0. Then it follows from formula (12.94) with f := αI − A u and formula (12.97) with ψ := u|∂ D that    αI − A u + L Hα (u|∂ D )       = LG 0β αI − A u − (α − β) LG 0α G 0β αI − A u LG 0α

(12.99)

+ L Hβ (u|∂ D ) − (α − β) LG 0α Hβ (u|∂ D ) = LG 0β ((β I − A)u) + L Hβ (u|∂ D )     + (α − β) LG 0β u − LG 0α G 0β αI − A u − LG 0α Hβ (u|∂ D ) . However, we obtain from formula (12.94) with f := u that     LG 0β u − LG 0α G 0β αI − A u − LG 0α Hβ (u|∂ D ) (12.100)  0   0 0 = LG β u − LG 0α G β β I − A u + Hβ (u|∂ D ) + (α − β) G β u = LG 0β u − LG 0α u − (α − β) LG 0α G 0β u = 0. Therefore, by combining formulas (12.99) and (12.100) we have the formula LG 0α

      αI − A u + L Hα (u|∂ D ) = LG 0β β I − A u + L Hβ (u|∂ D ) .  

This proves the lemma. We introduce a definition on the boundary condition L.

Definition 12.76 A Ventcel’ boundary condition L is said to be transversal on ∂ D if it satisfies the condition μ(x  ) + δ(x  ) > 0 on ∂ D.

(12.101)

666

12 Markov Processes, Transition Functions and Feller Semigroups

Fig. 12.25 Every Markov process on ∂ D is the “trace” on ∂ D of trajectories of some Markov process on D = D ∪ ∂ D under the transversality condition (12.101)

∂D

D

Table 12.2 A bird’s-eye view of strong Markov processes and elliptic boundary value problems through the reduction to the boundary under the transversality condition (12.101) Mathematical Field Probability Partial Differential (Microscopic approach) Equations (Mesoscopic approach) Mathematical subject Reduction to the boundary Mathematical theory

Strong Markov processes on the domain Strong Markov processes on the boundary Stochastic calculus

Elliptic boundary value problems Fredholm integral equations on the boundary Potential theory

Intuitively, the transversality condition (12.101) implies that either a reflection or a viscosity phenomenon occurs at each point of ∂ D. Probabilistically, this means that every Markov process on the boundary ∂ D is the “trace” on ∂ D of trajectories of some Markov process on the closure D = D ∪ ∂ D. The situation may be represented schematically by Fig. 12.25 above. Analytically, Table 12.2 gives a bird’s-eye view of strong Markov processes and elliptic boundary value problems through the reduction to the boundary under the transversality condition (12.101) (see Sect. 11.3.2). The next theorem tells us that transversality condition of L permits us to “piece together” a Markov process (Feller semigroup) on the boundary ∂ D with A- diffusion in D to construct a Markov process (Feller semigroup) on the closure D = D ∪ ∂ D: Theorem 12.77 Define a linear operator A : C(D) −→ C(D) as follows: (1) The domain D(A) of A is the space

D(A) = u ∈ D(A) : u|∂ D ∈ D, Lu = 0 on ∂ D .

(12.102)

12.6 Feller Semigroups and Boundary Value Problems

667

(2) Au = Au for every u ∈ D(A). Assume that the boundary condition L is transversal on ∂ D and that the operator L Hα , α > 0, is the infinitesimal generator of some Feller semigroup on ∂ D. Then the operator A is the infinitesimal generator of some Feller semigroup on D, and the Green operator G α = (αI − A)−1 , α > 0, is given by the following formula:    −1 G α f = G 0α f − Hα L Hα LG 0α f for f ∈ C(D).

(12.103)

Proof We apply part (ii) of Theorem 12.38 to the operator A. The proof is divided into several steps. Step (1): First, we prove that If the operator L Hα generates a Feller semigroup on ∂ D for some α > 0, then the operator L Hβ generates a Feller semigroup on ∂ D for any β > 0. We apply Corollary 12.54 with K := ∂ D to the operator L Hβ . By formula (12.97), it follows that the operator L Hβ can be written as L Hβ = L Hα + Mαβ , where Mαβ = (α − β) LG 0α Hβ is a bounded linear operator on C(∂ D) into itself. Furthermore, assertion (12.96) implies that the operator L Hβ satisfies condition (β  ) (the positive maximum principle) of Theorem 12.53. Therefore, it follows from an application of Corollary 12.54 that the operator L Hβ generates a Feller semigroup on ∂ D. Step (2): Next, we prove that If the operator L Hα , α > 0, is the infinitesimal generator of some Feller semigroup on ∂ D and if the boundary condition L is transversal on ∂ D, then the equation L Hα ψ = ϕ

(12.104)

has a unique solution ψ in D(L Hα ) for any ϕ ∈ C (∂ D) ; hence the inverse L Hα

−1

of L Hα can be defined on the whole space C(∂ D).

Furthermore, the operator − L Hα

−1

is non-negative and bounded on C(∂ D).

By applying Theorem 10.6 with A := A − α to the function Hα 1, we obtain that the function Hα 1 takes its positive maximum 1 only on the boundary ∂ D. Thus we can apply Lemma 10.12 with Σ3 := ∂ D to obtain that ∂ (Hα 1) < 0 on ∂ D. ∂n

(12.105)

668

12 Markov Processes, Transition Functions and Feller Semigroups

Hence, the transversality condition (12.101) gives that L Hα 1 = μ and so

∂ (Hα 1) + γ − αδ < 0 on ∂ D, ∂n

kα = − sup L (Hα 1) (x  ) > 0. x  ∈∂ D

Furthermore, by using Corollary 12.39 with K := ∂ D, A := L Hα , c := kα , we obtain that the operator L Hα + kα I is the infinitesimal generator of some Feller semigroup on ∂ D. Therefore, since kα > 0, it follows from an application of part (i) of Theorem 12.38 with A := L Hα + kα I that the equation   −L Hα ψ = kα I − (L Hα + kα I ) ψ = ϕ has a unique solution ψ ∈ D(L Hα ) for any ϕ ∈ C(∂ D), and further that the operator −L Hα

−1

 −1 = kα I − (L Hα + kα I )

is non-negative and bounded on C(∂ D) with norm 0 0 0 −1 0 1 −1 0 0 0 0 0−L Hα 0 = 0 kα I − (L Hα + kα I ) 0 ≤ . kα Step (3): By assertion (12.104), we can define the right-hand side of formula (12.103) for all α > 0. Now we prove that G α = (αI − A)−1 for α > 0.

(12.106)

In view of Lemmas 12.66, 12.72 and 12.73, it follows that we have, for any f ∈ C(D), ⎧    −1 0 0 f ⎪ G f = G f − H L H LG ∈ D(A), ⎪ α α α α ⎪ ⎨   α  −1 G α f |∂ D = −L Hα LG 0α f ∈ D L Hα = D, ⎪    ⎪ −1 ⎪ ⎩ LG α f = LG 0α f − L Hα L Hα LG 0α f = 0, and that

This proves that

  αI − A G α f = f.

12.6 Feller Semigroups and Boundary Value Problems

669



G α f ∈ D(A), (αI − A) G α f = f,

that is, (αI − A) G α = I on C(D). Therefore, in order to prove formula (12.106) it suffices to show the injectivity of the operator αI − A for α > 0. Assume that u ∈ D(A) and (αI − A) u = 0. Then, by Corollary 12.67 the function u can be written as u = Hα (u|∂ D ) ,

  u|∂ D ∈ D = D L Hα .

Thus we have the formula L Hα (u|∂ D ) = Lu = 0 on ∂ D. In view of assertion (12.104), this implies that u|∂ D = 0 on ∂ D, so that u = Hα (u|∂ D ) = 0 in D. Step (4): The non-negativity of G α , α > 0, follows immediately from formula −1 (12.103), since the operators G 0α , Hα , −L Hα and LG 0α are all non-negative. Step (5): We prove that the operator G α is bounded on C(D) with norm G α  ≤

1 for α > 0. α

(12.107)

In order to prove assertion (12.107), it suffices to show that Gα1 ≤

1 on D, α

since the operator G α is non-negative on C(D). First, it follows from the uniqueness theorem for the Dirichlet problem (D)α that αG 0α 1 + Hα 1 = 1 + G 0α c on D.

(12.108)

670

12 Markov Processes, Transition Functions and Feller Semigroups

Indeed, the both sides have the same boundary value 1 on ∂ D and satisfy the same equation: (α − A) u = α in D. By applying the operator L to the both hand sides of equality (12.108), we obtain that     −L (Hα 1) = −L1 − L G 0α c + αL G 0α 1   ∂  0  = −γ(x  ) − μ(x  ) G α c + αL G 0α 1 ∂n   ≥ αL G 0α 1 on ∂ D, since we have the inequalities  γ(x  ) ≤ 0 on ∂ D, μ(x  ) ≥ 0 on ∂ D 

and also

c(x) ≤ 0 on D, 0 G α c(x) ≤ 0 on D,

  with G 0α c (x  ) = 0 for all x  ∈ ∂ D.

−1

Therefore, we have, by the non-negativity of −L Hα , − L Hα

−1

  0  1 on ∂ D. L Gα1 ≤ α

(12.109)

By using formula (12.103) with f := 1, inequality (12.109) and equality (12.108), we obtain that   1 1 1 −1   0  G α 1 = G 0α 1 + Hα −L Hα L Gα1 ≤ G 0α 1 + Hα 1 = + G 0α c α α α 1 ≤ on D, α since the operators Hα and G 0α are non-negative. Step (6): Finally, we prove that The domain D(A) is everywhere dense in C(D).

(12.110)

Substep (6-1): Before the proof, we need some lemmas on the behavior of the −1 operators G 0α , Hα and L Hα as α → +∞. Lemma 12.78 For all f ∈ C(D), we have the assertion lim

α→+∞

 0 αG α f + Hα ( f |∂ D ) = f in C(D).

(12.111)

12.6 Feller Semigroups and Boundary Value Problems

671

Proof Choose a constant β > 0 and let g = f − Hβ ( f |∂ D ). Then, by using formula (12.90) with ϕ := f |∂ D we obtain that  αG 0α g − g = αG 0α f + Hα ( f |∂ D ) − f − βG 0α Hβ ( f |∂ D ) .

(12.112)

However, we have, by estimate (12.87), lim G 0α Hβ ( f |∂ D ) = 0 in C(D),

α→+∞

and by assertion (12.89 ): lim αG 0α g = g in C(D),

α→+∞

since g|∂ D = 0. Therefore, the desired assertion (12.111) follows by letting α → +∞ in formula (12.112). The proof of Lemma 12.78 is complete.   Lemma 12.79 The function

  ∂ (Hα 1) ∂n ∂D

diverges to −∞ uniformly and monotonically as α → +∞. Proof First, formula (12.90) with ϕ := 1 gives that Hα 1 = Hβ 1 − (α − β) G 0α Hβ 1 for α, β > 0. Thus, in view of the non-negativity of G 0α and Hα it follows that α ≥ β > 0 =⇒ Hα 1 ≤ Hβ 1 on D. Since Hα 1|∂ D = Hβ 1|∂ D = 1, this implies that the functions   ∂ (Hα 1) ∂n ∂D are monotonically non-increasing in α > 0. Furthermore, by using formula (12.89) with f := Hβ 1 we find that the functions 

β Hα 1(x) = Hβ 1(x) − 1 − αG 0α Hβ 1(x) α

672 Fig. 12.26 A neighborhood U of ∂ D, relative to D, with smooth boundary ∂U

12 Markov Processes, Transition Functions and Feller Semigroups

∂D U U

U ∂D

∂D

converge to zero monotonically as α → +∞, for each x ∈ D. Now, for any given constant K > 0, we can construct a function u ∈ C 2 (D) such that u|∂ D = 1 on ∂ D,  ∂u  ≤ −K on ∂ D. ∂n 

(12.113a) (12.113b)

∂D

Indeed, it follows from part (d  ) of Theorem 12.63 that, for any integer m > 0 the function m  for α0 > 0 u = Hα0 1 belongs to C ∞ (D) and satisfies condition (12.113a). Furthermore, we have the inequality      ∂  ∂u   ≤ m sup ∂ Hα 1 (x  ). H = m 1 α 0 0   ∂n ∂ D ∂n x  ∈∂ D ∂n ∂D In view of inequality (12.105), this implies that the function m  u = Hα0 1 satisfies condition (12.113b) for m sufficiently large. Take a function u ∈ C 2 (D) which satisfies conditions (12.113a) and (12.113b), and choose a neighborhood U of ∂ D, relative to D, with smooth boundary ∂U such that (see Fig. 12.26 above) 1 on U. (12.114) u≥ 2 Recall that the function Hα 1 converges to zero in D monotonically as α → +∞. Since we have the formulas

12.6 Feller Semigroups and Boundary Value Problems

673

u|∂ D = Hα 1|∂ D = 1, by using Dini’s theorem we can find a constant α > 0 (depending on u and hence on K ) such that Hα 1 ≤ u on ∂U \ ∂ D, α > 2Au.

(12.115a) (12.115b)

It follows from inequalities (12.114) and (12.115b) that (A − α) (Hα 1 − u) = αu − Au ≥

α − Au > 0 in U. 2

Thus, by applying Theorem 10.6 with A := A − α to the function Hα 1 − u we obtain that the function Hα 1 − u may take its positive maximum only on the boundary ∂U . However, conditions (12.113a) and (12.115a) imply that Hα 1 − u ≤ 0 on ∂U = (∂U \ ∂ D) ∪ ∂ D. Therefore, we have the inequality Hα 1 ≤ u on U = U ∪ ∂U , and hence

   ∂u  ∂  ≤ −K on ∂ D, (Hα 1) ≤ ∂n ∂n ∂ D ∂D

since u|∂ D = Hα 1|∂ D = 1. The proof of Lemma 12.79 is complete.

 

Lemma 12.80 We have the assertion 0 0 −1 0 0 lim 0−L Hα 0 = 0. α→+∞

Proof In view of Lemma 12.79, the transversality condition (12.101) implies that the function L (Hα 1) (x  ) = μ(x  )

∂ (Hα 1) (x  ) − αδ(x  ) + γ(x  ) for x  ∈ ∂ D, ∂n

diverges to −∞ monotonically as α → +∞. By Dini’s theorem, this convergence is uniform in x  ∈ ∂ D. Hence the function 1/L Hα 1(x  ) converges to zero uniformly in x  ∈ ∂ D as α → +∞. This gives that 0 0 0 0 0 0 −1 0 −1 0 0 0 0−L Hα 0 = 0−L Hα 10 ≤ 0 0

0 0 1 0 −→ 0 as α → +∞, L (Hα 1) 0

674

12 Markov Processes, Transition Functions and Feller Semigroups

since we have the inequality 0 0 0   1 −L (Hα 1) (x  ) 0 0 0 · −L (Hα 1) (x  ) for all x  ∈ ∂ D. ≤ 1= 0 |L (Hα 1) (x  )| L (Hα 1) 0  

The proof of Lemma 12.80 is complete.

Substep (6-2) Proof of assertion (12.110): Since the space C 2+θ (D) is dense in C(D), it suffices to prove that lim αG α f − f  = 0 for each f ∈ C 2+θ (D).

α→+∞

(12.116)

First, we remark that 0 0   −1  0 0 LG 0α f − f 0 αG α f − f  = 0αG 0α f − αHα L Hα 0 0 ≤ 0αG 0α f + Hα ( f |∂ D ) − f 0 0 0   −1  0 0 + 0−αHα L Hα LG 0α f − Hα ( f |∂ D )0 0 0 0 0  −1  0 0 ≤ 0αG 0α f + Hα ( f |∂ D ) − f 0 + 0−αL Hα LG 0α f − f |∂ D 0 . Thus, in view of formula (12.111) it suffices to show that lim

α→+∞

   −1  αL Hα LG 0α f + f |∂ D = 0 in C(∂ D).

Take a constant β such that 0 < β < α, and write f = G 0β g + Hβ ϕ where (cf. formula (12.93)): 

g = (β − A) f ∈ C θ (D), ϕ = f |∂ D ∈ C 2+θ (∂ D).

Then, by using equations (12.88) with f := g and (12.90) we obtain that G 0α f = G 0α G 0β g + G 0α Hβ ϕ = Hence, we have the inequality

 1 0 G β g − G 0α g + Hβ ϕ − Hα ϕ . α

(12.117)

12.6 Feller Semigroups and Boundary Value Problems

675

0 0  −1  0 0 LG 0α f − f |∂ D 0 0−αL Hα 0 0  0 α  0  α −1  0 0 0 =0 −L Hα LG β g − LG α g + L Hβ ϕ + ϕ − ϕ0 0 α−β α−β 0 0 0 0 0 0 0 α α 0 −1 −1 0 0 0 0 ≤ 0−L Hα 0 · 0 LG 0β g + L Hβ ϕ0 + 0−L Hα 0 · 0 LG 0α 0 · g α−β α−β β + ϕ for all α > β. α−β By Lemma 12.80, it follows that the first term on the last inequality converges to zero as α → +∞. For the second term, using formula (12.94) with f := 1 and the non-negativity of G 0β and LG 0α , we find that 0 0 00 0 0 0 0 0 0 LG 0 = 0 LG 10 = 0 LG 1 − (α − β) LG 0 G 0 10 α α β α β 0 0 ≤ 0 LG 0β 10 for all α > β. Hence, the second term also converges to zero as α → +∞. It is clear that the third term converges to zero as α → +∞. This completes the proof of assertion (12.117) and hence of assertion (12.116). Step (7): Summing up, we have proved that the operator A, defined by formula (12.102), satisfies conditions (a)–(d) in Theorem 12.38. Hence, it follows from an application of the same theorem that the operator A is the infinitesimal generator of some Feller semigroup on D. The proof of Theorem 12.77 is now complete.   By combining Theorems 12.74 and 12.77, we can prove general existence theorems for Feller semigroups in terms of boundary value problem ()α,λ : Theorem 12.81 (the existence theorem) Let the differential operator A satisfy condition (12.82) and let the boundary condition L satisfy condition (12.84). Assume that L is transversal on ∂ D, and further that the following two conditions are satisfied: [I] (Existence) For some constants α ≥ 0 and λ ≥ 0, the boundary value problem 

(α − A) u = 0 in D, (λ − L) u = ϕ on ∂ D

()α,λ

has a solution u in C(D) for any ϕ in some dense subset of C(∂ D). [II] (Uniqueness) For some constant α > 0, we have the assertion u ∈ C(D), (α − A) u = 0 in D, Lu = 0 on ∂ D =⇒ u = 0 in D. Then there exists a Feller semigroup {Tt }t≥0 on D whose infinitesimal generator A is characterized as follows: (1) The domain D(A) of A is the space

676

12 Markov Processes, Transition Functions and Feller Semigroups

D(A) = u ∈ C(D) : Au ∈ C(D), Lu = 0 on ∂ D .

(12.118)

(2) Au = Au for every u ∈ D(A). Here Au and Lu are taken in the sense of distributions. Proof Part (ii) of Theorem 12.74 tells us that if condition [I] is satisfied, then the operator L Hα is the infinitesimal generator of some Feller semigroup on ∂ D; hence Theorem 12.77 applies. It remains to show that if condition [II] is satisfied, then the two definitions (12.102) and (12.118) of D(A) coincide:

D(A) := u ∈ D(A) : u|∂ D ∈ D, Lu = 0 on ∂ D

= u ∈ C(D) : Au ∈ C(D), Lu = 0 on ∂ D .

(12.119)

In view of Remark 12.65, Lemmas 12.73 and 12.75, it follows that

D(A) ⊂ u ∈ C(D) : Au ∈ C(D), Lu = 0 on ∂ D . Conversely, let u be an arbitrary function in C(D) such that 

Au ∈ C(D), Lu = 0 on ∂ D.

We let w := u − G α ((α − A) u) . Then we have, by formula (12.106),  (α − A) w = 0 in D, Lw = 0 on ∂ D. Hence, condition [II] gives us that w = 0 in D, so that u = G α ((α − A) u) ∈ D(A). This proves the desired assertion (12.119). The proof of Theorem 12.81 is complete.

 

In general, there is a close relationship between the uniqueness and regularity properties of solutions of boundary value problems. In fact, we obtain the following corollary:

12.6 Feller Semigroups and Boundary Value Problems

677

Corollary 12.82 Let A and L be as in Theorem 12.81. We assume that condition [I] and the following condition (replacing condition [II]) are satisfied: [III] (Regularity) For some constant α > 0, we have the assertion u ∈ C(D), (α − A) u = 0 in D, Lu ∈ C ∞ (∂ D) =⇒ u ∈ C ∞ (D). Then there exists a Feller semigroup {Tt }t≥0 on D whose infinitesimal generator A enjoys property (12.118), and coincides with the minimal closed extension in C(D) of the restriction of A to the space {u ∈ C 2 (D) : Lu = 0 on ∂ D}. A = the minimal closed extension of the operator A|{u∈C 2 D):Lu=0 on ∂ D} . Proof The proof is divided into three steps. Step (1): First, we show that conditions [I] and [III] imply condition [II]; hence Theorem 12.81 applies. Assume that u ∈ C(D), (α − A) u = 0 in D, Lu = 0 on ∂ D. Then we obtain from condition [III] that u ∈ C ∞ (D). Thus, by the uniqueness property of solutions of the Dirichlet problem (D)α it follows that the function u can be written in the form u = Hα (u|∂ D ) , u|∂ D ∈ C ∞ (∂ D) ⊂ D(L Hα ). Hence, we have the assertion L Hα (u|∂ D ) = Lu = 0 on ∂ D.

(12.120)

However, by combining part (ii) of Theorem 12.74 and assertion (12.104) we find that if condition [I] is satisfied and if the boundary condition L is transversal on ∂ D, then the minimal closed extension L Hα of L Hα is bijective for each α > 0. Thus we have, by assertion (12.120), u|∂ D = 0 on ∂ D, and so u = Hα (u|∂ D ) = 0 in D. This proves that condition [II] is satisfied. Step (2): Next we show that if condition [III] is satisfied, then we have the assertion f ∈ C ∞ (D) =⇒ G α f ∈ C ∞ (D).

(12.121)

678

12 Markov Processes, Transition Functions and Feller Semigroups

Part (e) of Theorem 12.63 tells us that G 0α f ∈ C ∞ (D) whenever f ∈ C ∞ (D). We let   −1 w = Hα L Hα (LG 0α f ) . Then it follows from Lemmas 12.66 and 12.73 that  (α − A) w = 0 in D,  −1  LG 0α f = LG 0α f ∈ C ∞ (∂ D). Lw = L Hα L Hα Hence, condition [III] gives us that w ∈ C ∞ (D). In view of formula (12.103), this implies that G α f = G 0α f − w ∈ C ∞ (D). Step (3): Finally, we show that the operator A, defined by formula (12.118), coincides with the minimal closed extension in C(D) of the restriction of A to the space

u ∈ C 2 (D) : Lu = 0 on ∂ D . Let u be an arbitrary element of D(A). We choose a sequence { f n } in C ∞ (D) such that f n −→ (α − A) u in C(D), and let u n = G α fn . Then we have, by assertions (12.106) and (12.121), u n ∈ D(A) ∩ C ∞ (D). Furthermore, since the operator G α : C(D) → C(D) is bounded, it follows that u n = G α f n −→ G α ((αI − A) u) = u in C(D), and also Au n = αu n − f n −→ αu − (αI − A) u = Au in C(D). This proves that The graph of A := {(u, Au) : u ∈ D(A)} = the closure in the product space C(D) × C(D) of the graph {(u, Au) : u ∈ C 2 (D), Lu = 0 on ∂ D}. The proof of Corollary 12.82 is complete.

 

12.7 Notes and Comments

679

12.7 Notes and Comments This chapter is a revised and expanded version of Chap. 9 of Taira [191] in such a way as to make it accessible to graduate students and advanced undergraduates as well. The results discussed here are adapted from Blumenthal–Getoor [20], Dynkin [45, 46], Dynkin–Yushkevich [47], Ethier–Kurtz [53], Feller [59, 60], Ikeda–Watanabe [92], Itô–McKean, Jr. [95], Lamperti [111], Revuz–Yor [151] and Stroock–Varadhan [178]. In particular, our treatment of temporally homogeneous Markov processes follows the expositions of Dynkin [45, 46] and Blumenthal–Getoor [20]. However, unlike many other books on Markov processes, this chapter focuses on the relationship among three subjects: Feller semigroups, transition functions and Markov processes. Our approach to the problem of construction of Markov processes with Ventcel’ boundary conditions is distinguished by the extensive use of the ideas and techniques characteristic of the recent developments in functional analysis methods. Section 12.1: Theorem 12.6 is taken from Dynkin [45, Chap. 4, Sect. 2], while Theorem 12.21 is taken from Dynkin [45, Chap. 6] and [46, Chap. 3, Sect. 2]. Theorem 12.27 is due to Dynkin [45, Theorem 5.10] and Theorem 12.29 is due to Dynkin [45, Theorem 6.3], respectively. Theorem 12.29 is a non-compact version of Lamperti [111, Chap. 8, Sect. 3, Theorem 1]. Section 12.1.6 is adapted from Lamperti [111, Chap. 9, Sect. 2]. Section 12.2: The semigroup approach to Markov processes can be traced back to the work of Kolmogorov [103]. It was substantially developed in the early 1950s, with Feller [59] and [60] doing the pioneering work. Our presentation here follows the book of Dynkin [46] and also part of Lamperti’s [111]. Theorem 12.37 is a non-compact version of Lamperti [111, Chap. 7, Sect. 7, Theorem 1]. Section 12.3: Theorem 12.53 is due to Sato–Ueno [156, Theorem 1.2] and Bony– Courrège–Priouret [22, Théorème de Hille–Yosida–Ray] (cf. [94, 147, 191]). Section 12.4: Theorem 12.55 is adapted from Sato–Ueno [156], while the main idea of its proof is due to Ventcel’ [236]. Moreover, Bony, Courrège and Priouret [22] give a more precise characterization of the infinitesimal generators of Feller semigroups in terms of the maximum principle (see Chap. 10). Section 12.5: Theorem 12.57 is due to Ventcel’ [236]. We can reconstruct the functions αi j (x  ), β i (x  ), γ(x  ), μ(x  ) and δ(x  ) so that they are bounded and Borel measurable on the boundary ∂ D (see Bony–Courrège–Priouret [22, Théorème XIII]). For the probabilistic meanings of Ventcel’ boundary conditions, the reader might be referred to Dynkin–Yushkevich [47]. Section 12.6: The results discussed here are adapted from Sato–Ueno [156] and Bony–Courrège–Priouret [22], while Theorem 12.81 and Corollary 12.82 are due to Taira [188]. In Notes and Comments of the last Sect. 13.4, we formulate Theorem 12.81 and Corollary 12.82 for some degenerate elliptic differential operators of second order (see Theorems 13.20, 13.22 and 13.24). However, in Sect. 12.6 we confined ourselves to the strictly elliptic case. This makes it possible to develop the basic machinery of Taira [188] with a minimum of bother and the principal ideas can be presented more concretely and explicitly.

Chapter 13

L 2 Approach to the Construction of Feller Semigroups

In Chap. 12 we reduced the problem of construction of Feller semigroups to the problem of unique solvability for the boundaryvalue problem 

(α − A) u = 0 in D, (λ − L) u = ϕ on ∂ D,

()α,λ

and gave existence theorems for Feller semigroups, where α and λ are positive constants (Theorems 12.74, 12.77 and 12.81). In this chapter we prove two existence and uniqueness theorems for the boundary value problem ()α,λ in the framework of L 2 Sobolev spaces, and construct Feller semigroups (Theorems 13.1 and 13.3). The proof of Theorem 13.1 is flowcharted as Table 13.1 in Sect. 13.2 and the proof of Theorem 13.3 is flowcharted as Table 13.2 in Sect. 13.3, respectively. Our proof of the existence and uniqueness theorems for the boundary value problem ()α,λ is based on the maximum principle discussed in Sect. 10.2 (Theorem 10.6) and the a priori estimates stated in Sects. 9.9 and 9.10 (Theorems 9.51, 9.53, 9.56, 9.58 and 9.60). We make use of these estimates on one hand to prove regularity theorems for the boundary value problem ()α,λ , and on the other hand to show that the index of the boundary value problem ()α,λ is equal to zero, by using a variant of the Agmon–Nirenberg method due to Agmon–Nirenberg [5] developed in Sect. 11.4 (Theorem 11.19). By combining the regularity theorems and the maximum principles, we can obtain the uniqueness theorems and hence the existence theorems for the boundary value problem ()α,λ , since the index of the boundary value problem ()α,λ is equal to zero. Intuitively, our results may be stated as follows: If a Markovian particle goes through the set where no reflection phenomenon occurs in finite time, then there exists a Feller semigroup corresponding to such a diffusion phenomenon.

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 K. Taira, Functional Analytic Techniques for Diffusion Processes, Springer Monographs in Mathematics, https://doi.org/10.1007/978-981-19-1099-9_13

681

13 L 2 Approach to the Construction of Feller Semigroups

682

13.1 Statements of Main Results Let D be a bounded domain in R N with C ∞ boundary ∂ D, and let A be a second order strictly elliptic differential operator with real coefficients on R N such that: N 

 ∂2u ∂u Au = a (x) + bi (x) + c(x)u. ∂x ∂x ∂x i j i i, j=1 i=1 N

ij

(13.1)

Here: (1) a i j ∈ C ∞ (R N ), a i j (x) = a ji (x) for all x ∈ R N and all 1 ≤ i, j ≤ N , and there exists a constant a0 > 0 such that N 

  a i j (x)ξi ξ j ≥ a0 |ξ|2 for all (x, ξ) ∈ T ∗ R N = R N × R N ,

(13.2)

i, j=1

  where T ∗ R N is the cotangent bundle of R N . (2) bi ∈ C ∞ (R N ) for 1 ≤ i ≤ N . (3) c ∈ C ∞ (R N ) and c(x) ≤ 0 on D. Let L be a second order boundary condition of the form Lu =

N −1 

αi j (x  )

i, j=1

+ μ(x  )

N −1

 ∂2u ∂u + β i (x  ) + γ(x  )u ∂xi ∂x j ∂x i i=1

(13.3)

∂u − δ(x  ) Au. ∂n

Here: (1) The αi j are the components of a C ∞ symmetric contravariant tensor of type on ∂ D, and satisfy the degenerate elliptic condition N −1 

αi j (x  )ηi η j ≥ 0

0

(13.4)

i, j=1

for all x  ∈ ∂ D and η =

2

N −1 

η j d x j ∈ Tx∗ (∂ D),

j=1

where Tx∗ (∂ D) is the cotangent space of ∂ D at x  . (2) β i ∈ C ∞ (∂ D) for 1 ≤ i ≤ N − 1. (3) γ ∈ C ∞ (∂ D) and γ(x  ) ≤ 0 on ∂ D. (4) μ ∈ C ∞ (∂ D) and μ(x  ) ≥ 0 on ∂ D.

13.1 Statements of Main Results

683

(5) δ ∈ C ∞ (∂ D) and δ(x  ) ≥ 0 on ∂ D. (6) n is the unit inwrd normal to ∂ D (see Fig. 12.19). The boundary condition L is called a second order Ventcel’ boundary condition (see [22, 229, 236]). Recall that the three terms of the boundary condition L N −1 

αi j (x  )

i, j=1

N −1

 ∂2u ∂u ∂u + β i (x  ) , γ(x  )u, μ(x  ) ∂xi ∂x j ∂xi ∂ν i=1

are supposed to correspond to a diffusion phenomenon along the boundary, an absorption phenomenon and a reflection phenomenon, respectively (see Figs. 1.6 and 1.7). We remark that the Ventcel’ boundary condition (13.3) is non-degenerate or coercive if and only if the second order differential operator N −1 

Q(x  , Dx  ) :=

αi j (x  )

i, j=1

N −1

 ∂2u ∂u + β i (x  ) + γ(x  )u ∂xi ∂x j ∂x i i=1

is elliptic on ∂ D, that is, there exists a constant α0 > 0 such that N −1 

 2 αi j (x  ) ξi ξ j ≥ α0 ξ   on the cotangent bundle T ∗ (∂ D).

i, j=1

The non-degenerate case is studied by Višik [229, Sect. 8], Hörmander [84, p. 264, problem (10.5.13)], Agranovich–Vishik [6, p. 69, formula (3.11)] and Bony–Courrège–Priouret [22, p. 436, formula (II.2.1)]. In this chapter we study the Ventcel’ boundary condition L under the condition (13.4). In order to state our fundamental hypotheses on L, we introduce some notation and definitions. We say that a tangent vector v=

N −1 

vj

j=1

∂ ∈ Tx  (∂ D) ∂x j

is subunit for the operator L0 =

 i, j

αi j (x  )

∂2 ∂xi ∂x j

if it satisfies the condition ⎛ ⎝

N −1  j=1

⎞2 v jηj⎠ ≤

N −1  i, j=1

αi j (x  )ηi η j for all η =

N −1  j=1

η j d x j ∈ Tx∗ (∂ D).

13 L 2 Approach to the Construction of Feller Semigroups

684

If ρ > 0, we define a “non-Euclidean” ball B L 0 (x  , ρ) of radius ρ about x  as follows: B L 0 (x  , ρ) = the set of all points y ∈ ∂ D which can be joined to x  by a Lipschitz path v : [0, ρ] → ∂ D for which the tangent vector v(t) ˙ of ∂ D at v(t) is subunit for L 0 for almost every t. Also we let B E (x  , ρ) = the ordinary Euclidean ball of radius ρ about x  . Recall (see Definition 12.76 and Fig. 12.25) that the boundary condition L is said to be transversal on ∂ D if it satisfies the condition μ(x  ) + δ(x  ) > 0 on ∂ D.

(13.5)

Now we can state our existence theorem for a Feller semigroup: Theorem 13.1 (the existence theorem) Let the differential operator A of the form (13.1) satisfy the strict ellipticity condition (13.2) and let the boundary condition L of the form (13.3) satisfy the degenate elliptic condition (13.4). Assume that L is transversal on ∂ D and further that: (A.1) There exist constants 0 < ε ≤ 1 and C > 0 such that we have, for all sufficiently small ρ > 0, B E (x  , ρ) ⊂ B L 0 (x  , C ρε ) on the set M = x  ∈ ∂ D : μ(x  ) = 0 . Then there exists a Feller semigroup {Tt }t≥0 on D whose infinitesimal generator A is characterized as follows: (1) The domain D(A) of A is the space D(A) = u ∈ C(D) : Au ∈ C(D), Lu = 0 on ∂ D . (2) Au = Au for every u ∈ D(A). More precisely, the generator A coincides with the minimal closed extension in C(D) of the restriction of A to the space

u ∈ C 2 (D) : Lu = 0 on ∂ D .

Remark 13.2 Theorem 10.6 tells us that the non-Euclidean ball B L 0 (x  , ρ) may be interpreted as the set of all points where a Markovian particle with generator L 0 , starting at x  , diffuses during the time interval [0, ρ]. Hence the intuitive meaning of

13.1 Statements of Main Results

685

hypothesis (A.1) is that a Markovian particle with generator L 0 goes through the set M, where no reflection phenomenon occurs, in finite time (see Fig. 1.12). Furthermore, we consider the first order case of the Ventcel’ boundary condition L, that is, the case where αi j ≡ 0 on ∂ D: Lu =

N −1  i=1

β i (x  )

∂u ∂u + γ(x  )u + μ(x  ) − δ(x  ) (Au) ∂xi ∂n

:= β(x  ) · u + γ(x  )u + μ(x  )

(13.6)

∂u − δ(x  ) (Au), ∂n

where β(x  ) · u is a C ∞ vector field on ∂ D (see Fig. 12.19). Then we can prove the following existence theorem for a Feller semigroup: Theorem 13.3 (the existence theorem) Let the strictly elliptic differential operator A of the form (13.1) satisfy the condition (13.2) and let the boundary condition L be of the form (13.6). Assume that L is transversal on ∂ D and further that: (A.2) The vector field β is non-zero on the set M = x  ∈ ∂ D : μ(x  ) = 0 and any maximal integral curve of β is not entirely contained in M (see Fig. 9.19). Then we have the same conclusions as in Theorem 13.1. Remark 13.4 The vector field β is the drift vector field. Hence Theorem 10.6 tells us that hypothesis (A.2) has an intuitive meaning similar to hypothesis (A.1) (see Fig. 1.14).

13.2 Proof of Theorem 13.1 The proof of Theorem 13.1 can be flowcharted as in Table 13.1 below. We apply Corollary 12.82. The next theorem allows us to verify conditions [I] and [III] of the same corollary: Theorem 13.5 Let A and L be as in Theorem 13.1. Assume that hypothesis (A.1) is satisfied. Then there exists a constant 0 < κ ≤ 1 such that, for any α > 0, the boundary value problem  (α − A) u = f in D, (13.7) Lu = ϕ on ∂ D has a unique solution u ∈ H s−2+κ (D) for any function f ∈ H s−2 (D) and any function ϕ ∈ H s−5/2 (∂ D) with s ≥ 3. Furthermore, we have, for all s ≥ 3 and t < s − 2 + κ,

13 L 2 Approach to the Construction of Feller Semigroups

686

Table 13.1 A flowchart for the proof of Theorem 13.1 Theorem 12.81 (conditions ([I], [II]) Corollary 12.82 (conditions [I], [III]) Theorem 12.38 (Hille–Yosida) Theorem 13.1 (Hypothesis (A.1)) Lemma 13.6 (hypoellipticity of T (α)) Theorem 13.5 (conditions [I], [III]) Assertion (13.28) Assertion (13.29)



u ∈ H t (D), (α − A) u ∈ H s−2 (D), Lu ∈ H s−5/2 (∂ D) =⇒ u ∈ H s−2+κ (D).

(13.8)

Here α ≥ 0. Granting Theorem 13.5 for the moment, we shall prove Theorem 13.1. In view of the Sobolev imbedding theorem (Theorem 8.17), it follows from Theorem 13.5 that: (1) For any α > 0, the boundary value problem 

(α − A) u = 0 in D, −Lu = ϕ on ∂ D

()α,0

has a unique solution u ∈ C ∞ (D) for any function ϕ ∈ C ∞ (∂ D). (2) For any α ≥ 0, we have the regularity property u ∈ C(D), (α − A) u = 0, Lu ∈ C ∞ (∂ D) =⇒ u ∈ C ∞ (D). These results (1) and (2) verify conditions [I] with λ = 0 and [III] of Corollary 12.82, respectively. Therefore, Theorem 13.1 follows from an application of the same corollary, apart from the proof of Theorem 13.5.

13.2 Proof of Theorem 13.1

687

13.2.1 Proof of Theorem 13.5 We divide the proof of Theorem 13.5 into five steps. Step 1: First, we reduce the study of problem (13.7) to that of a pseudo-differential operator on the boundary, just as in Sect. 11.3. By applying Theorem 11.5 to the operator A − α for α ≥ 0, we obtain the following: (a) The Dirichlet problem



(α − A) u = 0 in D, on ∂ D γ0 u = ϕ

(D)α

has a unique solution w in H t (D) for any function ϕ ∈ H t−1/2 (∂ D) with t ∈ R. (b) The mapping P(α) : H t−1/2 (∂ D) −→ H t (D), defined by w = P(α)ϕ, is an isomorphism of H t−1/2 (∂ D) onto the null space N (α − A, t) = u ∈ H t (D) : (α − A) u = 0 in D for all t ∈ R; and its inverse is the trace operator γ0 on ∂ D. P(α)

H t−1/2 (∂ D) → −→ N (α − A, t) , H t−1/2 (∂ D) →←− N (α − A, t) . γ0

We let T (α) : C ∞ (∂ D) −→ C ∞ (∂ D) ϕ −→ L (P(α)ϕ) . Since we have the formula L (P(α)ϕ) =

N −1 

αi j (x  )

i, j=1

+ μ(x  )

N −1

 ∂2ϕ ∂ϕ + β i (x  ) + γ(x  )ϕ ∂xi ∂x j ∂x i i=1

∂ (P(α)ϕ) − α δ(x  )ϕ, ∂n

it follows that the operator T (α) can be written in the form: T (α)ϕ = Q(α)ϕ + μ(x  ) (α)ϕ, where

(13.9)

13 L 2 Approach to the Construction of Feller Semigroups

688 N −1 

N −1

  ∂2ϕ ∂ϕ  + β i (x  ) + γ(x  ) − α δ(x  ) ϕ ∂xi ∂x j ∂xi i, j=1 i=1   ∂ (α)ϕ = γ1 (P(α)ϕ) = (P(α)ϕ) . ∂n ∂D Q(α)ϕ =

αi j (x  )

We remark that: (1) The operator Q(α) is a second order degenerate elliptic differential operator on ∂ D, and its complete symbol is given by the formula −

N −1 

αi j (x  ) ξi ξ j +

N −1  √ −1 β i (x  )ξi + (γ(x  ) − αδ(x  )),

i, j=1

where

N −1 

i=1

αi j (x  ) ξi ξ j ≥ 0 on the cotangent bundle T ∗ (∂ D).

i, j=1

(2) The operator

(α) = γ1 P(α) : C ∞ (∂Ω) −→ C ∞ (∂Ω)

is called the Dirichlet-to-Neumann operator. By arguing just as in the proof of Theorem 11.3, we find that the operator (α) is a classical, elliptic pseudodifferential operator of first order on ∂ D, and its complete symbol is given by the formula

 √ p1 (x  , ξ  ) + −1 q1 (x  , ξ  ) + terms of order ≤ 0 depending on α, where 1/2  4 A2 (x  )a0 (x  , ξ  ) − a1 (x  , ξ  )2 , p1 (x , ξ ) = 2 A2 (x  ) a1 (x  , ξ  ) q1 (x  , ξ  ) = − 2 A2 (x  ) 



(cf. formulas (11.4) and (11.5)). We remark that p1 (x  , ξ  ) < 0 on the bundle T ∗ (∂ D) \ {0} of non-zero cotangent vectors. Therefore, we obtain that the operator T (α) = L P(α) = Q(α) + μ(x  ) (α),

(13.10a) (13.10b)

13.2 Proof of Theorem 13.1

689

defined by formula (13.9), is a classical, pseudo-differential operator of second order on ∂ D and its complete symbol is given by the formula    N −1 ij  − α (x ) ξi ξ j

(13.11)

i, j=1

   N −1  √       i  + μ(x ) p1 (x , ξ ) + −1 μ(x )q1 (x , ξ ) + β (x )ξi i=1

+ terms of order ≤ 0 depending on α. Since the operator T (α) = L P(α) : C ∞ (∂ D) −→ C ∞ (∂ D) extends to a continuous linear operator T (α) : H σ (∂ D) −→ H σ−2 (∂ D) for all σ ∈ R, we can introduce a densely defined, closed linear operator T (α) : H s−5/2+κ (∂ D) −→ H s−5/2 (∂ D) as follows: (α) The domain D(T (α)) of T (α) is the space D(T (α)) = ϕ ∈ H s−5/2+κ (∂ D) : T (α)ϕ ∈ H s−5/2 (∂ D) . (β) T (α)ϕ = T (α)ϕ = L (P(α)ϕ) for every ϕ ∈ D(T (α)). Here κ is a positive constant and will be fixed later on (see formula (13.24) below). The situation can be visualized as in Fig. 13.1.

Fig. 13.1 The mapping properties of the operators T (α) and L P(α)

LP (α)

C(∂D) − −−−− →

C(∂D)

T (α)

D(T (α)) − −−−− → H s−5/2 (∂D)

C ∞ (∂D) − −−−− → LP (α)

C ∞ (∂D)

13 L 2 Approach to the Construction of Feller Semigroups

690

Then, by arguing as in Sect. 11.3 we can prove that the problems of existence, uniqueness and regularity of solutions of the boundary value problem (13.7) are reduced to the same problems for the operator T (α), respectively (cf. Theorems 11.14 through 11.15). Step 2: The next lemma is an essential step in the proof of Theorem 13.5(see Ole˘ınik–Radkeviˇc [138, Theorem 2.4.2], Fedi˘ı [55, Theorem 2], Paneyakh [140, Theorem 4.2]): Lemma 13.6 Let A and L be as in Theorem 13.1, and assume that hypothesis (A.1) is satisfied. Then there exists a constant 0 < κ ≤ 1 such that we have, for all s ∈ R, ϕ ∈ D (∂ D), T (α)ϕ ∈ H s (∂ D) =⇒ ϕ ∈ H s+κ (∂ D).

(13.12)

Furthermore, for any t < s + κ, there exists a constant Cs,t > 0 such that   |ϕ| H s+κ (∂ D) ≤ Cs,t |T (α)ϕ| H s (∂ D) + |ϕ| H t (∂ D) .

(13.13)

Thus, the operator T (α) is hypoelliptic with loss of 2 − κ derivatives. Proof The proof of Lemma 13.6 is inspired by Ole˘ınik–Radkeviˇc [138], Fedi˘ı [55] and Paneyakh [140]. Hence, we only give a sketch of its proof.  Step 2-1: If P = P(x, D) is a pseudo-differential operator with symbol p(x, ξ), we denote by P ( j) =P ( j) (x, D) a pseudo-differential operator with symbol p ( j) (x, ξ) ∂p (x, ξ) and by P( j) = P( j) (x, D) a pseudo-differential operator with symbol = ∂ξ j ∂p p( j) (x, ξ) = √1−1 ∂x (x, ξ), respectively. j For example, since the principal symbol q(α)(x  , ξ  ) of the differential operator Q(α) is given by the formula





q(α)(x , ξ ) = −

N −1 

αi j (x  ) ξi ξ j ,

i, j=1

we have the formulas for 1 ≤ k ≤ N − 1: q(α)(k) (x  , ξ  ) =

N −1

 ∂q(α) = −2 α jk (x  ) ξ j , ∂ξk j=1

N −1  ∂αi j 1 ∂q(α) √ q(α)(k) (x  , ξ  ) = √ = −1 ξi ξ j . ∂xk −1 ∂xk i, j=1

Step 2-2: First, we prove an energy estimate for the differential operator Q(α) (see [138, Theorem 2.6.1], [140, Theorem 2.1]):

13.2 Proof of Theorem 13.1

691

Proposition 13.7 (the energy estimate) Let A and L be as in Theorem 13.1, and let (U, ψ) be a chart on ∂ D with ψ(x  ) = (x1 , . . . , x N −1 ). Then, for every compact K ⊂ U and s ≥ 0, there exists a constant C K ,s > 0 such that ⎛ ⎞ 2  N −1 2 n−1   N −1    ∂αm      ⎝ ⎠ αi j (x  ) D j ϕ + D Dm ϕ     ∂x j s j=1 i=1 ,m=1 s−1 H (∂ D) H (∂ D)   2 2 ∞ ≤ C K ,s |T (α)ϕ| L 2 (∂ D) + |ϕ| H 2s (∂ D) for all ϕ ∈ C K (U ).

(13.14)

Proof In the proof we shall denote by the letter C a generic positive constant depending only on K and s. We recall that (13.9) T (α) = Q(α) + μ(x  ) (α), and rewrite the differential operator Q(α) in the following form: Q(α)ϕ =−

⎞ ⎛ N −1  N −1 √  ij    ⎝ D j α (x )Di ϕ + D j αi j (x  ) + −1 β i (x  )⎠ Di ϕ

N −1  i, j=1

i=1 

j=1



+ (γ(x ) − αδ(x ))ϕ. Then we have, by integration by parts, Re (Q(α)ϕ, ϕ) = −

N −1   ij     α (x ) Di ϕ, D j ϕ + h(x  )ϕ, ϕ i, j=1

with some function h(x  ) ∈ C ∞ (∂ D). Here (·, ·) is the inner product of the Hilbert space L 2 (∂ D). Thus, in view of the Schwarz inequality it follows that N −1  

 αi j (x  ) Di ϕ, D j ϕ ≤ −Re (Q(α)ϕ, ϕ) + C |ϕ|2L 2 (∂ D)

(13.15)

i, j=1

for all ϕ ∈ C ∞ (∂ D). On the other hand, the operator μ(x  )(α) is a first order pseudo-differential operator with principal symbol   √ μ(x  ) p1 (x  , ξ  ) + −1 q1 (x  , ξ  ) , and

μ(x  ) p1 (x  , ξ  ) ≤ 0 on T ∗ (∂ D).

13 L 2 Approach to the Construction of Feller Semigroups

692

Hence, by applying the sharp Gårding inequality (Theorem 9.51) to the operator −μ(α) we obtain that   − Re μ(x  ) (α)ϕ, ϕ ≥ −C |ϕ|2L 2 (∂ D) for all ϕ ∈ C K∞ (U ).

(13.16)

Therefore, by combining estimates (13.15) and (13.16) we obtain from formula (13.9) that N −1  

 αi j (x  ) Di ϕ, D j ϕ ≤ −Re (T (α)ϕ, ϕ) + C |ϕ|2L 2 (∂ D)

(13.17)

i, j=1

for all ϕ ∈ C K∞ (U ). The desired energy estimate (13.14) follows from estimate (13.17), just as in the proof of Ole˘ınik–Radkeviˇc [138, Theorem 2.6.1]. The proof of Proposition 13.7 is complete.  Step 2-3: Secondly, we prove an energy estimate for the pseudo-differential operator T (α) = Q(α) + μ(x  )(α) (see [138, Theorem 2.6.2] and [55, Lemma 7]). Proposition 13.8 (the energy estimate) Let A and L be as in Theorem 13.1, and assume that hypothesis (A.1) is satisfied. Then, for any point x0 of ∂ D we can find a neighborhood U (x0 ) of x0 such that: For every compact K ⊂ U (x0 ), there exists a constant 0 < κ(K ) ≤ 1 such that we have, for all s ∈ R and t < s + κ(K ),  2 N −1     T (α)( j) ϕ  s+κ/2  j=1

H

(∂ D)

 2   + T (α)( j) ϕ s−1+κ/2 H

 (13.18) (∂ D)

+ |ϕ|2H s+κ (∂ D)   ≤ C K ,s,t |T (α)ϕ|2H s (∂ D) + |ϕ|2H t (∂ D) for ϕ ∈ C K∞ (U (x  )), with a constant C K ,s,t > 0. Proof (1) First, we prove the energy estimate (13.18) in the case μ(x0 ) = 0. In doing so, we make essential use of Theorem 9.60 (or Theorem 9.53) due to Fefferman– Phong [57]. Let x0 be an arbitrary point of ∂ D such that μ(x0 ) = 0. Since hypothesis (A.1) is satisfied, we can find a neighborhood U (x0 ) of x0 such that we have, for all sufficiently small ρ > 0, B E (x  , ρ) ⊂ B L 0 (x  , 2Cρε ) for every point x  ∈ U (x0 ). Thus, by applying Theorem 9.53 to the operator Q(α) we obtain that: For every compact K ⊂ U (x0 ), there exist constants c K > 0 and C K > 0 such that

13.2 Proof of Theorem 13.1

693

−Re (Q(α)ϕ, ϕ) ≥ c K |ϕ|2H ε (∂ D) − C K |ϕ|2L 2 (∂ D)   for all ϕ ∈ C K∞ U (x0 ) .

(13.19)

On the other hand, by applying the sharp Gårding inequality (Theorem 9.51) to the operator −μ(α) we have the estimate   − Re μ(x  ) (α)ϕ, ϕ ≥ −C K |ϕ|2L 2 (∂ D) for all ϕ ∈ C K∞ (U (x0 )),

(13.20)

with a constant C K > 0. Hence, by combining estimates (13.19) and (13.20) we obtain from formula (13.9) that −Re (T (α)ϕ, ϕ) ≥ c K |ϕ|2H ε (∂ D) − (C K + C K ) |ϕ|2L 2 (∂ D) . By virtue of the Schwarz inequality, this gives that   |ϕ|2H ε (∂ D) ≤ C K |T (α)ϕ|2H s (∂ D) + |ϕ|2L 2 (∂ D)

(13.21)

for all ϕ ∈ C K∞ (U (x0 )), with a constant C K > 0. Therefore, by using estimates (13.14) and (13.21) we can obtain the desired energy estimate (13.18) with κ(K ) = ε, just as in Ole˘ınik–Radkeviˇc [138, Theorem 2.6.2] and Paneyakh [140, Theorem 2.5]. (2) Now we prove the energy estimate (13.18) in the case μ(x0 ) > 0. In doing so, we make essential use of Theorem 9.56 due to Hörmander [89]. In view of formula (13.11), we have the following: (a) The principal symbol of T (α) is equal to the following: −

N −1 

αi j (x  ) ξi ξ j .

i, j=1

(b) The subprincipal symbol of T (α) on the characteristic set =

⎧ ⎨ ⎩

 

x  , ξ ∈ T ∗ (∂ D) \ {0} :

N −1  i, j=1

αi j (x  )ξi ξ j = 0

⎫ ⎬ ⎭

is equal to the following: μ(x  ) p1 (x  , ξ  ) (13.22) ⎛ ⎞ N −1 N −1   √ ∂αi j  ⎠ + −1 ⎝μ(x  )q1 (x  , ξ  ) + β i (x  )ξi − (x )ξi . ∂x j i=1 i, j=1

13 L 2 Approach to the Construction of Feller Semigroups

694

Since we have the inequalities  N −1

αi j (x  )ξi ξ j ≥ 0 on T ∗ (∂ D), on T ∗ (∂ D), μ(x ) p1 (x  , ξ  ) ≤ 0 i, j=1 

we find that all the hypotheses of Theorem 9.56 are satisfied for the pseudodifferential operator −T (α). Let x0 be an arbitrary point of ∂ D such that μ(x0 ) > 0. Then we can find a neighborhood U (x0 ) of x0 such that μ(x0 ) p1 (x  , ξ  ) < 0 for all x  ∈ U (x0 ) and ξ  ∈ Tx∗ (∂ D) \ {0}, since we have the assertion p1 (x  , ξ  ) < 0 on T ∗ (∂ D) \ {0}. In view of formula (13.22), this implies that condition (ii) of Theorem 9.56 is satisfied. Hence, by applying Theorem 9.56 to the pseudo-differential operator −T (α), we obtain that: For every compact K ⊂ U (x0 ), there exists a constant C K > 0 such that   |ϕ|2H 1 (∂ D) ≤ C K |T (α)ϕ|2L 2 (∂ D) + |ϕ|2L 2 (∂ D)

(13.23)

for all ϕ ∈ C K∞ (U (x0 )). Therefore, by using estimates (13.14) and (13.23) we can obtain the desired energy estimate (13.18) with κ(K ) = 1, just as in Ole˘ınik–Radkeviˇc [138, Theorem 2.6.2] and Paneyakh [140, Theorem 2.2]. The proof of Proposition 13.8 is complete.  Remark 13.9 The constant κ(K ) in the proposition can be chosen as follows:  κ(K ) =

1 if μ(x0 ) > 0, ε if μ(x0 ) = 0.

Here ε is the constant in hypothesis (A.1) (see the proof of Proposition 13.8). Step 2-4 End of the proof of Lemma 13.6: By Proposition 13.8, we can cover d the boundary ∂ D by a finite number of local charts U j , ψ j ) j=1 in each of which estimate (13.18) holds true for all ϕ ∈ C0∞ (U j ). Let {ϕ j }dj=1 be a partition of unity subordinate to the covering {U j }dj=1 , and choose a function θ j ∈ C0∞ (U j ) such that θ j = 1 on supp ϕ j .

13.2 Proof of Theorem 13.1

We let

695

  κ := min κ supp θ j , 1≤ j≤d

(13.24)

  where κ supp θ j is the constant with K = supp θ j defined in Remark 13.9. Note that 0 < κ ≤ 1. Now let ϕ be an arbitrary element of D (∂ D) such that T (α)ϕ ∈ H s (∂ D). Then we may assume that ϕ ∈ H t (∂ D) for some t < s + κ, since we have, by part (ii) of Theorem 7.31, D (∂ D) =



H t (∂ D)

t∈R

for the compact manifold ∂ D without boundary. Thus, in order to prove the lemma it suffices to show the following local version of Lemma 13.6 (see [138, Theorem 2.6.2] and [55, Lemma 7]): ϕ j ϕ ∈ H t (∂ D), T (α)ϕ ∈ H s (∂ D) =⇒ ϕ j ϕ ∈ H s+κ (∂ D).       2 ϕ j ϕ2 t ϕ j ϕ2 s+κ |T . ≤ C (α)ϕ| + s H (∂ D) H (∂ D) H (∂ D)

(13.12 ) (13.13 )

Here and in the following the letter C denotes a generic positive constant depending only on s and t. We choose constants m and k such that 0 < m < s + κ − t, k = [s] + 1, where [s] stands for the integral part of s. Then, by applying the first inequality of Theorem 8.33 with s1 := κ to ϕ j ϕ and further estimate (13.18) with s := 0 and t := t − s(< κ) to (ϕ j ϕ) ∗ χε , we obtain that   ϕ j ϕ2 (s+κ,m.ρ) (13.25) H (∂ D)     −m 1 2  ρ2 dε  (ϕ j ϕ) ∗ χε 2 κ 1+ 2 ≤C ε−2s + ϕ j ϕ H t (∂ D) H (∂ D) ε ε 0

13 L 2 Approach to the Construction of Feller Semigroups

696



 −m dε ρ2 ≤C ε−2s 1+ 2 L (∂ D) ε ε 0    1 2 −m  2 ρ 2 −2s dε   + ϕ j ϕ H t (∂ D) . |(ϕ j ϕ) ∗ χε | H t−s (∂ D) 1 + 2 ε + ε ε 0 1

   T (α) (ϕ j ϕ) ∗ χε 2 2

However, by using the second inequality of Theorem 8.33 with s1 := t − s, we can estimate the second term on the last inequality of (13.25) as follows:  0

1

    ϕ j ϕ ∗ χε 2 t−s H

 2 ≤ C ϕ j ϕ H (t,m,ρ) (∂ D)  2 ≤ C ϕ j ϕ H t (∂ D) .

 −m dε ρ2 ε−2s 1+ 2 (∂ D) ε ε

(13.26)

Furthermore, in light of the pseudo-local property for pseudo-differential operators, we can estimate the first term on the last inequality of (13.25) as follows (cf. Ole˘ınik– Radkeviˇc [138, inequality (2.4.46)]):  −m ρ2 dε 1+ 2 ε−2s L (∂ D) ε ε 0     2 ≤ C |T (α)ϕ|2H s (∂ D) + ϕ j ϕ H t (∂ D) . 

1

    T (α) ϕ j ϕ ∗ χε 2 2

(13.27)

Therefore, by carrying estimates (13.26) and (13.27) into estimate (13.25) we have the estimate      2 ϕ j ϕ2 (s+κ,m,ρ) ≤ C |T (α)ϕ|2H s (∂ D) + ϕ j ϕ H t (∂ D) . H (∂ D) By virtue of Lemma 8.32, this proves the desired assertions (13.12 ) and (13.13 ). The proof of Lemma 13.6 is complete.  Step 3: The regularity result (13.8) for the boundary value problem (13.7) is an immediate consequence of regularity result (13.12) for the pseudo-differential operator T (α). Step 4: We prove the following uniqueness result: For any α > 0, we have the assertion 

u ∈ H s−2+κ (D), (α − A) u = 0 in D, Lu = 0 on ∂ D =⇒ u = 0 in D.

The regularity result (13.8) tells us that  u ∈ H s−2+κ (D), (α − A) u = 0 in D, Lu = 0 on ∂ D =⇒ u ∈ C ∞ (D).

(13.28)

13.2 Proof of Theorem 13.1

697

Therefore, the uniqueness result (13.28) is an immediate consequence of the following maximum principle (see Paneyakh [140, Theorem 2]): Proposition 13.10 (the maximum principle) Let A and L be as in Theorem 13.1. Then we have, for any α > 0, 

u ∈ C 2 (D), (α − A) u ≤ 0 in D, Lu ≥ 0 on ∂ D =⇒ u ≤ 0 on D.

Proof If u is a constant, then we have the inequality 0 ≤ (A − α) u = (c(x) − α) u in D. This implies that u is non-positive, since c(x) ≤ 0 in D and α > 0. Now we consider the case when u is not a constant. Assume, to the contrary, that max u > 0. D

Then, by applying Theorem 10.6 (the weak maximum principle) to the operator A − α we obtain that there exists a point x0 of ∂ D such that 

u(x0 ) = max D u, u(x) < u(x0 ) for all x ∈ D.

Thus it follows from an application of Hopf’s boundary point lemma (Lemma 10.12) with 3 := ∂ D that ∂u  (x ) < 0. ∂n 0 Furthermore, we have the assertions ∂u  (x ) = 0 (1 ≤ i ≤ N − 1), ∂xi 0 Au(x0 ) ≥ αu(x0 ) > 0, and also

N −1  i, j=1

αi j (x0 )

∂2u (x  ) ≤ 0, ∂xi ∂x j 0

since the matrices (αi j (x0 )) and ((−(∂ 2 u)/(∂xi ∂x j )(x0 )) are both positive semidefinite. Hence, in view of the transversality condition (13.5) for L it follows that

13 L 2 Approach to the Construction of Feller Semigroups

698

Lu(x0 ) =

N −1 

αi j (x0 )

i, j=1

∂2u ∂u (x  ) + γ(x0 )u(x0 ) + μ(x0 ) (x0 ) − δ(x0 )Au(x0 ) ∂xi ∂x j 0 ∂n

< 0. This contradicts the hypothesis: Lu ≥ 0 on ∂ D. The proof of Proposition 13.10 is complete.



Step 5: Finally, we prove the following existence result: ⎧ ⎪ ⎨For each α > 0, the boundary value problem (13.7) has a solution u ∈ H s−2+κ (D) for any function f ∈ H s−2 (D) ⎪ ⎩ and any function ϕ ∈ H s−5/2 (∂ D) with s ≥ 3.

(13.29)

We make use of Theorem 11.19 and its proof with Ω := D. Step 5-1: Now we make use of the Agmon–Nirenberg method just as in Sect. 11.4. More precisely, by replacing the parameter α in problem (13.7) by the differential operator ∂2 − 2 ∂y on the unit circle S = R/2πZ, we consider the boundary value problem 

A+

∂2 ∂ y2

L! u=ϕ !



! u = f in D × S,

! 

on ∂ D × S

on the product domain D × S (see Fig. 13.2): Then, by applying Theorem 11.19 with Ω := D we find that

Fig. 13.2 The product domain D × S

D×S

D

13.2 Proof of Theorem 13.1

699

⎧ ⎪ ) is finite, ⎨If the index of the boundary value problem (! then the index of the boundary value problem (13.7) ⎪ ⎩ is equal to zero for all α ≥ 0.

(13.30)

Step 5-2: We reduce the study of problem (! ) to that of a pseudo-differential operator on the boundary, just as in problem (). By applying Theorem 11.5 to the strictly elliptic differential operator ! := A + 

∂2 in the product space D × S, ∂ y2

we obtain the following results: (! a ) The Dirichlet problem



!w  ! = 0 in D × S, γ0 w !=ϕ ! on ∂ D × S,

! ∈ H t−1/2 (∂ D × S) has a unique solution w ! in H t (D × S) for any function ϕ where t ∈ R. (! b) The mapping !: H t−1/2 (∂ D × S) −→ H t (D × S), P defined by the formula

!ϕ w != P !,

is an isomorphism of H t−1/2 (∂ D × S) onto the null space   !! !, t = ! u = 0 in D × S N  u ∈ H t (D × S) :  for all t ∈ R; and its inverse is the trace operator γ0 on ∂ D × S.   P˜ !, t , H t−1/2 (∂ D × S) →−→ N    !, t . H t−1/2 (∂ D × S) →←− N  γ0

We let ! : C ∞ (∂ D × S) −→ C ∞ (∂ D × S) T   !ϕ ϕ ! −→ L P ! . ! can be decomposed as follows: Then the pseudo-differential operator T ! = LP != Q ! + μ(x  ) ! T

(13.31)

13 L 2 Approach to the Construction of Feller Semigroups

700

where N −1 

!ϕ = Q!

αi j (x  )

i, j=1

N −1

! !  i  ∂! ∂2ϕ ∂2ϕ ϕ + δ(x  ) 2 + β (x ) + γ(x  )! ϕ, ∂xi ∂x j ∂y ∂x i i=1

 ∂  !  . Pϕ ! ∂n ∂ D×S

!ϕ  !=

! is a second order differential operator on the product space ∂ D × S The operator Q and its symbol is given by the formula −

N −1 





α (x )ξi ξ j − δ(x )η + ij

2

i, j=1



−1

N −1 

β i (x  )ξi + γ(x  ),

i=1

where η is the dual variable of y in the cotangent bundle T ∗ (S). We remark that N −1 

αi j (x  )ξi ξ j + δ(x  )η 2 ≥ 0 on the cotangent bundle T ∗ (∂ D × S).

i, j=1

Furthermore, by arguing as in the proof of Theorem 11.3 we find that the operator ! is a classical, elliptic pseudo-differential operator of first order on ∂ D × S, and  its principal symbol is given by (cf. formulas (13.9), (13.10)): ! p1 (x  , ξ  , y, η) +

√ −1 ! q1 (x  , ξ  , y, η)

where 1/2  4 A2 (x  )(a0 (x  , ξ  ) − η 2 ) − a1 (x  , ξ  )2 , ! p1 (x , ξ , y, η) = 2 A2 (x  ) a1 (x  , ξ  ) . ! q1 (x  , ξ  , y, η) = q1 (x  , ξ  ) = − 2 A2 (x  ) 



(13.32a) (13.32b)

Note that ! p1 (x  , ξ  , y, η) < 0 on the bundle T ∗ (∂ D × S) \ {0} of non-zero cotangent vectors. Therefore, the operator

!= Q ! + μ(x  ) !, T

defined by formula (13.31), is a classical, pseudo-differential operator of second order on ∂ D × S and its complete symbol is given by the formula

13.2 Proof of Theorem 13.1

⎡ ⎣−

N −1 

701

⎤ 



α (x ) ξi ξ j − δ(x )η ij

2⎦



+ μ(x  )! p1 (x  , ξ  , y, η)

(13.33)

i, j=1

  N −1  √    i  q1 (x , ξ , y, η) + β (x )ξi + −1 μ(x )! i=1

+ terms of order ≤ 0. Since the operator ! = LP !: C ∞ (∂ D × S) −→ C ∞ (∂ D × S) T extends to a continuous linear operator ! : H s (∂ D × S) −→ H s−2 (∂ D × S) T for all s ∈ R, we can associate with the boundary value problem (! ) a densely defined, closed linear operator T! : H s−5/2+κ (∂ D × S) −→ H s−5/2 (∂ D × S) as follows: (! α) The domain D(T!) of T! is the space !ϕ ! ∈ H s−5/2 (∂ D × S) . D(T!) = ϕ ! ∈ H s−5/2+κ (∂ D × S) : T   ! T!ϕ !ϕ !ϕ (β) != T != L P ! for every ϕ ! ∈ D(T!). The situation can be visualized as in Fig. 13.3 below. Then, just as in the boundary value problem (13.7) it is easy to see that the study of the boundary value problem (! ) is reduced to that of the pseudo-differential operator T!. Fig. 13.3 The mapping properties of the operators ! = LP ! and T! T

H s−5/2+κ (∂D × S) − −−−− → H s−5/2 (∂D × S)

D(T )

C ∞ (∂D × S)

T

− −−−− → H s−5/2 (∂D × S)

− −−−− → T =LP

C ∞ (∂D × S)

13 L 2 Approach to the Construction of Feller Semigroups

702

Step 5-3: We show that: ⎧ ⎪ ⎨If hypothesis (A.1) is satisfied, then the operator T! is a Fredholm operator, that is, the index of the ⎪ ⎩ boundary value problem (! ) is finite.

(13.34)

First, we remark that transversality condition (13.5) of L implies that δ(x  ) > 0 on M = {x  ∈ ∂ D : μ(x  ) = 0}. Hence we find that if hypothesis (A.1) is satisfied, then the following hypothesis is satisfied: ! > 0 such that we have, for all sufficiently small (& A.1) There exists a constant C ρ > 0,     ! ε ! B E ((x  , y), ρ) ⊂ B' L 0 ((x , y), Cρ ) for every point x , y ∈ M, where ' L0 =

N −1  i, j=1

and

!= M



αi j (x  )

∂2 ∂2 + δ(x  ) 2 ∂xi ∂x j ∂y

 x  , y ∈ ∂ D × S : μ(x  ) = 0 = M × S.

Therefore, by arguing as in the proof of Lemma 13.6 we can obtain the following: Lemma 13.11 Let A and L be as in Theorem 13.1, and assume that hypothesis (A.1) (and hence ( A.1)) is satisfied. Then we have, for all s ∈ R, !ϕ ϕ ! ∈ D (∂ D × S), T ! ∈ H s−5/2 (∂ D × S) =⇒ ϕ ! ∈ H s−5/2+κ (∂ D × S). !s,t > 0 such that Furthermore, for any t < s − 5/2 + κ, there exists a constant C    !s,t ! |! ϕ| H s−5/2+κ (∂ D×S) ≤ C Tϕ ! H s−5/2 (∂ D×S) + |! ϕ| H t (∂ D×S) .

(13.35)

Here κ is the same constant as in Lemma 13.6. It follows from an application of Rellich’s theorem (Theorem 8.12) with M := ∂ D × S that the injection H s−5/2+κ (∂ D × S) −→ H t (∂ D × S) is compact. Hence, by applying Peere’s theorem (Theorem 5.67) with

13.2 Proof of Theorem 13.1 Fig. 13.4 The mapping properties of the operators !∗ = (L P) ! ∗ and T!∗ T

703

H −s+5/2−κ (∂D × S)

← −−−− −

H −s+5/2 (∂D × S) ⏐ ⏐

H −s+5/2−κ (∂D × S)

T∗

← −−−− −

D(T ∗ )

⏐ ⏐ C ∞ (∂D × S)

⏐ ⏐ ←−−−−−−− T ∗ =(LP )∗

C ∞ (∂D × S)

X := H s−5/2+κ (∂ D × S), Y := H s−5/2 (∂ D × S), Z := H t (∂ D × S), T := T!, we obtain from estimate (13.35) that: (1) dim N (T!) < ∞. (2) The range R(T!) is closed in H s−5/2 (∂ D × S). On the other hand, it follows from an application of the closed range theorem (Theorem 5.8) that codim R(T!) = dim N (T!∗ ). (13.36) Here T!∗ is the adjoint operator of T! (see Sect. 5.4.12). More precisely, we have the formula     ! for all ϕ ! ∈ D(T!∗ ). ! = ϕ ! ∈ D(T!) and all ψ T!ϕ !, ψ !, T!∗ ψ Here (·, ·) on the left-hand (resp. right-hand) side is the sesquilinear pairing of H s−5/2 (∂ D × S) and H −s+5/2 (∂ D × S) (resp. H s−5/2+κ (∂ D × S and H −s+5/2−κ (∂ D × S). The situation can be visualized as in Fig. 13.4 above. ! we find that Furthermore, by applying Lemma 11.26 to the operator T ! ∈ H −s+5/2 ∂ D × S) : T != 0 , !∗ ψ N (T!∗ ) = ψ  ∗ !∗ = L P ! , the adjoint of T !, is a classical pseudo-differential operator of where T second order on ∂ D × S. However, by formula (13.33) it follows from an application of Theorem 9.16 that !∗ is given by the formula the symbol of T

13 L 2 Approach to the Construction of Feller Semigroups

704

    N −1 αi j (x  )ξi ξ j − δ(x  )η 2 + μ(x  ) ! p1 (x  , ξ  , y, η) − i, j=1

+

  N −1 N −1   √ ∂αi j  q1 (x  , ξ  , y, η) − β i (x  )ξi + 2 (x )ξi −1 −μ(x  ) ! ∂x j i=1 i, j=1

+ terms of order ≤ 0. Here we remark that 

N −1 i, j=1 

αi j (x  )ξi ξ j + δ(x  )η 2 ≥ 0 on T ∗ (∂ D × S), p1 (x  , ξ  , y, η) ≤ 0 on T ∗ (∂ D) × S. μ(x ) !

Hence, we can obtain the following result, analogous to Lemma 13.11. Lemma 13.12 Let A and L be as in Theorem 13.1, and assume that hypothesis (A.1) (and hence ( A.1)) is satisfied. Then we have, for all s ∈ R, ! ∈ D (∂ D × S), T ! ∈ H −s+5/2 (∂ D × S). ! ∈ H −s+5/2−κ (∂ D × S) =⇒ ψ !∗ ψ ψ ∗ !s,t > 0 such that Furthermore, for any t < −s + 5/2, there exists a constant C

  ! ψ

H −s+5/2 (∂ D×S)

     ∗ ! ! −s+5/2−κ !s,t T ∗ψ ≤C + ! ψ  H t (∂ D×S) . H (∂ D×S)

(13.37)

Therefore, by applying Peetre’s theorem (Theorem 5.67) to the operator T!∗ , we obtain from estimate (13.37) that dim N (T!∗ ) < ∞. In view of formula (13.36), this proves that: (3) codim R(T!) < ∞. Summing up, we have proved that the closed operator T! is a Fredholm operator. Step 5-4: It remains to show that If hypothesis (A.1) is satisfied, then we have the assertion codim R (T (α)) = 0 for all α > 0.

(13.38)

This assertion implies existence result (13.29). By combining assertions (13.30) and (13.34), we find that if hypothesis (A.1) is satisfied, then we have the assertion ind T (α) = dim N (T (α)) − codim R (T (α)) = 0 for all α ≥ 0.

(13.39)

13.2 Proof of Theorem 13.1

705

However, it follows from the uniqueness result (13.28) that dim N (T (α)) = 0 for all α ≥ 0. In view of assertion (13.39), this proves the desired assertion (13.38). The proof of Theorem 13.5 and hence that of Theorem 13.1 is now complete. 

13.3 Proof of Theorem 13.3 The proof of Theorem 13.3 can be flowcharted as in Table 13.2 below. We verify conditions [I] and [III] of Corollary 12.82; then Theorem 13.3 follows from an application of the same corollary. In doing so, we make essential use of Theorem 9.58 due to Melin–Sjöstrand [127]. Step 1: First, we verify condition [III] (see assertion (13.44) below). To do this, it suffices to prove the following regularity result, analogous to regularity result (13.8). Theorem 13.13 (the regularity theorem) Let A and L be as in Theorem 13.3. Assume that hypothesis (A.2) is satisfied. Then we have, for all s ≥ 0 and t < s − 1, 

u ∈ H t (D), (α − A) u ∈ H s−2 (D), Lu ∈ H s−3/2 (∂ D) =⇒ u ∈ H s−1 (D).

(13.40)

Here α ≥ 0. Proof As in the proof of Theorem 13.5, we are reduced to the study of the following pseudo-differential operator

Table 13.2 A flowchart for the proof of Theorem 13.3 Theorem 12.81 (conditions ([I], [II]) Corollary 12.82 (conditions [I], [III]) Theorem 12.38 (Hille–Yosida) Theorem 13.3 (Hypothesis (A.2)) Lemma 13.14 (hypoellipticity of T (α)) Theorem 13.13 (condition [III]) Theorem 13.15 Assertion (13.48)

13 L 2 Approach to the Construction of Feller Semigroups

706

T (α) : C ∞ (∂ D) −→ C ∞ (∂ D) ϕ −→ L (P(α)ϕ) . Since we have the formula T (α)ϕ = L (P(α)ϕ) =

N −1 

β i (x  )

i=1

∂ϕ ∂ + γ(x  )ϕ + μ(x  ) (P(α)ϕ) − αδ(x  )ϕ, ∂xi ∂n

it follows that the operator T (α) can be written in the form   T (α) = β(x  ) · u + γ(x  ) − α δ(x  ) + μ(x  ) Π (α).

(13.41)

Hence, the symbol of T (α) is given by the formula  ) N −1  √    i  β (x )ξi μ(x ) p1 (x , ξ ) + −1 μ(x )q1 (x , ξ ) +

(







(13.42)

i=1

+ terms of order ≤ 0 depending on α. However, we find from formulas (13.10a) and (13.10b) that: (a) p1 (x  , ξ  ) < 0 on T ∗ (∂ D) \ {0}. (b) q1 (x  , ξ  ) is a polynomial of degree one in the variable ξ  . Thus, by formula (13.42) it is easy to verify that hypothesis (A.2) implies hypotheses (B) and (C) of the Melin–Sjöstrand theorem (Theorem 9.58) for the operator T (α). Therefore, by applying Theorem 9.58 to the operator T (α) we can obtain the following result, analogous to Lemma 13.6.  Lemma 13.14 Let A and L be as in Theorem 13.3. If hypothesis (A.2) is satisfied, then we have, for all s ∈ R, ϕ ∈ D (∂ D), T (α)ϕ ∈ H s−3/2 (∂ D) =⇒ ϕ ∈ H s−3/2 (∂ D).

(13.43)

Furthermore, for any t < s − 3/2, there exists a constant Cs,t > 0 such that   |ϕ| H s−3/2 (∂ D) ≤ Cs,t |T (α)ϕ| H s−3/2 (∂ D) + |ϕ| H t (∂ D) . Thus, the operator T (α) is globally hypoelliptic, with loss of one derivative. It follows from an application of Theorem 11.15 (with m = 1, σ = s − 1 and τ = s − 2) that regularity result (13.43) implies regularity result (13.40). The proof of Theorem 13.13 is complete. 

13.3 Proof of Theorem 13.3

707

In view of the Sobolev imbedding theorem (Theorem 8.17), it follows from Theorem 13.13 that:  For any α ≥ 0, we have the regularity property (13.44) u ∈ C(D), (α − A) u = 0, Lu ∈ C ∞ (∂ D) =⇒ u ∈ C ∞ (D). This result verifies the desired condition [III]. Step 2: Next we verify condition [I] (see assertions (13.47) and (13.48) below). We use the same notation as in Sect. 12.6. Recall that the Poisson operator P(α) is essentially the same as the harmonic operator Hα introduced by Lemma 12.62 (see Fig. 12.21) and further that the operator T (α) = L P(α) is essentially the same as the closed operator L Hα introduced by Lemma 12.70, respectively. See Figs. 12.24 and 11.6 with s := s + 1, m := 2 and κ := 0. We let 

L O u(x ) =

N −1 

β i (x  )

i=1

∂u  ∂u (x ) + γ(x  )u(x  ) + μ(x  ) (x  ), ∂xi ∂n

and consider the term −δ(x  )Au(x  ) in Lu(x  ) as a term of “perturbation” of L O u(x  ): Lu(x  ) = L O u(x  ) − δ(x  )Au(x  ). Then, by formula (13.41) it is easy to see that the closed operator L Hα can be decomposed as follows: L Hα = L O Hα − αδ(x  )I. The next theorem, analogous to Theorem 13.5, is an essential step in the proof: Theorem 13.15 Let A and L be as in Theorem 13.3. If hypothesis (A.2) is satisfied, then, for all α ≥ 0 and λ > 0, the boundary value problem 

(α − A) u = f in D, (λ − L O ) u = ϕ on ∂ D,

(13.45)

has a unique solution u ∈ H s−1 (D) for any function f ∈ H s−2 (D) and any function ϕ ∈ H s−3/2 (∂ D) with s ≥ 2. Furthermore, we have, for any t < s − 1, 

u ∈ H t (D), (α − A) u ∈ H s−2 (D), (λ − L O ) u ∈ H s−3/2 (∂ D) =⇒ u ∈ H s−1 (D).

(13.46)

Granting Theorem 13.15 for the moment, we shall verify condition [I]. Theorem 13.15 tells us that the closed operator L O Hα is the infinitesimal generator of some Feller semigroup on ∂ D. Indeed, by virtue of the Sobolev imbedding theorem

708

13 L 2 Approach to the Construction of Feller Semigroups

(Theorem 8.17), Theorem 13.15 implies the following assertion:  For all α ≥ 0 and λ > 0, the boundary value problem (13.45) has a unique solution u ∈ C ∞ (D) for any ϕ ∈ C ∞ (∂ D).

(13.47)

Hence, by applying part (ii) of Theorem 12.74 to the boundary condition L O we find from assertion (13.47) that the closed operator L O Hα is the infinitesimal generator of some Feller semigroup on ∂ D. Furthermore, it is clear that the operator −αδ(x  )I is a bounded linear operator on C(∂ D) into itself, and satisfies condition (β  ) (the positive maximum principle) of Theorem 12.53, since α ≥ 0 and δ(x  ) ≥ 0 on ∂ D. Therefore, by applying Corollary 12.54 with A := L O Hα , M := −αδ(x  )I, we obtain that the closed operator L Hα = L O Hα − αδ(x  )I is the infinitesimal generator of some Feller semigroup on ∂ D. Hence, it follows from part (i) of Theorem 12.74 that: ⎧ ⎪ ⎨For each λ > 0, the boundary value problem ()α,λ has a solution u ∈ C 2+θ (D) for all ϕ in some dense subset ⎪ ⎩ of the space C(∂ D).

(13.48)

This result verifies the desired condition [I]. Theorem 13.3 is proved, apart from the proof of Theorem 13.15.

13.3.1 Proof of Theorem 13.15 The proof of Theorem 13.15 is essentially the same as that of Theorem 13.5 except that we use Lemmas 13.16 and 13.18 below instead of Lemmas 13.6, 13.11 and 13.12. So we only give a sketch of the proof. We divide the proof into three steps. Step (1): As in the proof of Theorem 13.13, we are reduced to the study of the first order pseudo-differential operator

13.3 Proof of Theorem 13.3

709

  T0 (α) := (L O − λ) P(α) = β(x  ) · u + γ(x  ) − λ + μ(x  )Π (α). By formula (13.42), it follows that the symbol of T0 (α) is given by the formula ( μ(x  ) p1 (x  , ξ  ) +



 −1 μ(x  )q1 (x  , ξ  ) +

N −1 

) β i (x  )ξi

i=1

+ terms of order ≤ 0 depending on α and λ. Hence, Lemma 13.14 remains valid for the operator T0 (α): Lemma 13.16 Let A and L be as in Theorem 13.3, and assume that hypothesis (A.2) is satisfied. Then we have, for all s ∈ R, ϕ ∈ D (∂ D), T0 (α)ϕ ∈ H s−3/2 (∂ D) =⇒ ϕ ∈ H s−3/2 (∂ D).

(13.49)

Furthermore, for any t < s − 3/2, there exists a constant Cs,t > 0 such that   |ϕ| H s−3/2 (∂ D) ≤ Cs,t |T0 (α)ϕ| H s−3/2 (∂ D) + |ϕ| H t (∂ D) . Thus, the operator T0 (α) is globally hypoelliptic, with loss of one derivative. The regularity result (13.46) for the boundary value problem (13.45) follows from regularity result (13.49) for the operator T0 (α). Step (2): Now we prove the following uniqueness result for the boundary value problem (13.45):  u ∈ H s−1 (D), (α − A) u = 0 in D, (λ − L O ) u = 0 on ∂ D =⇒ u = 0 in D.

(13.50)

The regularity result (13.46) tells us that u ∈ H s−1 (D), (α − A) u = 0 in D, (λ − L O ) u = 0 on ∂ D =⇒ u ∈ C ∞ (D). Therefore, the uniqueness result (13.50) is an immediate consequence of the following maximum principle, analogous to Proposition 13.10: Proposition 13.17 (the maximum principle) Let A and L be as in Theorem 13.3. Then we have, for all α ≥ 0 and λ > 0, u ∈ C 2 (D), (α − A) u ≤ 0 in D, (λ − L O ) u ≤ 0 on ∂ D =⇒ u ≤ 0 in D. Proof We have only to consider the case when u is not a constant. The proof is based on a reduction to absurdity. Assume, to the contrary, that max u > 0. D

13 L 2 Approach to the Construction of Feller Semigroups

710

Then, by arguing as in the proof of Proposition 13.10 we find that there exists a point x0 of ∂ D such that ⎧ ⎪u(x0 ) = max D u > 0, ⎨ ∂u (x  ) < 0, ∂n 0 ⎪ ⎩ ∂u  (x0 ) = 0, 1 ≤ i ≤ N − 1. ∂xi Thus we have the inequality (L O − λ) u(x0 ) = μ(x0 ) < 0.

∂u  (x ) + γ(x0 )u(x0 ) − λu(x0 ) ≤ −λu(x0 ) ∂n 0

This contradicts the hypothesis: (λ − L O ) u ≤ 0 on ∂ D. The proof of Proposition 13.17 is complete.



Step (3): Finally, we prove the following existence result: ⎧ ⎪ ⎨For any α ≥ 0 and λ > 0, the boundary value problem (13.45) has a solution u ∈ H s−1 (D) for all f ∈ H s−2 (D) and ⎪ ⎩ ϕ ∈ H s−3/2 (∂ D) with s ≥ 2.

(13.51)

In order to prove assertion (13.51), we make use of Theorem 11.19 just as in the proof of Theorem 13.5. Namely, instead of the boundary value problem (13.45), we consider the following boundary value problem on the product domain D × S: 

A+

∂2 ∂ y2



! u= ! f in D × S,

u=ϕ ! (L O − λ) !

(! )0,λ

on ∂ D × S.

The study of the boundary value problem (! )0,λ is reduced to that of the first order pseudo-differential operator != !0,λ := (L O − λ) P T

N −1 

β i (x  )

i=1

  ∂ !. + γ(x  ) − λ + μ(x  ) Π ∂xi

!0,λ is By formulas (13.32a) and (13.32b), it follows that the complete symbol of T given by the formula ( 





p1 (x , ξ , y, η) + μ(x )!



 





−1 μ(x )! q1 (x , ξ , y, η) +

N −1  i=1

+ terms of order ≤ 0 depending on λ, where

) 

β (x )ξi i

13.3 Proof of Theorem 13.3

711

(! a) ! p1 (x  , ξ  , y, η) < 0 on T ∗ (∂ D × S) \ {0}. (! b) ! q1 (x  , ξ  , y, η) = q1 (x  , ξ  ) is a polynomial of degree one in the variable ξ  . Therefore, by applying the Melin–Sjöstrand theorem (Theorem 9.58) to the oper!0 and T !0∗ we can obtain the following result, analogous to Lemma 13.16. ators T Lemma 13.18 Let A and L be as in Theorem 13.3. If hypothesis (A.2) is satisfied, then we have the following two assertions: (i) For each s ∈ R, we have the regularity property !0,λ ϕ ϕ ! ∈ D (∂ D × S), T ! ∈ H s−3/2 (∂ D × S) =⇒ ϕ ! ∈ H s−3/2 (∂ D × S). !s,t > 0 such that Furthermore, for any t < s − 3/2, there exists a constant C    !s,t ! |! ϕ| H s−3/2 (∂ D×S) ≤ C T0,λ ϕ ! H s−3/2 (∂ D×S) + |! ϕ| H t (∂ D×S) . (ii) For all s ∈ R, we have the regularity property ∗ ! ! ∈ D (∂ D × S), T ! ∈ H −s+3/2 (∂ D × S). !0,λ ψ ψ ∈ H −s+3/2 (∂ D × S) =⇒ ψ ∗ > 0 suh that Furthermore, for any t < −s + 3/2, there exists a constant Cs,t

       ∗ ∗ ! ! ! T0,λ ψ  H t (∂ D×S) . ψ H −s+3/2 (∂ D×S) + ! ψ  H −s+3/2 (∂ D×S) ≤ Cs,t By virtue of Lemma 13.18, the proof of Theorem 13.15 goes through just as in the proof of Theorem 13.5. The proof of Theorem 13.15 and hence that of Theorem 13.3 is now complete.  Remark 13.19 The boundary value problem (13.45) is the oblique derivative problem. For detailed studies of this prblem, the reader might be referred to Egorov–Kondrat’ev [49], Melin–Sjöstrand [127], Paneyakh [140], [141], Maugeri– Palagachev–Softova [123] and also Taira [183] and [184].

13.4 The Degenerate Diffusion Operator Case In this section we consider the degenerate diffusion operator case. More precisely, let D be a bounded domain in Euclidean space R N with C ∞ boundary ∂ D, and let A be a second order degenerate elliptic differential operator with real coefficients such that N N   ∂2u ∂u a i j (x) + bi (x) + c(x)u, (13.52) Au = ∂xi ∂x j ∂xi i, j=1 i=1

13 L 2 Approach to the Construction of Feller Semigroups

712

where: (1) a i j ∈ C ∞ (R N ), a i j (x) = a ji (x) for all x ∈ R N and all 1 ≤ i, j ≤ N , and satisfy the degenerate elliptic condition N 

a i j (x)ξi ξ j ≥ 0 for all x ∈ R N and ξ ∈ R N .

(13.53)

i, j=1

(2) bi ∈ C ∞ (R N ) for 1 ≤ i ≤ N . (3) c ∈ C ∞ (R N ) and c(x) ≤ 0 on D. Following Fichera [61], we let b(x  ) :=

N 

⎛ ⎝bi (x  ) −

i=1

N  ∂a i j j=1

∂x j

⎞ (x  )⎠ n i for x  ∈ ∂ D,

(13.54)

where n = (n 1 , . . . , n N ) is the unit inward normal to ∂ D at x  (see Fig. 12.19). It is easy to see that the function b(x  ) is invariantly defined (see the proof of [206, Lemma 3.2]). The function b(x  ), defined by formula (13.54), is called the Fichera function for the differential operator A of the form (13.52) with the degenerate elliptic condition (13.53). We divide the boundary ∂ D into the following four disjoint subsets (cf. Fichera [61], Ole˘ınik–Radkeviˇc [138], Stroock–Varadhan [178]): ⎧ ⎫ N ⎨ ⎬  a i j (x  )n i n j > 0 . (13.55a) 3 = x  ∈ ∂ D : ⎩ ⎭ i, j=1 ⎧ ⎫ N ⎨ ⎬  2 = x  ∈ ∂ D : a i j (x  )n i n j = 0, b(x  ) < 0 . (13.55b) ⎩ ⎭ i, j=1 ⎧ ⎫ N ⎨ ⎬  1 = x  ∈ ∂ D : a i j (x  )n i n j = 0, b(x  ) > 0 . (13.55c) ⎩ ⎭ i, j=1 ⎧ ⎫ N ⎨ ⎬  0 = x  ∈ ∂ D : a i j (x  )n i n j = 0, b(x  ) = 0 . (13.55d) ⎩ ⎭ i, j=1

It is easy to see that the sets 0 , 1 , 2 and 3 are all invariantly defined (see the proof of [206, Lemma 3.1]). The fundamental hypothesis for the degenerate elliptic differential operator A of the form (13.52) is the following (see Fig. 13.5 below): (H) Each i (i = 0, 1, 2, 3) consists of a finite number of connected hypersurfaces.

13.4 The Degenerate Diffusion Operator Case Fig. 13.5 The fundamental hypothesis (H) for the diffusion operator A

713

D

Σ3 Σ2 Σ0

Σ1

∂ D = 0 ∪ 1 ∪ 2 ∪ 3 . It should be emphasized that a Markovian particle moves continuously in the interior D, and approaches the boundary portion 2 ∪ 3 in finite time with positive probability ([177, 206]). Hence, we may impose a boundary condition only on the boundary portion 2 ∪ 3 (see [138, 178, 193]).

13.4.1 The Regular Boundary Case First, we consider the regular boundary case. Intuitively, the regularity of the boundary means that a Markovian particle approaches some boundary portion in finite time with positive probability, and also enters the interior from some boundary portion (see hypotheses (A.1 ) and (A.1 ) below). Now let L be a Ventcel’ boundary condition such that Lu =

N −1  i, j=1

αi j (x  )

N −1

 ∂2u ∂u + β i (x  ) ∂xi ∂x j ∂x i i=1

+ γ(x  )u(x  ) + μ(x  )

(13.56)

∂u − δ(x  ) (Au), ∂n

where: (1) The αi j are the components of a C ∞ symmetric contravariant tensor of type on 2 ∪ 3 and N −1 

  αi j (x  )ηi η j ≥ 0 for all x  , η ∈ Tx∗ (2 ∪ 3 ) ,

i, j=1

where Tx∗ (2 ∪ 3 ) is the cotangent space of 2 ∪ 3 at x  . (2) β i ∈ C ∞ (2 ∪ 3 ) for 1 ≤ i ≤ N − 1. (3) γ ∈ C ∞ (2 ∪ 3 ) and γ(x  ) ≤ 0 on 2 ∪ 3 .

2  0

(13.57)

714

13 L 2 Approach to the Construction of Feller Semigroups

(4) μ ∈ C ∞ (2 ∪ 3 ) and μ(x  ) ≥ 0 on 2 ∪ 3 . (5) δ ∈ C ∞ (2 ∪ 3 ) and δ(x  ) ≥ 0 on 2 ∪ 3 . In order to state a hypothesis for the boundary condition L of the form (13.56) with condition (13.57), we let L 0 :=

N −1  i, j=1

αi j (x  )

∂2 , ∂xi ∂x j

and B L 0 (x  , ρ) = the set of all points y ∈ 3 which can be joined to x  ∈ 3 by a Lipschitz path v : [0, ρ] → 3 for which the tangent vector v(t) ˙ of 3 at v(t) is subunit for the operator L 0 for almost every t. The hypothesis for the boundary condition L on 3 is the following: (A.1 ) The operator A of the form (13.52) is elliptic near 3 and there exist constants 0 ≤ ε1 < 1 and C1 > 0 such that we have, for all sufficiently small ρ > 0, B E (x  , ρ) ⊂ B L 0 (x  , C1 ρε1 ) on the set M3 = {x  ∈ 3 : μ(x  ) = 0}. Hypothesis (A.1 ) has an intuitive meaning similar to hypothesis (A.1) in Sect. 13.1 (see Remark 13.2). In order to state a hypothesis for the boundary condition L on 2 , we write the operator A of the form (13.52) in a neighborhood of 2 as follows: A = A2

∂2 ∂ + A0 , + A1 2 ∂n ∂n

(13.58)

where A j ( j = 0, 1, 2) is a differential operator of order 2 − j acting along the surfaces parallel to 2 . We remark that: (a) A2 = 0 on 2 . (b) The restriction A0 |2 to 2 of A0 is a second order differential operator with non-positive principal symbol. Since μ(x  ) ≥ 0 and b(x  ) < 0 on 2 , we can define a “non-Euclidean ”ball B L 0 −(μ/b)(A0 |2 ) (x  , ρ) in the same way as B L 0 (x  , ρ), replacing 3 and L 0 by 2 and L 0 − (μ/b)(A0 |2 ), respectively. The hypothesis for L on 2 is the following:

13.4 The Degenerate Diffusion Operator Case

715

(A.1 ) Assume that A is of the form (13.58) near the hypersurface 2 . Then there exist constants 0 ≤ ε2 < 1 and C2 > 0 such that we have, for all sufficiently small ρ > 0, B E (x  , ρ) ⊂ B L 0 −(μ/b)(A0 |2 ) (x  , C2 ρε2 ) on the set 2 . The intuitive meaning of hypothesis (A.1 ) is that a Markovian particle with generator L 0 − (μ/b)(A0 |2 ) diffuses everywhere in the hypersurface 2 in finite time. Now we can state a generalization of Theorem 13.1: Theorem 13.20 (the existence theorem) Let the degenerate elliptic differential operator A of the form (13.52) satisfy hypothesis (H) and let the boundary condition L of the form (13.56) satisfy the transversality condition on the set 2 ∪ 3 : μ(x  ) + δ(x  ) > 0 on 2 ∪ 3 .

(13.59)

Assume that hypotheses (A.1 ) and (A.1 ) are satisfied. Then there exists a Feller semigroup {Tt }t≥0 on D whose infinitesimal generator A coincides with the minimal closed extension in C(D) of the restriction of A to the space u ∈ C ∞ (D) : Lu = 0 on 2 ∪ 3 . The proof of this theorem is based on the maximum principles discussed in Chap. 10 and the work of Ole˘ınik–Radkeviˇc [138] on the Dirichlet problem for degenerate elliptic differential operators of second order (cf. Notes at the end of Chap. 11).

13.4.2 The Totally Characteristic Case The very recent paper [214] is devoted to the functional analytic approach to the problem of construction of strong Markov processes without boundary condition in the totally characteristic case, that is, the case 3 = ∅. More precisely, we can construct a Feller semigroup corresponding to such a diffusion phenomenon that a Markovian particle moves continuously in the interior of the state space without reaching the boundary. The paper [214] is inspired by the work of Altomare et al. [7], [8] and [10] (see Remark 13.23), and it is a continuation of the previous works [188, 189, 193, 196, 197] and Taira–Favini–Romanelli [215] and [216]. Our fundamental hypothesis for the operator A is stated as follows (see Fig. 13.6 below): (G) The boundary ∂ D consist of a finite number of connected hypersurfaces of the sets 0 and 1 . ∂ D = 0 ∪ 1 .

13 L 2 Approach to the Construction of Feller Semigroups

716 Fig. 13.6 The fundamental hypothesis (G) for the diffusion operator A

∂D = Σ0 ∪ Σ1 Σ0 D Σ0

Σ1

This hypothesis makes it possible to develop the basic machinery of Ole˘ınik and Radkeviˇc [138] with a minimum of bother and the principal ideas can be presented more concretely and explicitly. It should be emphasized that, under hypothesis (G) we cannot impose any boundary condition on the boundary ∂ D, since 2 = 3 = ∅. We give a simple example of hypothesis (G) in the unit disk in the plane R2 : Example 13.21 Let D = {(x1 , x2 ) ∈ R2 : x12 + x22 < 1} be the unit disk with the boundary ∂ D = {(x1 , x2 ) ∈ R2 : x12 + x22 = 1}. For a local coordinate system x1 = r cos θ and x2 = r sin θ with 0 ≤ θ ≤ 2π near the boundary ∂ D = {r = 1}, we assume that the differential operator A is written in the form   2 ∂ ∂ 1 ∂2 1 ∂ ∂ = ϕ(r ) + 2 2 − , + A = ϕ(r ) − 2 ∂r ∂r r ∂r r ∂θ ∂r where ϕ(r ) is a smooth function defined by the formula  ϕ(r ) =

* 1 + exp − 1−r for r < 1, 2 0 for r ≥ 1.

Then it is easy to see that 3 = ∅ and that b = 1 on ∂ D = {r = 1}. This proves that ∂ D = 1 . The next theorem asserts that there exists a Feller semigroup on D corresponding to such a diffusion phenomenon that a Markovian particle moves continuously in the interior D of the state space D without reaching the boundary ∂ D (see [214, Theorem 1.2]): Theorem 13.22 (the existence theorem) Assume that hypothesis (G) is satisfied. We define a linear operator A from the space C(D) into itself as follows: (1) The domain D(A) of A is the space D(A) = C 2 (D).

13.4 The Degenerate Diffusion Operator Case

717

(2) Au = Au for every u ∈ D(A). Then the operator A is closable in the space C(D), and its minimal closed extension A = A is the infinitesimal generator of some Feller semigroup {Tt }t≥0 on D. Remark 13.23 Some remarks are in order: 1◦ Altomare et al. [7] through [10] consider a convex compact domain K with not necessarily smooth boundary ∂ K and a second order differential operator V which degenerates on a subset of the boundary ∂ K containing the extreme points of K . They prove that the closure A of the operator V generates a Feller semigroup {Tt }t≥0 and further that the Feller semigroup {Tt }t≥0 can be approximated by iterates of modified Bernstein–Schnabl operators ([9]). It should be emphasized that Theorem 13.22 coincides with [7, Theorem 4.1], [8, Theorem 4.3] and [10, Theorem 3.1] with K := D if the boundary ∂ K is smooth, as in Example 13.21. 2◦ Theorem 13.22 is proved by Bony–Courrège–Priouret [22] in the strictly elliptic case (see [22, Théorème XVI]) and then by Cancelier [31, Théorème 7.2] in the non-characteristic case: ∂ D = 3 (see also Derridj [42, Chapitre 3]). By a version of the Hille–Yosida theorem in semigroup theory, the proof of Theorem 13.22 is reduced to the study of the homogeneous Dirichlet problem in the theory of partial differential equations. However, if hypothesis (G) is satisfied, the proof of Theorem 13.22 is reduced to the study of the equation (A − λ) u = f in D

(13.60)

without any boundary condition. In this way, an essential step in the proof is the following existence and uniqueness theorem for the equation (13.60) in the framework of Hölder spaces (see [214, Theorem 1.4]): Theorem 13.24 Assume that hypothesis (G) is satisfied. For each integer m ≥ 2, there exists a constant λm+1 > 0 such that if λ ≥ λm+1 , the equation (A − λ) u = f in D

()λ

has a unique solution u in the Hölder space C m+θ (D) for any function f ∈ C m+θ (D) with 0 < θ < 1. Furthermore, the solution u satisfies the inequality uC m+θ (D) ≤ Cm+θ (λ)  f C m+θ (D) ,

(13.61)

where Cm+θ (λ) > 0 is a constant independent of f . Rephrased, the inequality (13.61) of Theorem 13.24 asserts that if hypothesis (G) is satisfied, then the equation ()λ has a unique solution u with loss of two derivatives compared with the elliptic case.

718

13 L 2 Approach to the Construction of Feller Semigroups

13.5 Notes and Comments Sections 13.1, 13.2 and 13.3: Theorems 13.1 and 13.3 are adapted from Taira [185, 188, 189]. These results are a generalization of Théorème XIX of Bony–Courrège– Priouret [22], where Ventcel’ boundary conditions are assumed to be elliptic. We confined ourselves to the case when the differential operator A is strictly elliptic on D. The reason is that when A is not elliptic on D we do not know whether the operator T (α) = L P(α), which plays a fundamental role in the proof, is a pseudo-differential operator or not. It is thus an open problem to extend Theorems 13.1 and 13.3 to the degenerate elliptic case. We remark that Taira [194, 198] have some Results along these lines. Section 13.4: This section is adapted from [206, 214] in such a way as to make it accessible to mathematicians with interest in probability, functional analysis and partial differential equations.

Chapter 14

Concluding Remarks

This book is devoted to a concise and accessible exposition of the functional analytic approach to the problem of construction of strong Markov processes with Ventcel’ boundary conditions in probability. Our approach here is distinguished by the extensive use of the ideas and techniques characteristic of the recent developments in the theory of pseudo-differential operators which may be considered as a modern version of the classical potential theory. For detailed studies of diffusion processes using stochastic differential equations, the reader might be referred to Lévy [115], Itô–McKean, Jr. [95], Ikeda–Watanabe [92], Sato [155], Stroock–Varadhan [179] and Revuz–Yor [151]. Some important remarks are in order: (I) Very recently, we have been able to solve several long-standing open problems in the spectral analysis of elliptic boundary value problems, such as the subelliptic oblique derivative problem [204, 205] and the hypoelliptic Robin problem [207]. In the proofs we made essential use of the Boutet de Monvel calculus [25], which is one of the most influential ideas in the modern history of analysis. Indeed, we carefully re-worked the classical functional analytic methods for Markov processes (due to Sato–Ueno [156]) and Bony–Courrège– Priouret [22]) from the viewpoint of the Boutet de Monvel calculus, which will provide a powerful method for future research in semigroups, boundary value problems and Markov processes. For detailed studies of the Boutet de Monvel calculus, the reader might be referred to Grubb [77], Grubb–Hörmander [78], Rempel–Schulze [150], Schrohe [162] and also [202]. (II) In [200, 201, 208, 213], we prove existence theorems for Feller semigroups with Dirichlet boundary condition, oblique derivative boundary condition and first order Ventcel’ boundary condition for second order uniformly elliptic differential operators with discontinuous coefficients [97, 154]. Our approach there is distinguished by the extensive use of the ideas and techniques © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 K. Taira, Functional Analytic Techniques for Diffusion Processes, Springer Monographs in Mathematics, https://doi.org/10.1007/978-981-19-1099-9_14

719

720

14 Concluding Remarks

characteristic of the recent developments in the Calderón–Zygmund theory of singular integral operators with non-smooth kernels [36, 37, 41, 122, 123]. It should be emphasized that the Calderón–Zygmund theory of singular integral operators with non-smooth kernels provides a powerful tool to deal with smoothness of solutions of elliptic boundary value problems, with minimal assumptions of regularity on the coefficients. The theory of singular integrals continues to be one of the most influential works in modern history of analysis (see Stein [175], Stein–Shakarchi [176]). (III) The main results of [199, 211] are devoted to logistic Dirichlet problems and logistic Neumann problems with discontinuous coefficients, respectively. (IV) We state a brief history of the stochastic analysis methods for Ventcel’ boundary value problems. More precisely, we remark that Ventcel’ boundary value problems are studied by Anderson [14, 15], Cattiaux [33] and Takanobu– Watanabe [217] from the viewpoint of stochastic analysis (see also Ikeda– Watanabe [92, Chap. IV, Sect. 7]). (1) Anderson [14, 15] studies the non-degenerate case under low regularity in the framework of the submartingale problem and shows the existence and uniqueness of solutions to the considered submartingale problem. (2) Takanobu–Watanabe [217] study certain cases of both degenerate interior and boundary operators under minimal assumptions of regularity based on the theory of stochastic differential equations, and they show the existence and uniqueness of solutions. Such existence and uniqueness results on the diffusion processes corresponding to the boundary value problems imply the existence and uniqueness of the associated Feller semigroups on the space of continuous functions. (3) Cattiaux [33] studies the hypoellipticity for diffusions with Ventcel’ boundary conditions. By making use of a variant of the Malliavin calculus under Hörmander’s type conditions [86], he proves that some laws and conditional laws of such diffusions have a smooth density with respect to the Lebesgue measure. (V) In the paper [212] we consider an elliptic Waldenfels operator W of the form (cf. formula (12.64)): W u(x) = Au(x) + Su(x) ⎞ ⎛ n n 2   u ∂ ∂u a i j (x) (x) + bi (x) (x) + c(x)u(x)⎠ := ⎝ ∂x ∂x ∂x i j i i, j=1 i=1

 n  ∂u u(x + z) − u(x) − + zk (x) K (x, z) μ(dz) in D, ∂xk Rn \{0} k=1 

and a first order Ventcel’ boundary operator L of the form (cf. formula (12.80) with αi j ≡ β i ≡ δ ≡ 0):

14 Concluding Remarks

721

Lu(x  ) = Λu(x  ) + γ0 (T u)(x  )   ∂u    := μ(x ) (x ) + γ(x )u(x ) ∂n 

 + u(x + z) − u(x  ) J (x  , z) ν(dz) on ∂ D. Rn \{0}

We remark that the integral term γ0 (T u) is supposed to correspond to an inward jump phenomenon from the boundary (see Fig. 12.18 in Sect. 12.5). (1) By using real analysis techniques such as Strichartz norms and the complex interpolation method, we prove existence and uniqueness theorems in the framework of Sobolev and Besov spaces of L p type [212, Theorems 1.1 and 1.2], which extend Bony–Courrège–Priouret [22, Théorème XVII] to the hypoelliptic Robin case: μ(x  ) − γ(x  ) > 0 on ∂ D.

(14.1)

Our proof is based on various maximum principles for second order elliptic Waldenfels operators with discontinuous coefficients in the framework of L p Sobolev spaces. (2) For a probabilistic approach, the reader might be referred to Komatsu [105, Theorems 1 and 2] and Ishikawa [93, Theorems 1 and 2] on the study of the martingale problem for generators of stable processes. (VI) In Table 14.1 below, we give an overview of general results on generation theorems for Feller semigroups proved by the author using the L p theory of pseudo-differential operators [39, 85, 108, 167, 170, 220] and the Calderón– Zygmund theory of singular integral operators [29, 30, 173] under the condition (14.1) and the transversality condition μ(x  ) + δ(x  ) > 0 on ∂ D.

(1.32)

Here recall (formulas (1.30) and (1.34) and also Fig. 1.13 in Sect. 1.3) that: ⎛

⎞ N −1 2  u ∂ ∂u Lu = ⎝ αi j (x  ) + β i (x  ) + γ(x  )u ⎠ ∂x ∂x ∂x i j i i, j=1 i=1 N −1 

∂u − δ(x  ) Au ∂n ∂u − δ(x  ) Au := Qu + μ(x  ) ∂n + μ(x  )

722

14 Concluding Remarks

Table 14.1 An overview of generation theorems for Feller semigroups under the conditions (14.1) and (1.32) Diffusion operator A Ventcel’ condition Using the theory of Proved by ∂u Lu = Qu + μ(x  ) ∂n −δ(x  ) Au Elliptic smooth case Elliptic smooth case Elliptic smooth case

Elliptic discontinuous case Elliptic discontinuous case

Degenerate elliptic condition (1.31) on Q Hypotheses (A.1), (A.2) Condition (1.32) LSu = ∂u μ(x  ) ∂n + γ(x  )u Condition (14.1) (hypoelliptic Robin case) L D u = −u γ(x  ) ≡ −1 on ∂ D (Dirichlet case) ∂u L O u = ∂n + β(x  ) · u +γ(x  )u μ(x  ) ≡ 1 on ∂ D (oblique derivative case)

Pseudo-differential operators Pseudo-differential operators Pseudo-differential operators

[192, 202]

Singular integral operators

[200, 208]

Singular integral operators

[201, 213]

[191] the present book [195, 202]

and LOu =

N −1  i=1

β i (x  )

∂u ∂u + γ(x  )u + μ(x  ) ∂xi ∂n

:= β(x  ) · u + +γ(x  )u + μ(x  )

∂u . ∂n

(VII) As an application of a degenerate Robin problem under the condition (14.1), Krietenstein and Schrohe prove the short time existence of solutions to the porous medium equation with strictly positive initial value (see [107, Theorem 1.9]). Finally, stochastic methods can be applied to a variety of fields, including finance, biology and medicine. For instance, the stability of stochastic semigroups is used in models of population dynamics and epidemic systems as in Capasso–Bakstein [32, Part II, Chap. 7].

Appendix

A Brief Introduction to the Potential Theoretic Approach

In this appendix, following faithfully Gilbarg–Trudinger [74], we present a brief introduction to the potential theoretic approach to the Dirichlet problem for the usual Laplacian n  ∂2 ∂2 ∂2 = + · · · + . Δ= ∂ xn2 ∂ x 2j ∂ x12 j=1 The approach here can be traced back to the pioneering work of Schauder, [158, 159], on the Dirichlet problem for second order elliptic partial differential operators. This appendix is included for the sake of completeness and most of the material will be quite familiar to the reader and may be omitted. Ladyzhenskaya–Ural’tseva [109] and Miranda [128] are the classics for the theory of elliptic partial differential equations.

A.1

Hölder Continuity and Hölder Spaces

Let Ω be an open set in Euclidean space Rn . First, we let C(Ω) = the space of continuous functions on Ω. If k is a positive integer, we let C k (Ω) = the space of functions of class C k on Ω. Furthermore, we let C(Ω) = the space of functions in C(Ω) having continuous extensions to the closure Ω of Ω. © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 K. Taira, Functional Analytic Techniques for Diffusion Processes, Springer Monographs in Mathematics, https://doi.org/10.1007/978-981-19-1099-9

723

724

Appendix: A Brief Introduction to the Potential Theoretic Approach

If k is a positive integer, we let C k (Ω) = the space of functions in C k (Ω) all of whose derivatives of order ≤ k have continuous extensions to Ω. Let 0 < θ < 1. A function u defined on Ω is said to be uniformly Hölder continuous with exponent θ in Ω if the quantity [u]θ;Ω = sup

x,y∈Ω x= y

|u(x) − u(y)| |x − y|θ

(A.1)

is finite. We say that u is locally Hölder continuous with exponent θ in Ω if it is uniformly Hölder continuous with exponent θ on compact subsets of Ω. If 0 < θ < 1, we define the Hölder space C θ (Ω) as follows: C θ (Ω) = the space of functions in C(Ω) which are locally H¨older continuous with exponent θ on Ω. If k is a positive integer and 0 < θ < 1, we define the Hölder space C k+θ (Ω) as follows: C k+θ (Ω) = the space of functions in C k (Ω) all of whose k-th order derivatives are locally H¨older continuous with exponent θ on Ω. Furthermore, we let C θ (Ω) = the space of functions in C(Ω) which are H¨older continuous with exponent θ on Ω, and C k+θ (Ω) = the space of functions in C k (Ω) all of whose k-th order derivatives are H¨older continuous with exponent θ on Ω. Let k be a non-negative integer and 0 < θ < 1. We introduce various seminorms on the spaces C k (Ω) and C k+θ (Ω) as follows:   [u]k,0;Ω =  D k u 0;Ω = sup sup |D α u(x)| ,

(A.2a)

    [u]k,θ;Ω = D k u θ;Ω = sup D α u θ;Ω .

(A.2b)

x∈Ω |α|=k |α|=k

Appendix: A Brief Introduction to the Potential Theoretic Approach

725

We can define the associated norms on the spaces C k (Ω) and C k+θ (Ω) as follows: uC k (Ω) = |u|k;Ω

k   j   D u =

0;Ω

,

(A.3a)

j=0

  uC k+θ (Ω) = |u|k,θ;Ω = |u|k;Ω + D k u θ;Ω .

(A.3b)

Moreover, if Ω is a bounded domain in Rn with the diameter d := diam Ω = sup |x − y| , x,y∈Ω

then we can introduce two non-dimensional norms uC k (Ω) , uC k+θ (Ω) equivalent respectively to the norms uC k (Ω) , uC k+θ (Ω) as follows: uC k (Ω) =

k 

  d j  D j u 0;Ω ,

(A.4a)

j=0

uC k+θ (Ω) =

k 

    d j  D j u 0;Ω + d k+θ D k u θ;Ω .

(A.4b)

j=0

Then we have the following claims: Claim A.1 If Ω is bounded, then the space C k (Ω) is a Banach space with the norms  · C k (Ω) and  · C k (Ω) . Claim A.2 If Ω is bounded, then the Hölder space C k+θ (Ω) is a Banach space with the norms  · C k+θ (Ω) and  · C k+θ (Ω) . Claim A.3 Let Ω be a bounded domain in Rn with smooth boundary ∂Ω. Let k, j be non-negative integers and 0 < θ, τ < 1. Assume that j + τ < k + θ. Then the injection C k+θ (Ω) ⊂ C j+τ (Ω) is compact (or completely continuous) [74, Chap. 6, Lemma 6.36].

726

A.2

Appendix: A Brief Introduction to the Potential Theoretic Approach

Interior Estimates for Harmonic Functions

First, by differentiating the Poisson integral we can obtain the following interior derivative estimates for harmonic functions [74, Chap. 2, Theorem 2.10]: Theorem A.4 Let Ω be an open set in Rn and let Ω  be open subset of Ω that has compact closure in Ω: Ω   Ω. If u(x) is harmonic in Ω, then we have, for any multi-index α, sup |D α u(x)| ≤

x∈Ω 

where



n|α| d

|α|

sup |u(x)| ,

(A.5)

x∈Ω

 d = dist Ω  , ∂Ω .

Proof We only prove the case where |α| = 1. We assume that B := B(y, R) ⊂ Ω   Ω, where B = B(y, R) is the open ball of radius R about y (see Fig. A.1). First, since u(x) is harmonic in Ω, it follows that Δ (Di u) = Di (Δu) = 0 in Ω for each 1 ≤ i ≤ n. Hence, by applying the mean value theorem for harmonic functions to each function Di u (see [62, Corollary 2.9]) and then the divergence theorem (Theorem 4.22) we obtain that Fig. A.1 The domains Ω and Ω 

Ω Ω B(y, R)

y

Appendix: A Brief Introduction to the Potential Theoretic Approach

727

1 Di u(x) d x R n ωn B

n = u(z) νi dσ (z) for each 1 ≤ i ≤ n, ωn R n ∂ B

Di u(y) =

where dσ is the surface element of ∂ B, ωn =

2π n/2 Γ (n/2)

is the surface area of the unit ball in Rn and ν = (ν1 , ν2 , . . . , νn ) is the unit outward normal to ∂ B. Then we have the inequality |Di u(y)| ≤ ≤ ≤ ≤

n |u(z) νi | dσ (z) R n ωn ∂ B

n n |u(z)| · sup |u(z)| sup dσ (z) = n R ωn z∈∂ B R z∈∂ B ∂B n sup |u(x)| R x∈B n sup |u(x)| . R x∈Ω

By letting R ↑ d y = dist (y, ∂Ω), we obtain that |Di u(y)| ≤ We remark that

n sup |u(x)| for all y ∈ Ω. d y x∈Ω

1 1 for all z ∈ Ω  . ≤ dz d

Hence we have the inequality |Di u(z)| ≤

n sup |u(x)| for all z ∈ Ω  . d x∈Ω

This proves the desired interior estimate (A.5) for |α| = 1. The proof of Theorem A.4 is complete.



728

A.3

Appendix: A Brief Introduction to the Potential Theoretic Approach

Hölder Regularity for the Newtonian Potential

We consider the fundamental solution Γ (x − y) for the Laplacian in the case n ≥ 3: Γ (x − y) = Γ (|x − y|) =

1 |x − y|2−n . (2 − n)ωn

(A.6)

Then we have the following formulas for the fundamental solution Γ (x − y): ∂Γ 1 (x, y) = (A.7a) (xi − yi ) |x − y|−n , ∂ xi ωn ∂ 2Γ (x, y) (A.7b) Di D j Γ (x − y) = ∂ xi ∂ x j  1 = |x − y|2 δi j − n (xi − yi ) x j − y j |x − y|−n−2 . ωn Di Γ (x − y) =

We remark that the fundamental solution Γ (x − y) is harmonic for x = y: Δx Γ (x − y) =

n  ∂ 2Γ i=1

∂ xi2

(x − y) = 0 for x = y.

By formulas (A.7), we have the following estimates for the fundamental solution Γ (x, y): 1 |x − y|1−n , ωn    Di D j Γ (x − y) ≤ n |x − y|−n . ωn |Di Γ (x − y)| ≤

(A.8a) (A.8b)

Claim A.5 Let Ω be a smooth domain with boundary ∂Ω. If u ∈ C 2 (Ω), then we have the Green representation formula u(y) (A.9) 

 ∂Γ ∂u u(x) (x, y) − Γ (x, y) (x) dσ (x) + Γ (x, y)Δu(x) d x = ∂ν ∂ν ∂Ω Ω for y ∈ Ω. Proof We shall apply Green’s second identity (4.23b) for the fundamental solution Γ (x, y). Let y be an arbitrary point of Ω. We replace the domain Ω by the punctured domain Ω \ Bρ where Bρ = B(y, ρ) is a sufficiently small open ball of radius ρ about y (see Fig. A.2 below). By applying Green’s second identity to our situation, we obtain that

Appendix: A Brief Introduction to the Potential Theoretic Approach

729

Fig. A.2 The punctured domain Ω \ Bρ

ν ν Bρ = B(y, ρ) y

Ω ∂Ω

Ω\Bρ

Γ (x, y)Δu(x) d x

(A.10)

  ∂u ∂Γ Γ (x, y) (x) − u(x) (x, y) dσ (x) ∂ν ∂ν ∂Ω 

 ∂u ∂Γ Γ (x, y) (x) − u(x) (x, y) dσ (x). + ∂ν ∂ν ∂ Bρ

=

However, we have, as ρ ↓ 0,



∂ Bρ

= Γ (ρ)

Γ (x, y)

∂ Bρ

∂u (x) dσ (x) ∂ν

∂u (x) dσ (x) ∂ν

ρ ≤ ωn ρ n−1 sup |Du| (2 − n)ωn Bρ 

ρ = sup |Du| −→ 0. 2 − n Bρ 2−n



On the other hand, since we have the formula n 1 ∂ ∂ = (yi − xi ) ∂ν ρ i=1 ∂ xi

it follows that

on the sphere ∂ Bρ = ∂ B(y, ρ),

730

Appendix: A Brief Introduction to the Potential Theoretic Approach

∂Γ (x, y) ∂ν

n 1−n/2 n  1 ∂  = (yi − xi ) (xi − yi )2 (2 − n)ωn ρ i=1 ∂ xi i=1  n −n/2

n   1 1 2 −n 2 2 ρ ρ =− (xi − yi ) (xi − yi ) =− ωn ρ i=1 ωn ρ i=1 =−

1 on ∂ Bρ . ωn ρ n−1

Hence we have, as ρ ↓ 0,

1 ∂Γ (x, y)d σ (x) = − • u(x) ∂ν ωn ρ n−1 ∂ Bρ

∂ Bρ

u(x) dσ (x) −→ −u(y).

Indeed, by the continuity of u it suffices to note that  

 1    u(x) dσ (x) − u(y)    ωn ρ n−1 ∂ Bρ  

   1   = (u(x) − u(y)) dσ (x)    ωn ρ n−1  ∂ Bρ

1 ≤ |u(x) − u(y)| dσ (x) ωn ρ n−1 ∂ Bρ 

1 ≤ dσ (x) sup |u(x) − u(y)| ωn ρ n−1 |x−y|=ρ ∂ Bρ = sup |u(x) − u(y)| −→ 0 as ρ ↓ 0. |x−y|=ρ

Therefore, by letting ρ ↓ 0 in formula (A.10) we obtain the desired Green representation formula u(y) 

 ∂Γ ∂u u(x) (x, y) − Γ (x, y) (x) dσ (x) + Γ (x, y)Δu(x) d x. = ∂ν ∂ν ∂Ω Ω 

The proof of Claim A.5 is complete.

Now we study some differentiability properties of the Newtonian potential of a function f (x)

w(x) := (Γ ∗ f ) (x) =

Ω

Γ (x − y) f (y) dy

Appendix: A Brief Introduction to the Potential Theoretic Approach

731

in an open subset Ω of Euclidean space Rn . First, we obtain the following [74, Chap. 4, Lemma 4.1]: Lemma A.6 Let f (x) be a bounded and integrable function in Ω. Then it follows that

w(x) = Γ (x − y) f (y) dy ∈ C 1 (Ω), Ω

and we have, for any x ∈ Ω,

Di w(x) =

Ω

Di Γ (x − y) f (y) dy for 1 ≤ i ≤ n.

(A.11)

Secondly, we obtain the following [74, Chap. 4, Lemma 4.2]: Lemma A.7 Let f (x) be a bounded and locally Hölder continuous function with exponent 0 < θ ≤ 1 in Ω. Then it follows that 

 w(x) = Ω Γ (x − y) f (y) dy ∈ C 2 (Ω), Δw = f in Ω.

Moreover, take an arbitrary smooth domain Ω0 that contains Ω, and let ν = (ν1 , ν2 , . . . , νn ) be the unit outward normal to ∂Ω0 and let dσ be the surface element on ∂Ω0 (see Fig. A.3). Then we have, for any x ∈ Ω,

 Di D j Γ (x − y) f 0 (y) − f (y) dy (A.12) Ω0

− f (x) Di Γ (x − y) ν j (y) dσ (y) for 1 ≤ i, j ≤ n,

Di D j w(x) =

∂Ω0

where f 0 is the zero-extension of f outside Ω: Fig. A.3 The domains Ω and Ω0

732

Appendix: A Brief Introduction to the Potential Theoretic Approach



f (x) in Ω, 0 in Ω0 \ Ω.

f (x) = 0

A.4

Hölder Estimates for the Second Derivatives

We start with the following basic Hölder estimates fo the Newtonian potential [74, Chap. 4, Lemma 4.4]: Lemma A.8 Let B1 = B(x0 , R) and B2 = B(x0 , 2R) be concentric balls in Rn (see Fig. A.4). For a function f ∈ C θ (B2 ) with 0 < θ < 1, let w(x) be the Newtonian potential of f in B2 :

Γ (x − y) f (y) dy.

w(x) = B2

Then it follows that w ∈ C 2+θ (B1 ), and we have the interior estimate  2   D w

0,θ;B1

≤ C1 | f |0,θ;B2 ,

(A.13)

that is,  2   D w

0,B1

   + (2R)θ D 2 w θ:B1 ≤ C1 | f |0,B2 + (4R)θ [ f ]θ;B2 ,

with a constant C1 = C1 (n, θ ) > 0. We can prove the following interior Hölder estimate for solutions of Poisson’s equation [74, Chap. 4, Theorem 4.6].

B2 = B(x0 , 2R)

Fig. A.4 The concentric balls B1 and B2

B1 = B(x0 , R) x0

Appendix: A Brief Introduction to the Potential Theoretic Approach Fig. A.5 The concentric balls B1 and B2 in Ω

733

B2 Ω

∂Ω

B1 x0

Theorem A.9 Let Ω be a domain in Rn and let f ∈ C θ (Ω) with 0 < θ < 1. If a function u ∈ C 2 (Ω) satisfies Poisson’s equation Δu = f in Ω, then it follows that u ∈ C 2+θ (Ω). Moreover, for any two concentric balls (see Fig. A.5) B1 = B(x0 , R), B2 = B(x0 , 2R)  Ω, we have the estimate  |u|2,θ;B1 ≤ C2 |u|0;B2 + (4R)2 | f |0,θ;B2 ,

(A.14)

that is,     |u|0;B1 + (2R) |Du|0;B1 + (2R)2  D 2 u 0;B + (2R)2+θ D 2 u θ;B 1 1  ≤ C2 |u|0;B2 + (4R)2 | f |0,B2 + (4R)2+θ [ f ]θ;B2 , with a constant C2 = C2 (n, θ ) > 0. Let Ω be an open set in Rn . For x, y ∈ Ω, we let dx = dist (x, ∂Ω),  dx,y = min dx , d y . If k is a non-negative integer and 0 < θ < 1, then we introduce various interior seminorms and norms on the Hölder spaces C k (Ω) and C k+θ (Ω) as follows:

734

Appendix: A Brief Introduction to the Potential Theoretic Approach

[u]∗k,0;Ω = [u]∗k;Ω = sup sup dxk |D α u(x)| , x∈Ω |α|=k

|u|∗k;Ω

=

k 

[u]∗j;Ω ,

(A.15a) (A.15b)

j=0 k+θ [u]∗k,θ;Ω = sup sup dx,y x,y∈Ω |α|=k

|D α u(x) − D α u(y)| , |x − y|θ

|u|∗k,θ;Ω = |u|∗k;Ω + [u]∗k,θ;Ω .

(A.15c) (A.15d)

Claim A.10 (i) If Ω is bounded with d = diam Ω, then we have the inequality  |u|∗k,θ;Ω ≤ max 1, d θ+k |u|k,θ;Ω .

(A.14 )

 (ii) If Ω   Ω with σ = dist Ω  , ∂Ω , then we have the inequality  min 1, σ θ+k |u|k,θ;Ω  ≤ |u|∗k,θ;Ω .

(A.14 )

Moreover, we introduce a seminorm on the Hölder space C θ (Ω) as follows: k k+θ | f |(k) 0,θ;Ω = sup dx | f (x)| + sup dx,y x∈Ω

x,y∈Ω

| f (x) − f (y)| . |x − y|θ

(A.16)

Then, by using Theorem A.9 we can obtain a Schauder interior estimate for a general domain Ω [74, Chap. 4, Theorem 4.8]: Theorem A.11 (the Schauder interior estimate) Assume that a function u ∈ C 2 (Ω) satisfies the equation Δu = f in Ω for a function f ∈ C θ (Ω). Then we have the interior estimate   |u|∗2,θ;Ω ≤ C3 |u|0;Ω + | f |(2) 0,θ;Ω

(A.17)

with a constant C3 = C3 (n, θ ) > 0. Proof We have only to consider the case where |u|0;Ω < ∞, | f |(2) 0,θ;Ω < ∞. The proof is divided into two steps. In the following the letter C denotes a generic positive constant independent of u and f . Step 1: For each point x of Ω, we let (see Fig. A.6 below)

Appendix: A Brief Introduction to the Potential Theoretic Approach

735

Fig. A.6 The concentric balls B1 and B2 in Ω

B2

∂Ω

B1

Ω

x

1 1 dx = dist (x, ∂Ω) , 3 3 B1 := B(x, R), B2 := B(x, 2R). R :=

Then we have, by estimate (A.14),     dx |Du(x)| + dx2  D 2 u(x) ≤ 3R |Du|0;B1 + (3R)2  D 2 u 0,B1  ≤ C |u|0;B2 + R 2 | f |0,θ;B2   ≤ C |u|0;Ω + | f |(2) 0,θ;B2 .

(A.18)

However, we have the estimate  R 2 | f |0,θ;B2 = R 2 | f |0;B2 + (4R)θ [ f ]θ;B2 ≤ C | f |(2) 0,θ;Ω . Indeed, since we have the inequality R=

1 dx ≤ dz for all z ∈ B2 , 3

it follows that R 2 sup | f (z)| ≤ sup dz2 | f (z)| ≤ sup dx2 | f (x)| . z∈B2

z∈B2

x∈Ω

Similarly, since we have the inequality R= it follows that

1 dx ≤ dz,w = min (dz , dw ) for all z, w ∈ B2 , 3

(A.19)

736

Appendix: A Brief Introduction to the Potential Theoretic Approach

R 2+θ

| f (z) − f (w)| 2+θ | f (z) − f (w)| ≤ sup dz,w θ |z − w| |z − w|θ z,w∈B2 2+θ | f (z) − f (w)| ≤ sup dx,y . |z − w|θ x,y∈Ω

Hence we have the inequality R 2 | f |0,θ;B2 ≤ C | f |(2) 0,θ;Ω . Therefore, by combining inequalities (A.18) and (A.19) we obtain that   |u|∗2;Ω = |u|0;Ω + sup dx |Du(x)| + sup dx2  D 2 u(x) x∈Ω x∈Ω   (2) ≤ C |u|0;Ω + | f |0,θ;Ω .

(A.20)

Step 2: We assume that dx ≤ d y for x, y ∈ Ω, so that dx = dx,y = 3R for x, y ∈ Ω. Then it follows that |D 2 u(x) − D 2 u(y)| ≤ |x − y|θ



 D 2 u θ;B1 if y ∈ B1 ,   2   1  2    D + if y ∈ Ω \ B1 . D u(x) u(y) Rθ

Hence we have, for x, y ∈ Ω,   |D 2 u(x) − D 2 u(y)| ≤ (3R)2+θ D 2 u θ;B1 θ |x − y|     + 3θ (3R)2  D 2 u(x) +  D 2 u(y) .

2+θ dx,y

(A.21)

However, by using inequalities (A.14) and (A.19) we can estimate the first term on the right-hand side of inequality (A.21) as follows:   (3R)2+θ D 2 u θ;B1    ≤ C |u|0;B2 + R 2 | f |0,θ;B2 ≤ C |u|0,Ω + | f |(2) 0,θ;Ω . On the other hand, by using inequality (A.20) we can estimate the second term on the right-hand side of inequality (A.21) as follows:       3θ (3R)2  D 2 u(x) +  D 2 u(y) ≤ 6 sup dx2  D 2 u(x) ≤ 6 |u|∗2;Ω x∈Ω   ≤ C |u|0;Ω + | f |(2) 0,θ;Ω .

Appendix: A Brief Introduction to the Potential Theoretic Approach

737

Summing up, we obtain that 2+θ sup dx,y

x,y∈Ω

  |D 2 u(x) − D 2 u(y)| (2) |u| | | ≤ C + f 0;Ω 0,θ;Ω . |x − y|θ

(A.22)

The desired interior estimate (A.17) follows by combining inequalities (A.20) and (A.22). The proof of Theorem A.11 is complete.  The next corollary provides equicontinuity of solutions of Poisson’s equation and their derivatives up to the second order on compact subsets [74, Chap. 6, Corollary 6.3]: Corollary A.12 Let Ω be an open set in Rn and let Ω  be open subset of Ω that has compact closure in Ω: Ω   Ω. Assume that a function u ∈ C 2 (Ω) satisfies the equation Δu = f in Ω  for a function f ∈ C θ (Ω). Then we have, for any 0 < d ≤ dist Ω  , ∂Ω ,       d |Du|0;Ω  + d 2  D 2 u 0;Ω  + d 2+θ D 2 u θ;Ω  ≤ C4 |u|0;Ω + | f |(2) 0,θ;Ω , (A.23) with a constant C4 = C4 (n, θ ) > 0. Proof For all x, y ∈ Ω  , we have the asertions dx ≥ d, d y ≥ d, dx,y ≥ d. Hence, by using estimate (A.17) we obtain that     d |Du|0;Ω  + d 2  D 2 u 0;Ω  + d 2+θ D 2 u θ;Ω 

2 2   2+θ |D u(x) − D u(y)| ≤ sup dx |Du(x)| + sup dx2  D 2 u(x) + sup dx,y |x − y|θ x∈Ω  x∈Ω  x,y∈Ω 

≤ |u|∗2,θ;Ω  ≤ |u|∗2,θ;Ω   ≤ C |u|0;Ω + | f |(2) 0,θ;Ω . This proves the desired estimate (A.23). The proof of Corollary A.12 is complete.



Rephrased, this corollary provides a bound on the seminorms |Du|0;Ω  , |D u|0;Ω  and [D 2 u]θ;Ω  in any subset Ω  of Ω for which dist (Ω  , ∂Ω) ≥ d (see Fig. A.7 below). 2

738

Appendix: A Brief Introduction to the Potential Theoretic Approach

Fig. A.7 The domains Ω and Ω 

Ω ∂Ω

Ω

A.5

Hölder Estimates at the Boundary

The purpose of this section is to prove regularity results for solutions of Poisson’s equation at a hyperplane portion of the boundary (see Theorems A.14 and A.18). First, we introduce some notation (see Fig. A.8):  n = x  , xn ∈ Rn : xn > 0 , R+  T = x  , xn ∈ Rn : xn = 0 , n B1 = B(x0 , R) = x ∈ Rn : |x − x0 | < R for x0 ∈ R+ , n n B2 = B(x0 , 2R) = x ∈ R : |x − x0 | < 2R for x0 ∈ R+ , + n n B1 = B1 ∩ R+ = B(x0 , R) ∩ R+ , n n B2+ = B2 ∩ R+ = B(x0 , 2R) ∩ R+ .

Fig. A.8 The domains B1+ , n B2+ and T in R+

xn Rn+

B2+ = B2 ∩ Rn+ B1+ 0

x0

T

Appendix: A Brief Introduction to the Potential Theoretic Approach

739

The next lemma is a version of Lemma A.8 applicable to the intersection of a n [74, Chap. 4, Lemma 4.10]: domain Ω and the half space R+ Lemma A.13 If f ∈ C θ (B2+ ), we let w be the Newtonian potential of f in B2+ :

w(x) =

B2+

Γ (x − y) f (y) dy.

Then it follows that w ∈ C 2+θ (B1+ ), and we have the boundary a priori estimate  2   D w

0,θ;B1+

that is,

 2   D w

0;B1+

≤ C5 | f |0,θ;B + , 2

    + R θ D 2 w θ;B + ≤ C5 | f |0,B2+ + R θ [ f ]θ;B2+ , 1

with a constant C5 = C5 (n, θ ) > 0. Proof First, by applying Lemma A.7 with Ω := B1+ , Ω0 := B2+ , we obtain that

Di D j w(x) =

Di D j Γ (x − y) ( f (y) − f (x)) dy

− f (x) Di Γ (x − y) ν j (y) dσ (y) in B1+ . B2+

∂ B2+

However, it follows from an application of the divergence theorem that

∂ B2+

=

=

B2+

Di Γ (x − y) ν j (y) dσ (y)

D j Di Γ (x − y) = Di D j Γ (x − y)

∂ B2+

B2+

D j Γ (x − y) νi (y) dσ (y).

Since we have the formula ν = (0, . . . , 0, −1) on T ,

(A.24)

740

Appendix: A Brief Introduction to the Potential Theoretic Approach

Fig. A.9 The domain B1+ and the boundary ∂ B2+ \ T

xn Rn+

∂B2+ \ T B1+ 0

x

T

we have, for i = n or j = n,

Di D j w(x) =

Di D j Γ (x − y) ( f (y) − f (x)) dy

Di Γ (x − y) ν j (y) dσ (y) in B1+ . − f (x) B2+

∂ B2+ \T

We remark that (see Fig. A.9) R ≤ |x − y| ≤ 3R for x ∈ B1+ and y ∈ ∂ B2+ \ T . Hence we can estimate the derivatives Di D j w for i = n or j = n as follows:    Di D j w 

0,θ;B1+

≤ C | f |0,θ;B + . 2

Moreover, it follows from an application of Lemma A.7 that the function

w(x) =

B2+

Γ (x − y) f (y) dy

satisfies the equation Dn Dn w(x) = f (x) −

n−1 

Dk Dk w(x) in B1+ .

k=1

Therefore, we can estimate the derivatives Dn Dn w as follows: |Dn Dn w|0,θ;B + 1

≤|

f |0,θ;B + 2

+

n−1 

1

k=1

≤ C | f |0,θ;B + . 2

|Dk Dk w|0,θ;B +

Appendix: A Brief Introduction to the Potential Theoretic Approach

741

Here and in the following the letter C denotes a generic positive constant independent of w and f . The proof of Lemma A.13 is complete.  n The next theorem is a generalization of Theorem A.9 to the half space R+ [74, Chap. 4, Theorem 4.11]:

Theorem A.14 Assume that a function u ∈ C 2 (B2+ ) ∩ C(B2+ ) is a solution of the Dirichlet problem  Δu = f in Ω, u=0 on T   for a function f ∈ C θ B2+ . Then it follows that u ∈ C 2+θ (B2+ ), and we have the a priori estimate   |u|2,θ;B + ≤ C6 |u|0;B2+ + R 2 | f |0,θ;B + 1

(A.25)

2

with a constant C6 = C6 (n, θ ) > 0. Proof The proof of Theorem A.14 is divided into five steps. Step (I): In the  proof we make use of a method based on reflection. Namely, for each point x = x  , xn ∈ B2+ we define the point (see Fig. A.10) n . x ∗ = (x  , −xn ) ∈ B2− := B(x0 , 2R) ∩ R−

 xn

Fig. A.10 The mapping x −→ x ∗

Rn+ x = (x , xn ) x

0 x∗ = (x , −xn )

742

Appendix: A Brief Introduction to the Potential Theoretic Approach

Fig. A.11 The domains B2± , T and D in Rn

D = B2+ ∪ B2− ∪ (B2 ∩ T ) B2+ B2 ∩ T B2−

Then we have the following:

  Claim A.15 For a function f ∈ C θ B2+ , we consider a function f ∗ (x) defined on the domain (see Fig. A.11) D = B2+ ∪ B2− ∪ (B2 ∩ T ) 

as follows: ∗





f (x) = f (x , xn ) = Then it follows that

f (x  , xn ) for xn > 0, f (x  , −xn ) for xn < 0.

 f ∗ ∈ Cθ D ,

and we have the estimate

 ∗  f 

0,θ;D

≤ 4 | f |0,θ;B + . 2

Proof Indeed, it suffices to note that  ∗  f 

0,θ;D

    =  f ∗ 0;D + (diam D)θ f ∗ 0,θ;D     =  f ∗ 0;D + (diam D)θ f ∗ θ;D θ    ≤ | f |0;B2+ + 2 diam B2+ 2 f ∗ θ;B + 2    ∗  + θ f θ;B + ≤ 4 | f |0;B2+ + diam B2 2

=4|

f |0,θ;B + 2

.

The proof of Claim A.15 is complete. Stpe (II): We let

w(x) :=

B2+

 Γ (x − y) − Γ (x ∗ − y) f (y) dy.



Appendix: A Brief Introduction to the Potential Theoretic Approach

Since we have the formula     ∗   x − y  =  x  − y  , xn + yn  = x − y ∗  ,

it follows that w(x) =

B2+

 Γ (x − y) − Γ (x − y ∗ ) f (y) dy.

Moreover, we have the following: Claim A.16 The function w(x) is a solution of the Dirichlet problem 

Δw = f in B2+ , w=0 in T .

Proof (1) First, we have the assertion Γ (x − y) − Γ (x − y ∗ ) | f (y)|   1 1 1 | f (x)| −→ 0 as xn ↓ 0. − = (n − 2)ωn |x − y|n |x ∗ − y|n 

Moreover, we have the inequality

B2+

 Γ (x − y) − Γ (x − y ∗ ) | f (y)| dy 

≤ sup | f (z)| z∈B2+

B2+

|Γ (x − y)| dy +

B2+

   Γ (x ∗ − y) dy .

However, we have, for x, y ∈ B2+ ,   |x − y| ≤ 4R, x ∗ − y  ≤ 8R. Hence we have the inequality

B2+

 Γ (x − y) − Γ (x − y ∗ ) | f (y)|dy

1 ≤ (n − 2)ωn ≤ =

1 (n − 2)ωn 2

 1 1 | f (y)| dy + ∗ |x − y|n |x − y|n B2+  4R 

8R sup | f (z)| r n−2 r n−1 dr + r n−2 r n−1 dr ωn



z∈B2+

40R | f |0;B2+ . n−2



0

0

743

744

Appendix: A Brief Introduction to the Potential Theoretic Approach

Therefore, by applying Beppo Levi’s theorem (Theorem 2.13) we obtain that 

      ∗ lim sup  Γ (x − y) − Γ (x − y ) f (y) dy   xn ↓0  B2+

 ≤ lim Γ (x ∗ − y) − Γ (x − y) | f (x)| dy = 0. xn ↓0

B2+

This proves that

w(x) =

 B2+

Γ (x − y) − Γ (x − y ∗ ) f (y) dy = 0 on T = {xn = 0}.

(2) Secondly, we show that Δx Γ (x ∗ − y) = 0 for x, y ∈ B2+ . Indeed, we have the formulas   2−n      ∗ ∂ 2  ∗ 2 2 x − y 2 x ∗ − y −n−2 ,  x = (n − 2) n x − − y + y n n ∂ xn2

and 2−n  ∂ 2  ∗  x − y ∂ x 2j   2   −n−2 2  , 1 ≤ j ≤ n − 1. = (n − 2) n x j − y j − x ∗ − y  x ∗ − y  This proves that   2  2   −n−2 Δx Γ (x ∗ − y) = (n − 2) n x ∗ − y  − n x ∗ − y  x ∗ − y  = 0 for x, y ∈ B2+ , since we have the formula n−1    ∗  2   x − y 2 = x j − y j + xn2 + yn2 . j=1

(3) Summing up, we find from Lemma A.7 that

Appendix: A Brief Introduction to the Potential Theoretic Approach

Δw(x)

= Δx

= Δx

 B2+

B2+

745





Γ (x − y) f (y) dy − Δx



B2+



Γ (x − y ) f (y) dy

Γ (x − y) f (y) dy

= f (x) in B2+ . 

The proof of Claim A.16 is complete. Step (III): Since we have the assertion y ∈ B2+ ⇐⇒ y ∗ ∈ B2− ,

f (y) = f ∗ (y ∗ ),

we have the formula

 w(x) = Γ (x − y) − Γ (x ∗ − y) f (y) dy + B2

Γ (x − y) f (y) dy =2 B+

2 −

B2+

Γ (x − y) f (y) dy +







B2−

Γ (x − y) f (y) dy

Γ (x − y) f ∗ (y) dy.

= 2 (Γ ∗ f ) (x) − D

However, it follows from Claim A.15 that f ∗ ∈ C θ (D). If we let w ∗ (x) :=



Γ (x − y) f ∗ (y) dy, D

then it follows from an application of Lemma A.8 that (see Fig. A.12 below)  2 ∗  D w 

0,θ;B1+

≤ C | f |0,θ;D ≤ 2C | f |0,θ;B + . 2

Therefore, by combining this inequality with Lemma A.13 we obtain that     |w|2,θ;B + ≤ 2 |Γ ∗ f |2,θ;B + + w ∗ 2,θ;B + ≤ C | f |0,B2+ + R θ [ f ]θ;B2+ 1

1

= C | f |0,θ;B + . 2

1

(A.26)

746

Appendix: A Brief Introduction to the Potential Theoretic Approach

Fig. A.12 The domains B2+ , B1+ and D

xn D = B2+ ∪ B2− ∪ (B2 ∩ T )

Rn+

B2+ B1+ 0

B2 ∩ T B2−

Step (IV): If we let v(x) := u(x) − w(x), it follows that v is a solution of the Dirichlet problem 

Δv = Δu − Δw = 0 in B2+ , v=0 on T .

Moreover, we have the following: Claim A.17 The function  

V (x) = V (x , xn ) =

v(x  , xn ) for xn ≥ 0, −v(x  , −xn ) for xn ≤ 0.

is harmonic in D: ΔV = 0 in D. Moreover, we have the estimate |V |0;D ≤ |v|0;B2+ . Proof First, it follows that



ΔV = 0 in B2+ , ΔV = 0 in B2− ,

and that V = v = 0 on B2 ∩ T . Hence we have the assertion

x

Appendix: A Brief Introduction to the Potential Theoretic Approach

747

V ∈ C(D) ∩ C ∞ (B2+ ) ∩ C ∞ (B2− ). Secondly, we show that V is harmonic near T in D. To do so, let  Br := B (x  , 0), r  D  be an arbitrary small open ball of radius r about x  , 0 ∈ T . Then we have the formula



 V (y) dσ (y) = V (y , yn ) dσ (y) − V (z  , −z n ) dσ (z) ∂ Br ∂ Br+ ∂ Br−



 = V (y , yn ) dσ (y) − V (y  , yn ) dσ (y) ∂ Br+

∂ Br+

= 0. Hence there exists a constant ε > 0 such that

V (y) dσ (y) = 0 for all 0 < r < ε. ∂ Br

Therefore, by using Green’s first identity (4.23a) to our situation we obtain that 0= = = = = = = where

 

1 d V (y) dσ (y) dr r n−1 ωn ∂ Br  

  1 d V (x , 0) + r z dσ (z) dr ωn S(0,1)

 1 z · ∇V (x  , 0) + r z dσ (z) ωn S(0,1)

 y 1 · ∇V (x  , 0) + y r 1−n dσ (y) ωn S(0,r ) r

  y 1 · ∇V (x , 0) + y dσ (y) r n−1 ωn S(0,r ) r

1 ∂V   (x , 0) + y dσ (y) n−1 r ωn S(0,r ) ∂ν

1 ΔV (x) d x, r n−1 ωn Br S(0, r ) = z ∈ Rn : |z| = r .

Since the integral of ΔV over any ball near T in D vanishes, it follows that ΔV = 0 near T in D.

748

Appendix: A Brief Introduction to the Potential Theoretic Approach

Summing up, we have proved that V is harmonic in D. The proof of Claim A.17 is complete.



Step (V): It remains to prove estimate (A.25). (1) For the function v = u − w, we have the estimate |v|0;B1+ ≤ |u|0;B1+ + |w|0;B1+ ≤ |u|0;B2+ + C R 2 | f |0;B2+ . (2) By Theorem A.4, we have the estimate n n |V |0;B2 ≤ |v|0;B2+ . R R  2  2 2n 2n |V |0;B2 ≤ |v|0;B2+ . ≤ R R

|Dv|0;B1+ ≤  2   D v

0;B1+

(3) By Theorem A.9, we have the estimate 

D2v

 θ;B1+

≤ C R −2−θ |V |0;B2 ≤ C R −2−θ |v|0;B2+ .

(4) By Theorem A.9, we have the estimate |w|0;B2+ ≤ C R 2 | f |0;B2+ . For the function v = u − w, we have the estimate |v|0;B2+ ≤ |u|0;B2+ + |w|0;B2+ ≤ |u|0;B2+ + C R 2 | f |0;B2+ . Summing up, we obtain the estimate 2+θ  2   |v|2,θ;B + = |v|2;B + + diam B1+ D v θ;B + 1 1 1   2 ≤ C |v|0;B2+ ≤ C |u|0;B2+ + R | f |0;B2+   ≤ C |u|0;B2+ + R 2 | f |0,θ;B + .

(A.27)

2

Therefore, the desired estimate (A.25) follows by combining estimates (A.26) and (A.27). The proof of Theorem A.14 is complete.  n Let Ω be an open set in R+ with open boundary portion T on the set 

x  , xn ∈ Rn : xn = 0

(see Fig. A.13 below). For x, y ∈ Ω, we let

Appendix: A Brief Introduction to the Potential Theoretic Approach Fig. A.13 The open set Ω with an open boundary portion T

749

xn Rn+ ∂Ω \ T Ω 0

T

x

d x = dist (x, ∂Ω \ T ) ,  d x,y = min d x , d y . If k is a non-negative integer and 0 < θ < 1, then we introduce various interior seminorms and norms on the Hölder spaces C k (Ω) and C k+θ (Ω) as follows: k

[u]∗k,0;Ω∪T = [u]∗0;Ω∪T = sup sup d x |D α u(x)| , x∈Ω |α|=k

|u|∗k;Ω∪T =

k 

[u] j;Ω∪T ,

(A.28a) (A.28b)

j=0 k+θ

[u]∗k,θ;Ω∪T = sup sup d x,y x,y∈Ω |α|=k

|D α u(x) − D α u(y)| , |x − y|θ

|u|∗k,θ;Ω∪T = |u|∗k;Ω∪T + [u]∗k,θ;Ω∪T ,

(A.28c) (A.28d)

and |u|(k) 0,θ;Ω∪T = sup d x |u(x)| + sup d x,y k

x∈Ω

k+θ

x,y∈Ω

|u(x) − u(y)| . |x − y|θ

(A.28e)

n The next theorem is a generalization of Theorem A.11 to the half space R+ [74, Chap. 4, Theorem 4.12]:  n Theorem A.18 Let Ω be an open set in R+ with a boundary portion T on { x  , xn ∈ Rn : xn = 0}. Assume that a function u ∈ C 2 (Ω) ∩ C (Ω ∪ T ) is a solution of the Dirichlet problem  Δu = f in Ω, u=0 on T

for a function f ∈ C θ (Ω ∪ T ). Then we have the boundary estimate

750

Appendix: A Brief Introduction to the Potential Theoretic Approach xn

Fig. A.14 The domains B1 n and B2 in R+ Rn+

B2+ = B2 ∩ Rn+ B1+ 0

.

T

  |u|∗2,θ;Ω∪T ≤ C7 |u|0;Ω + | f |(2) 0,θ;Ω∪T ,

x

(A.29)

with a constant C7 = C7 (n, θ ) > 0. Proof The proof of this theorem is similar to that of Theorem A.11. Step 1: For each point x of Ω, we let (see Fig. A.14) 1 dx, 3 B1 := B(x, R), B2 := B(x, 2R), R :=

and n , B1+ := B(x, R) ∩ R+ + n B2 := B(x, 2R) ∩ R+ .

Then we have, by estimate (A.25),    2 d x |Du(x)| + d x  D 2 u(x) ≤ 3R |Du|0;B1 + (3R)2  D 2 u 0,B1  ≤ C |u|0;B2 + R 2 | f |0,θ;B2 . Moreover, if we assume that d x ≤ d y for x, y ∈ Ω, then it follows that d x = d x,y = 3R for x, y ∈ Ω. Hence we have the inequalities R 2 | f (x)| = and

1 2 1 d x | f (x)| ≤ | f |(2) 0,θ;Ω∪T , 9 9

(A.30)

Appendix: A Brief Introduction to the Potential Theoretic Approach θ

R2d x

751

| f (x) − f (y)| 1 2+θ | f (x) − f (y)| 1 = d x,y ≤ | f |(2) 0,θ;Ω∪T . θ θ |x − y| 9 |x − y| 9

This proves that

R 2 | f |0,θ;B2 ≤ C | f |(2) 0,θ;Ω∪T .

(A.31)

Therefore, by combining estimates (A.30) and (A.31) we obtain that   |u|∗2;Ω∪T ≤ C |u|0;Ω + | f |(2) 0,θ;Ω∪T ,

(A.32)

Step 2: Similarly, we assume that d x ≤ d y for x, y ∈ Ω, so that d x = d x,y = 3R for x, y ∈ Ω. Then it follows that |D 2 u(x) − D 2 u(y)| ≤ |x − y|θ



 D 2 u θ;B1 if y ∈ B1 ,   2   1  2    D u(x) + D u(y) if y ∈ Ω \ B1 . Rθ

Hence we have, for x, y ∈ Ω, |D 2 u(x) − D 2 u(y)| |x − y|θ       ≤ (3R)2+θ D 2 u θ;B + + 3θ (3R)2  D 2 u(x) +  D 2 u(y)   1 ≤ (3R)2+θ D 2 u θ;B + + 6 |u|∗2;Ω∪T . 2+θ

d x,y

(A.33)

1

Moreover, by using estimates (A.25) and (A.32) we obtain that 2+θ

d x,y

 |D 2 u(x) − D 2 u(y)| ≤ C |u|0;B2 + R 2 | f |0,θ;B2 + 6 |u|∗2;Ω∪T θ |x − y|   ∗ ≤ C |u|0;Ω + | f |(2) 0,θ;Ω∪T + 6 |u|2;Ω∪T ,

so that sup x,y∈Ω∪T

|D 2 u(x) − D 2 u(y)| |x − y|θ  ∗ + | f |(2) 0,θ;B2 + 6 |u|2;Ω∪T .

2+θ

d x,y

 ≤ C |u|0;Ω

(A.34)

Therefore, the desired boundary estimate (A.29) follows by combining estimates (A.32), (A.33) and (A.34). The proof of Theorem A.18 is complete. 

752

Appendix: A Brief Introduction to the Potential Theoretic Approach

Fig. A.15 The open set Ω with an open boundary portion T

T

∂Ω \ T Ω x

Let Ω be an open set in Rn with C 2+θ boundary portion T . For x, y ∈ Ω, we let (see Fig. A.15) d x = dist (x, ∂Ω \ T ) ,  d x,y = min d x , d y . If k is a non-negative integer and 0 < θ < 1, then we introduce various boundary seminorms and norms on the Hölder spaces C k (Ω ∪ T ) and C k+θ (Ω ∪ T ) as follows: k

[u]∗k,0;Ω∪T = [u]∗0;Ω∪T = sup sup d x |D α u(x)| , x∈Ω |α|=k

|u|∗k;Ω∪T

=

k 

[u] j;Ω∪T ,

(A.35a) (A.35b)

j=0 k+θ

[u]∗k,θ;Ω∪T = sup sup d x,y x,y∈Ω |α|=k

|D α u(x) − D α u(y)| , |x − y|θ

|u|∗k,θ;Ω∪T = |u|∗k;Ω∪T + [u]∗k,θ;Ω∪T ,

(A.35c) (A.35d)

and |u|(k) 0,θ;Ω∪T = sup d x |u(x)| + sup d x,y k

x∈Ω

k+θ

x,y∈Ω

|u(x) − u(y)| . |x − y|θ

(A.35e)

Then we can prove the following Schauder local boundary estimate for solutions of the Dirichlet problem for curved boundaries [74, Chap. 6, Lemma 6.5]: Lemma A.19 (the Schauder local boundary estimate) Let Ω be a C 2,θ domain in Rn with boundary ∂Ω. Assume that a function u ∈ C 2+θ (Ω) is a solution of the Dirichlet problem

Appendix: A Brief Introduction to the Potential Theoretic Approach

753

∂Ω

Fig. A.16 The domain B  (x0 ) and the boundary ∂ B  (x0 ) \ T

Ω ∂B (x0 ) \ T B (x0 ) x0



Δu = f in Ω, u=0 on ∂Ω

for a function f ∈ C θ (Ω). Then, at each boundary point x0 ∈ ∂Ω there is a ball B = B(x0 , δ) of radius δ > 0, independent of x0 , such that we have the boundary estimate  |u|2,θ;B∩Ω ≤ C8 |u|0;Ω + | f |0,θ;Ω , (A.36) with a constant C8 = C8 (n, θ ) > 0. Proof The proof of Lemma A.19 is divided into three steps. Step (1): First, we consider the case where x ∈ B  (x0 ) = B(x0 , ρ) ∩ Ω.

Since we have the inequality (see Fig. A.16)  d x = dist x, ∂ B  (x0 ) \ T ≤ diam Ω, it follows that | f |(2) 2,θ;B  (x0 )∪T

(A.37)

2 2+θ | f (x) − f (y)| = sup d x | f (x)| + sup d x,y |x − y|θ x∈B  (x0 ) x,y∈B  (x0 )

≤ C sup | f (x)| + x∈B  (x

0)

sup

x,y∈B  (x

| f (x) − f (y)| ≤ C | f |0,θ;B  (x0 ) |x − y|θ 0)

≤ C | f |0,θ;Ω . On the other hand, by applying Theorem A.18 we obtain that

754

Appendix: A Brief Introduction to the Potential Theoretic Approach T

Fig. A.17 The open neighborhoods B  (x0 ) and B  (x0 ) of x0 in Ω

∂B (x0 ) \ T

B (x0 )

B (x0 ) x0

  |u|∗2,θ;B  (x0 )∪T ≤ C |u|0;B  (x0 ) + | f |(2) 2,θ;B  (x0 )∪T   ≤ C |u|0;Ω + | f |(2)  2,θ;B (x0 )∪T . Therefore, we have, by inequality (A.37),  |u|∗2,θ;B  (x0 )∪T ≤ C1 |u|0;Ω + | f |0,θ;Ω , with a positive constant

(A.38)

 C1 = C1 n, θ, B  (x0 ) .

Step (2): Secondly, we consider the case where x ∈ B  (x0 ) = B (x0 , ρ/2) ∩ Ω.

However, we remark that (see Fig. A.17)  ρ for all x ∈ B  (x0 ), d x = dist x, ∂ B  (x0 ) \ T ≥ 2 ρ for all x, y ∈ B  (x0 ). d x,y ≥ 2 Hence we have the inequality |u|∗2,θ;B  (x0 )∪T

(A.39)

= sup |u(x)| + sup d x |Du(x)| + sup x∈B  (x0 )

x∈B  (x0 )

|D 2 u(x) − D 2 u(y)| |x − y|θ x,y∈B  (x0 )     ρ ρ 2+θ  sup |u(x)| ≥ min 1, , 2 2 x∈B  (x0 ) +

sup

2+θ

d x,y

x∈B  (x0 )

2 dx

|D2 u(x)|

Appendix: A Brief Introduction to the Potential Theoretic Approach Fig. A.18 The subdomain Ωσ and the ball B(xi , ρi /4) about xi

Ω

755

B(xi , ρi /4) Ωσ

+ sup |Du(x)| + sup |D2 u(x)| x∈B  (x

x∈B  (x

0)



0)

  ρ  ρ 2+θ |u|0,θ;B  (x0 ) . = min 1, , 2 2 Therefore, by combining inequalities (A.38) and (A.39) we obtain that This proves that  |u|0,θ;B  (x0 ) ≤ C2 |u|0;Ω + | f |0,θ;Ω , (A.40) with a positive constant  C2 = C2 n, θ, B  (x0 ), B  (x0 ) . Step (3): Since the boundary ∂Ω is compact, we can find a finite number of N N and positive numbers {ρi }i=1 such that (see Fig. A.18) boundary points {xi }i=1 ∂Ω ⊂

N 

B (xi , ρi /4) .

i=1

We let δ := min

1≤i≤N

and

ρi , 4

 C := max C2 n, θ, B  (xi ), B  (xi ) . 1≤i≤N

then, for each boundary point x0 ∈ ∂Ω we can find some ball B(xi , ρi /4) such that x0 ∈ B (xi , ρi /4) . Hence we have the inequality

756

Appendix: A Brief Introduction to the Potential Theoretic Approach

|x − xi | ≤ |x − x0 | + |x0 − xi | ≤ δ + and so

ρi ρi < 4 2

for all x ∈ B,

B ∩ Ω ⊂ B(xi , ρi /2) ∩ Ω = B  (xi ).

By using inequality (A.40), we obtain that  |u|0,θ;B∩Ω ≤ |u|0,θ;B  (xi ) ≤ C |u|0;Ω + | f |0,θ;Ω . This proves the the desired estimate (A.36). The proof of Lemma A.19 is complete.



Finally, by using Lemma A.19 we can obtain the following Schauder global estimate for solutions with C 2+θ boundary values defined on a C 2+θ domain Ω [74, Chap. 6, Theorem 6.6]: Theorem A.20 (the Schauder global estimat) Let Ω be a C 2+θ domain in Rn with boundary ∂Ω. For given functions f ∈ C θ (Ω) and ϕ ∈ C 2+θ (Ω), assume that a function u ∈ C 2+θ (Ω) is a solution of the Dirichlet problem for the Laplacian Δ  Δu = f in Ω, u=ϕ on ∂Ω. Then we have the global estimate  |u|2,θ;Ω ≤ C9 |u|0;Ω + |ϕ|2,θ;Ω + | f |0,θ;Ω ,

(A.41)

with a constant C9 = C9 (n, θ ) > 0. Proof The proof of Theorem A.20 is divided into two steps. Step I: The homogeneous case where ϕ = 0. We show that every solution u ∈ C 2+θ (Ω) of the Dirichlet problem 

Δu = f in Ω, u=0 on ∂Ω

satisfies the global estimate  |u|2,θ;Ω ≤ C |u|0,θ;Ω + | f |0,θ;Ω , with a positive constant C = C(n, θ ). (1) First, we consider the case where x ∈ B(x0 , δ) ∩ Ω.

(A.42)

Appendix: A Brief Introduction to the Potential Theoretic Approach

757

Here δ is the positive constant in Lemma A.19. Then it follows from an application of Lemma A.19 that   |Du(x)| +  D 2 u(x) ≤ |u|2,θ;B(x0 ,δ)∩Ω  ≤ Cδ |u|0;Ω + | f |0,θ;Ω for all x ∈ B(x0 , δ) ∩ Ω,

(A.43)

with a positive constant Cδ = C(n, θ, δ). (2) Secondly, we consider the case where x ∈ Ωσ := {x ∈ Ω : dist(x, ∂Ω) > σ } for σ := 2δ . Then it follows from an application of estimate (A.23) that     σ |Du|0;Ωσ + σ 2  D 2 u 0;Ωσ ≤ C |u|0;Ω + | f |(2) 0,θ;Ω  ≤ C |u|0;Ω + | f |0,θ;Ω . This proves that    |Du(x)| +  D 2 u(x) ≤ Cσ |u|0;Ω + | f |0,θ;Ω for all x ∈ Ωσ ,

(A.44)

with a positive constant Cσ = C(n, θ, σ ). Therefore, by combining estimates (A.43) and (A.44) we obtain that  |u|2;Ω ≤ Cδ,σ |u|0;Ω + | f |0,θ;Ω ,

(A.45)

with a positive constant Cδ,σ = C(n, θ, δ, σ ).  (3) It remains to estimate the quantity D 2 u θ;Ω . (a) First, we consider the case where x, y ∈ B(x0 , δ) ∩ Ω. Then it follows from an application of estimate (A.36) that  |D 2 u(x) − D 2 u(y)| ≤ |u|2,θ;B(x0 ,δ)∩Ω ≤ C1 |u|0;Ω + | f |0,θ;Ω . θ |x − y| (b) Secondly, we consider the case where x, y ∈ Ωσ . It follows from an application of estimate (A.23) that

(A.46)

758

Appendix: A Brief Introduction to the Potential Theoretic Approach

σ 2+θ

  |D 2 u(x) − D 2 u(y)| (2) |u| | | ≤ C + f 0;Ω 0,θ;Ω |x − y|θ  ≤ C2 |u|0;Ω + | f |0,θ;Ω .

(A.47)

(c) Finally, we consider the case where |x − y| > σ either x ∈ / Ωσ . / Ωσ or y ∈ Then it follows from estimate (A.45) that     |D 2 u(x) − D 2 u(y)| ≤ σ −θ  D 2 u(x) +  D 2 u(y) ≤ σ −θ 2 |u|2;Ω θ |x − y|  ≤ C3 |u|0;Ω + | f |0,θ;Ω .

(A.48)

Therefore, we obtain from estimates (A.46), (A.47) and (A.48) that 

D2u

 θ;Ω

|D 2 u(x) − D 2 u(y)| (A.49) |x − y|θ x,y∈Ω   C2 C3  ≤ C1 + 2+θ + θ |u|0;Ω + | f |0,θ;Ω for σ = 2δ . σ σ

= sup

The desired global estimate (A.42) follows by combining estimates (A.45) and (A.49). Step II: The non-homogeneous case where ϕ ∈ C 2+θ (Ω). Assume that a function u ∈ C 2+θ (Ω) is a solution of the Dirichlet problem  Δu = f in Ω, u=ϕ on ∂Ω. If we let v := u − ϕ ∈ C 2+θ (Ω), then the function v is a solution of the homogeneous Dirichlet problem  v = f − Δϕ in Ω, v=0 on ∂Ω. Hence, by applying estimate (A.42) to the solution v = u − ϕ we obtain that  |u − ϕ|2,θ;Ω ≤ C |u − ϕ|0;Ω + | f − Δϕ|0,θ;Ω . However, we have the inequalities

(A.50)

Appendix: A Brief Introduction to the Potential Theoretic Approach

759

|Δϕ|0,θ;Ω ≤ C |ϕ|2,θ;Ω , and |u − ϕ|0,θ;Ω ≤ |u|0,θ;Ω + |ϕ|0,θ;Ω ≤ |u|0,θ;Ω + |ϕ|2,θ;Ω . Therefore, we obtain from estimate (A.50) that  |u − ϕ|2,θ;Ω ≤ C |u − ϕ|0;Ω + | f |0,θ;Ω + |ϕ|2,θ;Ω  ≤ C |u|0,θ;Ω + | f |0,θ;Ω + |ϕ|2,θ;Ω . This proves that |u|2,θ;Ω ≤ |u − ϕ|2,θ;Ω + |ϕ|2,θ;Ω  ≤ C |u|0;Ω + | f |0,θ;Ω + |ϕ|2,θ;Ω . Now the proof of Theorem A.20 is complete.

A.6



Notes and Comments

Now let Ω be a bounded C 2,θ domain in Euclidean space R N with 0 < θ < 1, and let A be a second order strictly elliptic differential operator with real coefficients such that N N   ∂2 ∂ a i j (x) + bi (x) + c(x)u, A= ∂ xi ∂ x j ∂ xi i, j=1 i=1 where the coefficients a i j , bi , c satisfy the following conditions: (1) a i j ∈ C θ (Ω), a i j (x) = a ji (x) for 1 ≤ i, j ≤ N and x ∈ Ω and there exists a constant a0 > 0 such that N 

 a i j (x)ξi ξ j ≥ a0 |ξ |2 for all (x, ξ ) ∈ T ∗ Ω = Ω × R N .

i, j=1

(2) bi (x) ∈ C θ (Ω) for 1 ≤ i ≤ N . (3) c ∈ C θ (Ω) and c(x) ≤ 0 in Ω. We introduce a family of second order strictly elliptic differential operators At := t A + (1 − t) Δ for 0 ≤ t ≤ 1, and consider a family of the Dirichlet problem for the operators At

760

Appendix: A Brief Introduction to the Potential Theoretic Approach



At u = f in Ω, u=ϕ on ∂Ω.

(D)t

Then, by using the method of continuity (Theorem 5.19) we can prove the following theorem (see [74, Theorem 6.8]): Theorem A.21 Assume that the Dirichlet problem for the Laplacian Δ 

Δu = f in Ω, u=ϕ on ∂Ω

(D)0

has a solution u ∈ C 2+θ (Ω) for any f ∈ C θ (Ω) and any ϕ ∈ C 2+θ (∂Ω). Then the Dirichlet problem for the operator A 

Au = f in Ω, u=ϕ on ∂Ω

(D)1

also has a solution u ∈ C 2+θ (Ω) for any f ∈ C θ (Ω) and any ϕ ∈ C 2+θ (∂Ω).

References

1. R. Abraham, J. E. Marsden and T. Ratiu: Manifolds, tensor analysis, and applications, second edition, Applied Mathematical Sciences, Vol. 75, Springer-Verlag, New York, 1988. 2. R. A. Adams and J. J. F. Fournier: Sobolev spaces, second edition, Pure and Applied Mathematics, Vol. 140, Elsevier/Academic Press, Amsterdam, 2003. 3. S. Agmon: Lectures on elliptic boundary value problems, Van Nostrand, Princeton, New Jersey, 1965. 4. S. Agmon, A. Douglis and L. Nirenberg: Estimates near the boundary for solutions of elliptic partial differential equations satisfying general boundary conditions I, Comm. Pure Appl. Math., 12 (1959), 623–727. 5. S. Agmon and L. Nirenberg: Properties of solutions of ordinary differential equations in Banach space, Comm. Pure Appl. Math., 16 (1963), 121–239. 6. M. S. Agranovich and M. I. Vishik: Elliptic problems with a parameter and parabolic problems of general type, Uspehi Mat. Nauk 19 (3)(117), (1964), 53–161 (Russian): English translation in Russian Math. Surv. 19 (3) (1964), 53–157. 7. F. Altomare, M. Cappelletti Montano and S. Diomede: Degenerate elliptic operators, Feller semigroups and modified Bernstein–Schnabl operators, Math. Nachr., 284 (2011), 587–607. 8. F. Altomare, M. Cappelletti Montano, V. Leonessa and I. Ra¸sa: On differential operators associated with Markov operators, J. Funct. Anal., 266 (2014), 3612–3631. 9. F. Altomare, M. Cappelletti Montano, V. Leonessa and I. Ra¸sa: Markov operators, positive semigroups and approximation processes, De Gruyter Studies in Mathematics, Vol. 61, Walter de Gruyter, Berlin-Munich-Boston, 2014. 10. F. Altomare, M. Cappelletti Montano, V. Leonessa and I. Ra¸sa: Elliptic differential operators and positive semigroups associated with generalized Kantorovich operators, J. Math. Anal. Appl., 458 (2018), 153–173. 11. H. Amann: Linear and quasilinear parabolic problems, Vol. I, Abstract linear theory. Monographs in Mathematics, Vol. 89, Birkhäuser Boston, Boston, Massachusetts, 1995. 12. H. Amann: Linear and quasilinear parabolic problems. Vol. II, Function spaces. Monographs in Mathematics, Vol. 106, Birkhäuser/Springer, Cham, 2019. 13. K. Amano: Maximum principles for degenerate elliptic-parabolic operators, Indiana Univ. Math. J., 29 (1979), 545–557. 14. R. F. Anderson: Diffusions with second order boundary conditions I, Indiana Univ. Math. J., 25 (1976), 367–397.

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 K. Taira, Functional Analytic Techniques for Diffusion Processes, Springer Monographs in Mathematics, https://doi.org/10.1007/978-981-19-1099-9

761

762

References

15. R. F. Anderson: Diffusions with second order boundary conditions II, Indiana Univ. Math. J., 25 (1976), 403–441. 16. N. Aronszajn and K. I. Smith: Theory of Bessel potentials I. Ann. Inst. Fourier (Grenoble), 11 (1961), 385–475. 17. J. Barros-Neto: An introduction to the theory of distributions, Marcel Dekker, New York. 1973. 18. J. Bergh and J. Löfström: Interpolation spaces, an introduction, Springer-Verlag, Berlin, 1976. 19. S. N. Bernstein: Equations différentielles stochastiques. In: Actualités Sci. et Ind. 738, pp. 5– 31. Conf. intern. Sci. Math. Univ. Genève, Hermann, Paris, 1938. 20. R. M. Blumenthal and R. K. Getoor: Markov processes and potential theory, Dover Publications Inc., Mineola, New York, 2007. 21. J.-M. Bony: Principe du maximum, inégalité de Harnack et unicité du problème de Cauchy pour les opérateurs elliptiques dégénérés, Ann. Inst. Fourier (Grenoble), 19 (1969), 277–304. 22. J.-M. Bony, P. Courrège et P. Priouret: Semigroupes de Feller sur une variété à bord compacte et problèmes aux limites intégro-différentiels du second ordre donnant lieu au principe du maximum, Ann. Inst. Fourier (Grenoble), 18 (1968), 369–521. 23. G. Bourdaud: L p -estimates for certain non-regular pseudo-differential operators, Comm. Partial Differential Equations, 7 (1982), 1023–1033. 24. L. Boutet de Monvel: Comportement d’un opérateur pseudo-différentiel sur une variété à bord, J. Anal. Math., 17 (1966), 241–304. 25. L. Boutet de Monvel: Boundary problems for pseudo-differential operators, Acta Math., 126 (1971), 11–51. 26. H. Brezis: Functional analysis, Sobolev spaces and partial differential equations, Universitext. Springer-Verlag, New York, 2011. 27. R. Brown: A brief account of microscopical observations made in the months of June, July, and August, 1827, on the particles contained in the pollen of plants; and on the general existence of active molecules in organic and inorganic bodies, Philosophical Magazine N. S. 4 (1828), 161–173. 28. A. P. Calderón and R. Vaillancourt: A class of bounded pseudo-differential operators, Proc. Nat. Acad. Sci. USA, 69 (1972), 1185–1187. 29. A. P. Calderón and A. Zygmund: On the existence of certain singular integrals, Acta Math. 88 (1952), 85–139. 30. A. P. Calderón and A. Zygmund: Local properties of solutions of elliptic partial differential equations, Studia Math., 20 (1961), 171–225. 31. C. Cancelier: Problèmes aux limites pseudo-différentiels donnant lieu au principe du maximum, Comm. Partial Differential Equations, 11 (1986), 1677–1726. 32. V. Capasso and D. Bakstein: An introduction to continuous-time stochastic processes - theory, models, and applications to finance, biology, and medicine, fourth edition. Modeling and Simulation in Science, Engineering and Technology. Birkhäuser/Springer, Cham, 2021. 33. P. Cattiaux: Hypoellipticité et hypoellipticité partielle pour les diffusions avec une condition frontière, Ann. Inst. H. Poincaré Probab. Statist., 22 (1986), 67–112. 34. S. Chapman: On the Brownian displacements and thermal diffusion of grains suspended in a non-uniform fluid. Proc. Roy. Soc. London Ser. A, 119 (1928), 34–54. 35. J. Chazarain et A. Piriou: Introduction à la théorie des équations aux dérivées partielles linéaires, Gauthier-Villars, Paris, 1981. 36. F. Chiarenza, M. Frasca and P. Longo: Interior W 2, p estimates for nondivergence elliptic equations with discontinuous coefficients, Ricerche mat., 60 (1991), 149–168. 37. F. Chiarenza, M. Frasca and P. Longo: W 2, p - solvability of the Dirichlet problem for nondivergence elliptic equations with VMO coefficients, Trans. Amer. Math. Soc., 336 (1993), 841–853. 38. W.-L. Chow: Über Systeme von linearen partiellen Differentialgleichungen erster Ordnung, Math. Ann., 117 (1940), 98–105. 39. R. R. Coifman et Y. Meyer: Au-delà des opérateurs pseudo-différentiels. Astérisque, No. 57, Société Mathématique de France, Paris, 1978.

References

763

40. G. de Rham: Variétés différentiables, Hermann, Paris, 1955. English translation: Differentiable manifolds, Springer-Verlag, Berlin Heidelberg New York Tokyo, 1984, 41. G. Di Fazio and D. K. Palagachev: Oblique derivative problem for elliptic equations in nondivergence form with VMO coefficients, Comment. Math. Univ. Carolinae, 37 (1996), 537–556. 42. M. Derridj: Un problème aux limites pour une classe d’opérateurs du second ordre hypoelliptiques, Ann. Inst. Fourier (Grenoble), 21 (1971), 99–148. 43. J. J. Duistermaat: Fourier integral operators, Courant Institute Lecture Notes, New York, 1973. 44. J. J. Duistermaat and L. Hörmander: Fourier integral operators II, Acta Math., 128 (1972), 183–269. 45. E. B. Dynkin: Foundations of the theory of Markov processes, Fizmatgiz, Moscow, 1959 (Russian). German translation: Springer-Verlag, Berlin, 1961. English translation: Pergamon Press, Oxford, 1960. 46. E. B. Dynkin: Markov processes, Vols. I, II. Springer-Verlag, Berlin, 1965. 47. E. B. Dynkin and A. A. Yushkevich: Markov process, theorems and problems. Nauka, Moscow, 1967 (Russian). English translation: Plenum Press, New York, 1969. 48. Ju. V. Egorov: Subelliptic operators. Uspekhi Mat. Nauk, 30:2 (1975), 57–114; 30:3 (1975), 57–104 (Russian). English translation: Russian Math. Surv. 30:2 (1975), 59–118: 30:3 (1975), 55–105. 49. Ju. V. Egorov and V. A. Kondrat’ev: The oblique derivative problem. Mat. Sbornik, 78(120) (1969), 148–176 (Russian). English translation: Math. USSR Sbornik, 7 (1969), 139–169. 50. A. Einstein: Investigations on the theory of the Brownian movement. Dover, New York, 1956. 51. K.-J. Engel and R. Nagel: One-parameter semigroups for linear evolution equations. Graduate Texts in Mathematics, Vol. 194, Springer-Verlag, New York Berlin Heidelberg, 2000. 52. G. I. Èskin: Boundary value problems for elliptic pseudodifferential equations. Nauka, Moscow, 1973 (Russian). English translation: American Mathematical Society, Providence, Rhode Island, 1981. 53. S. N. Ethier and T. G. Kurtz: Markov processes, characterization and convergence. John Wiley & Sons Inc., New York Chichester Brisbane Toronto Singapore, 1986. 54. L. C. Evans: Partial differential equations, second edition. Graduate Studies in Mathematics, Vol. 19, American Mathematical Society, Providence, Rhode Island, 2010. 55. V. S. Fedi˘ı: On a criterion for hypoellipticity. Mat. Sbornik 85 (127) (1971), 18–48 (Russian). English translation: Math. USSR Sbornik, 14(85) (1971), 15–45. 56. C. Fefferman and D. H. Phong: On positivity of pseudo-differential operators, Proc. Nat. Acad. Sci. USA, 75 (1978), 4673–4674. 57. C. Fefferman and D. H. Phong, Subelliptic eigenvalue problems, In: Conference on Harmonic analysis (1981: Chicago, Illinois), pp. 590–606, Wadsworth, Belmont, California, 1983. 58. W. Feller: Zur Theorie der stochastischen Prozesse (Existenz und Eindeutigkeitssätze). Math. Ann., 113 (1936), 113–160. 59. W. Feller: The parabolic differential equations and the associated semigroups of transformations. Ann. of Math. (2), 55 (1952), 468–519. 60. W. Feller: On second order differential equations. Ann. of Math. (2), 61 (1955), 90–105. 61. G. Fichera, Sulle equazioni differenziali lineari ellittico-paraboliche del secondo ordine, Atti Accad. Naz. Lincei. Mem. Cl. Sci. Fis. Mat. Nat. Sez. I. (8) 5 (1956), 1–30. 62. G. B. Folland: Introduction to partial differential equations, second edition. Princeton University Press, Princeton, New Jersey, 1995. 63. G. B. Folland: Real analysis, second edition. John Wiley & Sons, New York Chichester Weinheim Brisbane Singapore Toronto, 1999. 64. R. Fortet: Les fonctions aléatoires du type de Markoff associées à certaines équations linéaires aux dérivées partielles du type paraboloque. J. Math. Pures Appl., 22 (1943), 177–243.f 65. A. Friedman, Remarks on the maximum principle for parabolic equations and its applications, Pacific J. Math., 8 (1958), 201–211. 66. A. Friedman: Foundations of modern analysis, Dover Publications Inc., New York, 1982. 67. A. Friedman: Partial differential equations, Dover Publications Inc., Mineola, New York, 2008.

764

References

68. K. Friedrichs: On differential operators in Hilbert spaces, Amer. J. Math., 61 (1939), 523–544. 69. D. Fujiwara: On some homogeneous boundary value problems bounded below. J. Fac. Sci. Univ. Tokyo Sec. IA, 17 (1970), 123–152. 70. D. Fujiwara and H. Omori: An example of a globally hypo-elliptic operator, Hokkaido Math. J., 12 (1983), 293–297. 71. D. Fujiwara and K, Uchiyama: On some dissipative boundary value problems for the Laplacian. J. Math. Soc. Japan, 23 (1971), 625–635. 72. L. Gårding: Dirichlet’s problem for linear elliptic partial differential equations. Math. Scand., 1 (1953), 55–72. 73. I. M. Gel’fand and G. E. Shilov: Generalized functions, Vols. I–III, Moscow, 1958 (Russian). English translation: Academic Press, New York, 1964, 1967, 1968. 74. D. Gilbarg and N. S. Trudinger: Elliptic partial differential equations of second order, Reprint of the 1998 edition, Classics in Mathematics, Springer-Verlag, Berlin, 2001. 75. I. C. Gohberg and M. G. Kre˘ın: The basic propositions on defect numbers, root numbers and indices of linear operators (Russian). Uspehi Mat. Nauk., 12 (1957), 43–118. English translation: Amer. Math. Soc. Transl. (2), 13 (1960), 185–264. 76. J. A. Goldstein: Semigroups of linear operators and applications. Oxford Mathematical Monographs, Clarendon Press, Oxford University Press, New York, 1985. 77. G. Grubb: Functional calculus for pseudodifferential boundary value problems. Second edition, Progress in Mathematics, Vol. 65, Birkhäuser Boston, Boston, Massachusetts, 1996. 78. G. Grubb and L. Hörmander: The transmission property. Math. Scand., 67 (1990), 273–289. 79. D. Henry: Geometric theory of semilinear parabolic equations. Lecture Notes in Mathematics, No. 840, Springer-Verlag, New York Heidelberg Berlin, 1981. 80. C. D. Hill: A sharp maximum principle for degenerate elliptic-parabolic equations, Indiana Univ. Math. J., 20 (1970), 213–229. 81. E. Hille and R. S. Phillips: Functional analysis and semi-groups, American Mathematical Society Colloquium Publications, 1957 edition. American Mathematical Society, Providence, Rhode Island, 1957. 82. E. Hopf: Elementare Bemerkungen über die Lösungen partieller Differentialgleichungen zweiter Ordnung vom elliptischen Typus, Sitz. Ber. Preuss. Akad. Wissensch. Berlin Math.Phys. Kl. 19 (1927), 147–152. 83. E. Hopf: A remark on linear elliptic differential equations of second order. Proc. Amer. Math. Soc., 3 (1952), 791–793. 84. L. Hörmander: Linear partial differential operators. Springer-Verlag, Berlin Göttingen Heidelberg, 1963. 85. L. Hörmander: Pseudodifferential operators and non-elliptic boundary problems. Ann. of Math. (2), 83 (1966), 129–209. 86. L. Hörmander: Hypoelliptic second order differential equations. Acta Math., 119 (1967), 147–171. 87. L. Hörmander: Pseudo-differential operators and hypoelliptic equations. In: Proc. Sym. Pure Math., X, Singular integrals, A. P. Calderón (ed.), pp. 138–183, American Mathematical Society, Providence, Rhode Island, 1967. 88. L. Hörmander: Fourier integral operators I. Acta Math. 127 (1971), 79–183. 89. L. Hörmander: A class of hypoelliptic pseudodifferential operators with double characteristics. Math. Ann., 217 (1975), 165–188. 90. L. Hörmander: Subelliptic operators. In: Seminar on singularities of solutions of linear partial differential equations, 127–208. Ann. of Math. Stud., No. 91, Princeton University Press, Princeton, New Jersey, 1979. 91. L. Hörmander: The analysis of linear partial differential operators III, Pseudo-differential operators, reprint of the 1994 edition, Classics in Mathematics Springer-Verlag, Berlin Heidelberg New York Tokyo, 2007. 92. N. Ikeda and S. Watanabe: Stochastic differential equations and diffusion processes, second edition. North-Holland Mathematical Library, Vol. 24, North-Holland Publishing Co., Amsterdam; Kodansha, Ltd., Tokyo, 1989.

References

765

93. Y. Ishikawa: A remark on the existence of a diffusion process with nonlocal boundary conditions, J. Math. Soc. Japan, 42 (1990), 171–184 94. K. Itô: Stochastic processes (Japanese), Iwanami-Shoten, Tokyo, 1957. 95. K. Itô and H. P. McKean, Jr.: Diffusion processes and their sample paths, reprint of the 1974 edition, Classics in Mathematics, Springer-Verlag, Berlin Heidelberg New York, 1996. 96. G. J. O. Jameson: Topology and normed spaces, Chapman and Hall, London, 1974. 97. F. John and L. Nirenberg: On functions of bounded mean oscillation, Comm. Pure and Appl. Math., 14 (1961), 415–426. 98. S. Kakutani: Markoff process and the Dirichlet problem, Proc. Japan Acad., 21 (1945), 227– 233. 99. Y. Kannai: Hypoellipticity of certain degenerate elliptic boundary value problems. Trans. Amer. Math. Soc., 217 (1976), 311–328. 100. T. Kato: Perturbation theory for linear operators, reprint of the 1980 edition. Classics in Mathematics, Springer-Verlag, Berlin Heidelberg New York, 1995. 101. J. R. Kinney: Continuity properties of Markov processes, Trans. Amer. Math. Soc., 74 (1953), 289–302. 102. F. B. Knight: Essentials of Brownian motion and diffusion. American Mathematical Society, Providence, Rhode Island, 1981. 103. A. N. Kolmogorov: Über die analytischen Methoden in der Wahrscheinlichkeitsrechnung. Math. Ann., 104 (1931), 415–458. 104. A. N. Kolmogorov and S. V. Fomin: Introductory real analysis, Translated from the second Russian edition and edited by R. A. Silverman. Dover Publications, New York, 1975. 105. T. Komatsu: Pseudo-differential operators and Markov processes, J. Math. Soc. Japan, 36 (1984), 387–418. 106. S. G. Kre˘ın: Linear differential equations in Banach space, Nauka, Moscow, 1967 (Russian). English translation: Amer. Math. Soc., Providence, Rhode Island, 1971. Japanese translation: Yoshioka Shoten, Kyoto, 1972. 107. T. Krietenstein and E. Schrohe: Bounded H ∞ -calculus for a degenerate elliptic boundary value problem, Math. Ann. (2021). https://doi.org/10.1007/s00208-021-02251-1. 108. H. Kumano-go: Pseudodifferential operators. MIT Press, Cambridge, Massachusetts, 1981. 109. O. A. Ladyzhenskaya and N. N. Ural’tseva: Linear and quasilinear elliptic equations. Translated from the Russian by Scripta Technica, Inc. Translation editor: Leon Ehrenpreis. Academic Press, New York London, 1968. 110. J. Lamperti: Probability, Benjamin, New York, 1966. 111. J. Lamperti: Stochastic processes. Springer-Verlag, New York Heidelberg Berlin, 1977. 112. S. Lang: Real analysis, Addison-Wesley, Reading, Massachusetts. 1983. 113. S. Lang: Differential manifolds, second edition. Springer-Verlag, New York, 1985. 114. P. Lévy: Processus stochastiques et mouvement brownien. Gauthier-Villars, Paris, 1948. 115. P. Lévy: Théorie de l’addition des variables aléatoires, deuxième édition. Gauthier-Villars, Paris, 1954. 116. J.-L. Lions et E. Magenes: Problèmes aux limites non-homogènes et applications, Vol. 1, 2. Dunod, Paris, 1968. English translation: Non-homogeneous boundary value problems and applications, Vol. 1, 2. Springer-Verlag, Berlin Heidelberg New York, 1972. 117. J. López-Gómez: Linear second order elliptic operators. World Scientific Publishing Co. Pte. Ltd., Hackensack, New Jersey, 2013. 118. I. Madsen and J. Tornehave: From calculus to cohomology: de Rham cohomology and characteristic classes, Cambridge University Press, Cambridge New York Melbourne, 1997. 119. J. Malý and W. P. Ziemer: Fine regularity of solutions of elliptic partial differential equations, American Mathematical Society, Providence, Rhode Island, 1997. 120. K. Masuda: Evolution equations (Japanese). Kinokuniya Shoten, Tokyo, 1975. 121. Y. Matsushima: Differentiable manifolds, Marcel Dekker, New York, 1972. 122. A. Maugeri and D. K. Palagachev: Boundary value problem with an oblique derivative for uniformly elliptic operators with discontinuous coefficients, Forum Math., 10 (1998), 393– 405.

766

References

123. A. Maugeri, D. K. Palagachev and L. G. Softova: Elliptic and parabolic equations with discontinuous coefficients, Mathematical Research, Vol. 109, Wiley-VCH, Berlin. 124. H. P. McKean, Jr.: Elementary solutions for certain parabolic partial differential equations, Trans. Amer. Math. Soc. 82 (1956), 519–548. 125. W. McLean: Strongly elliptic systems and boundary integral equations. Cambridge University Press, Cambridge, 2000. 126. A. Melin: Lower bounds for pseudo-differential operators, Ark. för Mat., 9 (1971), 117–140. 127. A. Melin and J. Sjöstrand: Fourier integral operators with complex phase functions and parametrix for an interior boundary value problem, Comm. Partial Differential Equations, 1 (1976), 313–400. 128. C. Miranda: Partial differential equations of elliptic type, second revised edition. Translated from the Italian by Zane C. Motteler. Ergebnisse der Mathematik und ihrer Grenzgebiete, Band 2, Springer-Verlag, New York Berlin, 1970. 129. J. R. Munkres: Elementary differential topology. Ann. of Math. Studies, No. 54, Princeton Univ. Press, Princeton, New Jersey, 1966. 130. M. Nagase: The L p -boundedness of pseudo-differential operators with non-regular symbols. Comm. Partial Differential Equations, 2 (1977), 1045–1061. 131. R. Narasimhan: Analysis on real and complex manifolds, North-Holland, Amsterdam, 1973. 132. E. Nelson: Dynamical theories of Brownian motion, Princeton University Press, Princeton, New Jersey, 1967. 133. L. Nirenberg: A strong maximum principle for parabolic equations. Comm. Pure Appl. Math., 6 (1953), 167–177. 134. M. Nishio: Probability theory (Japanese). Jikkyo Shuppan, Tokyo, 1978. 135. B. Noble: Methods based on the Wiener–Hopf technique for the solution of partial differential equations, second edition. Chelsea Publishing Company, New York, 1988. 136. K. Ogura: On the theory of approximating functions with applications, to geometry, law of errors and conduction of heat, Tôhoku Math. J. (2), 16 (1919), 103–154. 137. O. A. Ole˘ınik: On properties of solutions of certain boundary problems for equations of elliptic type. Mat. Sbornik, 30 (1952), 595–702 (Russian). 138. O. A. Ole˘ınik and E. V. Radkeviˇc: Second order equations with nonnegative characteristic form, Itogi Nauki, Moscow, 1971 (Russian). English translation: Amer. Math. Soc., Providence, Rhode Island and Plenum Press, New York, 1973. 139. R. Palais: Seminar on the Atiyah–Singer index theorem, Ann. of Math. Studies, No. 57, Princeton University Press, Princeton, New Jersey, 1963. 140. B. P. Paneyakh: Some boundary value problems for elliptic equations, and the Lie algebras associated with them. Math. USSR Sbornik, 54 (1986), 207–237. 141. B. P. Paneyakh: The oblique derivative problem. The Poincaré-problem. Mathematical Topic, Vol. 17, Wiley-VCH Verlag, Berlin, 2000. 142. A. Pazy: Semigroups of linear operators and applications to partial differential equations. Springer-Verlag, New York Berlin Heidelberg Tokyo, 1983. 143. J. Peetre: Rectification à l’article “Une caractérisation des opérateurs différentiels”. Math. Scand., 8 (1960), 116–120. 144. J. Peetre: Another approach to elliptic boundary problems. Comm. Pure Appl. Math., 14 (1961), 711–731. 145. J. B. Perrin: Les atomes. Gallimard, Paris, 1970. 146. M. H. Protter and H. F. Weinberger: Maximum principles in differential equations, corrected second printing. Springer-Verlag, New York, 1984. 147. D. Ray: Stationary Markov processes with continuous paths. Trans. Amer. Math. Soc., 82 (1956), 452–493. 148. R. M. Redheffer: The sharp maximum principle for nonlinear inequalities, Indiana Univ. Math. J., 21 (1971), 227–248. 149. M. Reed and B. Simon: Methods of modern mathematical physics I: Functional analysis, revised and enlarged edition, Academic Press, New York, 1980.

References

767

150. S. Rempel and B.-W. Schulze: Index theory of elliptic boundary problems. Akademie-Verlag, Berlin, 1982. 151. D. Revuz and M. Yor: Continuous martingales and Brownian motion, third edition. SpringerVerlag, Berlin New York Heidelberg, 1999. 152. S. Rosenberg: The Laplacian on a Riemannian manifold, London Mathematical Society Student Texts, No. 31, Cambridge University Press, Cambridge New York Melbourne, 1997. 153. W. Rudin: Real and complex analysis, third edition. McGraw-Hill, New York, 1987. 154. D. Sarason: Functions of vanishing mean oscillation, Trans. Amer. Math. Soc., 207 (1975), 391–405. 155. K. Sato: Lévy processes and infinitely divisible distributions, translated from the 1990 Japanese original, revised edition of the 1999 English translation. Cambridge Studies in Advanced Mathematics, Vol. 68, Cambridge University Press, Cambridge, 2013. 156. K. Sato and T. Ueno: Multi-dimensional diffusion and the Markov process on the boundary, J. Math. Kyoto Univ., 14 (1965), 529–605. 157. H. H. Schaefer: Topological vector spaces, third printing. Graduate Texts in Mathematics, Vol. 3, Springer-Verlag, New York Berlin, 1971. 158. J. Schauder: Über lineare elliptische Differentialgleichungen zweiter Ordnung, Math. Z., 38 (1934), 257–282. 159. J. Schauder: Numerische Abschätzungen in elliptischen linearen Differentialgleichungen, Studia Math., 5 (1935), 34–42. 160. M. Schechter: Principles of functional analysis, second edition. Graduate Studies in Mathematics, Vol. 36, American Mathematical Society, Providence, Rhode Island, 2002. 161. M. Schechter: Modern methods in partial differential equations. Dover Publications Inc., Mineola, New York, 2014. 162. E. Schrohe: A short introduction to Boutet de Monvel’s calculus. In: Approaches to Singular Analysis, J. Gil, D. Grieser and M. Lesch (eds), pp. 85–116, Oper. Theory Adv. Appl., 125, Birkhäuser, Basel, 2001. 163. L. Schwartz: Théorie des distributions, Hermann, Paris, 1966. 164. G. Schwarz: Hodge decomposition – A method for solving boundary value problems, Lecture Notes in Mathematics, Vol. 1607, Springer-Verlag, Berlin Heidelberg New York Tokyo, 1995. 165. R. T. Seeley: Extension of C ∞ functions defined in a half-space. Proc. Amer. Math. Soc., 15 (1964), 625–626. 166. R. T. Seeley: Refinement of the functional calculus of Calderón and Zygmund. Proc. Nederl. Akad. Wetensch. Ser. A, 68 (1966), 521–531. 167. R. T. Seeley: Singular integrals and boundary value problems. Amer. J. Math., 88 (1966), 781–809. 168. R. T. Seeley: Topics in pseudo-differential operators. In: Pseudo-differential operators (C.I.M.E., Stresa, 1968), L. Nirenberg (ed.), pp. 167–305. Edizioni Cremonese, Roma, 1969. Reprint of the first edition, Springer-Verlag, Berlin Heidelberg, 2010. 169. L. V. Seregin: Continuity conditions for stochastic processes, Teoriya Veroyat. i ee Primen. 6 (1961), 3–30 (Russian). English translation: Theory Prob. and its Appl. 6 (1961), 1–26. 170. M. A. Shubin: Pseudodifferential operators and spectral theory. Translated from the 1978 Russian original by Stig I. Andersson, second edition, Springer-Verlag, Berlin Heidelberg, 2001. 171. I. M. Singer and J. A. Thorpe: Lecture notes on elementary topology and geometry, SpringerVerlag, New York Heidelberg Berlin, 1967. 172. E. M. Stein: The characterization of functions arising as potentials II. Bull. Amer. Math. Soc., 68 (1962), 577–582. 173. E. M. Stein: Singular integrals and differentiability properties of functions. Princeton Mathematical Series, Princeton University Press, Princeton, New Jersey, 1970. 174. E. M. Stein: The differentiability of functions in Rn , Ann. of Math. (2), 113 (1981), 383–385. 175. E. M. Stein: Harmonic analysis: real-variable methods, orthogonality, and oscillatory integrals, Princeton Mathematical Series, Vol. 43. Monographs in Harmonic Analysis, III, Princeton University Press, Princeton, New Jersey, 1993.

768

References

176. E. M. Stein and R. Shakarchi: Real analysis. Measure theory, integration, and Hilbert spaces. Princeton Lectures in Analysis, Vol. 3. Princeton University Press, Princeton, New Jersey, 2005. 177. D. W. Stroock and S. R. S. Varadhan: On the support of diffusion processes with applications to the strong maximum principle. In: Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability (Univ. California, Berkeley, Calif., 1970/1971), Vol. III: Probability theory, pp. 333–359. Univ. California Press, Berkeley, California, 1972. 178. D. W. Stroock and S. R. S. Varadhan: On degenerate elliptic-parabolic operators of second order and their associated diffusions, Comm. Pure Appl. Math., 25 (1972), 651–713. 179. D. W. Stroock and S. R. S. Varadhan: Multidimensional diffusion processes, reprint of the 1997 edition. Classics in Mathematics. Springer-Verlag, Berlin, 2006. 180. M. H. Taibleson: On the theory of Lipschitz spaces of distributions on Euclidean n-space I. J. Math. Mech., 13 (1964), 407–479. 181. K. Taira: On non-homogeneous boundary value problems for elliptic differential operators, K¯odai Math. Sem. Rep., 25 (1973), 337–356. 182. K. Taira: On some degenerate oblique derivative problems, J. Fac. Sci. Univ. Tokyo Sec.IA, 23 (1976), 259–287. 183. K. Taira: Sur le problème de la dérivée oblique I, J. Math. Pures Appl., 57 (1978), 379–395. 184. K. Taira: Sur le problème de la dérivée oblique II, Ark. Mat., 17 (1979), 177–191. 185. K. Taira: Sur l’existence de processus de diffusion, Ann. Inst. Fourier (Grenoble), 29 (1979), 99–126. 186. K. Taira: A strong maximum principle for degenerate elliptic operators, Comm. Partial Differential Equations, 4 (1979), 1201–1212. 187. K. Taira: Un théorème d’existence et d’unicité des solutions pour des problèmes aux limites non-elliptiques, J. Funct. Anal., 43 (1981), 166–192. 188. K. Taira: Semigroups and boundary value problems, Duke Math. J., 49 (1982), 287–320. 189. K. Taira: Semigroups and boundary value problems II, Proc. Japan Acad., 58 (1982), 277–280. 190. K. Taira: Le principe du maximum et l’hypoellipticité globale, Séminaire Bony–Sjöstrand– Meyer 1984–1985, Exposé No. I, Ecole Polytechnique, Palaiseau, 1985. 191. K. Taira: Diffusion processes and partial differential equations, Academic Press Inc., Boston, Massachusetts, 1988. http://hdl.handle.net/2241/0002001094. 192. K. Taira: On the existence of Feller semigroups with boundary conditions, Mem. Amer. Math. Soc. 99, No. 475, American Mathematical Society, Providence, Rhode Island, 1992. 193. K. Taira: On the existence of Feller semigroups with Dirichlet condition, Tsukuba J. Math., 17 (1993), 377–427. 194. K. Taira: On the existence of Feller semigroups with boundary conditions II, J. Funct. Anal., 129 (1995), 108–131. 195. K. Taira: Boundary value problems for elliptic integro-differential operators, Math. Z., 222 (1996), 305–327. 196. K. Taira: Feller semigroups and degenerate elliptic operators I, Conf. Semin. Mat. Univ. Bari, No. 274 (1999), 1–29. 197. K. Taira: Feller semigroups and degenerate elliptic operators II, Conf. Semin. Mat. Univ. Bari, No. 275 (1999), 1–30. 198. K. Taira: Feller semigroups generated by degenerate elliptic operators II. In the Proceedings of the First International Conference on Semigroups of Operators, Theory and Applications (Newport Beach, CA, 1998), pp. 304–319, Progr. Nonlinear Differential Equations Appl., textbf42, Birkhäuser, Basel, 2000. 199. K. Taira: Logistic Dirichlet problems with discontinuous coefficients, J. Math. Pures Appl., 82 (2003), 1137–1190. 200. K. Taira: On the existence of Feller semigroups with discontinuous coefficients, Acta Math. Sinica (English Series), 22 (2006), 595–606. 201. K. Taira: On the existence of Feller semigroups with discontinuous coefficients II, Acta Math. Sinica (English Series), 25 (2009), 715–740.

References

769

202. K. Taira: Semigroups, boundary value problems and Markov processes, second edition, Springer Monographs in Mathematics. Springer-Verlag, Heidelberg, 2014. 203. K. Taira: Analytic semigroups and semilinear initial boundary value problems, second edition. London Mathematical Society Lecture Note Series, No. 434, Cambridge University Press, London New York, 2016. 204. K. Taira: Analytic semigroups for the subelliptic oblique derivative problem, J. Math. Soc. Japan, 69 (2017), 1281–1330. 205. K. Taira: Spectral analysis of the subelliptic oblique derivative problem, Ark. Mat., 55 (2017), 243–270. 206. K. Taira: A strong maximum principle for globally hypoelliptic operators, Rend. Circ. Mat. Palermo (2), 68 (2019), 193–217. 207. K. Taira: Spectral analysis of the hypoelliptic Robin problem, Ann. Univ. Ferrara Sez. VII Sci. Mat., 65 (2019), 171–199. 208. K. Taira: Dirichlet problems with discontinuous coefficients and Feller semigroups, Rend. Circ. Mat. Palermo (2), 69 (2020), 287–323. 209. K. Taira: Boundary value problems and Markov processes: Functional analysis methods for Markov processes, Third edition, Lecture Notes in Mathematics, 1499, Springer-Verlag, Berlin, 2020. 210. K. Taira: Spectral analysis of hypoelliptic Vishik–Ventcel’ boundary value problems, Ann. Univ. Ferrara Sez. VII Sci. Mat., 66 (2020), 157–230. 211. K. Taira: Logistic Neumann problems with discontinuous coefficients, Ann. Univ. Ferrara Sez. VII Sci. Mat., 66 (2020), 409–485. 212. K. Taira: Ventcel’ boundary value problems for elliptic Waldenfels operators, Boll. Unione Mat. Ital., 13 (2020), 213–256. 213. K. Taira: Oblique derivative problems and Feller semigroups with discontinuous coefficients, Ricerche mat., (2020). https://doi.org/10.1007/s11587-020-00509-5. 214. K. Taira: Feller semigroups and degenerate elliptic operators III, Math. Nachr., 294 (2021), 377–417. 215. K. Taira, A. Favini and S. Romanelli: Feller semigroups generated by degenerate elliptic operators, Semigroup Forum 60 (2000), 296–309. 216. K. Taira, A. Favini and S. Romanelli: Feller semigroups and degenerate elliptic operators with Wentzell boundary conditions, Studia Math., 145 (2001), 17–53. 217. S. Takanobu and S.Watanabe: On the existence and uniqueness of diffusion processes with Wentzell’s boundary conditions. J. Math. Kyoto Univ., 28 (1988), 71–80. 218. H. Tanabe: Equations of evolution, Iwanami-Shoten, Tokyo, 1975 (Japanese). English translation: Pitman, London, 1979. 219. H. Tanabe: Functional analytic methods for partial differential equations, Marcel Dekker, New York Basel, 1997. 220. M. E. Taylor: Pseudodifferential operators. Princeton Mathematical Series, Vol. 34, Princeton University Press, Princeton, New Jersey, 1981. 221. M. E. Taylor: Pseudodifferential operators and nonlinear PDE. Progress in Mathematics, Vol. 100, Birkhäuser Boston, Boston, Massachusetts, 1991. 222. F. Treves: An invariant criterion of hypoellipticity, Amer. J. Math., 83 (1961), 645–668. 223. F. Treves: Topological vector spaces, distributions and kernels. Academic Press, New York London, 1967. 224. H. Triebel: Interpolation theory, function spaces, differential operators, North-Holland, Amsterdam, 1978. 225. H. Triebel: Theory of function spaces. Birkhäuser, Basel Boston Stuttgart, 1983. 226. G. M. Troianiello: Elliptic differential equations and obstacle problems. The University Series in Mathematics, Plenum Press, New York London, 1987. 227. T. Ueno: The diffusion satisfying Wentzell’s boundary condition and the Markov process on the boundary II, Proc. Japan Acad., 36 (1960), 625–629. 228. B. R. Va˘ınberg and V. V. Grušin: Uniformly nonelliptic problems I, II, Mat. Sbornik 72(114) (1967), 602–636. 73(115) (1967), 126–154 (Russian). English translation: Math. USSR Sbornik 1(72) (1967), 543–568. 2(73) (1967), 111–133.

770

References

229. M. I. Višik: On general boundary problems for elliptic differential equations, Trudy Moskov. Mat. Obšˇc. 1 (1952), 187–246 (Russian). English translation: Amer. Math. Soc. Transl. (2) 24 (1963), 107–172. 230. M. I. Višik and G. I. Èskin: Normally solvable problems for elliptic systems of convolution equations. Mat. Sb. (N.S.), 74(116) (1967), 326–356 (Russian). 231. W. von Waldenfels: Positive Halbgruppen auf einem n-dimensionalen Torus. Archiv der Math., 15 (1964), 191–203. 232. F. W. Warner: Foundations of differentiable manifolds and Lie groups, Graduate Texts in Mathematics, No. 94, Springer-Verlag, New York Berlin Heidelberg Tokyo, 1983. 233. S. Watanabe: Construction of diffusion processes with Wentzell’s boundary conditions by means of Poisson point processes of Brownian excursions. In: Probability Theory, pp. 255– 271. Banach Center Publications, vol. 5, PWN-Polish Scientific Publishers, Warsaw, 1979. 234. G. N. Watson: A treatise on the theory of Bessel functions, reprint of the second (1944) edition, Cambridge Mathematical Library, Cambridge University Press, Cambridge, 1995. 235. R. O. Wells, Jr.: Differential analysis on complex manifolds, Graduate Texts in Mathematics, No. 65, Springer-Verlag, Berlin Heidelberg New York, 1980. 236. A. D. Wentzell (Ventcel’): On boundary conditions for multidimensional diffusion processes (Russian). Teoriya Veroyat. i ee Primen., 4 (1959), 172–185. English translation: Theory Prob. and its Appl., 4 (1959), 164–177. 237. N. Wiener: Differential space, J. Math. Phys., 2 (1923), 131–174. 238. N. Wiener and E. Hopf: Über eine Klasse singulärer Integralgleichungen. Sitzungsberichte Preuißsche Akademie, Math. Phys. Kl. (1931), 696–706. 239. J. Wloka: Partial differential equations. Cambridge University Press, Cambridge, 1987. 240. K. Yosida: Functional analysis, reprint of the sixth (1980) edition. Classics in Mathematics, Springer-Verlag, Berlin Heidelberg New York, 1995.

Index

A Absolutely continuous, 475 Absolutely continuous measure, 63 Absorbing, 196 Absorbing barrier Brownian motion, 19, 587, 588, 616, 617 Absorbing–reflecting barrier Brownian motion, 588, 617 Absorption, 646 Absorption phenomenon, 646 Accumulation point, 45 Adapted stochastic process, 580 Adjoint, 202, 203, 215, 224, 225, 333, 346 Adjoint operator, 202, 203, 215, 224, 225, 413 Admissible chart, 142 Admissible inner product, 142 Agmon–Nirenberg method, 31, 517, 544, 681, 698 Algebra, 52, 54 Algebra generated by, 107 Algebraic complement, 210 Algebra of Pseudo-differential operator, 405 Almost everywhere (a. e.), 14, 61, 77, 96, 348, 475 Almost surely (a. s.), 77, 596 Alternating, 160 Alternation mapping, 160 Amplitude, 397, 399, 400, 404 Annihilator, 198 Antidual, 222 Antilinear, 222 Approximation theorem, 80, 81 Approximation to the identity, 340, 341

a priori estimate, 441, 542 Ascoli–Arzelà theorem, 51, 360 Associated distribution, 294 Associated Fourier integral distribution, 397, 398 Associated Fourier integral operator, 399, 400 Associated homogeneous principal symbol, 411 Associated initial-value problem, 229 Associated norm, 357, 527, 725 Associated semigroup, 11, 13, 229 Associated seminorm, 343, 361 Associative algebra, 52 Associative law, 52, 156 Asymptotic expansion, 394 Asymptotic expansion of a symbol, 394, 404, 406, 407, 442, 443, 446 Atlas, 141, 142 Atlas of charts with boundary, 173 Avogadro number, 3, 577

B Baire–Hausdorff theorem, 48 Baire’s category, 48 Balanced, 195 Banach’s closed graph theorem, 209, 421, 561 Banach’s closed range theorem, 210 Banach’s open mapping theorem, 209 Banach space, 189, 578, 603, 725 Banach space valued function, 229 Banach–Steinhaus theorem, 187, 192, 300, 355 Barrier for maximum principles, 484

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 K. Taira, Functional Analytic Techniques for Diffusion Processes, Springer Monographs in Mathematics, https://doi.org/10.1007/978-981-19-1099-9

771

772 Base space, 165 Basis, 49 Beppo Levi’s theorem, 61, 744 Bessel function, 321 Bessel potential, 321, 326, 332, 399, 405 Bessel’s inequality, 223 Bidual space, 155, 199 Bijection, 141 Bijective, 193 Bilinear, 155, 218 Bilinear form, 218 Bochner integral, 230 Bolzano–Weierstrass theorem, 360 Borel integrable, 230 Borel measurable, 84, 581, 592 Borel measure, 77, 630, 633, 637, 643 Borel measure space, 230 Borel set, 55, 230, 578, 583, 592, 631, 639 Borel σ -algebra, 55 Boundary, 36, 45, 172–174 Boundary condition, 637, 645, 647 Boundary point lemma, 513, 660, 697 Boundary value problem, 647 Bounded, 255 Bounded Borel measurable function, 207, 208 Bounded continuous function, 180, 204 Bounded convergence topology, 187, 188, 191 Bounded linear functional, 197 Bounded linear operator, 191, 234 Bounded set, 185, 187 Brownian excursion, 34 Brownian motion, 1, 2, 7, 15, 577, 585, 615, 635 Brownian motion with constant drift, 7, 15, 585, 615 Bundle atlas, 165 Bundle of non-zero cotangent vectors, 166

C (C0 ) semigroup, 255, 269, 275, 276 C0 transition function, 16, 592, 595, 602– 604, 607, 611 C k function, 287 C k,θ domain, 348 C k,θ hypograph, 347, 348 C r domain, 346, 347 C r manifold, 142 C r manifold with boundary, 173 C r structure, 143

Index C r structure generated by an atlas, 142 Calderón–Vaillancourt theorem, 448 Calderón–Zygmund integro-differential operator, 332 Canonical measure, 627 Canonical projection, 165 Canonical scale, 627 Carathéodory extension theorem, 88 Cartesian product of measurable spaces, 99 Cauchy convergence condition, 48, 180, 183 Cauchy density, 586 Cauchy problem, 8, 591 Cauchy process, 7, 15, 585, 586, 615 Cauchy sequence, 48, 180, 183, 189, 355 Cauchy’s theorem, 312, 433, 434 Cemetery, 12, 582, 590 Chain of integral curves, 490 Change of variable formula, 170 Chapman–Kolmogorov equation, 6, 10, 251, 252, 576, 583 Characteristic, 462 Characteristic function, 77, 84 Characteristic set, 442, 693 Chart, 141 Chart with boundary, 173 Christoffel symbol, 506 Classical pseudo-differential operator, 405, 410, 689 Classical symbol, 394 Closable, 209, 717 Closable operator, 619 Closed extension, 209, 225, 619 Closed graph theorem, 209 Closed (linear) operator, 208, 214 Closed range theorem, 210 Closed set, 45 Closed submanifold, 348 Closed subspace, 183 Closest-point theorem, 220 Closure, 184, 190, 209 Codimension, 52, 213 Coercive, 548, 683 Color, 174 Commutative, 53 Compact, 184, 194, 725 Compact convergence topology, 187 Compact metric space, 207, 613, 619, 625, 628, 643 Compact operator, 211–214, 226 Compact perturbation, 214 Compact set, 46, 184 Compact subset, 184

Index Compact support, 204, 290, 303, 326 Compactification, 46, 184, 581, 631, 633, 640 Compactly supported, 204 Compactness, 46 Compatible, 142 Complemented, 210 Complemented subspace, 210 Complete, 177, 181 Complete measure, 73, 74 Complete metric, 48 Complete orthonormal system, 223 Complete symbol, 299, 404 Completely continuous, 211, 725 Completeness of a space, 189, 191, 219 Complexification of the tangent space, 443 Complex linear space, 49 Complex number field, 49 Complex phase function, 447 Complex vector space, 49 Conditional distribution of a random variable, 124, 137–139 Conditional expectation, 127, 131, 578 Conditional probability, 113, 114, 578 Conditional probability of a random variable, 120 Conjugate exponent, 286 Conjugate linear, 222 Conjugation, 202, 300 Connected component, 47 Connectedness, 47 Connected set, 47 Connected space, 47 Conservative, 582, 589 Consistency condition, 111, 113 Constant map, 581 Construction of barriers, 484 Construction of Feller semigroups, 681 Construction of random processes, 111 Continuity condition, 88 Continuity from above, 81 Continuity from below, 81 Continuity of linear operators, 186 Continuous mapping, 48 Continuous Markov process, 595, 596 Continuous path, 595, 596, 636 Contour, 433, 434 Contraction mapping, 181 Contraction mapping principle, 179, 181 Contraction semigroup, 11, 13, 235–237, 239, 240, 242, 243, 245, 247, 248, 255, 275, 604, 607 Contraction semigroup of class (C0 ), 235

773 Contractive linear operator, 10, 604, 611 Contractive operator, 592, 604 Contravariant tensor, 156, 157, 444, 520, 649, 682, 713 Contravariant tensor field, 157 Converges strongly, 183, 189 Converges weakly, 199, 200, 628, 633 Converges weakly*, 201 Convex, 50, 195 Convolution, 286, 287, 305, 306, 310 Coordinate map, 581 Coordinate neighborhood, 141 Coordinate neighborhood system, 141 Coordinate transformation, 141 Cotangent bundle, 26, 32, 146, 154, 157, 166, 454, 518, 520, 546, 682, 688 Cotangent space, 26, 33, 148, 154, 444, 455, 649, 682, 713 Countable collection, 54, 80 Countably additive, 80, 87 Cramer’s rule, 352 Curve, 151, 153

D Definition of conditional distributions, 124 Definition of conditional expectations, 131 Definition of conditional probabilities, 113, 120 Definition of Feller semigroups, 9, 13, 14, 17, 24, 603, 611 Definition of Feller transition functions, 594 Definition of Markov processes, 577, 579, 580, 582 Definition of stopping times, 596 Definition of strong Markov property, 601 Definition of temporally homogeneous Markov processes, 580 Definition of temporally homogeneous transition functions, 583 Definition of transition functions, 583 Degenerate, 683 Degenerate diffusion operator case, 711 Degenerate elliptic, 25, 454, 466, 634, 649, 712 Degenerate elliptic condition, 32, 682 Degenerate elliptic differential operator, 23, 454, 458, 466, 712, 715 Dense, 34, 45, 201, 203, 213, 224, 242, 243 Densely defined, closed operator, 689

774 Densely defined operator, 201, 209, 210, 213, 225 Density, 166, 167, 337, 342, 343 Density of a domain, 238 Density on a manifold, 167, 168, 341 Derivation, 149 Derivative, 64, 145, 146 Diagonal, 331 Diameter, 725 Difference, 43, 107 Differential form, 162 Differential operator, 294, 299, 345 Differential operator on a manifold, 344 Differentiation, 298 Differentiation of a distribution, 298 Diffusion along the boundary, 646 Diffusion coefficient, 635 Diffusion operator, 635 Diffusion process, 16, 585, 586, 603 Diffusion trajectory, 29 Dimension, 49 Dini’s theorem, 673 Dirac measure, 297, 311, 321, 328, 339–341, 591 Direct sum, 50 Dirichlet problem, 192, 338, 339, 427, 518, 519, 524–527, 548, 550, 650, 687, 699, 741, 760 Dirichlet-to-Neumann operator, 531, 549, 551, 688 Disjoint, 54 Disjoint union, 54 Distance, 180 Distance function, 47, 180 Distribution, 294, 344 Distribution function of a random variable, 96 Distribution kernel, 325, 331 Distribution of a random variable, 95 Distribution of class C k , 309 Distribution on a manifold, 342 Distribution theory on a manifold, 341 Distribution with compact support, 303, 326 Divergence, 159, 176, 532 Divergence theorem, 175, 176, 726, 739 Domain, 346 Domain of class C k,θ , 348 Domain of class C r , 346 Domain of definition, 80, 181, 629, 637 Dominated convergence theorem, 61, 130, 132, 134, 260, 310, 327, 438, 605 Double integral, 71, 74 Double layer potential, 336, 426–429

Index Double of a manifold, 172, 174, 370, 519, 527, 544, 552 Drift, 7, 585, 615 Drift coefficient, 635 Drift trajectory, 27, 30, 455, 468, 512, 513 Drift vector field, 27, 38, 455, 456, 468, 487, 685 d-system, 78–80, 100, 117 Dual basis, 156, 158, 161, 162 Duality theorem, 363, 370, 372 Dual operator, 201, 202 Dual space, 148, 197, 276, 277, 294, 295, 302, 315 Dual space of a normed factor space, 198 Du Bois Raymond’s lemma, 296 Dynkin class theorem, 80, 100, 117, 119, 120

E Eberlein–Shmulyan theorem, 200 Egorov–Hörmander theorem, 449 Ehrling’s inequality, 568 Eigenfunction, 423–425, 552 Eigenspace, 212, 263 Eigenvalue, 212, 263 Eigenvector, 212, 263 Elementary family, 54, 117, 119 Elementary symmetric polynomial, 353 Ellipsoid, 26, 455, 456, 468, 469 Elliptic, 548, 683 Elliptic boundary value problem, 1, 2, 24, 25, 31, 391, 400, 441, 446, 453, 513, 530, 571, 666, 719, 720 Elliptic differential operator, 391, 411, 426, 429, 520, 527, 550, 685, 719 Elliptic pseudo-differential operator, 31, 391, 406, 411, 412, 416, 419–422, 700 Elliptic regularity theorem, 416 Elliptic symbol, 394 Energy estimate, 690, 692 Equicontinuous, 51, 188 Equivalence class, 52, 198 Equivalence law, 52, 198 Equivalent (function), 285 Equivalent (norm), 189 Equivalent relation, 52 Essentially bounded, 286 Event, 77, 102 Everywhere dense, 45 Example of vector bundles, 157 Existence theorem for the Dirichlet problem, 192

Index Existence theorem of a Feller semigroup, 33, 35, 37, 681, 715, 716 Expectation, 77, 89, 96, 578 Exponential function, 233 Extended real numbers, 83 Extension, 182, 209, 225 Extension of an operator, 182 Exterior form, 160 Exterior normal, 348 Exterior product, 159, 160 Exterior product of differential forms, 162

F Factor space, 52, 189 Fatou’s lemma, 61 Fefferman–Phong theorem, 26, 442, 445, 453, 454, 467, 494, 496, 513, 692 Feller semigroup, 13, 517, 575, 576, 603, 604, 611, 629, 634, 637, 646, 681 Feller transition function, 592, 594, 630, 638 Fiber bundle of densities, 342 Fibre, 165 Fichera function, 712 Finite codimension, 211 Finite codimensional space, 211, 213 Finite dimensional distribution, 111 Finite dimensional space, 194, 211–213 Finitely additive, 80 Finite measure, 81, 633 Finite measure space, 81 First axiom of countability, 46, 182 First category, 48 First order Ventcel’ boundary condition, 36, 685 First passage time, 586 Fixed point, 181 Fixed point theorem, 181 Fokker–Planck partial differential equations, 9, 591 Fourier coefficient, 223 Fourier integral distribution, 397 Fourier integral operator, 399, 400 Fourier integral operator with complex phase function, 447 Fourier inversion formula, 314, 328 Fourier series, 224 Fourier series expansion, 224 Fourier transform, 294, 310, 321, 322, 338 Fréchet space, 183 Fréchet differentiable, 348

775 Fredholm alternative, 212, 215 Fredholm integral equation, 531 Fredholm operator, 213, 416, 540, 547, 548, 550, 551 Friedrichs’ lemma, 387 Friedrichs’ mollifier, 292, 382, 436 Fubini’s theorem, 70–75, 305, 310, 430, 437, 592, 606 Function rapidly decreasing at infinity, 311 Function space, 248, 285 Function with compact support, 204 Functional, 181, 197 Fundamental neighborhood system, 45 Fundamental solution, 728 G Gårding’s inequality, 441 Generation theorem for Feller semigroups, 612, 619 Geometric multiplicity, 263 Globally hypoelliptic, 446, 447, 706, 709 Gradient vector field, 159 Gram–Schmidt orthogonalization, 223 Graph, 208 Green kernel, 615 Green operator, 613 Green operator of a Feller semigroup, 667 Green operator of a semigroup, 613 Green operator of the Dirichlet problem, 650 Green representation formula, 283, 338, 730 Green’s identity, 175, 176, 728, 747 Green’s representation formula, 428 Gronwall’s lemma, 475 H Hahn–Banach extension theorem, 195 Hahn decomposition, 62, 66, 68 Hahn–Jordan decomposition, 62 Hamiltonian, 505 Hamiltonian curve, 505 Hamilton vector field, 449 Hamilton map, 443 Harmonic function, 726 Harmonic operator of the Dirichlet problem, 650 Hausdorff space, 46 Heat equation, 341 Heat kernel, 248, 340 Heaviside function, 295, 323 Hessian, 442, 446 Hilbert–Schmidt theory, 225, 424 Hilbert space, 218, 219

776 Hill’s diffusion trajectory, 29, 457, 469, 470, 512, 513 Hill’s drift trajectory, 30, 457, 469, 512 Hille–Yosida–Ray theorem, 619 Hille–Yosida theorem, 242, 269, 612 Hille–Yosida theory, 235, 593 Hille–Yosida theory of contraction semigroups, 235, 593, 611 Hille–Yosida theory of Feller semigroups, 611 Hölder continuity, 723 Hölder continuous, 290, 724 Hölder estimate for the second derivative, 732 Hölder estimate at the boundary, 738 Hölder norm, 725 Hölder regularity for the Newtonian potential, 728 Hölder seminorm, 724, 734 Hölder’s inequality, 286 Hölder space, 290, 291, 723, 724 Holomorphic, 431 Homeomorphic, 48 Homeomorphism, 48 Homogeneous principal symbol, 405, 411 Hopf’s boundary point lemma, 464, 660, 697 Hypoelliptic, 445, 447, 690 Hypoelliptic with loss of one derivative, 447 Hypograph, 347

I Ideal, 53 Idempotent operator, 221 Identity operator, 10, 180, 192, 253, 255, 340, 341, 593 Image measure, 97, 103 Independence, 102 Independent algebras, 105 Independent events, 102 Independent random variables, 103, 105 Index of an elliptic pseudo-Differential operator, 416 Index of an operator, 213 Indicator function, 77 Inductive limit topology, 290 Infinite dimensional, 49, 211, 212 Infinitesimal generator, 236, 257, 612–614, 617, 619, 625, 626, 629, 634, 636, 637 Infinitesimal generator of a Feller semigroup, 612–614, 619, 625, 629, 634, 636, 637

Index Infinitesimal generator of a Feller semigroup on a bounded domain, 626, 636 Infinitesimal generator of a semigroup, 236 Initial-value problem for the heat equation, 341 Injective, 190, 331 Injectivity, 669 Inner product, 218 Inner product space, 218 Inner regular, 59 Integrable, 22, 60, 61, 63, 65–67, 72–74, 76, 98, 131, 170, 172, 230, 239, 285, 295, 297, 315, 396, 578, 628, 731 Integral, 77 Integral curve, 151–153 Integration on a manifold, 170 Integro-differential operator, 634, 645 Interior, 45, 173, 174 Interior estimate for harmonic functions, 726 Interior Hölder estimate, 732 Interior Hölder norm, 733 Interior Hölder seminorm, 733 Interior product of tensors, 156 Interpolation inequality, 360, 568 Inverse, 181, 190 Inverse Fourier transform, 311, 322 Inverse image of a distribution, 364 Inverse operator, 181, 190 Inward jump phenomenon from the boundary, 646 Inward normal, 22, 463, 464, 712 Inward normal derivative, 464 Isometry, 190 Isomorphic, 190 Isomorphism, 190 Iterated integral, 71 J Jacobian determinant, 164, 168, 170, 172, 175 Jacobian matrix, 364 Joint distribution, 103 Jordan decomposition, 63, 206 Jordan–Hahn decomposition, 62 Jump formula, 307, 308, 310, 338, 379, 427, 428, 523, 524, 571 Jump phenomenon on the boundary, 646 K Kernel, 329, 331 Kernel on a manifold, 345 Kernel theorem, 331

Index Killing measure, 627 Kolmogorov extension theorem, 113 Kolmogorov’s backward equation, 9, 10, 591 Kolmogorov’s forward equation, 9, 11, 591

L L p space, 285 Laplace–Beltrami operator, 345 Laplacian, 308, 338, 345, 723, 756 Lax’s lemma, 395 Layer potential, 336, 337, 355 Lebesgue dominated convergence theorem, 61, 310, 327, 370, 438, 605 Lebesgue measurable, 57, 285, 295 Lebesgue measurable function, 285, 286 Lebesgue measure, 8, 57, 60, 171, 285, 349, 376, 519, 590, 618, 720 Lebesgue monotone convergence theorem, 61, 97, 100, 101, 118, 120, 128, 129, 136, 380, 592 Lebesgue–Stieltjes integral, 96 Leibniz formula, 232, 234, 299 Leibniz–Hörmander formula, 300 Lévy operator, 635 Lie derivative, 149 Lie group of linear automorphisms, 165 Lifetime, 582 Linear functional, 54, 181, 195–197 Linearly dependent, 49 Linearly independent, 49 Linear mapping, 161, 190, 212, 290, 294, 344 Linear operator, 53, 181, 186 Linear space, 49 Linear subspace, 50 Linear subspace spanned by, 50 Linear topological space, 50, 51, 290, 343 Lipschitz constant, 348 Lipschitz continuous, 346 Lipschitz domain, 347 Lipschitz hypograph, 346 Local coordinate, 142 Local coordinate system, 142 Locally compact, 46, 184, 204 Locally compact metric space, 204, 578, 580, 583, 592, 595, 601–603 Locally compact topological space, 204 Locally convex linear topological space, 51 Locally finite, 142, 144, 171, 301, 343 Locally Hölder continuous, 291 Locally integrable function, 295, 344 Local operator, 294, 635

777 Loss of two derivatives, 717 M Manifold, 141, 142, 341, 342, 344, 345 Manifold with boundary, 172, 173, 347 Mapping, 44 Markov process, 575, 577, 579–581, 645 Markov property, 579, 581, 583 Markov time, 596 Markov transition function, 583 Maximal atlas, 142 Maximal integral curve, 38, 447, 685 Maximum norm, 628 Maximum principle, 619, 634, 697 Mazur’s theorem, 195, 200 Mean, 96 Mean curvature, 532 Mean value theorem, 151, 472, 478, 480, 482, 503, 508 Mean value theorem for harmonic functions, 726 Measurability of functions, 77, 79, 83, 99 Measurable, 77, 579 Measurable function, 77, 83, 85 Measurable mapping, 85 Measurable set, 54, 77, 84, 112 Measurable space, 54, 77, 83, 85 Measure, 80, 81 Measure space, 77, 81 Melin’s inequality, 443 Melin–Sjöstrand theorem, 447 Method of continuity, 192, 760 Metric, 47, 180, 182, 189 Metric space, 47, 179, 204 Metrizable, 47, 180, 182 Minimal closed extension, 36, 209, 225, 619, 677, 684, 715 Minkowski functional, 196 Modified Bessel function, 321 Module, 52 Mollifier, 283, 292, 382, 391, 436, 652, 653 Monotone class theorem, 78, 85, 86, 606 Monotone convergence theorem, 61, 97, 100, 101, 118, 120, 128, 129, 136, 380, 592 Monotonicity, 81, 114 Multi-index, 284 Multiplication by functions, 298 Multiplicity, 212, 226 N n-dimensional random variable, 91

778 Negative variation measure, 206, 207 Neighborhood, 45 Neighborhood system, 45 Neumann series, 192, 194, 212 Newtonian potential, 321, 325, 332, 336, 426–429, 728, 730, 732, 739 Non-characteristic, 378, 462, 717 Non-degenerate, 683 Non-dimensional Hölder norm, 725 Non-Euclidean ball, 684 Non-negative, 10, 13, 22, 56, 57, 60, 61, 66, 67, 71, 84, 85, 87, 88, 93, 94, 97, 98, 101, 114, 171, 191, 206, 208, 219, 284, 288–292, 294, 295, 306, 311, 315, 320, 328–330, 335, 347, 359, 363, 366, 369, 371, 373, 376, 377, 380–382, 384, 392, 432, 444, 449, 461, 471, 472, 483, 518, 544, 578, 583, 598, 599, 604, 611, 613, 614, 621, 623–625, 630, 631, 638, 642, 650–652, 654, 658, 667–670, 724, 725, 733, 749, 752 Non-negative Borel measure, 630, 633, 637, 643 Non-negative linear functional, 206 Non-negative measure, 80, 631, 632, 641 Non-negative operator, 592, 604 Norm, 188, 190, 248, 255, 285, 286, 289, 291, 592, 593, 603, 612 Normal, 348 Normal coordinate, 370, 376, 519 Normal in the sense of Bony, 481, 492 Normal transition function, 584, 595 Norm continuous, 232 Norm differentiable, 232 Normed factor space, 189, 198 Normed linear space, 188 Norm-preserving, 190 Nowhere dense, 48 Null space, 181, 212, 534

O Oblique derivative problem, 37, 711 One-point compactification, 46, 184, 204, 581 Open covering, 46, 184 Open mapping theorem, 209 Open set, 45 Open submanifold, 143 Operator, 181, 186, 329 Operator norm, 191, 231, 255 Operator on a manifold, 345

Index Operator valued function, 231 Order of a differential operator, 21, 25, 299, 308, 333, 345, 378, 379, 382, 388, 389, 392, 408, 521, 714 Order of a distribution, 295 Order of a Sobolev space, 358, 412 Order of an operator, 413 Oriented, 175 Orthogonal, 220 Orthogonal complement, 221 Orthogonal decomposition, 221 Orthogonal projection, 221 Orthogonal set, 223 Orthogonality, 220 Orthonormal, 223 Orthonormal set, 223 Orthonormal system, 223 Oscillatory integral, 397, 398 Outer regular, 59 Outward normal, 176, 727, 731

P π -system, 78 Paley–Wiener–Schwartz theorem, 433 Paracompact, 142, 145, 174 Parallelogram law, 220 Parametrix, 406 Parseval formula, 314, 322, 360, 366 Parseval’s identity, 224 Partition of unity, 144, 170, 301, 342 Path, 579 Path-continuity, 595 Path function, 595 Peetre’s inequality, 385 Peetre’s theorem, 294 Peetre’s theorem for differential operators, 283, 635 Peetre’s theorem for Fredholm operators, 215, 562, 563, 704 Permutation, 112 Phase function, 395–400, 448 Piecewise differentiable curve, 477, 478, 512, 513 Plancherel theorem, 322 Plane-wave expansion, 397 Point at infinity, 46, 184, 204, 580, 601, 612 Point spectrum, 212, 263 Poisson integral, 726 Poisson integral formula, 338–340, 427 Poisson kernel, 340, 521, 532 Poisson operator, 525, 650 Poisson process, 7, 585, 614

Index Poisson’s equation, 519 Positive definite, 158 Positively homogeneous, 297, 392 Positive maximum principle, 619, 621, 622, 625, 660, 664, 667, 708 Positive semi-definite, 630, 632, 637, 640 Positive variation measure, 206, 207 Potential, 337, 426 Potential theoretic approach, 723 Precompact, 184 Pre-Hilbert space, 218 Principal part, 394 Principal part of a symbol, 394 Principal symbol, 299, 404 Probability, 87, 578 Probability measure, 87, 578, 581, 641, 643 Probability space, 87, 578 Product measure, 59, 73 Product neighborhood, 174 Product neighborhood theorem, 174 Product of linear operators, 191 Product space, 188, 189 Product topological space, 46 Product topology, 46, 188, 189 Progressively measurable, 601 Propagation set, 454, 467 Proper mapping, 48 Properly supported, 402–404, 406, 408, 429, 439–441, 445, 446, 448, 449 Pseudo-differential operator, 400, 409, 721 Pseudo-local property, 401 Pull-back, 344, 407 Push-forward, 344, 407 Q Quasibounded, 255 Quasinorm, 183 Quasinormed linear space, 182, 183 R Rademacher’s theorem, 348 Radon measure, 59, 206, 604 Radon–Nikodým derivative, 64 Radon–Nikodým theorem, 62, 63, 114 Random variable, 89, 578 Range, 181 Rapidly decreasing, 311, 431, 436, 437, 439 Rational function, 429 Real Borel measure, 58 Real linear space, 49 Real measure, 58 Real number field, 49

779 Real vector space, 49 Reduction to the boundary, 530 Refinement, 142 Reflecting barrier Brownian motion, 16, 18, 586, 587, 615, 616 Reflection phenomenon, 646 Reflexive, 199 Reflexive Banach space, 209 Reflexivity, 200, 300 Reflexivity of Hilbert spaces, 222 Regular Borel measure, 58 Regular boundary, 627 Regular boundary case, 713 Regular distribution, 308 Regularity property, 540 Regularization, 292, 307 Regularizer, 334, 401 Regular point, 626 Relatively compact, 46, 51, 184, 211 Relative topology, 46 Rellich’s theorem, 360, 367, 373, 417, 419, 423, 561, 702 Representative, 52 Resolvent, 212, 240, 262 Resolvent equation, 263, 651 Resolvent set, 212, 262 Resonance theorem, 192, 232, 245, 256, 273 Restriction of a distribution, 298 Restriction of an operator, 182 Riemannian manifold, 159, 345 Riemannian metric, 158, 345, 348 Riemann–Stieltjes integral, 96 Riemann sum, 276, 335 Riesz kernel, 321, 326, 397 Riesz–Markov representation theorem, 206, 207 Riesz operator, 332 Riesz potential, 321, 325, 326, 331 Riesz representation theorem, 221, 224 Riesz’s theorem, 221, 224 Riesz–Schauder theory, 179, 212, 412, 521 Right-continuous Markov process, 595, 602 Right-continuous path, 596 Right-continuous σ -algebras, 597

S σ -algebra, 54, 77, 78 σ -algebra of all Borel sets, 578, 583, 592, 601, 631, 639 σ -compact, 51, 184 σ -finite, 81

780 σ -algebra generated by, 55, 107, 581 Sample point, 4, 87, 578, 579 Sample space, 579 Scalar field, 181, 194 Scalar multiplication, 187 Scalar product, 218 Schauder global estimate, 756 Schauder interior estimate, 734 Schauder local boundary estimate, 752 Schur’s lemma, 287 Schwartz space, 311 Schwartz’s kernel theorem, 331 Schwarz’s inequality, 218, 286, 482, 488, 494, 497, 505, 691, 693 Second axiom of countability, 46 Second category, 48 Second dual space, 199 Second fundamental form, 532 Second order elliptic differential operator, 31, 32, 517, 544 Second order Ventcel’ boundary condition, 32, 683 Sectional trace, 309, 377, 439 Sectional trace theorem, 377 Section of a function, 70 Section of a set, 70, 99 Section of a vector bundle, 162, 166 Self-adjoint, 225, 424 Semigroup, 235, 255, 592, 603 Semigroup property, 235, 604 Seminorm, 182, 183, 186, 188, 287, 288, 291, 311, 316, 343, 359, 361, 365, 392, 724, 733, 734, 737, 749 Separable, 45, 578, 580, 592, 595, 601–603 Sequential density, 305 Sequentially compact, 184 Sequentially dense, 303 Sequential weak compactness, 200 Sequential weak* compactness, 201 Sesquilinear, 203 Sesquilinear form, 203 Sesquilinearity, 218 Set, 43 Set function, 80 Sharp Gårding inequality, 441, 692, 693 Sheaf property, 301, 344 Shift mapping, 581 Signature, 326 Signed measure, 58, 62, 63, 205, 206

Index Simple convergence topology, 187, 199 Simple function, 84 Single layer potential, 336–338, 426, 429 Singular integral operator, 720 Singular point, 626 Singular support, 398 Slobodecki˘ı seminorm, 359 Smallest σ -algebra, 55, 92, 577–579 Small perturbation, 214 Smooth mapping, 143 Sobolev imbedding theorem, 363 Sobolev’s theorem, 371 Space of continuous functions, 204, 287, 577, 593, 603, 612, 655 Space of densities, 166 Space of signed measures, 205 Spectral theorem, 226 Spectrum, 212, 263 Speed measure, 627 State space, 579 Sticking barrier Brownian motion, 18, 586, 616 Sticking phenomenon, 646 Sticky barrier Brownian motion, 19, 617 Stochastic process, 579, 580 Stokes’s theorem, 175 Stopping time, 596, 602 Strictly elliptic, 32, 460, 461, 518, 519, 527, 544, 648, 682, 759 Strong bidual space, 199, 222 Strong closure, 200 Strong continuity, 10, 235, 593 Strong convergence, 183, 189 Strong dual space, 197 Stronger topology, 45 Strong limit, 192 Strongly continuous, 13, 229, 232 Strongly continuous semigroup, 235, 604 Strongly differentiable, 231, 232 Strongly integrable, 230 Strong Markov process, 15, 16, 601, 602 Strong Markov property, 602 Strong maximum principle, 25, 27, 28, 30, 453, 454, 456, 466, 468–470 Strong second dual space, 199 Strong solution, 389, 390 Strong topology, 191, 197, 300 Structure theorem for distributions with compact support, 320, 335

Index Structure theorem for tempered distributions, 315 Subadditivity, 81 Subcollection, 102, 184 Sub-elliptic, 449, 548 Sub-elliptic pseudo-differential operator, 448 Submanifold, 348 Subprincipal part, 27, 455 Subprincipal symbol, 443, 446, 693 Sub-σ -algebra, 579, 580 Subspace spanned by, 50 Subunit, 444, 683 Subunit tangent vector, 26–28, 31, 444, 453– 456, 467, 486, 496, 498 Subunit trajectory, 27, 455, 468, 494, 513 Sum of linear operators, 187 Sum of linear spaces, 210 Support, 204, 290, 301, 302 Support of a function, 204 Supremum norm, 592, 593, 603, 612 Surface area of the unit ball, 727 Surface element, 176, 727, 731 Surface measure, 297, 348, 349 Surface potential, 429, 439, 521 Surjective, 331 Symbol, 392 Symmetric, 158 Symmetric contravariant tensor, 32, 156, 157, 444, 520, 682, 713 Symmetric difference, 81, 107 Symmetric matrix, 632, 637, 640 Symmetric operator, 221 Symplectic form, 443

T Tangent bundle, 145, 166 Tangent bundle projection, 146, 154 Tangent map, 147 Tangent space, 145, 146, 157, 342, 443 Tangent vector, 145, 442 Tempered distribution, 314, 315, 322 Temporally homogeneous Markov process, 581 Temporally homogeneous Markov property, 579 Temporally homogeneous Markov transition function, 583 Tensor, 155 Tensor field, 157 Tensor product, 303, 305

781 Tensor product of distributions, 303, 305 Tensor product of Fréchet spaces, 393 Tensor product of tensors, 156 Tensor product of test function spaces, 303 Terminal point, 5, 582, 588, 617 Termination coefficient, 635 Test function, 290 Topological complement, 210 Topological space, 45, 180, 182 Topological subspace, 46 Topological vector space, 50 Topology defined by a metric, 180 Topology defined by a norm, 191 Topology defined by an atlas, 173 Topology defined by neighborhoods, 184 Topology defined by open sets, 184 Topology defined by seminorms, 183 Topology of linear operators, 187 Topology of uniform convergence, 187 Topology on a manifold, 142 Totally bounded, 185, 186 Totally characteristic case, 715 Total space, 165 Total variation, 58 Total variation measure, 58 Total variation norm, 10, 206, 207, 628 Trace map, 373, 375, 376 Trace of trajectories of a Markov process, 666 Trace operator, 687, 699 Trace theorem, 373, 375, 376 Trajectory, 579 Trajectory of a Markov process, 579 Transformation, 44 Transition function, 575, 577, 582, 583, 603, 604 Transition map, 141 Transitivity, 52 Translate, 50 Translation invariant, 57 Transpose, 202, 210, 215, 332, 346 Transpose of an operator, 413 Transposition, 160 Transversal, 33, 665, 675, 684, 715 Trap, 587, 626 Triangle inequality, 47, 180, 183, 189 Trivial line bundle, 166 Type of a tensor, 156 Type of a tensor field, 157 Typical fibre, 165 U Uniformly bounded, 51

782 Uniformly stochastically continuous, 15, 602, 603, 607, 611, 630, 638 Uniform motion, 6, 584, 614 Uniform stochastic continuity, 602, 603, 610, 611 Uniform topology of operators, 191 Uniqueness of the Fourier expansion, 555, 556 Uniqueness theorem for the Dirichlet problem, 524 Unit ball, 194

V Valeur principale (v.p.), 298 Vandermonde determinant, 352 Vanish at infinity, 204, 248 Vector, 49 Vector bundle, 157, 162, 164 Vector bundle of differential forms, 162 Vector bundle of exterior forms, 162 Vector bundle of tensors, 157 Vector field, 147, 151 Vector space, 49, 50, 165 Ventcel’ boundary condition, 713 Ventcel’ boundary value problem, 650, 655 Ventcel’ (Wentzell) boundary condition, 23, 24, 32, 33, 36, 576, 637, 646, 649, 665, 683 Ventcel’s theorem, 629, 637 Viscosity phenomenon, 646 Volume element, 164, 166 Volume potential, 428, 439, 520

Index W Waldenfels integro-differential operator, 576, 635 Waldenfels operator, 635 Weak closure, 200 Weak compactness, 200 Weak convergence, 199 Weak convergence of measures, 208, 595, 628 Weaker topology, 45 Weakly compact, 200 Weakly convergent, 199, 222 Weakly* convergent, 201 Weak maximum principle, 458, 459, 470, 651, 660, 697 Weak solution, 389 Weak topology, 199 Weak* convergence, 201, 628 Weak* dual space, 197 Weak* topology, 197, 201, 300, 307 Wentzell Ventcel’ boundary condition, 2, 23 Wiener measure, 3, 578

Y Yosida approximation, 243, 270, 613 Young’s inequality, 287, 305

Z Zero-extension, 307, 310, 731 Zero vector, 49